Full Stack Deep Learning
<aside>
๐ก 80-90% of time debugging and tuning
</aside>
1. Why is DL troubleshooting so hard?
[Poor Model Performance]
- Implementation bugs
- Hyperparameter choices
- Data/model fit
- Dataset construction
- Not enough data
- Class imbalances
- Noisy labels
- Train / test from different distributions
- etc
2. Strategy for DL troubleshooting
Start simple โ complexity
[Decision Tree]

2.1 Start Simple
[๊ฐ๋จํ ์ํคํ
์ฒ ์ ํ]
- ๋ฐ์ดํฐ๊ฐ ์ด๋ฏธ์ง์ผ ๊ฒฝ์ฐ, LeNet โ ResNet ๊ณ ๋ ค
- ๋ฐ์ดํฐ๊ฐ ์ํ์ค์ผ ๊ฒฝ์ฐ, LSTM โ Attention ๊ธฐ๋ฐ ๋ชจ๋ธ์ด๋ WaveNet ๊ณ ๋ ค
- ๊ธฐํ ๊ฒฝ์ฐ, Fully connected neural net with one hidden layer์์ ๊ณ ๊ธ ๋คํธ์ํฌ ๊ณ ๋ ค
[Sensible defaults ์ฌ์ฉ]
-
Optimizer:
Adam optimizer with learning rate 3e-4
-
Activations:
relu (FC and Conv models), tanh (LSTMs)
-
Initialization
He et al. normal (relu), Glorot normal (tanh)
-
Regularization & Data normalization:
None
[Inputs ์ ๊ทํ]
- Subtract mean and divide by variance
[๋ฌธ์ ๋จ์ํ]
- Start with a small training set (~10,000 examples)
- Use a fixed number of objects, classes, image size, etc.
- Create a simpler synthetic training set
2.2 Implement & debug