In deep learning, proper data scaling is critical.
Neural networks are trained using gradient descent. If input features have very different scales, training can become unstable or inefficient.
Normalization and standardization are two common preprocessing techniques.
Neural networks rely on:
If input values are too large:
If input values are too small:
Proper scaling improves:
Standardization transforms data so that:
After transformation:
Standardization centers the distribution and keeps relative distances between samples.
It works well when: