Adam Optimizer ·
Adaptive Moment Estimation
un
seel
.com · θ ← θ − η·m̂/(√v̂+ε) · one rate per parameter
Step
0
η
0.001
β₁,β₂
.9/.999
Phase
—
Gradient g
1st moment m
2nd moment v
Adam step
Play
←
→
Unmute
Reset
Un
seel
.com · Adam Optimizer