f(x)=x⁴-0.5x²+0.1x, starting from x₀=1.
The derivative is f'(x)=4x³-x+0.1. For small step sizes, GD typically settles in the
shallow right local minimum, while momentum can overshoot the barrier and reach the deeper left minimum.
η≈0.02 and γ≈0.95. You should often see GD settle into the right local minimum
near x≈0.44, while momentum crosses the barrier near x≈0.10 and reaches the deeper left minimum
near x≈-0.54.γ to make momentum behave more like GD. Increase η too much and both methods can oscillate or diverge.