In another post, we covered the nuts and bolts of Stochastic Gradient Descent and how to address problems like getting stuck in a local minima or a saddle point. In this post, we take a look at another problem that plagues training of neural networks, **pathological curvature**.

This is a companion discussion topic for the original entry at https://blog.paperspace.com/intro-to-optimization-momentum-rmsprop-adam/