An overview of gradient descent optimization algorithms (2016) (ruder.io)
Gradient descent is one of the most popular algorithms to perform optimization and by far the most common way to optimize neural networks.