Mathematics of Machine Learning

Dissecting Relu: A desceptively simple activation function

What is this post about? The Rectified Linear Unit (Relu) will be beautifully dissected in this popst: https://www.youtube.com/watch?v=Un9A90mfO54 This is what you will be able to generate and understand by the end of this post. This is the evolution of a shallow Artificial Neural Network (ANN) with relu() activation functions while training. The goal is …

Dissecting Relu: A desceptively simple activation function Read More »

Stochastic Approximation to Gradient Descent

What will you learn? The video below, delivers the main points of this blog post on Stochastic Gradient Descent (SGD): (GitHub code available in here)  https://youtu.be/gE9HzJ_GaRM In our previous 2 posts about gradient, in both post number 1 and post number 2, we did cover gradient descent in all its glory. We even went through …

Stochastic Approximation to Gradient Descent Read More »

The Derivative of Softmax(z) Function w.r.t z

What will you learn? Ask any machine learning expert! They will all have to google the answer to this question: “What was the derivative of the Softmax function w.r.t (with respect to) its input again?” The reason behind this forgetfulness is that Softmax(z) is a tricky function, and people tend to forget the process of …

The Derivative of Softmax(z) Function w.r.t z Read More »