Mathematics of Machine Learning

Dissecting Relu: A desceptively simple activation function

What is this post about? This is what you will be able to generate and understand by the end of this post. This is the evolution of a shallow Artificial Neural Network (ANN) with relu() activation functions while training. The goal is to fit the black curve, which means that the ANN is a regressor! …

Dissecting Relu: A desceptively simple activation function Read More »

Stochastic Approximation to Gradient Descent

What will you learn? The video below, delivers the main points of this blog post on Stochastic Gradient Descent (SGD): (GitHub code available in here)  https://youtu.be/gE9HzJ_GaRM In our previous 2 posts about gradient, in both post number 1 and post number 2, we did cover gradient descent in all its glory. We even went through …

Stochastic Approximation to Gradient Descent Read More »

Back-propagation with Cross-Entropy and Softmax

What will you learn? This post is also available to you in this video, should you be interested 😉 https://www.youtube.com/watch?v=znqbtL0fRA0&feature=youtu.be In our previous post, we talked about the derivative of the softmax function with respect to its input. We indeed beautifully dissect ed the math and got comfortable with it! In this post, we will …

Back-propagation with Cross-Entropy and Softmax Read More »

The Derivative of Softmax(z) Function w.r.t z

What will you learn? Ask any machine learning expert! They will all have to google the answer to this question: “What was the derivative of the Softmax function w.r.t (with respect to) its input again?” The reason behind this forgetfulness is that Softmax(z) is a tricky function, and people tend to forget the process of …

The Derivative of Softmax(z) Function w.r.t z Read More »