Neural Network From Scratch with NumPy and MNIST

 Neural Network From Scratch with NumPy and MNIST
Neural Network From Scratch with NumPy and MNIST

This article was first published by IBM Developer at, but authored by Casper Hansen. Here is the Direct link.

Creating complex neural networks with different architectures in Python should be a standard practice for any Machine Learning Engineer and Data Scientist. But a genuine understanding of how a neural network works is equally as valuable. This is what we aim to expand on in this article, the very fundamentals on how we can build neural networks, without the help of the frameworks that make it easy for us.

Please open the notebook from GitHub and run the code alongside reading the explanations in this article.

Prerequisite Knowledge

In this specific article, we explore how to make a basic deep neural network, by implementing the forward and backward pass (backpropagation). This requires some specific knowledge on the functionality of neural networks – which I went over in this complete introduction to neural networks.

Neural Networks: Feedforward and Backpropagation Explained
What is neural networks? Developers should understand backpropagation, to figure out why their code sometimes does not work. Visual and down to earth explanation of the math of backpropagation.
Neural Network From Scratch with NumPy and MNIST

It's also important to know the fundamentals of linear algebra, to be able to understand why we do certain operations in this article. I have a series of articles here, where you can learn some of the fundamentals. Though, my best recommendation would be watching 3Blue1Brown's brilliant series Essence of linear algebra.

Essence of linear algebra - YouTube
A geometric understanding of matrices, determinants, eigen-stuffs and more.
Neural Network From Scratch with NumPy and MNIST


We are building a basic deep neural network with 4 layers in total: 1 input layer, 2 hidden layers and 1 output layer. All layers will be fully connected.

We are making this neural network, because we are trying to classify digits from 0 to 9, using a dataset called MNIST, that consists of 70000 images that are 28 by 28 pixels. The dataset contains one label for each image, specifying the digit we are seeing in each image. We say that there are 10 classes, since we have 10 labels.

Neural Network From Scratch with NumPy and MNIST
10 examples of the digits from the MNIST dataset, scaled up 2x.

For training the neural network, we will use stochastic gradient descent; which means we put one image through the neural network at a time.

Let's try to define the layers in an exact way. To be able to classify digits, we must end up with the probabilities of an image belonging to a certain class, after running the neural network, because then we can quantify how well our neural network performed.

  1. Input layer: In this layer, we input our dataset, consisting of 28x28 images. We flatten these images into one array with $28 times 28 = 7

Source - Continue Reading:


Related post