Neural networks in a nutshell

This is the first post that makes justice to the blog’s motto: show me the code motherfucker. In this and the next n posts with the title “Neural networks in a nutshell – k” I will talk about artificial neural networks, showing concepts (theory) and code (practice). The codes will be written in Python without any fancy library as NumPy, SciPy or PyBrain just because:

  1. I don’t know how to use any of these.
  2. I don’t have time to learn them now.
  3. The focus is in the concepts, not in the performance.

Also I will learn how to type \LaTeX code and insert it.

0 – Overview

Artificial neural networks, ANNs or neural networks, are computational models inspired by animals’ central nervous system (usually the brain) and used to recognize patterns through machine learning. These models are organized as parallel systems composed of simple processing units (neurons) disposed in a direct graph structure (network) with weighted edges (synapses). Although they are based on biological neural networks, ANNs are just simplified versions of those, with only the necessary parts.

An ANN is a statistical pattern recognition technique, but it’s not unique. ANNs are used when the domain of a problem is not entirely known, so the net must learn by examples how to identify the patterns. For example, imagine a financial institution is searching for the best way to determine if a new client is good for a loan. Given a set of characteristics (inputs) as income, credit history, age, occupation and criminal records, and with a corresponding classification (output), it is possible to teach the ANN how to identify the majority of good and bad payers. As the examples (the tuples input-output) are presented, the net changes it’s weights in order to minimize the errors in classifying the inputs. This is an example of an ANN with supervised learning, where the correct (or desired) output is known a priori. Not all ANNs learn with a supervisor, but this is a subject for later posts.

1 – The McCulloch–Pitts neuron

As said before, an ANN is a direct graph in which the nodes are neurons and the edges are synapses. To understand this subject, let’s first see how the McCulloch-Pitts model works, represented by the diagram in Fig. 1. The name was given in recognition of the pioneering work done by Warren McCulloch and Walter Pitts (1943) in modelling neural networks.


Figure 1 Model of a neuron, labelled k.

Here we identify three basic elements of this model:

  1. A set of synapses, each characterized by a weight. For each input (dendrite signal) x_j in the neuron k there is a synaptic weight w_{kj} to multiply it.
  2. A linear combiner or adder for summing the weighted input signals.
  3. An activation function \varphi(.) for limiting the amplitude of the neuron’s output. It is in the interval [0,1].

The model also has an externally applied bias or threshold, denoted by b_k (sometimes called w_{k0} with an input x_0 = +1 as in Fig. 2), which effect is to increase or decrease the net input of the activation function.


Figure 2 Another non-linear model of a neuron, with the bias as an input.

All this given, we can describe the neuron k by the pair of equations:

  • (1) u_k = \sum_{j=1}^{m} w_{kj} x_j
  • (2) y_k = \varphi(u_k + b_k)

where we can say y_k = \varphi(v_k) and v_k = u_k + b_k. The value of v_k is modified depending on the signal of b_k as shown in Fig. 3.


Figure 3 Affine transformation produced by the presence of a bias.

The activation function \varphi(.) is the threshold function, which means the output y_k is 1 for any non-negative value of v_k and 0 otherwise, as shown in Fig. 4.


Figure 4 The threshold function.

Now we will see some code. First let’s define the threshold function:

def threshold(x):
    if x >= 0:
        return 1
        return 0

Notice that in this case, for x == 0 the return is 1, not 1/2 (must be one class, can’t be in the middle). Later comes the class defining the neuron (I am no checking if inputs and self.weights have the same length):

class MCP_Neuron(object):

    def __init__(self, weights, bias):
        self.weights = weights
        self.bias = bias

    def fire(self, inputs):
        summed = sum([i*w for (i,w) in zip(inputs, self.weights)])
        return threshold(summed + self.bias)

The activation function was not set as an attribute. In a more general model it would (and will) be necessary. Also the implementation is based on the diagram of Fig. 1, as the other shown in Fig. 2 would be better for a circuit implementation. And at last we test the model (the negative bias makes the neuron firing more difficult – what usually is our intention):

if __name__ == '__main__':
    neuron_1 = MCP_Neuron([0.2, 0.7, 0.3], -1.5)
    neuron_2 = MCP_Neuron([0.4, 0.6, 0.9], -0.8)
    neuron_3 = MCP_Neuron([0.7, 0.4, -0.9], -0.6)

    inputs = [1, 0, 1]

    print('Test #1 - inputs on neuron_1:',
    print('Test #1 - inputs on neuron_2:',
    print('Test #1 - inputs on neuron_3:',

To fit in this page, I created a soft link in the desktop with the line ln -s ~/dados/workspace/The-Men-Who-Stare-at-Codes/posts/neural-networks-in-a-nutshell/code/ neural.

embat@hal9000:~/desktop/neural$ python3.3
Test #1 - inputs on neuron_1: 0
Test #1 - inputs on neuron_2: 1
Test #1 - inputs on neuron_3: 0

The complete code is in the post directory on github and that’s all for today folks. In the next post we will see the Perceptron.


Live long and prosper


6 thoughts on “Neural networks in a nutshell

  1. Pingback: Neural networks in a nutshell – 2 | The Men Who Stare at Codes

  2. Pingback: Neural networks in a nutshell – 3 | The Men Who Stare at Codes

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s