A Hopfield(2) neural network is a type of artificial neural network invented by John Hopfield in 1982. It usually works by first learning a number of binary patterns and then returning the one that is the most similar to a given input.

What defines a Hopfield network

It is composed of only one layer of nodes or units each of which is connected to all the others but not itself. It is therefore a feedback network, which means that its outputs are redirected to its inputs. Every unit also acts as an input and an output of the network. Thus the number of nodes, inputs, outputs of the network are equal. Additionally, each one of the neurons in a has a binary state or activation value, usually represented as 1 or -1, which is its particular output. The state of each node generally converges, meaning that the state of each node becomes fixed after a certain number of updates.

An example of a Hopfield neural network with 4 nodes:
Hopfield Network

How a Hopfield network is updated

The nodes of a Hopfield network can be updated either synchronously or asynchronously. In the case of a synchronously updated network, a clock coordinates the different nodes so that they are all updated at the same time. In the case of an asynchronously updated network the nodes update one at a time and the node is selected randomly. Since there doesn’t exist any clock in biological systems such as the brain to synchronise the update of the nodes, the asynchronous networks are considered to be more realistic. In both cases there exists a connectivity weight between any two distinct nodes since all nodes are connected. The weight between any given node and itself is considered to be 0. We note the connectivity weight between node n and node m wnm = wmn. This connectivity weight determines how much impact the state of node j will have on the new state of node i when node i is updated and vice-versa. If the weight between two nodes is positive the states of the two nodes will converge, otherwise they will diverge. The weight can be considered as an indicator of the final state of a node with respect to the finite state of the others Every node i also has a threshold θi that determines whether or not the input is sufficient to give it a state of 1. If the total input of a node is greater than or equal to its threshold then the node’s new state is 1, otherwise it is -1.

The input received by a node i, also called weighted input sum of node i can be defined as:
I_i=\sum\limits_{j=1}^n w_{ij}s_j

Where:

And the new state of node i is then given by:
s_i = sign(I_i-\theta_i)

Where:

We can also describe the update of a node in terms of matrices:

The respective states of all neurons can be expressed as a row matrix S:
S= \begin{pmatrix} s_{1} & s_{2} & ... & s{i} & ... & s{n} \\ \end{pmatrix}
All the connectivity weights of a Hopfield network can be represented in a weight matrix W:
W= \begin{pmatrix} 0 & w_{12} & ... & w_{1i} & ... & w_{1n} \\ w_{21} & 0 & ... & w_{2i} & ... & w_{2n} \\ \vdots & \vdots & \ddots & \vdots & & \vdots \\ w_{i1} & w_{i2} & ... & 0 & ... & w_{in} \\ \vdots & \vdots & & \vdots & \ddots & \vdots \\ w_{n1} & w_{n2} & ... & w_{ni} & ... & 0 \\ \end{pmatrix}

The weight matrix of a Hopfield network is always a symmetric, zero-diagonal matrix. This follows from the fact that no neuron is connected to itself and the fact that w=w. It also implies that neuron i influes on the new value of neuron j as much as neuron j influes on the new value of neuron i. It also means that the state of a neuron at some time t does not influence its state at time t+1.

The thresholds of all neurons form a column matrix T:
T= \begin{pmatrix} \theta_{1} \\ \theta_{2} \\ \vdots \\ \theta{i} \\ \vdots \\ \theta{n}\\ \end{pmatrix}

Training a Hopfield network

There exist several rules to train a Hopfield network all of which are based on the same process: the modification of the synaptic weights. Given an input to learn, the connectivity weight between nodes with the same state is increased whereas the weight between nodes with opposite states is decreased. A learning rule can have two useful properties: being local and being incremental. It is local if for each update it uses only the information from both nodes whose connectivity weight is updated. It is incremental if it does not require any information about the other patterns previously learnt by the network. Another major property of a learning rule is the capacity. It corresponds to the number of patterns per neuron that can be recalled by the network. The most common learning rule for Hopfield networks is Hebbian(3) learning, invented by Donald Hebb in 1949.

For a network with N nodes Hebbian learning updates the connectivity weights according to the formula:
w_{ij}=\frac{1}{N}\sum\limits_{\mu=1}^p \epsilon_{i}^{\mu} \epsilon_{j}^{\mu}

Where:

As the formula describes it, Hebbian learning consists of setting the new weight between nodes i and j to the average of the product of the states of i and j in each of the patterns. While the Hebbian learning rule is the most well-known, the Storkey learning rule, invented by Amos Storkey in 1997, is also both local and incremental and provides a network with a greater capacity.

The Storkey learning rule was mathematically defined by Amos Storkey as:
w_{ij}^{0}=0 \ \forall i,j\ and\ w_{ij}^{\nu}=w_{ij}^{\nu -1}+\frac{1}{n}\xi_{i}^{\nu}\xi_{j}^{\nu}-\frac{1}{n}\xi_{i}^{\nu}h_{ji}^{\nu}-\frac{1}{n}h_{ij}^{\nu}\xi_{j}^{\nu}

Where:

The following graph published by Storkey shows the differencein capacity between the Hebbian learning and the Storkey learning:
Storkey vs Hebbian graph
According to the model deduced from the above graph, for n nodes we have:
Hebbian learning absolute capacity
\frac{N}{2 ln(N)}
Storkey learning absolute capacity
\frac{N}{2 \sqrt{ln(N)}}

The energy of a Hopfield network

In a Hopfield network each node has an associated energy which is defined as:
E_{i}=-\frac{1}{2}I_{i}s_{i}

Where:

From this expression we can deduce the total energy of a network with n nodes:
E=-\frac{1}{2}\sum\limits_{j=1}^{n}\sum\limits_{i=1}^{n}w_{ij}s_{i}s_{j}+\sum\limits_{i=1}^{n}\theta_{i}s_{i}
This expression can now be clarified using the previously defined matrices:
E=-\frac{1}{2}SWS^{T}+T^{T}S^{T}

If a neuron changes sign when updated then the energy of the network decreases otherwise it stays constant. Therefore the energy indicates whether or not the networks is modified when an update occurs and reaches its minimum energy when it becomes stable. The training patterns all correspond to a local minimum, called attractor on the graph of the energy, thus once the network reaches a training pattern it stops evolving. However some local minima may not correspond to a training pattern, these are called spurious minima and constitute an obvious limit to the Hopfield network.