# Deep Learning: Common Architectures

This post is designed to be an overview highlighting three of the most common deep learning architectures: Multi-layer Perceptrons, Convolutional Neural Networks and Recurrent Neural Networks.

The content in this post is high-level, introducing the main features of each architecture and describing the type of data it tends to be used for. It is complementary to the post Introduction to Deep Learning: What do I need to know…?.

# Multi-layer Perceptron (MLP)

The above diagram is a multi-layer perceptron (MLP). An MLP can have many layers but must have at least three: the input, a hidden layer, and the output layer. All the neurons in an MLP are connected to every neuron in the previous layer and then connected to every neuron in the next layer.

These can be used for a variety of data types but are particularly useful for structured (tabular) data. It is good practice to first build an MLP model to use as a baseline for accuracy before trying more complicated architectures to ensure they are in fact providing significant improvement.

# Convolutional Neural Networks (CNN)

Convolutional neural networks tend to be used in computer vision solutions using images as input. They capture the spatial aspects of the data; rather than every pixel being see as a standalone feature, the fact that pixels are next to each other or within close proximity can be taken into consideration.

The reason convolutional neural networks are named as such is because they have convolution layers which are trying to capture these spatial patterns. They also have pooling layers which reduce the scope of mathematical work for future layers, whilst keeping important information.

After a collection of convolution and pooling layers, the last layer is joined and flattened out to be one long neural layer. This is then passed to a fully connected network (an MLP) to reach an output layer.

# Recurrent Neural Networks (RNN)

A recurrent neural network is names as such because the mathematics of the neural network is repeated at each time step. This architecture takes into consideration that what has happened in the past is likely to impact what happens in the future, this is why it is good for sequential data.

Neurons within an RNN have a “state” which can be interpreted as memory; it can remember important aspects of what has happened and use this to help predict what comes next.

If your data is time-series, it can take the features at time t-4, t-3, …, t-1 to predict what will happen at time t. Trends and patterns that have previously been seen are likely to be important when predicting what happens next.

Similarly, for natural language processing, sentences can be passed to a model in a sequence of words. The model should then learn to “remember” context of what was previously said as this will impact the likelihood of future words.

# Hybrid Models

Many deep learning solutions — particularly state-of-the-art — combine multiple models of different architecture types; the output of one model is passed to another model.

For example, you might pass your input data to a CNN model before an RNN model and then finally pass this output to an MLP which includes the output layer.

# Further Reading

Additional posts that relate to deep learning that might be of interest are:

Introduction to Deep Learning: What do I need to know…?

Deep Learning: Overview of Neurons and Activation Functions

Deep Learning: Which Loss and Activation Functions should I use?