What are Neural Networks?
Neural networks – are computing systems vaguely inspired by the biological neural networks that constitute animal brains.
Artificial neural networks (ANNs) are statistical models directly inspired by, and partially modeled on biological neural networks. They are capable of modeling and processing nonlinear relationships between inputs and outputs in parallel. The related algorithms are part of the broader field of machine learning and can be used in many applications as discussed.
Artificial neural networks are characterized by containing adaptive weights along paths between neurons that can be tuned by a learning algorithm that learns from observed data to improve the model. In addition to the learning algorithm itself, one must choose an appropriate cost function.
The cost function is what is used to learn the optimal solution to the problem being solved. This involves determining the best values for all the tunable model parameters, with neuron path adaptive weights being the primary target, along with algorithm tuning parameters such as the learning rate. It is usually done through optimization techniques such as gradient descent or stochastic gradient descent.
These optimization techniques basically try to make the ANN solution be as close as possible to the optimal solution, which when successful means that the ANN can solve the intended problem with high performance.
Architecturally, an artificial neural network is modeled using layers of artificial neurons, or computational units able to receive input and apply an activation function along with a threshold to determine if messages are passed along.
Model of Neural Networks
In a simple model, the first layer is the input layer, followed by one hidden layer, and lastly by an output layer. Each layer can contain one or more neurons.
Models can become increasingly complex, and with increased abstraction and problem-solving capabilities by increasing the number of hidden layers, the number of neurons in any given layer, and/or the number of paths between neurons. Note that an increased chance of overfitting can also occur with increased model complexity.
Model architecture and tuning are therefore major components of ANN techniques, in addition to the actual learning algorithms themselves. All these characteristics of an ANN can have significant impact on the performance of the model.
Additionally, models are characterized and tunable by the activation function used to convert a neuron’s weighted input to its output activation. There are many different types of transformations that can be used as the activation function, and a discussion of them is out of scope for this article.
The abstraction of the output because of the transformations of input data through neurons and layers is a form of distributed representation, as contrasted with local representation. The meaning represented by a single artificial neuron for example is a form of local representation. The meaning of the entire network, however, is a form of distributed representation due to the many transformations across neurons and layers.
One thing worth noting is that while ANNs are extremely powerful, they can also be overly complex and are considered black box algorithms, which means that their inner workings are exceedingly difficult to understand and explain. Choosing whether to employ ANNs to solve problems should therefore be chosen with that in mind.