Artifical Neural Network

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

ARTIFICAL NEURAL NETWORK

INTRODUCTION
Artificial neural network (ANN), usually called “neural network” (NN) has developed by Mc
Cann(1992). It is based on biological neurons. The first model of a neuron was developed by Moller
and Pittes in 1943. ANN is composed of large number of highly interconnected processing elements
called neurons. Good ability to learn from experience in order to improve their performance.ANN
minimizes the error using various algorithms and gives good prediction. One major application area
of ANNs is forecasting (Sharda, 1994). ANNs provide an attractive alternative tool for both
forecasting researchers and practitioners. An artificial neural network (ANN) is an information
processing system that is based on a mathematical model inspired by the complex non-linear and
parallel neural structures of information in the brain of intelligent beings that acquire knowledge
through experience. A large ANN may contain hundreds and even thousands of processing units.
Artificial Neural Networks (ANNs) are non-linear mapping structures based on the function of
the human brain. They are powerful tools for modelling, especially when they underlying data
relationship is unknown. A neural network, mathematical function that computes an output based on
a set of input values each neuron involves a mathematical function that uses weights to convert inputs
to an output value each neuron can has only one output at a certain time, though that output may
broadcast to several other neurons at the same time. training of the network the weights are adapted
by the input data set small changes of an input signal will not change the output of a neuron
dramatically changes of a weight will only affect output of a certain number of input patterns training
of the network the weights are adapted by the input data set small changes of an input signal will not
change the output of a neuron dramatically changes of a weight will only affect output of a certain
number of input patterns
y = b+w1x1+w2x2+…+wnxn
x1, x2, …, xn --- the signals
w1, w2, …, wn --- corresponding weights
b --- bias (change the output independently
of the inputs)

Structure of Artificial neural network


An ANN contains neurons or nodes that are organized in layers. The layer between the
input and output layer is called hidden layers. The input layer contains the predictors, hidden layer
contains unobservable nodes, or units and output layer contains the responses.
Designing of an ANN model
An ANN is typically composed of layers of nodes. In the popular MLP, all the input nodes are in
one input layer, all the output nodes are in one output layer and the hidden nodes are distributed
into one or more hidden layers in between. In designing an MLP, one must determine the
following variables:

• The number of input nodes.


• The number of hidden layers and hidden node.
• The number of output nodes.
• Inter connection of the nodes.
The number of input nodes
The number of input nodes corresponds to the number of variables in the input vector used
to forecast future values. The number of inputs is usually transparent and relatively easy to choose. Too
few or too many input nodes can affect either the learning or prediction capability of the network.
The number of hidden layers and hidden node
For the successful applications of neural network, the hidden layer and nodes play very
important role. It is the hidden nodes in the hidden layer allows neural networks to detect the feature, to
capture the pattern in the data, and to perform complicated nonlinear mapping between input and output.
The theoretical works show that a single hidden layer is sufficient for ANNs to approximate any complex
nonlinear function with any desired accuracy.
The number of output nodes
The number of output nodes is relatively easy to specify as it is directly related to the
problem under study. For a time, series forecasting problem, the number of output nodes often
corresponds to the forecasting horizon.
Inter connection of the nodes
The network architecture is also characterized by the interconnections of nodes in layers. The
connections between nodes in a network basically determine the performance of the network. For most
forecasting as well as other applications, the networks are fully connected in that all nodes in one layer
are only fully connected to all nodes in the next higher layer except for the output layer.
Architecture of Neural Networks
There are several types of architecture of ANN. However, the two most widely used ANN are
discussed below:

• Feedforward Networks
Feedforward ANNs allow signals to travel one way only; from input to output. There is no
feedback (loops) i.e. the output of any layer does not affect that same layer. They are extensively used
in pattern recognition.

• Feedback/Recurrent Networks
Feedback networks can have signals traveling in both directions by introducing loops in the network.
Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium
point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be
found.
Activation function of an ANN
It is also called the transfer function. It determines the relationship between inputs and outputs
of a node and a network. An activation function may be a linear or a nonlinear function. It introduces a
degree of nonlinearity that is valuable for most ANN application. applications. Chen and Chen (1995)
identify general conditions for a continuous function to qualify as an activation function. Loosely
speaking, any differentiable function theory. In practice, only a small number of ‘‘well- tangent’’
(bounded, monotonically increasing, and differentiable) activation functions are used.

• The sigmoid (logistic) function


f(x) = (1 + exp⁡( x))

• The sin or cosine function


f(x) = Sin (x) or f(x) = cos (x)

• The hyperbolic tangent (tan h) function


f(x) = (exp (x) – exp (-x)) / (exp (x) + exp (-x))

• The linear function:


f(x) = x
There are some heuristic rules for the selection of the activation function. For example,
Klimasauskas (1991) suggests logistic activation functions for classification problems which
involve learning about average behaviour, and to use the hyperbolic tangent functions if the
problem involves learning about deviations from the average such as the forecasting problem.
However, it is not clear whether different activation functions have major effects on the per-
neural networks, performance of the networks.
Generally, a network may have different activation functions for different nodes in the same or
different layers. Yet almost all the networks use the activation functions particularly for the
nodes in the same layer. While the majority of researchers use logistic activation functions for
hidden nodes, there is no consensus on which activation function should be used for output
nodes.
Conventionally, the logistic activation function seems well suited for the output nodes for many
classification problems where the target values are often binary. However, for a forecasting
problem which involves continuous target values, it is reason- able to use a linear activation
function for output nodes.
It is important to note that feedforward neural networks with linear output nodes have the
limitation that they cannot model a time series containing a trend. Hence, for this type of neural
networks, pre- differencing may be needed eliminate the trend effects. So far no research has
investigated the relative performance of using linear and nonlinear activation functions for
output nodes and there have been no empirical results to support preference of one over the
other.

The Backpropagation (BP)Algorithm


The backpropagation algorithm (Rumelhart and McClelland, 1986) is used in
layered feed-forward ANNs. This means that the artificial neurons are organized in layers,
and send their signals “forward”, and then the errors are propagated backwards. The
network receives inputs by neurons in the input layer, and the output of the network is given
by the neurons on an output layer. There may be one or more intermediate hidden layers.
The backpropagation algorithm uses supervised learning, which means that we provide the
algorithm with examples of the inputs and outputs we want the network to compute, and
then the error (difference between actual and expected results) is calculated. The idea of
the backpropagation algorithm is to reduce this error, until the ANN learns the training
data. The training begins with random weights, and the goal is to adjust them so that the
error will be minimal.
The activation function of the artificial neurons in ANNs implementing the
backpropagation algorithm is a weighted sum (the sum of the inputs x multiplied by them
respective weights w):

the activation depends only on the inputs and the weights.


If the output function would be the identity (output=activation), then the neuron
would be called linear. But these have severe limitations. The most common output
function is the sigmoidal function:

the goal of the training process is to obtain a desired output when certain
inputs are given. Since the error is the difference between the actual and the desired
output, the error depends on the weights, and we need to adjust the weights in order to
minimize the error. We can define the error function for the output of each neuron:

The error of the network will simply be the sum of the errors
of all the neurons in the output layer
The backpropagation algorithm now calculates how the error depends on the
output, inputs, and weights. After we find this, we can adjust the weights using the method
of gradient descendent:

This formula can be interpreted in the following way: the adjustment of each weight
() w) will be the negative of a constant eta (0) multiplied by the dependence of the
previous weight on the error of the network, which is the derivative of E in respect to w.
The size of the adjustment will depend on 0, and on the contribution of the weight to the
error of the function.

Benefits of ANNs

• Usefulness for pattern recognition, classification, generalization,


abstraction and interpretation of incomplete and noisy inputs.
(e.g. handwriting recognition, image recognition, voice and
speech recognition, weather forecasting).

• Providing some human characteristics to problem solving that are difficult to simulate
using the logical, analytical techniques of expert systems and standard software
technologies. (e.g. Financial applications).
• Ability to solve new kinds of problems. ANNs are particularly effective at solving
problems whose solutions are difficult, if not impossible, to define. This opened up a
new range of decision support applications formerly either difficult or impossible to
computerize.
• Robustness. ANNs tend to be more robust than the conventional counterparts. They
have the ability to cope with incomplete or fuzzy data. ANNs can be very tolerant of
faults if properly implemented.
• Fast processing speed. Because they consist of a large number of massively
interconnected processing units, all operating in parallel on the same problem, ANNs
can potentially operate at considerable speed (when implemented on parallel
processors).
• Flexibility and ease of maintenance. ANNs are very flexible in adapting their behaviour
to new and changing environments. They are also easier to maintain, with some having
the ability to learn from experience to improve their own performance.
Limitations of ANNs

• ANNs do not produce an explicit model even though new cases can be fed into it and new
results obtained.
• ANNs lack explanation capabilities. Justifications for results is difficult to obtain
because the connection weights usually do not have obvious interpretations.

ANN application areas:

• Lapedes and Farber (1987), (1988). Using two deterministic chaotic time series generated
by the logistic map and the Glass- Mackey equation, they designed the feedforward neural
networks that can accurately mimic and predict such dynamic nonlinear systems. Their
results show that ANNs can be used for modelling and forecasting nonlinear time series
with very high accuracy.
• Another application of neural network forecasting is in electric load consumption study.
Load forecasting is an area which requires high accuracy since the supply of electricity is
highly dependent on load demand forecasting. Sandberg (1991) report that simple ANNs
with inputs of temperature information alone perform much better than the currently used
regression-based technique in forecasting hourly, peak and total load consumption. Bacha
and Meyer (1992) discuss why ANNs are suitable for load forecasting and propose a
system of cascaded subnetworks. Srinivasan et al. (1994) use a four-layer MLP to predict
the hourly load of a power system.
• ANN for rainfall forecasting was undertaken by French et al. (1992), which employed a
neural network to forecasting two-dimensional rainfall.

• Tax form processing to identify tax fraud


• Enhancing auditing by finding irregularities
• Bankruptcy prediction
• Customer credit scoring
• Loan approvals
• Credit card approval and fraud detection
• Financial prediction
• Energy forecasting
• Computer access security (intrusion detection and classification of attacks)
• Fraud detection in mobile telecommunication networks

REFERENCE
Bacha, H., Meyer, W., 1992. A neural network architecture for load forecasting. In:
Proceedings of the IEEE International Joint Conference on Neural Networks, 2, pp. 442–447.
Chen, T., Chen, H., 1995. Universal approximation to nonlinear operators by neural networks
with arbitrary activation functions and its application to dynamical systems. IEEE
Transactions on Neural Networks 6 (4), 911–917.
Hayati, M. and Mohebi, Z. 2007. Temperature Forecasting Based on neural Network
Approach, Wld. Appl. Sci. J. 2 (6): 613-620.
Lapedes, A., Farber, R., 1987. Nonlinear signal processing using neural networks: prediction
and system modeling. Technical Report LA-UR-87-2662, Los Alamos National Laboratory,
Los Alamos, NM.
Lapedes, A., Farber, R., 1988. How neural nets work. In:Anderson, D.Z., (Ed.), Neural
Information Processing Systems,American Institute of Physics, New York, pp. 442–456.
Srinivasan, D., Liew, A.C., Chang, C.S., 1994. A neural network short-term load forecaster.
Electric Power Systems Research 28, 227–234.
Tang, Z. and Fishwick, P. A.,1993. Feedforward neural nets as models for time
series forecasting. ORSA J. computing, 5(4) :374–385.

You might also like