Chapter 6 - Artificial Intelligence notes

6.
Applications of AI
Neural Network
 A neuron is a cell in brain whose principle function is the collection, Processing, and
dissemination of electrical signals.
 Brains Information processing capacity comes from networks of such neurons.
 Due to this reason some earliest AI work aimed to create such artificial networks. (Other Names
are Connectionism; Parallel distributed processing and neural computing
Units of Neural Network
 Nodes (units):
Nodes represent a cell of neural network.
 Links:
Links are directed arrows that show propagation of information from one node to another node.
 Activation:
Activations are inputs to outputs from unit.
 Weight:
Each link has weight associated with it which determines strength and sign of the connection.
 Activation function:
A function which is used do derive output activation from the input activations to a given node is
called activation function.
 Bias Weight:
Bias weight is used to set the threshold for a unit. Unit is activated when the weighted sum of real
inputs exceeds the bias weight.
Q. Why are Artificial Neural Networks?

 They are extremely powerful computational devices.
 Massive parallelism makes them very efficient
 They can learn and generalize from training data, so
there is no need for enormous feats of programming
 They are particularly fault tolerant.
 They are very noise tolerant, so they can cope with
situations where normal symbolic systems would have difficulty
 In principle, they can do anything a symbolic/logic system can do, and more.
Brains vs. Computers
 Processing elements: There are 10 synapses in the brain, compared with 10 transistors in the
computer
 Processing speed: 100 Hz for the brain compared to 10 Hz for the computer
 Style of computation: The brain computes in parallel and distributed mode, whereas the
computer mostly serially and centralized.
 Fault tolerant: The brain is fault tolerant, whereas the computer is not
 Adaptive: The brain learns fast, whereas the computer doesn’t even compare with an infant’s
learning capabilities
 Intelligence and consciousness: The brain is highly intelligent and conscious, whereas the
computer shows lack of intelligence
 Evolution: The brains have been evolving for tens of millions of years; computers have been
evolving for decades.
Network Structure
Weighting Factors (W)

The values W1, W2, W3…. Wn are weights to determine the strength of input X1, X2, X3….Xn
 The neuron computes the weighted sum of the input signals and compares the
result with a threshold value, ⱷ.
 X = X1W1+X2W2+X3W3+…..+XnWn = ∑�=
 If the net input is less than the threshold (ⱷ), the neuron output is -1.
 If the net input is greater than or equal to the threshold (ⱷ) , the neuron becomes
activated and its output attains value +1.
Threshold (ⱷ)
 Threshold is the magnitude offset value of the node. It affects the activation of the node output Y
as
Y= f(X) = f { ∑�= -ⱷ}
Activation function
An activation function f performs a mathematical operation on the signal output. Activation function in
example below is called sign function
Types of Neural Network

1. Single layer feed forward network
A neural network in which all the inputs connected directly to the outputs is called a single-
layer neural network, or a perceptron network. Since each output unit is independent of the
others each weight affects only one of the outputs
2. Multilayer feed forward network (Multilayer Perceptron)

The neural network which contains input layers, output layers and some hidden layers also is
called multilayer neural network. The advantage of adding hidden layers is that it enlarges the
space of hypothesis. Layers of the network are normally fully connected.
3. Recurrent network (feedback)
Perceptron
• A Perceptron is the simplest kind of feed forward neural network invented by Frank
Rosenblatt
• A perceptron can learn any linearly
separable functions, given enough
training.
• The model consists of a linear
combiner followed by an activation
function.
• The weighted sum of the inputs is applied to the activation function, which produces an output
equal to +1 if its input is positive and -1 if it is negative.
Perceptron algorithm
1. Initialization
- Set the initial weights wi and threshold to ⱷ random numbers in the range [-0.5, +0.5]
- If the error, e(p) is positive, we need to increase perceptron output Y(p), but if it is
negative, we need to decrease Y(p)
2. Activation
- Activate the perceptron by applying inputs xi(p) and desired output Yd(p).
Calculate the actual output at iteration p=1
Where n is the number of the perceptron inputs, and step is step activation function
3. Weight Training
- Update the weights of the perceptron
wi(p+1) = wi(p) + Δwi(p), Where Δwi(p) is the weight correction at iteration p. the weight
correction is computed by the delta rule.
Δwi(p) = α. Xi(p). e(p), α is learning rate
4. Iteration
- Increase iteration p by one, go back to Step 2 and repeat the process until convergence.
Adaline Network
Adaline network is a variation on the Perceptron Network
 inputs are +1 or -1
 outputs are +1 or -1
 uses a bias input
It is trained using the Delta Rule which is also known as the least mean squares (LMS) or Widrow-Hoff
rule. The activation function, during training is the identity function. After training the activation is a
threshold function.
Adaline Algorithm
Step 0: initialize the weights to small random values and select a learning rate, a
Step 1: for each input vector s, with target output, t set the input to s
Step 2: compute the neuron inputs
Step 3: use the delta rule to update the bias and weights
Step 4: stop if the largest weight change across all the training samples is less than a specified
tolerance, otherwise cycle through the training set again
Back propagation
It is a supervised learning method, and is an implementation of the Delta rule. It requires a teacher that
knows, or can calculate, the desired output for any given input. It is most useful for feed-forward
networks (networks that have no feedback, or simply, that have no connections that loop). The term is an
abbreviation for "backwards propagation of errors". Back propagation requires that the activation
function used by the artificial neurons (or "nodes") is differentiable.
Back propagation networks are necessarily multilayer perceptron (usually with one input, one hidden,
and one output layer). In order for the hidden layer to serve any useful function, multilayer networks
must have non-linear activation functions for the multiple layers: a multilayer network using only linear
activation functions is equivalent to some single layer, linear network.
Back propagation Algorithm
1. Initialization: Set all the weights and threshold levels of the n/w to random numbers uniformly
distributed inside a small range [-2.4/Ti, 2.4/Ti] where Ti is the total number of inputs of neuron i in the
network
2. Activation: Activate the back-propagation neural network by applying inputs xi(p) and desired outputs
yi(p)
 Calculate the actual outputs of the neuron in the hidden layer.
 yj(p) = sigmoid [ Σ∑�= � . � − ⱷ ]
 Calculate the actual output of the neurons in the output layer
 yk(p)=sigmoid [ ∑�= � . � − ⱷ ]
Where m is the number of inputs of neuron k in the output layer
3. Weight training: Update weights in the back-propagation network by propagating backward the errors
associated with output neurons
• Calculate the error gradient of the neurons in the output layer
ᵟk =yk(p) [1-yk(p)]. ek(p) where ek(p) = yd,k (p) –yk(p)
• Calculate the weight corrections
Δ wjk = α . yj(p),ᵟk(p)
• Update the weight at the output neurons
wjk(p+1) = wjk+ Δwjk(p)
• Calculate the error gradient for the neurons in the hidden layer
ᵟj(p) = yj(p)[1-yj(p)]. [ ∑ = ᵟk � . � ]
• Calculate the weight corrections
Δwij =α. xi(p). ᵟj(p)
• Update the weight at the hidden neuron
Wij(p+1)= wij(p)+Δwij(p)
Hopfield network
A Hopfield network is a form of recurrent artificial neural network popularized by John Hopfield in
1982, but described earlier by Little in 1974. Hopfield nets serve as content-addressable memory systems
with binary threshold nodes.
Hopfield networks also provide a model for understanding human
memory.
The units in Hopfield nets are binary threshold units, i.e. the units
only take on two different values for their states and the value is
determined by whether or not the units' input exceeds their
threshold. Hopfield nets normally have units that take on values
of 1 or -1, and this convention will be used throughout this page.
However, other literature might use units that take values of 0 and
1.
Every pair of units i and j in a Hopfield network have a connection that is described by the connectivity
weight . In this sense, the Hopfield network can be formally described as a complete undirected graph
, where V is a set of McCulloch-Pitts neurons and is a function that links

pairs of nodes to a real value, the connectivity weight.
The connections in a Hopfield net typically have the following restrictions:
 (no unit has a connection with itself)
 (connections are symmetric)

Kohonen Network
The Self-Organizing Map (SOM), commonly also known as Kohonen network is a computational
method for the visualization and analysis of high-dimensional data, especially experimentally acquired
information.
The Self-Organizing Map defines an ordered mapping, a kind of projection from a set of given data items
onto a regular, usually two-dimensional grid. These models are computed by the SOM algorithm. A data
item will be mapped into the node whose model which is most similar to the data item, e.g., has the
smallest distance from the data item in some metric.
Like a codebook vector in vector quantization, the model is then usually a certain weighted local average
of the given data items in the data space. But in addition to that, when the models are computed by the
SOM algorithm, they are more similar at the nearby nodes than between nodes located farther away from
each other on the grid. In this way the set of the models can be regarded to constitute a similarity graph,
and structured 'skeleton' of the distribution of the given data items.
The SOM was originally developed for the visualization of distributions of metric vectors, such as
ordered sets of measurement values or statistical attributes, but it can be shown that a SOM-type mapping
can be defined for any data items, the mutual pairwise distances of which can be defined. Examples of
non-vectorial data that are feasible for this method are strings of symbols and sequences of segments in
organic molecules (Kohonen and Somervuo 2002).
Mathematical definition of the SOM
Consider first data items that are n-dimensional Euclidean vectors
x(t)=[ξ1(t),ξ2(t),…,ξn(t)] .
Here t is the index of the data item in a given sequence. Let the ith model
be mi(t)=[μi1(t),μi2(t),…,μin(t)] , where now t denotes the index in the sequence in which the models are
generated. This sequence is defined as a smoothing-type process in which the new value mi(t+1) is
computed iteratively from the old value mi(t) and the new data item x(t) as
mi(t+1)=mi(t)+α(t)hci(t)[x(t)−mi(t)].
Expert System
 An expert system is a computer system whose performance is guided by specific, expert
knowledge in solving problems.
 It is a computer system that simulates the decision- making process of a human expert in a
specific domain.
 Expert system is one of the early (large- scale) successes of artificial intelligence.
 An expert system is an “intelligent” program that solves problems in a narrow problem
area by using high-quality, specific knowledge rather than an algorithm.
 Expert systems are used by most of the large or medium sized organization as a major tool
for improving productivity and quality.
 An expert system’s knowledge is obtained from expert sources and code in a form suitable
for the system to use in its process
ES Architecture
The expert system is composed of
following main components:
Knowledge Base
 Knowledge are collected
from a number of human
experts and codified by a
knowledge engineer.
 Knowledge base = Facts +
Rules
 Facts are a type of
declarative knowledge
which describes what is
known about a given
problem. They are statements which are asserted into the working memory with either true or
false value. The definition, hypothesis, theorems, probabilities, images, measurements,
relationships, constraints, observation, etc will be the facts.
Inference Engines
Makes inference by deciding which rules are satisfied by facts, and fires the rules. The major tasks
performed by the inference engine include:
1. The reasoning task: implement the rules to discover new data
2. The control task: determine the order in which rules are ―fired‖
3. The explanation task: respond to a user request to explain why certain data is required
4. The how task: respond to a user request to report how a conclusion was reached
5. The uncertainty task: manage uncertainty in the data
User Interface
The component of an expert system that communicates with the user is known as the user interface. The
communication performed by a user interface is bidirectional. At the simplest level, we must be able to
describe our problem to the expert system, and the system must be able to respond with its
recommendations. We may want to ask the system to explain its “reasoning”, or the system may request
additional information about the problem from us.
Features of an Expert System
Although each expert system has its own particular characteristics, there are several features
common to many systems. The following list from Rule-Based Expert Systems suggests seven
criteria that are important prerequisites for the acceptance of an expert System.
 “The program should be useful.” An expert system should be developed to meet a specific need,
one for which it is recognized that assistance is needed.
 “The program should be usable.” An expert system should be designed so that even a novice
computer user finds it easy to use.
 “The program should be educational when appropriate.” An expert system may be used by non-
experts, who should be able to increase their own expertise by using the system.
 “The program should be able to explain its advice.” An expert system should be able to explain
the “reasoning” process that led it to its conclusions, to allow us to decide whether to accept the
system’s recommendations.
 “The program should be able to respond to simple questions.” Because people with different
levels of knowledge may use the system, an expert system should be able to answer questions
about points that may not be clear to all users.
 “The program should be able to learn new knowledge.” Not only should an expert system be
able to respond to our questions, it also should be able to ask questions to gain additional
information.
 “The program’s knowledge should be easily modified.” It is important that we should be able to
revise the knowledge base of an expert system easily to correct errors or add new information.
Advantages of Expert System
- It provides consistent answer for repetitive decisions, processes and tasks.
- Hold and maintained significant level of information.
- Encourage organization to clarify the logic of their decision making.
- Ask question like human expertise.
Disadvantages of Expert System
- Lack of common sense needed in some decision making.
- Cannot make creative response as human expert would in unusual circumstances.
- Error may occur in the knowledge base and lead to wrong decision.
- Cannot adopt changing environment, unless knowledgebase is changed.
Applications
 Business
 Manufacturing
 Medicine
 Engineering
 Applied science
 Military
 Space
 Transportation
 Education
 Image analysis
Natural Language Processing
NLP is the process of understanding and generation of our natural language (English, Russian, French,
and Nepali etc.) to the means of Human-Computer interfacing through voice. This is mostly used in
database system, export system, automatic text translation system or text summarization system. The task
of mapping the sound wave to a string of words is called speech recognition. It has the problem of
background noise, inter-speaker variation and intra-speaker variation.
NLP = NLU + NLG, where
NLU: speech/text to meaning
NLG: meaning to text/speech
Problems in NLP
 Multiple meaning of words in different places of world. For example: Flat = House (for English
man) and Flat = Puncture (for American man).
 One sentence may have multiple meaning. For example: “I saw Taj mahal flying over Agra”. This
has two meaning that whether Tajmahal or person is flying.
 Single word may have multiple meaning. For example: Copy = Notebook or Copy = Transfer data
in computer.
 Language phrases will give separate meaning in combined way and in segmented way. For
example: get-rid-off = Release, get = obtain.
Phases of NLP
 Speech recognition: including word-spotting, speech separation, sound classification
 Speech coding: Encoding sound wave to binary code
 Speech synthesis: Production of speech by the computer
NLP Processes
A complete NLP system consists of programs that perform all these functions
Syntactic Analysis
• Syntactic analysis takes an input sentence and produces a representation of its grammatical
structure.
• A grammar describes the valid parts of speech of a language and how to combine them into
phrases.
• The grammar of English is nearly context free.
Semantic Analysis
• Semantic analysis is a process of converting the syntactic representations into a meaning
representation. This involves the word sense determination, sentence level analysis, and
knowledge representation
Pragmatic Analysis
• Pragmatics comprises aspects of meaning that depend upon the context or upon facts about real
world.
Computer Vision (or Machine Vision)
Computer vision is the technology concerned with computational understanding and use of the
information present in visual images. The input image is composed of large numbers of array of pixels
and each contains very little information. The individual pixel is meaningless but when we combine
similar type of pixel, it will show certain meaningful things. This type of pixel organization for
meaningful information is the goal of machine vision. In manufacturing, vision based sensing and
interpretation system help in automatic inspection such as identification of cracks, holes, and surface
roughness, counting of objects, and alignment of parts. The most applicable area of computer vision is
the car manufacturing company, X-ray image analysis, satellite image analysis, movement of weather
patterns analysis etc. The process of computer vision can be pointed as:
a. Image acquisition: Convert the analog image signal into digital image signal.
b. Image processing: Reduce noise; enhance image, color and gray level adjustment etc.
c. Image analysis: Classify the different objects contained in an image.
d. Image understanding: Recognition of different classified object of an image with their
description and relation to other. The object is described according to the predefined
information.

Chapter 6 - Artificial Intelligence notes

Uploaded by

Copyright:

Available Formats

Chapter 6 - Artificial Intelligence notes

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 6 - Artificial Intelligence notes

Uploaded by

Copyright:

Available Formats

6.

Q. Why are Artificial Neural Networks?

Weighting Factors (W)

Types of Neural Network

2. Multilayer feed forward network (Multilayer Perceptron)

, where V is a set of McCulloch-Pitts neurons and is a function that links

 (connections are symmetric)

You might also like