INTRODUCTION TO
NEURAL )
NETWORKS _%,
USING tilt ih
Ue: %)
RS yA ee OU a llInformation contained in this work has been obtained by Tata
McGraw-Hill, from sources believed fo be reliable. However,
neither Tata McGraw-Hill nor its authors guarantee the accuracy
or completeness of any information including the program
listings, published herein, and neither Tata McGraw-Hill nor its
authors shall be responsible for any errors, omissions, or
damages arising out of use of this information. This work is
published with the understanding that Tata McGraw-Hill and is
authors are supplying information but are not attempting to
render engineering or other professional services. If such
services are required, the assistance of an appropriate
professional should be sought.
INA
[== Tata McGraw-Hill
Copyright © 2006, by Tata McGraw-Hill Publishing Company Limited
Second reprint 2006
RZLYCDRKRQLYC
No part of this publication may be reproduced or distibuted im any form or by any means,
electronic, mechanical, photocopying, recording, or otherwise or stored in a database or
retrieval system without the prior written permission of the publishers. The program listings
(if any) may be entered, stored and executed in a computer system, but they may not be
reproduced for publication.
This edition can be exported from India only by the publishers,
Tata McGraw-Hill Publishing Company Limited.
ISBN 0-07-059112-1
Published by the Tata McGraw-Hill Publishing Company Limited,
7 West Patel Nagar, New Delhi 110 008, typeset in Times at Script Makers,
19, AI-B, DDA Market, Pashchim Vihar, New Delhi 110063 and
printed at S.P. Printers, E-120, Sector-7, Noida
Cover: Shree Ram Enterprises
Cover Design: Kapil Gupta, DelhiContents
Preface axi
Acknowledgements ay
1.1 Neural Processing /
1.3__ The Rise of Neurocomputing 4
14__MATLAB = An Overview 5
Review Questions 8
I z ficial Neural N. 10
‘ .
2.3 Historical Development of Neural Networks 13
2.4 Biological Neural Networks 15
2.5 Comparison Between the Brain and the Computer_17
2.6 Comparison Between Artificial and Biological Neural Network _17
2.7__ Basic Building Blocks of Artificial Neural Networks 19
; 2.2.1 Network Architecture 19
2.7.2 Setting the Weights 20
2.2.3 Activation Function 22
2.8 Artificial Neural Network (ANN) Terminologies 23
2.8.1 Weights 23
2.82 Activation Functions 23
2.8.3 Sigmoidal Functions 24
2.8.4 Calculation of Net Input Using Matrix Multiplication Method 25
285 Bias 25
2.86 Threshold 26
2.9 Summary of Notations Used _27
‘Summary 29
Review Questions 29
3._ Fundamental Models of Artificial Neural Networks 9D
3.2__McCulloch-Pitts Neuron Model _2/
3.2.1 Architecture 31
3.3 Learning Rules 43x Contents
1 Hebbian Leaming Rule 43
3.3.2 Perceptron Learning Rule 44
3.3.3 Delta Learning Rule (Widrow-Hoff Rule or Least Mean Square (LMS) Rule) 44
3.3.4 Competitive Learning Rule 47
3.3.5 Out Star Leaming Rule 48
3.3.6 Boltzmann Leaning 48
3.3.7 Memory Based Leaming 48 —
3.4.2 Algorithm 50
3.4.3 Linear Separability 50
Summary 56
Review Questions 57
Beard -
4, Perceptron Networks oO
; aa
4.2 _ Single Layer Perceptron_ 61
tot Anhivenm GI
42.2 Algorithm 62
42.3 Application Procedure _63
- 4.2.4 Perception Algorithm for Several Output Classes 63
4.3 _ Brief Introduction to Multilayer | Networks 84
‘Summary 85°
Review Questions 85
5. Adaline and 87
S.L__Introduction 87
5.2__Adaline 88
5.2.2 Algorithm 88
5.2.3 Application Algorithm 89
5.3__Madaline 99°
Review Questions 107
Exercise Problems 107
6. Associative Memory Networks 109
6.2 Algorithms for Pattern Association 70
6.2.1 Hebb Rule for I A
6.2.1 Hebb Rule for Pattem Association 1/0
Elgments sous droits d'auteur6.2.3 Extended Delta Rule Ji
6.3 Hetero Associative Memory Neural Networks 113
im ie
6.3.2 Application Algorithm 1/3
6.4 Auto Associative Memory Network 129
6.4.1 Architecture 129
6.4.2 Training Algorithm 129
6.4.3 Application Algorithm 130
6.5 Bi-directional Associative Memory 150
6.5.2 Types of Bi-directional Associative Memory Net_151
65.3 Application Algorithm 153
6.5.4 Hamming Distance 154
Summary _ 162
Review Questions 162
Exercise Problems 163
‘1. Feedback Networks __166
7.2 Discrete Hopfield Net__ 167
7.2.2 Training Algorithm 167
7.2.3 Application Algorithm 168
; 7.2.4 Analysis 168
7.3 _ Continuous Hopfield Net_179
7.4 _ Relation Between BAM and Hopfield Nets J8/_
Summary _ 18]
Review Questions 181
8.1 Introduction 184
8.2 _ Back Propagation Network (BPN) 185
8.2.1 Generalized Delta Leaming Rule (or) Back Propagation Rule 185
e s
8.2.3 Training Algorithm 187
a Selection of P so
8.2.5 Learning in Back Propagation _190
8.2.6 Application Algorithm 192
$2.7 Local Minima and Global Minima 192
8.2.8 Merits and Demerits of Back Propagation Network 193
: .
8.2.9 ions_193
8.3 Radi: Function Network (RBEN} 212
8.3.1 Architecture 2/3
Eléments sous droits d'auteurxi Contents
83.2 Training Algorithm for an RBFN with Fixed Centers 2/3
Summary 217
Review Questions 218
Spee oh
9. Self Organizing Feature Map
9.2 Methods Used for Determining the Winner_22/
9.3 _Kohonen Self Organizing Feature Maps (SOM) 22]
93.1 Architecture 222
93.2 Training Algorithm 223
9.4 _Leaming Vector Quantization (LVQ) 237
94.2 Training Algorithm 238
94.3 Variants of LVQ 239.
95 Max Net 245
95.2 Application Procedure 246
96.2 Training Algorithm 250
9.7__Hamming Net_ 253
97.2 Application Procedure 254
Summary 257
Review Questions _ 257.
10. Counter Propagation Network
10.1 Introduction 260
10.2 Full Counter Propagation Network (Full CPN) 261
10.2.1 Architecture 267
102.2 Training Phases of Full CPN 263
(10.2.4 Application Procedure 265
10.3 Forward only Counter Propagation Network 270
103.1 Architcture 270
10.3.2 Training Algorithm 271
10.3.3 Application procedure 272
Summary _274
Review Questions 274
Exewise Problems 274
11. Adaptive Resonance Theory
TLL _Introduction 277
: i —_
sous droitsContents xiii
11.2.2 Basic Operation 279
11.2.3 Learning in ART 280
11.2.4 Basic Training Steps 280
“1L3__ART 1 280
113.1 Architecture 287
11.3.2 Algorithm 282
114.1 Architecture 299
114.2 Training Algorithm 300
Summary 309
Review Questions 309
Exercise Problems 310
12, Special Networks 3
re fon 312
22? aes 212
12.2.1 Architecture 312
122.2 Training Algorithm 3/3
12.2.3 Application Algorithm 3/4
12.3 Cognitron 3/4
123.2 Training 315
123.3 Excitatory Neuron 3/5
123.4 Inhibitory Neuron 376
123.5 Training Problems 3/7
12.4 Neocognitron 318
124.1 Architecture 3/8
12.4.4 Algorithm Calculations 3/9
12.4.5 Training 320
12.5__Boltzmann Machine 320
12.5.1 Architecture 320
125.2 Application Algorithm 327
12.6 Boltzmann Machine with Learning 322
126.1 Architecture _223
126.2 Algorithm 323
12.7 Gaussian Machine 325
12.8 Cauchy Machine 326
12.9 Optical Neural Networks 327
12.9.1 Electro-optical Matrix Multipliers 327,
12.9.2 Holographic Correlators 328
12.10 Simulated Annealing 329
1210.1 Algorithm 329
deuteu12.10.2 Structure of Simulated Annealing Algorithm 331
12.10.3 When to Use Simulated Annealing 332
12.11 Cascade Correlation _333
12.12 Spatio-temporal Neural Network 336
12.12.1 Input Dimensions 336
12.12.2 Long- and Short-term Memory in Spatio-temporal Connectionist Networks _337
12.12.3 Output, Teaching, and Error _337
12.12.4 Taxonomy for Spatio-temporal Connectionist Networks 338
12.125 Computing the State Vector 339
12.12.6 Computing the Output Vector 339
12.12.7 Initializing the Parameters 339
12.12.8 Updating the Parameters 340
12.13 Support Vector Machines 340
12.13.1 Need for SVMs 342
' 12,13.2 Support Vector Machine Classifiers 343
12.14.1 Pulsed Neuron Model (PN Model) 345
12.15_Neuro-dynamic Programming 347
12.15.1 Example of ic Pros ing 349
12.15.2 Applications of Ne ic Pros i 350
Summary 350
Review Questions 350
13, Applications of Neural Networks 352
13.1 __ Applications of Neural Networks in Arts 353
13.1 Neural Networks 352
13.1.2 Applications 355
13.L3 Conclusion 357
13.2 _ Applications of Neural Networks in Bioinformatics 358
13.2.1 A Bioinformatics Application for Neural Networks 358
13.3. Use of Neural Networks in Knowledge Extraction 360
13.3.1 Artificial Neural Networks on Transputers 36]
13.3.2 Knowledge Extraction from Neural Networks 363
13.4 Neural Networks in Forecasting 367
13.4.1 Operation of a Neural Network 368
13.4.2, Advantages and Disadvantages of Neural Networks 368
13.4.3 Applications in Business 369
13.5 __Neural Networks Applications in Bankruptcy Forecasting 37]
13.5.1 Observing Data and Variables 372
13.5.2 Neural Architechture 372
13.5.3 Conclusion 374
13.6 Neural Networks in Healthcare 374Contents xv
13.6.1 Clinical Diagnosis 374
13.6.2 Image Analysis and Interpretation 376
13.6.3 Signal Analysis and Interpretation 377
13.6.4 Drug Development 378
13.7 _ Application of Neural Networks to Intrusion Detection 378
13.7.1. Classification of Intrusion Detection Systems 378
13.7.2 Commercially Available Tools 378
13.7.3 Application of Neural Networks to Intrusion Detection 380
13.2.4 DARPA Intrusion Detection Database 360
13.7.5_Georgia University Neural Network IDS 380
13.26 MIT Research in Neural Network IDS 381
13.7.7 UBILAB Laboratory _38/
13.7.8 Research of RST Corporation _38/
13.19 Conclusion 382
13.8.1 Using Intelligent Systems 383
13.8.2 Application Areas of Artificial Neural Networks 383
13.8.3 European Initiatives in the Field of Neural Networks 385
13.8.4 Application of Neural Networks in Efficient Design of RF and
13.8.5 Neural Network Models of Non-linear Sub-systems _ 387
13.8.6 Modeling the Passive Elements 388
13.8.7 Conclusion 289
<9 Te
13.9.1 Natural Landmark Recognition using Neural Networks for
Autonomous Vacuuming Robots 389
13.9.2 Conclusions 395
13.10_Neural Network in Image Processing and Compression 395
13.10.1_ Windows based Neural Network Image Compression and Restoration _395
13.10.2 Application of Artificial Neural Networks for Real Time Data Compression 407
13.10.3 Image Compression using Direct Solution Method Based Neural Network 406
13.10.4 Application of Neural Networks to Wavelet Filter Selection in Multispectral
Image Compression 411
13.10.6 Rotation Invariant Neural Network-based Face Detection 4/6
beet
13.11.1 Neural Network Applications in Stock Market Predictions—A
Methodology Analysis 42/
13.11.2 Search and Classification of “Interesting” Business Applications
in the World Wide Web Using a Neural Network Approach 427
13.12_Neural Networks in Control _432
13.12.1 Basic Concept of Control Systems 432
13.12.2 Applications of Neural Network in Control Systems 433
13.13 Neural Networks in Pattern Recognition 440xvi
Contents
13.13.1 Handwritten Character Recognition 440
13.14 Hardware Implementation of Neural Networks 448
13.14.1 Hardware Requirements of Neuro-Computing 448
13.14.2 Electronic Circuits for Neurocomputing 450
13.14.3 Weights Adjustment using Integrated Circuit 452
14, Applications of Special Networks 455
1
14.2
143
14.4
14.5
14.6
14.7
‘Temporal Updating Scheme for Probabilistic Neural Network with Application to
Satellite Cloud Classificati 55
14.1.1 Ter Updating for Cloud Classification 457
Application of Knowledge-based Cascade-correlation to Vowel Recognition 458
14.2.1 Description of KBCC 459
14.2.2 Demonstration of KBCC: Peterson-Barney Vowel Recognition 460
23 Discussi 53
Rate-coded Restricted Boltzmann Machines for Face Recognition 464
14.3.1 Applying RBMs to Face Recognition 465
14.3.2 Comparative Results 467
143.3 Receptive Fields Learned by RBMrate 459
143.4 Conclusion 469
MPSA: A Methodology to Parallelize Simulated Annealing and its Application to the
‘Traveling Salesman Problem 470
144.1 Simulated Annealing Algorithm and the Traveling Salesman Problem 470
14.4.2 Parallel Simulated Annealing Algorithms 471
14.4.3 Methodology to Parallelize Simulated Annealing 471
14.4.4 TSP-Parallel SA Algorithm Implementation 474
14.4.5 TSP-Parallel SA Algorithm Test _ 474
Application of “Neocognitron” Neural Network for Integral Chip Images Processing 476
Generic Pretreatment for Spiking Neuron Application on Lip Reading with STANN
(Spatio-Temporal Artificial Neural Networks) 480
14.6.1 STANN 480
14.6.2 General Classification System with STANN 481
14.6.3 A Generic Pretreatment 482
146.4 Results 483
Optical Neural Networks in Image Recognition 484
14.7.1 Optical MVM 485
14.7.2 Input Test Patterns 485
14.7.3 Mapping TV Gray-levels to Weights 486
14.7.4 Recall in the Optical MVM 487
14.7.5 LCLN Spatial Characteristics 488
14.7.6 Thresholded Recall 489
14.7.7 Discussion of Results 489
15, Neural Network Projects with MATLAB 491
15.1
Brain Maker to Improve Hospital Treatment using ADALINE 49
15.1.1 Symptoms of the Patient__49215.2
153
15.4
155
Contents | xvii
15.1.2 Need for Estimation of Stay 493
15.13 ADALINE 493
15.1.4 Problem Description 493
15.1.5 Digital Conversion 493
15.16 Data Sets 494
15.1.7 Sample Data_ 494
15.1.8 Program for ADALINE Network 495
15.1.9 Program for Di
15.1.10 Program for Digitising the Target 498
15.1.1 Program for Testing the Data 499
15.1.12 Simulation and Results 499
15.113 Conclusion 507
Breast Cancer Detection Using ART Network 502
15.2.1 Art I Classification Operation 502
15.22 Data Representation Schemes 503
15.2.3 Program for Data Classification using Art] Network 508
15.24 Simulation and Results 5/1
15.2.5 Conclusion 5/2
Access Control by Face Recognition using Backpropagation Neural Network 51/3
15.3.1 Approach 513
15,32 Face Training and Testing Images 515
15.33 Data Description 516
15.34 Program for Discrete Training Inputs 518
15.3.5 Program for Discrete Testing Inputs 527
15.3.6 Program for Continuous Training Inputs 523
15.3.7 Program for Continuous Testing Inputs 527
15.3.7 Simulation 529
15.38 Results 530
15.39 Conclusion 531
Character Recognition using Kohonen Network 531
15.4.1 Kohonen’s Learning Law 531
Winner-take-all 532
3 Kohonen Self-organizing Maps 532
Data Representation Schemes 533
Description of Data 533
Sample Data 534
Kohonen's Program 536
Simulation Results 540
Kohonen Results 540
Observation 540
154.11 Conclusion 540
Classification of Heart Disease Database using Learning Vector Quantization Artificial
Neural Network 54]
15.5.1 Vector Quantization 54]xviii
Contents
15.5.2 Learning Vector Quantization 541
15.5.3 Data Representation Scheme 542
15.5.4 Sample of Heart Disease Data Sets 543
15.5.5 LVQ Program 543
15.5.6 Input Format 550
15.5.7 Output Format 55]
15.5.8 Simulation Results 551
15.5.9 Observation 552
15.5.10 Conclusion 552
15.6 Data Compression using Backpropagation Network 552
15.6.1 Back Propagation Network 552
15.6.2 Data Compression 553
15.6.3 Conventional Methods of Data Compression 553
15.6.4 Data Representation Schemes 554
15.6.5 Sample Data 555
15.6.6 for Bipolar Coding 556
15.6.7 Program for Implementation of Backpropagation Network for
Data Compression 557
15.6.8 Program for Testing 561
15.6.9 Results 563
15.6.10 Conclusion 565
15.7__ System Identification using CMAC _ 565
15.8
15.7.1. Overview of System Identification 566
15.7.2 Applications of System Identification 566
15.7.3 Need for System Identification 568
15.7.4 Identification Schemes 569
15.7.5 Least Squares Method for Self Tuning Regulators 569
15.7.6 Neural Networks in System Identification 570
15.7.7 Cerebellar Mode! Arithmetic Computer (CMAC) 572
15.7.8 Properties of CMAC 575
15.7.9 Design Criteria 576
15.7.10 Advantages and Disadvantages of CMAC 578
15.7.11 Algorithms for CMAC 579
15.7.12 Program 58/
15.7.13 Results 596
15.7.14 Conclusion 599
Neuro-fuzzy Control Based on the Nefcon-model under MATLAB/SIMULINK 600
15.8.1 Learning Algorithms 601
15.8.2 Optimization of a Rule Base 602
15.8.3 Description of System Error 602
15.8.4 Example 604
15.8.5 Conclusion 60716. Fuzzy Systems
16.1 Introduction _608
16.2 _ History of the Development of Fuzzy Logic 608
16.3 Operation of Fuzzy Logic 609
16.4 Fuzzy Sets and Traditional Sets 609
16.5 Membership Functions 6/0
16.6 Fuzzy Techniques 6/0
16.7 Applications 6/2
16.8 Introduction to Neuro Fuzzy Systems 6/2
16.8.1 Fuzzy Neural Hybrids 6/4
16.8.2 Neuro Fuzzy Hybrids 616
Summary 619
Appendix—MATLAB Neural Network Toolbox
A.L_A Simple Example 621
A.L.I Prior to Training 622
A.L.2 Training Error 622
A.1.3 Neural Network Output Versus the Targets 623,
‘A.2_A Neural Network to Model sin(x) 623
A.2.1 Prior to Training 624
A22 Training Error 624
- A.2.3 Neural Network Output Versus the Targets 625
A.3__ Saving Neural Objects in MATLAB 626
‘A3.1 Examples 637
A.4 Neural Network Object: 631
Ad.1 Example 634
A.5_ Supported Training and Learning Functions 637
‘AS.1 Supported Training Functions 638
AS.2 Supported Learning Functions 638
A5.3 Transfer Functions 638
AS4 Transfer Derivative Functions 639
A.5.5 Weight and Bias Initialization Functions 639
A.5.6 Weight Derivative Functions. 639
A6.1 Introduction to GUL_640
Summary 645
Index.
Contents
xix
620Preface
The world we live in is becoming ever more reliant on the use of electronic gadgets and computers to
control the behavior of real world resources. For example, an increasing amount of commerce is performed
without a single bank note or coin ever being exchanged, Similarly, airports can safely land and send off
aeroplanes without even locking out of a window. Another, more individual, example is the increasing
use of electronic personal organizers for organizing meetings and contacts. All of these examples share
a similar structure: multiple parties (e.g. aeroplanes, or people) come together to coordinate their activities
inorder to achieve a common goal. It is not surprising, then, that a lot of research is being done on how
the mechanics of the coordination process can be automated using computers. This is where neural
networks come in.
‘Neural networks are important for their ability to adapt. Neural nets represent entirely different models
from those related to other symbolic systems. The difference occurs in the way the nets store and retrieve
information. The information in a neural net is found to be distributed throughout the network and not
localized. The nets are capable of making memory associations. They can handle a large amount of data,
fast and efficiently. They are also fault tolerant, ie. even if a few neurons fail, it will not disable the
entire system.
The paradigm of artificial neural networks, developed to emulate some of the capabilities of the
human brain, has demonstrated great potential for various low-level computations and embodies salient
features such as learning, fault-tolerance, parallelism and generalization. Neural networks, comprising
processing elements called neurons, are capable of coping with computational complexity, non-linearity
and uncertainty. In view of this versatility of neural networks, it is believed that they hold great potential
as building blocks for a variety of behaviors associated with human cognition. However, the subjective
phenomena such as reasoning and perceptions are often regarded beyond the domain of the neural
network theory. Neural networks can deal with imprecise data and ill-defined activities; thus they offer
low-level computational features.
About the Book
Neural networks, at present is a much sought-after topic, among academicians as well as program
developers. This book is designed to give a broad, yet an in-depth overview of the field of neural networks.
The principles of neural networks are discussed in detail, including information and useful knowledge
available for various network processes. The various algorithms and solutions to the problems given in
the book are well balanced and pertinent to the neural networks research projects, labs and for college
and university level studies. The modem aspects of neural networks have been introduced right from the
basic principles and discussed in an easy-to-understand manner, so that a beginner to the subject is able
to grasp the concept of soft networks with minimal effort.xxii | Preface
The wide variety of worked-out examples relevant to the neural network area, will help in
reinforming the concepts explained. The solutions to the problems are programmed using MATLAB
6.0 and the simulated results are given. The MATLAB neural network toolbox is provided in the
Appendix for easy reference.
This book provides the neural network architecture, algorithms and application procedure-oriented
structures to help the reader move into the world of neural networks with ease. It also presents
application of neural networks to a wide variety of fields of current interest. A few field projects are
also included.
Who will Benefit
This book would be an ideal text for undergraduate students of Computer Science, Information
Technology, Electrical and Electronics and Electronics and Communication engineering for their course
on Neural Networks. Those pursuing MCA and taking a course on Neural Networks will find the book
useful. Programmers involved in neural network applications programming will also benefit from this
book.
Organization
The book includes 16 chapters altogether. The chapters are organized as follows:
Chapter 1 gives an introduction to Neural Networks Techniques. An overview of MATLAB is also
discussed,
The preliminaries of the Artificial Neural Network are described in Chapter 2. The discussion is
based on the development of artificial neural net, comparison between the biological neuron and the
artificial neuron, the basic building blocks of a neural net and the terminologies used in neural net. The
summary of notations is given at the end of the chapter.
Chapter 3 deals with the fundamental models of an artificial neural net. The basics of McCulloch
Pitts neuron and the Hebb net along with the concept of linear separability is given. The learning rules
used in neural networks are also described in detail in this chapter.
Chapter 4 provides information regarding the Perceptron Neural Net. The architecture and algorithm
of the perceptron neural net was explained along with suitable example problems. An introduction to
multi layer perceptron is given.
The basic architecture and algorithm along with examples for Adaline and Madaline nets are described
in Chapter 5.
‘Chapter 6 discusses pattern association nets. Pattern association nets include auto association, hetero
association and bi-directional associative memory net. The learning rules used for pattern association
are also given.
Feedback network is described in Chapter 7. The chapter mainly provides information regarding
Discrete Hopfield and Continuous Hopfield nets. Their architecture, algorithm and application procedure
along with solved examples are discussed in this chapter.
Chapter 8 gives details on feed forward nets. The feed forward nets described here are the Back
Propagation Algorithm and the Radial Basis Function Network. Both the networks are described withPreface | xxiii
their architecture, algorithm and example problem. The merits and demerits of back propagation algorithm
are also included.
Chapter 9 deals with competitive nets. The nets that come under this category are self-organizing
feature map, learning vector quantization, Max net, Mexican Hat, Hamming net. All these networks are
discussed in detail with their features in this chapter.
The Counter Propagation Net (CPN) used for data compression is discussed in Chapter 10, The two
types of CPN, full CPN and forward only CPN are discussed along with their architecture and algorithms.
Chapter 11 describes the features of Adaptive Resonance Theory (ART). The types of ART, ART
network and ART2 network are described with their respective architecture, algorithms and example
problems.
The information regarding the special nets like Boltzmann machine, cascade correlation, spatio
temporal network, simulated annealing, optical neural net, Cauchy machine, Gaussian machine, cognitron,
neo cognitron, Boltzmann machine with learning, ete. are given in Chapter 12.
Chapter 13 discusses the application of neural network in arts, biomedicine, industrial and control
area, data mining, robotics, patiern recognition, etc. with case studies.
Chapter 14 presents the applications of various special networks dealt in Chapter 12.
Few projects related to pattern classification, system identification using different networks with
MATLAB programs are discussed in Chapter 15.
Chapter 16 gives a brief introduction to Fuzzy Systems and Hybrid Systems (Fuzzy Neural Hybrid
and Neural Fuzzy Hybrid),
‘The appendices include the neural network MATLAB tool box.
In conclusion, we hope that the reader will find this book a truly helpful guide and a valuable source
of information about the neural networks principles and their numerous practical applications. Critical
somments and suggestions from the readers are welcome as they will help us improve the future editions
of the book.
SN Sivanandam
S Sumathi
SN DeepaAcknowledgements
First of al, the authors would like to thank the Almighty for granting them perseverance and achievements.
Dr N Sivanandam, Dr S Sumathi and S N Deepa wish to thank Mr V Rajan, Managing Trustee,
PSG Institutions, Mr C R Swaminathan, Chief Executive, and Dr $ Vijayarangan, Principal, PSG
College of Technology, Coimbatore, for their whole-hearted cooperation and encouragement provided
for this endeavor.
The authors are grateful for the support received from the staff members of the Electrical and Electronics
Engineering and Computer Science and Engineering departments of their college.
Dr Sumathi owes much to her daughter S Priyanka, who cooperated even when her time was being
monopolized with book work. She feels happy and proud for the steel-frame support rendered by her
husband. She would like to extend whole-hearted thanks to her parents and parents-in-law for their
constant support. She is thankful to her brother who has always been the “Stimulator” for her progress,
Mrs $ N Deepa wishes to thank her husband Mr TS Anand, her daughter Nivethitha TS Anand and
her family for the support provided by them.
Thanks are also due to the editorial and production teams at Tata McGraw-Hill Publishing Company
Limited for their efforts in bringing out this book.‘* Scope of neural networks and
MATLAB.
Dmdvrto
Howneural network is used to learn
patterns and relationships in data.
The aim of neural networks.
About fuzzy logic .
Use of MATLAB to develop the
|
applications basedonneural =|
networks.
|
|
|
|
|
|
.
Introduction to
Neural
Networks
Artificial neural networks are the result of academic investigations that use mathematical formulations
to model nervous system operations. The resulting techniques are being successfully applied in a variety
of everyday business applications.
Neural networks (NNs) represent a meaningfully different approach to using computers in the work-
place. A neural network is used to lear patterns and relationships in data. The data may be the results of2 Introduction to Neural Networks
a market research effort, a production process given varying operational conditions, or the decisions of
a loan officer given a set of loan applications. Regardless of the specifics involved, applying a neural
network is substantially different from traditional approaches.
Traditionally a programmer or an analyst specifically ‘codes’ for every facet of the problem for the
computer to ‘understand’ the situation, Neural networks do not require explicit coding of the problems.
For example, to generate a model that performs a sales forecast, a neural network needs to be given only
raw data related to the problem. The raw data might consist of history of past sales, prices, competitors’
prices and other economic variables. The neural network sorts through this information and produces an
understanding of the factors impacting sales. The model can then be called upon to provide a prediction
of future sales given a forecast of the key factors,
‘These advancements are due to the creation of neural network learning rules, which are the algo-
rithms used to ‘learn’ the relationships in the data. The learning rules enable the network to ‘gain knowl-
edge’ from available data and apply that knowledge to assist a manager in making key decisions.
What are the Capabilities of Neural Networks?
In principle, NNs can compute any computable function, ie. they can do everything a normal digital
‘computer can do. Especially anything that can be represented as a mapping between vector spaces can
be approximated to arbitrary precision by feedforward NNs (which is the most often used type).
In practice, NNs are especially useful for mapping problems, which are tolerant of some errors, have
lots of example data available, but to which hard and fast rules cannot easily be applied. However, NNs
are, as of now, difficult to apply successfully to problems that concem manipulation of symbols and memory.
Who is Concerned with Neural Networks?
Neural Networks are of interest to quite a lot of people from different fields:
© Computer scientists want to find out about the properties of non-symbolic information processing
with neural networks and about learning systems in general.
© Engineers of many kinds want to exploit the capabilities of neural networks in many areas (e.g.
signal processing) to solve their application problems.
© Cognitive scientists view neural networks as a possible apparatus to describe models of thinking
and conscience (High-level brain function).
# Neuro-physiologists use neural networks to describe and explore medium-level brain function (e.g.
memory, sensory system).
© Physicists use neural networks to model phenomena in statistical mechanics and for a lot of other
tasks.
# Biologists use Neural Networks to interpret nucleotide sequences.
# Philosophers and some other people may also be interested in Neural Networks to gain knowledge
about the human systems namely behavior, conduct, character, intelligence, brilliance and other
psychological feelings. Environmental nature and related functioning, marketing business as well
as designing of any such systems can be implemented via Neural networks.
The development of Artificial Neural Network started 50 years ago. Artificial neural networks (ANNs)
are gross simplifications of real (biological) networks of neurons. The paradigm of neural networks,Introduction to Neural Networks 3
which began during the 1940s, promises to be a very important tool for studying the structure-function
relationship of the human brain, Due to the complexity and incomplete understanding of biological
neurons, various architectures of artificial neural networks have been reported in the literature. Most of
the ANN structures used commonly for many applications often consider the behavior of a single neu-
ron as the basic computing unit for describing neural information processing operations. Each comput-
ing unit, i. the artificial neuron in the neural network is based on the concept of an ideal neuron. An
ideal neuron is assumed to respond optimally to the applied inputs. However, experimental studies in
neuro-physiology show that the response of a biological neuron appears random and only by averaging
many observations it is possible to obtain predictable results, Inspired by this observation, some re-
searchers have developed neural structures based on the concept of neural populations.
In common with biological neural networks, ANNs can accommodate many inputs in parallel and
encode the information in a distributed fashion. Typically the information that is stored in a neural net is,
shared by many of its processing units. This type of coding is in sharp contrast to traditional memory
schemes, where a particular piece of information is stored in only one memory location. The recail
process is time consuming and generalization is usually absent. The distributed storage scheme provides
many advantages, most important of them being the redundancy in information representation. Thus, an
ANN can undergo partial destruction of its structure and still be able to function well. Although redun-
dancy can also be built into other types of systems, ANN has a natural way of implementing this. The
result is a natural fault-tolerant system which is very similar to biological systems.
The aim of neural networks is to mimic the human ability to adapt to changing circumstances and the
current environment. This depends heavily on being able to leam from events that have happened in the
past and to be able to apply this to future situations. For example the decisions made by doctors are
rarely based on a single symptom because of the complexity of the human body; since one symptom
could signify any number of problems, An experienced doctor is far more likely to make a sound deci-
sion than a trainee, because from his past experience he knows what to look out for and what to ask, and
may haye etched on his mind a past mistake, which he will not repeat. Thus the senior doctor is in a
superior position than the trainee. Similarly it would be beneficial if machines, too, could use past events
as part of the criteria on which their decisions are based, and this is the role that artificial neural net-
works seek to fill
Artificial neural networks consist of many nodes, i its analogous to neurons in the
brain, Each node has a node function, associated with it which along with a set of local parameters
determines the output of the node, given an input. Modifying the local parameters may alter the node
function. Artificial Neural Networks thus is an information-processing system. In this information-pro-
cessing system, the elements called neurons, process the information. The signals are transmitted by
means of connection links. The links possess an associated
weight, which is multiplied along with the incoming signal
(net input) for any typical neural net. The output signal is ob-
tained by applying activations to the act input.
‘The neural net can generally be a single layer or a multi-
layer net. The structure of the simple artificial neural net is
shown in Fig. 1.1.
Figure 1.1 shows a simple artificial neural net with two
input neurons (x,, x5) and one output neuron (y). The inter-
connected weights are given by w, and w>. In a single layer
net there is a single layer of weighted interconnections.
Fig 14 | 4 Simple Artticial Neural Net4 Introduction to Neural Networks
A typical multi-layer artificial neural
network, abbreviated as MNN, com-«
prises an input layer, output layer and
hidden (intermediate) layer of neurons.
MNNs are often called layered net- *
works. They can implement arbitrary
complex inpuoutput mappings or de-
cision surfaces separating different pat-
terns. A three-layer MNN is shown in
Eclat a 13 le ae Input Layer Hidden Layer Output Layer
layer of input units is connected to a
layes of fuddieg units ;which 1s connected A Densely Interconnected Three-layered Static
to the layer of output units. The activity / —ig/412] Neural Network. Each Shaded Circle, or Node,
of neurons in the input layer represents
we ee a)
Represents an Artificial Neuron
the raw information that is fed into the
network. The activity of neurons in the 1
hidden layer is determined by the activi x Nn, No >| Ny
ties of the input neurons and the con- | I
necting weights between the input and Input Layer Hidden Layer Output Layer
hidden units. Similarly, the behavior of
the output units depends on the activity A Block Diagram Representation of a
of the neurons in the hidden layer and "| Three-tayered MNN
the connecting weights between the hid-
den and the output layers. This simple neural structure is interesting because neurons in the hidden layers
are free to construct their own representation of the input,
MNNSs provide an increase in computational power over a single-layer neural network unless there is
a nonlinear activation function between layers, Many capabilities of neural networks, such as nonlinear
functional approximation, learning, generalization, etc are in fact performed due to the nonlinear activa-
tion function of each neuron.
ANNs have become a technical folk legend. The market is flooded with new, increasingly technical
software and hardware products, and many more are sure to come. Among the most popular hardware
implementations are Hopfield, Multilayer Perceptron, Self-organizing Feature Map, Learning Vector
Quantization, Radial Basis Function, Cellular Neural, and Adaptive Resonance Theory (ART) networks,
Counter Propagation networks, Back Propagation networks, Neo-cognitron, etc. As a result of the exist-
ence of all these networks, the application of the neural nctwork is increasing tremendously.
‘Thus artificial neural network represents the major extension to computation. They perform the op-
erations similar to that of the human brain. Hence it is reasonable to expect « rapid increase in our
understanding of artificial neural networks leading to improved network paradigms and a host of appli-
cation opportunities.
1.3. The Rise of Neurocomputing
A majority of information processing today is carried out by digital computers, This has led to the
widely held misperception that information processing is dependent on digital computers. However, ifIntroduction to Neural Networks 5
we look at cybernetics and the other disciplines that form the basis of information science, we see that
information processing originates with living creatures in their struggle to survive in their environ-
ments, and that the information being processed by computers today accounts for only a small part —
the automated portion — of this. Viewed in this light, we can begin to consider the possibility of infor-
mation processing devices that differ from conventional computers. In fact, research aimed at realizing
a variety of different types of information processing devices is already being carried out, albeit in the
shadows of the major successes achieved in the realm of digital computers. One direction that this
research is taking is toward the development of an information processing device that mimics the struc-
tures and operating principles found in the information processing systems possessed by humans and
other living creatures.
Digital computers developed rapidly in and after the late 1940's and after originally being applied to
the field of mathematical computations, have found expanded applications in a variety of areas, like text
(word), symbol, image and voice processing, i.e. pattern information processing, robotic control and
artificial intelligence. However, the fundamental structure of digital computers is based on the principle
of sequential (serial) processing, which has little if anything in common with the human nervous system.
The human nervous system, itis now known, consists of an extremely large number of nerve cells, or
neurons, which operate in parallel to process various types of information. By taking a hint from the
structure of the human nervous system, we should be able to build a new type of advanced parallel
information processing device.
In addition to the increasingly large volumes of data that we must process as a result of recent devel-
opments in sensor technology and the progress of information technology, there is also a growing re-
quirement to simultaneously gather and process huge amounts of data from multiple sensors and other
sources. This situation is creating a need in various fields to switch from conventional computers that
process information sequentially, to parallel computers equipped with multiple processing elements
aligned to operate in parallel to process information.
Besides the social requirements just cited, a number of other factors have been at work during the
1980’s to prompt research on new forms of information processing devices. For instance, recent neuro-
physiological experiments have shed considerable light on the structure of the brain, and even in fields
such as cognitive science, which study human information processing processes at the macro level, we
are beginning to see proposals for models that call for multiple processing elements aligned to operate in
parallel. Research in the fields of mathematical science and physics is also concentrating more on the
mathematical analysis of systems comprising multiple clements that interact in complex ways. These
factors gave birth to 2 major research trend aimed at clarifying the structures and operating principles
inherent in the information processing systems of human beings and other animals, and constructing an
information processing device based on these structures and operating principles. The term
‘neurocomputing’ is used to refer to the information engineering aspects of this research.
Dr. Cleve Moler, chief scientist at MathWorks, Inc., originally wrote MATLAB to provide easy access
to matrix software developed in the LINPACK and FISPACK projects. The first version was written in
the late 1970s for use in courses in matrix theory, linear algebra, and numerical analysis. MATLAB is
therefore built upon a foundation of sophisticated matrix software, in which the basic data element isa
matrix that does not require predimensioning.6 Introduction to Neural Networks
MATLAB is a product of The MathWorks, Inc. and is an advanced interactive software package
specially designed for scientific and engineering computation. The MATLAB environment integrates
graphical illustrations with precise numerical calculations, and is a powerful, easy-to-use, and compre-
hensive tool for performing all kinds of computations and scientific data visualization. MATLAB has
proven to be a very flexible and useful tool for solving problems in many areas. MATLAB is a high-
performance language for technical computing. It integrates computation, visualization and program-
ming in an easy-to-use environment where problems and solutions are expressed in familiar mathemati-
cal notation, Typical areas of application of MATLAB include:
© Math and computation
# Algorithm development
# Modeling, simulation and prototyping
‘ Data analysis, exploration, and visualization
* Scientific and engineering graphics
+ Application development, including graphical user interface building
MATLAB isan interactive system whose basic element is an array that does not require dimensioning.
This helps in solving many computing problems, especially those with matrix and vector formulations,
in a fraction of the time it would take to write a program in a scalar non-interactive language such as C
or FORTRAN. Mathematics is the common language of science and engineering. Matrices, different
equations, arrays of data, plots and graphs are the basic building blocks of both applied mathematics and
MATLAB. It is the underlying mathematical base that makes MATLAB accessible and powerful.
MATLAB allows expressing the entire algorithm in a few dozen lines, to compute the solution with
great accuracy in about a second. Therefore it is especially helpful for technical analysis, algorithm
prototyping and application development,
MATLAB‘s two-and three-dimensional graphics are object oriented. MATLAB is thus both an envi-
ronment and a matrix/vector-oriented programming language, which enables the user to build own
reusable tools. The user can create his own customized functions and programs (known as M-files) in
MATLAB code. The Toolbox is a specialized collection of M-files for working on particular classes of
problems. MATLAB Documentation Set has been written, expanded and put online for ease of use. The
set includes online help, as well as hypertext-based and printed manuals. The commands in MATLAB.
are expressed in a notation close to that used in mathematics and engineering. There isa very large set of
these commands and functions, known as MATLAB M-files, As a result, solving problems through
MATLAB is faster than the other traditional programming. It is easy to modify the functions since most
of the M-files can be opened and modified. For ensuring high performance, the MATLAB software has
been written in optimized C and coded in assembly language.
The main features of MATLAB can be summarized as:
* Advance algorithms for high-performance numerical computations, especially in the field of ma-
trix algebra.
# A large collection of predefined mathematical functions and the ability to define one’s own func-
tions.
© Two- and three-dimensional graphics for plotting and displaying data.
* A complete online help system.
© Powerful, matrix/vector-oriented, high-level programming language for individual applications.Introduction to Neural Neworks 7
© Ability to cooperate with programs written in other languages and for importing and exporting
formatted data.
© Toolboxes available for solving advanced problems in several application areas.
Figure 1.4 shows the main features and capabilities of MATLAB.
User-written Functions
Built-in Functions
T
|
Graphics: Computations | External Interface
*2-D Graphics + Linear Algebra * Interface with C
© 3-D Graphics * Data Analysis and FORTRAN
* Color and * Signal Processing | Programs:
Lighting * Quadrature
'* Animation Etec. |
Toolboxes
« Signal Processing
«Image Processing
* Control System
‘* Optimization
«Neural Networks.
* Communications
« Robust Control
* Statistics
* Splines
Fg. 14 | Features and Capabilities of MATLAB
An optional extension of the core of MATLAB called SIMULINK is also available. SIMULINK
means SIMUlating and LINKing the environment. SIMULINK is an environment for simulating linear
and non-linear dynamic systems, by constructing block diagram models with an easy (0 use graphical
user interface8 Introduction to Neural Networks
SIMULINK is a MATLAB toolbox designed for the dynamic simulation of linear and non-linear
systems as well as continuous and discrete-time systems. It can also display information graphically.
MATLAB is an interactive package for numerical analysis, matrix computation, control system design,
and linear system analysis and design available on most CAEN (Computer Aided Engineering
Network) platforms (Macintosh, PCs, Sun, and Hewlett-Packard). In addition to the standard functions
provided by MATLAB, there exist a large set of Toolboxes, or collections of functions and procedures,
available as part of the MATLAB package. Toolboxes are libraries of MATLAB functions used to
customize MATLAB for solving particular class of problems. Toolboxes are a result of some of the
‘world’s top researches in specialized fields. They are equivalent to prepackaged “off-the-shelf” software
solution for a particular class of problem or technique. It is a collection of special files called M-files
that extend the functionality of the base program. The various Toolboxes available are:
* Conirol System: Provides several features for advanced control system design and analysis.
© Communications: Provides functions to model the components of a communication system’s physical
layer.
« Signal Processing: Contains functions to design analog and digital filters and apply these filters to
data and analyze the results.
‘© System identification: Provides features to build mathematical models of dynamical systems based
on observed system data.
© Robust Control: Allows users to create robust multivariable feedback control system designs based
on the concept of the singular-value Bode plot.
© Simulink; Allows you to model dynamic systems graphically.
* Neural Network: Allows you to simulate neural networks.
* Fuzzy Logie: Allows for manipulation of fuzzy systems and membership functions.
# Image Processing: Provides access to a wide variety of functions for reading, writing, and filtering
images of various kinds in different ways.
+ Analysis: Includes a wide variety of system analysis tools for varying matrices.
© Optimization: Contains basic tools for use in constrained and unconstrained minimization prob-
lems.
* Spline: Can be used to find approximate functional representations of data sets.
+ Symbolic: Allows for symbolic (rather than purely numeric) manipulation of functions.
© User Inierface Utilities: Includes tools for creating dialog boxes, menu utili
interactions for script files.
and other user
MATLAB has been used as an efficient tool, all over this text to develop the applications based on
Neural Nets.
Review Questions
1.1 How did neurocomputing originate?
1.2. What is a multilayer net? Describe with a neat sketch.Introduction to Neural Networks
1.3 State some of the popular neural networks.
1.4 Briefly discuss the key characteristics of MATLAB.
1.5 List the basic arithmetic operations that can be performed in MATLAB.
1.6 What is the necessity of SIMULINK package available in MATLAB?
17 Discuss in brief about the GUI toolbox feature of MATLAB.
1.8 What is meant by toolbox and list some of the toolboxes available for MATLAB?© The preliminaries of Artificial
Neural Networks.
© Definition of an artificial neuron.
© The development of neural net-
works,
Dmyavrdwo
No
* Comparison between the biological
Neuron and artificial neuron based
| onspeed, fault tolerance, memory,
control mechanism, etc.
. | © The method of setting the value
| for the weight id how it.
Introduction to — renee mtentenes
| © Various methods of training, viz.
Artificial Neural pce perecomorniad
Networks
|
|
'* Basic building blocks of the
artificial neural network, i.e.
network architecture, setting the
weights, activation function, etc.
‘© The activation function used to
calculate the output response of a
| neuron.
| « Summary of notations used alll
‘over in this text.Introduction to Artificial Neural Networks Ww
‘The basic preliminaries involved in the Artificial Neural Network (ANN) are described in this chapter.
A brief summary of the history of neural networks, in terms of the development of architectures and
algorithms, the structure of the biological neuron is discussed and compared with the artificial neuron.
The basic building blocks and the various terminologies of the artificial neural network are explained
towards the end of the chapter. The chapter concludes by giving the summary of notations, which are
used in all the network algorithms, architectures, etc. discussed in the forthcoming chapters.
Artificial neural networks are nonlinear information (signal) processing devices, which are built from
interconnected elementary processing devices called neurons.
An Artificial Neural Network (ANN) is an information-processing paradigm that is inspired by the
way biological nervous systems, such as the brain, process information. The key clement of this para-
digm is the novel structure of the information processing system. It is composed of a large number of
highly interconnected processing elements (neurons) working in union to solve specific problems. ANNs,
like people, learn by example. An ANN is configured for a specific application, such as pattern recogni-
tion or data classification, through a learning process. Learning in biological systems involves adjust-
ments to the synaptic connections that exist between the neurons. This is true of ANNs as well.
ANN’s are a type of artificial intelligence that attempts to imitate the way a human brain works,
Rather than using a digital model, in which all computations manipulate zeros and ones, a neural net-
work works by creating connections between processing elements, the computer equivalent of neurons.
‘The organization and weights of the connections determine the output.
A neural network is 2 massively parallel-distributed processor that has ¢ natural propensity for stor-
ing experimental knowledge and making it available for use. It resembles the brain in two respects:
1. Knowledge is acquired by the network through a learning process, and,
2. Inter-neuron connection strengths known as synaptic weights are used to store the knowledge.
Neural networks can also be defined as parameterized computational nonlinear algorithms for (nu-
merical) data/signalimage processing. These algorithms are either implemented on a general-purpose
computer or are built into a dedicated hardware.
Artificial Neural Networks thus is an information-processing system. In this information-processing
system, the elements called as neurons, process the information. The signals are transmitted by means of
connection links. The links possess an associated weight, which is multiplied along with the incoming
signal (net input) for any typical neural net. The output signal is obtained by applying activations to the
net input.
An artificial neuron is characterized by:
1, Architecture (connection between neurons)
2. Training or learning (determining weights on the connections)
3. Activation function12 Introduction to Neural Networks
All these are discussed in detail in the forthcoming
subsections, The structure of the simple artificial neural
network is shown in Fig, 2.1.
Figure 2.1 shows a simple artificial neural network
with two input neurons (x, X2) and one output neuron
(y). The inter connected weights are given by w and w5.
An artificial neuron is a p-input single-output signal-pro-
cessing element, which can be thought of as a simple
model of a non-branching biological neuron. In Fig 2.1,
various inputs to the network are represented by the math- Fig. 2.4 | A Simple Artificial Neuféll Net
ematical symbol, x(n). Each of these inputs are multi-
plied by a connection weight. These weights are represented by w(n). In the simplest case, these prod-
ucts are simply summed, fed through a transfer function to generate a result, and then delivered as
output. This process lends itself to physical implementation on a large scale in a small package. This
electronic implementation is still possible with other network structures, which utilize different sum-
ming functions as well as different transfer functions.
Why Artificial Neural Networks?
Inputs
_ oS (Output)
Output Layer
We (weights)
Input Layer
The long course of evolution has given the hurnan brain many desirable characteristics not present in
‘Von Neumann or modem parallel computers. These include
© Massive parallelism,
© Distributed representation and computation,
© Learning ability,
© Generalization ability,
© Adaptivity,
© Inherent contextual information processing
© Fault tolerance, and
© Low energy consumption.
It ig hoped that devices based on biological neural networks will posses some of these desirable
characteristics. Modern digital computers outperform humans in the domain of numeric computation
and related symbol manipulation. However, humans can effortlessly solve complex perceptual prob-
lems (like recognizing a man in a crowd from a mere glimpse of his face) at such a high speed and extent
as to dwarf the world’s fastest computer. Why is there such a remarkable difference in their perfor-
mance? The biological neural system architecture is completely different from the Von Neumann archi-
tecture (see Table 2.1). This difference significantly affects the type of functions each computational
model can best perform.
Numerous efforts to develop “intelligent” programs based on Von Neumann's centralized architec-
ture have not resulted in any general-purpose intelligent programs. Inspired by biological neural net-
works, ANNs are massively parallel computing systems consisting of an extremely large number of
simple processors with many interconnections. ANN models attemp! to use some “organizational” prin-
ciples believed to be used in the human brain.Introduction to Artificial Neural Networks 13
Fable 24] Von Neumann Computer Versus Biological Neural System
Von Neumann Biological
Computer Neural System
Processor ‘complex ‘Simple
High speed Low speed
One or a few A large number
Memory Separate from a processor _Integrated into
Localized Processor
Noncontent addressable Distributed
Content addressable
Computing Centralized Distributed
Sequential Parallel
Stored programs Self-learning
Raliability Very vulnerable Robust
Expertise Numerical and symbolic Perceptual
manipulations problems
Operating Well-defined, Poorly defined,
environment well-constrained unconstrained
Either humans or other computer techniques can use neural networks, with their remarkable ability to
derive meaning from complicated or imprecise data, to extract patterns and detect trends that are too
complex to be noticed. A trained neural network can be thought of as an “expert” in the category of
information it has been given to analyze. This expert can then be used to provide projections given new
situations of interest and answer “what if” questions.
Other advantages include:
1. Adaptive learning: An abi
initial experience.
2. Self-organization: An ANN can create its own organisation or representation of the information it
receives during learning time.
to lean how to do tasks based on the data given for training or
3. Real-time operation: ANN computations may be carried out in parallel, using special hardware
devices designed and manufactured to take advantage of this capability.
4, Fault tolerance via redundant information coding: Partial destruction of a network leads to a
corresponding degradation of performance. However, some network capabilities may be retained
even after major network damage due to this feature.
The historical development of the neural networks can be traced as follows:
+ 1943—McCulloch and Pitts: start of the modern era of neural networks
This forms a logical calculus of neural networks. A network consists of sufficient number of neu-
rons (using a simple model) and properly set synaptic connections can compute any computable14 Introduction to Neural Networks
function. A simple logic function is performed by a neuron in this case based upon the weights set
in the McCulloch-Pitts neuron. The arrangement of neuron in this case may be represented as a
combination of logic functions. The most important feature of this type of neuron is the concept of
threshold. When the net input to a particular neuron is greater than the specified threshold by the
user, then the neuron fires. Logic circuits are found to use this type of neurons extensively.
© 1949—Hebb’s book “The organization of behavior”
An explicit statement of a physiological learning rule for synaptic modification was presented for
the first time, Hebb proposed that the connectivity of the brain is continually changing as an organ-
ism learns differing functional tasks, and that neural assemblies are created by such changes.
Hebb’s work was immensely influential among psychologists. The concept behind the Hebb theory
is that if two neurons are found to be active simultaneously the strength of connection between
the two neurons should be increased. This concept is similar to that of the correlation matrix
learning
© 1958—Rosenblatt introduces Perceptron (Block {1962], Minsky and Papert (1988])
In Perceptron network the weights on the connection paths can be adjusted. A method of iterative
‘weight adjustment can be used in the Perceptron net. The Perceptron net is found to converge if the
weights obtained allow the net to reproduce exactly all the training input and target output vector
pairs
* 1960—Widrow and Hoff introduce adaline
ADALINE, abbreviated from Adaptive Linear Neuron uses a leaming rule called as Least Mean
Square rule or Delta rule. This rule is found to adjust the weights so as to reduce the difference
between the net input to the output unit and the desired output. The convergence criteria in this case
are the reduction of mean square error to a minimum value. This delta rule for a single layer net can
be called a precursor of the backpropagation net used for multi-layer nets. The multi-layer exten-
sions of Adaline formed the Madaline [Widrow and Lehr, 1990].
+ 1982—John Hopfield’s networks
Hopfield showed how to use “Ising spin glass” type of model to store information in dynamically
stable networks. His work paved the way for physicists to enter neural modeling, thereby trans-
forming the field of neural networks. These nets are widely used as associative memory nets. The
Hopfield nets are found to be both continuous valued and discrete valued. This net provides an
efficient solution for the “Travelling Sales-man Problem’
« 1972—Kohonen’s Self-Organizing Maps (SOM)
Kohonen’s Self-Organizing Maps are capable of reproducing important aspects of the structure of
biological neural nets. They make use of data representation using topographic maps, which are
‘common in the nervous systems. SOM also has a wide range of applications. It shows how the
output layer can pick up the correlational structure (from the inputs) in the form of the spatial
arrangement of units. These nets are applied to many recognition problems.
« 1985—Parker, 1986—Lecum
During this period the backpropagation net paved its way into the Neural Networks. This method
propagates the error information at the output units back to the hidden units using a generalized
delta rule. This net is basically a multilayer, feed forward net trained by means of backpropagation.
Originally, even though the work was performed by Parker (1985) the credit of publishing this net
goes to Rumelhart, Hinton and Williams (1986). Backpropogation net emerged as the most popularIntroduction to Anificial Neural Networks 15
learning algorithm for the training of multilayer perceptrons and has been the workhorse for many
neural network applications.
© 1988—Grossberg
Grossberg developed a leaming rule similar to that of Kohonen, which is widely used in the Counter
Propagation net, This Grossberg type of learning is also used as outstar learning. This learning
occurs for all the units in a particular layer; no competition dmong these units is assumed.
+ 1987, 1990—Carpenter and Grossberg
Carpenter and Grossberg invented Adaptive Resonance Theory (ART). ART was designed for both
binary inputs and the continuous valued inputs. The design for the binary inputs formed ARTI, and
ART2 came into being when the design became applicable to the continuous valued inputs, The
‘most important feature of these nets is that the input patterns can be presented in any order.
‘* 1988—Broomhead and Lowe developed Radial Basis Functions (RBF). This is also a multi-
layer net that is quiet similar to the back propagation net.
‘* 1990—Vapnik developed the support vector machine.
2.4 Biological Neural Networks
A biological neuron or anerve cell consists of synapses, dendrites, the cell body (or hillock), and the
axon. The “building blocks” are discussed as follows:
‘© The synapses are elementary signal processing devices
» A synapse is a biochemical device, which converts a pre-synaptic electrical signal into achemical
signal and then back into a post-synaptic electrical signal.
+ The input pulse train has its amplitude modified by parameters stored in the synapse. The nature
of this modification depends on the type of the synapse, which can be either inhibitory or excita-
tory.
* ‘The postsynaptic signals are aggregated and transferred along the dendrites to the nerve cell body.
© The cell body generates the output neuronal signal, a spike, which is transferred along the axon to
the synaptic terminals of other neurons.
© The frequency of firing of a neuron is proportional to the total synaptic activities and is controlled
by the synaptic parameters (weights).
© The pyramidal cell can receiv
of target cells —a connectivit
104 synaptic inputs and it can fan-out the output signal to thousands
difficult to achieve in the artificial neural networks.
In general the function of the main elements can be given as,
Dendrite - Receives signals from other neurons
Soma — Sums all the incoming signals
Axon — When a particular amount of input is received, then the cell fires. It transmits signal
through axon to other cells.16 Introduction to Neural Networks
The fundamental processing element of a neural network is a neuron. This building block of human
awareness encompasses a few general capabilities. Basically, a biological neuron receives inputs from
other sources, combines them in some way, performs a generally nonlinear operation on the result, and
then outputs the final result. Figure 2.2 shows the relationship of these four parts.
—
\ = 4 Parts of a Typical Nerve Cell
~
/ { Denarites: Accept inputs
(® + Soma: Process the inputs
——— Axon: Turn the processed inputs
into outputs
Als
Synapses: The electrochemical
contact between neurons
Fig. 22| A Biological Neuron
The properties of the biological neuron pose some features on the artificial neuron, They are:
1. Signals are received by the processing elements. This clement sums the weighted inputs.
‘The weight at the receiving end has the capability to modify the incoming signal.
The neuron fires (transmits output), when sufficient input is obtained.
The output produced from one neuron may be transmitted to other neurons.
The processing of information is found to be local.
‘The weights can be modified by experience.
Neurotransmitters for the synapse may be excitatory or inhibitory.
Both artificial and biological neurons have inbuilt fault tolerance.
en oy he
Figure 2.3 and Table 2.2 indicate how the biological neural net is associated with the artificial neural
net.
Cell Body
ro —_ | sresgraa
:
ea
“SE LD
_s | Axon
Summation
#IGlBS| Assocar0n of Bo1qca Ne wn Arca MetIntroduction to Aniificial Neural Networks 17
Associated Terminologies of Biological and Ariical Neural
Table 2.2| Net
Biological Neural Network Aniificial Neural Network
Cell Body Neurons
Dendrite Weights or interconnections
Soma Net input
Axon Output
The main differences between the brain and the computer are:
* Biological neurons, the basic building blocks of the brain, are slower than silicon logic gates. The
neurons operate in milliseconds, which is about six orders of magnitude slower than the silicon
gates operating in the nanosecond range.
© The brain makes up for the slow rate of operation with two factors:
«a huge number of nerve cells (neurons) and interconnections between them. The human brain
contains approximately 10" to 10’ interconnections.
«= the function of a biological neuron seems to be much more complex than that of a logic gate.
© The brain is very energy efficient. It consumes only about 10-16 joules per operation per second,
comparing with 10-6 joules per operation per sec, for a digital computer.
© The brain is a highly complex, non-linear, parallel information processing system. It performs tasks
like pattern recognition, perception, motor control, many times faster than the fastest digital com-
puters.
© Consider an efficiency of the visual system which provides a representation of the environment
which enables us to interact with the environment. For example, a complex task of perceptual
recognition, ¢.g. recognition of a familiar face embedded in an unfamiliar scene can be accom-
plished in 100-200 ms, whereas tasks of much lesser complexity can take hours if not days on
conventional computers.
As another example consider an efficiency of the SONAR system of a bat. SONAR is an active
echolocation system. A bat SONAR provides information about the distance from a target, its rela-
tive velocity and size, the size of various features of the target, and its azimuth and elevation. The
complex neural computations needed to extract all this information from the target echo occur
within the brain, which has the size of a plum. The precision and success rate of the target location
is rather impossible to match by RADAR or SONAR engineers.
‘Table 2.3 shows the major differences between the biological and the artificial neural network,aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.20 Introduction to Neural Networks
Feed Forward Net
Feed forward networks may have a single layer of weights where the inputs are directly connected to the
outputs, or multiple layers with intervening sets of hidden units (see Fig. 2.4), Neural networks use
hidden units to create internal representations of the input patterns, In fact, it has been shown that given
enough hidden units, it is possible to approximate arbitrarily any function with a simple feed forward
network. This result has encouraged people to use neural networks to solve many kinds of problems.
1. Single layer net: It is a feed forward net. It has only one layer of weighted interconnections. The
inputs may be connected fully to the output units. But there is a chance that none of the input units
and output units are connected with other input and output units respectively, There is also a case
where, the input units are connected with other input units and output units with other output
units. In a single layer net, the weights from one output unit do not influence the weights for other
output units.
2. Multi layer net: It is also.a feed forward net i.e., the net where the signals flow from the input units
to the output units in a forward direction. The multi-layer net pose one or more layers of nodes
between the input and output units. This is advantageous over single layer net in the sense that, it
can be used to solve more complicated problems.
Competitive Net
The competitive net is similar to a single-layered feed forward network except that there are connec-
tions, usually negative, between the output nodes. Because of these connections the output nodes tend to
‘compete to represent the current input pattern. Sometimes the output layer is completely connected and
sometimes the connections are restricted to units that are close to each other (in some neighborhood).
With an appropriate learning algorithm the latter type of network can be made to organize itself topo-
logically. In a topological map, neurons near each other represent similar input patterns. Networks of
this kind have been used to explain the formation of topological maps that occur in many animal sensory
systems including vision, audition, touch and smell
Recurrent Net
‘The fully recurrent network is perhaps the simplest of neural network architectures. All units are con-
nected to all other units and every unit is both an input and an output. Typically, a set of patterns is
instantiated on all of the units, one at a time. As each pattern is instantiated the weights are modified.
When a degraded version of one of the patterns is presented, the network attempts to reconstruct the
pattern.
Recurrent networks are also useful in that they allow networks to process sequential information.
Processing in recurrent networks depends on the state of the network at the last time step. Consequently,
the response to the current input depends on previous inputs. Figure 2.4 shows two such networks: the
simple recurrent network and the Jordan network.
2.7.2. Setting the Weights
‘The method of setting the value for the weights enables the process of learning or training. The process
of modifying the weights in the connections between network layers with the objective of achieving the
expected output is called training a network. The internal process that takes place when a network is
trained is called leaming. Generally, there are three types of training as follows.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.24 Introduction to Neural Networks
Binary Step Function
The function is given by,
k if {20
fay =
@ fe if f
nw-p
This is the condition for absolute inhibition.
The McCulloch-Pitts neuron will fire if it receives k or more excitatory inputs and no inhibitory
inputs, where
k,20>(k-1)w.
Example 3.1 Generate the output of logic AND function by McCulloch-Pitts neuron model.
Solution ‘The AND fanction returns a true value only if both the inputs are true, else it returns a false value.
“1 represents true value and ‘0’ represents false value.
‘The truth table for AND function is,
ES Be:
1 1
1 0
1
0
core
0
0 0
A McCulloch-Pitts neuron to implement AND function is (#)—
shown in Fig. 3.2. The threshold on unit Y is 2.
‘The output Y is, McCulloch-Pitts Neuron to Per-
Y=f,) form Logical AND Function
‘The net input is given by
> weights * input
txt lt x,
Yin =%1 +X
From this the activations of output neuron can be formed.
1 if yi22
Y= =
fi) { if Yq <2aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.36 ‘introduction to Neural Networks
‘The activations of z, and 7, are given as,
it
1 if
2=Cn-d= Vp ig
The calculation of net input and activations of z, and z, are shown below.
2% =(%) ANDNOT x) Zq_
Wy YW,
% x
1 1
1 0
0 1
0 0
2)=(%, ANDNOT X))—%y..9= iW) +X.
% 3
1 1
1 oO
oO 1
oO oO
‘The activation for the output unit y is 1.
1 if yi, 21
=f = 7”
Y= F(a) {, if Yq =thete
yG)el;
else
yQ)
end
end
disp( ‘Output of Net’):
disp(‘Net is not learning enter another set of weights and Threshold value’):
wl=input(‘weight wi=" ):
w2=input (‘weight w2="):
theta=input('theta="
end
end
disp(*MccuTloch-Pitts Net for ANDNOT function’ ):
disp( ‘Weights of Neuron’):
dispiwl):
dispiw2);
disp(‘Threshold value’
disp(theta);
Elgments sous droits d'euteuraa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.44 Introduction to Neural Networks
‘This type of synapse is called Hebbian synapse. The fourkey mechanisms that characterize a Hebbian
synapse are time dependent mechanism, local mechanism, interactive mechanism and correlational
mechanism.
The simplest form of Hebbian leaming is described by,
Aw=x,y
‘This Hebbian learning rule represents a purely feed forward, unsupervised learning. It states that if
the cross product of output and input is positive, this results in increase of weight, otherwise the weight
decreases.
In some cases, the Hebbian rule needs to be modified to counteract unconstrained growth of weight
values, which takes place when excitations and response consistently agree in sign. This corresponds to
the Hebbian learning rule with saturation of weights at a certain preset level.
3.3.2 Perceptron Learning Rule
For the perceptron learning rule, the learning signal is the difference between the desired and actual
neuron’s response. This type of learning is supervised.
The fact that the weight vector is perpendicular to the plane separating the input patterns during the
learning processes, can be used to interpret the degree of difficulty of training a perceptron for different
types of input.
‘The perceptron leaming rule states that for a finite ‘n’ number of input training vectors,
x(n) where n=110N
each with an associated target value,
tin) where n= 110N
which is +1 or - 1, and an activation function y — f(y_,,), where,
1 if y>o
y=40 if -0sy,s0
Hl if ¥j,<-8
the weight updation is given by
ify #t, then
Waew = Wor +
ify =, then there is no change in weights.
‘The perceptron learning rule is of central importance for supervised learning of neural networks. The
weights can be initialized at any values in this method.
There is a perceptron learning rule convergence theorem which states, “If there is a weight vector w*
such that f(x(p) w*) = t(p) for all p, then for any starting vector w, the perceptron learning rule will
converge toa weight vector that gives the correct response for all training patterns, and this will be done
in a finite number of steps”.
3.3.3 Delta Learning Rule (Widrow-Hoff Rule or Least Mean Square (LMS) Rule)
The delta learning rule is also referred to as Widrow-Hoff rule, named due to the originators (Widrow
and Hoff, 1960). The delta learning rule is valid only for continuous activation functions and in theaa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.48 Introduction to Neural Networks
the dot product or Euclidean norm, Euclidean norm is most widely used because dot product may re~
quire normalization.
3.3.5 Out Star Learning Rule
Out star leaming rule can be well explained when the neurons are arranged in a layer. This rule is
designed to produce the desired response t from the layer of n neurons. This type of learning is also
called as Grossberg learning.
Out star learning occurs for all units in a particular layer and no competition among these units are
assumed. However the forms of weight updates for Kohonen learning and Grossberg learning areclosely
related.
In the case of out star learning
aw,
. fen —W,) if neuron j wins the competition
=
0 if neuron j losses the competition
The rule is used to provide leaming of repetitive and characteristic properties of input-output rela-
tionships. Though it is concemed with supervised learning, it allows the network to extract statistical
Properties of the input and output signals. It ensures that the output pattern becomes similar to the
undistorted desired output after repetitively applying on distorted output versions. The weight change
here will be a times the error calculated.
3.3.6 Boltzmann Learning
‘The learning is a stochastic learning. A neural net designed based on this leaming is called Boltzmann
earning. In this learning, the nourons constitute a recurrent structure and they work in binary form. This
learning is characterized by an energy function, E, the value of which is determined by the particular
states occupied by the individual neurons of the machine, given by,
= Lhwaan ii
where x; is the state of neuron i and wyis the weight from neuron i to neuron j. The value i#/ means that
none of the neurons in the machine has self feedback. The operation of machine is performed by choos-
ing a neuron at random.
The neurons of this learning process are divided into two groups; visible and hidden. In visible
neurons there is an interface between the network and the environment in which it operates but in
hidden neurons, they operates independent of the environment. The visible neurons might be clamped
‘onto specific states determined by the environment, called as clamped condition. On the other hand,
there is free-running condition, in which all the neurons are allowed to operate freely.
3.3.7 Memory Based Learning
In memory based learning, all the previous experiences are stored in a large memory of correctly classi-
fied input-output examples: (x,44,){4, where 4; is the input vector and 4 is the desired response. The
desired response is a scalar,aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.52 Introduction to Neural Networks
For the 3rd and 4th epoch the separating line remains the same, hence this line separates the boundary
regions as shown in Fig. 3.9.
Xe
\
(IRGIR] Hote Net tor AND Function
‘The same procedure can be repeated for generating the logic function OR, NOT, AND NOT ete.
Example 3.10 Apply the Hebb netto the training patterns that define XOR function with bipolar input and
targets.
Soluation
Input Target
8 b y
1 1 1-1
1-1 1 1
-1 1 1 1
-1 -1 1 -1
By Hebb training algorithm, assigning initial values of the weights w,, & w, to be zero and bias to be zero.
wy = 0 =0 and b= 0
Input Target ‘Weight Changes Weights
& x by yaw AW, Ab wow
o 0 9
1 1 1 -1 -1 -1 -Loo-1 0 =b oo =1
1 -1 1 1 +1 -1 1 0 -2 0
-1 1 1 1 -1 1 1-1) =1 1
-1 -1 1 -1 1 1 -1 0 o 0
The weight changes are called using,
Aw, =x, yand Ab=yaa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.56 Introduction to Neural Networks
Solution The ‘** symbol indicates that there exist a “+ I’ and ‘.’ Symbol indicates that there exist a ‘- 1".
The inputis givenby, E=[11111-1-1-111111-1-1-11111)
Fs(l1ii1-1-1-111111-1-1-11-1-1-1};
‘The MATLAB program is given as follows
Program
Hebb Net to classify two dimensional input patterns
clear;
cic:
‘Input Patterns
G-[hibit-L-l -11
Fe[l1111-1-1-11
x(1,1:20)=E:
(2,1:20)=F
w(1:20)=0;
tefl -1];
be0;
for in1:2
wewex (4 .1:20)*tC):
bebet(i):
end
disp( ‘Weight matrix’):
displw):
disp( Bias’);
disp(b):
Summary
‘The fundamental models of the artificial neural network was discussed in this chapter. The models were
used to generate the logic functions like AND, OR, XOR etc. The linear seperability concept to obtain
the decision boundary of the regions and the Hebb rule for the pattern classification problem was illus-
trated. The learning rules used in various networks for weight updation process were also derived in this
chapter.
ments sous draa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.How the perceptron learning rule is
better than the Hebb rule.
Layer structure in the original
perceptrons.
Learning and training algorithms in
the perceptron network .
Architecture, algorithm and the
application procedure of the
perceptron net.
Derivation of perceptron algorithm
for several output classes .
Applications of multilayer
Perceptron peepee
Networks
Amy wv>Tto
=
Frank Rosenblatt [1962], and Minsky and Papert [1988], developed large class of artificial neural
networks called Perceptrons. The perceptron learning rule uses an iterative weight adjustment that is
more powerful than the Hebb rule. The perceptrons use threshold output function and the McCulloch-
Pitts model of a neuron. Their iterative learning converges to correct weights, i.e. the weights that
produce the exact output value for the training input pattem. The original perceptron is found to have
three layers, sensory, associator and response units as shown in Fig. 4.1.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.64 Introduction to Neural Networks
Step 5: Compute the activation output of each output unit y_ jy =
1, if Vig > O
if -O 00020
1roaii2 4 1 o 0 0 0 0 00020
aod od-t 2-2-1 =I ° 0 0 0 000020
Initial > -1 1-1-11
-1 1-1-1 1 5 1 1 0 0 0 @ O-1 1-1-1 1
P-lb=1 1 1-1 <1 -1 0 0 0 0 O-1 1-1-1 1
‘The final weights from Epoch | are used as the initial weights for Epoch 2. Thus the output is equal to target
by training for suitable weights.
‘Testing the response of the net
‘The final weights are,
For the Ist set of input, w, =0, w, =0, w3 =0,w,=2,b=0, and
For the 2nd set of input, w, =— 1, w)= 1 w;=-1,w,=-1,b=1
‘The net input is, ¥jq=b + 50x, w,
For the Ist set of inputs,
@ (111)
Yin =0+0X140X140%142x1=2>0
Applying activation, Y= AY) = 1
@ aii-1n
Yoit =OF0X140X140x142% 2<0
Applying activation, y= fly.) =— 1
For 2nd set of inputs,
@ Cli-1-11)
Vim = 14-1X—141x14-1x-14-1x-141x1=5>0
Applying activation, ——_y, = fi
di) (-1-111)
Yoing = 14-1141 x-14-1K-14+-1x14-1x15-1<0
Applying activations, y= fly_iqa) =aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.n Introduction to Neural Networks
01001;
1d;
10111;
O1111;
10111;
11100;
01010;
10011;
111k
11110;
11tda)
Program
clear:
cle;
cdeopen( ‘reg.mat)
‘input=[cd.A":cd.B" :cd.C' :ed.0" sed. E' :e¢.F':0d.G' scd.H' sed.1°:ed.J']";
10
J
output (+. j)=1
else
output(i.j)=0
end
end
end
for t#1:15
test=[cd.k" :cd.L" zed." :¢d.N*:cd.0°)":
net=newp(aw. 10, ‘hard] in’)
net .trainparam. epochs=1000;
net .trainparam.goal=0;
net=train(net.. input output) ;
yesim(net, test)
aysaa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.Example 4.8 With a suitable example simulate the perceptron learning network and separate the bound-
aries. Plot the points assumed in the respective quadrants using different symbols for identification.
Solution Plot the elements as square in the first quadrant, as star in the second quadrant, as diamond in the
third quadrant, as circle in the fourth quadrant. Based on the learning rule draw the decision boundaries.
Progran
Clear;
pl=(1 1]': p2=(12]': %- class 1, first quadrant when we plot the elements. square
p3"[2 -1]': p4=[2 -2]'; %- class 2, 4th quadrant when we plot the elements, circle
p5=[-12]': p6=[-2 1]'; %- class 3, 2nd quadrant when we plot the elements.star
p7*[-1 -1]'; p8=[-2 -2]':% - class 4, 3rd quadrant when we plot the elements diamond
Now, lets plot the vectors
hold on
plot(pi(1).p1(2), 'ks" .p2(1).p2(2). "ks" .p3(1) .p3(2). "ko" .p4(1).p4(2). "ko" )
plot(p5(1).p5(2), 'k*" .p6(1) .p6(2), *k*" .p7(1).p7(2), "kd" .p8(1).p8(2). "kd" )
grid
hold
axis([-3 3 -3.3])&set nice axis on the figure
t1*[0 0]': t2[0 0]': %- class 1. first quadrant when we plot the elements. square
t3-[0 1]': t4=[0 1]"; %- class 2. 4th quadrant when we plot the elements. circle
t5=[1 0]'; t6=[10]': %- class 3, 2nd quadrant when we plot the elements. star
t7#[1 1]'; t8=[11]':% - class 4, 3rd quadrant when we plot the elements, diamond
‘lets simulate perceptron learning
Elaments sous droits d'auteuraa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.80 Introduction to Neural Networks
Example 4.9 Write a MATLAB program for pattern classification using perceptron network. Test train
the net with a noisy pattem. Form the input vectors and noisy vectors from patterns as shown below and store
it in a* mat file,
a + *
++ * 7
* * +
ae * +e
* He aa
<1-4t 0 -4te -111
ae + +e
* * +
” ee ate
* oe ee
* +
1-14 41-1 114
Input vectors
Noisy vectors.
Solution ‘The input vectors and the noisy vectors are stored ina mat file, say class.mal, and the required data
is taken from the file. Here a subfunction called charplot.m is used. The MATLAB program for this is given
below.
Program
Perceptron for pattern classification
clear:
cle
Get the data from file
data=open( ‘class mat")
Sinput pattern
Blarget
tsedata.ts; Testing pattern
nels;
BInitialize the Weight matrix
wezeros(n.m)
bezeros(a. 1.
EIntitalize
alphae1
earning rate and threshold valueaa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.|
84 | Introduction to Neural Neworks
Noisy Input Pattern used for Training
mies me Pr
ae
Classified Output Pattern
4.3. Brief Introduction to |Networks
Multilayer perceptron networks is an important class of neural networks. The network consists of a set
of sensory units that constitute the input layer and one or more hidden layer of computation modes. The
input signal passes through the network in the forward direction. The network of this type is called
multilayer perceptron (MLP).
‘The multilayer perceptrons are used with supervised leaming and have led to the successful back-
propagation algorithm, The disadvantage of the single layer perceptron is that it cannot be extended to
multi-layered version. In MLP networks there exists a non-linear activation function. The widely used
non-linear activstion function is logistic sigmoid function. The MLP network also has various layers of
hidden neurons. The hidden neurons make the MLP network active for highly complex tasks. The layers
of the network are connecied by synaptic weights, The MLP thus has a high computational efficiency.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.aa
You have either reached a page that is unavailable for viewing or reached your viewing limit for this
book.88 Introduction to Neural Neiworks
rule and are applied to various neural network applications. The weights on the interconnections between
the adaline and madaline networks are adjustable, The adaline and madaline networks are discussed in
detail in this chapter
Adaline, developed by Widrow and Hoff [1960], is found to use bipolar activations for its input signals
and target output. The weights and the bias of the adaline are adjustable. The learning rule used canbe
called as Delta rule, Least Mean Square rule or Widrow-Hoffrule. The derivation of this rule with single
output unit, Several output units and its extension has been dealt already in Section 3.3.3. Since the
activation function is an identity function, the activation of the unit is its net input.
‘When adaline is to be used for pattern classification, then, after training, a threshold function is
applied to the net input to obtain the activation
The activation is,
y= @ @e@
The adaline unit can solve the problem with linear separability if it occurs.
5.2.1 Architecture §
‘The architecture of an adaline is shown in Fig. 5.1.
th 7
‘The adaline has only one output unit. This output unit
receives input from several units and also from bias: whose (y,) wy,
activation is always +1. The adaline also resembles asingle —\. aa a
We
layer network as discussed in Section 2.7. It receives input
from several neurons. It should be noted that it also receives
input from the unit which is always ‘+1’, called as bias. The @)—
bias weights are also trained in the same manner as the other jor
weights, In Fig. 5.1, an input layer with xy..-xj--.x,and bias,
aan output layer with only one output neuron is present. The
link between the input and output neurons possess weighted Cy
interconnections. These weights get changed as the trai
progresses.
y
Fig 6:4] Architecture of an Adaline
5.2.2 Algorithm
Basically, the initial weights of adaline network have to be set to small random values and not to zero as
discussed in Hebb or perceptron networks, because this may influence the error factor to be considered.
After the initial weights are assumed, the activations for the input unit are set. The net input is calculated
based on the training input patterns and the weights. By applying delta leaming rule discussed in 3.3.3,
the weight updation is being carried out. The training process is continued until the error, which is the
difference between the target and the net input becomes minimum. The step based training algorithm for
an adaline is as follows:
|