Artificial Intelligence:, John Mccarthy

Artificial Intelligence
What is Artificial Intelligence?

At its simplest form, artificial intelligence is a field, which combines computer science and robust
datasets, to enable problem-solving. It also encompasses sub-fields of machine learning and deep
learning, which are frequently mentioned in conjunction with artificial intelligence. These
disciplines are comprised of AI algorithms which seek to create expert systems which make
predictions or classifications based on input data.
Over the years, artificial intelligence has gone through many cycles of hype, but even to skeptics,
the release of OpenAI’s ChatGPT seems to mark a turning point. The last time generative AI
loomed this large, the breakthroughs were in computer vision, but now the leap forward is in
natural language processing. And it’s not just language: Generative models can also learn the
grammar of software code, molecules, natural images, and a variety of other data types.
The applications for this technology are growing every day, and we’re just starting to explore the
possibilities.
Define Artificial Intelligence?

While a number of definitions of artificial intelligence (AI) have surfaced over the last few
decades, John McCarthy offers the following definition, “It is the science and engineering of
making intelligent machines, especially intelligent computer programs. It is related to the similar
task of using computers to understand human intelligence, but AI does not have to confine itself
to methods that are biologically observable."
Alan Turing’s definition would have fallen under the category of “systems that act like humans.”
Applications of AI?
There are numerous, real-world applications of AI systems today. Below are some of the most
common use cases:
 Speech recognition: It is also known as automatic speech recognition (ASR), computer

speech recognition, or speech-to-text, and it is a capability which uses natural language
processing (NLP) to process human speech into a written format. Many mobile devices
incorporate speech recognition into their systems to conduct voice search
e.g. Siri—or provide more accessibility around texting.
 Customer service: Online virtual agents are replacing human agents along the customer
journey. They answer frequently asked questions (FAQs) around topics, like shipping, or
provide personalized advice, cross-selling products or suggesting sizes for users, changing
the way we think about customer engagement across websites and social media platforms.
Examples include messaging bots on e-commerce sites with virtual agents, messaging
apps, such as Slack and Facebook Messenger, and tasks usually done by virtual assistants
and voice assistants.
 Computer vision: This AI technology enables computers and systems to derive

meaningful information from digital images, videos and other visual inputs, and based on
those inputs, it can take action. This ability to provide recommendations distinguishes it
from image recognition tasks. Powered by convolutional neural networks, computer vision
has applications within photo tagging in social media, radiology imaging in healthcare, and
self-driving cars within the automotive industry.
 Recommendation engines: Using past consumption behavior data, AI algorithms can help
to discover data trends that can be used to develop more effective cross-selling strategies.
This is used to make relevant add-on recommendations to customers during the checkout
process for online retailers.
 Automated stock trading: Designed to optimize stock portfolios, AI-driven high-

frequency trading platforms make thousands or even millions of trades per day without
human intervention.
History of artificial intelligence

Maturation of Artificial Intelligence (1943-1952)
o Year 1943: The first work which is now recognized as AI was done by Warren McCulloch
and Walter pits in 1943. They proposed a model of artificial neurons.
o Year 1949: Donald Hebb demonstrated an updating rule for modifying the connection
strength between neurons. His rule is now called Hebbian learning.
o Year 1950: The Alan Turing who was an English mathematician and pioneered Machine
learning in 1950. Alan Turing publishes "Computing Machinery and Intelligence" in
which he proposed a test. The test can check the machine's ability to exhibit intelligent
behavior equivalent to human intelligence, called a Turing test.
The birth of Artificial Intelligence (1952-1956)
o Year 1955: An Allen Newell and Herbert A. Simon created the "first artificial intelligence
program"Which was named as "Logic Theorist". This program had proved 38 of 52
Mathematics theorems, and find new and more elegant proofs for some theorems.
o Year 1956: The word "Artificial Intelligence" first adopted by American Computer
scientist John McCarthy at the Dartmouth Conference. For the first time, AI coined as an
academic field.
At that time high-level computer languages such as FORTRAN, LISP, or COBOL were invented.
And the enthusiasm for AI was very high at that time.
The golden years-Early enthusiasm (1956-1974)
o Year 1966: The researchers emphasized developing algorithms which can solve
mathematical problems. Joseph Weizenbaum created the first chatbot in 1966, which was
named as ELIZA.
o Year 1972: The first intelligent humanoid robot was built in Japan which was named as
WABOT-1.
The first AI winter (1974-1980)
o The duration between years 1974 to 1980 was the first AI winter duration. AI winter refers
to the time period where computer scientist dealt with a severe shortage of funding from
government for AI researches.
o During AI winters, an interest of publicity on artificial intelligence was decreased.
A boom of AI (1980-1987)
o Year 1980: After AI winter duration, AI came back with "Expert System". Expert systems
were programmed that emulate the decision-making ability of a human expert.
o In the Year 1980, the first national conference of the American Association of Artificial
Intelligence was held at Stanford University
The second AI winter (1987-1993)
o The duration between the years 1987 to 1993 was the second AI Winter duration.
o Again Investors and government stopped in funding for AI research as due to high cost but
not efficient result. The expert system such as XCON was very cost effective.
The emergence of intelligent agents (1993-2011)
o Year 1997: In the year 1997, IBM Deep Blue beats world chess champion, Gary Kasparov,
and became the first computer to beat a world chess champion.
o Year 2002: for the first time, AI entered the home in the form of Roomba, a vacuum
cleaner.
o Year 2006: AI came in the Business world till the year 2006. Companies like Facebook,
Twitter, and Netflix also started using AI.
Deep learning, big data and artificial general intelligence (2011-present)
o Year 2011: In the year 2011, IBM's Watson won jeopardy, a quiz show, where it had to
solve the complex questions as well as riddles. Watson had proved that it could understand
natural language and can solve tricky questions quickly.
o Year 2012: Google has launched an Android app feature "Google now", which was able
to provide information to the user as a prediction.
o Year 2014: In the year 2014, Chatbot "Eugene Goostman" won a competition in the
infamous "Turing test."
o Year 2018: The "Project Debater" from IBM debated on complex topics with two master
debaters and also performed extremely well.
o Google has demonstrated an AI program "Duplex" which was a virtual assistant and which
had taken hairdresser appointment on call, and lady on other side didn't notice that she was
talking with the machine.
Now AI has developed to a remarkable level. The concept of Deep learning, big data, and data
science are now trending like a boom. Nowadays companies like Google, Facebook, IBM, and
Amazon are working with AI and creating amazing devices. The future of Artificial Intelligence
is inspiring and will come with high intelligence.
What is machine learning?
Machine learning is a branch of artificial intelligence (AI) and computer science which
focuses on the use of data and algorithms to imitate the way that humans learn,
gradually improving its accuracy.
Real-world machine learning use cases
Here are just a few examples of machine learning you might encounter every day:
1) Speech recognition: It is also known as automatic speech recognition (ASR),

computer speech recognition, or speech-to-text, and it is a capability which uses
natural language processing (NLP) to translate human speech into a written
format. Many mobile devices incorporate speech recognition into their systems
to conduct voice search—e.g. Siri—or improve accessibility for texting.
2) Customer service: Online chatbots are replacing human agents along the
customer journey, changing the way we think about customer engagement
across websites and social media platforms. Chatbots answer frequently asked
questions (FAQs) about topics such as shipping, or provide personalized advice,
cross-selling products or suggesting sizes for users. Examples include virtual
agents on e-commerce sites; messaging bots, using Slack and Facebook
Messenger; and tasks usually done by virtual assistants and voice assistants.
3) Computer vision: This AI technology enables computers to derive meaningful
information from digital images, videos, and other visual inputs, and then take
the appropriate action. Powered by convolutional neural networks, computer
vision has applications in photo tagging on social media, radiology imaging in
healthcare, and self-driving cars in the automotive industry.
4) Recommendation engines: Using past consumption behavior data, AI
algorithms can help to discover data trends that can be used to develop more
effective cross-selling strategies. This approach is used by online retailers to
make relevant product recommendations to customers during the checkout
process.
5) Automated stock trading: Designed to optimize stock portfolios, AI-driven
high-frequency trading platforms make thousands or even millions of trades per
day without human intervention.
6) Fraud detection: Banks and other financial institutions can use machine
learning to spot suspicious transactions. Supervised learning can train a model
using information about known fraudulent transactions. Anomaly detection can
identify transactions that look atypical and deserve further investigation.
Challenges of machine learning

As machine learning technology has developed, it has certainly made our lives easier.
However, implementing machine learning in businesses has also raised a number of
ethical concerns about AI technologies. Some of these include:
1) Technological singularity
While this topic garners a lot of public attention, many researchers are not concerned
with the idea of AI surpassing human intelligence in the near future. Technological
singularity is also referred to as strong AI or superintelligence. Philosopher Nick
Bostrum defines superintelligence as “any intellect that vastly outperforms the best
human brains in practically every field, including scientific creativity, general wisdom,
and social skills.” Despite the fact that superintelligence is not imminent in society, the
idea of it raises some interesting questions as we consider the use of autonomous
systems, like self-driving cars. It’s unrealistic to think that a driverless car would never
have an accident, but who is responsible and liable under those circumstances? Should
we still develop autonomous vehicles, or do we limit this technology to semi-
autonomous vehicles which help people drive safely? The jury is still out on this, but
these are the types of ethical debates that are occurring as new, innovative AI
technology develops.
2) AI impact on jobs
While a lot of public perception of artificial intelligence centers around job losses, this
concern should probably be reframed. With every disruptive, new technology, we see
that the market demand for specific job roles shifts. For example, when we look at the
automotive industry, many manufacturers, like GM, are shifting to focus on electric
vehicle production to align with green initiatives. The energy industry isn’t going away,
but the source of energy is shifting from a fuel economy to an electric one.
In a similar way, artificial intelligence will shift the demand for jobs to other areas.
There will need to be individuals to help manage AI systems. There will still need to be
people to address more complex problems within the industries that are most likely to
be affected by job demand shifts, such as customer service. The biggest challenge with
artificial intelligence and its effect on the job market will be helping people to transition
to new roles that are in demand.
3) Privacy
Privacy tends to be discussed in the context of data privacy, data protection, and
data security. These concerns have allowed policymakers to make more strides in
recent years. As a result, investments in security have become an increasing priority
for businesses as they seek to eliminate any vulnerabilities and opportunities for
surveillance, hacking, and cyberattacks.
4) Bias and discrimination
Instances of bias and discrimination across a number of machine learning systems

have raised many ethical questions regarding the use of artificial intelligence. How
can we safeguard against bias and discrimination when the training data itself may be
generated by biased human processes?
Bias and discrimination aren’t limited to the human resources function either; they
can be found in a number of applications from facial recognition software to social
media algorithms.As businesses become more aware of the risks with AI, they’ve also
become more active in this discussion around AI ethics and values. Because Bias and
discrimination in AI data and algorithms can have serious consequences for
individuals and society, such as unfair treatment, exclusion, or violation of human
rights. As an AI practitioner, you have a responsibility to ensure that your AI systems
are ethical, fair, and transparent
5) Accountability
Since there isn’t significant legislation to regulate AI practices, there is no real

enforcement mechanism to ensure that ethical AI is practiced. The current incentives
for companies to be ethical are the negative repercussions of an unethical AI system
on the bottom line. To fill the gap, ethical frameworks have emerged as part of a
collaboration between ethicists and researchers to govern the construction and
distribution of AI models within society. However, at the moment, these only serve to
guide. Some research shows that the combination of distributed responsibility and a
lack of foresight into potential consequences aren’t conducive to preventing harm to
society.
Differentiate between Conventional Programming v/s Machine

Learning
Conventional Programming
 Conventional Programming uses conventional procedural language. It could be

assembly language or a high-level language such as C, C++, Java, JavaScript,
Python, etc.
 Conventional programming is a manual process, which means the programmer
creates the logic of the program.
 They need to code the rules and write lines of code manually.
 They provide the input data, and based on the program’s programming logic; it
produces the desired output.
 The conventional programming approach is algorithm dependent, and for a
program, multiple algorithms can work. It is up to the programmer how he will
design and develop the logic of the program.
Conventional Program for Activity Recognition
From the image above we only use one information (speed), but how to recognize golf
activity ? It’s more complicated right ? We need more than one variable to give enough
information to recognize complex activity. We can’t do activity recognition by use speed
variable alone of course is quite naive. We walk and run at different speeds uphill and
downhill and other people walk and run at different speeds to us.
Imagine that you need to create activity recognition program, so how

much variable and if-else statement for create this program ? Maybe it can be
tens of variable and thousands of if-else statement. It’s terrible and actually
static system
Machine learning (ML)
 In the field of the scientific study of algorithms and uses various statistical
models.
 The computer systems use these statistical models to perform a specific task
effectively.
 Here you don’t need to provide explicit instructions; instead, it relies
on patterns and inference.
 As shown in the above image, machine learning is a subset of artificial
intelligence.
 Machine learning language algorithms build a mathematical model depending
on sample data.
 This data is known as “training data.” Using these data and algorithms,
prepares predictions or decisions. Here you don’t need to program to perform
the task explicitly.
The idea of Machine Learning for Activity Recognition
Let’s say every activity generate specific and unique data which represent these activity
(like the picture above). We realize that this data has specific patterns and we give label
for every patterns. Term of “label” is result which you want to predict from system.
Machine Learning need high quality and high quantity data for training the
system.
Concept of Conventional Programming and Machine Learning
Conventional Programming decision making is based on IF-ELSE. Many solution

can not be modeled with conventional programming because of the variation of the data
input variable and complexity of the problem.
Eg: Conventional Programming will always use for develop and build website, android
app, ios app, and other stuff.
Machine Learning solves this problem by modeling this data with train
data and test data and then predict the result.
Eg: ML can be used to build intelligent systems like Fraud Detection,
Recommendations, and etc.
How machine learning related to AI
Artificial Intelligence (AI) started as a subfield of computer science with the focus on
solving tasks that humans can but computers can’t do (for instance, image
recognition). AI can be approached in many ways, for example, writing a computer
program that implements a set of rules devised by domain experts. Now, hand-crafting
rules can be very laborious and time consuming.
The field of machine learning – originally, we can consider it as a subfield of AI – was

concerned with the development of algorithms so that computers can automatically
learn (predictive) models from data.
For instance, say we want to develop a program that can recognize handwritten digits
from images. One would be to look at all of these images and come-up with a set of
(nested) if-this-than-that rules to say which image is displayed in a particular image
(for instance, by looking at the relative locations of pixels). Another approach would
be to use a machine learning algorithm, which can fit a predictive model based on a
thousands of labeled image samples that we may have collected in a database. Now,
there’s also deep learning, which in turn is a subfield of machine learning, referring to
a particular subset of models that are particularly good at certain tasks such as image
recognition and natural language processing.
Or in short, machine learning (and deep learning) definitely helps to develop “AI,”
however, AI doesn’t necessarily have to be developed using machine learning –
although, machine learning makes “AI” much more convenient ;).
DATA
Define – Data refers to the raw information or facts that are collected and processed by AI systems
to make predictions, learn patterns, and make decisions. Data in AI can take various forms, including
text, numbers, images, audio, and more. It serves as the input to AI algorithms and models, which use
this data to learn and generate insights or perform specific tasks.
Data in AI can be categorized into different types:

1. Training Data: This is a crucial type of data used to train AI models. Training data consists of
input features and their corresponding output labels or target values. The AI model learns
patterns and relationships within the training data to make predictions or classifications when
presented with new, unseen data.
2. Testing Data: Testing data is a separate dataset used to evaluate the performance of AI models.
It is not used during the training process and helps assess how well a model generalizes to new
data.
3. Unlabeled Data: In unsupervised learning, AI models analyze and discover patterns within
data that does not have associated labels or outcomes. This type of data is used for clustering,
dimensionality reduction, and other tasks.
4. Streaming Data: Some AI applications, such as real-time analytics or monitoring, involve
continuously streaming data from various sources. AI systems process and analyze this data as
it arrives to make timely decisions or predictions.
5. Time Series Data: Time series data is a sequence of data points collected at successive time
intervals. It is often used in forecasting and trend analysis.
6. Structured and Unstructured Data: Data in AI can be structured, meaning it is organized
in a clear, tabular format (e.g., databases), or unstructured, meaning it lacks a predefined
structure (e.g., text documents, images, audio).
7. Big Data: This term refers to exceptionally large and complex datasets that require specialized
tools and techniques to handle, store, and process efficiently.
In AI, the quality, quantity, and relevance of data are critical factors that can significantly impact the
performance of AI models. Data preprocessing, cleaning, and feature engineering are common
practices used to prepare data for AI tasks. Additionally, ethical considerations and data privacy play a
crucial role in data handling and usage within AI systems.
Data terminology and concepts in AI

Data terminology and concepts in AI encompass a wide range of terms and ideas related to the
collection, processing, analysis, and use of data in artificial intelligence applications. Here are some key
data-related terms and concepts in AI:
1. Data: Raw facts, figures, or information that can be collected and processed. Data can be
structured (e.g., in databases) or unstructured (e.g., text, images).
2. Data Point: A single unit of data, often represented as a single row in a dataset.
3. Dataset: A structured collection of data points, organized for specific analysis or machine
learning tasks.
4. Feature: A specific attribute or variable in a dataset used as input for AI models. Features are
often used to describe and represent data.
5. Label: In supervised learning, a label is the known output or target variable associated with a
data point. It represents the correct answer or outcome.
6. Training Data: The portion of the dataset used to train a machine learning model. It consists
of input features and their corresponding labels.
7. Testing Data: The portion of the dataset used to evaluate the performance of a machine
learning model. It is separate from the training data and helps assess how well the model
generalizes.
8. Validation Data: An additional dataset used for fine-tuning models and hyperparameter
optimization. It helps prevent overfitting.
9. Unlabeled Data: Data without associated labels or outcomes, often used in unsupervised
learning.
10. Supervised Learning: A machine learning paradigm where models are trained on labeled
data to make predictions or classifications.
11. Unsupervised Learning: A machine learning paradigm where models find patterns or
structures in data without labeled outcomes.
12. Semi-Supervised Learning: Combines elements of both supervised and unsupervised
learning, where some data is labeled and some is not.
13. Reinforcement Learning: A type of machine learning where agents learn to make decisions
by interacting with an environment and receiving rewards or penalties.
14. Feature Engineering: The process of selecting, creating, or transforming features to improve
the performance of AI models.
15. Data Preprocessing: Tasks performed on data to clean, normalize, and prepare it for model
training.
16. Overfitting: Occurs when a model learns the training data too well and performs poorly on
new, unseen data.
17. Underfitting: Occurs when a model is too simple to capture the underlying patterns in the
data, resulting in poor performance.
18. Bias-Variance Trade-off: Balancing the model's ability to fit the training data while avoiding
overfitting (high variance) or underfitting (high bias).
19. Feature Selection: Choosing the most relevant features from the dataset to improve model
efficiency and reduce complexity.
20. Feature Extraction: Transforming or summarizing data into a new set of features that capture
essential information.
21. Data Augmentation: Techniques for increasing the size and diversity of the dataset by
applying transformations to the existing data.
22. Data Mining: The process of discovering patterns and insights from large datasets.
23. Anomaly Detection: Identifying rare or unusual data points that deviate from expected
patterns.
24. Data Pipeline: A series of data processing and transformation steps, often automated, that
prepare data for AI modeling.
25. Data Labeling: The process of assigning labels or categories to data points, often done
manually for training data.
26. Feature Scaling: Normalizing or standardizing features to have similar scales, which can
improve model performance.
27. Data Privacy: Ensuring the protection and confidentiality of sensitive or personal data used in
AI applications.
28. Data Ethics: The ethical considerations related to data collection, usage, and potential biases
in AI systems.
These terms and concepts are fundamental to understanding how data is utilized in AI, machine
learning, and deep learning applications. They are essential for both practitioners and those looking to
gain insights into the field of artificial intelligence.
Deep learning
Deep learning is a subset of machine learning, which is essentially a neural network with three or more
layers. These neural networks attempt to simulate the behavior of the human brain—albeit far from
matching its ability—allowing it to “learn” from large amounts of data. While a neural network with a
single layer can still make approximate predictions, additional hidden layers can help to optimize and
refine for accuracy.
Deep learning drives many artificial intelligence (AI) applications and services that improve
automation, performing analytical and physical tasks without human intervention. Deep learning
technology lies behind everyday products and services (such as digital assistants, voice-enabled
TV remotes, and credit card fraud detection) as well as emerging technologies (such as
self-driving cars).
Deep learning v/s Machine learning

If deep learning is a subset of machine learning, how do they differ? Deep learning distinguishes itself
from classical machine learning by the type of data that it works with and the methods in which it
learns.
Machine learning algorithms leverage structured, labeled data to make predictions—meaning that
specific features are defined from the input data for the model and organized into tables. This doesn’t
necessarily mean that it doesn’t use unstructured data; it just means that if it does, it generally goes
through some pre-processing to organize it into a structured format.
Deep learning eliminates some of data pre-processing that is typically involved with machine learning.
These algorithms can ingest and process unstructured data, like text and images, and it automates
feature extraction, removing some of the dependency on human experts. For example, let’s say
that we had a set of photos of different pets, and we wanted to categorize by “cat”,
“dog”, “hamster”, etc. Deep learning algorithms can determine which features (e.g.
ears) are most important to distinguish each animal from another. In machine learning, this hierarchy
of features is established manually by a human expert.
Then, through the processes of gradient descent and backpropagation, the deep learning algorithm
adjusts and fits itself for accuracy, allowing it to make predictions about a new photo of an animal with
increased precision.
Machine learning and deep learning models are capable of different types of learning
as well, which are usually categorized as supervised learning, unsupervised learning,
and reinforcement learning.
 Supervised learning utilizes labeled datasets to categorize or make predictions; this
requires some kind of human intervention to label input data correctly.
 In contrast, unsupervised learning doesn’t require labeled datasets, and instead, it
detects patterns in the data, clustering them by any distinguishing characteristics.
 Reinforcement learning is a process in which a model learns to become more accurate for
performing an action in an environment based on feedback in order to maximize the reward.
How deep learning works
Deep learning neural networks, or artificial neural networks, attempts to mimic the human brain
through a combination of data inputs, weights, and bias. These elements work together to
accurately recognize, classify, and describe objects within the data.
Deep neural networks consist of multiple layers of interconnected nodes, each building upon the
previous layer to refine and optimize the prediction or categorization. This progression of computations
through the network is called forward propagation. The input and output layers of a deep neural
network are called visible layers. The input layer is where the deep learning model ingests the data for
processing, and the output layer is where the final prediction or classification is made.
Another process called backpropagation uses algorithms, like gradient descent, to calculate errors in
predictions and then adjusts the weights and biases of the function by moving backwards through the
layers in an effort to train the model. Together, forward propagation and backpropagation allow a
neural network to make predictions and correct for any errors accordingly. Over time, the algorithm
becomes gradually more accurate.
The deep learning algorithms are incredibly complex, and there are different types of neural networks
to address specific problems or datasets. For example,
 Convolutional neural networks (CNNs), used primarily in computer vision and image
classification applications, can detect features and patterns within an image, enabling tasks, like
object detection or recognition. In 2015, a CNN bested a human in an object recognition
challenge for the first time.
 Recurrent neural network (RNNs) are typically used in natural language and speech recognition
applications as it leverages sequential or times series data.
Deep learning applications
Real-world deep learning applications are a part of our daily lives, but in most cases, they are so well-
integrated into products and services that users are unaware of the complex data processing that is
taking place in the background. Some of these examples include the following:
1. Law enforcement
Deep learning algorithms can analyze and learn from transactional data to identify dangerous patterns
that indicate possible fraudulent or criminal activity. Speech recognition, computer vision, and other
deep learning applications can improve the efficiency and effectiveness of investigative analysis by
extracting patterns and evidence from sound and video recordings, images, and documents, which
helps law enforcement analyze large amounts of data more quickly and accurately.
2. Financial services
Financial institutions regularly use predictive analytics to drive algorithmic trading of stocks, assess
business risks for loan approvals, detect fraud, and help manage credit and investment portfolios for
clients.
3. Customer service
Many organizations incorporate deep learning technology into their customer service
processes. Chatbots—used in a variety of applications, services, and customer service portals—are a
straightforward form of AI. Traditional chatbots use natural language and even visual recognition,
commonly found in call center-like menus. However, more sophisticated chatbot solutions attempt to
determine, through learning, if there are multiple responses to ambiguous questions. Based on the
responses it receives, the chatbot then tries to answer these questions directly or route the conversation
to a human user.
Virtual assistants like Apple's Siri, Amazon Alexa, or Google Assistant extends the idea of a chatbot
by enabling speech recognition functionality. This creates a new method to engage users in a
personalized way.
4. Healthcare
The healthcare industry has benefited greatly from deep learning capabilities ever since the digitization
of hospital records and images. Image recognition applications can support medical imaging specialists
and radiologists, helping them analyze and assess more images in less time.
Neural Network
What is a neural network?
A neural network is a method in artificial intelligence that teaches computers to process data in a way
that is inspired by the human brain. It is a type of machine learning process, called deep learning that
uses interconnected nodes or neurons in a layered structure that resembles the human brain. It creates
an adaptive system that computers use to learn from their mistakes and improve continuously. Thus,
artificial neural networks attempt to solve complicated problems, like summarizing documents or
recognizing faces, with greater accuracy.
What are neural networks used for?
Neural networks have several use cases across many industries, such as the following:
 Medical diagnosis by medical image classification

 Targeted marketing by social network filtering and behavioral data analysis
 Financial predictions by processing historical data of financial instruments
 Electrical load and energy demand forecasting
 Process and quality control
 Chemical compound identification
four of the important applications of neural networks below.
1. Computer vision
Computer vision is the ability of computers to extract information and insights from images and videos.
With neural networks, computers can distinguish and recognize images similar to humans. Computer
vision has several applications, such as the following:
 Visual recognition in self-driving cars so they can recognize road signs and other road users
 Content moderation to automatically remove unsafe or inappropriate content from image and video
archives
 Facial recognition to identify faces and recognize attributes like open eyes, glasses, and facial hair
 Image labeling to identify brand logos, clothing, safety gear, and other image details
2. Speech recognition
Neural networks can analyze human speech despite varying speech patterns, pitch, tone, language, and
accent. Virtual assistants like Amazon Alexa and automatic transcription software use speech
recognition to do tasks like these:
 Assist call center agents and automatically classify calls

 Convert clinical conversations into documentation in real time
 Accurately subtitle videos and meeting recordings for wider content reach
3. Natural language processing
Natural language processing (NLP) is the ability to process natural, human-created text. Neural
networks help computers gather insights and meaning from text data and documents. NLP has several
use cases, including in these functions:
 Automated virtual agents and chatbots

 Automatic organization and classification of written data
 Business intelligence analysis of long-form documents like emails and forms
 Indexing of key phrases that indicate sentiment, like positive and negative comments on social media
 Document summarization and article generation for a given topic
4. Recommendation engines
Neural networks can track user activity to develop personalized recommendations. They can also
analyze all user behavior and discover new products or services that interest a specific user. For
example, Curalate, a Philadelphia-based startup, helps brands convert social media posts into sales.
Brands use Curalate’s intelligent product tagging (IPT) service to automate the collection and curation
of user-generated social content. IPT uses neural networks to automatically find and recommend
products relevant to the user’s social media activity. Consumers don't have to hunt through online
catalogs to find a specific product from a social media image. Instead, they can use Curalate’s auto
product tagging to purchase the product with ease.
How do neural networks work?

The human brain is the inspiration behind neural network architecture. Human brain cells, called
neurons, form a complex, highly interconnected network and send electrical signals to each other to
help human’s process information. Similarly, an artificial neural network is made of artificial neurons
that work together to solve a problem. Artificial neurons are software modules, called nodes, and
artificial neural networks are software programs or algorithms that, at their core, use computing
systems to solve mathematical calculations.
Simple neural network architecture
A basic neural network has interconnected artificial neurons in three layers:
1. Input Layer
Information from the outside world enters the artificial neural network from the input layer. Input
nodes process the data, analyze or categorize it, and pass it on to the next layer.
2. Hidden Layer
Hidden layers take their input from the input layer or other hidden layers. Artificial neural networks
can have a large number of hidden layers. Each hidden layer analyzes the output from the previous
layer, processes it further, and passes it on to the next layer.
3. Output Layer
The output layer gives the final result of all the data processing by the artificial neural network. It can
have single or multiple nodes. For instance, if we have a binary (yes/no) classification problem, the
output layer will have one output node, which will give the result as 1 or 0. However, if we have a multi-
class classification problem, the output layer might consist of more than one output node.
Deep neural network architecture
Deep neural networks, or deep learning networks, have several hidden layers with millions of artificial
neurons linked together. A number, called weight, represents the connections between one node and
another. The weight is a positive number if one node excites another, or negative if one node suppresses
the other. Nodes with higher weight values have more influence on the other nodes.
Theoretically, deep neural networks can map any input type to any output type. However, they also
need much more training as compared to other machine learning methods. They need millions of
examples of training data rather than perhaps the hundreds or thousands that a simpler network might
need.
What are the types of neural networks?
Artificial neural networks can be categorized by how the data flows from the input node to the output
node. Below are some examples:
1. Feedforward neural networks
Feedforward neural networks process data in one direction, from the input node to the output node.
Every node in one layer is connected to every node in the next layer. A feedforward network uses a
feedback process to improve predictions over time.
2. Backpropagation algorithm
Artificial neural networks learn continuously by using corrective feedback loops to improve their
predictive analytics. In simple terms, you can think of the data flowing from the input node to the
output node through many different paths in the neural network. Only one path is the correct one that
maps the input node to the correct output node. To find this path, the neural network uses a feedback
loop, which works as follows:
1. Each node makes a guess about the next node in the path.
2. It checks if the guess was correct. Nodes assign higher weight values to paths that lead to more correct
guesses and lower weight values to node paths that lead to incorrect guesses.
3. For the next data point, the nodes make a new prediction using the higher weight paths and then repeat
Step 1.
3. Convolutional neural networks
The hidden layers in convolutional neural networks perform specific mathematical functions, like
summarizing or filtering, called convolutions. They are very useful for image classification because they
can extract relevant features from images that are useful for image recognition and classification. The
new form is easier to process without losing features that are critical for making a good prediction. Each
hidden layer extracts and processes different image features, like edges, color, and depth.
Example: How do neural networks work?
Think of each individual node as its own linear regression model, composed of input data, weights, a
bias (or threshold), and an output. The formula would look something like this:
∑wixi + bias = w1x1 + w2x2 + w3x3 + bias
output = f(x) = 1 if ∑w1x1 + b>= 0; 0 if ∑w1x1 + b < 0
Once an input layer is determined, weights are assigned. These weights help determine the importance
of any given variable, with larger ones contributing more significantly to the output compared to other
inputs. All inputs are then multiplied by their respective weights and then summed. Afterward, the
output is passed through an activation function, which determines the output. If that output exceeds a
given threshold, it “fires” (or activates) the node, passing data to the next layer in the network. This
results in the output of one node becoming in the input of the next node. This process of passing data
from one layer to the next layer defines this neural network as a feedforward network.
Let’s break down what one single node might look like using binary values. We can apply this concept
to a more tangible example, like whether you should go surfing (Yes: 1, No: 0). The decision to go or
not to go is our predicted outcome, or y-hat. Let’s assume that there are three factors influencing your
decision-making:
1. Are the waves good? (Yes: 1, No: 0)

2. Is the line-up empty? (Yes: 1, No: 0)
3. Has there been a recent shark attack? (Yes: 0, No: 1)
Then, let’s assume the following, giving us the following inputs:
 X1 = 1, since the waves are pumping

 X2 = 0, since the crowds are out
 X3 = 1, since there hasn’t been a recent shark attack
Now, we need to assign some weights to determine importance. Larger weights signify that particular
variables are of greater importance to the decision or outcome.
 W1 = 5, since large swells don’t come around often

 W2 = 2, since you’re used to the crowds
 W3 = 4, since you have a fear of sharks
Finally, we’ll also assume a threshold value of 3, which would translate to a bias value of –3. With all
the various inputs, we can start to plug in values into the formula to get the desired output.
Y-hat = (1*5) + (0*2) + (1*4) – 3 = 6
If we use the activation function from the beginning of this section, we can determine that the output
of this node would be 1, since 6 is greater than 0. In this instance, you would go surfing; but if we adjust
the weights or the threshold, we can achieve different outcomes from the model. When we observe one
decision, like in the above example, we can see how a neural network could make increasingly complex
decisions depending on the output of previous decisions or layers.
Artificial Neural Network

The term "Artificial neural network" refers to a biologically inspired sub-field of artificial
intelligence modeled after the brain. An Artificial neural network is usually a computational
network based on biological neural networks that construct the structure of the human brain.
Similar to a human brain has neurons interconnected to each other, artificial neural networks also
have neurons that are linked to each other in various layers of the networks. These neurons are
known as nodes.
What is Artificial Neural Network?

The term "Artificial Neural Network" is derived from Biological neural networks that develop the
structure of a human brain. Similar to the human brain that has neurons interconnected to one
another, artificial neural networks also have neurons that are interconnected to one another in
various layers of the networks. These neurons are known as nodes.
The given figure illustrates the typical diagram of Biological Neural Network.
The typical Artificial Neural Network looks something like the given figure.
Relationship between Biological neural network and artificial neural
network:
Biological Neural Network Artificial Neural Network
Dendrites Inputs
Cell nucleus Nodes
Synapse Weights
Axon Output
An Artificial Neural Network in the field of Artificial intelligence where it attempts to mimic
the network of neurons makes up a human brain so that computers will have an option to understand
things and make decisions in a human-like manner. The artificial neural network is designed by
programming computers to behave simply like interconnected brain cells.
There are around 1000 billion neurons in the human brain. Each neuron has an association point
somewhere in the range of 1,000 and 100,000. In the human brain, data is stored in such a manner as
to be distributed, and we can extract more than one piece of this data when necessary from our memory
parallelly. We can say that the human brain is made up of incredibly amazing parallel processors.
The architecture of an artificial neural network:
To understand the concept of the architecture of an artificial neural network, we have to understand
what a neural network consists of. In order to define a neural network that consists of a large number
of artificial neurons, which are termed units arranged in a sequence of layers. Lets us look at various
types of layers available in an artificial neural network.
Artificial Neural Network primarily consists of three layers:

1. Input Layer:
As the name suggests, it accepts inputs in several different formats provided by the programmer.
2. Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the calculations to find
hidden features and patterns.
3. Output Layer:
The input goes through a series of transformations using the hidden layer, which finally results in
output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the inputs and includes a
bias. This computation is represented in the form of a transfer function.
It determines weighted total is passed as an input to an activation function to produce the output.
Activation functions choose whether a node should fire or not. Only those who are fired make it to the
output layer. There are distinctive activation functions available that can be applied upon the sort of
task we are performing.
Advantages of Artificial Neural Network (ANN)

a. Parallel processing capability:
Artificial neural networks have a numerical value that can perform more than one task simultaneously.
b. Storing data on the entire network:
Data that is used in traditional programming is stored on the whole network, not on a database. The
disappearance of a couple of pieces of data in one place doesn't prevent the network from working.
c. Capability to work with incomplete knowledge:
After ANN training, the information may produce output even with inadequate data. The loss of
performance here relies upon the significance of missing data.
d. Having a memory distribution:
For ANN is to be able to adapt, it is important to determine the examples and to encourage the network
according to the desired output by demonstrating these examples to the network. The succession of the
network is directly proportional to the chosen instances, and if the event can't appear to the network in
all its aspects, it can produce false output.
e. Having fault tolerance:
Extortion of one or more cells of ANN does not prohibit it from generating output, and this feature
makes the network fault-tolerance.
Disadvantages of Artificial Neural Network:
a. Assurance of proper network structure:
There is no particular guideline for determining the structure of artificial neural networks. The
appropriate network structure is accomplished through experience, trial, and error.
b. Unrecognized behavior of the network:
It is the most significant issue of ANN. When ANN produces a testing solution, it does not provide
insight concerning why and how. It decreases trust in the network.
What Machine Learning can and what cannot do?
Machine Learning is a substudy of Artificial Intelligence and a part of the Data Science field.
Machine Learning (ML) is considered a subset of AI. You can even say that ML is an implementation of
AI. So whenever you think AI, you can think of applying ML there. As the name makes it pretty clear,
ML is used in situations where we want the machine to learn from the huge amounts of data we give it,
and then apply that knowledge on new pieces of data that streams into the system.
There are various algorithms in ML which could be used for prediction problems, classification
problems, regression problems, and more. You might have heard of algorithms such as simple linear
regression, polynomial regression, support vector regression, decision tree regression, random forest
regression, K-nearest neighbors, and the like.
The main idea of ML is that you compile a data set, feed it to ML algorithms to learn,
and then ML algorithms make predictions or recommendations based on the data
analyzed. In other words, such algorithms are coded by hand, and they cannot learn.
When they receive various inputs, they always respond in the same way.
Machine Learning can do:

 Predictive maintenance
 Image recognition
 Voice recognition
 Text recognition
 Recommendations
 Classification
 Text to Speech
 Diagnose pneumonia from the ~10 000 labeled images
Machine Learning cannot do:

 Primary example the problem of learning a language from hearing verbal utterances
 Market analysis (Computation limits)
 Translations
 Generate human-like messages
 Human intention recognition
 Emotion recognition
 Gestures recognition
 Diagnose pneumonia from textbook image (little amount go examples)
 Clear the data

JOBs in AI
Career Paths in Artificial Intelligence
The list below includes jobs in AI but also some positions that work closely with those in AI roles.
Career Path Description
Find meaningful patterns in data by looking at the past to help

Big Data Analyst
make predictions about the future.
Work with products to help customers understand their function

User Experience
and can use them easily. Understand how people use equipment
(UX)
and how computer scientists can apply that understanding to
Designer/Developer
produce more advanced software.
Natural Language Explore the connection between human language and

Processing computational systems; this includes working on projects like
Engineer chatbots and virtual assistants.
Work with computer science and AI research Discover ways to

Researcher
advance AI technology
Expert in applied math, machine learning, deep learning, and

computational stats. Expected to have an advanced degree in
Research Scientist
computer science or an advanced degree in a related field
supported by experience.
Develop programs in which AI tools function. The role may also be

Software Engineer
referred to as a Programmer or Artificial Intelligence Developer.
Build AI models from scratch and help product managers and

AI Engineer
stakeholders understand results.
Data Mining and Finding anomalies, patterns, etc. within large data sets to predict
Analysis outcomes.
Machine Learning
Using data to design, build and manage ML software applications.
Engineer
Data Scientist Collect, analyze and interpret data sets.

Business
Intelligence (BI) Analyze complex data sets to identify business and market trends
Developer
Big Data Develop systems that allow businesses to communicate and collect
Engineer/Architect data
Robotics Engineer Design, build and test robots or robotic systems.
Computer Vision
Develop and work on projects and systems involving visual data.
Engineer
Companies Currently Hiring AI Positions
 Wells Fargo — Sr. Conversational AI Content Strategists

 Nike — Data Scientist, Experience Research & Analytics
 Amazon Web Services — Machine Learning Engineer
 Apple — AI/ML Software Engineer
 Spotify — Research Scientist – Language Technologies
 Microsoft — Senior Researcher
As you can see from the list above, there are many different types of positions within artificial
intelligence. Some of the most common AI-related job titles, courtesy of Glassdoor, include:
 Software engineer
 Data scientist
 Software development engineer
 Research scientist
In general, tech companies (both software and hardware) dominate the list of companies that are
hiring AI professionals. But a quick search on any reputable job listing site will give you a list of
positions that span a variety of industries. Here is a sample of some of the top companies that are
hiring for these types of AI roles:
 Deloitte
 Amazon
 Accenture
 H&R Block
 IBM
 PwC
 Fidelity Investments
 PayPal
 Major League Baseball
 Harvard Business School
 IKEA

Artificial Intelligence:, John Mccarthy

Uploaded by

Copyright:

Available Formats

Artificial Intelligence:, John Mccarthy

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial Intelligence:, John Mccarthy

Uploaded by

Copyright:

Available Formats

Artificial Intelligence

What is Artificial Intelligence?

Define Artificial Intelligence?

 Speech recognition: It is also known as automatic speech recognition (ASR), computer

 Computer vision: This AI technology enables computers and systems to derive

 Automated stock trading: Designed to optimize stock portfolios, AI-driven high-

History of artificial intelligence

The birth of Artificial Intelligence (1952-1956)

The golden years-Early enthusiasm (1956-1974)

The first AI winter (1974-1980)

The second AI winter (1987-1993)

The emergence of intelligent agents (1993-2011)

Deep learning, big data and artificial general intelligence (2011-present)

Real-world machine learning use cases

1) Speech recognition: It is also known as automatic speech recognition (ASR),

Challenges of machine learning

4) Bias and discrimination

Instances of bias and discrimination across a number of machine learning systems

Since there isn’t significant legislation to regulate AI practices, there is no real

Differentiate between Conventional Programming v/s Machine

 Conventional Programming uses conventional procedural language. It could be

Imagine that you need to create activity recognition program, so how

Machine learning (ML)

Concept of Conventional Programming and Machine Learning

Conventional Programming decision making is based on IF-ELSE. Many solution

How machine learning related to AI

The field of machine learning – originally, we can consider it as a subfield of AI – was

Data in AI can be categorized into different types:

Data terminology and concepts in AI

Deep learning v/s Machine learning

How deep learning works

Deep learning applications

What are neural networks used for?

 Medical diagnosis by medical image classification

four of the important applications of neural networks below.

 Assist call center agents and automatically classify calls

3. Natural language processing

 Automated virtual agents and chatbots

How do neural networks work?

Simple neural network architecture

A basic neural network has interconnected artificial neurons in three layers:

Deep neural network architecture

1. Feedforward neural networks

3. Convolutional neural networks

Example: How do neural networks work?

∑wixi + bias = w1x1 + w2x2 + w3x3 + bias

output = f(x) = 1 if ∑w1x1 + b>= 0; 0 if ∑w1x1 + b < 0

1. Are the waves good? (Yes: 1, No: 0)

Then, let’s assume the following, giving us the following inputs:

 X1 = 1, since the waves are pumping

 W1 = 5, since large swells don’t come around often

Y-hat = (1*5) + (0*2) + (1*4) – 3 = 6

Artificial Neural Network

What is Artificial Neural Network?

Biological Neural Network Artificial Neural Network

Cell nucleus Nodes

The architecture of an artificial neural network:

Artificial Neural Network primarily consists of three layers:

Advantages of Artificial Neural Network (ANN)

Y-hat = (15) + (02) + (1*4) – 3 = 6