UNIT 1 All Notes
UNIT 1 All Notes
UNIT 1 All Notes
Introduction:
Machine Learning (ML) is a subfield of Artificial intelligence (AI) that focuses on developing
algorithms and statistical models that allow computers to learn from and make predictions or
decisions based on data, without being explicitly programmed for specific tasks. The
fundamental idea behind machine learning is to enable computers to learn from experience,
improve over time, and perform tasks more accurately as they are exposed to more data.
In traditional programming, humans explicitly write rules and instructions for computers to
follow. However, in machine learning, the approach is different. Instead of coding explicit
rules, the computer is trained on large datasets, and the algorithms automatically learn patterns
and relationships within the data. These learned patterns are used to make predictions, classify
new data, or solve complex problems.
Supervised Learning: In this approach, the algorithm is trained on a labeled dataset, where
the input data and corresponding correct outputs are provided. The model learns to map inputs
to outputs, allowing it to make predictions on new, unseen data.
Unsupervised Learning: Here, the algorithm is given an unlabeled dataset and is tasked with
finding patterns, structures, or relationships within the data without explicit guidance.
Clustering and dimensionality reduction are common tasks in unsupervised learning.
Semi-Supervised Learning: This combines elements of both supervised and unsupervised
learning, where the algorithm is trained on a partially labeled dataset.
Reinforcement Learning: In this paradigm, an agent interacts with an environment and learns
to take actions to maximize rewards. The agent receives feedback in the form of rewards or
penalties, allowing it to learn through trial and error.
Machine learning is a branch of artificial intelligence (AI) and computer science which
focuses on the use of data and algorithms to imitate the way that humans learn, gradually
improving its accuracy.
Machine learning is an important component of the growing field of data science. Through the
use of statistical methods, algorithms are trained to make classifications or predictions, and to
uncover key insights in data mining projects. These insights subsequently drive decision
making within applications and businesses, ideally impacting key growth metrics. As big data
continues to expand and grow, the market demand for data scientists will increase. They will
be required to help identify the most relevant business questions and the data to answer them.
Machine learning algorithms are typically created using frameworks that accelerate solution
development, such as TensorFlow and PyTorch.
The all new enterprise studio that brings together traditional machine learning along with new
generative AI capabilities powered by foundation models
Supervised learning, also known as supervised machine learning, is defined by its use of
labeled datasets to train algorithms to classify data or predict outcomes accurately. As input
data is fed into the model, the model adjusts its weights until it has been fitted appropriately.
This occurs as part of the cross validation process to ensure that the model
avoids overfitting or underfitting. Supervised learning helps organizations solve a variety of
real-world problems at scale, such as classifying spam in a separate folder from your inbox.
Some methods used in supervised learning include neural networks, naïve bayes, linear
regression, logistic regression, random forest, and support vector machine (SVM).
Unsupervised machine learning
Unsupervised learning
Unsupervised learning also known as unsupervised machine learning, uses machine learning
algorithms to analyze and cluster unlabeled datasets. These algorithms discover hidden patterns
or data groupings without the need for human intervention. This method’s ability to discover
similarities and differences in information make it ideal for exploratory data analysis, cross-
selling strategies, customer segmentation, and image and pattern recognition. It’s also used to
reduce the number of features in a model through the process of dimensionality reduction.
Principal component analysis (PCA) and singular value decomposition (SVD) are two common
approaches for this. Other algorithms used in unsupervised learning include neural networks,
k-means clustering, and probabilistic clustering methods.
Semi-supervised learning
Here are just a few examples of machine learning you might encounter every day:
Speech recognition: It is also known as automatic speech recognition (ASR), computer speech
recognition, or speech-to-text, and it is a capability which uses natural language processing
(NLP) to translate human speech into a written format. Many mobile devices incorporate
speech recognition into their systems to conduct voice search—e.g. Siri—or improve
accessibility for texting.
Customer service: Customer service: Online chatbots are replacing human agents along the
customer journey, changing the way we think about customer engagement across websites and
social media platforms. Chatbots answer frequently asked questions (FAQs) about topics such
as shipping, or provide personalized advice, cross-selling products or suggesting sizes for users.
Examples include virtual agents on e-commerce sites; messaging bots, using Slack and
Facebook Messenger; and tasks usually done by virtual assistants and voice assistants.
Recommendation engines: Using past consumption behavior data, AI algorithms can help to
discover data trends that can be used to develop more effective cross-selling strategies. This
approach is used by online retailers to make relevant product recommendations to customers
during the checkout process.
Technological singularity
While this topic garners a lot of public attention, many researchers are not concerned with the
idea of AI surpassing human intelligence in the near future. Technological singularity is also
referred to as strong AI or superintelligence. Philosopher Nick Bostrum defines
superintelligence as “any intellect that vastly outperforms the best human brains in practically
every field, including scientific creativity, general wisdom, and social skills.” Despite the fact
that superintelligence is not imminent in society, the idea of it raises some interesting questions
as we consider the use of autonomous systems, like self-driving cars. It’s unrealistic to think
that a driverless car would never have an accident, but who is responsible and liable under
those circumstances? Should we still develop autonomous vehicles, or do we limit this
technology to semi-autonomous vehicles which help people drive safely? The jury is still out
on this, but these are the types of ethical debates that are occurring as new, innovative AI
technology develops.
AI impact on jobs
While a lot of public perception of artificial intelligence centers around job losses, this concern
should probably be reframed. With every disruptive, new technology, we see that the market
demand for specific job roles shifts. For example, when we look at the automotive industry,
many manufacturers, like GM, are shifting to focus on electric vehicle production to align with
green initiatives. The energy industry isn’t going away, but the source of energy is shifting
from a fuel economy to an electric one.
In a similar way, artificial intelligence will shift the demand for jobs to other areas. There will
need to be individuals to help manage AI systems. There will still need to be people to address
more complex problems within the industries that are most likely to be affected by job demand
shifts, such as customer service. The biggest challenge with artificial intelligence and its effect
on the job market will be helping people to transition to new roles that are in demand.
Privacy
Privacy tends to be discussed in the context of data privacy, data protection, and data security.
These concerns have allowed policymakers to make more strides in recent years. For example,
in 2016, GDPR legislation was created to protect the personal data of people in the European
Union and European Economic Area, giving individuals more control of their data. In the
United States, individual states are developing policies, such as the California Consumer
Privacy Act (CCPA), which was introduced in 2018 and requires businesses to inform
consumers about the collection of their data. Legislation such as this has forced companies to
rethink how they store and use personally identifiable information (PII). As a result,
investments in security have become an increasing priority for businesses as they seek to
eliminate any vulnerabilities and opportunities for surveillance, hacking, and cyberattacks.
Instances of bias and discrimination across a number of machine learning systems have raised
many ethical questions regarding the use of artificial intelligence. How can we safeguard
against bias and discrimination when the training data itself may be generated by biased human
processes? While companies typically have good intentions for their automation efforts
highlights some of the unforeseen consequences of incorporating AI into hiring practices. In
their effort to automate and simplify a process, Amazon unintentionally discriminated against
job candidates by gender for technical roles, and the company ultimately had to scrap the
project.
Traditional Programming
Machine Learning
his is the basic difference between traditional programming and machine learning. Without
anyone programming the logic, In Traditional programming one has to manually formulate/code
rules while in Machine Learning the algorithms automatically formulate the rules from the data,
which is very powerful.
For example, if you feed in customer demographics, transactions as input and the observed
output if they churned or not in the past, the algorithm will formulate the program which would
know how to predict if someone would churn or not. That program is called a model.
Customers, in general, have a lot of such input and output data and they can feed that to the
algorithms to create predictive models. For example, feed-in customer information/loan
transactions (input) and how many defaulted on the loan (observed output), and it will create a
model to predict who will default on the loan.
Example:
A product manager can use this framework to predict business outcomes in any situation where you
have input and historical output data:
3. Identify the historically observed output (i.e., data samples for when the condition
is true and for when it’s false).
For instance, if you want to predict who will pay the bills late, identify the input (customer
demographics, bills) and the output (pay late or not), and let the machine learning use this data
to create your model.
Explanation:
So the problem a company needs to solve to create a predictive model is to identify data samples
for when the condition (churn) is true and when the condition (churn) is false and pass this data
to a predictive algorithm to create the model. That is a simplistic way to look at predicting
outcomes.
This is what you as a product manager can figure out ways to treat your business data as a
financial asset. Formulate the question that has good business value, assemble the data
(historical) that has samples of both positive and negative past outcomes/conditions and point
the algorithm to your data so it can learn powerful rules that can be used to predict future
business outcomes.
ML vs AI vs Data Science:
The following are the important differences between descriptive data mining and predictive
data mining −
S.No. Descriptive Data Mining Predictive Data Mining
1. It tries to understand what happened in It tries to understand what could happen in
the past by analyzing the stored data. the future using past data analysis.
2. The data it provides is accurate. It may not be accurate result.
3. It provides standard reporting. It also It is used in predictive modelling,
provides ad-hoc reporting. forecasting, simulation and alerts.
4. It uses data aggregation and data It uses statistics and forecasting methods.
mining.
5. It uses a reactive approach. It uses a proactive approach.
6. Answers questions such as − Answers questions such as −
• What happened? • What would happen next?
• Where is the problem? • What would be the outcome if
• What is the frequency of the trends continue?
this problem? • What actions need to be taken?
The most significant difference between the two is that descriptive data mining is used to know
what happed in the past by using historical data, while predictive data mining is used to know
what could happen in future by using this historical data.
Descriptive analysis is used to understand the past and predictive analysis is used to predict
the future. Both of these concepts are important in machine learning because a clear
understanding of the problem and its implications is the best way to make the right decisions.
Predictive Analysis: A key idea in machine learning is predictive analytics. After creating a
machine learning model using descriptive analysis, our next objective is to predict the
algorithm's future actions by providing some beginning conditions. To find and establish the
criteria that support a procedure for pushing a specific condition on time, predictive analytics
is applied. A self-driving car's object detector, for instance, can be quite accurate at spotting
an obstruction in time, but another model must respond to reduce the danger of harm and
increase safety.
Types of Learning:
1.Supervised Learning:
Supervised learning, as the name indicates, has the presence of a supervisor as a teacher.
Basically supervised learning is when we teach or train the machine using data that is well-
labelled. Which means some data is already tagged with the correct answer. After that, the
machine is provided with a new set of examples(data) so that the supervised learning
algorithm analyses the training data(set of training examples) and produces a correct outcome
from labelled data.
Example: suppose you are given a basket filled with different kinds of fruits. Now the first
step is to train the machine with all the different fruits one by one like this:
• If the shape of the object is rounded and has a depression at the top, is red in color,
then it will be labelled as –Apple.
• If the shape of the object is a long curving cylinder having Green-Yellow color,
then it will be labelled as –Banana.
Now suppose after training the data, you have given a new separate fruit, say Banana from
the basket, and asked to identify it.
machine has already learned the things from previous data and this time has to use it wisely.
It will first classify the fruit with its shape and color and would confirm the fruit name as
BANANA and put it in the Banana category. Thus the machine learns the things from training
data(basket containing fruits) and then applies the knowledge to test data(new fruit).
Supervised learning is classified into two categories of algorithms:
Advantages:-
• Supervised learning allows collecting data and produces data output from
previous experiences.
• Helps to optimize performance criteria with the help of experience.
• Supervised machine learning helps to solve various types of real-world
computation problems.
• It performs classification and regression tasks.
• It allows estimating or mapping the result to a new sample.
• We have complete control over choosing the number of classes we want in the
training data.
Disadvantages:-
• Classifying big data can be challenging.
• Training for supervised learning needs a lot of computation time. So, it requires a
lot of time.
• Supervised learning cannot handle all complex tasks in Machine Learning.
• Computation time is vast for supervised learning.
• It requires a labelled data set.
• It requires a training process.
2.Unsupervised learning:
Unsupervised learning is the training of a machine using information that is neither classified
nor labeled and allowing the algorithm to act on that information without guidance. Here the
task of the machine is to group unsorted information according to similarities, patterns, and
differences without any prior training of data.
Unlike supervised learning, no teacher is provided that means no training will be given to the
machine. Therefore the machine is restricted to find the hidden structure in unlabeled data by
itself.
For instance, suppose it is given an image having both dogs and cats which it has never seen.
Thus the machine has no idea about the features of dogs and cats so we can’t categorize it as
‘dogs and cats ‘. But it can categorize them according to their similarities, patterns, and
differences, i.e., we can easily categorize the above picture into two parts. The first may
contain all pics having dogs in them and the second part may contain all pics having cats in
them. Here you didn’t learn anything before, which means no training data or examples.
It allows the model to work on its own to discover patterns and information that was
previously undetected. It mainly deals with unlabelled data.
Unsupervised learning is classified into two categories of algorithms:
Clustering Types:-
1. Hierarchical clustering
2. K-means clustering
3. Principal Component Analysis
4. Singular Value Decomposition
5. Independent Component Analysis
3. Semi-Supervised learning :
4.Reinforcement learning:
Reinforcement learning is a machine learning training method based on rewarding desired
behaviors and/or punishing undesired ones. In general, a reinforcement learning agent is able
to perceive and interpret its environment, take actions and learn through trial and error.
These long-term goals help prevent the agent from stalling on lesser goals. With time, the agent
learns to avoid the negative and seek the positive. This learning method has been adopted in
artificial intelligence (AI) as a way of directing unsupervised machine learning through
rewards and penalties.
Models of Machine learning:
1.Geometric Model Machine Learning:
Machine learning is a field of artificial intelligence that allows machines to learn from data and
improve their performance without being explicitly programmed. One approach to machine
learning is to use geometric models, which can help us represent data in a mathematical form
that makes it easier to analyze and understand.
2.Probabilistic modelling:
• Probabilistic modeling is a statistical approach that uses the effect of random occurrences or
actions to forecast the possibility of future results. It is a quantitative modeling method that
projects several possible outcomes that might even go beyond what has happened recently.
• Probabilistic modeling considers new situations and a wide range of uncertainty while not
underestimating dangers. The three primary building blocks of probabilistic modeling are
adequate probability distributions, correct use of input information for these distribution
functions, and proper accounting for the linkages and interactions between variables. The
downside of the probabilistic modeling technique is that it needs meticulous development, a
process that depends on several assumptions and large input data.
Example:
• Weather and traffic are two everyday phenomena that are both unpredictable and appear to
have a link with one another. You are all aware that if the weather is cold and snow is falling,
traffic will be quite difficult and you will be detained for an extended period of time. We
could even go so far as to predict a substantial association between snowy weather and
higher traffic mishaps.
• Based on available data, we can develop a basic mathematical model of traffic accidents as
a function of snowy weather to aid in the analysis of our hypothesis. All of these models are
based on probabilistic modeling. It is one of the most effective approaches for assessing
weather and traffic relationships.
3.Logical Model:
Logical models are particularly useful when dealing with structured data, such as knowledge
graphs, where the data is organized in a hierarchical or network structure.
• One of the most well-known logical models in machine learning is the decision tree.
Decision trees are a popular classification algorithm that uses a tree-like model of
decisions and their possible consequences to classify data points. Each internal node
in the decision tree represents a decision based on a feature value, and each leaf node
represents a class label.
• Another logical model is the rule-based system, which uses if-then rules to represent
knowledge. Rule-based systems are particularly useful in expert systems, where
human expertise is encoded as a set of rules.
• In addition to these models, there are other logical models used in machine learning,
such as Bayesian networks, which use directed acyclic graphs to model the
dependencies between random variables, and fuzzy logic, which deals with uncertain
or vague information by assigning degrees of truth to propositions.
• Overall, logical models provide a powerful way to represent and reason about
structured data in machine learning. By encoding knowledge in a logical form, these
models can provide interpretable and transparent explanations for their predictions,
which is essential in many real-world applications.
In machine learning (ML), grouping and grading models involve various techniques to
organize, categorize, and evaluate the performance of models. Here's an overview of these
concepts
Clustering: Grouping similar data points together based on certain features, often used for
unsupervised learning. Examples include k-means clustering, hierarchical clustering, and
DBSCAN.
Means that the choice of grouping and grading methods depends on the specific problem
you're trying to solve and the characteristics of your data. It's essential to have a good
understanding of your data, the problem domain, and the available evaluation techniques to
make informed decisions.
Examples:
• Logistic Regression
• Linear Discriminant Analysis
• Perceptron
• Naïve Bayes
• Simple Neural Networks
Advantages:
• Simpler
• Speed
• Less Data
Disadvantages:
• Constrained
• Limited Complexity
• Poor Fit
• Non-Parametric Learning
Examples:
• K-Nearest Neighbors Decision Tree
• Support Vector Machine
Advantages:
• Flexibility
• Power
• Performance
Disadvantages:
• More Data
• Slower
• Overfitting
2
Non-Parametric analysis to test Parametric analysis to test
5
Can be used on ordinal & nominal scale data Used mainly on interval & ratio scale data
independent
7 Example K-Nearest Neighbours
Example: Logistic regression,
SVM
Feature Transformation:
Dimensionality Reduction Techniques:
The number of input features, variables, or columns present in a given dataset is known as
dimensionality, and the process to reduce these features is called dimensionality reduction.
A dataset contains a huge number of input features in various cases, which makes the predictive
modeling task more complicated. Because it is very difficult to visualize or make predictions
for the training dataset with a high number of features, for such cases, dimensionality reduction
techniques are required to use.
Dimensionality reduction technique can be defined as, "It is a way of converting the higher
dimensions dataset into lesser dimensions dataset ensuring that it provides similar
information." These techniques are widely used in machine learning for obtaining a better fit
predictive model while solving the classification and regression problems.
It is commonly used in the fields that deal with high-dimensional data, such as speech
recognition, signal processing, bioinformatics, etc. It can also be used for data visualization,
noise reduction, cluster analysis, etc.
• Dimensionality reduction is an important technique in data analysis and machine learning
that allows us to reduce the number of variables in a dataset while retaining the most important
information. By reducing the number of variables, we can simplify the problem, improve
computational efficiency, and avoid overfitting.
• import pandas as pd
• import numpy as np
• # Here we are using inbuilt dataset of scikit learn
• from sklearn.datasets import load_breast_cancer
Output:
• # instantiating
• cancer = load_breast_cancer(as_frame=True) Original Dataframe shape :
• # creating dataframe (569, 31) Inputs Dataframe
• df = cancer.frame shape : (569, 30)
• # checking shape
• print('Original Dataframe shape :',df.shape)
• # Input features
• X = df[cancer['feature_names']]
• print('Inputs Dataframe shape :', X.shape)
For example, we have two classes and we need to separate them efficiently. Classes can have
multiple features. Using only a single feature to classify them may result in some overlapping
as shown in the below figure. So, we will keep on increasing the number of features for proper
classification.
For example Suppose we have two sets of data points belonging to two different classes that
we want to classify. As shown in the given 2D graph, when the data points are plotted on the
2D plane, there’s no straight line that can separate the two classes of the data points completely.
Hence, in this case, LDA (Linear Discriminant Analysis) is used which reduces the 2D graph
into a 1D graph in order to maximize the separability between the two classes.
Here, Linear Discriminant Analysis uses both the axes (X and Y) to create a new axis and
projects data onto a new axis in a way to maximize the separation of the two categories and
hence, reducing the 2D graph into a 1D graph.
In the graph, it can be seen that a new axis (in red) is generated and plotted in the 2D graph
such that it maximizes the distance between the means of the two classes and minimizes the
variation within each class.
In simple terms, this newly generated axis increases the separation between the data points of
the two classes. After generating this new axis using the above-mentioned criteria, all the data
points of the classes are plotted on this new axis and are shown in the figure given below.