Machine Learning Basics
1. General Introduction
Compiled For
Ph.D. course Work
APSU, Rewa, MP, India
Outline
Artificial Intelligence
Machine Learning: Modern
Approaches to Artificial Intelligence
Machine Learning Problems
Machine Learning Resources
Our Course
Machine Learning Basics: 1. General Introduction
Intelligence
Intelligence
Ability to solve problems
Examples of Intelligent Behaviors or
Tasks
Classification of texts based on content
Heart disease diagnosis
Chess playing
Machine Learning Basics: 1. General Introduction
Example 1: Text Classification (1)
Huge oil platforms dot the Gulf like
beacons -- usually lit up like Christmas
trees at night.
One of them, sitting astride the
Rostam offshore oilfield, was all but Human
blown out of the water by U.S. Judgment
Crude
Warships on Monday.
Ship
The Iranian platform, an unsightly
mass of steel and concrete, was a
three-tier structure rising 200 feet
(60 metres) above the warm waters of
the Gulf until four U.S. Destroyers
pumped some …
Machine Learning Basics: 1. General Introduction
Example 1: Text Classification (2)
The Federal Reserve is expected to
enter the government securities
market to supply reserves to the
banking system via system repurchase Human
agreements, economists said. Judgment
Most economists said the Fed would Money-fx
execute three-day system
repurchases to meet a substantial
need to add reserves in the current
maintenance period, although some
said a more …
Machine Learning Basics: 1. General Introduction
Example 2: Disease Diagnosis (1)
Patient 1’s data
Age: 67
Sex: male Doctor
Diagnosis
Chest pain type: asymptomatic
Presence
Resting blood pressure: 160mm Hg
Serum cholestoral: 286mg/dl
Fasting blood sugar: < 120mg/dl
…
Machine Learning Basics: 1. General Introduction
Example 2: Disease Diagnosis (2)
Patient 2‘s data
Age: 63
Sex: male Doctor
Diagnosis
Chest pain type: typical angina
Absence
Resting blood pressure: 145mm Hg
Serum cholestoral: 233mg/dl
Fasting blood sugar: > 120mg/dl
…
Machine Learning Basics: 1. General Introduction
Example 3: Chess Playing
Chess Game
Two players playing one-by-one under
the restriction of a certain rule
Characteristics
To achieve a goal: win the game
Interactive
Machine Learning Basics: 1. General Introduction
Artificial Intelligence
Artificial Intelligence
Ability of machines in conducting
intelligent tasks
Intelligent Programs
Programs conducting specific intelligent
tasks
Intelligent
Processing
Input Output
Machine Learning Basics: 1. General Introduction
Example 1: Text Classifier (1)
…
…
fiber = 0
Text File: Crude = 1
Preprocessing …
Huge oil Classification …
platforms dot huge = 1
the Gulf like Money-fx = 0
…
beacons -- …
usually lit up … oil = 1
Ship = 1
platforms = 1
…
…
Machine Learning Basics: 1. General Introduction
Example 1: Text Classifier (2)
…
…
enter = 1
Text File: Crude = 0
Preprocessing expected = 1
The Federal Classification …
Reserve is …
expected to Money-fx = 1
federal = 1
enter the …
government … …
Ship = 0
oil = 0
…
…
Machine Learning Basics: 1. General Introduction
Example 2: Disease Classifier (1)
Preprocessed data of patient 1
Age = 67
Sex = 1
Classification
Chest pain type = 4
Presence = 1
Resting blood pressure = 160
Serum cholestoral = 286
Fasting blood sugar = 0
…
Machine Learning Basics: 1. General Introduction
Example 2: Disease Classifier (2)
Preprocessed data of patient 2
Age = 63
Sex = 1
Classification
Chest pain type = 1
Presence = 0
Resting blood pressure = 145
Serum cholestoral = 233
Fasting blood sugar = 1
…
Machine Learning Basics: 1. General Introduction
Example 3: Chess Program
Searching and
evaluating
Matrix representing Best move -
the current board New matrix
Opponent’s
playing his move
Machine Learning Basics: 1. General Introduction
AI Approach
Reasoning with Knowledge
Knowledge base
Reasoning
Traditional Approaches
Handcrafted knowledge base
Complex reasoning process
Disadvantages
Knowledge acquisition bottleneck
Machine Learning Basics: 1. General Introduction
Outline
Artificial Intelligence
Machine Learning: Modern
Approaches to Artificial Intelligence
Machine Learning Problems
Research and Resources
Our Course
Machine Learning Basics: 1. General Introduction
Machine Learning
Machine Learning (Mitchell 1997)
Learn from past experiences
Improve the performances of intelligent
programs
Definitions (Mitchell 1997)
A computer program is said to learn
from experience E with respect to some
class of tasks T and performance
measure P, if its performance at the
tasks improves with the experiences
Machine Learning Basics: 1. General Introduction
Example 1: Text Classification
Classified text files
Text file 1 trade
Text file 2 ship
… …
Training
New text file Text
class
classifier
Machine Learning Basics: 1. General Introduction
Example 2: Disease Diagnosis
Database of medical records
Patient 1’s data Absence
Patient 2’s data Presence
… …
Training
New patient’s Disease Presence or
data classifier absence
Machine Learning Basics: 1. General Introduction
Example 3: Chess Playing
Games played:
Game 1’s move list Win
Game 2’s move list Lose
… …
Training
New matrix Strategy of
representing Searching and Best move
the current
Evaluating
board
Machine Learning Basics: 1. General Introduction
Examples
Text Classification
Task T
Assigning texts to a set of predefined
categories
Performance measure P
Precision and recall of each category
Training experiences E
A database of texts with their
corresponding categories
How about Disease Diagnosis?
How about Chess Playing?
Machine Learning Basics: 1. General Introduction
Why Machine Learning Is Possible?
Mass Storage
More data available
Higher Performance of Computer
Larger memory in handling the data
Greater computational power for
calculating and even online learning
Machine Learning Basics: 1. General Introduction
Advantages
Alleviate Knowledge Acquisition
Bottleneck
Does not require knowledge engineers
Scalable in constructing knowledge base
Adaptive
Adaptive to the changing conditions
Easy in migrating to new domains
Machine Learning Basics: 1. General Introduction
Success of Machine Learning
Almost All the Learning Algorithms
Text classification (Dumais et al. 1998)
Gene or protein classification optionally
with feature engineering (Bhaskar et al.
2006)
Reinforcement Learning
Backgammon (Tesauro 1995)
Learning of Sequence Labeling
Speech recognition (Lee 1989)
Part-of-speech tagging (Church 1988)
Machine Learning Basics: 1. General Introduction
Outline
Artificial Intelligence
Machine Learning: Modern
Approaches to Artificial Intelligence
Machine Learning Problems
Machine Learning Resources
Our Course
Machine Learning Basics: 1. General Introduction
Choosing the Training Experience
Choosing the Training Experience
Sometimes straightforward
Text classification, disease diagnosis
Sometimes not so straightforward
Chess playing
Other Attributes
How the training experience is controlled
by the learner?
How the training experience represents
the situations in which the performance
of the program is measured?
Machine Learning Basics: 1. General Introduction
Choosing the Target Function
Choosing the Target Function
What type of knowledge will be learned?
How it will be used by the program?
Reducing the Learning Problem
From the problem of improving
performance P at task T with experience
E
To the problem of learning some
particular target functions
Machine Learning Basics: 1. General Introduction
Solving Real World Problems
What Is the Input?
Features representing the real world
data
What Is the Output?
Predictions or decisions to be made
What Is the Intelligent Program?
Types of classifiers, value functions, etc.
How to Learn from experience?
Learning algorithms
Machine Learning Basics: 1. General Introduction
Feature Engineering
Representation of the Real World Data
Features: data’s attributes which may be useful
in prediction
Feature Transformation and Selection
Select a subset of the features
Construct new features, e.g.
Discretization of real value features
Combinations of existing features
Post Processing to Fit the Classifier
Does not change the nature
Machine Learning Basics: 1. General Introduction
Intelligent Programs
Value Functions
Input: features
Output: value
Classifiers (Most Commonly Used)
Input: features
Output: a single decision
Sequence Labeling
Input: sequence of features
Output: sequence of decisions
Machine Learning Basics: 1. General Introduction
Examples of Value Functions
Linear Regression
Input: feature vectors x ( x1 , x2 ,, xn )
n
Output: f (x) w x b wi xi b
i 1
Logistic Regression
Input: feature vectors x ( x1 , x2 ,, xn )
1
Output: f (x)
1 e
w x b
Machine Learning Basics: 1. General Introduction
Examples of Classifiers
Linear Classifier
Input: feature vectors x ( x1 , x2 ,, xn )
n
Output: y sgn( w x b) sgn( wi xi b)
i 1
Rule Classifier
Decision tree
A tree with nodes representing condition
testing and leaves representing classes
Decision list
If condition 1 then class 1 elseif condition 2
then class 2 elseif ….
Machine Learning Basics: 1. General Introduction
Examples of Learning Algorithms
Parametric Functions or Classifiers
Given parameters of the functions or
classifier, e.g.
Linear functions or classifiers: w, b
Estimating the parameters, e.g.
Loss function optimization
Rule Learning
Condition construction
Rules induction using divide-and-conquer
Machine Learning Basics: 1. General Introduction
Machine Learning Problems
Methodology of Machine Learning
General methods for machine learning
Investigate which method is better under
some certain conditions
Application of Machine Learning
Specific application of machine learning
methods
Investigate which feature, classifier,
method should be used to solve a certain
problem
Machine Learning Basics: 1. General Introduction
Methodology
Theoretical
Mathematical analysis of performances of
learning algorithms (usually with
assumptions)
Empirical
Demonstrate the empirical results of
learning algorithms on datasets
(benchmarks or real world applications)
Machine Learning Basics: 1. General Introduction
Application
Adaptation of Learning Algorithms
Directly apply, or tailor learning
algorithms to specific application
Generalization
Generalize the problems and methods in
the specific application to more general
cases
Machine Learning Basics: 1. General Introduction
Outline
Artificial Intelligence
Machine Learning: Modern
Approaches to Artificial Intelligence
Machine Learning Problems
Machine Learning Resources
Our Course
Machine Learning Basics: 1. General Introduction
Introduction Materials
Text Books
T. Mitchell (1997). Machine Learning,
McGraw-Hill Publishers.
N. Nilsson (1996). Introduction to
Machine Learning (drafts).
Lecture Notes
T. Mitchell’s Slides
Introduction to Machine Learning
Machine Learning Basics: 1. General Introduction
Technical Papers
Journals, e.g.
Machine Learning, Kluwer Academic
Publishers.
Journal of Machine Learning Research,
MIT Press.
Conferences, e.g.
International Conference on Machine
Learning (ICML)
Neural Information Processing Systems
(NIPS)
Machine Learning Basics: 1. General Introduction
Others
Data Sets
UCI Machine Learning Repository
Reuters data set for text classification
Related Areas
Artificial intelligence
Knowledge discovery and data mining
Statistics
Operation research
…
Machine Learning Basics: 1. General Introduction
Outline
Artificial Intelligence
Machine Learning: Modern
Approaches to Artificial Intelligence
Machine Learning Problems
Machine Learning Resources
Our Course
Machine Learning Basics: 1. General Introduction
What I will Talk about
Machine Learning Methods
Simple methods
Effective methods (state of the art)
Method Details
Ideas
Assumptions
Intuitive interpretations
Machine Learning Basics: 1. General Introduction
What I won’t Talk about
Machine Learning Methods
Classical, but complex and not effective
methods (e.g., complex neural networks)
Methods not widely used
Method Details
Theoretical justification
Machine Learning Basics: 1. General Introduction
What You will Learn
Machine Learning Basics
Methods
Data
Assumptions
Ideas
Others
Problem solving techniques
Extensive knowledge of modern
techniques
Machine Learning Basics: 1. General Introduction
References
H. Bhaskar, D. Hoyle, and S. Singh (2006). Machine
Learning: a Brief Survey and Recommendations for
Practitioners. Computers in Biology and Medicine, 36(10),
1104-1125.
K. Church (1988). A Stochastic Parts Program and Noun
Phrase Parser for Unrestricted Texts. In Proc. ANLP-
1988, 136-143.
S. Dumais, J. Platt, D. Heckerman and M. Sahami
(1998). Inductive Learning Algorithms and
Representations for Text Categorization. In Proc. CIKM-
1998, 148-155.
K. Lee (1989). Automatic Speech Recognition: The
Development of the Sphinx System, Kluwer Academic
Publishers.
T. Mitchell (1997). Machine Learning, McGraw-Hill
Publishers.
G. Tesauro (1995). Temporal Difference Learning and
TD-gammon. Communications of the ACM, 38(3), 58-68.
Machine Learning Basics: 1. General Introduction
The End