0% found this document useful (0 votes)
74 views10 pages

Data Analytics & R Programming: Decision Tree Algorithm

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 10

Data Analytics & R Programming

Decision Tree Algorithm

S.Prabhakar, M.C.A., M.E (SEOR)


Assistant Professor, DFT Dept.,
NIFT, Chennai
Today’s Session
• What is Decision Tree?
• Terminologies associated to Decision Tree
• Visualizing a Decision Tree
• Writing a Decision tree classifier from scratch
Types of Machine Learning Algorithm

Supervised Learning (Task Un Supervised Learning Reinforcement Learning


Driven) (Data Driven) (Learn from Mistakes)
Decision Tree
• Decision tree is a type of supervised learning algorithm
(having a pre-defined target variable) that is mostly used
in classification problems.
• Graphical representation of all possible solution to a decision
• Decisions are based on some conditions
• Decision made can be easily explained
• Works for both categorical and continuous input and output
variables.
• Split the population or sample into two or more
homogeneous sets (or sub-populations) based on most
significant splitter / differentiator in input variables.
Decision Tree Terminology

ROOT Node: It represents entire population or sample and this further gets divided
into two or more homogeneous sets.
SPLITTING: It is a process of dividing a node into two or more sub-nodes.
Decision Node: When a sub-node splits into further sub-nodes, then it is called
decision node.
Leaf/ Terminal Node: Nodes do not split is called Leaf or Terminal node.
Pruning: When we remove sub-nodes of a decision node, this process is called
pruning. You can say opposite process of splitting.
Weekend Plan – Decision Tree

ROOT NODE
NO Rain YES

Go
Go Out
Out Stay
Stay in
in

INTERIOR NODES
Shopping Movie Coffee Shop TV Shows
Shopping Movie Coffee Shop TV Shows

NO YES NO YES

LEAF NODES
CART ALGORITHM
• Classification and Regression Trees or CART for short is a term
introduced by Leo Breiman to refer to Decision Tree algorithms
that can be used for classification or regression predictive
modeling problems.

• Classically, this algorithm is referred to as “decision trees”, but on


some platforms like R they are referred to by the more modern
term CART.

• The CART algorithm provides a foundation for important


algorithms like bagged decision trees, random forest and boosted
decision trees.
Decision Tree – R Programming
library (rpart)
library (rpart.plot)

data = read.csv("D:/S Prabhakar 14.6.18/1 Even Semester 2020 Jan/R Programming/Class/Decision Tree/data1.csv")

tree <- rpart (height ~ gender +weight,data)


a <- data.frame(gender=c("BOY"),weight=(30))
result<-predict(tree,a)
print (result)

rpart.plot(tree)

tree <- rpart(gender ~ height+weight,data)


a<- data.frame(height=c(12),weight=c(20))
result<-predict(tree,a)
print (result)

rpart.plot(tree)
Decision Tree – R Programming

You might also like