0% found this document useful (0 votes)
23 views

Training Python

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Training Python

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Generic Tree flow Gini

Information gain
Entropy
by which feature
Level 0

N
Y Feature1
Level 1

Depth/height (3)
Y Feature2 Feature3 N

Some
conditional test
N Y
Level 2

Feature4 Feature5
Y N

External
leaf N Y

class/value class/value class/value class/value class/value class/value


1
“CatBoost” name comes from two words “Category” and “Boosting”
“This is the first Russian machine learning technology that’s an open source,” said Mikhail Bilenko, Yandex’s head of machine intelligence and research.

Advantages
• What is CatBoost? • Performance: CatBoost provides state of the art
• A open-sourced algorithm from Yandex. results and it is competitive with any leading ML
algorithm on the performance front.
• It can easily integrate with deep learning
frameworks like Google’s TensorFlow and • Handling Categorical features automatically: We
can use CatBoost without any explicit pre-
Apple’s Core ML.
processing to convert categories into numbers.
• It can work with diverse data types - CatBoost converts categorical values into numbers
Categories of data, such as audio, text, using various statistics on combinations of
image categorical features and combinations of
• It provides best-in-class accuracy. categorical and numerical features.
• Powerful in two ways: • Robust: It reduces the need for extensive hyper-
parameter tuning and lower the chances of
• Does not require extensive data training overfitting also which leads to more generalized
typically required by other ML models. Although, CatBoost has multiple
• Provides powerful out-of-the-box support parameters to tune and it contains parameters like
for the more descriptive data formats that the number of trees, learning rate, regularization,
accompany many business problems. tree depth, fold size, bagging temperature and
others
2
https://tech.yandex.com/catboost/
eXtreme Gradient Boosting
"It is not the strongest of the species that survives, nor the most intelligent. Rather it is the one that is most adaptable to change“ Charles Darwin

Series Series
The
subsequ
Tree model Tree model Tree model
ent
samples Boosting Boosting
depend
on
weights
given to
records
in the
previous Learning from correcting mistakes
sample The final prediction is a
weighted (not a
which simple) average of all
did not w1 w2 w3 predictions

predict

XGBoost
correctly

Image source link is at reference section

You might also like