Introduction

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 11

Data Science: Concepts and Practice

Course slides
Course Book Course Software

Data Science: Concepts and


Practice
Authors : Vijay Kotu & Bala Deshpande
Publisher : Morgan Kaufmann www.rapidminer.com

Free Download
Google Class Room

• Google Classroom link:

https://classroom.google.com/c/NTQ0NTE4NTYyMTky?cjc=tczkuji

• Zoom Meeting Link:

Meeting ID: 653 2942 0781


Password: 345
1. Introduction
What is Data Science
Models
Data Science

• Data science starts with data, which can range from a simple array of a few
numeric observations to a complex matrix of millions of observations with
thousands of variables.
• Data science utilizes certain specialized computational methods in order to
discover meaningful and useful structures within a dataset.
• The discipline of data science coexists and is closely associated with a
number of related areas such as database systems, data engineering,
visualization, data analysis, experimentation, and business intelligence
(BI).
Types of Data Science
Tasks Description Algorithms Examples

Classification Predict if a data point belongs to Decision Trees, Neural Assigning voters into known buckets by
one of predefined classes. The networks, Bayesian political parties eg: soccer moms.
prediction will be based on models, Induction rules, K Bucketing new customers into one of
learning from known data set. nearest neighbors known customer groups.

Regression Predict the numeric target label of Linear regression, Logistic Predicting unemployment rate for next
a data point. The prediction will regression year. Estimating insurance premium.
be based on learning from known
data set.

Anomaly detection Predict if a data point is an outlier Distance based, Density Fraud transaction detection in credit
compared to other data points in based, LOF cards. Network intrusion detection.
the data set.

Time series Predict if the value of the target Exponential smoothing, Sales forecasting, production
variable for future time frame ARIMA, regression forecasting, virtually any growth
based on history values. phenomenon that needs to be
extrapolated

Clustering Identify natural clusters within the K means, density based Finding customer segments in a
data set based on inherit clustering - DBSCAN company based on transaction, web
properties within the data set. and customer call data.

Association analysis Identify relationships within an FP Growth, Apriori Find cross selling opportunities for a
itemset based on transaction retailor based on transaction purchase
data. history.
Course
Core Algorithms
outline Classification
Decision Trees
Rule Induction
k-Nearest Neighbors
Naïve Bayesian
Artificial Neural Networks
Process Basics Support Vector Machines
Common Applications

Data Science Ensemble Learners


Text Mining
Process Regression
Time Series Forecasting
Data Exploration Linear Regression
Logistic Regression Anomaly Detection
Model Evaluation
Association Analysis Feature Selection
Apriori
FP-Growth

Clustering
k-Means
DBSCAN
Self-Organizing Maps

You might also like