0% found this document useful (0 votes)
105 views

Data Wrangling Python - Suwarti

1) The document outlines an upcoming mini bootcamp on data science that will cover data wrangling in Python. 2) The instructor is Suwarti, who has a master's degree in actuarial mathematics and works as a data scientist. 3) Participants will learn about data cleansing mechanisms including handling missing values, detecting anomalies and outliers, checking and correcting data types.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views

Data Wrangling Python - Suwarti

1) The document outlines an upcoming mini bootcamp on data science that will cover data wrangling in Python. 2) The instructor is Suwarti, who has a master's degree in actuarial mathematics and works as a data scientist. 3) Participants will learn about data cleansing mechanisms including handling missing values, detecting anomalies and outliers, checking and correcting data types.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

MINI BOOTCAMP

DATA SCIENCE

Data Wrangling Python


By Suwarti

#MulaiBelajarData
Mini Bootcamp Data Science

Profil Narasumber
SUWARTI, M.Si

Pendidikan:
Magister Matematika Aktuaria
Institut Teknologi Bandung (ITB)

Pekerjaan:
Data Scientist at Astra Graphia Information Technology (AGIT)

Contact Narasumber
LinkedIn: https://www.linkedin.com/in/suwarti/

#MulaiBelajarData
Mini Bootcamp Data Science

Learning Objective
In this course you will learn:
 Understanding Data Cleansing Mechanism
 Understanding Missing Values Checking and Handling Concepts
 Understanding Anomaly and Outlier Detection Concepts
 Understanding Data Type Checking and Correction Mechanism

#MulaiBelajarData
Mini Bootcamp Data Science

Machine Learning Workflow

#MulaiBelajarData
Mini Bootcamp Data Science

Data Cleansing
 Missing Values Checking and Handling
 Duplicates Checking
 Anomaly and Outlier Detection
 Data Type Checking
 Data type correction
 Feature extraction

#MulaiBelajarData
Mini Bootcamp Data Science

Missing values
Why missing value exist?
 Values are missed during data acquisition process
 Values are deleted accidentally
 Corrupt data
 Mismatch between row and column position
 The real value is not available

If we fill in missing values with the wrong data, you are adding bias.

#MulaiBelajarData
Mini Bootcamp Data Science

Missing Values Handling


Basic Imputation :
 Mean Imputation
 Mode Imputation
 Median Imputation

#MulaiBelajarData
Mini Bootcamp Data Science

Anomalies

Anomalies are something that deviate from


what is standard, normal, or expected,
anomalies are form of error.

#MulaiBelajarData
Mini Bootcamp Data Science

Outliers
Outliers are data point that differs significantly from other
observations, outliers are not form of error.

#MulaiBelajarData
Mini Bootcamp Data Science

Data Type Checking


Data Types:
Object
Integer
Float
Date, datetime

#MulaiBelajarData
Mini Bootcamp Data Science

Data Type Correction

#MulaiBelajarData
Mini Bootcamp Data Science

TERIMA KASIH

#MulaiBelajarData

You might also like