Importing Datasets
Objectives
Read data using Python's Pandas package
Demonstrate how to import and export data in Python
Importing Datasets 2
Importing and
Exporting Data in
Python
Importing data: process of loading and reading data into
Python from various resource
What are important properties?
Resource format: file format( .csv, . Xlsx, .json,...
Located: resource path
C:/data/yourdata.csv
https://archive.ics.uci.edu/ml/machine-learning-database
s/autos/imports-85.names
Importing Datasets 3
Importing and
Exporting Data in
Python
Importing a CSV into Python
Importing a CSV without a header
Importing Datasets 4
Importing and
Exporting Data in
Python
Printing the dataframe in Python
df.head(n): show the first n rows of data frame
df.tail(n): show the bottom n rows of data frame
We printed out the first five rows of data
Pandas automatically set the column header
Difficult to work with the meaningful column names
Importing Datasets 5
Importing and
Exporting Data in
Python
Replace default header: df.columns= headers
Importing Datasets 6
Importing and
Exporting Data in
Python
Saving data
Exporting to different formats in Python
Importing Datasets 7
Analyzing Data in
Python
Understand your data before you begin any analysis
Should check:
Data types
Data distribution
Locate potential issues with the data: the wrong data
type of features which may need to be resolved later on.
Importing Datasets 8
Analyzing Data in
Python
Data type in Pandas and pure Python
Why do you need to check data types?
Pandas automatically assign types based on the encoding it
detects from the original data table.
Compatibility with Python methods
Importing Datasets 9
Analyzing Data in
Python
In Pandas, you can use dataframe.dtypes to check data
type: df.types
Importing Datasets 10
Analyzing Data in
Python
df.describe(): returns a statistical summary
df.describe(include=“all”): returns a full statistical
summary
Importing Datasets 11
Analyzing Data in
Python
df.info(): concise summary of your DataFrame
Importing Datasets 12
Summary
Read data using Python's Pandas package
Demonstrate how to import and export data in Python
Importing Datasets 13
Q&A
Importing Datasets 14