Importing Datasets
Objectives
Analyze Python data using a dataset
Identify three Python libraries and describe their
uses
Chapter1- Importing Datasets 2
The Problem
Why Data Analysis:
Data is collected everywhere around us:
Collected manually by scientists or collected digitally,
Every time you click on a website, or your mobile device.
Data does NOT mean information.
Data analysis (data science) help us unlock the information
and insights from raw data to answer our questions.
Data analysis plays an important role:
Discover useful information from the data.
Answer questions.
Predict the future or the unknown.
3
Chapter1- Importing Datasets
The Problem
Analysis the scenario- How can we help Mr. Hoa
determine the best price for his car?
Hoa
Give your opinions: estimate used car prices
Chapter1- Importing Datasets 4
The Problem
Let's say we have a friend named Hoa. And Hoa
wants to sell his car. But the problem is he doesn't
know how much he should sell his car for.
Hoa wants to sell his car for as much as he can.
But he also wants to set the price reasonably, so
someone would want to purchase it.
So the price he sets should represent the value of
the car.
Chapter1- Importing Datasets 5
The Problem
Let's think like data scientists and clearly define some
of his problems:
Is there data on the prices of other cars and their characteristics?
What features of cars affect their prices?
Color, Brand?
Does horsepower also effect the selling price, or perhaps
something else?
As a data analyst or data scientist, these are some of
the questions we can start thinking about. To answer
these questions, we're going to need some data.
Chapter1- Importing Datasets 6
Understanding the Data
We'll be looking at the dataset on used car prices.
The dataset used in this course is an open dataset by Jeffrey
C. Schlemmer.
https://archive.ics.uci.edu/ml/machine-learning-databases/aut
os/
Chapter1- Importing Datasets 7
Understanding the Data
Each of the attributes in the datasets
https://archive.ics.uci.edu/ml/machine-learning-
databases/autos/imports-85.names
Chapter1- Importing Datasets 8
Python Packages for Data
Science
A Python library is a collection of functions and methods:
perform lots of actions without writing any code.
The libraries usually contain built in modules providing
different functionalities which you can use directly.
The Python data analysis libraries is divided three groups.
Scientific computing libraries
Visualization libraries
Algorithmic libraries
Chapter1- Importing Datasets 9
Python Packages for Data
Science
Scientific computing libraries
Chapter1- Importing Datasets 10
Python Packages for Data
Science
Visualization libraries
Chapter1- Importing Datasets 11
Python Packages for Data
Science
Algorithmic libraries
Chapter1- Importing Datasets 12
Summary
Analyze Python data using a dataset
Identify three Python libraries and describe their uses
Chapter1- Importing Datasets 13
Q&A
Chapter1- Importing Datasets 14