2/23/25, 9:15 PM about:blank
Data Analysis with Python
Cheat Sheet: Importing Data Sets
Package/Method Description Code Example
df = pd.read_csv(<CSV_path>, header = None)
# load without header
df = pd.read_csv(<CSV_path>, header = 0)
Read the CSV file containing a data set to a # load using first row as header
Read CSV data set
pandas data frame
Note: The labs in this course run in JupyterLite environment. In JupyterLite environment, you'll need to download the required file to the local
environment and then use the local path to the file as the CSV_path. However, in case you are using JupyterLabs, or any other Python compiler on your
local machine, you can use the URL of the required file directly as the CSV_path.
Print first few Print the first few entries (default 5) of the df.head(n) #n=number of entries; default 5
entries pandas data frame
Print the last few entries (default 5) of the df.tail(n) #n=number of entries; default 5
Print last few entries
pandas data frame
Assign header Assign appropriate header names to the data df.columns = headers
names frame
Replace "?" with Replace the entries "?" with NaN entry from df = df.replace("?", np.nan)
NaN Numpy library
Retrieve the data types of the data frame df.dtypes
Retrieve data types
columns
Retrieve the statistical description of the data
Retrieve statistical set. Defaults use is for only numerical data df.describe() #default use df.describe(include="all")
description types. Use include="all" to create summary
for all variables
Retrieve data set Retrieve the summary of the data set being df.info()
summary used, from the data frame
Save data frame to Save the processed data frame to a CSV file df.to_csv(<output CSV path>)
CSV with a specified path
about:blank 1/1