Python Data Wrangling Tutorial: Pandas Cheatsheet
Python Data Wrangling Tutorial: Pandas Cheatsheet
Python Data Wrangling Tutorial: Pandas Cheatsheet
This Pandas cheatsheet will cover some of the most common and useful functionalities for data wrangling in
Python. Broadly speaking, data wrangling is the process of reshaping, aggregating, separating, or otherwise
transforming your data from one format to a more useful one.
Pandas is the best Python library for wrangling relational (i.e. table-format) datasets, and it will be doing most of
the heavy lifting for us.
To see the most up-to-date full tutorial and download the sample dataset, visit the online tutorial at
elitedatascience.com.
df = pd.read_csv(‘BNC2_sample.csv’,
Reduce-merge the melted data
names=[‘Code’, ‘Date’, ‘Open’, ‘High’, ‘Low’
from functools import reduce
‘Close’, ‘Volume’, ‘VWAP’, ‘TWAP’])
*The sample dataset can be downloaded here.
base_df = df[[‘Date’, ‘Code’, ‘Volume’, ‘VWAP’]]
feature_dfs = [base_df] + melted_dfs
Filter unwanted observations
gwa_codes = [code for code in df.Code.unique() if ‘GWA_’ in code] abt = reduce(lambda left,right: pd.merge(left,right,on=[‘Date’,
df = df[df.Code.isin(gwa_codes)] ‘Code’]), feature_dfs)
To see the most up-to-date full tutorial, explanations, and additional context, visit the online tutorial at elitedatascience.com.
We also have plenty of other tutorials and guides.
ELITEDATASCIENCE.COM