Introduction to Pandas Programming
4. Data Cleaning
Handling Missing Data:
Check for missing values:
python
CopierModifier
print(df.isnull().sum())
Fill missing values:
python
CopierModifier
df["Age"].fillna(df["Age"].mean(), inplace=True) # Fill with mean
Drop rows with missing values:
python
CopierModifier
df.dropna(inplace=True)
5. Adding and Removing Columns
Add a New Column:
python
CopierModifier
df["Bonus"] = df["Salary"] * 0.1
print(df)
Remove a Column:
python
CopierModifier
df.drop("Bonus", axis=1, inplace=True)
6. Grouping and Aggregation
Group by a column and calculate summary statistics:
python
CopierModifier
grouped = df.groupby("Age")["Salary"].mean()
print(grouped)
Aggregate multiple functions:
python
CopierModifier
agg = df.groupby("Age").agg({"Salary": ["mean", "sum"]})
print(agg)
7. Sorting Data
Sort by a single column:
python
CopierModifier
df.sort_values("Salary", ascending=False, inplace=True)
print(df)
Sort by multiple columns:
python
CopierModifier
df.sort_values(["Age", "Salary"], ascending=[True, False], inplace=True)
8. Merging and Joining DataFrames
Merging:
python
CopierModifier
df1 = pd.DataFrame({"ID": [1, 2], "Name": ["Alice", "Bob"]})
df2 = pd.DataFrame({"ID": [1, 2], "Salary": [50000, 60000]})
merged = pd.merge(df1, df2, on="ID")
print(merged)
Joining:
python
CopierModifier
df1 = df1.set_index("ID")
df2 = df2.set_index("ID")
joined = df1.join(df2)
print(joined)
9. Exporting Data
To a CSV File:
python
CopierModifier
df.to_csv("output.csv", index=False)
To an Excel File:
python
CopierModifier
df.to_excel("output.xlsx", index=False)
10. Example: Simple Data Analysis
Dataset Example:
python
CopierModifier
data = {
"Product": ["A", "B", "C", "A", "B", "C"],
"Sales": [100, 200, 300, 400, 500, 600],
"Region": ["East", "West", "East", "West", "East", "West"]
}
df = pd.DataFrame(data)
Analysis:
Total sales by product:
python
CopierModifier
print(df.groupby("Product")["Sales"].sum())
Sales summary by region:
python
CopierModifier
print(df.groupby("Region")["Sales"].agg(["mean", "sum"]))