0% found this document useful (0 votes)
5 views

Introduction to Pandas Programming 2

The document provides an introduction to Pandas programming, focusing on data cleaning, manipulation, and analysis techniques. It covers handling missing data, adding/removing columns, grouping and aggregation, sorting, merging, and exporting data. Additionally, it includes a simple data analysis example demonstrating total sales by product and sales summary by region.

Uploaded by

Marou fan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Introduction to Pandas Programming 2

The document provides an introduction to Pandas programming, focusing on data cleaning, manipulation, and analysis techniques. It covers handling missing data, adding/removing columns, grouping and aggregation, sorting, merging, and exporting data. Additionally, it includes a simple data analysis example demonstrating total sales by product and sales summary by region.

Uploaded by

Marou fan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Introduction to Pandas Programming

4. Data Cleaning

Handling Missing Data:

 Check for missing values:

python
CopierModifier
print(df.isnull().sum())

 Fill missing values:

python
CopierModifier
df["Age"].fillna(df["Age"].mean(), inplace=True) # Fill with mean

 Drop rows with missing values:

python
CopierModifier
df.dropna(inplace=True)

5. Adding and Removing Columns

Add a New Column:

python
CopierModifier
df["Bonus"] = df["Salary"] * 0.1
print(df)

Remove a Column:

python
CopierModifier
df.drop("Bonus", axis=1, inplace=True)

6. Grouping and Aggregation

 Group by a column and calculate summary statistics:

python
CopierModifier
grouped = df.groupby("Age")["Salary"].mean()
print(grouped)
 Aggregate multiple functions:

python
CopierModifier
agg = df.groupby("Age").agg({"Salary": ["mean", "sum"]})
print(agg)

7. Sorting Data

 Sort by a single column:

python
CopierModifier
df.sort_values("Salary", ascending=False, inplace=True)
print(df)

 Sort by multiple columns:

python
CopierModifier
df.sort_values(["Age", "Salary"], ascending=[True, False], inplace=True)

8. Merging and Joining DataFrames

Merging:

python
CopierModifier
df1 = pd.DataFrame({"ID": [1, 2], "Name": ["Alice", "Bob"]})
df2 = pd.DataFrame({"ID": [1, 2], "Salary": [50000, 60000]})

merged = pd.merge(df1, df2, on="ID")


print(merged)

Joining:

python
CopierModifier
df1 = df1.set_index("ID")
df2 = df2.set_index("ID")

joined = df1.join(df2)
print(joined)

9. Exporting Data

To a CSV File:

python
CopierModifier
df.to_csv("output.csv", index=False)

To an Excel File:

python
CopierModifier
df.to_excel("output.xlsx", index=False)

10. Example: Simple Data Analysis

Dataset Example:

python
CopierModifier
data = {
"Product": ["A", "B", "C", "A", "B", "C"],
"Sales": [100, 200, 300, 400, 500, 600],
"Region": ["East", "West", "East", "West", "East", "West"]
}
df = pd.DataFrame(data)

Analysis:

 Total sales by product:

python
CopierModifier
print(df.groupby("Product")["Sales"].sum())

 Sales summary by region:

python
CopierModifier
print(df.groupby("Region")["Sales"].agg(["mean", "sum"]))

You might also like