0% found this document useful (0 votes)
16 views8 pages

Pandas Notes

The document provides detailed notes on the basics of using pandas, focusing on creating DataFrames, inspecting data, checking structure, and performing aggregations. It includes real-life examples, concise definitions, syntax breakdowns, full working code, and key takeaways for each topic. The notes aim to help users effectively analyze and manipulate data using pandas.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views8 pages

Pandas Notes

The document provides detailed notes on the basics of using pandas, focusing on creating DataFrames, inspecting data, checking structure, and performing aggregations. It includes real-life examples, concise definitions, syntax breakdowns, full working code, and key takeaways for each topic. The notes aim to help users effectively analyze and manipulate data using pandas.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Pandas Notes

Here are detailed notes on the pandas basics we’ve covered. Each section follows
the same five-point pattern, with very simple examples and full code you can run.

1. Creating a DataFrame
1. Real-Life Example
You run a roadside tea stall and note today’s sales of two items:

Cups of tea sold

Packets of biscuits sold

You write it on paper:

Tea – 30 cups
Biscuits – 20 packets

To analyse sales, you want this in a neat computer table.

2. Concise Definition
A DataFrame is pandas’ way to store table data (rows × columns), just like an
Excel sheet.

Pandas Notes 1
3. Syntax Breakdown

pd.DataFrame(data, # your raw data (dict or list of dicts)


columns=None, # (optional) list of column names
index=None) # (optional) list of row labels

data: often a dict of equal-length lists, e.g. {'item': [...], 'count': [...]}

columns: lets you pick or rename which columns appear, and in what order

index: lets you label rows (e.g. dates or IDs)

4. Full, Working Code

import pandas as pd

# 1) Raw sales data as a dictionary


data = {
'item': ['Tea', 'Biscuits'],
'sold': [30, 20]
}

# 2) Create the DataFrame


df = pd.DataFrame(data)

# 3) Show the table


print(df)

Output:

item sold
0 Tea 30

Pandas Notes 2
1 Biscuits 20

5. Three Key Takeaways


1. Dict → Table: A dict of lists becomes a neat table.

2. Check with print(df): Always look at your table right after creating it.

3. Flexible Input: You can also start from a list of records ( [{'item':'Tea','sold':30}, …] ).

2. Inspecting with head()


1. Real-Life Example
You have a big list of student marks. You want to peek at the first few entries to
confirm you loaded them correctly.

2. Concise Definition
df.head(n) shows the first n rows of your DataFrame (default n=5 ).

3. Syntax Breakdown

df.head(n)

df: your DataFrame

.head: the function to look at top rows

(n): number of rows to show (optional; default = 5)

4. Full, Working Code

import pandas as pd

data = {

Pandas Notes 3
'student': ['Amit', 'Bina', 'Chirag', 'Deepa', 'Esha', 'Farhan'],
'marks': [85, 90, 78, 92, 88, 75]
}
df = pd.DataFrame(data)

# Show the first 3 students


print(df.head(3))

Output:

student marks
0 Amit 85
1 Bina 90
2 Chirag 78

5. Three Key Takeaways


1. Quick Peek: head() avoids scrolling through hundreds of rows.

2. Default = 5: Without (n) , you see the first 5.

3. Errors Show Early: If your data header is wrong, you spot it immediately.

3. Checking Structure ( shape , columns , dtypes )


1. Real-Life Example
You have a guest list for a family function with names, ages, and gifts they bring.
You want to know:

How many guests?

What columns do you have?

Are ages stored as numbers or text?

Pandas Notes 4
2. Concise Definition
df.shape → returns (rows, columns)

df.columns → lists the column names

df.dtypes → shows each column’s data type (int, float, object)

3. Syntax Breakdown

df.shape # no (), returns a tuple like (10, 3)


df.columns # no (), returns an Index of column names
df.dtypes # no (), returns a Series of column:data_type

4. Full, Working Code

import pandas as pd

guests = {
'name': ['Ravi', 'Sara', 'Manoj'],
'age': [28, 25, 30],
'gift': ['Flowers','Chocolates','Book']
}
df = pd.DataFrame(guests)

# Check structure
print("Shape :", df.shape)
print("Columns :", df.columns)
print("Data types:\n", df.dtypes)

Output:

Pandas Notes 5
Shape : (3, 3)
Columns : Index(['name', 'age', 'gift'], dtype='object')
Data types:
name object
age int64
gift object
dtype: object

5. Three Key Takeaways


1. Know Size: shape tells you exactly how many rows and columns.

2. See Fields: columns avoids guessing field names.

3. Type Safety: dtypes lets you catch “ages as text” before you do math.

4. Aggregation with agg()


1. Real-Life Example
You track daily sales of two sweets at your mithai shop. After a week, you want:

Total sweets sold

Average price you charged

Day with maximum laddoos sold

Instead of manual sums, you use pandas to tell you in one step.

2. Concise Definition
(or df.agg() ) computes summary numbers (sum, mean, max, min)
df.aggregate()

across entire DataFrame.

Combined with groupby() , it does the same per category (e.g., per sweet type).

3. Syntax Breakdown

Pandas Notes 6
# Overall summary
df.agg({'sold':'sum', 'price':'mean'})

# By category
df.groupby('sweet').agg({
'sold':['sum','max'],
'price':['mean','min']
})

df: your table

.agg / .aggregate: the summary function

groupby('col'): first split rows by that column

func dict/list: choose which statistics you want

4. Full, Working Code

import pandas as pd

data = {
'day': ['Mon','Tue','Wed','Thu','Fri','Sat','Sun'],
'sweet': ['laddoo','laddoo','gulab','laddoo','gulab','gulab','laddoo'],
'sold': [10, 12, 8, 15, 10, 9, 11],
'price': [20, 20, 25, 20, 25, 25, 20]
}
df = pd.DataFrame(data)

# 1) Overall summary
print(df.agg({'sold':'sum','price':'mean'}))

# 2) Summary by sweet

Pandas Notes 7
print(df.groupby('sweet').agg({'sold':['sum','max'],'price':['mean','min']}))

5. Three Key Takeaways


1. One-Step Summary: .agg() gives totals and averages in one command.

2. Compare Groups: groupby()+agg() shows stats per category (like laddoo vs


gulab).

3. Customizable: Pass your own list or dict of functions— .agg(['min','max','mean']) or


even your own Python function.

End of Notes
Keep practicing with your own data every
day!

Pandas Notes 8

You might also like