0% found this document useful (0 votes)

14 views7 pages

Pandas Tutorial

The document provides an introduction to Pandas, a data manipulation library in Python, detailing its primary data structures: Series and DataFrame. It covers basic operations, data manipulation, handling missing data, data aggregation, merging DataFrames, and advanced operations like pivot tables and applying functions. The conclusion emphasizes the importance of practice and using documentation for mastering Pandas.

Uploaded by

vardhankallempudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views7 pages

Pandas Tutorial

Uploaded by

vardhankallempudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Introduction to Pandas

1. Introduction to Pandas

Pandas is built on top of NumPy and provides two primary data structures: Series and DataFrame.

Series

A Series is a one-dimensional labeled array capable of holding any data type.

import pandas as pd

# Creating a Series

s = pd.Series([1, 3, 5, 7, 9])

print(s)

DataFrame

A DataFrame is a two-dimensional labeled data structure with columns of potentially different types.

# Creating a DataFrame

data = {

'Name': ['John', 'Anna', 'Peter', 'Linda'],

'Age': [28, 24, 35, 32],

'City': ['New York', 'Paris', 'Berlin', 'London']

df = pd.DataFrame(data)
Introduction to Pandas

print(df)

2. Basic Operations on DataFrames

Viewing Data

- head(): View the first few rows of the DataFrame.

- tail(): View the last few rows of the DataFrame.

- info(): Get a summary of the DataFrame.

- describe(): Get descriptive statistics.

print(df.head())

print(df.tail())

print(df.info())

print(df.describe())

Selecting Data

- Using column names.

- Using row indices with iloc and loc.

# Select a column

print(df['Name'])

# Select multiple columns

print(df[['Name', 'City']])
Introduction to Pandas

# Select rows by index

print(df.iloc[1:3])

# Select rows and columns by labels

print(df.loc[0:2, ['Name', 'City']])

3. Data Manipulation

Adding and Dropping Columns

- Adding new columns.

- Dropping columns.

# Adding a new column

df['Country'] = ['USA', 'France', 'Germany', 'UK']

print(df)

# Dropping a column

df = df.drop(columns=['Country'])

print(df)

Filtering Data

- Using conditions to filter rows.

Introduction to Pandas

# Filtering rows where Age > 30

filtered_df = df[df['Age'] > 30]

print(filtered_df)

4. Handling Missing Data

- Checking for missing data.

- Filling missing data.

- Dropping missing data.

# Creating a DataFrame with missing values

data = {

'Name': ['John', 'Anna', 'Peter', 'Linda'],

'Age': [28, None, 35, 32],

'City': ['New York', 'Paris', None, 'London']

df = pd.DataFrame(data)

# Checking for missing data

print(df.isnull())

# Filling missing data

df_filled = df.fillna({'Age': df['Age'].mean(), 'City': 'Unknown'})

Introduction to Pandas

print(df_filled)

# Dropping missing data

df_dropped = df.dropna()

print(df_dropped)

5. Data Aggregation and Grouping

- Using groupby to group data and perform aggregation.

data = {

'Category': ['A', 'B', 'A', 'B'],

'Value': [10, 20, 30, 40]

df = pd.DataFrame(data)

# Grouping by 'Category' and calculating the sum of 'Value'

grouped_df = df.groupby('Category').sum()

print(grouped_df)

6. Merging and Joining DataFrames

- Concatenation.
Introduction to Pandas

- Merging based on keys.

# Concatenation

df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1']})

df2 = pd.DataFrame({'A': ['A2', 'A3'], 'B': ['B2', 'B3']})

result = pd.concat([df1, df2])

print(result)

# Merging

left = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'A': ['A0', 'A1', 'A2']})

right = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'B': ['B0', 'B1', 'B2']})

result = pd.merge(left, right, on='key')

print(result)

7. Advanced Data Operations

Pivot Tables

- Creating pivot tables to summarize data.

data = {

'Date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],

'City': ['New York', 'Paris', 'Berlin', 'London'],

Introduction to Pandas

'Sales': [200, 150, 300, 250]

df = pd.DataFrame(data)

pivot_table = df.pivot_table(values='Sales', index='City', columns='Date')

print(pivot_table)

Applying Functions

- Using apply to apply functions to data.

# Applying a lambda function to a column

df['Sales'] = df['Sales'].apply(lambda x: x * 1.1)

print(df)

Conclusion

This is a brief overview of some of the basic and intermediate functionalities of pandas. As you work

more with pandas, you'll discover many more powerful features and methods that can help you

manipulate and analyze data efficiently. Practice is key, so try to work on different datasets and use

the pandas documentation for further reference.

Pandas Basics
No ratings yet
Pandas Basics
84 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas
No ratings yet
Pandas
4 pages
Pandas Tutorial
No ratings yet
Pandas Tutorial
9 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Pandas
No ratings yet
Pandas
25 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
Pandas Research
No ratings yet
Pandas Research
14 pages
Python Programming For Data Science
No ratings yet
Python Programming For Data Science
36 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
1 page
Pandas - Cheat - Sheet (1) - 240511 - 113437
No ratings yet
Pandas - Cheat - Sheet (1) - 240511 - 113437
1 page
Pandas
No ratings yet
Pandas
26 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Pandas - Digitalocean
No ratings yet
Pandas - Digitalocean
15 pages
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
No ratings yet
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
1 page
Data Wrangling With Python and Pandas
No ratings yet
Data Wrangling With Python and Pandas
7 pages
Pandas
No ratings yet
Pandas
7 pages
Pandas Data Structures: Sections
No ratings yet
Pandas Data Structures: Sections
13 pages
Pandas
No ratings yet
Pandas
12 pages
PandasGUIA PYTHON-04
No ratings yet
PandasGUIA PYTHON-04
1 page
Pandas
No ratings yet
Pandas
25 pages
Data Handling Module
No ratings yet
Data Handling Module
10 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
UNIT II Notes
No ratings yet
UNIT II Notes
23 pages
Pandas Notes
No ratings yet
Pandas Notes
44 pages
Introduction To Pandas in Data Analytics
No ratings yet
Introduction To Pandas in Data Analytics
12 pages
Data Science Notes Unit-1 Part - 2
No ratings yet
Data Science Notes Unit-1 Part - 2
22 pages
Python 2.1.2
No ratings yet
Python 2.1.2
7 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Pandas
No ratings yet
Pandas
9 pages
Pandas Tutorial
No ratings yet
Pandas Tutorial
33 pages
Pandas
No ratings yet
Pandas
27 pages
Pandas
No ratings yet
Pandas
21 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
Pandas PDF
No ratings yet
Pandas PDF
25 pages
Unit 4
No ratings yet
Unit 4
36 pages
Getting Start With Pandas
No ratings yet
Getting Start With Pandas
11 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
Python Unit 3 4
No ratings yet
Python Unit 3 4
92 pages
Mypnotes
No ratings yet
Mypnotes
3 pages
Pandas
No ratings yet
Pandas
50 pages
pandas
No ratings yet
pandas
6 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Unit 3
No ratings yet
Unit 3
10 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
18 pages
Loki Temp PPT Pandas 2
No ratings yet
Loki Temp PPT Pandas 2
31 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
5 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Ip Study
No ratings yet
Ip Study
18 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
Unit 2
No ratings yet
Unit 2
81 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Account Payable
No ratings yet
Account Payable
14 pages
CCNA Security v2.0 Final Exam Answers 100 1 PDF
100% (3)
CCNA Security v2.0 Final Exam Answers 100 1 PDF
26 pages
Drawing and Detailing With Solidworks: A Workbook For Solidworks 2001/2001plus
No ratings yet
Drawing and Detailing With Solidworks: A Workbook For Solidworks 2001/2001plus
2 pages
ICT 7 1st Quarter Exam
No ratings yet
ICT 7 1st Quarter Exam
4 pages
Dell EMC PowerEdge C6525 - FSM
No ratings yet
Dell EMC PowerEdge C6525 - FSM
124 pages
Exp - No: Sorting The Data in Descending Order Date: Aim
100% (1)
Exp - No: Sorting The Data in Descending Order Date: Aim
3 pages
Azure Cand c2 Project Starter Template
100% (1)
Azure Cand c2 Project Starter Template
45 pages
James Sandoval's Resume - Software Engineer
No ratings yet
James Sandoval's Resume - Software Engineer
1 page
Btm452 Hot Jan 2024 Set2
No ratings yet
Btm452 Hot Jan 2024 Set2
8 pages
Sharp - MX M550 620 700
No ratings yet
Sharp - MX M550 620 700
12 pages
User Manual For UR Robots With Polyscope 3 5 Quick Changer v6.2.0 EN
No ratings yet
User Manual For UR Robots With Polyscope 3 5 Quick Changer v6.2.0 EN
80 pages
QoS - Linux - NSM - PRIO
No ratings yet
QoS - Linux - NSM - PRIO
26 pages
Report (Vaishnavi)
No ratings yet
Report (Vaishnavi)
46 pages
Demonstration Checklists: Demonstration Checklist Assessment Tools
100% (1)
Demonstration Checklists: Demonstration Checklist Assessment Tools
2 pages
LOGO Access Tool Help
No ratings yet
LOGO Access Tool Help
22 pages
ZWP 18
No ratings yet
ZWP 18
9 pages
SIPROTEC 7SA86 Profile
No ratings yet
SIPROTEC 7SA86 Profile
2 pages
Ubuntu
100% (1)
Ubuntu
382 pages
Facebook As A Social Media and A Business Platform
No ratings yet
Facebook As A Social Media and A Business Platform
6 pages
Logistics Planning PDF
No ratings yet
Logistics Planning PDF
87 pages
Pertech 6100k Users Guide
No ratings yet
Pertech 6100k Users Guide
14 pages
Sample Field Report Ict
No ratings yet
Sample Field Report Ict
2 pages
Housekeeping Unusued or Old Tablespace
No ratings yet
Housekeeping Unusued or Old Tablespace
3 pages
Supplementary Slides For Software Engineering: A Practitioner's Approach, 5/e
No ratings yet
Supplementary Slides For Software Engineering: A Practitioner's Approach, 5/e
8 pages
Ndace Project 123
100% (1)
Ndace Project 123
42 pages
Access List Tutorial
No ratings yet
Access List Tutorial
5 pages
Midterm Exam Data Base
No ratings yet
Midterm Exam Data Base
5 pages
Short Answer Questions (MC) : UNIT-1 Wireless Transmission
No ratings yet
Short Answer Questions (MC) : UNIT-1 Wireless Transmission
7 pages
Yosys Manual: Claire Xenia Wolf
No ratings yet
Yosys Manual: Claire Xenia Wolf
278 pages

Pandas Tutorial

Uploaded by

Pandas Tutorial

Uploaded by

Introduction to Pandas

A Series is a one-dimensional labeled array capable of holding any data type.

'Name': ['John', 'Anna', 'Peter', 'Linda'],

'Age': [28, 24, 35, 32],

'City': ['New York', 'Paris', 'Berlin', 'London']

2. Basic Operations on DataFrames

- head(): View the first few rows of the DataFrame.

- tail(): View the last few rows of the DataFrame.

- info(): Get a summary of the DataFrame.

- describe(): Get descriptive statistics.

- Using column names.

- Using row indices with iloc and loc.

# Select multiple columns

# Select rows by index

# Select rows and columns by labels

print(df.loc[0:2, ['Name', 'City']])

Adding and Dropping Columns

- Adding new columns.

# Adding a new column

df['Country'] = ['USA', 'France', 'Germany', 'UK']

- Using conditions to filter rows.

# Filtering rows where Age > 30

filtered_df = df[df['Age'] > 30]

4. Handling Missing Data

- Checking for missing data.

- Filling missing data.

- Dropping missing data.

# Creating a DataFrame with missing values

'Name': ['John', 'Anna', 'Peter', 'Linda'],

'Age': [28, None, 35, 32],

'City': ['New York', 'Paris', None, 'London']

# Checking for missing data

# Filling missing data

df_filled = df.fillna({'Age': df['Age'].mean(), 'City': 'Unknown'})

# Dropping missing data

5. Data Aggregation and Grouping

- Using groupby to group data and perform aggregation.

'Category': ['A', 'B', 'A', 'B'],

'Value': [10, 20, 30, 40]

# Grouping by 'Category' and calculating the sum of 'Value'

6. Merging and Joining DataFrames

- Merging based on keys.

df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1']})

df2 = pd.DataFrame({'A': ['A2', 'A3'], 'B': ['B2', 'B3']})

result = pd.concat([df1, df2])

left = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'A': ['A0', 'A1', 'A2']})

right = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'B': ['B0', 'B1', 'B2']})

result = pd.merge(left, right, on='key')

7. Advanced Data Operations

- Creating pivot tables to summarize data.

'Date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],

'City': ['New York', 'Paris', 'Berlin', 'London'],

'Sales': [200, 150, 300, 250]

pivot_table = df.pivot_table(values='Sales', index='City', columns='Date')

- Using apply to apply functions to data.

# Applying a lambda function to a column

df['Sales'] = df['Sales'].apply(lambda x: x * 1.1)

the pandas documentation for further reference.

You might also like