0% found this document useful (0 votes)

14 views13 pages

Pandas Vs SQL Concepts Final

Uploaded by

nvinaysastry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views13 pages

Pandas Vs SQL Concepts Final

Uploaded by

nvinaysastry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Pandas vs SQL: Essential

Concepts for Beginners

A Quick, Practice-Oriented Guide
Less read, Practice More, Master Quickly
Merging and Joining
SQL:
SELECT * FROM Orders
INNER JOIN Customers
ON Orders.Customer_ID = Customers.Customer_ID;

Pandas:
merged_df = pd.merge(orders_df, customers_df, on='Customer_ID', how='inner')

Note: Practice all join types (inner, outer, left, right).

Outer Joins
SQL: SQL
SELECT * FROM Customers

FULL OUTER JOIN Orders SELECT Orders.Order_ID, Orders.Customer_ID,

ON Customers.Customer_ID = Customers.Customer_NameFROM Orders LEFT[RIGHT]

Orders.Customer_ID; JOIN CustomersON Orders.Customer_ID =
Customers.Customer_ID;
Pandas:
Pandas:

outer_join_df = pd.merge(customers_df, Join_df = pd.merge(orders_df, customers_df,

orders_df, on='Customer_ID', how='outer') on='Customer_ID', how='right’[‘left’])
print(right_join)
Purpose: Use outer joins to include unmatched
rows from both tables.
Unmatched Rows with Outer
Joins
SQL:

SELECT * FROM Customers

FULL OUTER JOIN Orders

ON Customers.Customer_ID = Orders.Customer_ID

WHERE Orders.Customer_ID IS NULL OR Customers.Customer_ID IS NULL;

Pandas:

unmatched_rows = outer_join_df[(outer_join_df['Order_ID'].isna()) | (outer_join_df['Customer_Name'].isna())]

Why? Identify gaps in data to ensure completeness.

Basic Set
Operations(Union)
SQL:

SELECT DISTINCT Category FROM Orders WHERE Region = 'West'

UNION

SELECT DISTINCT Category FROM Orders WHERE Region = 'East';

Pandas:

west_categories = orders_df[orders_df['Region'] == 'Western US']['Category']

east_categories = orders_df[orders_df['Region'] == 'Eastern US']['Category']

union_categories = pd.concat([west_categories, east_categories]).drop_duplicates()

Find common categories (Intersect)
SQL:

SELECT DISTINCT Category FROM Orders WHERE Region = 'West'

INTERSECT
SELECT DISTINCT Category FROM Orders WHERE Region = 'East’;

Pandas:
intersect_categories = pd.merge(west_categories, east_categories, on='Category')
Categories exclusive (DIFFERENCE):
SQL:
SELECT DISTINCT Category FROM Orders WHERE Region = 'West'
EXCEPT
SELECT DISTINCT Category FROM Orders WHERE Region = 'East';

Pandas:
difference_categories = west_categories[~west_categories.isin(east_categories)]

Practice All: UNION, INTERSECT, DIFFERENCE.

Slicing and Indexing
Filtering by a column: Subset specific columns:
SQL: SQL:
SELECT * FROM Orders WHERE Region = SELECT Order_ID, Sales FROM Orders;
‘Central US';

Pandas: Pandas:
filtered_df = orders_df[orders_df['Region'] subset_df = orders_df[['Order_ID', 'Sales']]
== ‘Central US'']
Indexing Techniques
Explore Pandas loc/iloc for advanced indexing. Pandas:

SQL: Create a MultiIndex:

SELECT * FROM Orders LIMIT 10 OFFSET 5; multi_index_df = orders_df.set_index(['Region',

'Category'])

Pandas:
Access data for a specific `Region` and
sliced_df = orders_df.iloc[5:15] `Category`:

multi_index_df.loc[('West', 'Furniture’)]
Ranking and Row
Numbering
Partition by Region (SQL): SQL:
SQL: SELECT *, RANK() OVER (ORDER BY Sales DESC) AS
SELECT *, RANK() OVER (PARTITION BY Region rank

ORDER BY Sales DESC) AS rank FROM Orders;

FROM Orders;

Pandas:
Pandas:
orders_df['rank'] =
orders_df['rank'] = orders_df.groupby('Region') orders_df['Sales'].rank(ascending=False)
['Sales'].rank(ascending=False)
Ranking and Row
Numbering Dense Rank Example
Rank products based on their sales (highest first).
SQL: SQL: SELECT Product_ID, Sales, DENSE_RANK()
SELECT Product_ID, Sales, RANK() OVER OVER (ORDER BY Sales DESC) AS
(ORDER BY Sales DESC) AS RankFROM Orders; Dense_RankFROM Orders;

Pandas:
Pandas:
orders_df['Rank'] = orders_df['Dense_Rank'] =
orders_df['Sales'].rank(ascending=False, orders_df['Sales'].rank(ascending=False,
method='min’) method='dense’)
print(orders_df[['Product_ID', 'Sales', 'Rank']])
print(orders_df[['Product_ID', 'Sales',
'Dense_Rank']])
Row Numbering
SQL :
SELECT Product_ID, Sales,ROW_NUMBER() OVER (ORDER BY Sales DESC) AS Row_Number FROM Orders;

Pandas:
orders_df['Row_Number'] = orders_df['Sales'].rank(ascending=False, method='first’)
print(orders_df[['Product_ID', 'Sales', 'Row_Number’]])

Key Differences to Highlight:

•Rank: Skips ranks if there are ties.
•Dense Rank: No gaps in rank sequence, even with ties.
•Row Number: Provides a unique number for every row, even with ties.
The Power of Practice
1. Explore the previous PowerPoints for **data filtering** and
**aggregation**.
2. Practice concepts side-by-side in SQL and Pandas.
3. Use the Global Superstore dataset for meaningful insights.
4. Dataset link https://github.com/codewithVnsastry/learnNdo
"Less read, practice more – your expertise is just a few queries away!

Capintec CRC-15W Dose Calibrator PDF
No ratings yet
Capintec CRC-15W Dose Calibrator PDF
238 pages
DBL Group CSR
100% (1)
DBL Group CSR
4 pages
Cooperative Learning Lesson Plan
100% (1)
Cooperative Learning Lesson Plan
5 pages
Pandas Vs SQL Concepts Updated
No ratings yet
Pandas Vs SQL Concepts Updated
17 pages
5 Merging Concatenating
No ratings yet
5 Merging Concatenating
8 pages
DA - 2. Pandas
No ratings yet
DA - 2. Pandas
79 pages
Joins
No ratings yet
Joins
15 pages
4th Unit Answer Bank
No ratings yet
4th Unit Answer Bank
40 pages
learnPandas
No ratings yet
learnPandas
37 pages
DSP Unit-5 Updated
No ratings yet
DSP Unit-5 Updated
23 pages
Unit 4 1
No ratings yet
Unit 4 1
3 pages
Chapter 4
No ratings yet
Chapter 4
40 pages
57 Pandas_
No ratings yet
57 Pandas_
7 pages
Joining Data 4
No ratings yet
Joining Data 4
40 pages
Chapter 4
No ratings yet
Chapter 4
40 pages
Joins
No ratings yet
Joins
2 pages
Python For DS Unit4
No ratings yet
Python For DS Unit4
11 pages
07 Data Wrangling
No ratings yet
07 Data Wrangling
51 pages
PySpark SQL Pandas CheatSheet
No ratings yet
PySpark SQL Pandas CheatSheet
2 pages
Practical
No ratings yet
Practical
12 pages
IV Unit Fds
No ratings yet
IV Unit Fds
16 pages
Questions For Preparation
No ratings yet
Questions For Preparation
9 pages
Python 2.1.3
No ratings yet
Python 2.1.3
6 pages
Chapter 3
No ratings yet
Chapter 3
35 pages
4.3 Joining Data With Pandas (Advanced Merging and Concatenating)
No ratings yet
4.3 Joining Data With Pandas (Advanced Merging and Concatenating)
38 pages
Chapter 2 Python Pandas - II
No ratings yet
Chapter 2 Python Pandas - II
19 pages
Pandas Tutorial
No ratings yet
Pandas Tutorial
9 pages
Combining Data in Pandas With Merge, .Join, and Concat - Real Python
No ratings yet
Combining Data in Pandas With Merge, .Join, and Concat - Real Python
2 pages
Panda - Ipynb - Colab
No ratings yet
Panda - Ipynb - Colab
1 page
Module 4
No ratings yet
Module 4
38 pages
Python Lecture 5 (2025)
No ratings yet
Python Lecture 5 (2025)
29 pages
Python Libraries Cheat Sheets
No ratings yet
Python Libraries Cheat Sheets
6 pages
Python CheatSheet
No ratings yet
Python CheatSheet
2 pages
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
No ratings yet
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
7 pages
Pandas Moderate
No ratings yet
Pandas Moderate
15 pages
Unit 4 DSE
No ratings yet
Unit 4 DSE
9 pages
Pandas
No ratings yet
Pandas
26 pages
Deloitte Data Engineer Interview Experience (0-3 Yoe)
No ratings yet
Deloitte Data Engineer Interview Experience (0-3 Yoe)
22 pages
CO3 - 3 - Indexing and Sorting, Loading Data From CSV
No ratings yet
CO3 - 3 - Indexing and Sorting, Loading Data From CSV
29 pages
BigData - W3 - Cloud and Cluster Processing (Cont) - HoangVu
No ratings yet
BigData - W3 - Cloud and Cluster Processing (Cont) - HoangVu
69 pages
Introduction To Pandas in Data Analytics
No ratings yet
Introduction To Pandas in Data Analytics
12 pages
Deloite Data Engineer Interview Questions
No ratings yet
Deloite Data Engineer Interview Questions
24 pages
Pandas
No ratings yet
Pandas
94 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
EDA Cheat Sheet
No ratings yet
EDA Cheat Sheet
7 pages
Pandas & NumPy For Tabular Data (Cleaning & Reshaping)
No ratings yet
Pandas & NumPy For Tabular Data (Cleaning & Reshaping)
9 pages
Praveen PPT
No ratings yet
Praveen PPT
9 pages
Subjoints, Constraints
No ratings yet
Subjoints, Constraints
32 pages
Python Programming For Data Science
No ratings yet
Python Programming For Data Science
36 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Pandas Roadmap
No ratings yet
Pandas Roadmap
6 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
OOM Unit 2
No ratings yet
OOM Unit 2
145 pages
Edp 3
No ratings yet
Edp 3
16 pages
Pandas Cheat Sheet
100% (2)
Pandas Cheat Sheet
6 pages
Reference Guide - Pandas Tools For Structuring A Dataset
No ratings yet
Reference Guide - Pandas Tools For Structuring A Dataset
5 pages
Battle of The Data Tools - Pandas Vs SQL
No ratings yet
Battle of The Data Tools - Pandas Vs SQL
12 pages
_Spark SQL Optimization — Real Case Studies
No ratings yet
_Spark SQL Optimization — Real Case Studies
18 pages
Pandas Tutorial
No ratings yet
Pandas Tutorial
33 pages
Week 2
No ratings yet
Week 2
6 pages
LSU Flow Chart PETE - 2015-2016
No ratings yet
LSU Flow Chart PETE - 2015-2016
1 page
Chemist Cover Letter
100% (1)
Chemist Cover Letter
4 pages
Barozzi Veiga Profile Web
100% (1)
Barozzi Veiga Profile Web
42 pages
Load Tables LRFD LH-Series
No ratings yet
Load Tables LRFD LH-Series
89 pages
CV of Sabina
No ratings yet
CV of Sabina
2 pages
Thesis Statement On Italian Culture
100% (2)
Thesis Statement On Italian Culture
5 pages
Some Notes Clean R Ngs
No ratings yet
Some Notes Clean R Ngs
29 pages
Writing An Opinion Piece Uniform Example and Outline For Zoos
No ratings yet
Writing An Opinion Piece Uniform Example and Outline For Zoos
7 pages
Lab Report 28 Molar Volume of Hydrogen Gas
No ratings yet
Lab Report 28 Molar Volume of Hydrogen Gas
4 pages
Item-Based Collaborative Filtering Recommendation Algorithms - Highlighted Paper
No ratings yet
Item-Based Collaborative Filtering Recommendation Algorithms - Highlighted Paper
11 pages
The Man Who Would Become A God Story
No ratings yet
The Man Who Would Become A God Story
83 pages
Elevator Safe Usage
No ratings yet
Elevator Safe Usage
1 page
Resume Deepak
No ratings yet
Resume Deepak
2 pages
Cleanroom Protocol Rev 2
No ratings yet
Cleanroom Protocol Rev 2
7 pages
Owner'S Manual and Installation Manual: HRV (Heat Recovery Ventilation)
No ratings yet
Owner'S Manual and Installation Manual: HRV (Heat Recovery Ventilation)
21 pages
Digital Marketing MARVEL VS DC
No ratings yet
Digital Marketing MARVEL VS DC
22 pages
Competencies Report 4th Quarter - Grade 10 Mapeh
No ratings yet
Competencies Report 4th Quarter - Grade 10 Mapeh
2 pages
Chick N' Joes: A Business Plan Presented To The Faculty of Business High School
No ratings yet
Chick N' Joes: A Business Plan Presented To The Faculty of Business High School
21 pages
Bolted Connections: DR S R Satish Kumar, IIT Madras 1
No ratings yet
Bolted Connections: DR S R Satish Kumar, IIT Madras 1
30 pages
Light: Daily Practice Problems
100% (4)
Light: Daily Practice Problems
2 pages
Line Graph Vocabulary
No ratings yet
Line Graph Vocabulary
1 page
The Exponential Form of Complex Numbers
No ratings yet
The Exponential Form of Complex Numbers
5 pages
The Sequel - Jean Hanff Korelitz
No ratings yet
The Sequel - Jean Hanff Korelitz
277 pages
Festing 1998
No ratings yet
Festing 1998
19 pages
Moholy-Nagy Laszlo Painting Photography Film-Pages-25-27
No ratings yet
Moholy-Nagy Laszlo Painting Photography Film-Pages-25-27
3 pages
Ch08 Innovation Networks
No ratings yet
Ch08 Innovation Networks
38 pages
Ias Notes - History 01
No ratings yet
Ias Notes - History 01
5 pages

Pandas Vs SQL Concepts Final

Uploaded by

Pandas Vs SQL Concepts Final

Uploaded by

Pandas vs SQL: Essential

Concepts for Beginners

Note: Practice all join types (inner, outer, left, right).

FULL OUTER JOIN Orders SELECT Orders.Order_ID, Orders.Customer_ID,

ON Customers.Customer_ID = Customers.Customer_NameFROM Orders LEFT[RIGHT]

outer_join_df = pd.merge(customers_df, Join_df = pd.merge(orders_df, customers_df,

SELECT * FROM Customers

FULL OUTER JOIN Orders

WHERE Orders.Customer_ID IS NULL OR Customers.Customer_ID IS NULL;

unmatched_rows = outer_join_df[(outer_join_df['Order_ID'].isna()) | (outer_join_df['Customer_Name'].isna())]

Why? Identify gaps in data to ensure completeness.

SELECT DISTINCT Category FROM Orders WHERE Region = 'West'

SELECT DISTINCT Category FROM Orders WHERE Region = 'East';

west_categories = orders_df[orders_df['Region'] == 'Western US']['Category']

east_categories = orders_df[orders_df['Region'] == 'Eastern US']['Category']

union_categories = pd.concat([west_categories, east_categories]).drop_duplicates()

SELECT DISTINCT Category FROM Orders WHERE Region = 'West'

Practice All: UNION, INTERSECT, DIFFERENCE.

SQL: Create a MultiIndex:

SELECT * FROM Orders LIMIT 10 OFFSET 5; multi_index_df = orders_df.set_index(['Region',

ORDER BY Sales DESC) AS rank FROM Orders;

Key Differences to Highlight:

You might also like