0% found this document useful (0 votes)

5 views17 pages

Pandas Vs SQL Concepts Updated

Uploaded by

nvinaysastry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views17 pages

Pandas Vs SQL Concepts Updated

Uploaded by

nvinaysastry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Pandas vs SQL: Essential

Concepts for Beginners

A Quick, Practice-Oriented Guide
“Read Less, Practice More, Master Quickly
Merging and Joining
SQL:
SELECT * FROM Orders
INNER JOIN Customers
ON Orders.Customer_ID = Customers.Customer_ID;

Pandas:
merged_df = pd.merge(orders_df, customers_df, on='Customer_ID', how='inner')

Note: Practice all join types (inner, outer, left, right).

Outer Joins
SQL:
SELECT * FROM Customers
FULL OUTER JOIN Orders
ON Customers.Customer_ID = Orders.Customer_ID;

Pandas:
outer_join_df = pd.merge(customers_df, orders_df, on='Customer_ID',
how='outer')

Purpose: Use outer joins to include unmatched rows from both tables.
Unmatched Rows with Outer
Joins
SQL:
SELECT * FROM Customers

FULL OUTER JOIN Orders

ON Customers.Customer_ID = Orders.Customer_ID

WHERE Orders.Customer_ID IS NULL OR Customers.Customer_ID IS NULL;

Pandas:
unmatched_rows = outer_join_df[(outer_join_df['Order_ID'].isna()) |
(outer_join_df['Customer_Name'].isna())]

Why? Identify gaps in data to ensure completeness.

Basic Set
Operations(Union)
SQL:

SELECT DISTINCT Category FROM Orders WHERE Region = 'West'

UNION

SELECT DISTINCT Category FROM Orders WHERE Region = 'East';

Pandas:

west_categories = orders_df[orders_df['Region'] == 'West']['Category']

east_categories = orders_df[orders_df['Region'] == 'East']['Category']

union_categories = pd.concat([west_categories, east_categories]).drop_duplicates()

Find common categories (Intersect)
SQL:

SELECT DISTINCT Category FROM Orders WHERE Region = 'West'

INTERSECT
SELECT DISTINCT Category FROM Orders WHERE Region = 'East’;

Pandas:
intersect_categories = pd.merge(west_categories, east_categories, on='Category')
Categories exclusive to the West (DIFFERENCE):
SQL:
SELECT DISTINCT Category FROM Orders WHERE Region = 'West'
EXCEPT
SELECT DISTINCT Category FROM Orders WHERE Region = 'East';

Pandas:
difference_categories = west_categories[~west_categories.isin(east_categories)]

Practice All: UNION, INTERSECT, DIFFERENCE.

Slicing and Indexing
Filtering by a column: Subset specific columns:
SQL: SQL:
SELECT * FROM Orders WHERE Region = SELECT Order_ID, Sales FROM Orders;
'West';

Pandas: Pandas:
filtered_df = orders_df[orders_df['Region'] subset_df = orders_df[['Order_ID', 'Sales']]
== 'West']
Indexing Techniques
Explore Pandas loc/iloc for advanced indexing. Create a MultiIndex:

SQL: multi_index_df = orders_df.set_index(['Region',

'Category'])
SELECT * FROM Orders LIMIT 10 OFFSET 5;

Access data for a specific `Region` and

Pandas: `Category`:
sliced_df = orders_df.iloc[5:15] multi_index_df.loc[('West', 'Furniture’)]
Ranking and Row
Numbering
Partition by Region (SQL): SQL:
SQL: SELECT *, RANK() OVER (ORDER BY Sales DESC) AS
SELECT *, RANK() OVER (PARTITION BY Region rank

ORDER BY Sales DESC) AS rank FROM Orders;

FROM Orders;

Pandas:
Pandas:
orders_df['rank'] =
orders_df['rank'] = orders_df.groupby('Region') orders_df['Sales'].rank(ascending=False)
['Sales'].rank(ascending=False)
Ranking and Row
Numbering Dense Rank Example
Rank products based on their sales (highest first).
SQL: SQL: SELECT Product_ID, Sales, DENSE_RANK()
SELECT Product_ID, Sales, RANK() OVER OVER (ORDER BY Sales DESC) AS
(ORDER BY Sales DESC) AS RankFROM Orders; Dense_RankFROM Orders;

Pandas:
Pandas:
orders_df['Rank'] = orders_df['Dense_Rank'] =
orders_df['Sales'].rank(ascending=False, orders_df['Sales'].rank(ascending=False,
method='min')print(orders_df[['Product_ID', method='dense')print(orders_df[['Product_ID',
'Sales', 'Rank']]) 'Sales', 'Dense_Rank']])
Row Numbering
SQL :
SELECT Product_ID, Sales,ROW_NUMBER() OVER (ORDER BY Sales DESC) AS Row_Number FROM Orders;

Pandas:
orders_df['Row_Number'] = orders_df['Sales'].rank(ascending=False,
method='first')print(orders_df[['Product_ID', 'Sales', 'Row_Number’]])
Merging Deep Dive
1. Merge Òrders` with `Customers` using ÌNNER
JOIN`.
2. Experiment with mismatches using `LEFT JOIN`.
3. Use ÒUTER JOIN` to identify customers with no
orders.

Why? Mastering joins is crucial for data integration.

Real Examples of Set
Operations
- Find common categories sold in West and East
regions (INTERSECT).
- Combine all categories across both regions
(UNION).
- Identify categories sold only in one region
(DIFFERENCE).

Exercise: Implement these with the Global

Superstore dataset in SQL and Pandas.
Advanced Indexing
Techniques
- Multi-level indexing: Index by `Region` and
`Category`.
- Combine WHERE clauses: Filter by `Ship Mode` and
`Profit`.

Why? Efficient data selection = faster analysis.

Ranking in Real Scenarios
- Rank customers by total `Sales`.
- Assign row numbers for the top 5 products by
category.
- Partition rankings by `Region` for localized insights.

Task: Replicate ranking in both tools.

The Power of Practice
1. Explore my previous PowerPoints for **data
filtering** and **aggregation**.
2. Practice concepts side-by-side in SQL and Pandas.
3. Use the Global Superstore dataset for meaningful
insights.

Tagline: "Read less, practice more – your expertise is

just a few queries away!

Data Science Tools Study Guides For MIT's 15.003
No ratings yet
Data Science Tools Study Guides For MIT's 15.003
23 pages
Pandas Vs SQL Concepts Final
No ratings yet
Pandas Vs SQL Concepts Final
13 pages
4th Unit Answer Bank
No ratings yet
4th Unit Answer Bank
40 pages
Questions For Preparation
No ratings yet
Questions For Preparation
9 pages
learnPandas
No ratings yet
learnPandas
37 pages
DA - 2. Pandas
No ratings yet
DA - 2. Pandas
79 pages
Joins
No ratings yet
Joins
15 pages
Advance SQL With Rajan Chettri
No ratings yet
Advance SQL With Rajan Chettri
47 pages
Practical
No ratings yet
Practical
12 pages
Python For DS Unit4
No ratings yet
Python For DS Unit4
11 pages
Joins
No ratings yet
Joins
2 pages
EDA Cheat Sheet
No ratings yet
EDA Cheat Sheet
7 pages
Chapter 4
No ratings yet
Chapter 4
40 pages
Joining Data 4
No ratings yet
Joining Data 4
40 pages
Python CheatSheet
No ratings yet
Python CheatSheet
2 pages
Pandas Roadmap
No ratings yet
Pandas Roadmap
6 pages
Chapter 4
No ratings yet
Chapter 4
40 pages
Unit 4 1
No ratings yet
Unit 4 1
3 pages
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
No ratings yet
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
7 pages
07 Data Wrangling
No ratings yet
07 Data Wrangling
51 pages
57 Pandas_
No ratings yet
57 Pandas_
7 pages
EDA With Pandas
No ratings yet
EDA With Pandas
8 pages
Data Merging
No ratings yet
Data Merging
4 pages
OEL01
No ratings yet
OEL01
8 pages
DA - Hands On - Week 4
No ratings yet
DA - Hands On - Week 4
4 pages
Pandas Moderate
No ratings yet
Pandas Moderate
15 pages
BigData - W3 - Cloud and Cluster Processing (Cont) - HoangVu
No ratings yet
BigData - W3 - Cloud and Cluster Processing (Cont) - HoangVu
69 pages
Combining Data in Pandas With Merge, .Join, and Concat - Real Python
No ratings yet
Combining Data in Pandas With Merge, .Join, and Concat - Real Python
2 pages
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
No ratings yet
Super Study Guide: Data Science Tools: Afshine Amidi and Shervine Amidi August 21, 2020
23 pages
Python Lecture 5 (2025)
No ratings yet
Python Lecture 5 (2025)
29 pages
Praveen PPT
No ratings yet
Praveen PPT
9 pages
Subjoints, Constraints
No ratings yet
Subjoints, Constraints
32 pages
Data Analyst Cheat Sheet
No ratings yet
Data Analyst Cheat Sheet
28 pages
5 Merging Concatenating
No ratings yet
5 Merging Concatenating
8 pages
Chapter 2 Python Pandas - II
No ratings yet
Chapter 2 Python Pandas - II
19 pages
DSP Unit-5 Updated
No ratings yet
DSP Unit-5 Updated
23 pages
Python 2.1.3
No ratings yet
Python 2.1.3
6 pages
Chapter 3
No ratings yet
Chapter 3
35 pages
Excel To Pandas Advanced Data Techniques For BI Devs 1729266352
No ratings yet
Excel To Pandas Advanced Data Techniques For BI Devs 1729266352
9 pages
Lab Session 06: Perform Following Operations Using Pandas Lab Session 06: Perform Following Operations Using Pandas
No ratings yet
Lab Session 06: Perform Following Operations Using Pandas Lab Session 06: Perform Following Operations Using Pandas
5 pages
Dataframe in Pandas - Cheatsheet
No ratings yet
Dataframe in Pandas - Cheatsheet
8 pages
Interview Questions For Data Analysis and Data Science
No ratings yet
Interview Questions For Data Analysis and Data Science
19 pages
Deloitte Data Engineer Interview Experience (0-3 Yoe)
No ratings yet
Deloitte Data Engineer Interview Experience (0-3 Yoe)
22 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
SQL Short Notes
No ratings yet
SQL Short Notes
4 pages
Week 2
No ratings yet
Week 2
6 pages
Pandas Cheat Sheet Final
No ratings yet
Pandas Cheat Sheet Final
1 page
T SQL
No ratings yet
T SQL
39 pages
4.3 Joining Data With Pandas (Advanced Merging and Concatenating)
No ratings yet
4.3 Joining Data With Pandas (Advanced Merging and Concatenating)
38 pages
Comparison of SQL
No ratings yet
Comparison of SQL
11 pages
Pandas
No ratings yet
Pandas
94 pages
Wipro Data Analyst Interview Questions
No ratings yet
Wipro Data Analyst Interview Questions
29 pages
Deloite Data Engineer Interview Questions
No ratings yet
Deloite Data Engineer Interview Questions
24 pages
SQL Cheat Sheet 2021 Web
No ratings yet
SQL Cheat Sheet 2021 Web
1 page
Battle of The Data Tools - Pandas Vs SQL
No ratings yet
Battle of The Data Tools - Pandas Vs SQL
12 pages
Tableau Notes
No ratings yet
Tableau Notes
16 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
SQL Server: Tips and Tricks - 1
From Everand
SQL Server: Tips and Tricks - 1
Priyanka Agarwal
5/5 (1)
DBMS Lab Manual
From Everand
DBMS Lab Manual
Jitendra Patel
1.5/5 (3)
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Evaluation 1: Toeicsv - We Are A Big Family
No ratings yet
Evaluation 1: Toeicsv - We Are A Big Family
8 pages
Corrosion Protection of Fiber-Reinforced Polymer-Wrapped Reinforced Concrete
No ratings yet
Corrosion Protection of Fiber-Reinforced Polymer-Wrapped Reinforced Concrete
8 pages
Code 188 - Punto Classic
No ratings yet
Code 188 - Punto Classic
5 pages
Z900 Kawasaki
No ratings yet
Z900 Kawasaki
2 pages
Screenshot 2023-10-07 at 7.46.18 PM
No ratings yet
Screenshot 2023-10-07 at 7.46.18 PM
4 pages
Task 1 - Thematic Content 2
No ratings yet
Task 1 - Thematic Content 2
11 pages
Study of Various Approaches in Machine Translation For Sanskrit Language
No ratings yet
Study of Various Approaches in Machine Translation For Sanskrit Language
6 pages
NE Alien Update
100% (2)
NE Alien Update
21 pages
Lesson Plan Form Smep
No ratings yet
Lesson Plan Form Smep
8 pages
CO2 Fixed Installation Rules
No ratings yet
CO2 Fixed Installation Rules
13 pages
Specification of Waste Oil Generated at Ships in Terms of Its Use As Fuels
No ratings yet
Specification of Waste Oil Generated at Ships in Terms of Its Use As Fuels
9 pages
Australian Semiconductor Sector Study
No ratings yet
Australian Semiconductor Sector Study
48 pages
Foiling Exponents Polynomials Scientific Notation: Instructions Questions
No ratings yet
Foiling Exponents Polynomials Scientific Notation: Instructions Questions
39 pages
DC DCconverters 2016
No ratings yet
DC DCconverters 2016
9 pages
Freud's Stages of Human Development
No ratings yet
Freud's Stages of Human Development
8 pages
The United Methodist Church Ecumenical Christian College Junior High School Department S.Y 2019-2020
No ratings yet
The United Methodist Church Ecumenical Christian College Junior High School Department S.Y 2019-2020
8 pages
Temperature
No ratings yet
Temperature
13 pages
MEI A Level Further Mathematics Full Worked Solution
100% (1)
MEI A Level Further Mathematics Full Worked Solution
10 pages
1721, 1.1 KV Power Cable Schedule (Annexure - 2.0)
No ratings yet
1721, 1.1 KV Power Cable Schedule (Annexure - 2.0)
24 pages
Lowboy
No ratings yet
Lowboy
2 pages
Dyad Activity
No ratings yet
Dyad Activity
3 pages
General: Title Design of Corbel Cl. No. Design Calculations References
No ratings yet
General: Title Design of Corbel Cl. No. Design Calculations References
15 pages
Verified PDF Download Financial Accounting 8th Edition by Libby FULL Version
No ratings yet
Verified PDF Download Financial Accounting 8th Edition by Libby FULL Version
407 pages
Steel Structure (Mahin Sir)
No ratings yet
Steel Structure (Mahin Sir)
130 pages
Haryana Electricity Bill New
No ratings yet
Haryana Electricity Bill New
1 page
Week 4
No ratings yet
Week 4
27 pages
Hge 11 Ri Noted
No ratings yet
Hge 11 Ri Noted
2 pages
Board and Pillar
No ratings yet
Board and Pillar
21 pages
Engaging NLP For Work, 1st Edition Complete Volume Download
100% (19)
Engaging NLP For Work, 1st Edition Complete Volume Download
14 pages
Preventive Mantainance
No ratings yet
Preventive Mantainance
6 pages

Pandas Vs SQL Concepts Updated

Uploaded by

Pandas Vs SQL Concepts Updated

Uploaded by

Pandas vs SQL: Essential

Concepts for Beginners

Note: Practice all join types (inner, outer, left, right).

FULL OUTER JOIN Orders

WHERE Orders.Customer_ID IS NULL OR Customers.Customer_ID IS NULL;

Why? Identify gaps in data to ensure completeness.

SELECT DISTINCT Category FROM Orders WHERE Region = 'West'

SELECT DISTINCT Category FROM Orders WHERE Region = 'East';

west_categories = orders_df[orders_df['Region'] == 'West']['Category']

east_categories = orders_df[orders_df['Region'] == 'East']['Category']

union_categories = pd.concat([west_categories, east_categories]).drop_duplicates()

SELECT DISTINCT Category FROM Orders WHERE Region = 'West'

Practice All: UNION, INTERSECT, DIFFERENCE.

SQL: multi_index_df = orders_df.set_index(['Region',

Access data for a specific `Region` and

ORDER BY Sales DESC) AS rank FROM Orders;

Why? Mastering joins is crucial for data integration.

Exercise: Implement these with the Global

Why? Efficient data selection = faster analysis.

Task: Replicate ranking in both tools.

Tagline: "Read less, practice more – your expertise is

You might also like