0% found this document useful (0 votes)

106 views9 pages

Python Class 6 Assignment Solution

Uploaded by

Arpit Dubey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

106 views9 pages

Python Class 6 Assignment Solution

Uploaded by

Arpit Dubey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

PANDAS:

1- Create a Pandas Data frame from the given data and create a new column “Voter” based on
voter age, i.e., if age >18 then voter column should be “Yes” otherwise if age <18 then voter
column should be “No”

raw_Data = {'Voter_name': ['Geek1', 'Geek2', 'Geek3', 'Geek4',

'Geek5', 'Geek6', 'Geek7', 'Geek8'],
'Voter_age': [15, 23, 25, 9, 67, 54, 42, np.NaN]}

ills
Solution:

import pandas as pd

Sk
import numpy as np

raw_Data = {'Voter_name': ['Geek1', 'Geek2', 'Geek3', 'Geek4', 'Geek5', 'Geek6', 'Geek7',

'Geek8'], a
'Voter_age': [15, 23, 25, 9, 67, 54, 42, np.NaN]}
at
df = pd.DataFrame(raw_Data)
D

# Create a new column "Voter" based on voter age

df['Voter'] = np.where(df['Voter_age'] > 18, 'Yes', 'No')

print(df)
ro

2 – Create a Pandas Data frame from the given data and collapse First and Last column into
G

one column as Full Name, so the output contains Full Name and Age, then convert column age
to index

raw_Data = {'First': ['Manan ', 'Raghav ', 'Sunny '],

'Last': ['Goel', 'Sharma', 'Chawla'],
'Age' : [12, 24, 56]}

Solution:
raw_Data = {'First': ['Manan', 'Raghav', 'Sunny'],

'Last': ['Goel', 'Sharma', 'Chawla'],

'Age': [12, 24, 56]}

df = pd.DataFrame(raw_Data)

# Combine First and Last columns into Full Name

ills
df['Full Name'] = df['First'] + ' ' + df['Last']

# Set Age as index

Sk
df.set_index('Age', inplace=True)

print(df)

a
3- Create a Pandas Data frame from the given data -
at
raw_Data = {'Date':['10/2/2011', '11/2/2011', '12/2/2011', '13/2/2011'],
'Product':['Umbrella', 'Matress', 'Badminton','Shuttle'],
D

'Price':[1250, 1450, 1550, 400],

'Expense': [ 21525220.653, 31125840.875, 23135428.768, 56245263.942]}

a- Add Index as Item1, Item2, Item3, Item4

b- Find the index labels of all items whose ‘Price’ is greater than 1000.

c- Replace products using Map() with respective codes- Umbrella : ‘U’, Matress : 'M', Badminton
ro

: 'B', Shuttle: 'S'

d- Round off the Expense column values to two decimal places.

e- Create a new column called ‘Discounted_Price’ after applying a 10% discount on the existing
‘price’ column.(try using lambda function)

f- Convert the column type of “Date” to datetime format

g- Create a column rank which ranks the products based on the price (one with highest price will
be rank 1).

Solution:
raw_Data = {'Date': ['10/2/2011', '11/2/2011', '12/2/2011', '13/2/2011'],

'Product': ['Umbrella', 'Matress', 'Badminton', 'Shuttle'],

'Price': [1250, 1450, 1550, 400],

'Expense': [21525220.653, 31125840.875, 23135428.768, 56245263.942]}

df = pd.DataFrame(raw_Data)

ills
# Task a: Add Index as Item1, Item2, Item3, Item4

df.index = ['Item1', 'Item2', 'Item3', 'Item4']

Sk
# Task b: Find index labels with Price > 1000

indexes_with_price_gt_1000 = df[df['Price'] > 1000].index.tolist()

a
# Task c: Replace products using Map()

product_map = {'Umbrella': 'U', 'Matress': 'M', 'Badminton': 'B', 'Shuttle': 'S'}

at
df['Product'] = df['Product'].map(product_map)
D

# Task d: Round off Expense column to two decimal places

df['Expense'] = df['Expense'].round(2)
w

# Task e: Create 'Discounted_Price' column with 10% discount

df['Discounted_Price'] = df['Price'] * 0.9

# Task f: Convert 'Date' column to datetime format

df['Date'] = pd.to_datetime(df['Date'])

# Task g: Create 'Rank' column based on price

df['Rank'] = df['Price'].rank(ascending=False).astype(int)

print(df)
Assignment: Exploring NBA Player Data

Download the nba.csv file containing NBA player data Complete the following tasks using
Python, Pandas, and data visualization libraries:

1. Load Data:

● Load the nba.csv data into a Pandas DataFrame.

ills
● Display basic information about the DataFrame.

2. Data Cleaning:

● Handle missing values by either removing or imputing them.

Sk
● Remove duplicate rows.

3. Data Transformation:

● Create a new column 'BMI' (Body Mass Index) using the formula: BMI = (weight in
pounds / (height in inches)^2) * 703.(Assuming a fixed height value of 70 inches (5 feet
a
10 inches)

4. Exploratory Data Analysis (EDA):

at
● Display summary statistics of the 'age', 'weight', and 'salary' columns.

● Calculate the average age, weight, and salary of players in each 'position' category.
D

5. Data Visualization:

● Create a histogram of player ages.

● Create a box plot of player salaries for each 'position'.

● Plot a scatter plot of 'age' vs. 'salary' with a different color for each 'position'.
ro

6. Top Players:

● Display the top 10 players with the highest salaries.

7. College Analysis:

● Determine the top 5 colleges with the most represented players.

8. Position Distribution:

● Plot a pie chart to show the distribution of players across different 'positions'.

9. Team Analysis:

● Display the average salary of players for each 'team'.

● Plot a bar chart to visualize the average salary of players for each 'team'.

10. Extras

● Get the index at which minimum weight value is present.

● Sort values based on name in alphabetical order for the rows (the original Dataframe
sorting should not change)
● Create a series from given dataframe on “name” column and display top and last 10

Guidelines:

ills
1. Write Python code to complete each task.

2. Provide comments explaining your code.

3. Use meaningful variable names.

Sk
4. Include necessary library imports.

5. Present your findings in a clear and organized manner.

6. Feel free to use additional code cells for each task.

Solution:
a
at
1. Load Data:

import pandas as pd
D

# Load the data into a Pandas DataFrame

df = pd.read_csv('nba.csv')
ro

# Display basic information about the DataFrame

print(df.info())
G

print(df.head())

2. Data Cleaning:

# Handle missing values

df.dropna(inplace=True)
# Remove duplicate rows

df.drop_duplicates(inplace=True)

3. Data Transformation: Create 'BMI' column using a fixed height value

# Assuming a fixed height value of 70 inches (5 feet 10 inches)

fixed_height = 70

ills
# Create 'BMI' column

df['BMI'] = (df['Weight'] / (fixed_height ** 2)) * 703

Sk
4. Exploratory Data Analysis (EDA):

# Summary statistics

print(df[['Age', 'Weight', 'Salary']].describe())

a
# Average age, weight, and salary by position
at
avg_by_position = df.groupby('Position')[['Age', 'Weight', 'Salary']].mean()

print(avg_by_position)
D

5. Data Visualization:
w

import matplotlib.pyplot as plt

# Histogram of player ages

plt.hist(df['Age'], bins=20)
G

plt.xlabel('Age')

plt.ylabel('Frequency')

plt.title('Distribution of Player Ages')

plt.show()

# Box plot of player salaries by position

plt.figure(figsize=(10, 6))

df.boxplot(column='Salary', by='Position')

plt.ylabel('Salary')

plt.title('Box Plot of Player Salaries by Position')

plt.suptitle('')

plt.xticks(rotation=45)

plt.show()

ills
# Scatter plot of 'age' vs. 'salary' by position

plt.figure(figsize=(10, 6))

Sk
colors = {'PG': 'red', 'SG': 'blue', 'SF': 'green', 'PF': 'purple', 'C': 'orange'}

plt.scatter(df['Age'], df['Salary'], c=df['Position'].map(colors), alpha=0.5)

plt.xlabel('Age')

plt.ylabel('Salary')
a
plt.title('Age vs. Salary by Position')
at
plt.legend(colors)

plt.show()
D

6. Top Players:
w

top_players = df.nlargest(10, 'Salary')

print(top_players)
ro

7. College Analysis:
G

top_colleges = df['College'].value_counts().nlargest(5)

print(top_colleges)

8. Position Distribution:

position_counts = df['Position'].value_counts()
plt.pie(position_counts, labels=position_counts.index, autopct='%1.1f%%', startangle=140)

plt.title('Position Distribution of Players')

plt.axis('equal')

plt.show()

9. Team Analysis:

ills
avg_salary_by_team = df.groupby('Team')['Salary'].mean()

print(avg_salary_by_team)

Sk
plt.figure(figsize=(10, 6))

avg_salary_by_team.plot(kind='bar')

plt.xlabel('Team')

plt.ylabel('Average Salary')
a
plt.title('Average Salary of Players by Team')
at
plt.xticks(rotation=45)

plt.show()
D

10.Extras:
w

min_weight_index = df['Weight'].idxmin()

print("Index with minimum weight value:", min_weight_index)

df_sorted = df.sort_values(by='Name', ignore_index=True)

print(df_sorted)

name_series = df['Name']

print("Top 10 names:\n", name_series.head(10))

print("\nLast 10 names:\n", name_series.tail(10))

G
ro
w
D
at
a
Sk
ills

IP Practical File 2024-25
100% (7)
IP Practical File 2024-25
22 pages
Calibre XRC Parasitic Extraction: 2017 Mentor Graphics Corporation
No ratings yet
Calibre XRC Parasitic Extraction: 2017 Mentor Graphics Corporation
14 pages
Badminton Unit
No ratings yet
Badminton Unit
45 pages
The Nature of Psychology and The Psychology Major: Dunn & Halonen
No ratings yet
The Nature of Psychology and The Psychology Major: Dunn & Halonen
25 pages
Practical File 2024
No ratings yet
Practical File 2024
25 pages
Analysing NBA DATA
No ratings yet
Analysing NBA DATA
13 pages
DS Manual 1
No ratings yet
DS Manual 1
96 pages
IP Practical File 2022
No ratings yet
IP Practical File 2022
26 pages
Exemplar - Perform Feature Engineering
No ratings yet
Exemplar - Perform Feature Engineering
14 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
Even Students
No ratings yet
Even Students
36 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
32 pages
Khadeeja - DS - PRACTICAL 4
No ratings yet
Khadeeja - DS - PRACTICAL 4
24 pages
Data Visualization EDA-print
No ratings yet
Data Visualization EDA-print
18 pages
EDA - Exploratory Data Analysis
No ratings yet
EDA - Exploratory Data Analysis
16 pages
Step-by-Step Explanation of Python Data Preprocessing Script
No ratings yet
Step-by-Step Explanation of Python Data Preprocessing Script
9 pages
Practical File 12.
No ratings yet
Practical File 12.
22 pages
Oddstudents
No ratings yet
Oddstudents
35 pages
Python Codes
No ratings yet
Python Codes
17 pages
BDA Lab 4: Python Data Visualization: Your Name: Mohamad Salehuddin Bin Zulkefli Matric No: 17005054
No ratings yet
BDA Lab 4: Python Data Visualization: Your Name: Mohamad Salehuddin Bin Zulkefli Matric No: 17005054
10 pages
Data Sci
No ratings yet
Data Sci
29 pages
Eda Code Snippets
No ratings yet
Eda Code Snippets
17 pages
Practical File 12th
No ratings yet
Practical File 12th
19 pages
Assignment 1 - LP1
No ratings yet
Assignment 1 - LP1
14 pages
Some Exercises
No ratings yet
Some Exercises
9 pages
Ip Practical File
No ratings yet
Ip Practical File
23 pages
Exp 8 - LM
No ratings yet
Exp 8 - LM
10 pages
Prac 2
No ratings yet
Prac 2
11 pages
Abhiml ML File
No ratings yet
Abhiml ML File
74 pages
Ip 12th Practical
No ratings yet
Ip 12th Practical
22 pages
Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
Data Science
No ratings yet
Data Science
18 pages
PP DWDM 4 5
No ratings yet
PP DWDM 4 5
26 pages
Project Paarth
No ratings yet
Project Paarth
21 pages
Data Mining Lab 03
No ratings yet
Data Mining Lab 03
10 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
12 Pandas
No ratings yet
12 Pandas
14 pages
Prac 2
No ratings yet
Prac 2
11 pages
Data Science Practical Book - Ipynb
No ratings yet
Data Science Practical Book - Ipynb
21 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
42 pages
Statistics IMP Questions and Answers
No ratings yet
Statistics IMP Questions and Answers
23 pages
Day 4 Data Manipulation With Pandas
No ratings yet
Day 4 Data Manipulation With Pandas
4 pages
Exp 12 and 15
No ratings yet
Exp 12 and 15
4 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Python For Exploratory Data Analysis
No ratings yet
Python For Exploratory Data Analysis
12 pages
Data Science Practicals - Ipynb
No ratings yet
Data Science Practicals - Ipynb
54 pages
sakina_assign1_batch3
No ratings yet
sakina_assign1_batch3
8 pages
Cleaning Data in Python
No ratings yet
Cleaning Data in Python
8 pages
12 IP Practical Exampl
No ratings yet
12 IP Practical Exampl
6 pages
ST Joseph'S Convent Senior Secondary School: Name:-Shatakshi Gaur Class:-Xii Sec:-A Board Roll No.
No ratings yet
ST Joseph'S Convent Senior Secondary School: Name:-Shatakshi Gaur Class:-Xii Sec:-A Board Roll No.
65 pages
Python SQL
No ratings yet
Python SQL
5 pages
Practical 1
No ratings yet
Practical 1
5 pages
Practicals
No ratings yet
Practicals
42 pages
Hrithik Saini Class 12th c1, Roll No 1033
No ratings yet
Hrithik Saini Class 12th c1, Roll No 1033
25 pages
Ai Programs
No ratings yet
Ai Programs
22 pages
Aerofit Case Study
No ratings yet
Aerofit Case Study
16 pages
Data Preprocessing 2
No ratings yet
Data Preprocessing 2
5 pages
Jashan ML
No ratings yet
Jashan ML
20 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
Data Science in Society Cat
No ratings yet
Data Science in Society Cat
5 pages
Lesson 2 - Data Preprocessing
100% (1)
Lesson 2 - Data Preprocessing
72 pages
No Ph.D. Game Design With Three.js
From Everand
No Ph.D. Game Design With Three.js
Nikiforos Kontopoulos
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
1270A544-038 Console v3.5
No ratings yet
1270A544-038 Console v3.5
305 pages
CL605 - Lecture 3, 7 Aug 24
No ratings yet
CL605 - Lecture 3, 7 Aug 24
5 pages
Installation Guide
No ratings yet
Installation Guide
16 pages
Solutionskian
No ratings yet
Solutionskian
4 pages
Brochure - InnoSight General
No ratings yet
Brochure - InnoSight General
12 pages
2210 - Melting Points and Mixed Melting Points
0% (1)
2210 - Melting Points and Mixed Melting Points
13 pages
2020-X1 Police Report Writing Tips
No ratings yet
2020-X1 Police Report Writing Tips
9 pages
Hard Geometric Problem
No ratings yet
Hard Geometric Problem
4 pages
Internal Reconstruction - Homework
No ratings yet
Internal Reconstruction - Homework
25 pages
Rainbow Magic Rainbow Sunny
No ratings yet
Rainbow Magic Rainbow Sunny
17 pages
Mycology Practical (Orange)
No ratings yet
Mycology Practical (Orange)
3 pages
WhatsUp 5 Test1 A
No ratings yet
WhatsUp 5 Test1 A
5 pages
AFS MTO and PT
No ratings yet
AFS MTO and PT
4 pages
Aspire Archon User Manual
No ratings yet
Aspire Archon User Manual
1 page
الحنين
No ratings yet
الحنين
27 pages
Menna Fitzpatrick and Jen Kehoe
No ratings yet
Menna Fitzpatrick and Jen Kehoe
3 pages
CMS Memo and Approval of Kentucky 1115 Waiver
No ratings yet
CMS Memo and Approval of Kentucky 1115 Waiver
10 pages
Rac Form 2 Municipal Tourism Culture and Arts Office 2023
No ratings yet
Rac Form 2 Municipal Tourism Culture and Arts Office 2023
5 pages
Connect A Wired and Wireless LAN
No ratings yet
Connect A Wired and Wireless LAN
11 pages
National Income FINAL
No ratings yet
National Income FINAL
23 pages
" Project Charter ": Reading Assignment
0% (1)
" Project Charter ": Reading Assignment
26 pages
Values Assessment
100% (1)
Values Assessment
2 pages
Solar Inverter Modbus Interface Definitions (V3.0) (Huawei Sun2000-250ktl-H1-250kw)
No ratings yet
Solar Inverter Modbus Interface Definitions (V3.0) (Huawei Sun2000-250ktl-H1-250kw)
65 pages
ASEAN Risk Communication Module
No ratings yet
ASEAN Risk Communication Module
128 pages
Speech Perception A Foundation For First
No ratings yet
Speech Perception A Foundation For First
19 pages
Nephrotic Vs Nephritic Syndrome
100% (1)
Nephrotic Vs Nephritic Syndrome
2 pages
Charles Schwab Corporation (A)
100% (4)
Charles Schwab Corporation (A)
4 pages

Python Class 6 Assignment Solution

Uploaded by

Python Class 6 Assignment Solution

Uploaded by

PANDAS:

raw_Data = {'Voter_name': ['Geek1', 'Geek2', 'Geek3', 'Geek4',

raw_Data = {'Voter_name': ['Geek1', 'Geek2', 'Geek3', 'Geek4', 'Geek5', 'Geek6', 'Geek7',

# Create a new column "Voter" based on voter age

df['Voter'] = np.where(df['Voter_age'] > 18, 'Yes', 'No')

raw_Data = {'First': ['Manan ', 'Raghav ', 'Sunny '],

'Last': ['Goel', 'Sharma', 'Chawla'],

'Age': [12, 24, 56]}

# Combine First and Last columns into Full Name

# Set Age as index

'Price':[1250, 1450, 1550, 400],

a- Add Index as Item1, Item2, Item3, Item4

: 'B', Shuttle: 'S'

d- Round off the Expense column values to two decimal places.

f- Convert the column type of “Date” to datetime format

'Product': ['Umbrella', 'Matress', 'Badminton', 'Shuttle'],

'Price': [1250, 1450, 1550, 400],

'Expense': [21525220.653, 31125840.875, 23135428.768, 56245263.942]}

df.index = ['Item1', 'Item2', 'Item3', 'Item4']

indexes_with_price_gt_1000 = df[df['Price'] > 1000].index.tolist()

product_map = {'Umbrella': 'U', 'Matress': 'M', 'Badminton': 'B', 'Shuttle': 'S'}

# Task d: Round off Expense column to two decimal places

# Task e: Create 'Discounted_Price' column with 10% discount

df['Discounted_Price'] = df['Price'] * 0.9

# Task f: Convert 'Date' column to datetime format

# Task g: Create 'Rank' column based on price

● Load the nba.csv data into a Pandas DataFrame.

● Handle missing values by either removing or imputing them.

4. Exploratory Data Analysis (EDA):

● Create a histogram of player ages.

● Create a box plot of player salaries for each 'position'.

● Display the top 10 players with the highest salaries.

● Determine the top 5 colleges with the most represented players.

● Display the average salary of players for each 'team'.

● Get the index at which minimum weight value is present.

2. Provide comments explaining your code.

3. Use meaningful variable names.

5. Present your findings in a clear and organized manner.

6. Feel free to use additional code cells for each task.

# Load the data into a Pandas DataFrame

# Display basic information about the DataFrame

# Handle missing values

3. Data Transformation: Create 'BMI' column using a fixed height value

# Assuming a fixed height value of 70 inches (5 feet 10 inches)

df['BMI'] = (df['Weight'] / (fixed_height ** 2)) * 703

print(df[['Age', 'Weight', 'Salary']].describe())

import matplotlib.pyplot as plt

# Histogram of player ages

plt.title('Distribution of Player Ages')

# Box plot of player salaries by position

plt.title('Box Plot of Player Salaries by Position')

plt.scatter(df['Age'], df['Salary'], c=df['Position'].map(colors), alpha=0.5)

top_players = df.nlargest(10, 'Salary')

plt.title('Position Distribution of Players')

print("Index with minimum weight value:", min_weight_index)

df_sorted = df.sort_values(by='Name', ignore_index=True)

print("Top 10 names:\n", name_series.head(10))

print("\nLast 10 names:\n", name_series.tail(10))

You might also like