IP - Pandas 1 & 2 (Worksheet) Class 12
IP - Pandas 1 & 2 (Worksheet) Class 12
Ans:
0 1
1 2
2 2
3 7
4 Sachin
dtype: object
0 1
1 2
2 2
dtype: object
2 Write a program in python to find maximum value over index in Data frame.
Ans:
# importing pandas as pd
import pandas as pd
Ans:
1| P a ge
3. It displays all columns with row index 2 to 7.
4. It will display entire dataframe with all rows and columns.
5. It will display all rows except the last 4 four rows.
4 Write a python program to sort the following data according to ascending order
of Age.
Name Age Designation
Sanjeev 37 Manager
Keshav 42 Clerk
Rahul 38 Accountant
Ans:
import pandas as pd
name=pd.Series(['Sanjeev','Keshav','Rahul'])
age=pd.Series([37,42,38])
designation=pd.Series(['Manager','Clerk','Accountant'])
d1={'Name':name,'Age':age,'Designation':designation}
df=pd.DataFrame(d1)
print(df)
df1=df.sort_values(by='Age')
print(df1)
Ans:
import pandas as pd
name=pd.Series(['Sanjeev','Keshav','Rahul'])
age=pd.Series([37,42,38])
designation=pd.Series(['Manager','Clerk','Accountant'])
d1={'Name':name,'Age':age,'Designation':designation}
df=pd.DataFrame(d1)
print(df)
2| P a ge
df2=df.sort_values(by='Name',ascending=0)
print(df2)
6 Which of the following thing can be data in Pandas?
1. A python dictionary
2. An nd array
3. A scalar value
4. All of above
Ans:
Ans:
3. Value,size
Ans:
1. True
Ans:
3| P a ge
4. None
Ans:
1. Dataframe
12 What will be the output of df.iloc[3:7,3:6]?
Ans:
It will display the rows with index 3 to 6 and columns with index 3 to 5 in a
dataframe ‘df’
13 How to select the rows where where age is missing?
1. df[df[‘age’].isnull]
2. df[df[‘age’]==NaN]
3. df[df[‘age’]==0]
4. None
Ans:
'Bidprice':[13,12,7,10,17,15],
'Runs':[1000,2400,900,200,3600,3700]}
df=pd.DataFrame(d)
print(df)
print(df.iloc[:2,:])
print(df.iloc[ -3:,:])
15 Write a command to Find most expensive Player.
Ans:
print(df[df['BidPrice']==df['BidPrice'].max()])
16 Write a command to Print total players per team.
4| P a ge
Ans:
print(df.groupby('Team').Player.count())
17 Write a command to Find player who had highest BidPrice from each team.
Ans:
val=df.groupby('Team')
print(val['Player','BidPrice'].max())
1. Mathematician
2. Statistician
3. Software Programmer
4. All of the above
Ans:
4 All the above
22 What is the built-in database used for python?
1. Mysql
2. Pysqlite
3. Sqlite3
4. Pysqln
Ans:
3 Sqlite3
23 How can you drop columns in python that contain NaN?
Ans:
df1.dropna(axis=1)
5| P a ge
24 How can you drop all rows that contains NaN?
Ans:
df1.dropna(axis=0)
25 A Series is array, which is labelled and type.
Ans:
Ans:
4 All
Ans:
4.6
29 How many rows the resultant data frame will have?
import pandas as pd
df1=pd.DataFrame({‘key’:[‘a’,’b’,’c’,’d’], ‘value’:[1,2,3,4]})
df2=pd.DataFrame({‘key’:[‘a’,’b’,’e’,’b’], ‘value’:[5,6,7,8]})
df3=df1.merge(df2, on=’key’, how=’inner’)
1. 3
2. 4
3. 5
4. 6
Ans:
1. 3
30 How many rows the resultant data frame will have?
6| P a ge
import pandas as pd
df1=pd.DataFrame({‘key’:[‘a’,’b’,’c’,’d’], ‘value’:[1,2,3,4]})
df2=pd.DataFrame({‘key’:[‘a’,’b’,’e’,’b’], ‘value’:[5,6,7,8]})
df3=df1.merge(df2, on=’key’, how=’right’)
1. 3
2. 4
3. 5
4. 6
Ans:
2. 4
31 How many rows the resultant data frame will have?
import pandas as pd
df1=pd.DataFrame({‘key’:[‘a’,’b’,’c’,’d’], ‘value’:[1,2,3,4]})
df2=pd.DataFrame({‘key’:[‘a’,’b’,’e’,’b’], ‘value’:[5,6,7,8]})
df3=df1.merge(df2, on=’key’, how=’left’)
1. 3
2. 4
3. 5
4. 6
Ans:
3. 5
Ans:
pop()
33 A is an interactive way to quickly summarize large amount of data.
Ans:
Pivoting
34 Method is used to rename the existing indexes in a data frame.
Ans:
rename
35 Attribute that can prohibit to create a new data frame in
sort_values() method.
Ans:
Inplace
36 Write a program in python to calculate the sum of marks in C S subject in a
given dataset-
‘CS’:[45,55,78,95,99,97], ‘IP’:[87,89,98,94,78,77]
Ans:
d1={ ‘CS’:[45,55,78,95,99,97], ‘IP’:[87,89,98,94,78,77] }
df=pd.DataFrame(d1)
print(df['CS'].sum())
7| P a ge
37 Write a python program to create a data frame with headings (CS and IP) from
the list given below-
[[79,92][86,96],[85,91],[80,99]]
Ans:
l=[[10,20],[20,30],[30,40]]
df=pd.DataFrame(l,columns=['CS','IP'])
print(df)
38 How you can find the total number of rows and columns in a data frame.
Ans:
df.shape
39 MaxTemp MinTemp City RainFall
45 30 Delhi 25.6
34 24 Guwahati 41.5
48 34 Chennai 36.8
32 22 Bangluru 40.2
44 29 Mumbai 38.5
39 37 J a ip u r 24.9
Ans:
print(df.sum(axis=0))
40 Based on the above data frame df, Write a command to compute mean of
column MaxTemp.
Ans:
Print(df['MaxTemp'].mean())
41 Based on the above data frame df, Write a command to compute average
MinTemp, RainFall for first 4 rows.
Ans:
df[['MinTemp', 'Rainfall’]][:4].mean()
42 Which method is used to read the data from M yS Q L database through Data
Frame?
Ans:
read_sql_query()
Ans:
execute()
44 What will be the output of following code?
8| P a ge
import pandas as pd
df = pd.DataFrame([45,50,41,56], index = [True, False, True, False])
print(df.iloc[True])
Ans:
It will display error message like- Cannot index by location index with a non-integer
key because iloc accept only integer index.
9| P a ge
Two functions for pivoting are: pivot() and pivot_table()
52. Write a python code to create a dataframe with appropriate headings from the
list given below:
['S101', 'Amy', 70], ['S102', 'Risha', 69], ['S104', 'Susan', 75], ['S105','George',
82]
import pandas as pd
L=[['S101','Amy',70], ['S102','Risha',69], ['S104','Susan',75], ['S105','George',82]]
df=pd.DataFrame(L,index=[1,2,3,4],columns=['ID','Name','Points'])
print(df)
53. Consider the following dataframe, and answer the questions given below:
import pandas as pd
df = pd.DataFrame({“Quarter1":[2000, 4000, 5000, 4400, 10000],
"Quarter2":[5800, 2500, 5400, 3000, 2900],
"Quarter3":[20000, 16000, 7000, 3600, 8200],
"Quarter4":[1400, 3700, 1700, 2000, 6000]})
Write the code to find mean value from above dataframe df over the index and
column axis. (Skip NaN value)
print(df.mean(axis=0,skipna=True))
print(df.mean(axis=1,skipna=True))
54. Use sum() function to find the sum of all the values over the index axis.
print(df.sum(axis=0))
55. Find the median of the dataframe df.
print(df.median())
56. Find the output of the following code:
import pandas as pd
data = [{'a': 10, 'b': 20},{'a': 6, 'b': 32, 'c': 22}]
df1 = pd.DataFrame(data,columns=['a','b'])
df2 = pd.DataFrame(data,columns=['a','b1'])
print(df1)
print(df2)
a b
0 10 20
1 6 32
a b1
1 10 NaN
2 6 NaN
57.
import pandas as pd
x1=[[10,150],[40,451],[15,302],[40,703]]
df1=pd.DataFrame(x1,columns=['mark1','mark2'])
x2=[[30,20],[20,25],[20,30],[5,30]]
df2=pd.DataFrame(x2,columns=['mark1','mark2
']) print(df1)
print(df2)
10 | P a g e
58. To add dataframes df1 and df2.
print(df1.add(df2))
60. To change index label of df1 from 0 to zero and from 1 to one.
df1=df1.rename(index={0:'zero',1:'one'})
62. For the given code fill in the blanks so that we get the desired output with
maximum value for Quantity and Average Value for Cost:
import pandas as pd
import numpy as np
d={'Product':['Apple','Pear','Banana','Grapes'],'Quantity':[100,150,200,250],
'Cost':[1000,1500,1200,900]}
df = pd.DataFrame(d)
df1 =
print(df1)
Quantity 250.0
Cost 1150.0
dtype: float64
df1=pd.DataFrame([df['Quantity'].max(),df['Cost'].mean()],index=['Quantity','Cost'])
11 | P a g e
import pandas as pd
df1=pd.DataFrame({'Icecream':['Vanila','ButterScotch','Caramel'] ,
'Cookies':['Goodday','Britannia', 'Oreo']})
df2=pd.DataFrame({'Chocolate':['DairyMilk','Kitkat'],'Icecream':['Vanila','ButterScotc
h'],'Cookies':['Hide and Seek','Britannia'})
df2.reindex_like(df1)
print(df2)
Chocolate Icecream Cookies
1 DairyMilk Vanila Hide and Seek
2 Kitkat ButterScotch Britannia
print(df1.add(df2))
12 | P a g e
68. To sort df1 by Second column in descending order.
df1=df1.sort_values(by=’Second’,ascending=False)
df2=df2.rename(index={0:’a’,1:’b’,2:’c’,3:’d’})
70. To display those rows in df1 where value of third column is more than 45.
print(df1[df1[‘Third’]>45])
import pandas as pd
student_df=pd.DataFrame({'Name':['Ananmay','Aditi','Mehak','Kriti'],'Class':['XI','XI','
XI','XI'],'Marks':[95,82,65,45]},index=[1,2,3,4])
data={'Name':'Sohail','Class':'XII','Marks':77}
newstd=pd.DataFrame(data,index=[5])
student_df=student_df.append(newstd)
73. Ji t es h wants to sort a DataFrame df. He has written the following code.
df=pd.DataFrame({"a":[13, 24, 43, 4],"b":[51, 26, 37, 48]})
print(df)
df.sort_values(‘a’)
print(df)
He is getting an output which is showing original DataFrame and not the sorted
DataFrame. Identify the error and suggest the correction so that the sorted
DataFrame is printed.
The possible reason is that the original dataframe is not
modified. The correct answer is:
df.sort_values(‘a’,inplace=True)
74. Write a command to display the name of the company and the highest car price
from DataFrame having data about cars.
import pandas as pd
car={'Name':['Innova','Tavera','Royal','Scorpio'],'Price':[300000,800000,25000
0,650000]}
df=pd.DataFrame(car,index=[1,2,3,4])
print(df[df.Price==df.Price.max()])
75. Write a command in python to Print the total number of records in the
DataFrame.
print(df1.count())
13 | P a g e
76. Consider a DataFrame ‘df’ created using the dictionary given below, answer
the questions given below:
77. Write a command to create a pivot table based on ‘qualify’ column and display
sum of the score and attempt columns.
print(df.pivot_table(columns=['qualify'],values=['score','attempts'],aggfunc='sum'))
78. Write a command to display the names of students who have qualified.
print(df[df['qualify']=='yes'].name)
79. Consider the following DataFrame df and answer the questions given below:
80. Write command to compute mean of every column of the data frame.
print(df.mean(axis=0))
81. Write command to add one more row to the data frame with data [5,12,33,3]
14 | P a g e
82.
Emp_ID Name Dept Salary Status
100 Kabir IT 34000 Regular
110 Rishav Finance 28500 Regular
120 Seema IT 13500 Contract
130 David IT 41000 Regular
140 Ruchi HRD 17000 Contract
Consider the above Data frame as df.
Write a Python Code to calculate the average salary of the Regular employees
and the Contract employees separately.
print(df.groupby('Status').mean().Salary)
83. Write a Python Code to print the dataframe in the descending order of Salary.
df=df.sort_values(by='Salary',ascending=False)
print(df)
84. Write a Python Code to update the Salary of all Contract employees to Rs
19000
df.Salary[df.Status=='Contract']=19000
85. Write a Python Code to count the total number of employees in each
department.
print(df.groupby('Dept').count().Name)
86. Write a Python Code to display the maximum salary of the “Contract” staff.
print(df[df['Status']=='Contract'].max().Salary)
print(df.iloc[3:4,:])
del df['Status']
89. Write a Python Code to display the maximum salary of all employees in the
‘IT’ department.
print(df[df.Dept=='IT'].max().Salary)
90. Write a Python Code to delete the 1 st and the last record.
df=df.drop([0,4])
15 | P a g e
print(df[df>50].count().sum())
93. Write Python Code to count the number of even numbers and number of odd
numbers in the dataframe.
print('No of Even Numbers:',df[df%2==0].count().sum())
print('No of Odd Numbers:',df[df%2==1].count().sum())
94. Consider the above data frame df.
employee sales Quarter State
Sahay 125600 1 Delhi
George 235600 1 Tamil Nadu
Priya 213400 1 Kerala
Manila 189000 1 Haryana
Raina 456000 1 West Bengal
Manila 172000 2 Haryana
Priya 201400 2 Kerala
import pandas as pd
data={'employee':['Sahay','George','Priya','Manila','Raina','Manila','Priya'],
'Sales':[125600,235600,213400,189000,456000,172000,201400],
'Quarter':[1,1,1,1,1,2,2],'State':['Delhi','TamilNadu','Kerala','Haryana','West
Bengal','Haryana','Kerala']}
df=pd.DataFrame(data)
print(df)
95. Write Python Program to find total sales per state.
print(df.groupby('State').sum().Sales)
print(df.groupby('employee').sum().Sales)
97. Write Python Program to find average sales on both employee and state wise.
print(df.groupby(['employee','State']).sum().Sales)
98. Write Python Program to find mean,median and minimum sale statewise.
print(df.groupby('State').mean().Sales)
print(df.groupby('State').median().Sales)
print(df.groupby('State').min().Sales)
99. Write Python Program to find maximum sales quarter-wise.
print(df.groupby('Quarter').max().Sales)
100 Write Python Program to create a Pivot Table with State as the index, Sales as
. the values and calculating the maximum Sales in each State.
print(df.pivot_table(index='State',values='Sales',aggfunc='max'))
16 | P a g e