0% found this document useful (0 votes)
25 views

Data Handling using pandas - I Q & ANS (1)

Uploaded by

maazmaaz20061013
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Data Handling using pandas - I Q & ANS (1)

Uploaded by

maazmaaz20061013
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Data Handling using pandas – I

Worksheet
1. Write a python panda code to create a 1D array of size 8 with all elements as zero.
Assign 20 to 2nd element. (1)

Ans:
import pandas as pd
import numpy as np
a=np.zeros(8,dtype=int)
s=pd.Series(a)
print(s)
s[1]=20
print(s)
OR
import pandas as pd
s=pd.Series(0,index=[0,1,2,3,4,5,6,7])
print(s)
s[1]=20
print(s)

2. Find the output of the following code: (2)


import pandas as pd
s1=pd.Series([10,20,30,40,50],index=["a","b","b","d","e"])
(i) print(s1[-3:-1]) (ii) print(s1['b'])

Ans:
(i) b 30
d 40

(ii) b 20
b 30

3. Differentiate between series data structure and dataframe data structure? (2)
Series
➢ one-dimensional array
➢ homogenous data
➢ one axis. It contains row and therefore it has row index.
➢ data is mutable
➢ size is immutable
DataFrame
➢ a two-dimensional labelled data structure like a table of MySQL.
➢ Heterogeneous data
➢ Two axes. It contains rows and columns, and therefore has both a row and column
index.
➢ Data is mutable.
➢ Size is mutable.

4. Difference between head() and tail() functions in series/DataFrame. Illustrate the


functions with the help of a program. (2)
Ans:
head(n)
Returns the first n rows of the series/DataFrame. If the value for n is not passed, then by
default n takes 5 and the first five rows are displayed.
tail(n)
Returns the last n rows of the series/DataFrame.. If the value for n is not passed, then by
default n takes 5 and the last five rows are displayed.

Program for head() and tail() function in Series


import pandas as pd
s1=pd.Series([10,20,30,40,50,60,70,80,90])
print(s1.head())
print(s1.tail(3))
Output:
0 10
1 20
2 30
3 40
4 50
dtype: int64

6 70
7 80
8 90
dtype: int64

Program for head() and tail() function in DataFrame


import pandas as pd
df=pd.DataFrame([[10,20],[30,40],[50,60],[70,80]])
print(df.head(2))
print(df.tail(3))
Output:
0 1
0 10 20
1 30 40
dtype: int64

0 1
1 30 40
2 50 60
3 70 80
dtype: int64

5. Write a python program to create a pandas DataFrame from a dictionary of Series. (2)
Ans:
import pandas as pd
s1 = pd.Series([1,2,3,4,5],index = ['a', 'b', 'c', 'd', 'e'])
s2 = pd.Series ([10,20,30,40,50],index = ['a', 'b', 'c', 'd', 'e'])
df = pd.DataFrame([s1, s2])
print(df)

Output
a b c d e
0 1 2 3 4 5
1 10 20 30 40 50

6. Write a python program to create a pandas DataFrame from a list of dictionaries. (2)
Ans:
import pandas as pd
df=pd.DataFrame([{'name':'ali','age':16,'mark':80},{'name':'zubair','age':17},{'nam
e':'amaan','age':16}])
print(df)
Output:
name age mark
0 ali 16 80.0
1 zubair 17 NaN
2 amaan 16 NaN
7. Give the output of the following code: (1)
import pandas as pd
dict={'Name':pd.Series(['Anoop','Abhi','Raju','Mitu']),'Age':pd.Series([16,15,17,18]),
'Score':pd.Series([57,97,76,65])}
df=pd.DataFrame(dict)
print("Dataframe contents")
print("*********************")
print(df)

Ans:
Dataframe contents
*********************
Name Age Score
0 Anoop 16 57
1 Abhi 15 97
2 Raju 17 76
3 Mitu 18 65

8. Difference between loc() and iloc() functions in series. Illustrate the functions with the
help of a program. (2)

Ans:
loc (): loc is used for selecting or setting elements of a dataframe based on label (by row
name or column name).

iloc (): iloc is used for selecting elements of a dataframe based on position. It refers to
position-based indexing.

import pandas as pd
print("creation of dataframe from dictionary of list")
df2=pd.DataFrame({'name':['ali','giri','mini','geena','meena','reena'],'age':[15,16,17,1
6,16,17],'mark':[60,70,80,90,85,75]},index=['s1','s2','s3','s4','s5','s6'])
print(df2)
print("to display the rows s1,s3,s5 and columns name,mark using loc")
print(df2.loc[['s1','s3','s5'],['name','mark']])
print("to display the rows s1,s3,s5 and columns name,mark using iloc")
print(df2.iloc[0:5:2,0:3:2])

9. We can delete an element from a DataFrame using_____ (1)


a. empty() b. reindex() c. rsub() d. drop()
10. Name the two important data structures of Pandas library. (1)
Ans:
One dimensional data structure-series and two dimensional data structure- DataFrame
11. Write the Python command to display the last 4 records of the dataframe df (1)
Ans:
df.tail(4)
12. Consider the following DataFrame stud (2)
Admno Name Class
S1 101 Ali X
S2 110 Fadil IX
a) Write the program to create a dataframe stud using the list of dictionaries. After
creation change the index of the dataframe and display.
b) Write a program to create a dataframe stud using the dictionary of series. After
creation change the column index and display.

13. Given two objects, a list Object namely Mylist and a Series Object namely MySeries,
both are having similar contents i.e. 1 3 5 7 9. Find out the output produced by the
following two statements
a. print(Mylist * 2) b. print(MySeries * 2) (2)

14. Create a python program that creates the 2 series given below and perform any four
arithmetic operations with the given Series and print the result. (2)
Series1 Series 2
0 10 0 2
1 20 1 3
2 30 2 4

15. Which command is used for installing Pandas? (1)


Ans: pip install pandas

16. Write a suitable Python code to create an empty Series. (1)


Ans:
import pandas as pd
df=pd.Series()
print(df)
17. What is pandas and what are the benefits of pandas? (2)
Ans:
PANDAS (PANEL DATA)
➢ High-level data manipulation tool used for analysing data.
➢ It is very easy to import and export data using Pandas library
➢ It is built on packages gives us a single, convenient place to do most of our data
analysis and visualisation work.
➢ Pandas has three important data structures, namely –Series, DataFrame and Panel to
make the process of analysing data organised, effective and efficient.
18. Find the output of the following code: (1)
import pandas as pd
s1=pd.Series([10,20,30,40,50],index=["a","b","c","d","e"])
print(s1[-4:-2])

Ans:
b 20
c 30
dtype: int64

19. Explain Boolean indexing in data frame. Illustrate Boolean indexing using a data frame
program. (2)

20. Write a program to create and perform following operations on rows and
columns of data frame. (6)
(i) creating new row in existing dataframe
(ii) Creating new column in existing dataframe
(iii) print first 3 rows
(iv) print first and third column
(v) delete a column using drop function

21. Write the command to find the sum of series S1 and S2 (1)
Ans:
print(S1+S2)

22. Consider the following DataFrame stud (2)


Admno Name Class
S1 101 Ali X
S2 110 Fadil IX
Write commands to :
i. Add a new column ‘mark’ to the Dataframe stud with values (30,45)
ii. Add a new row with row index S3 and values ( 105 , Murali ,X)

Ans:
i. stud[‘mark’]=[30,45]
ii. stud.loc[‘S3’]=[105,’Murali’, ‘X’]

23. Consider two objects x and y. x is a list whereas y is a Series. Both have values 20,
40,90, 110. What will be the output of the following two statements considering that the
above objects have been created already. (3)
a. print (x*2) b. print(y*2)
Justify your answer.
Ans:
a. will give the output as: [20,40,90,110,20,40,90,110]
b. will give the output as
0 40
1 80
2 180
3 220
Justification: In the first statement x represents a list so when a list is multiplied by a
number, it is replicated that many number of times. The second y represents a series. When
a series is multiplied by a value, then each element of the series is multiplied by that
number.

24. The command used to display the last 2 rows in a dataframe df is………… (1)
Ans:
df.tail(2)

25. Name any one python library generally used for data analysis. (1)
Ans: pandas

26. Which statement create an empty data frame. (1)


a)>>> s=pd.DataFrame([ ])
b)>>> s=pd.DataFrame(0)
c)>>> s=pd.DataFrame()
d)>>> s=pd.DataFrame([np.NaN])
Choose the correct option from the following
i) option a ii) option b iii) option a and c I v) option d

Ans:
iii) option a and c

a) Replace the index with student name as [Siya, Ram, Fiza, Diya, Manish].
b) Display the failed students (passing mark is 33)
Ans:
i) S.index=[' Siya ',' Ram ',' Fiza ','Diya',' Manish ']
ii) print(S[S<33])
28. Consider the following DataFrame df and answer any four questions from (i) to (iv)
i) Write down the command to add a new column “Height” with values
156,173,140,146,185 (1)
a) df ['Height']=[ 156,173,140,146,185]
b) df. Height=[ 156,173,140,146,185]
c) df (Height) =[ 156,173,140,146,185]
d) both (a) and (b)

Ans:
a) df ['Height']=[ 156,173,140,146,185]

ii) Write down the command to display the column “Name‟ from the dataframe. (1)
a) print(df.Name) b) print(df[column]=‟Name‟)
c) print(df[“Name‟]) d) Both (a) and (c)

Ans:
d) Both (a) and (c)

iii) Write command to display the number of rows and columns in dataframe. (1)
a) print(df.size) b) print(df[index,column])
c) print(df.shape) d) print(df.ndim)

Ans:
c) print(df.shape)

iv) Write command to delete the column “Age‟ from the dataframe. (1)
a) del df[‘Age’] b) drop df['Age']
c) df.del[“Age‟] d) drop[“Age‟]

Ans:
a) del df[‘Age’]
29. Create a series Month (from Jan-May) , from a dictionary having number of days as data
and month name as keys. (2)
Ans:
import pandas as pd
dic={'Jan':31,'Feb':28,'Mar':31,'Apr':30,'May':31}
s=pd.Series(dic)
print(s)

30. The average marks of 5 subjects in three divisions given below: (5)

i) Write a python pandas program to create a dataframe using above data.


ii) Rename the column DIVISION C by DIVISION D.
iii) To display the marks in DIVISION A from dataframe
iv) To display the subjects name from the dataframe

Ans:
i) import pandas as pd
dic={'Division A':{'English':65,'Maths':45,'Science':87},
'Division B':{'English':67,'Maths':34,'Science':87},
'Division C':{'English':87,'Maths':87,'Science':56}}
d=pd.DataFrame(dic)
print(d)
ii) print(d.rename(columns={'Division C':'Division D'}))
iii)print(d['Division A'])
iv) print(d.index.values)

31. In a DataFrame, axis= 0, represents the_____________ elements. (1)


a. Row b. Plot c. Column d. Graph

You might also like