Pandas Dataframe1

The document provides an overview of the DataFrame data structure in pandas, highlighting its features such as mutable size, heterogeneous data types, and labeled axes. It includes examples of creating DataFrames from lists, series, dictionaries, and arrays, along with attributes and methods for accessing and modifying data. Additionally, it demonstrates how to select and manipulate data within a DataFrame using various techniques.

Uploaded by

manishmcamba2013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Pandas Dataframe1

Uploaded by

manishmcamba2013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 43

DATAFRAME-I

It is two dimensional data structure of pandas just

like table with rows & columns which store
heterogeneous data. It is also similar to
spreadsheet or MySql tables.
Features of Dataframe
1. column can be of different types i.e. it is
possible to have any kind of data in columns
i.e. numeric , string or floating point.
2. Size of Dataframe is mutable i.e. the
number of rows and columns can be increased
or decreased.
3. Its data are also mutable and can be
changed at any time.
4. Labelled axes i.e. rows and columns.
5. Arithmetic operation on rows and
columns.
6. Indexes may constitute number, string or
letters.

COLUMNS
A B C D E
1 25 88 99 87 54
ROWS

2 66 54 45 75 84
3 84 85 86 95 89
4 74 75 78 87 65
CREATING DATAFRAME AND DISPLAY
1.CREATING DATAFRAME FROM LISTS.
>>> import pandas as pd
>>> a=[10,20,30,40,50]
>>> df=pd.DataFrame(a)
>>> print(a)
[10, 20, 30, 40, 50]

>>> print(df)
0
0 10
1 20
2 30
3 40
4 50
>>> import pandas as pd
>>> a=[["ram",22],["mohan",15],["sonam",16],
["kirti",21]]
>>> df=pd.DataFrame(a)
>>> print(a)
[['ram', 22], ['mohan', 15], ['sonam', 16], ['kirti',
21]]
>>> print(df)
0 1
0 ram 22
1 mohan 15
2 sonam 16
3 kirti 21
2.CREATING DATAFRAME FROM SERIES
>>> import pandas as pd
>>>
s=pd.Series(data=["ram","mohan","kapil"],index=[
'a','b','c'])
>>> df=pd.DataFrame(s)
>>> print(df)
0
a ram
b mohan
c kapil
>>> import pandas as pd
>>> dic={"jan":31,"feb":28,"mar":31}
>>> s=pd.Series(dic)
>>> df=pd.DataFrame(s)
>>> print(df)
0
jan 31
feb 28
mar 31
>>> import pandas as pd
>>>
sm=pd.Series({"vijaya":80,"rahul":73,"soni":94})
>>>
sa=pd.Series({"vijaya":22,"rahul":24,"soni":21})
>>> df=pd.DataFrame({"marks":sm,"age":sa})
>>> print(df)
marks age
vijaya 80 22
rahul 73 24
soni 94 21
>>>
3.CREATING DATAFRAME FROM DICTIONARY
>>> import pandas as pd
>>> dic={"roll":[1,2,3,4,5],"name":
["ram","mohan","kapil","sunil","jyotsana"]}
>>> df=pd.DataFrame(dic)
>>> print(df)
roll name
0 1 ram
1 2 mohan
2 3 kapil
3 4 sunil
4 5 jyotsana
>>> import pandas as pd
>>>
nm=pd.Series(["ram","mohan","sohan","kapil","so
nu"])
>>> eng=pd.Series([55,56,58,59,56])
>>> math=pd.Series([75,78,79,88,98])
>>> ip=pd.Series([89,88,98,87,89])
>>>
std={"Name":nm,"English":eng,"Maths":math,"Inf
ormatics Practices":ip}
>>> df=pd.DataFrame(std)
>>> print(df)
Name English Maths Informatics Practices
0 ram 55 75 89
1 mohan 56 78 88
2 sohan 58 79 98
3 kapil 59 88 87
4 sonu 56 98 89
>>> import pandas as pd
>>> dic={"Name:":["ram","mohan","kapil"],"Eng":
[85,95,78],"Hin":[88,77,85]}
>>> df=pd.DataFrame(dic)
>>> print(df)
Name: Eng Hin
0 ram 85 88
1 mohan 95 77
2 kapil 78 85
4.CREATING DATAFRAME USING ARRAY
>>> import pandas as pd
>>> import numpy as np
>>> a=np.array([[54,55,56,57],[65,66,67,68],
[87,88,89,85]])
>>> df=pd.DataFrame(a)
>>> print(a)
[[54 55 56 57]
[65 66 67 68]
[87 88 89 85]]
>>> print(df)
0 1 2 3
0 54 55 56 57
1 65 66 67 68
2 87 88 89 85
5.CREATING DATAFRAME USING LIST OF
DICTIONARY
>>> import pandas as pd
>>> a=[{"ram":55,"sunil":75,"kapil":75},
{"ram":65,"sunil":78,"kapil":77},
{"ram":55,"sunil":88,"kapil":87}]
>>> df=pd.DataFrame(a)
>>> print(a)
[{'ram': 55, 'sunil': 75, 'kapil': 75}, {'ram': 65,
'sunil': 78, 'kapil': 77}, {'ram': 55, 'sunil': 88, 'kapil':
87}]
>>> print(df)
ram sunil kapil
0 55 75 75
1 65 78 77
2 55 88 87
/////////////////////////////////////////////////////
/////////////////////////////
ATTRIBUTES OF DATAFRAME OBJECT
/////////////////////////////////////////////////////
////////////////////////////
Index : It return the index (row labels) of the
DataFrame.
>>> import pandas as pd
>>> a={"ram":[55,57,98],"mohan":
[75,95,85],"kapil":[57,85,78]}
>>> b=pd.DataFrame(a)
>>> print(b)
ram mohan kapil
0 55 75 57
1 57 95 85
2 98 85 78
>>> b.index
RangeIndex(start=0, stop=3, step=1)
Columns: It return the column labels index of the
DataFrame
>>> b.columns
Index(['ram', 'mohan', 'kapil'], dtype='object')
Axes : Return a list representing both the axes.
>>> b.axes
[RangeIndex(start=0, stop=3, step=1),
Index(['ram', 'mohan', 'kapil'], dtype='object')]
dtypes : Returns the dtypes of data and
DataFrame.
>>> b.dtypes
ram int64
mohan int64
kapil int64
dtype: object
size : Returns the number of element in the
DataFrame
>>> b.size
9
Shape : Return the tuple representing the
dimensionality of the DataFrame.
>>> b.shape
(3, 3)
values : Return a numpy representation of
DataFrame.
>>> b.values
array([[55, 75, 57],
[57, 95, 85],
[98, 85, 78]], dtype=int64)
empty : Indicate whether DataFrame is Empty or
not. It returns True if it is empty ,otherwise it
returns False.
>>> b.empty
False
ndim : Return the number of axes/array
dimension.
>>> b.ndim
2
T : It Transpose the DataFrame, i.e. from row to
Column and vice versa.
>>> b.T
0 1 2
ram 55 57 98
mohan 75 95 85
kapil 57 85 78
count() :It count number of rows or count(0) . if
Count(1) it count number of columns.By Default
is row count i.e. 0.
>>> b.count(0)
ram 3
mohan 3
kapil 3
dtype: int64
or
>>> b.count(axis=0)
ram 3
mohan 3
kapil 3
dtype: int64
/////////////
>>> b.count(1)
0 3
1 3
2 3
dtype: int64
>>> b.count(axis=1)
0 3
1 3
2 3
dtype: int64

Q. Create a DataFrame of given table.

NAME MARK GRADE
S
0 VIJAYA 90 A1
1 RAHUL 82 A2
2 MEGHNA 67 C
3 RADHIKA 95 A1
4 SHAURYA 97 A1
>>> import pandas as pd
>>> dic={"NAME":
["VIJAYA","RAHUL","MEGHNA","RADHIKA","SHAU
RYA"],"MARKS":[90,82,67,95,97],"GRADE":
["A1","A2","C","A1","A1"]}
>>> df=pd.DataFrame(dic)
>>> print(df)
NAME MARKS GRADE
0 VIJAYA 90 A1
1 RAHUL 82 A2
2 MEGHNA 67 C
3 RADHIKA 95 A1
4 SHAURYA 97 A1
Q.Create DataFrame of given table
2015 2016 2017
Qtr1 34500 41000 54000
Qtr2 56000 63000 75000
Qtr3 47000 57000 57000
Qtr4 49000 59000 58500
>>> import pandas as pd
>>>
y2015={"qtr1":34500,"qtr2":56000,"qtr3":47000,"
qtr4":49000}
>>>
y2016={"qtr1":41000,"qtr2":63000,"qtr3":57000,"
qtr4":59000}
>>>
y2017={"qtr1":54000,"qtr2":75000,"qtr3":57000,"
qtr4":58500}
>>> dic={2015:y2015,2016:y2016,2017:y2017}
>>> df=pd.DataFrame(dic)
>>> print(df)
2015 2016 2017
qtr1 34500 41000 54000
qtr2 56000 63000 75000
qtr3 47000 57000 57000
qtr4 49000 59000 58500

Q.Create DataFrame using given table

POPULATIO AVG INCOME PER
N CAPITA
INCOM
E
DELHI 15478965 4578987546525 6.60
4
MUMBA 85647596 7589654587456 6.76
I 4
KOLKAT 42598758 9587547854565 9.12
A 4
CHENNA 56987545 4356547854565 1.21
I 4
Q.Create DataFrame using given table
NAME ENG ECO IP ACCT
0 RINKU 67 85 75 65
1 PANKAJ 88 77 87 85
3 ADITYYA 57 75 84 75
4 RITU 68 87 49 87

SELECTING/ACCESSING DATA OF DATAFRAM BY

COLUMNS
DF[ [“COLUMN1”] ]
DF.COLUMNAME
IT DISPLAY SPECIFIED COLUMN’S VALUE OF A
DATAFRAME
>>> import pandas as pd
>>> import numpy as np
>>>dic={"NAME":
["RINKU","PANKAJ","ADITYA","RITU"],"ENG":
[67,88,57,68],"ECO":[85,77,75,87],"IP":
[75,87,84,49],"ACCT":[65,85,75,87]}
>>> df=pd.DataFrame(dic)
>>> print(df)
NAME ENG ECO IP ACCT
0 RINKU 67 85 75 65
1 PANKAJ 88 77 87 85
2 ADITYA 57 75 84 75
3 RITU 68 87 49 87
>>> print(df.NAME)
0 RINKU
1 PANKAJ
2 ADITYA
3 RITU
Name: NAME, dtype: object
>>> print(df['ENG'])
0 67
1 88
2 57
3 68
Name: ENG, dtype: int64
//////////////////////////////////////
DF[ [ “COL1”,”COL2” ] ] IT DISPLAY
MULTICOLUMNS VALUE
>>> print(df[["ENG","IP"]])
ENG IP
0 67 75
1 88 87
2 57 84
3 68 49
/////////////////////////////////////
SELECTING DATA USING ROW/COLUMN NAMES
/////////////////////////////////////
DF.loc[start row : end row , start col : end col ]
>>> print(df.loc[0,:]) #IT DISPLAY SINGLE ROW
NAME RINKU
ENG 67
ECO 85
IP 75
ACCT 65
Name: 0, dtype: object
>>> print(df.loc[0:2,:]) # IT DISPLAY ROW FROM
0 TO 2
NAME ENG ECO IP ACCT
0 RINKU 67 85 75 65
1 PANKAJ 88 77 87 85
2 ADITYA 57 75 84 75
>>> print(df.loc[:,"NAME":"IP"]) #IT DISPLAY ALL
ROW AND COL FROM NAME TO IP
NAME ENG ECO IP
0 RINKU 67 85 75
1 PANKAJ 88 77 87
2 ADITYA 57 75 84
3 RITU 68 87 49
>>> print(df.loc[0:2,"NAME":"IP"]) # IT DISPLAY
ROW FROM 0 TO 2 AND COL FROM NAME
TO IP
NAME ENG ECO IP
0 RINKU 67 85 75
1 PANKAJ 88 77 87
2 ADITYA 57 75 84
>>> print(df.loc[:,:]) # it display all rows and
columns
NAME ENG ECO IP ACCT
0 RINKU 67 85 75 65
1 PANKAJ 88 77 87 85
2 ADITYA 57 75 84 75
3 RITU 68 87 49 87
/////////////////////////////////////////////
SELECTING ROWS/COLS USING INDEX
/////////////////////////////////////////////
DF.iloc[ start row index : end row index , start
col index : end col index ]
>>> print(df.iloc[0:3,0:3]) # it display data from
row 0 to . and col 0 to 3
Note : in index loc end means end-1
NAME ENG ECO
0 RINKU 67 85
1 PANKAJ 88 77
2 ADITYA 57 75
/////////////////////////////////////////////
SELECTING/ACCESSING INDIVIDUAL ELEMENTS
OF DATAFRAME
///////////////////////////////////////////
>>> df.NAME[0]
'RINKU'
>>> df.NAME[3]
'RITU'
>>> df.IP[2]
84
//////////////////////////////////////////
USING AT FUNCTION
/////////////////////////////////
Df.at[ row,col ]
IT IS USED TO FETCH DATA FROM DATAFRAME
OF SPECIFIC ROW/COLUMN’S CELL using
ROWNAME AND COLUMN NAME.
>>> df.at[0,"NAME"]
'RINKU'
>>> df.at[3,"NAME"]
'RITU'
>>> df.at[2,"IP"]
84
///////////////////////////////////
USING IAT FUNCTION
////////////////////////////////
df.iat[ row index, col index ]
IT IS USED TO FETCH DATA FROM DATAFRAME
OF SPECIFIC ROW/COLUMN’S CELL using its row
index and column index.
>>> df.iat[0,0]
'RINKU'
>>> df.iat[3,0]
'RITU'
>>> df.iat[2,3]
84
/////////////////////////////////////////////////////
////
ASSIGNING/MODFYING DATA VALUE IN
DATAFRAME
/////////////////////////////////////////////////////
///
>>> df["MAT"]=88
>>> df
NAME ENG ECO IP ACCT MAT
0 RINKU 67 85 75 65 88
1 PANKAJ 88 77 87 85 88
2 ADITYA 57 75 84 75 88
3 RITU 68 87 49 87 88
>>> df.at[4,:]="ram"
>>> df
NAME ENG ECO IP ACCT MAT
0 RINKU 67 85 75 65 88
1 PANKAJ 88 77 87 85 88
2 ADITYA 57 75 84 75 88
3 RITU 68 87 49 87 88
4 ram ram ram ram ram ram
>>> df.loc[5,:]="rahim"
>>> df
NAME ENG ECO IP ACCT MAT
0 RINKU 67 85 75 65 88
1 PANKAJ 88 77 87 85 88
2 ADITYA 57 75 84 75 88
3 RITU 68 87 49 87 88
4 ram ram ram ram ram ram
5 rahim rahim rahim rahim rahim rahim

>>> df.NAME[0]="KAMAL"
>>> df
NAME ENG ECO IP ACCT MAT
0 KAMAL 67 85 75 65 88
1 PANKAJ 88 77 87 85 88
2 ADITYA 57 75 84 75 88
3 RITU 68 87 49 87 88
4 ram ram ram ram ram ram
5 rahim rahim rahim rahim rahim rahim
>>> df.at[3,"ENG"]=88
>>> df
NAME ENG ECO IP ACCT MAT
0 KAMAL 67 85 75 65 88
1 PANKAJ 88 77 87 85 88
2 ADITYA 57 75 84 75 88
3 RITU 88 87 49 87 88
4 ram ram ram ram ram ram
5 rahim rahim rahim rahim rahim rahim
//////////////////////////////////////////////
DELETING COLUMNS
///////////////////////////////////////////
>>> df
NAME ENG ECO IP ACCT
0 KAMAL 67 85 75 65
1 PANKAJ 88 77 87 85
2 ADITYA 57 75 84 75
3 RITU 88 87 49 87
4 ram ram ram ram ram
5 rahim rahim rahim rahim Rahim
Del df[“column name”] # it will delete
dataframe column and its entire value
>>> del df["ACCT"]
>>> df
NAME ENG ECO IP
0 RINKU 67 85 75
1 PANKAJ 88 77 87
2 ADITYA 57 75 84
3 RAMAN 88 87 65
4 ram ram ram ram
5 rahim rahim rahim Rahim
Df.drop(“col/row”,axis=0/1) # it will delete data
for one instance
OR
Df.drop(“col/row”, axis=0/1, inplace=True) # it
will delete data permanently
>>> df.drop(5,axis=0,inplace=True)
>>> df
NAME ENG ECO IP
0 RINKU 67 85 75
1 PANKAJ 88 77 87
2 ADITYA 57 75 84
3 RAMAN 88 87 65
4 ram ram ram ram
Df.pop(“col name”) #it will delete given col and
its entire data
>>> df.pop("IP") #it delete and pop up the
deleted value
0 75
1 87
2 84
3 65
4 ram
Name: IP, dtype: object
>>> df
NAME ENG ECO
0 RINKU 67 85
1 PANKAJ 88 77
2 ADITYA 57 75
3 RAMAN 88 87
4 ram ram ram
////////////////////////////////////////////////////
Iterrows( ) and iteritems( ) function in
DataFrame
///////////////////////////////////////////////////
Df.iterrows( ) # it iterate over the horizontal
subsets in pair i.e. row index and row series
>>> df
NAME ENG ECO
0 RINKU 67 85
1 PANKAJ 88 77
2 ADITYA 57 75
3 RAMAN 88 87
4 ram ram ram
>>> for (ri,rs) in df.iterrows():
print("Row Index = ",ri)
print("Row Series = ",rs)

Row Index = 0
Row Series = NAME RINKU
ENG 67
ECO 85
Name: 0, dtype: object
Row Index = 1
Row Series = NAME PANKAJ
ENG 88
ECO 77
Name: 1, dtype: object
Row Index = 2
Row Series = NAME ADITYA
ENG 57
ECO 75
Name: 2, dtype: object
Row Index = 3
Row Series = NAME RAMAN
ENG 88
ECO 87
Name: 3, dtype: object
Row Index = 4
Row Series = NAME ram
ENG ram
ECO ram
Name: 4, dtype: object
/////////////////////////////////////////////////////
///////
Df.iteritems( ) # it iterate over vertical subsets in
form of col index and col series.
/////////////////////////////////////////////////////
///////
>>> df
NAME ENG ECO
0 RINKU 67 85
1 PANKAJ 88 77
2 ADITYA 57 75
3 RAMAN 88 87
4 ram ram ram
>>> for (ci,cs) in df.iteritems():
print("Column Index=",ci)
print("Column Series=\n",cs)

Column Index= NAME

Column Series=
0 RINKU
1 PANKAJ
2 ADITYA
3 RAMAN
4 ram
Name: NAME, dtype: object
Column Index= ENG
Column Series=
0 67
1 88
2 57
3 88
4 ram
Name: ENG, dtype: object
Column Index= ECO
Column Series=
0 85
1 77
2 75
3 87
4 ram
Name: ECO, dtype: object
/////////////////////////////////////////////////////
///////
BINARY OPERATION IN DATAFRAME
/////////////////////////////////////////////////////
//////
Df1+df2 Df1.add(df2) Df1.radd(df2)Reverse
Addition
Df1-df2 Df1.sub(df2) Df1.rsub(df2)Reverse
Subtraction
Df1*df2 Df1.mul(df2) Df1.rmul(df2)Reverse
Multiplication
Df1/df2 Df1.div(df2) Df1.rdiv(df2) Reverse
Division

Note : Arithmetic Operation with dataframes is

termed as binary operation
//////////////////////////////////////////
INSPECTION FUNCTION info( ) and describe()
/////////////////////////////////////////
>>> import pandas as pd
import
>>> import pandas as pd
>>> import numpy as np
>>> dic={"NAME":
["RINKU","PANKAJ","ADITYA","RITU"],"ENG":
[67,88,57,68],"ECO":[85,77,75,87],"IP":
[75,87,84,49],"ACCT":[65,85,75,87]}
>>> df=pd.DataFrame(dic)
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 NAME 4 non-null object
1 ENG 4 non-null int64
2 ECO 4 non-null int64
3 IP 4 non-null int64
4 ACCT 4 non-null int64
dtypes: int64(4), object(1)
memory usage: 208.0+ bytes
>>> df.describe()
ENG ECO IP ACCT
count 4.000000 4.000000 4.000000 4.000000
mean 70.000000 81.000000 73.750000
78.000000
std 12.987173 5.887841 17.269916 10.132456
min 57.000000 75.000000 49.000000
65.000000
25% 64.500000 76.500000 68.500000
72.500000
50% 67.500000 81.000000 79.500000
80.000000
75% 73.000000 85.500000 84.750000
85.500000
max 88.000000 87.000000 87.000000
87.000000
/////////////////////////////////////////
Using head( ) and tail( ) in DataFrame
////////////////////////////////////////
>>> df.head(2)
NAME ENG ECO IP ACCT
0 RINKU 67 85 75 65
1 PANKAJ 88 77 87 85
>>> df.tail(2)
NAME ENG ECO IP ACCT
2 ADITYA 57 75 84 75
3 RITU 68 87 49 87
/////////////////////////////////////////////////////
//////
DATAFRAME FUNCTIONS
Axis=0/1 By default axis is 0
Df.cumsum( ),
df.sum(),df.max(),df.min(),df.count(),df.std(),df.i
dxmax(),df.idxmin()
/////////////////////////////////////////////////////
/////
BROADCASTING AND MATCHING
/////////////////////////////////////////////////////
////
MATCHING
The default behaviour of data alignment on the
basis of Matching Indexes called Matching
BROADCASTING

The terms Broadcasting comes from NUMPY

while performing Arithmetic operation with array
either with scalar value or with same size array.
Ex. a=np.array([2,3,4]) + 2
Arithmetic operation with scalar value
a=np.array([2,3,4])+np.array([5,6,7])
SIMPLE BOARDCASTING
TWO DATAFRAME WITH SAME SIZE / SHAPE
>>> import pandas as pd
>>> a=[2,4,5]
>>> b=[4,5,6]
>>> df1=pd.DataFrame(a)
>>> df2=pd.DataFrame(b)
>>> print(df1+df2)
0
0 6
1 9
2 11
BROADCASTING WITH SCALAR/CONSTANT
VALUE
>>> import pandas as pd
>>> a=[5,6,7]
>>> df1=pd.DataFrame(a)
>>> print(df1+8)
0
0 13
1 14
2 15
BROADCASTING USING 1D ARRAY
>>> import pandas as pd
>>> a=[[1,2,3],[4,5,6]]
>>> df1=pd.DataFrame(a)
>>> print(df1+[10,11,12])
0 1 2
0 11 13 15
1 14 16 18
/////////////////////////////////////////////////////
////////////
HANDLING MISSING DATA& FILLING VALUES
/////////////////////////////////////////////////////
//////////
The value with no computational significance are
called missing data. In another words the data
which is un-defined or un-available or for which
user hasn’t entered any value. Pandas allocates
these missing values with NaN(Not a Number).
These missing values can be filled using fillna( )
method/function.
>>> import pandas as pd
>>> a=[[1,2,3,4],[8],[10,4]]
>>> df1=pd.DataFrame(a)
>>> print(df1)
0 1 2 3
0 1 2.0 3.0 4.0
1 8 NaN NaN NaN
2 10 4.0 NaN NaN
>>> df1.fillna(0)
0 1 2 3
0 1 2.0 3.0 4.0
1 8 0.0 0.0 0.0
2 10 4.0 0.0 0.0
FILLING VALUE WITH SPECIFIED COLUMN
>>> import pandas as pd
>>> a=[[1,2,3,4],[8],[10,4]]
>>> df1=pd.DataFrame(a)
>>> print(df1)
0 1 2 3
0 1 2.0 3.0 4.0
1 8 NaN NaN NaN
2 10 4.0 NaN NaN
>>> df1.fillna({0 : -8,1 : -10,2 : -15})
0 1 2 3
0 1 2.0 3.0 4.0
1 8 -10.0 -15.0 NaN
2 10 4.0 -15.0 NaN
FILLING INTERPOLATE VALUE
>>> import pandas as pd
>>> a=[[1,2,3,4],[8],[10,4]]
>>> df1=pd.DataFrame(a)
>>> print(df1)
0 1 2 3
0 1 2.0 3.0 4.0
1 8 NaN NaN NaN
2 10 4.0 NaN NaN

>>> df1.fillna(method='ffill')
0 1 2 3
0 1 2.0 3.0 4.0
1 8 2.0 3.0 4.0
2 10 4.0 3.0 4.0
/////////////////////////////////////////////////////
//////////////////
Dropna()
It is method of dataframe which drop NaN values
row .
>>> import pandas as pd
>>> a=[[1,2,3,4],[8],[10,4]]
>>> df1=pd.DataFrame(a)
>>> print(df1)
0 1 2 3
0 1 2.0 3.0 4.0
1 8 NaN NaN NaN
2 10 4.0 NaN NaN

>>> df1.dropna()
0 1 2 3
0 1 2.0 3.0 4.0
CHECKING NaN value in DATAFRAME
Isnull() function is used to check NaN value in
DataFrame
>>> import pandas as pd
>>> a=[[1,2,3,4],[8],[10,4]]
>>> df1=pd.DataFrame(a)
>>> print(df1)
0 1 2 3
0 1 2.0 3.0 4.0
1 8 NaN NaN NaN
2 10 4.0 NaN NaN

>>> df1.isnull()
0 1 2 3
0 False False False False
1 False True True True
2 False False True True
/////////////////////////////////////////////////////
///
CONCATENATING THE DATAFRAME
/////////////////////////////////////////////////////
/
Pd.concat([df1,df2],axis= 0/1) : it
concat/append two dataframe along with its axis
i.e. row-wise or column-wise.
>>> import pandas as pd
>>> r1={"roll":[1,2,3,4],"name":
["mohan","kapil","danish","rahul"]}
>>> r2={"roll":[5,6,7,8],"name":
["kavita","gita","sita","dipika"]}
>>> df1=pd.DataFrame(r1)
>>> df2=pd.DataFrame(r2)
>>> df1
roll name
0 1 mohan
1 2 kapil
2 3 danish
3 4 rahul
>>> df2
roll name
0 5 kavita
1 6 gita
2 7 sita
3 8 dipika
>>> df3=pd.concat([df1,df2],axis=0)
>>> df3
roll name
0 1 mohan
1 2 kapil
2 3 danish
3 4 rahul
0 5 kavita
1 6 gita
2 7 sita
3 8 dipika
>>> df4=pd.concat([df1,df2],axis=1)
>>> df4
roll name roll name
0 1 mohan 5 kavita
1 2 kapil 6 gita
2 3 danish 7 sita
3 4 rahul 8 dipika

>>>
df5=pd.concat([df1,df2],axis=0,ignore_index=Tru
e)
>>> df5
roll name
0 1 mohan
1 2 kapil
2 3 danish
3 4 rahul
4 5 kavita
5 6 gita
6 7 sita
7 8 dipika
///////////////////////////////////////////////
////
MERGE OPERATION IN DATAFRAME
///////////////////////////////////////////////
///
Pd.merge(df1,df2,on=”fieldname”)
It let the user to merge two dataframe using
field name upon same data
>>> import pandas as pd
>>> a={"roll":[1,2,3],"name":
["ram","rama","mangal"]}
>>> b={"roll":[3,4,5],"name":
["sohan","kapil","ram"]}
>>> df1=pd.DataFrame(a)
>>> df2=pd.DataFrame(b)
>>> df3=pd.merge(df1,df2,on="roll")
>>> df3
roll name_x name_y
0 3 mangal sohan
>>> df4=pd.merge(df1,df2,on="name")
>>> df4
roll_x name roll_y
0 1 ram 5
//////////////////////////////////////////////////////////////
BOOLEAN INDEXING
////////////////////////////////////////////////////////////
DataFrame indexing can be done on Boolean value
i.e. True/False.
>>> import pandas as pd
>>> a={"roll":[1,2,3,4],"name":
["abhishek","balkishore","chandan","danish"]}
>>>
df1=pd.DataFrame(a,index=[True,False,True,False
])
>>>print(df1)

roll name
True 1 abhishek
False 2 balkishore
True 3 chandan
False 4 danish
/////////////////////////////////////////////////////////////////
BOOLEAN REDUCTION
///////////////////////////////////////////////////////////////
Using empty(),any(),all() functions of DataFrame
it provides a way to summarise a Boolean result
i.e. termed as Boolean Reduction.
Df.cmpty
It returns True if it is empty otherwise False
>>> import pandas as pd
>>> a=pd.DataFrame({"x":[]})
>>> a
Empty DataFrame
Columns: [x]
Index: []
>>> a.empty
True
Df.all()
It return True if all values are True or non-zero
>>> import pandas as pd
>>> a=pd.DataFrame({"x":[True,True],"y":
[True,False],"z":[False,False]})
>>> a
x y z
0 True True False
1 True False False
>>> a.all()
x True
y False
z False
dtype: bool
df.any()
It returns true is any one is True
>>> import pandas as pd
>>> a=pd.DataFrame({"x":[True,True],"y":
[True,False],"z":[False,False]})
>>> a
x y z
0 True True False
1 True False False

>>> a.any()
x True
y True
z False
dtype: bool

Python Pandas-DataFrames Complete - Jupyter Notebook
No ratings yet
Python Pandas-DataFrames Complete - Jupyter Notebook
34 pages
Chapter 2 Data Handling using pandas - I(DATA FRAME)
No ratings yet
Chapter 2 Data Handling using pandas - I(DATA FRAME)
15 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
Xii Record (Dataframe & CSV)
No ratings yet
Xii Record (Dataframe & CSV)
11 pages
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
No ratings yet
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
32 pages
Dataframes-I (Create & Selection)
No ratings yet
Dataframes-I (Create & Selection)
10 pages
Pandas
No ratings yet
Pandas
5 pages
Python Pandas-Data Frames
No ratings yet
Python Pandas-Data Frames
41 pages
DataFrame Notes1
No ratings yet
DataFrame Notes1
32 pages
Data Frame Demo
No ratings yet
Data Frame Demo
73 pages
Chapter Notes - Data Handling Using Pandas DataFrame
No ratings yet
Chapter Notes - Data Handling Using Pandas DataFrame
16 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
Chapter 1 Python Pandas - I
No ratings yet
Chapter 1 Python Pandas - I
35 pages
Pandas
No ratings yet
Pandas
8 pages
Data Aggregation and Group Operations
No ratings yet
Data Aggregation and Group Operations
34 pages
Day08-Pandas-Tutorial: Pandas - by Punith V T
No ratings yet
Day08-Pandas-Tutorial: Pandas - by Punith V T
8 pages
Pandas
No ratings yet
Pandas
44 pages
05 Pandas Data Frames
No ratings yet
05 Pandas Data Frames
33 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Pandas 2 Complete Notes Class XII
No ratings yet
Pandas 2 Complete Notes Class XII
18 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
Revision Point - Dataframe
No ratings yet
Revision Point - Dataframe
11 pages
Working With Panda
No ratings yet
Working With Panda
13 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
26 pages
9.9.24 Revision
No ratings yet
9.9.24 Revision
9 pages
Data Frame
No ratings yet
Data Frame
17 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
a5
No ratings yet
a5
28 pages
IP Slybuss
No ratings yet
IP Slybuss
21 pages
Pandas (PPT 5)
No ratings yet
Pandas (PPT 5)
16 pages
Acknowledgement
No ratings yet
Acknowledgement
25 pages
Chapter 1 - Part 2 - DataFrame (1)
No ratings yet
Chapter 1 - Part 2 - DataFrame (1)
48 pages
data frame CREATION
No ratings yet
data frame CREATION
7 pages
Programs of Python Pandas
No ratings yet
Programs of Python Pandas
15 pages
Pandas DataFrame1
No ratings yet
Pandas DataFrame1
22 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Pandas & Mysql
No ratings yet
Pandas & Mysql
20 pages
IP-LAB-FILE-PYTHON
No ratings yet
IP-LAB-FILE-PYTHON
9 pages
PDF&Rendition=1
No ratings yet
PDF&Rendition=1
47 pages
DataFrame in Pandas
No ratings yet
DataFrame in Pandas
4 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Data Handing Using Pandas-I
100% (2)
Data Handing Using Pandas-I
46 pages
Assignments IP Class 12
No ratings yet
Assignments IP Class 12
9 pages
Practical File Python
No ratings yet
Practical File Python
25 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
DATAFRAME (1)
No ratings yet
DATAFRAME (1)
16 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
Data Frames
No ratings yet
Data Frames
60 pages
Lab 9
No ratings yet
Lab 9
9 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
Pandas Questions
100% (1)
Pandas Questions
7 pages
MCQ On Dataframe
No ratings yet
MCQ On Dataframe
11 pages
Learn Data Analysis With Pandas - Introduction
No ratings yet
Learn Data Analysis With Pandas - Introduction
2 pages
Line By Line 12 IP
No ratings yet
Line By Line 12 IP
21 pages
Pandas - Dataframe - Introduction
No ratings yet
Pandas - Dataframe - Introduction
16 pages
03 DataFrames
No ratings yet
03 DataFrames
9 pages
Pandas - Digitalocean
No ratings yet
Pandas - Digitalocean
15 pages
Creation of DF
No ratings yet
Creation of DF
16 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Open Source Intelligence For Malicious Behavior Discovery and Interpretation
No ratings yet
Open Source Intelligence For Malicious Behavior Discovery and Interpretation
14 pages
Computer Science
No ratings yet
Computer Science
34 pages
Manonmaniam Sundaranar University, Tirunelveli B.SC Computer Science
No ratings yet
Manonmaniam Sundaranar University, Tirunelveli B.SC Computer Science
18 pages
Unit 5 Trees & Its Operation
No ratings yet
Unit 5 Trees & Its Operation
40 pages
8 Uniform Cost Search 02-08-2024
No ratings yet
8 Uniform Cost Search 02-08-2024
9 pages
Ju 120722120619 0
No ratings yet
Ju 120722120619 0
136 pages
GSSSB
No ratings yet
GSSSB
2 pages
Chapter No 5 Class 8
No ratings yet
Chapter No 5 Class 8
3 pages
Scala PPT
No ratings yet
Scala PPT
25 pages
BCLab - Experiment No2
No ratings yet
BCLab - Experiment No2
9 pages
Solved - Let X1, X2, Be Independent Bernoulli Random Variables, X...
No ratings yet
Solved - Let X1, X2, Be Independent Bernoulli Random Variables, X...
2 pages
Write A Routine To Call PRODUCE - DEAL.SLIP As Shown Below
No ratings yet
Write A Routine To Call PRODUCE - DEAL.SLIP As Shown Below
3 pages
Chapter Two
No ratings yet
Chapter Two
49 pages
Velammal Bodhi Campus: A Project Report On
No ratings yet
Velammal Bodhi Campus: A Project Report On
17 pages
Activity 1
No ratings yet
Activity 1
4 pages
11 - Polymorphism Tutorial
No ratings yet
11 - Polymorphism Tutorial
6 pages
Presentation 1
No ratings yet
Presentation 1
26 pages
DL Unit -1 Notes
No ratings yet
DL Unit -1 Notes
45 pages
Topic Assignment
No ratings yet
Topic Assignment
3 pages
Edia - Graphing and Solving Inequalities (Full)
No ratings yet
Edia - Graphing and Solving Inequalities (Full)
4 pages
Priority Scheduling Algorithm
No ratings yet
Priority Scheduling Algorithm
4 pages
Ada QP
No ratings yet
Ada QP
7 pages
22206890074_DPOL301_24251_1
No ratings yet
22206890074_DPOL301_24251_1
6 pages
Artificial Intelligence Chapter 3: Problem Solving and Searching
100% (1)
Artificial Intelligence Chapter 3: Problem Solving and Searching
94 pages
snake
No ratings yet
snake
5 pages
Compiler Design Lab
No ratings yet
Compiler Design Lab
61 pages
01 Lists As Stacks and Queues
No ratings yet
01 Lists As Stacks and Queues
27 pages
Codechef Assignment Answers - Haam's Community
No ratings yet
Codechef Assignment Answers - Haam's Community
4 pages
COBOL - en ANGLAIS
No ratings yet
COBOL - en ANGLAIS
62 pages
Programming in Lua 2nd Edition Roberto Ierusalimschy - The full ebook with all chapters is available for download now
100% (1)
Programming in Lua 2nd Edition Roberto Ierusalimschy - The full ebook with all chapters is available for download now
47 pages