0% found this document useful (0 votes)

22 views21 pages

python-pandas-dataframe

Pandas is a popular library for scientific data analysis, providing functions for reading/writing data, calculating organized data, selecting subsets, merging datasets, reshaping data, and supporting visualization. It features two main data structures: Series (1-D, homogeneous) and DataFrame (2-D, heterogeneous), with various methods for creation from lists, dictionaries, and other data types. DataFrames can perform arithmetic operations, support mutable values, and have customizable row and column indices.

Uploaded by

Paritosh Srivastava

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views21 pages

python-pandas-dataframe

Uploaded by

Paritosh Srivastava

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Pandas is most popular library.

It provides various functions related to

Scientific Data analysis, like
 It can read and write different data formats like int, float, double
 It can calculate data that is organized in row and columns.
 It can select sub set of data values and merge two data sets.
 It can support reshape of data values.
 It can support visualization library matplot.

Pandas Data Structure is a way to store & organize data values in a

specific manner so that various specific functions can be applied on them.
Examples- array, stack, queue, linked list, series, DataFrame etc.

Property Series DataFrame

Dimensions One-Dimensional (1-D) Two-Dimensional (2-D)
Types of data Homogenous Heterogeneous
(In Series, all data values should (In DataFrame, data values may be
be of same type) of different type)
Value Mutable Yes, Mutable (Values of elements Yes, Mutable (Values of elements
can be changed) can be changed)
Size Mutable Size is Immutable. Once the size Size is Immutable. Once the size
of series created, it cannot be of DataFrame created, it can be
changed. changed.
(If add/delete element, then new (That mean elements in
series object will be created.) DataFrame can add/delete.)

A DataFrame is another Pandas Data Structure that represent 2–

Dimensional array of indexed data. It stores the different types of data in
tabular form as rows and columns.
 Potentially columns are used to store of different types of data.
 Row and column can delete that mean Size is Mutable.
 Data value can be changed that mean value is Mutable.
 Can Perform Arithmetic operations on rows and columns
 It has two indexes, Row index on Axis-0, Column on Axis-1.
 The data values are identical with combination of row index and
column index
 The indexes can be numbers, characters or strings.

A DataFrame object can be created by using following syntax.

Syntax:
pandas.DataFrame( data, index, columns, dtype, copy)
Where
1. Data: data takes various forms like ndarray, series, map, lists, dict,
constants and also another DataFrame
2. Index: For the row labels, (Optional)
Default np.arange(n), if no index is passed
3. Column: For column labels, (Optional)
Default np.arange(n), if no index is passed
4. Dtype: Data type of each column
5. Copy: This parameter is used for copying of data, (Optional)
Default is False, if not passed
A pandas DataFrame can be created using various inputs like −

 Lists
 dict
 Series
 Numpy ndarrays
 Another DataFrame

1. Creation of empty DataFrame by using DataFrame( ):

Syntax:
DataFrame_object = pandas.DataFrame( )
# D and F are capital in DataFrame( )
Example:
#import the pandas library and aliasing as pd
import pandas as pd
df = pd.DataFrame()
print (df)
Output:
Empty DataFrame
Columns: [] Output
Index: [] 0 Column index
0 1
2. Create a DataFrame from Lists: 1 2
import pandas as pd 2 3
data = [1,2,3,4,5] 3 4
df = pd.DataFrame(data) 4 5
print (df) 5
Row Index Values of Column

Output:
import pandas as pd
data = [['Tina',10],['Naman',12],['Rita',13]] Name Age
0 Tina 10
df = pd.DataFrame(data,columns=['Name','Age']) 1 Naman 12
print (df) 2 Rita 13

import pandas as pd
data = [['Tina',10],['Naman',12],['Rita',13]]
df = pd.DataFrame(data,columns=['Name','Age'],dtype=float)
print (df)
Output:

Name Age
0 Tina 10.0
1 Naman 12.0
2 Rita 13.0

Note − The dtype parameter changes the type of Age column to

floating point.

3. Create a DataFrame from Dict of ndarrays / Lists

All the ndarrays must be of same length. If index is passed, then
the length of the index should equal to the length of the arrays.
If no index is passed, then by default, index will be range(n),
where n is the array length.

import pandas as pd
data = {'Name':['Tina', 'John', 'Seema', 'Reena'],'Age':[28,34,29,42]}
df = pd.DataFrame(data) Age Name
print (df) 0 28 Tina
1 34 John
2 29 Seema
3 42 Reena

Note − The values 0,1,2,3. They are the default index assigned to each using the
function range(n).
import pandas as pd
data = {'Name':['Tina', 'Jaohn', 'Seema', 'Reena'],'Age':[28,34,29,42]}
df = pd.DataFrame(data, index=['rank1','rank2','rank3','rank4'])
print (df) Age Name
rank1 28 Tina
rank2 34 John
rank3 29 Seema
rank4 42 Reena

4. Create a DataFrame from List of Dictionaries

List of Dictionaries can be passed as input data to create a
DataFrame. The dictionary keys are by default taken as column
names.
dict={'Student:':['Tina','Geeta','Moti','Mangal'],
'Marks':[23,45,76,32],
'Sports':['Badminton','Volleyball','Kabaddi','Cricket']}
print("Dictionary is\n",dict)
df=pd.DataFrame(dict)
print("DataFrame is\n",df)
Output:
Dictionary is
{'Sports': ['Badminton', 'Volleyball', 'Kabaddi', 'Cricket'], 'Marks':
[23, 45, 76, 32], 'Student:': ['Tina', 'Geeta', 'Moti', 'Mangal']}

DataFrame is
Marks Sports Student:
0 23 Badminton Tina
1 45 Volleyball Geeta
2 76 Kabaddi Moti
3 32 Cricket Mangal

Note: indices are as same series (0 to 3) but columns in

DataFrame are the indices of dictionary and display in sorted
order.
df2=pd.DataFrame(dict, index=[‘I’,’II’,’III’,’IV’]
print(df2)
Output:
Marks Sports Student:
I 23 Badminton Tina
II 45 Volleyball Geeta
III 76 Kabaddi Moti
IV 32 Cricket Mangal

df=pd.DataFrame(dict,index=['I','II','III','IV'],
columns=['Student','Sports'])
print(df)
Output:
Student Sports
I Tina Badminton
II Geeta Volleyball
III Moti Kabaddi
IV Mangal Cricket

import pandas as pd a b c
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] 0 1 2 NaN
1 5 10 20.0
df = pd.DataFrame(data)
print (df)

Note − NaN (Not a Number) is appended in missing areas.

import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data, index=['first', 'second']) a b c
first 1 2 NaN
print (df) second 5 10 20.0

5. Creation of DataFrame from dictionary of dictionaries

Student={‘Sci’:{‘Name’:’Babu’,’Age’:15,’City’:’Ajmer’},
{‘Arts’:{‘Name’:’John’,’Age’:17,’City’:’Jaipur’},
{‘Com’:{‘Name’:’Heera’,’Age’:14,’City’:’Bikaner’}}
Df=pd.DataFrame(Student)

OutPut:
Arts Com Sci
Age 17 14 15
City Jaipur Bikaner Ajmer
Name John Heera Babu

Note: Keys of Inner dictionary makes index and Keys of Outer dictionary
makes columns of DataFrame. (Sorted form)
d1= {'Year-1':1500,'Year-2':2000}
d2= {'Year-1':2500,'Year-3':3000}
dict={'I':d1,'II':d2}
df=pd.DataFrame(dict)
print(df)
Output:
I II
Year-1 1500.0 2500.0
Year-2 2000.0 NaN
Year-3 NaN 3000.0
Example:
dict={'Population':{'Delhi':2000,'Mumbai':3000,'Kolkata':3500,'Chenni
':4000},
'Hospitals':{'Delhi':200,'Mumbai':300,'Kolkata':350,'Chenni':400},
'School':{'Delhi':20,'Mumbai':30,'Kolkata':35,'Chenni':40}}
df=pd.DataFrame(dict)
print(df)

Output:
Hospitals Population School
Chenni 400 4000 40
Delhi 200 2000 20
Kolkata 350 3500 35
Mumbai 300 3000 30

Example:
d1={'Delhi':{'Population':100,'Hospitals':12,'Schools':52},
'Mumbai':{'Population':200,'Hospitals':15,'Schools':60},
'Kolkatta':{'Population':250,'Hospitals':17,'Schools':72},
'Chenni':{'Population':300,'Hospitals':42,'Schools':62}}
df=pd.DataFrame(d1)
print(df)

Output:
Chenni Delhi Mumbai Kolkatta
Hospitals 42 12 15 17
Population 300 100 200 250
Schools 62 52 60 72

6. Create a DataFrame from Dictionary of Series

Dictionary of Series can be passed to form a DataFrame. The resultant
index is the union of all the series indexes passed
import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}
one two
df = pd.DataFrame(d) a 1.0 1
print (df) b 2.0 2
c 3.0 3
d NaN 4

Problem: Write a program to create a dataframe from a list containing

two lists. Each contains Target and actual Sales figure of four zonal
offices. Give appropriate row label.
Solution:
Import pandas as pd
Target=[5000,6000,7000,8000]
Sales=[1000,2000,3000,4000]
ZoneSales=[Target, Sales]
Df=pd.DataFrame(ZoneSales, columns=[‘ZoneA’, ‘ZoneB’, ‘ZoneC’,
‘ZoneD’], index=[‘Target’,’Sales’])
Print(Df)

ZoneA ZoneB ZoneC ZoneD

Target 5000 6000 7000 8000
Sales 1000 2000 3000 4000

Problem: create two series object as “staff and salaries” to store number
of staff and salary in various branches. Create another series object as
“average” to calculate average salary of each branch. Then create
DataFrame to display the records.
import pandas as pd
staff=pd.Series([10,20,40,50])
salary=pd.Series([20000,10000,30000,40000])
avg=salary/staff
org={'People':staff,'Salary':salary,'Average':avg}
df=pd.DataFrame(org)
print(df)
Output:
Average People Salary
0 2000.0 10 20000
1 500.0 20 10000
2 750.0 40 30000
3 800.0 50 40000

7. Create a DataFrame from 2-D array

Import numpy as np
Import pandas as pd 0 1 2
Narr1=np.array([[10,20,30],[40,50,60]],np.int32)
0 10 20 30
Df=pd.DataFrame(Narr1) 1 40 50 60
Print(Df)

Df=pd.DataFrame(Narr1, columns=[‘One’,’Two’,’Three’])
Print(Df)
One Two Three

0 10 20 30
1 40 50 60

Df=pd.DataFrame(Narr1,columns=[‘One’,’Two’,’Three’], index=[‘A’,’B’])
Print(Df)
One Two Three

A 10 20 30
B 40 50 60

8. Create a DataFrame from another DataFrame

import pandas as pd 0 1 2
import numpy as np
Df1=pd.DataFrame([[10,20,30],[40,50,60]]) 0 10 20 30
1 40 50 60
Df2=(Df1)
print(Df2)
DataFrame Description of Attribute
Attribute
Index Row label or index name of Row of DataFrame
columns Column label or index name of column of DataFrame
Axes Returns list of both axis( axis 0 for index (Row) and axis 1 for
column)
Dtypes Returns datatype of data values of DataFrame
Size Returns integer values as number of elements in DataFrame
shape Returns a tuple referencing dimension of DataFrame
(rows, columns)
values Returns Numpy representation of DataFrame
empty Returns True if DataFrame is empty otherwise False.
Ndim Returns an integer value representing number of axes/
dimensions.
T Transpose index and columns

DataFrame Attribute Output

import pandas as pd Col-1 Col-2 Col-3
data=[[10,20,30],[40,50,60]] Row-1 10 20 30
Cols=['Col-1','Col-2','Col-3'] Row-2 40 50 60
Rows=['Row-1','Row-2']
Df1=pd.DataFrame(data,
columns=Cols, index=Rows)
print(Df1)
Print(“Index Attribute”) Index Attribute:
Print(Df1.index) Index(['Row-1','Row-2'], dtype='object')
print("Columns Attribute") Columns Attribute
print(Df1.columns) Index(['Col-1','Col-2','Col-3'],dtype='object')
print("Asex Attribute") Asix Attribute
print(Df1.axes) [Index(['Row-1', 'Row-2'],
dtype='object'), Index(['Col-1','Col-2','Col-
3'], dtype='object')]
print("dtypes Attribute") dtypes Attribute
print(Df1.dtypes) Col-1 int64
Col-2 int64
Col-3 int64
print("size Attribute") size Attribute
print(Df1.size) 6
print("shape Attribute") shape Attribute
print(Df1.shape) (2, 3)
print("Values Attribute") Values Attribute
print(Df1.values) [[10 20 30]
[40 50 60]]
print("ndim Attribute") ndim Attribute
print(Df1.ndim) 2
print("empty Attribute") empty Attribute
print(Df1.empty) False
print("Transposing(T) Attribute") Transposing(T) Attribute
print(Df1.T) Row-1 Row-2
Col-1 10 40
Col-2 20 50
Col-3 30 60
Print(No of Rows in DataFrame”) No. of rows in DataFrame
Print( len( Df1 ) ) 2
data=[[None,20,30],[40,50,None]] Col-1 Col-2 Col-3
Cols=['Col-1','Col-2','Col-3'] Row-1 NaN 20 30.0
Rows=['Row-1','Row-2'] Row-2 40.0 50 NaN
Df1=pd.DataFrame(data,
columns=Cols, index=Rows)
print(Df1)
print("No. non NA Values in No. non NA Values in DataFrame
DataFrame") Col-1 1
print(Df1.count()) Col-2 2
Col-3 1
print("No. of non NA Values in No. of non NA Values in Axes-0 of
Axes-0 of DataFrame") DataFrame
print(Df1.count(0)) Col-1 1
OR Col-2 2
print(Df1.count(axis=’index’)) Col-3 1
print("No. non NA Values in No. non NA Values in Axes-1 of
Axes-1 of DataFrame") DataFrame
print(Df1.count(1)) Row-1 2
OR Row-2 2
print(Df1.count(axis=’columns’))
print("Access column Values Access column Values from DataFrame
from DataFrame") Row-1 NaN
print(Df1['Col-1']) Row-2 40.0
Name: Col-1, dtype: float64
print("Access Multiple column Access Multiple column Values from
Values from DataFrame") DataFrame
print(Df1[['Col-1',’Col-3’]]) Col-1 Col-3
Row-1 NaN 30.0
Row-2 40.0 NaN
print('Access specific Value') Access specific Value
print(Df1['Col-2']['Row-2']) 50
print("change Values of whole change Values of whole column
column") Col-1 Col-2 Col-3
Df1['Col-1']=100 Row-1 100 20 30.0
print(Df1) Row-2 100 50 NaN
print("change specific Values") change specific Values
Df1['Col-2']['Row-2']=200 Col-1 Col-2 Col-3
print(Df1) Row-1 100 20 30.0
Row-2 100 200 NaN
print("Add New Column") Add New Column
Df1['Col-4']=400 Col-1 Col-2 Col-3 Col-4
print(Df1) Row-1 NaN 20 30.0 400
Row-2 40.0 50 NaN 400
print("Add New Row") Col-1 Col-2 Col-3 Col-4
Df1.at['Row-3']=300 Row-1 NaN 20.0 30.0 400.0
print(Df1) Row-2 40.0 50.0 NaN 400.0
Row-3 300.0 300.0 300.0 300.0
print("Delete column") Delete column
del Df1['Col-4'] Col-1 Col-2 Col-3
print(Df1) Row-1 NaN 20.0 30.0
Row-2 40.0 50.0 NaN
Row-3 300.0 300.0 300.0
print("Delete Row") Delete Row
Df1=Df1.drop('Row-3') OR Col-1 Col-2 Col-3
Df1.drop('Row-3',inplace=True) Row-1 NaN 20.0 30.0
print(Df1) Row-2 40.0 50.0 NaN

print("Rename Index") Rename Index

Df1.rename(index={'Row-1':'R- Col-1 Col-2 Col-3
1','Row-3':'R-3'}, inplace=True) R-1 NaN 20.0 30.0
print(Df1) Row-2 40.0 50.0 NaN
R-3 300.0 300.0 300.0
print("Rename Column") Rename column
Df1.rename(columns={'Col- C-1 Col-2 C-3
1':'C-1','Col-3':'C-3'}, R-1 NaN 20.0 30.0
inplace=True) Row-2 40.0 50.0 NaN
print(Df1) R-3 300.0 300.0 300.0

Accessing / Selecting Sub Set from DataFrame:

DF_Object.loc(start_row: end_row, start_column: end_column)

dict={'Population':{'Delhi':2000,'Mumbai':3000,'Kolkata':3500,'Chenni':4000},
'Hospitals':{'Delhi':200,'Mumbai':300,'Kolkata':350,'Chenni':400},
'School':{'Delhi':20,'Mumbai':30,'Kolkata':35,'Chenni':40}}

df=pd.DataFrame(dict)
print(df)

Output:
Hospitals Population School
Chenni 400 4000 40
Delhi 200 2000 20
Kolkata 350 3500 35
Mumbai 300 3000 30

Access a Row:
print("Access a row---------")
print(df.loc['Delhi'])

Output:
Access a row---------
Hospitals 200
Population 2000
School 20

print("Access Multiple row--------")

print(df.loc[['Delhi','Chenni']])
Ouput:
Access Multiple row--------
Hospitals Population School
Delhi 200 2000 20
Chenni 400 4000 40

print("Access a column-------")
print(df.loc[:,'Population':'School'])

Output:
Access a column-------
Population School
Chenni 4000 40
Delhi 2000 20
Kolkata 3500 35
Mumbai 3000 30

print("Access Mix of row & column-------")

print(df.loc['Delhi':'Mumbai','Hospital':'Population'])

Output:
Access Mix of row & column-------
Hospital
Delhi 200
Kolkata 350
Mumbai 300

loc[] Vs iloc[]:
loc[] used to select / access the subset of DataFrame with the help of
given row and column index (label). Where iloc[] used to access the
subset of DataFrame by using the numeric index positions instead
of labels.
Syntax:
DataFrame_Object.iloc[strat row index: end row index, start
column index: end column index]

Print(Df.iloc[1:4])
Output:
Hospital Population School
Delhi 200 2000 20
Kolkata 350 3500 35
Mumbai 300 3000 30

print(df.iloc[0:2,1:2])
Output:

Population
Chenni 4000
Delhi 2000

Adding / Modifying Row / Column Value in DataFrame:

1. Adding / Modifying Columns Value

Df. Column_name=New Value
Df.[Column]=New Value
print("Add New Column=-----")
df['Density']=1200
print(df)

Output:
Add New Column=-----
Hospital Population School Density
Chenni 400 4000 40 1200
Delhi 200 2000 20 1200
Kolkata 350 3500 35 1200
Mumbai 300 3000 30 1200

print("Change Values of Column=-----")

df['Density']=[1200,1300,1320,1240]
print(df)

Change Values of Column=-----

Hospital Population School Density
Chenni 400 4000 40 1200
Delhi 200 2000 20 1300
Kolkata 350 3500 35 1320
Mumbai 300 3000 30 1240
2. Adding / Modifying Row’s Value
Df.at[row name,:]= Value
Df.loc[row name,:]= Value

print("Add New Row--------")

df.at['Banglore']=1500 OR df.at['Banglore',:]=1500
print(df)

Output:
Add New Row--------
Hospital Population School
Chenni 400.0 4000.0 40.0
Delhi 200.0 2000.0 20.0
Kolkata 350.0 3500.0 35.0
Mumbai 300.0 3000.0 30.0
Banglore 1500.0 1500.0 1500.0

print("Change Values of Row--------")

df.at['Banglore',:]=[1500,1300,2300]
print(df)

Output:
Change Values of Row--------
Hospital Population School
Chenni 400.0 4000.0 40.0
Delhi 200.0 2000.0 20.0
Kolkata 350.0 3500.0 35.0
Mumbai 300.0 3000.0 30.0
Banglore 1500.0 1300.0 2300.0

3. Change / Modify Single Value

Df.column_name[row_name]= New value

print("Change Population of Delhi to 5000")

df.Population['Delhi']=5000
print(df)
Output:
Change Population of Delhi to 5000
Hospital Population School
Chenni 400 4000 40
Delhi 200 5000 20
Kolkata 350 3500 35
Mumbai 300 3000 30
4. Delete Row / Column
del df[column_name]
print("Delete column--------")
del df['School']
print(df)

Output:
Delete column-------------
Hospital Population
Chenni 400 4000
Delhi 200 5000
Kolkata 350 3500
Mumbai 300 3000

print("Delete row-----")
df=df.drop('Mumbai')
print(df)

Output:
Delete row-----
Hospital Population School
Chenni 400 4000 40
Delhi 200 5000 20
Kolkata 350 3500 35

5. Rename Row Index

print("Rename index---")
df=df.rename(index={'Delhi':'New Delhi','Kolkata':'Colcata'})
print(df)
Output:
Rename index---
Hospital Population School
Chenni 400 4000 40
New Delhi 200 2000 20
Colcata 350 3500 35
Mumbai 300 3000 30

Note: df.rename(index={'Delhi':'New Delhi','Kolkata':'Colcata'}, inplace=True)

The inplace=True parameter change in orginal DataFrame.

df.rename(index={'Delhi':'New Delhi','Kolkata':'Colcata'})
print(df)
6. Rename Column Index
print("Rename Column Index:----")

df.rename(columns={'School':'college'}, inplace=True)
OR
df= df.rename(columns={'School':'College'}, inplace=True)

print(df)

Output:
Rename Column Index:----
Hospital Population College
Chenni 400 4000 40
New Delhi 200 2000 20
Colcata 350 3500 35
Mumbai 300 3000 30

7. Boolean Indexing
The Boolean indexing refers to the index of the DataFrame as Boolean Values (True
or False) (1 or 0). The advantage of Boolean index is to divide the DataFrame in
Two sub groups.
Example:

print("Boolean Indexing-----")
d=['Mon','Tue','Wed','Thu','Fri','Sat']
cls=[2,0,0,7,0,6]
dic={'Day':d,'No. of Classes':cls}
df=pd.DataFrame(dic,index=[True,False,False,True,False,True])
print(df)
Output:
Boolean Indexing-----
Day No. of Classes
True Mon 2
False Tue 0
False Wed 0
True Thu 7
False Fri 0
True Sat 6
8. Access Values by using Boolean Index
df.loc[True] OR df.loc[1] => It will show all True indexed records
df.loc[False] OR df.loc[0] => It will show all False indexed records

print("Show True Index records----")

print(df.loc[True])

Output:
Show True Index records----
Day No. of Classes
True Mon 2
True Thu 7
True Sat 6

print("Show False Index records----")

print(df.loc[False])

Output:
Show False Index records----
Day No. of Classes
False Tue 0
False Wed 0
False Fri 0

print("Boolean Indexing-----")
d=['Mon','Tue','Wed','Thu','Fri','Sat']
cls=[2,0,0,7,0,6]
dic={'Day':d,'No. of Classes':cls}
df=pd.DataFrame(dic,index=[1,0,0,1,0,1])
print(df)

Output:
Day No. of Classes
1 Mon 2
0 Tue 0
0 Wed 0
1 Thu 7
0 Fri 0
1 Sat 6
print("Show True Index records----")
print(df.loc[1])

Output:
Show True Index records----
Day No. of Classes
1 Mon 2
1 Thu 7
1 Sat 6

Exporting DataFrame into CSV file.

Following template in Python in order to export your Pandas
DataFrame to a CSV file:
df.to_csv(r'Path where you want to store the exported CSV
file\File Name.csv', index = False)

To include the index, simply remove “, index = False” from the code:

import pandas as pd
cars={'Brand':['Honda Civic','Toyota Corolla','Ford Focus','Audi
A4'],'Price': [22000,25000,27000,35000]}
df=pd.DataFrame(cars)
print("Write DataFrame into csv file-----")
df.to_csv(r'C:\export_dataframe.csv', index = False, header=True)
print(df)

Output:
Write DataFrame into csv file -----
Brand Price
0 Honda Civic 22000
1 Toyota Corolla 25000
2 Ford Focus 27000
3 Audi A4 35000
Importing csv file into DataFrame
The csv (Comma Separated Values) file can be read in DataFrame by
using the read_csv( ) in Pandas.
Syntax:
DF.read_csv(“Path of csv file”, header, sep, index_col)

header: This allows to specify which row will be used as column

names for dataframe. Default value is header=0, which means
the first row of the CSV file will be treated as column names.
If csv file doesn’t have a header, then simply set header=None.
sep: Specify a custom delimiter for the CSV input, the default is a
comma.
pd.read_csv('file_name.csv',sep='\t') # Tab to separate
index_col: This is to allow you to set which columns to be used as the
index of the dataframe. The default value is None, and pandas will add
a new column start from 0 to specify the index column
pd.read_csv('file_name.csv',index_col='Name')
# 'Name' column as index

import pandas as pd
print("Read csv file and store into DataFrame-----")
df=pd.read_csv('C:\export_dataframe.csv')
print(df)

Output:
Read csv file and store into DataFrame-----
Brand Price
0 Honda Civic 22000
1 Toyota Corolla 25000
2 Ford Focus 27000
3 Audi A4 35000

**Finish**

Pe Syllabus g12
100% (2)
Pe Syllabus g12
8 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
Dataframe Notes
No ratings yet
Dataframe Notes
39 pages
Dataframe Notes
No ratings yet
Dataframe Notes
26 pages
Chapter 1 Python Pandas - I
No ratings yet
Chapter 1 Python Pandas - I
35 pages
Data Frame CREATION
No ratings yet
Data Frame CREATION
7 pages
Pandas DataFrame1
No ratings yet
Pandas DataFrame1
22 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
DataFrame Notes1
No ratings yet
DataFrame Notes1
32 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
DF 1
No ratings yet
DF 1
17 pages
12 Ip
No ratings yet
12 Ip
4 pages
Lecture 9 Pandas
No ratings yet
Lecture 9 Pandas
176 pages
IP DataFrames (Introduction)
No ratings yet
IP DataFrames (Introduction)
18 pages
Pandas
No ratings yet
Pandas
82 pages
Class Xii Ip Ch-2 Dataframes
No ratings yet
Class Xii Ip Ch-2 Dataframes
100 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Python Pandas
No ratings yet
Python Pandas
34 pages
Dataframe PDF
No ratings yet
Dataframe PDF
14 pages
Lab 9
No ratings yet
Lab 9
9 pages
14 Pandas
No ratings yet
14 Pandas
25 pages
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
No ratings yet
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
4 pages
Python Pandas Module - Introduction-07-11-2023
No ratings yet
Python Pandas Module - Introduction-07-11-2023
84 pages
SBLC 1
No ratings yet
SBLC 1
23 pages
L1 DataFrames I
No ratings yet
L1 DataFrames I
24 pages
CSL 410 L15
No ratings yet
CSL 410 L15
29 pages
DataFrame in Pandas
No ratings yet
DataFrame in Pandas
4 pages
Chapter 2 Data Handling Using Pandas - I (DATA FRAME)
No ratings yet
Chapter 2 Data Handling Using Pandas - I (DATA FRAME)
15 pages
Pandas
No ratings yet
Pandas
16 pages
Create A Data Frame
No ratings yet
Create A Data Frame
25 pages
Dataframe Ip
No ratings yet
Dataframe Ip
75 pages
Ip Study
No ratings yet
Ip Study
18 pages
Practical File-Python
No ratings yet
Practical File-Python
14 pages
Data Frame Demo
No ratings yet
Data Frame Demo
73 pages
Creating A Series Using Scalar Values
No ratings yet
Creating A Series Using Scalar Values
15 pages
Creation of DF
No ratings yet
Creation of DF
16 pages
Pandas
No ratings yet
Pandas
27 pages
Pandas, Numpy, Matplotlib
No ratings yet
Pandas, Numpy, Matplotlib
11 pages
Pandas - Ipynb - Colab
No ratings yet
Pandas - Ipynb - Colab
8 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
Pandas Shan Ver2
No ratings yet
Pandas Shan Ver2
25 pages
Pandas
No ratings yet
Pandas
8 pages
Pandas
No ratings yet
Pandas
12 pages
Data Aggregation and Group Operations
No ratings yet
Data Aggregation and Group Operations
34 pages
Python Pandas-Data Frames
No ratings yet
Python Pandas-Data Frames
41 pages
12 Ai Practical File
100% (1)
12 Ai Practical File
5 pages
DATAFRAME
No ratings yet
DATAFRAME
11 pages
IP Slybuss
No ratings yet
IP Slybuss
21 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
No ratings yet
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
32 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
Practical Record Programs - Solutions
No ratings yet
Practical Record Programs - Solutions
23 pages
Dataframe Programs
No ratings yet
Dataframe Programs
12 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
P.no 35 To 52
No ratings yet
P.no 35 To 52
18 pages
Pandas Library
No ratings yet
Pandas Library
5 pages
Practical File 2024
No ratings yet
Practical File 2024
25 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
DAY2_databaseconnectivity
No ratings yet
DAY2_databaseconnectivity
10 pages
E MAGAZINE FINAL
No ratings yet
E MAGAZINE FINAL
117 pages
special LESSON PLAN paritosh
No ratings yet
special LESSON PLAN paritosh
6 pages
IPQPPT1_24-25KVAMC
No ratings yet
IPQPPT1_24-25KVAMC
3 pages
Lessonplanxi CS
No ratings yet
Lessonplanxi CS
28 pages
Final Report Askari Bank
No ratings yet
Final Report Askari Bank
117 pages
Secviii Div 2 - 7.5.6
No ratings yet
Secviii Div 2 - 7.5.6
2 pages
3 Attention
100% (2)
3 Attention
53 pages
Geologic Materials: Rock (Geology) Rock Cycle
No ratings yet
Geologic Materials: Rock (Geology) Rock Cycle
2 pages
Agriculture and Allied Group
No ratings yet
Agriculture and Allied Group
2 pages
Gerson The Relational Unconscious Psychoanalytic Quarterly
No ratings yet
Gerson The Relational Unconscious Psychoanalytic Quarterly
19 pages
Wipro 2
No ratings yet
Wipro 2
8 pages
MAK Halliday The Language of Science
100% (5)
MAK Halliday The Language of Science
268 pages
IT Chem F4 Topical Test 1 (E)
No ratings yet
IT Chem F4 Topical Test 1 (E)
2 pages
B31.3 Course Handout Intro
No ratings yet
B31.3 Course Handout Intro
0 pages
Logistic Regression 205
No ratings yet
Logistic Regression 205
8 pages
Chapter 12
No ratings yet
Chapter 12
14 pages
p222 1358 M H Klaiman Grammatical Voice Cambridge University Press 1991
100% (1)
p222 1358 M H Klaiman Grammatical Voice Cambridge University Press 1991
342 pages
Drdo Research Project
No ratings yet
Drdo Research Project
5 pages
United Nations Sustainable Development Goals Presentation
No ratings yet
United Nations Sustainable Development Goals Presentation
12 pages
2019C MGMT871002
No ratings yet
2019C MGMT871002
4 pages
DTT TMT TelecomIndRprt 03824
No ratings yet
DTT TMT TelecomIndRprt 03824
24 pages
Case Study Repor Take Time
No ratings yet
Case Study Repor Take Time
18 pages
Food Safety Attitude of Culinary Arts Based Students in Public PDF
No ratings yet
Food Safety Attitude of Culinary Arts Based Students in Public PDF
11 pages
Six Sigma - Examskey.lssbb.v2019!03!11.by - Ronnie.182q
No ratings yet
Six Sigma - Examskey.lssbb.v2019!03!11.by - Ronnie.182q
88 pages
NCERT Chemistry Class 12
No ratings yet
NCERT Chemistry Class 12
190 pages
Why Do You Glamorize Serial Killers in The Media
No ratings yet
Why Do You Glamorize Serial Killers in The Media
7 pages
JAVA Modifier Inheritance
No ratings yet
JAVA Modifier Inheritance
3 pages
Writing in Focus
No ratings yet
Writing in Focus
69 pages
Test Bank For Cognitive Psychology: Connecting Mind, Research, and Everyday Experience, 5th Edition, E. Bruce Goldstein
100% (8)
Test Bank For Cognitive Psychology: Connecting Mind, Research, and Everyday Experience, 5th Edition, E. Bruce Goldstein
36 pages
Sectors Without Number
No ratings yet
Sectors Without Number
15 pages
Fil2 Syllabus
No ratings yet
Fil2 Syllabus
9 pages
Absence Error Codes
100% (1)
Absence Error Codes
28 pages
l3 Assignment - Interview Questions - Template 2
No ratings yet
l3 Assignment - Interview Questions - Template 2
3 pages

python-pandas-dataframe

Uploaded by

python-pandas-dataframe

Uploaded by

Pandas is most popular library.

It provides various functions related to

Pandas Data Structure is a way to store & organize data values in a

Property Series DataFrame

A DataFrame is another Pandas Data Structure that represent 2–

A DataFrame object can be created by using following syntax.

1. Creation of empty DataFrame by using DataFrame( ):

Note − The dtype parameter changes the type of Age column to

3. Create a DataFrame from Dict of ndarrays / Lists

4. Create a DataFrame from List of Dictionaries

Note: indices are as same series (0 to 3) but columns in

Note − NaN (Not a Number) is appended in missing areas.

5. Creation of DataFrame from dictionary of dictionaries

6. Create a DataFrame from Dictionary of Series

Problem: Write a program to create a dataframe from a list containing

ZoneA ZoneB ZoneC ZoneD

7. Create a DataFrame from 2-D array

8. Create a DataFrame from another DataFrame

DataFrame Attribute Output

print("Rename Index") Rename Index

Accessing / Selecting Sub Set from DataFrame:

print("Access Multiple row--------")

print("Access Mix of row & column-------")

Adding / Modifying Row / Column Value in DataFrame:

1. Adding / Modifying Columns Value

print("Change Values of Column=-----")

Change Values of Column=-----

print("Add New Row--------")

print("Change Values of Row--------")

3. Change / Modify Single Value

print("Change Population of Delhi to 5000")

5. Rename Row Index

Note: df.rename(index={'Delhi':'New Delhi','Kolkata':'Colcata'}, inplace=True)

print("Show True Index records----")

print("Show False Index records----")

Exporting DataFrame into CSV file.

header: This allows to specify which row will be used as column

You might also like