0% found this document useful (0 votes)

5 views

3 Powerful Data Structure and Software Ecosystem

Uploaded by

maxew81693

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

3 Powerful Data Structure and Software Ecosystem

Uploaded by

maxew81693

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Data Processing Using Python

Powerful Data Structure and Software

Ecosystem
ZHANG Li/Dazhuang
Nanjing University
Department of Computer Science and Technology
Department of University Basic Computer Teaching
Data Processing Using
Python

WHY
DICTIONARY?
Nanjing University
3
Why Dictionary?
Use Python to build a simple employee information table
including names and salaries. Use the table to query salary
of Niuyun.

F ile Output:
2000
# Filename: info.py
names = ['Wangdachui', 'Niuyun', 'Linling', 'Tianqi']
salaries = [3000, 2000, 4500, 8000]
print(salaries[names.index('Niuyun')]) salaries['Niuyun']

Nanjing University
Dictionary 4

Dict • What is dictionary?

A mapping type

– key

– value

– key-value pair

Nanjing University
5
Create a Dictionary
Info
• Create a dictionary

0 'Wangdachui' − directly

1 'Niuyun', − Use dict()

2 'Linling' cInfo['Niuyun']
3 'Tianqi'
S ource

>>> aInfo = {'Wangdachui': 3000, 'Niuyun':2000, 'Linling':4500, 'Tianqi':8000}

>>> info = [('Wangdachui',3000), ('Niuyun',2000), ('Linling',4500), ('Tianqi',8000)]
>>> bInfo = dict(info)
>>> cInfo = dict([['Wangdachui',3000], ['Niuyun',2000], ['Linling',4500], ['Tianqi',8000]])
>>> dInfo = dict(Wangdachui=3000, Niuyun=2000, Linling=4500, Tianqi=8000)

{'Tianqi': 8000, 'Wangdachui': 3000, 'Linling': 4500, 'Niuyun': 2000}

Nanjing University
6
Create a Dictionary
How to set the default value of salary to be 3000？

S ource

>>> aDict = {}.fromkeys(('Wangdachui', 'Niuyun', 'Linling', 'Tianqi'),3000)

>>> aDict
{'Tianqi': 3000, 'Wangdachui': 3000, 'Niuyun': 3000, 'Linling': 3000}
sorted(aDict) = ?
['Linling', 'Niuyun', 'Tianqi', 'Wangdachui']

Nanjing University
7
Generate a Dictionary

How to generate an employee information dictionary with

name and salary list?

S ource

>>> names = ['Wangdachui', 'Niuyun', 'Linling', 'Tianqi']

>>> salaries = [3000, 2000, 4500, 8000]
>>> dict(zip(names,salaries))
{'Tianqi': 8000, 'Wangdachui': 3000, 'Niuyun': 2000, 'Linling': 4500}

Nanjing University
8
Generate a Dictionary
How to generate a dictionary of company code and stock
price from data?

{'AXP': '78.51', 'BA': '184.76', 'CAT ': '96.39', 'CSCO': '33.71', 'CVX': '106.09'}

lf = [('AXP', 'American Express Company', '78.51'),

('BA', 'The Boeing Company', '184.76'),
('CAT', 'Caterpillar Inc.', '96.39'),
('CSCO', 'Cisco Systems,Inc.', '33.71'),
('CVX', 'Chevron Corporation', '106.09')]

Nanjing University
9
Generate a Dictionary
F ile

# Filename: createdict.py
pList = … pList = [('AXP', 'American Express Company', '78.51'),
aList = [] ('BA', 'The Boeing Company', '184.76'),
bList = []
for i in range(5): ('CAT', 'Caterpillar Inc.', '96.39'), …]
aStr = pList[i][0]
bStr = pList[i][2]
aList.append(aStr)
bList.append(bStr)
aDict = dict(zip(aList,bList))
print(aDict)

{'AXP': '78.51', 'BA': '184.76', 'CAT ': '96.39', 'CSCO': '33.71', 'CVX': '106.09'}

Nanjing University
Data Processing Using
Python

USING
DICTIONARY
Nanjing University
11
Basic Operation of Dictionary
S ource

>>> aInfo = {'Wangdachui': 3000, 'Niuyun':2000, 'Linling':4500, 'Tianqi':8000}

>>> aInfo['Niuyun'] Search by key
2000
>>> aInfo['Niuyun'] = 9999 update
>>> aInfo
{'Tianqi': 8000, 'Wangdachui': 3000, 'Linling': 4500, 'Niuyun': 9999}
>>> aInfo['Fuyun'] = 1000 insert
>>> aInfo
{'Tianqi': 8000, 'Fuyun': 1000, 'Wangdachui': 3000, 'Linling': 4500, 'Niuyun': 9999}
>>> 'Mayun' in aInfo Member identification
False
>>> del aInfo['Fuyun'] Delete
>>> aInfo
{'Tianqi': 8000, 'Wangdachui': 3000, 'Linling': 4500, 'Niuyun': 9999}

Nanjing University
12
Built-in Functions of Dictionary

S ource

dict() >>> names = ['Wangdachui', 'Niuyun', 'Linling', 'Tianqi']

len() >>> salaries = [3000, 2000, 4500, 8000]
>>> aInfo = dict(zip(names, salaries))
hash() >>> aInfo
{'Wangdachui': 3000, 'Linling': 4500, 'Niuyun': 2000, 'Tianqi': 8000}
>>> len(aInfo)
4

Nanjing University
13
Built-in Functions of Dictionary

S ource

>>> hash('Wangdachui')
7716305958664889313
>>> testList = [1, 2, 3]
>>> hash(testList)
Traceback (most recent call last):
File "<pyshell#127>", line 1, in <module>
hash(testList)
TypeError: unhashable type: 'list'

Nanjing University
14
Dictionary Methods
An information dictionary is known as {'Wangdachui':3000,
'Niuyun':2000, 'Linling':4500, 'Tianqi':8000}，how to output the
name and salary of employee separately?
S ource

>>> aInfo = {'Wangdachui': 3000, 'Niuyun': 2000, 'Linling': 4500, 'Tianqi': 8000}
>>> aInfo.keys()
['Tianqi', 'Wangdachui', 'Niuyun', 'Linling']
>>> aInfo.values()
[8000, 3000, 2000, 4500]
>>> for k, v in aInfo.items():
print(k, v)

Nanjing University
15
Dictionary Methods
There are two dictionaries, the first one contains original
information, while the second one has some new members
and updates, how to merge and update information?

S ource

>>> aInfo = {'Wangdachui': 3000, 'Niuyun': 2000, 'Linling': 4500}

>>> bInfo = {'Wangdachui': 4000, 'Niuyun': 9999, 'Wangzi': 6000}
>>> aInfo.update(bInfo)
>>> aInfo
{'Wangzi': 6000, 'Linling': 4500, 'Wangdachui': 4000, 'Niuyun': 9999}

Nanjing University
16
Dictionary Methods
What’s the difference between the two kinds of search
operation?

S ource S ource

>>> stock = {'AXP': 78.51, 'BA': 184.76}

>>> stock['AAA'] >>> stock = {'AXP': 78.51, 'BA': 184.76}
Traceback (most recent call last): >>> print(stock.get('AAA'))
File "<stdin>", line 1, in <module> None
KeyError: 'AAA'

Nanjing University
17
Dictionary Methods
• Delete a dictiontary
S ource
Source

>>> aStock = {'AXP': 78.51, 'BA': 184.76}

>>> aStock = {'AXP': 78.51, 'BA':184.76} >>> bStock = aStock
>>> bStock = aStock >>> aStock.clear()
>>> aStock = {} >>> aStock
>>> bStock {}
{'BA': 184.76, 'AXP': 78.51} >>> bStock
{}

clear() copy() fromkeys() get() items()

method
keys() pop() setdefault() update() values()

Nanjing University
Case Study 18

• JSON format • Keyword query with search engine

− JavaScript Object Notation Baidu:
http://www.baidu.com/s?wd=%s
− A lightweight data exchange Google:
http://www.googlestable.com/search/?q=%us
format
Bing
China：http://cn.bing.com/search?q=%us
USA：http://www.bing.com/search?q=%us
>>> import requests
>>> x = {"name":"Niuyun","address":
after decode

>>> kw = {'q': 'Python dict'}

{"city":"Beijing","street":"Chaoyang Road"} >>> r = requests.get('http://cn.bing.com/search',
}
params = kw)
>>> x['address']['street']
'Chaoyang Road' >>> r.url
>>> print(r.text)

Nanjing University
Variable Length Keyword Parameter(dict)19
Parameter type in Python
function: S ource

• Position or keyword >>> def func(args1, *argst, **argsd):

parameter print(args1)
print(argst)
• Only position parameter
print(argsd)
• Variable Length Position >>> func('Hello,','Wangdachui','Niuyun','Linling',a1= 1,a2=2,a3=3)
Parameter Hello,
('Wangdachui', 'Niuyun', 'Linling')
• Variable length keyword
{'a1': 1, 'a3': 3, 'a2': 2}
parameter with default value

Nanjing University
Data Processing Using
Python

SET

Nanjing University
21
Set
How to remove the duplicate values in information form?

S ource

>>> names = ['Wangdachui', 'Niuyun', 'Wangzi', 'Wangdachui', 'Linling', 'Niuyun']

>>> namesSet = set(names)
>>> namesSet
{'Wangzi', 'Wangdachui', 'Niuyun', 'Linling'}

Nanjing University
Set 22

• What is set?
A combination of several unordered elements with
no duplicate

– Variable set（set）

– Fixed set（frozenset）

Nanjing University
23
Create a Set

S ource

>>> aSet = set('hello')

>>> aSet
{'h', 'e', 'l', 'o'}
>>> fSet = frozenset('hello')
>>> fSet
frozenset({'h', 'e', 'l', 'o'})
>>> type(aSet)
<class 'set'>
>>> type(fSet)
<class 'frozenset'>

Nanjing University
24
Comparison between Sets
S ource

Mathematic Python
>>> aSet = set('sunrise')  in
>>> bSet = set('sunset')  not in
>>> 'u' in aSet = ==
True
≠ !=
>>> aSet == bSet
False ⊂ <
>>> aSet < bSet ⊆ <=
False ⊃ >
>>> set('sun') < aSet ⊇ >=
True Standard type operators

Nanjing University
25
Relational Operation
S ource

Mathematic Python
>>> aSet = set('sunrise')
>>> bSet = set('sunset') ∩ &
>>> aSet & bSet ∪ |
{'u', 's', 'e', 'n'}
>>> aSet | bSet - or \ -
{'e', 'i', 'n', 's', 'r', 'u', 't'} Δ ^
>>> aSet - bSet
{'i', 'r'} Set type operator
>>> aSet ^ bSet
{'i', 'r', 't'} compound
>>> aSet -= set('sun')
>>> aSet assignment operators
{'e', 'i', 'r'}
&= |= -= ^=
Nanjing University
26
Built-in Function for Set
• Function can also be used
to do similar work S ource

− For all sets >>> aSet = set('sunrise')

>>> bSet = set('sunset')
s.issubset(t)
>>> aSet.issubset(bSet)
issuperset(t) False
union(t) >>> aSet.intersection(bSet)
{'u', 's', 'e', 'n'}
intersection(t) >>> aSet.difference(bSet)
difference(t) {'i', 'r'}
symmetric_difference(t) >>> cSet = aSet.copy()
>>> cSet
copy() {'s', 'r', 'e', 'i', 'u', 'n'}

Nanjing University
27
Built-in Function for Set
• Function can also be used
to do similar work S ource

− For variable sets

>>> aSet = set('sunrise')
update(t) >>> aSet.add('!')
intersection_update(t) >>> aSet
{'!', 'e', 'i', 'n', 's', 'r', 'u'}
difference_update(t) >>> aSet.remove('!')
symmetric_difference_update(t) >>> aSet
add(obj) {'e', 'i', 'n', 's', 'r', 'u'}
>>> aSet.update('Yeah')
remove(obj) >>> aSet
discard(obj) {'a', 'e', 'i', 'h', 'n', 's', 'r', 'u', 'Y'}
>>> aSet.clear()
pop() >>> aSet
clear() set()

Nanjing University
Data Processing Using
Python

SCIPY
LIBRARY
Nanjing University
29
SciPy
Feature
• A software ecosystem based on Python
• Open-source
• Serve for math, science and engineering

Nanjing University
30
Common Data Type in Python

Dict Numeric

String Set

Tuple List

Nanjing University
Other Data Structure 31

• Data Structure in SciPy

Modification based on original
Python data structure

– ndarray（n-dimension array）
– Series（dictionary with
variable length）
– DataFrame

Nanjing University
32
NumPy
Feature
• Powerful ndarray object and ufunc() function
• Ingenious funciton
• Suitable for scientific computation like linear algebra and
random number handling
• Flexible and available general multi-dimension data structure
• Easy to connect with database
S ource

>>> import numpy as np

>>> xArray = np.ones((3,4))

Nanjing University
33
SciPy
Feature
• Key package for scientific computation in Python and it is based
on NumPy. It includes richer functions and methods than NumPy
and it probably has stronger function when they have the same
functions or methods.
• Efficiently compute NumPy matrix to benefit collaboration
between NumPy and SciPy.
• Toolbox to deal with different fields in scientific computation
with modules including interpolation, integration, optimization
and image processing. S
ource

>>> import numpy as np

>>> from scipy import linalg
>>> arr = np.array([[1,2],[3,4]])
>>> linalg.det(arr)
-2.0
Nanjing University
34
Matplotlib
Feature
• Based on NumPy
• 2-dimensional graph library to rapidly generate all
kinds of graphs
• Pyplot module provides MATLAB-like interface.

Nanjing University
35
pandas
Feature
• Based on SciPy and NumPy
• Efficient Series and DataFrame structure
• Powerful Python library for scalable data processing
• Efficient solution for large dataset slides
• Optimized library function to read/write many types of
files, like CSV and HDF5
S ource

…
>>> df[2 : 5]
>>> df.head(4)
>>> df.tail(3)
Nanjing University
Data Processing Using
Python

NDARRAY

Nanjing University
37
Array in Python
Format

• Use data structure like list and tuple

− One-dimensional array list = [1,2,3,4]

− Two-dimensional array list = [[1,2,3],[4,5,6],[7,8,9]]

• Array module

− Create array with array()，array.array("B", range(5))

− Provide methods including append、insert and read

Nanjing University
Ndarray 38

• What is ndarray?
0 10 0
2 03 04 N-dimensional array

50 60 0
7 08 09 – Basic data type in NumPy

10
0 11
0 12
0 13
0 14
0 – Elements are of the same type

15
0 16
0 17
0 18
0 19
0
– With another name array

20
0 21
0 22
0 23
0 24
0
– Reduce memory cost and
improve the computational
efficiency
– Powerful functions

Nanjing University
Basic Concepts of Ndarray 39

axis = 1 • Ndarray attributes

ndarray

0 10 0
2 03 04 – Dimensions are called axes, the number of
axes is rank.
50 60 0
7 08 09
axis = 0

– Basic attributes
10
0 11
0 12
0 13
0 14
0
• ndarray.ndim（rank）
15
0 16
0 17
0 18
0 19
0
• ndarray.shape（dimension）
20
0 21
0 22
0 23
0 24
0
• ndarray.size（total size）
• ndarray.dtype（type of element）
• ndarray.itemsize（size of item(in byte)）

Nanjing University
40
Creation of Ndarray
S ource

>>> import numpy as np arange array

>>> aArray = np.array([1,2,3]) copy empty
>>> aArray
array([1, 2, 3]) empty_like eye
>>> bArray = np.array([(1,2,3),(4,5,6)]) fromfile fromfunction
>>> bArray
array([[1, 2, 3], identity linspace
[4, 5, 6]]) logspace mgrid
>>> np.arange(1,5,0.5) ogrid ones
array([ 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])
>>> np.random.random((2,2)) ones_like r
array([[ 0.79777004, 0.1468679 ], zeros zeros_like
[ 0.95838379, 0.86106278]])
>>> np.linspace(1, 2, 10, endpoint=False) Ndarray creation
array([ 1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9]) funciton

Nanjing University
41
Creation of Ndarray
S ource

>>> np.ones([2,3]) arange array

array([[ 1., 1., 1.],
[ 1., 1., 1.]]) copy empty
>>> np.zeros((2,2)) empty_like eye
array([[ 0., 0.], fromfile fromfunction
[ 0., 0.]])
>>> np.fromfunction(lambda i,j:(i+1)*(j+1), (9,9)) identity linspace
array([[ 1., 2., 3., 4., 5., 6., 7., 8., 9.], logspace mgrid
[ 2., 4., 6., 8., 10., 12., 14., 16., 18.],
[ 3., 6., 9., 12., 15., 18., 21., 24., 27.], ogrid ones
[ 4., 8., 12., 16., 20., 24., 28., 32., 36.], ones_like r
[ 5., 10., 15., 20., 25., 30., 35., 40., 45.],
[ 6., 12., 18., 24., 30., 36., 42., 48., 54.], zeros zeros_like
[ 7., 14., 21., 28., 35., 42., 49., 56., 63.], Ndarray creation
[ 8., 16., 24., 32., 40., 48., 56., 64., 72.], funciton
[ 9., 18., 27., 36., 45., 54., 63., 72., 81.]])
Nanjing University
42
Ndarray Operations
S ource

>>> aArray = np.array([(1,2,3),(4,5,6)])

array([[1, 2, 3],
[4, 5, 6]])
>>> print(aArray[1])
10 20 0
3 [4 5 6]
>>> print(aArray[0:2])
40 50 0
6 [[1 2 3]
[4 5 6]]
>>> print(aArray[:,[0,1]])
[[1 2]
[4 5]]
>>> print(aArray[1,[0,1]])
[4 5]
>>> for row in aArray:
print(row)
[1 2 3]
[4 5 6]

Nanjing University
43
Ndarray Operations

S ource
S ource

>>> aArray.resize(3,2)
>>> aArray = np.array([(1,2,3),(4,5,6)])
>>> aArray
>>> aArray.shape
array([[1, 2],
(2, 3)
[3, 4],
>>> bArray = aArray.reshape(3,2)
[5, 6]])
>>> bArray
array([[1, 2], >>> bArray = np.array([1,3,7])
[3, 4], >>> cArray = np.array([3,5,8])
[5, 6]]) >>> np.vstack((bArray, cArray))
>>> aArray array([[1, 3, 7],
array([[1, 2, 3], [3, 5, 8]])
[4, 5, 6]]) >>> np.hstack((bArray, cArray))
array([1, 3, 7, 3, 5, 8])

Nanjing University
44
Ndarray Calculation
S ource

/ >>> aArray = np.array([(5,5,5),(5,5,5)])

- >>> bArray = np.array([(2,2,2),(2,2,2)])
> >>> cArray = aArray * bArray
+ >>> cArray
array([[10, 10, 10],
* [10, 10, 10]])
>>> aArray += bArray
>>> aArray
Use basic operators. array([[7, 7, 7],
[7, 7, 7]])
>>> a = np.array([1,2,3])
>>> b = np.array([[1,2,3],[4,5,6]])
>>> a + b
array([[2, 4, 6],
[5, 7, 9]])
Nanjing University
45
Ndarray Calculation
S ource

>>> aArray = np.array([(1,2,3),(4,5,6)])

>>> aArray.sum()
21
>>> aArray.sum(axis = 0)
array([5, 7, 9]) sum mean
>>> aArray.sum(axis = 1) std var
array([ 6, 15])
>>> aArray.min() # return value min max
1 argmin argmax
>>> aArray.argmax() # return index
5 cumsum cumprod
>>> aArray.mean()
Use basic array
3.5
statistic methods
>>> aArray.var()
2.9166666666666665
>>> aArray.std()
1.707825127659933
Nanjing University
46
Specific Application—Linear Algebra
S ource

>>> import numpy as np

>>> x = np.array([[1,2], [3,4]]) dot Inner product of matrix
>>> r1 = np.linalg.det(x) linalg.det Determinant
>>> print(r1) linalg.inv Inverse matrix
-2.0
>>> r2 = np.linalg.inv(x) linalg.solve Multiple linear equation
>>> print(r2) linalg.eig Eigenvalue and
[[-2. 1. ] eigenvector
[ 1.5 -0.5]]
>>> r3 = np.dot(x, x)
Common Functions
>>> print(r3)
[[ 7 10]
[15 22]]
Nanjing University
47
ufunc() in Ndarray
• ufunc（universal function） F ile

# Filename: math_numpy.py
can operate each element in
import time
the array. As many ufunc()s in import math
import numpy as np
NumPy are implemented by C, x = np.arange(0, 100, 0.01)
t_m1 = time.process_time()
the speed can be fast.
for i, t in enumerate(x):
x[i] = math.pow((math.sin(t)), 2)
t_m2 = time.process_time()
add, all, any, arange, apply_along_axis, y = np.arange(0,100,0.01)
argmax, argmin, argsort, average, t_n1 = time.process_time()
bincount, ceil, clip, conj, corrcoef, cov, y = np.power(np.sin(y), 2)
cross, cumprod, cumsum, diff, dot, t_n2 = time.process_time()
exp, floor, …
print('Running time of math:', t_m2 - t_m1)
print('Running time of numpy:', t_n2 - t_n1)
Nanjing University
Data Processing Using
Python

SERIES

Nanjing University
49
Series
• Basic feature
− Object similar to one-dimensional array
− Consist of data and index.
Source

>>> from pandas import Series

>>> aSer = Series([1,2.0,'a'])
>>> aSer
0 1
1 2
2 a
dtype: object
Nanjing University
50
Index of Self-defined Series
S ource

>>> bSer = pd.Series(['apple','peach','lemon'], index = [1,2,3])

>>> bSer
1 apple
2 peach
3 lemon
dtype: object
>>> bSer.index
Int64Index([1, 2, 3], dtype='int64')
>>> bSer.values
array(['apple', 'peach', 'lemon'], dtype=object)

Nanjing University
51
Basic Operation of Series
S ource

>>> aSer = pd.Series([3,5,7],index = ['a','b','c'])

>>> aSer['b']
5
>>> aSer * 2
a 6
b 10
c 14
dtype: int64
>>> import numpy as np
>>> np.exp(aSer)
a 20.085537
b 148.413159
c 1096.633158
dtype: float64

Nanjing University
52
Data Alignment of Series
S ource

>>> data = {'AXP':'86.40','CSCO':'122.64','BA':'99.44'}

>>> sindex = ['AXP','CSCO','BA','AAPL']
>>> aSer = pd.Series(data, index = sindex)
>>> aSer
AXP 86.40
CSCO 122.64
BA 99.44
AAPL NaN
dtype: object
>>> pd.isnull(aSer)
AXP False
CSCO False
BA False
AAPL True
dtype: bool

Nanjing University
53
Data Alignment of Series
• Important feature S ource

>>> aSer = pd.Series(data, index = sindex)

− Align data with >>> aSer
AXP 86.40
different indexes CSCO 122.64
BA 99.44
during computation AAPL NaN
dtype: object
>>> bSer = {'AXP':'86.40','CSCO':'122.64','CVX':'23.78'}
>>> cSer = pd.Series(bSer)
>>> aSer + cSer
AAPL NaN
AXP 86.4086.40
BA NaN
CSCO 122.64122.64
CVX NaN
dtype: object

Nanjing University
54
Data Alignment of Series
Source

• Important feature
>>> data = {'AXP':86.40,'CSCO':122.64,'BA':99.44}
− Align data with >>> aSer = pd.Series(data, index = sindex)
>>> aSer
different indexes AXP 86.40
CSCO 122.64
during computation BA 99.44
AAPL NaN
dtype: object
>>> bSer = {'AXP':86.40,'CSCO':130.64,'CVX':23.78}
>>> cSer = pd.Series(bSer)
>>> (aSer+cSer)/2
AAPL NaN
AXP 86.40
BA NaN
CSCO 126.64
CVX NaN
dtype: float64
Nanjing University
Data Processing Using
Python

DATAFRAME

Nanjing University
56
DataFrame
• Basic Feature
− A form-like data structure
− Have an ordered column（like index）
− Can be considered as a set of Series sharing the same index
S ource

>>> data = {'name': ['Wangdachui', 'Linling', 'Niuyun'], 'pay': [4000, 5000, 6000]}
>>> frame = pd.DataFrame(data)
>>> frame
name pay
0 Wangdachui 4000
1 Linling 5000
2 Niuyun 6000

Nanjing University
57
Index and Value of Dataframe
S ource

>>> data = np.array([('Wangdachui', 4000), ('Linling', 5000), ('Niuyun', 6000)])

>>> frame =pd.DataFrame(data, index = range(1, 4), columns = ['name', 'pay'])
>>> frame
name pay
1 Wangdachui 4000
2 Linling 5000
3 Niuyun 6000
>>> frame.index
RangeIndex(start=1, stop=4, step=1)
>>> frame.columns
Index(['name', 'pay'], dtype='object')
>>> frame.values
array([['Wangdachui', '4000'],
['Linling', '5000'],
['Niuyun', '6000']], dtype=object)
Nanjing University
58
Basic Operation of DataFrame
• The query for row and column of name pay
DataFrame object returns Series 0 Wangdachui 4000
S ource
1 Linling 5000
2 Niuyun 6000
>>> frame['name']
0 Wangdachui
1 Linling
2 Niuyun
Name: name, dtype: object S ource

>>> frame.pay
0 4000 >>> frame.iloc[ : 2, 1]
1 5000 0 4000
2 6000 1 5000
Name: pay, dtype: int64 Name: pay, dtype: object

Nanjing University
59
Basic Operation of DataFrame
• Modification and deletion of DataFrame object

Source
S ource

>>> del frame['pay']

>>> frame['name'] = 'admin' >>> frame
>>> frame name
name pay 0 admin
0 admin 4000 1 admin
1 admin 5000 2 admin
2 admin 6000
[3 rows x 1 columns]

Nanjing University
60
Statistics with DataFrame
• Find groups with lowest and high salaries in DataFrame object members
name pay
0 Wangdachui 4000
1 Linling 5000
2 Niuyun 6000 Source

>>> frame[frame.pay >= '5000']

S ource
name pay
>>> frame.pay.min() 1 Linling 5000
'4000' 2 Niuyun 6000

Nanjing University
61
Summary

Nanjing University

School of Computing and Information Technology: Course Delivery
No ratings yet
School of Computing and Information Technology: Course Delivery
26 pages
Dictionary
No ratings yet
Dictionary
11 pages
Dictionary Data Type in Python
No ratings yet
Dictionary Data Type in Python
5 pages
PYTHON DICTIONARIES
No ratings yet
PYTHON DICTIONARIES
17 pages
chapter-13 Dictionaries
No ratings yet
chapter-13 Dictionaries
11 pages
Tabular - List, Tuple, Dictionary
No ratings yet
Tabular - List, Tuple, Dictionary
5 pages
Module 2.3
No ratings yet
Module 2.3
11 pages
Python Dictionary
No ratings yet
Python Dictionary
62 pages
Basics of Dictionary in Python - Jupyter Notebook
No ratings yet
Basics of Dictionary in Python - Jupyter Notebook
5 pages
List Declaration-Programs
No ratings yet
List Declaration-Programs
14 pages
Dictionary
No ratings yet
Dictionary
7 pages
PYTHON-LIST - J - Dictionary
No ratings yet
PYTHON-LIST - J - Dictionary
10 pages
Python-08
No ratings yet
Python-08
19 pages
Algorithmic thinking with python
No ratings yet
Algorithmic thinking with python
8 pages
Tuple & Dict
No ratings yet
Tuple & Dict
16 pages
Session 17
No ratings yet
Session 17
61 pages
IP notes-dict (1)
No ratings yet
IP notes-dict (1)
6 pages
Chapter - 6 Dictionary
100% (2)
Chapter - 6 Dictionary
25 pages
Python Notes
No ratings yet
Python Notes
24 pages
Problem: Write A Kids Play Program That Prints The Capital of A Country Given The Name of The Country
No ratings yet
Problem: Write A Kids Play Program That Prints The Capital of A Country Given The Name of The Country
31 pages
Problem: Write A Kids Play Program That Prints The Capital of A Country Given The Name of The Country
No ratings yet
Problem: Write A Kids Play Program That Prints The Capital of A Country Given The Name of The Country
32 pages
DAO Cheatsheet
No ratings yet
DAO Cheatsheet
3 pages
Python Dictionary
No ratings yet
Python Dictionary
3 pages
PYTHON_UNIT-5
No ratings yet
PYTHON_UNIT-5
14 pages
Python Module4
No ratings yet
Python Module4
8 pages
4
No ratings yet
4
30 pages
9 - Dictionary and Tuples
No ratings yet
9 - Dictionary and Tuples
40 pages
Python 06 Dictionary
No ratings yet
Python 06 Dictionary
20 pages
Unit-2 Lect 3[1]
No ratings yet
Unit-2 Lect 3[1]
43 pages
Lists and Dictionaries
No ratings yet
Lists and Dictionaries
22 pages
RAW Data
No ratings yet
RAW Data
22 pages
Unit 5
No ratings yet
Unit 5
27 pages
Python Cheat Sheet For Beginners
No ratings yet
Python Cheat Sheet For Beginners
1 page
Dictvvjj
No ratings yet
Dictvvjj
23 pages
Chapter 7 Dictionaries 1
No ratings yet
Chapter 7 Dictionaries 1
5 pages
Dictionary Lab Guide
No ratings yet
Dictionary Lab Guide
13 pages
tp07-dictionaries
No ratings yet
tp07-dictionaries
17 pages
Introduction To Tuples: #Creates An Empty Tuple
No ratings yet
Introduction To Tuples: #Creates An Empty Tuple
22 pages
DICTIONARY
No ratings yet
DICTIONARY
9 pages
PYTHONa 7
No ratings yet
PYTHONa 7
15 pages
Metasnake Python Tips
No ratings yet
Metasnake Python Tips
15 pages
Python For Data Science Cheat Sheet 2.0
100% (1)
Python For Data Science Cheat Sheet 2.0
11 pages
Session 21 and 22
No ratings yet
Session 21 and 22
50 pages
Introductiontocourse: 1 The Python Programming Language: Functions
No ratings yet
Introductiontocourse: 1 The Python Programming Language: Functions
11 pages
Getting Started With Python Cheat Sheet
No ratings yet
Getting Started With Python Cheat Sheet
1 page
22mbada303 Module 4
No ratings yet
22mbada303 Module 4
32 pages
11.Dictionary Datatype
No ratings yet
11.Dictionary Datatype
9 pages
10 Dictionary
No ratings yet
10 Dictionary
29 pages
Python Cheat Sheet For Excel Users
No ratings yet
Python Cheat Sheet For Excel Users
5 pages
Python Collections Module
No ratings yet
Python Collections Module
19 pages
Dictionary: Example
No ratings yet
Dictionary: Example
5 pages
Dictionary
No ratings yet
Dictionary
60 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Dictionary in Python-1
No ratings yet
Dictionary in Python-1
38 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Dictionary 2024 2025
100% (1)
Dictionary 2024 2025
90 pages
Python BasicsGUIA PYTHON-01
No ratings yet
Python BasicsGUIA PYTHON-01
1 page
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
Spherical Multpole Moment
No ratings yet
Spherical Multpole Moment
11 pages
Lecture FABM1 1 4
No ratings yet
Lecture FABM1 1 4
6 pages
WS1
No ratings yet
WS1
2 pages
List of Unsolved Problems in Mathematics
100% (1)
List of Unsolved Problems in Mathematics
21 pages
Analytical Calculation of Magnetic Field Created by A Ring Magnet Used in Magnetron RF Reactor
No ratings yet
Analytical Calculation of Magnetic Field Created by A Ring Magnet Used in Magnetron RF Reactor
5 pages
Alastair J Sinclair, Garston H Blackwell AppliBookFi Org Pages 79
No ratings yet
Alastair J Sinclair, Garston H Blackwell AppliBookFi Org Pages 79
17 pages
Fast Analytical Techniques
No ratings yet
Fast Analytical Techniques
164 pages
Chapter 3
No ratings yet
Chapter 3
3 pages
Boris Stoyanov - The Dynamics of D-Branes With Dirac-Born-Infeld and Chern-Simons/Wess-Zumino Actions
No ratings yet
Boris Stoyanov - The Dynamics of D-Branes With Dirac-Born-Infeld and Chern-Simons/Wess-Zumino Actions
58 pages
HW6
No ratings yet
HW6
4 pages
Nature of Mathematics
No ratings yet
Nature of Mathematics
22 pages
Kahoot and Quiz Questions
100% (1)
Kahoot and Quiz Questions
3 pages
240 Bnotes
No ratings yet
240 Bnotes
112 pages
Question Stems To Help Apply Bloom's Taxonomy
No ratings yet
Question Stems To Help Apply Bloom's Taxonomy
31 pages
(SPIE Press Monograph PM251) Albertazzi, Armando - Viotti, Matias R - Robust Speckle Metrology Techniques For Stress Analysis and NDT (2014, SPIE) - Libgen - Li
No ratings yet
(SPIE Press Monograph PM251) Albertazzi, Armando - Viotti, Matias R - Robust Speckle Metrology Techniques For Stress Analysis and NDT (2014, SPIE) - Libgen - Li
202 pages
Correlation Ansd Simple Regression
No ratings yet
Correlation Ansd Simple Regression
27 pages
Untitled
No ratings yet
Untitled
616 pages
Operation Research Assignment - Transportation Problems
No ratings yet
Operation Research Assignment - Transportation Problems
5 pages
List of Practicals (CS)
0% (1)
List of Practicals (CS)
4 pages
Untitled11.ipynb - Colab
No ratings yet
Untitled11.ipynb - Colab
11 pages
InfyTQ Previous Year Slots Aptitude Questions Day 2
No ratings yet
InfyTQ Previous Year Slots Aptitude Questions Day 2
6 pages
Daily Lesson Log Grade Level Practice Teacher Learning Area Teaching Date Quarter I. Objectives
No ratings yet
Daily Lesson Log Grade Level Practice Teacher Learning Area Teaching Date Quarter I. Objectives
13 pages
Nitte Meenakshi Institute of Technology
No ratings yet
Nitte Meenakshi Institute of Technology
13 pages
Outlier Detection
No ratings yet
Outlier Detection
19 pages
Program of Stack Using Array
No ratings yet
Program of Stack Using Array
9 pages
Data Types in C Language
No ratings yet
Data Types in C Language
4 pages
Lecture 05 - Floating Point Numbers
No ratings yet
Lecture 05 - Floating Point Numbers
28 pages
HW1 Solution
No ratings yet
HW1 Solution
8 pages
Physics Numerical
No ratings yet
Physics Numerical
10 pages
SOROS ENGLISH FINAL t3bnxq
No ratings yet
SOROS ENGLISH FINAL t3bnxq
5 pages