0% found this document useful (0 votes)

27 views

Topic_4_Basic_programming_concepts

Uploaded by

alhammadheba77

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

Topic_4_Basic_programming_concepts

Uploaded by

alhammadheba77

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

1 Basic Programming
1.1 Python Programming
1.2 Learning Outcomes
1.3 Data Series
2 DataFrame (DF)
2.1 Creating DataFrame
2.1.1 Create DF with Given n-D Array
2.1.2 Create DF with Random n-D Array
2.1.3 Create DF from CSV
2.1.4 Create DF from XLSX
2.2 Accessing Elements
2.2.1 Accessing via .loc
2.2.2 Accessing via .iloc
2.3 DF Slicing
2.3.1 Single Column Slice
2.3.2 Slicing via .loc
2.3.3 Slicing via .iloc
2.4 DF Sorting
2.5 DF Selecting
2.5.1 Example: Displaying Data
2.6 Addition DF Methods
2.6.1 Count Rows and Columns
2.6.2 Apply Lambda Functions
2.6.3 Apply General Functions
3 Data Visualization
3.1 Why Visualization
3.2 Typical Visualizations
3.3 Pandas Plots
3.4 Seaborn Plots
3.4.1 Example: Seaborn's Basic Plots
3.4.2 Example: Seaborn's Advanced Plots
4 References:
4.1 Data Sets:
4.2 Others:

Basic Programming
Python Programming and DataFrames

Python Programming
1. In ICS 104, we have seen Python as a general purpose programming language.
2. Although, Python can be used in many areas that require programming, in this chapter we see the usage of Python w.r.t data reading and visualization.
3. Specifically, we highlight the usage of Pandas and Seaborn library.

Learning Outcomes
1. Build, read and/or manipulate DataFrames.
2. Learn DataFrame slicing, sorting and selection methods.
3. Visualize data using histogram, box plots, scatter plots, etc.

Data Series
1. In Pandas, series are labeled/indexed one-dimensional arrays.
2. The elements of the series could be integer, float, string, objects, etc.
3. The index may not be unique.
4. A collection of Data Series is called as Data Frame.
5. Our focus in this chapter will be on Data Frames.

DataFrame (DF)
Collection of series, where each column is a series.

Creating DataFrame
In order to create a data from, following things are typically needed:

1. Data in n-dimensional array format.

2. Optional column title/header

An optional row index can be given too. In addition to the above, there are many parameters, which can be read from pydata docs.

Create DF with Given n-D Array

Data:

[ ]
a11 a12
(1)
a21 a22

col_names: ['col 1', 'col 2']

row_index: ['row 1', 'row 2']

In [1]: import pandas as pd

df = pd.DataFrame([['a11', 'a12'], ['a21', 'a22']],
index=['row 1', 'row 2'],
columns=['col 1', 'col 2'])

display(df)

col 1 col 2

row 1 a11 a12

row 2 a21 a22

Create DF with Random n-D Array

Data: random 3 columns 1000 rows numbers between 0 and 1.
col_names: ['col 1', 'col 2', 'col 3']

In [2]: import pandas as pd

import numpy as np

df = pd.DataFrame(np.random.rand(1000,3),
columns=['col 1', 'col 2', 'col 3'])

display(df)

col 1 col 2 col 3

0 0.900379 0.450864 0.915527

1 0.107990 0.368869 0.516130

2 0.342405 0.492546 0.788363

3 0.597415 0.860259 0.241468

4 0.095641 0.791185 0.959279

... ... ... ...

995 0.247713 0.397768 0.911072

996 0.023047 0.423800 0.863465

997 0.714096 0.294497 0.737279

998 0.494320 0.450755 0.293238

999 0.552515 0.312760 0.456961

1000 rows × 3 columns

Create DF from CSV

Data: Basic-1.csv
col_names: Use from the first row

1. Display the above data.

2. Obtain the summary statistics:
For numeric data (count, mean, std, min, max)
For non-numeric data (count, unique-values, most occurring and its frequency)

In [3]: #1. Display the above data.

import pandas as pd
df = pd.read_csv('data/Basic-1.csv', delimiter=',')#the path to the data file.

## Path relative to the .ipynb file

# df = pd.read_csv('Basic-101.csv', delimiter=',',header=None,names=['Gender','Job Type',"Province"])

# display data
display(df)

#2. Obtain the summary statistics:

# summary statistics for non-numeric columns is available by default only when all columns are non-numeric
display(df.describe())

Gender Job Type Province

0 M Pink-collar Hejaz

1 M White-collar Central

2 M Pink-collar Hejaz

3 M Pink-collar Hejaz

4 M Gold-collar Eastern

... ... ... ...

1273 M Blue-collar Central

1274 M Pink-collar Hejaz

1275 F Blue-collar Hejaz

1276 F White-collar Central

1277 F Blue-collar Eastern

1278 rows × 3 columns

Gender Job Type Province

count 1278 1278 1278

unique 2 4 3

top M White-collar Central

freq 875 568 531

Create DF from XLSX

Data: Basic-2.xlsx
col_names: There are 17 columns, defined respectively as:

1. dur: duration of agreement

2. wage1.wage : wage increase in first year of contract
3. wage2.wage : wage increase in second year of contract
4. wage3.wage : wage increase in third year of contract
5. cola : cost of living allowance
6. hours.hrs : number of working hours during week
7. pension : employer contributions to pension plan
8. stby_pay : standby pay
9. shift_diff : shift differencial : supplement for work on II and III shift
10. educ_allw.boolean : education allowance
11. holidays : number of statutory holidays
12. vacation : number of paid vacation days
13. lngtrm_disabil.boolean : employer's help during employee longterm disability
14. dntl_ins : employers contribution towards the dental plan
15. bereavement.boolean : employer's financial contribution towards the covering the costs of bereavement
16. empl_hplan : employer's contribution towards the health plan
17. empl_hplan : employer's contribution towards the health plan

For the above data, do the following:

1. Display the above data.

2. Obtain the summary statistics:
For numeric data (count, mean, std, min, max)
For non-numeric data (count, unique-values, most occurring and its frequency)

In [3]: # 1. Display the above data.

import pandas as pd

df = pd.read_excel('data/Basic-2.xlsx')

display(df.head())
#df.tail()
#df.sample(5)

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17

0 1.0 5.0 NaN NaN NaN 40.0 NaN NaN 2.0 NaN 11.0 average NaN NaN yes NaN good

1 2.0 4.5 5.8 NaN NaN 35.0 ret_allw NaN NaN yes 11.0 below average NaN full NaN full good

2 NaN NaN NaN NaN NaN 38.0 empl_contr NaN 5.0 NaN 11.0 generous yes half yes half good

3 3.0 3.7 4.0 5.0 tc NaN NaN NaN NaN yes NaN NaN NaN NaN yes NaN good

4 3.0 4.5 4.5 5.0 NaN 40.0 NaN NaN NaN NaN 12.0 average NaN half yes half good

In [5]: # 2. Obtain the summary statistics:

display(df.describe())
# by default the describe gives summary of the numeric columns, when there are mixed columns
# df.describe(include='all') # can be used, but too many NaNs

c1 c2 c3 c4 c6 c8 c9 c11

count 39.000000 39.000000 30.000000 12.000000 37.000000 7.000000 24.000000 38.000000

mean 2.102564 3.620513 3.913333 3.766667 37.810811 6.142857 4.583333 11.105263

std 0.753758 1.331438 1.280822 1.415070 2.716593 4.845223 4.754098 1.371318

min 1.000000 2.000000 2.000000 2.000000 27.000000 2.000000 0.000000 9.000000

25% 2.000000 2.500000 2.625000 2.250000 37.000000 2.000000 3.000000 10.000000

50% 2.000000 3.500000 4.000000 4.600000 38.000000 4.000000 4.000000 11.000000

75% 3.000000 4.500000 4.500000 5.000000 40.000000 10.000000 5.000000 12.000000

max 3.000000 6.900000 7.000000 5.100000 40.000000 13.000000 25.000000 15.000000

In [6]: # to get summary of non-numeric columns

# Idenfity non-numeric columns

cat_columns= df.select_dtypes(include='object').columns

# Show summary of non-numeric columns

df[cat_columns].describe()

## or in one step
# df.select_dtypes(include='object').describe()

Out[6]: c5 c7 c10 c12 c13 c14 c15 c16 c17

count 24 18 18 37 16 25 20 24 40

unique 3 3 2 3 2 3 2 3 2

top none none no below average yes half yes full good

freq 14 8 11 14 11 11 18 12 26

Accessing Elements
.loc and .iloc methods can be used for accessing the elements of a DF.

Accessing via .loc

Consider the dataset given in Basic-1.csv. Display second row, second column element.

In [7]: import pandas as pd

df = pd.read_csv('data/Basic-1.csv', delimiter=',')
print(df.loc[1,'Job Type'])

White-collar

Accessing via .iloc

Consider the dataset given in Basic-1.csv. Display second row, second column element.

In [8]: import pandas as pd

df = pd.read_csv('data/Basic-1.csv', delimiter=',')
display(df.head())
print(df.iloc[1,1])
print(df.loc[1,'Job Type'])

Gender Job Type Province

0 M Pink-collar Hejaz

1 M White-collar Central

2 M Pink-collar Hejaz

3 M Pink-collar Hejaz

4 M Gold-collar Eastern

White-collar
White-collar

DF Slicing
1. Slicing is nothing but extracting a part of the DF.
2. The part could be contiguous or broken chunks of the DF.

Single Column Slice

A single column can be extracted using the following methods:

1. df.COL_NAME
2. df[COL_NAME]

Consider the dataset given in Basic-2.xlsx. Display first column.

In [9]: import pandas as pd

df = pd.read_excel('data/Basic-2.xlsx')
# df.head()
display(df['c1'])

## Other styles to select a column

# display(df.iloc[:,0])
# display(df.loc[:,'c1'])

## Following code can be used to create column names when dataset does not contain any column names.
# col_name=[''.join(['c',str(i+1)]) for i in range(17)]
## Following options can be used to add column names to the dataset
# header=None, names=col_name

0 1.0
1 2.0
2 NaN
3 3.0
4 3.0
5 2.0
6 3.0
7 3.0
8 2.0
9 1.0
10 3.0
11 2.0
12 2.0
13 3.0
14 1.0
15 2.0
16 1.0
17 1.0
18 1.0
19 2.0
20 2.0
21 2.0
22 3.0
23 2.0
24 1.0
25 3.0
26 2.0
27 2.0
28 2.0
29 3.0
30 3.0
31 3.0
32 2.0
33 2.0
34 3.0
35 2.0
36 1.0
37 1.0
38 3.0
39 2.0
Name: c1, dtype: float64

Slicing via .loc

Consider the dataset given in Basic-2.xlsx. Display first 10 rows from the second, third, fourth and fifth column.

In [10]: import pandas as pd

df = pd.read_excel('data/Basic-2.xlsx')

display(df.head())

display(df.loc[0:9,'c2':'c5']) # ends are inclusive

# display(df.loc[0:9:2,'c2':'c5':2]) # ends are inclusive and steps can be used too

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17

0 1.0 5.0 NaN NaN NaN 40.0 NaN NaN 2.0 NaN 11.0 average NaN NaN yes NaN good

1 2.0 4.5 5.8 NaN NaN 35.0 ret_allw NaN NaN yes 11.0 below average NaN full NaN full good

2 NaN NaN NaN NaN NaN 38.0 empl_contr NaN 5.0 NaN 11.0 generous yes half yes half good

3 3.0 3.7 4.0 5.0 tc NaN NaN NaN NaN yes NaN NaN NaN NaN yes NaN good

4 3.0 4.5 4.5 5.0 NaN 40.0 NaN NaN NaN NaN 12.0 average NaN half yes half good

c2 c3 c4 c5

0 5.0 NaN NaN NaN

1 4.5 5.8 NaN NaN

2 NaN NaN NaN NaN

3 3.7 4.0 5.0 tc

4 4.5 4.5 5.0 NaN

5 2.0 2.5 NaN NaN

6 4.0 5.0 5.0 tc

7 6.9 4.8 2.3 NaN

8 3.0 7.0 NaN NaN

9 5.7 NaN NaN none

Slicing via .iloc

Consider the dataset given in Basic-2.xlsx. Display first 10 rows from the second, third, fourth and fifth column.

In [11]: import pandas as pd

df = pd.read_excel('data/Basic-2.xlsx')

display(df.iloc[:10,1:5])

c2 c3 c4 c5

0 5.0 NaN NaN NaN

1 4.5 5.8 NaN NaN

2 NaN NaN NaN NaN

3 3.7 4.0 5.0 tc

4 4.5 4.5 5.0 NaN

5 2.0 2.5 NaN NaN

6 4.0 5.0 5.0 tc

7 6.9 4.8 2.3 NaN

8 3.0 7.0 NaN NaN

9 5.7 NaN NaN none

In [12]: ## Example of loc and iloc

df = pd.DataFrame([['a11', 'a12'], ['a21', 'a22']],
index=['row 1', 'row 2'],
columns=['col 1', 'col-2'])

display(df)

# display(df.loc[:,'col 1']) # Select one column

# display(df['col 1']) # Select one column

# display(df.col-2) # # Select one column, problem is Spaces or Special characters

# display(df.loc['row 2','col 1'])

# display(df.loc['row 2',:])

# display(df['row 2']) ## check...

# display(df.iloc[:,0]) # Select one column

# display(df.iloc[1,0]) # row 2 column 1

col 1 col-2

row 1 a11 a12

row 2 a21 a22

DF Sorting
1. DataFrames can be sorted w.r.t values in the columns.
2. One or more columns can be used for sorting.

Consider the dataset given in Basic-2.xlsx. Sort the rows according to the first column, in ascending order. Repeat the sort in descending order.

In [13]: import pandas as pd

#Read data
df = pd.read_excel('data/Basic-2.xlsx')
display(df.head())

# # sort by single column in ascending order

df.sort_values(by=['c1'], inplace=True)
display(df.head())

# # sort by single column in descending order

df.sort_values(by=['c1'], inplace=True, ascending=False)
display(df.head())

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17

0 1.0 5.0 NaN NaN NaN 40.0 NaN NaN 2.0 NaN 11.0 average NaN NaN yes NaN good

1 2.0 4.5 5.8 NaN NaN 35.0 ret_allw NaN NaN yes 11.0 below average NaN full NaN full good

2 NaN NaN NaN NaN NaN 38.0 empl_contr NaN 5.0 NaN 11.0 generous yes half yes half good

3 3.0 3.7 4.0 5.0 tc NaN NaN NaN NaN yes NaN NaN NaN NaN yes NaN good

4 3.0 4.5 4.5 5.0 NaN 40.0 NaN NaN NaN NaN 12.0 average NaN half yes half good

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17

0 1.0 5.0 NaN NaN NaN 40.0 NaN NaN 2.0 NaN 11.0 average NaN NaN yes NaN good

16 1.0 2.8 NaN NaN NaN 35.0 NaN NaN 2.0 NaN 12.0 below average NaN NaN NaN NaN good

14 1.0 3.0 NaN NaN none 36.0 NaN NaN 10.0 no 11.0 generous NaN NaN NaN NaN good

24 1.0 6.0 NaN NaN NaN 38.0 NaN 8.0 3.0 NaN 9.0 generous NaN NaN NaN NaN good

17 1.0 2.1 NaN NaN tc 40.0 ret_allw 2.0 3.0 no 9.0 below average yes half NaN none bad

c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17

29 3.0 2.0 2.5 NaN NaN 35.0 none NaN NaN NaN 10.0 average NaN NaN yes full bad

7 3.0 6.9 4.8 2.3 NaN 40.0 NaN NaN 3.0 NaN 12.0 below average NaN NaN NaN NaN good

3 3.0 3.7 4.0 5.0 tc NaN NaN NaN NaN yes NaN NaN NaN NaN yes NaN good

4 3.0 4.5 4.5 5.0 NaN 40.0 NaN NaN NaN NaN 12.0 average NaN half yes half good

34 3.0 2.0 2.5 2.1 tc 40.0 none 2.0 1.0 no 10.0 below average no half yes full bad

Consider the dataset given in Basic-2.xlsx. Sort the rows according to the first and then sixth column, where ascending in the first and descending in the sixth column. Display only first 6 columns.

In [14]: # sort by multiple columns

df.sort_values(by=['c1','c6'], inplace=True, ascending=[True, False])

display(df.iloc[:,0:6])

c1 c2 c3 c4 c5 c6

9 1.0 5.7 NaN NaN none 40.0

17 1.0 2.1 NaN NaN tc 40.0

36 1.0 2.0 NaN NaN tc 40.0

0 1.0 5.0 NaN NaN NaN 40.0

24 1.0 6.0 NaN NaN NaN 38.0

37 1.0 2.8 NaN NaN none 38.0

18 1.0 2.0 NaN NaN none 38.0

14 1.0 3.0 NaN NaN none 36.0

16 1.0 2.8 NaN NaN NaN 35.0

12 2.0 3.5 4.0 NaN none 40.0

39 2.0 4.5 4.0 NaN none 40.0

21 2.0 2.5 3.0 NaN NaN 40.0

23 2.0 4.5 4.0 NaN NaN 40.0

35 2.0 2.0 2.0 NaN none 40.0

33 2.0 4.0 5.0 NaN none 40.0

8 2.0 3.0 7.0 NaN NaN 38.0

11 2.0 6.4 6.4 NaN NaN 38.0

32 2.0 2.5 2.5 NaN NaN 38.0

20 2.0 4.3 4.4 NaN NaN 38.0

15 2.0 4.5 4.0 NaN none 37.0

28 2.0 5.0 4.0 NaN none 37.0

1 2.0 4.5 5.8 NaN NaN 35.0

5 2.0 2.0 2.5 NaN NaN 35.0

19 2.0 4.0 5.0 NaN tcf 35.0

27 2.0 3.0 3.0 NaN none 33.0

26 2.0 4.5 4.5 NaN tcf NaN

7 3.0 6.9 4.8 2.3 NaN 40.0

4 3.0 4.5 4.5 5.0 NaN 40.0

34 3.0 2.0 2.5 2.1 tc 40.0

30 3.0 4.5 4.5 5.0 none 40.0

25 3.0 2.0 2.0 2.0 none 40.0

31 3.0 3.0 2.0 2.5 tc 40.0

38 3.0 2.0 2.5 2.0 NaN 37.0

13 3.0 3.5 4.0 5.1 tcf 37.0

10 3.0 3.5 4.0 4.6 none 36.0

29 3.0 2.0 2.5 NaN NaN 35.0

22 3.0 3.5 4.0 4.6 tcf 27.0

3 3.0 3.7 4.0 5.0 tc NaN

6 3.0 4.0 5.0 5.0 tc NaN

2 NaN NaN NaN NaN NaN 38.0

DF Selecting
Selecting is similar to the slicing. Typically, it deals with the following:

1. We are typically interested in extracting rows based on conditions.

2. Conditions are based on the values of one or more columns.

Consider the dataset given in Basic-1.csv. Display all rows related to White-collar jobs.

In [15]: import pandas as pd

df = pd.read_csv('data/Basic-1.csv', delimiter=',')
display(df.head())

seleted_rows = df['Job Type']=='White-collar'

display(df.loc[seleted_rows,:])

Gender Job Type Province

0 M Pink-collar Hejaz

1 M White-collar Central

2 M Pink-collar Hejaz

3 M Pink-collar Hejaz

4 M Gold-collar Eastern

Gender Job Type Province

1 M White-collar Central

5 M White-collar Hejaz

8 M White-collar Eastern

9 F White-collar Hejaz

10 M White-collar Hejaz

... ... ... ...

1266 M White-collar Hejaz

1267 M White-collar Hejaz

1271 F White-collar Central

1272 M White-collar Central

1276 F White-collar Central

568 rows × 3 columns

Consider the dataset given in Basic-1.csv.

1. Display all rows related to White-collar jobs in Eastern Province.

2. Display the statistical summary of Gender column in the above selection.

In [16]: import pandas as pd

df = pd.read_csv('data/Basic-1.csv', delimiter=',')

seleted_rows = (df['Province']=='Eastern') & (df['Job Type']=='White-collar')

display(df.loc[seleted_rows,:])

# Summary
display(df.loc[seleted_rows,:].describe().iloc[:,0])
# display(df.loc[seleted_rows,:].describe())
# display(df.loc[seleted_rows,:].describe().loc[:,'Gender'])

Gender Job Type Province

8 M White-collar Eastern

18 F White-collar Eastern

27 M White-collar Eastern

31 M White-collar Eastern

37 M White-collar Eastern

... ... ... ...

1226 M White-collar Eastern

1232 M White-collar Eastern

1238 F White-collar Eastern

1241 M White-collar Eastern

1256 M White-collar Eastern

138 rows × 3 columns

count 138
unique 2
top M
freq 98
Name: Gender, dtype: object

Example: Displaying Data

Question-A: Consider the dataset given in Basic-1.csv.

1. Display all rows related to White-collar and Blue-collar jobs.

2. Display all rows related to White-collar and Blue-collar jobs for Females.
3. Display the statistical summary of Province column in the above selection.
4. Among all the White-collar and Blue-collar job Females, what proportion works in Eastern Province.

In [17]: #1. Display all rows related to White-collar and Blue-collar jobs.

import pandas as pd
df = pd.read_csv('data/Basic-1.csv', delimiter=',')
display(df)

seleted_rows = df['Job Type'].isin(['White-collar','Blue-collar'])

# seleted_rows = (df['Job Type']=='White-collar') | (df['Job Type']=='Blue-collar')

display(df.loc[seleted_rows,:])

Gender Job Type Province

0 M Pink-collar Hejaz

1 M White-collar Central

2 M Pink-collar Hejaz

3 M Pink-collar Hejaz

4 M Gold-collar Eastern

... ... ... ...

1273 M Blue-collar Central

1274 M Pink-collar Hejaz

1275 F Blue-collar Hejaz

1276 F White-collar Central

1277 F Blue-collar Eastern

1278 rows × 3 columns

Gender Job Type Province

1 M White-collar Central

5 M White-collar Hejaz

6 M Blue-collar Eastern

7 M Blue-collar Central

8 M White-collar Eastern

... ... ... ...

1272 M White-collar Central

1273 M Blue-collar Central

1275 F Blue-collar Hejaz

1276 F White-collar Central

1277 F Blue-collar Eastern

984 rows × 3 columns

In [18]: # 2. Display all rows related to White-collar and Blue-collar jobs for Females.

import pandas as pd
df = pd.read_csv('data/Basic-1.csv', delimiter=',')

seleted_rows = df['Job Type'].isin(['White-collar','Blue-collar']) & (df['Gender']=='F')

display(df.loc[seleted_rows,:])

# 3. Display the statistical summary of Province column in the above selection.

df.loc[seleted_rows,'Province'].describe()

Gender Job Type Province

9 F White-collar Hejaz

11 F Blue-collar Central

13 F White-collar Central

14 F White-collar Hejaz

15 F White-collar Central

... ... ... ...

1254 F White-collar Hejaz

1271 F White-collar Central

1275 F Blue-collar Hejaz

1276 F White-collar Central

1277 F Blue-collar Eastern

311 rows × 3 columns

Out[18]: count 311

unique 3
top Central
freq 136
Name: Province, dtype: object

In [19]: # 4. Among all the White-collar and Blue-collar job Females, what proportion works in Eastern Province.
ndf = df.loc[seleted_rows,'Province']
# display(ndf)
# print(ndf.value_counts())

proportion = ndf.value_counts()[2]/len(ndf.index)*100

## to get length of dataframe

# print(len(ndf.index))
# print(ndf.shape[0])
# print(len(ndf))

print(f'The proportion of Females working as blue/white collar in Eastern province is: {proportion: 0.2f}%')

The proportion of Females working as blue/white collar in Eastern province is: 22.83%

Addition DF Methods

Count Rows and Columns

Consider the dataset given in Basic-2.csv.

1. Count the number of rows and columns.

2. Count the number of non-null rows for each column.

In [1]: import pandas as pd

df = pd.read_excel('data/Basic-2.xlsx')
display(df.head())

print(f'The number of rows are {len(df.index)}, and the number of columns are {len(df.columns)}')

print(f'The number of non-null rows for each column are:\n{df.count()}')

# print(f'The number of null rows for each column are:\n{df.isna().sum()}')

print(df.shape)

File "C:\Users\TAHIR~1.MAH\AppData\Local\Temp/ipykernel_29224/563380489.py", line 11

print(df.shape)
^
IndentationError: unexpected indent
Apply Lambda Functions
Lambda functions can be applied to each row or column using .apply() method.

Consider the data in file Basic-2-Clean.csv.

1. In columns c1, c6 and c11, convert every number to the nearest integer.
2. In column c12, replace the space between two words by underscore.

In [21]: # 1. In columns c1, c6 and c11, convert every number to the nearest integer.
import pandas as pd
df = pd.read_csv('data/Basic-2-Clean.csv')
display(df.head())

df['c1']=df['c1'].apply(lambda x: int(round(x))) # apply for single column

# df['c1']= (lambda x: round(x))(df['c1']) #another style

df.loc[:,['c6','c11']]=df.loc[:,['c6','c11']].applymap(lambda x: int(round(x)) ) #for multiple columns

display(df.head())

c1 c2 c3 c5 c6 c9 c11 c12 c14 c16 c17

0 1.000000 5.000000 3.913333 none 40.000000 2.000000 11.000000 average half full good

1 2.000000 4.500000 5.800000 none 35.000000 4.583333 11.000000 below average full full good

2 2.102564 3.620513 3.913333 none 38.000000 5.000000 11.000000 generous half half good

3 3.000000 3.700000 4.000000 tc 37.810811 4.583333 11.105263 below average half full good

4 3.000000 4.500000 4.500000 none 40.000000 4.583333 12.000000 average half half good

c1 c2 c3 c5 c6 c9 c11 c12 c14 c16 c17

0 1 5.000000 3.913333 none 40 2.000000 11 average half full good

1 2 4.500000 5.800000 none 35 4.583333 11 below average full full good

2 2 3.620513 3.913333 none 38 5.000000 11 generous half half good

3 3 3.700000 4.000000 tc 38 4.583333 11 below average half full good

4 3 4.500000 4.500000 none 40 4.583333 12 average half half good

In [22]: # 2. In column c12, replace the space between two words by underscore.
display(df.head())
df['c12']=df['c12'].apply(lambda x: x.replace(' ','_'))
display(df.head())

c1 c2 c3 c5 c6 c9 c11 c12 c14 c16 c17

0 1 5.000000 3.913333 none 40 2.000000 11 average half full good

1 2 4.500000 5.800000 none 35 4.583333 11 below average full full good

2 2 3.620513 3.913333 none 38 5.000000 11 generous half half good

3 3 3.700000 4.000000 tc 38 4.583333 11 below average half full good

4 3 4.500000 4.500000 none 40 4.583333 12 average half half good

c1 c2 c3 c5 c6 c9 c11 c12 c14 c16 c17

0 1 5.000000 3.913333 none 40 2.000000 11 average half full good

1 2 4.500000 5.800000 none 35 4.583333 11 below_average full full good

2 2 3.620513 3.913333 none 38 5.000000 11 generous half half good

3 3 3.700000 4.000000 tc 38 4.583333 11 below_average half full good

4 3 4.500000 4.500000 none 40 4.583333 12 average half half good

Apply General Functions

Similar to Lambda functions, user defined or inbuilt functions can be applied to each row or column using .apply() method. Moreover, .applymap() method can be used for multiple columns.

Consider the data in file Basic-2-Clean.csv.

In columns c2, c3 and c9, round the values to the nearest 0, 0.5 or 1. If any value is negative, then replace it with zero.

In [23]: def custom_round(x):

if x <=0:
return 0
int_x = int(x)
if (x <= int_x+0.25):
return int_x
elif (x > int_x+0.25) and (x <= int_x+0.75):
return int_x+0.5
else:
return int_x+1

import pandas as pd
df = pd.read_csv('data/Basic-2-Clean.csv')
display(df.head())
# df['c2']=df['c2'].apply(custom_round) # for single column
df.loc[:,['c2','c3','c9']]=df.loc[:,['c2','c3','c9']].applymap(custom_round) # for multiple columns
display(df.head())

c1 c2 c3 c5 c6 c9 c11 c12 c14 c16 c17

0 1.000000 5.000000 3.913333 none 40.000000 2.000000 11.000000 average half full good

1 2.000000 4.500000 5.800000 none 35.000000 4.583333 11.000000 below average full full good

2 2.102564 3.620513 3.913333 none 38.000000 5.000000 11.000000 generous half half good

3 3.000000 3.700000 4.000000 tc 37.810811 4.583333 11.105263 below average half full good

4 3.000000 4.500000 4.500000 none 40.000000 4.583333 12.000000 average half half good

c1 c2 c3 c5 c6 c9 c11 c12 c14 c16 c17

0 1.000000 5.0 4.0 none 40.000000 2.0 11.000000 average half full good

1 2.000000 4.5 6.0 none 35.000000 4.5 11.000000 below average full full good

2 2.102564 3.5 4.0 none 38.000000 5.0 11.000000 generous half half good

3 3.000000 3.5 4.0 tc 37.810811 4.5 11.105263 below average half full good

4 3.000000 4.5 4.5 none 40.000000 4.5 12.000000 average half half good

Data Visualization
Graphical representation of data

Why Visualization
1. When data is small (no magic numbers, but say 5 rows and 2 columns), then one may extract meaningful patterns by looking at the data.
2. However, when we have large data (again no magic numbers, but say 5000 rows and 2000 columns), then mere observation may not be fruitful in identifying patterns.
3. Visualization is the first typical step in data analysis.

Typical Visualizations
1. Histogram
2. Box Plots
3. Scatter Plots
4. Line Plots

Pandas Plots
1. Pandas provide in-built basic plots using the numpy and matplotlib libraries.
2. Several plots from pandas are under df.plot, which can be accessed via kind option.
3. In addition to that, specific plots like histogram (df.hist()), and boxplot (df.boxplot()) are also available.
4. However, pandas basic plots are not enough, and in the course we look at the seaborn library for plotting.

In the following cells, we will look at most typical plots that can be constructed using seaborn plots.

Seaborn Plots
1. Seaborn library provides a variety of plots, inlcluding histogram, boxplots, lineplots, scatterplots, countplot, violinplot, swarmplot, pairplot, catplot etc.
2. These plots can be accessed through .countplot, .violinplot, catplot, etc.
3. Typical parameters include

x, y, and/or hue

where x, y refers to the x and y axis, and hue defines subsets of the data, which will be drawn on separate facets in the grid.
4. It also comes with huge customization.
5. The Seaborn Plots are aesthetically better than basic plots from pandas library.
6. The library requires numpy and matplotlib libraries.
7. It is integrated with pandas data.

In the following cell, we will look at most typical plots from the seaborn library.

Example: Seaborn's Basic Plots

Question-B: Consider the data in file Basic-3.csv.

1. Read and display information the data.

2. Add a new column to dataframe that represents total score. The total score is sum of score of math, reading, and writing scores.
3. Draw histograms of both numeric and non-numeric columns.
4. Draw box-plots for the numeric columns, and differentiate by test preparation course column.
5. Draw three overlapping scatter plots of columns math score, reading score and writing score w.r.t Total score. Set the markers transparency level (alpha) to 0.5.
6. Draw scatter plot of columns math score and writing score, where the size of the marker is based on column Total score. In addition to that, reduce the marker transparency (alpha) to 0.5.
7. Plot Total score in ascending order on the x axis, and the corresponding pairwise absolute differences of math score, reading score and writing score on the y axis.

In [24]: # 1. Read and display information the data.

import pandas as pd
df = pd.read_csv('data/Basic-3.csv')

display(df.head())
display(df.info())

gender race/ethnicity parental level of education lunch test preparation course math score reading score writing score

0 female group B bachelor's degree standard none 72 72 74

1 female group C some college standard completed 69 90 88

2 female group B master's degree standard none 90 95 93

3 male group A associate's degree free/reduced none 47 57 44

4 male group C some college standard none 76 78 75

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 gender 1000 non-null object
1 race/ethnicity 1000 non-null object
2 parental level of education 1000 non-null object
3 lunch 1000 non-null object
4 test preparation course 1000 non-null object
5 math score 1000 non-null int64
6 reading score 1000 non-null int64
7 writing score 1000 non-null int64
dtypes: int64(3), object(5)
memory usage: 62.6+ KB
None

In [25]: # 2. Add a new column to dataframe that represents total score.

# The total score is sum of score of math, reading, and writing scores

df['Total score']=df['math score']+df['reading score']+df['writing score']

Out[25]: gender race/ethnicity parental level of education lunch test preparation course math score reading score writing score Total score

0 female group B bachelor's degree standard none 72 72 74 218

1 female group C some college standard completed 69 90 88 247

2 female group B master's degree standard none 90 95 93 278

3 male group A associate's degree free/reduced none 47 57 44 148

4 male group C some college standard none 76 78 75 229

... ... ... ... ... ... ... ... ... ...

995 female group E master's degree standard completed 88 99 95 282

996 male group C high school free/reduced none 62 55 55 172

997 female group C high school free/reduced completed 59 71 65 195

998 female group D some college standard completed 68 78 77 223

999 female group D some college free/reduced none 77 86 86 249

1000 rows × 9 columns

In [26]: %matplotlib inline

# 3. (a) Draw histograms of both numeric and non-numeric columns.

# Load all the libraries

import matplotlib.pyplot as plt
import seaborn as sns

# Identify numeric columns

num_columns = df.select_dtypes(exclude='object').columns

for c in num_columns:
plt.figure()
sns.histplot(x=c,bins=10,data=df);
plt.show()

## To draw a single column

# plt.figure()
# sns.histplot(x='math score',bins=10,data=df);
# plt.show()

In [27]: # (b) For non-numeric columns, count plot option may be used.

# Identify non-numeric columns

obj_columns = df.select_dtypes('object').columns

# Plot for each identified column

for c in obj_columns:
plt.figure()
# sns.histplot(y=c,data=df);
sns.countplot(y=c,data=df);
plt.show()

## To draw a single column

# plt.figure()
# sns.countplot(y='gender',data=df);
# plt.show()

In [28]: # 4. Draw box-plots for the numeric columns, and differentiate by *test preparation course* column.

import matplotlib.pyplot as plt

import seaborn as sns

# Identify numeric columns

num_columns = df.select_dtypes(exclude='object').columns

for c in num_columns:
plt.figure()
sns.boxplot(y=c,x='test preparation course',data=df);
plt.show()

# plt.figure()
# sns.boxplot(y='math score',x='test preparation course',data=df);
# plt.show()

In [29]: # 5. Draw three overlapping scatter plots of columns *math score, reading score* and *writing score*
#w.r.t *Total score*. Set the markers transparency level (alpha) to 0.5.

plt.figure(figsize=(10,8))
sns.scatterplot(x='Total score', y='math score', color='red', label='math score', alpha=0.5, data=df)
sns.scatterplot(x='Total score', y='reading score',color='blue',label='reading score', alpha=0.5, data=df)
sns.scatterplot(x='Total score', y='writing score',color='green',label='writing score', alpha=0.5, data=df)
plt.ylabel("Subject Scores");
plt.xlabel("Total Scores");
plt.title("Scores Comparison");
plt.show()

In [30]: # 6. Draw scatter plot of columns *math score* and *writing score*,
#where the size of the marker is based on column *Total score*.
#In addition to that, reduce the marker transparency (alpha) to 0.5.
-

plt.figure(figsize=(10,10))
sns.scatterplot(x='math score', y='writing score',
size='Total score',sizes=(20,200),
alpha=0.5, color='red',
data=df)
plt.show()

In [31]: # 7. Plot Total score in ascending order on the x axis,

# and the corresponding pairwise absolute differences of *math score, reading score* and *writing score*
# on the y axis.

# add new columns

df['m-r']=np.abs(df['math score']-df['reading score'])
df['m-w']=np.abs(df['math score']-df['writing score'])
df['w-r']=np.abs(df['writing score']-df['reading score'])

# select the data

sel_columns=['m-r','m-w','w-r','Total score']
ndf = df.loc[:,sel_columns]
ndf.set_index('Total score',inplace=True)
ndf.sort_index(inplace=True)
# display(ndf.head())

plt.figure(figsize=(20,10))
sns.lineplot(markers=True, data=ndf) #the plot `y values at each value of x and shows an estimate of the central tendency and a confidence interval for that estimate
plt.show()

# # the same plot can be made using the following code:

# plt.figure(figsize=(20,10))
# sns.lineplot(y='m-r',x='Total score', data=df)
# sns.lineplot(y='m-w',x='Total score', data=df)
# sns.lineplot(y='w-r',x='Total score', data=df)
# plt.show()

## Or to display the points without line

# plt.figure(figsize=(20,10))
# sns.scatterplot(data=ndf)
# plt.show()

Example: Seaborn's Advanced Plots

Question-C: Consider the data in file Basic-4-Clean.csv.

1. Read and display information the data. Remove all rows containing NA values.
2. Depict Category column by count. In another plot, depict Category counts (differentiated) by Type.
3. Draw a boxplot plot of Installs vs Rating columns, ordered by Installs.
4. Draw a violinplot plot of Content Rating vs Rating columns, differentiated w.r.t Type .
5. Plot a barplot depicting count of Category that have Installs above 0.5B.
6. Depicting the Size of Category that have Installs above 0.5B.
7. For all the paid apps, depict Genres by count diferentiated by Content Rating. In another plot depict Genres by Rating.

In [3]: # 1. Read and display information the data. Remove all rows containing NA values.

import pandas as pd
df = pd.read_csv('data/Basic-4-Clean.csv')
print(df.info())
#print(df.shape)# Can also be obtained from info
#print(df.count())# Can also be obtained from info
display(df.sample(5))

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7723 entries, 0 to 7722
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 App 7723 non-null object
1 Category 7723 non-null object
2 Rating 7723 non-null float64
3 Reviews 7723 non-null int64
4 Size 7723 non-null float64
5 Installs 7723 non-null object
6 Type 7723 non-null object
7 Price 7723 non-null float64
8 Content Rating 7723 non-null object
9 Genres 7723 non-null object
10 Last Updated 7723 non-null object
11 Current Ver 7723 non-null object
12 Android Ver 7723 non-null object
dtypes: float64(3), int64(1), object(9)
memory usage: 784.5+ KB
None
App Category Rating Reviews Size Installs Type Price Content Rating Genres Last Updated Current Ver Android Ver

2028 Night Photo Frame PHOTOGRAPHY 4.5 4400 8.3 1000000+ Free 0.0 Everyone Photography 28-Jul-18 2.0.1 4.0.3 and up

1913 The wall SHOPPING 4.4 35563 4.2 1000000+ Free 0.0 Everyone Shopping 23-Jul-18 10.2.1 4.1 and up

3291 Remote For ATT U-verse TV - NOW FREE TOOLS 2.5 20 26000.0 5000+ Free 0.0 Everyone Tools 25-Jul-18 6.1.5 4.2 and up

3860 Heroes Arena GAME 4.2 336386 88000.0 10000000+ Free 0.0 Everyone 10+ Action 24-Jul-18 2.0.1 4.0.3 and up

880 Six Packs for Man–Body Building with No Equipment HEALTH_AND_FITNESS 4.6 2880 7.1 100000+ Free 0.0 Everyone Health & Fitness 24-Jul-18 1.0.8 4.1 and up

In [33]: # 2. Depict *Category* column by count. In another plot, depict *Category* counts (differentiated) by *Type*.

#plot category counts

import seaborn as sns
plt.figure(figsize=(10,10))
ax = sns.countplot(y="Category", data=df)
plt.title('Apps available')
plt.show()

#plot category with type

import seaborn as sns
plt.figure(figsize=(10,10))
ax = sns.countplot(y='Category',hue='Type',data=df)
plt.title('Apps popularity')
plt.show()

In [34]: #3. Draw a boxplot plot of *Installs* vs *Rating* columns, ordered by *Installs*.

#adding a new column

df['Installs_num']=df['Installs'].apply(lambda x:x.replace('+','')).apply(pd.to_numeric)

#the box plot

ax = plt.figure(figsize=(10,5))
sns.boxplot(x="Installs", y="Rating", data=df.sort_values(by=['Installs_num']))
plt.title("Installs vs Rating")
plt.xticks(rotation=80)
plt.show()

In [35]: #4. Draw a violinplot plot of *Content Rating* vs *Rating* columns, differentiated w.r.t *Type* .

import seaborn as sns

plt.figure(figsize=(10,10))
sns.violinplot(x ='Content Rating', y ='Rating', hue='Type', data = df)
plt.title('Apps ratings')
plt.show()
#violinplot plot also depitcs the distribution.

# # contrast with box plots

# plt.figure(figsize=(10,10))
# sns.boxplot(x ='Content Rating', y ='Rating', hue='Type', data = df)
# plt.title('Apps ratings')
# plt.show()

In [36]: #5. Plot a barplot depicting count of *Category* that have *Installs* above 0.5B.
import seaborn as sns
seleted_rows = (df['Installs']=='1000000000+') | (df['Installs']=='500000000+')

plt.figure(figsize=(10,2))
sns.countplot(y ='Category', data = df.loc[seleted_rows,:])
plt.title('App Categories in 0.5B+ & above')
plt.show()

In [37]: #6. Depicting the *Size* of *Category* that have *Installs* above 0.5B.

import seaborn as sns

plt.figure(figsize=(5,5))
# ax=sns.swarmplot(x ='Category', y ='Size', data = df.loc[seleted_rows,:])
# ax=sns.violinplot(x ='Category', y ='Size', data = df.loc[seleted_rows,:])
ax=sns.boxplot(x ='Category', y ='Size', data = df.loc[seleted_rows,:])
plt.xticks(rotation=80)
plt.title('Top Apps sizes in Kb')
plt.show()

In [38]: # 7. For all the paid apps, depict *Genres* by count diferentiated by *Content Rating*.
# In another plot depict *Genres* by *Rating*.

import seaborn as sns

seleted_rows = df['Type']=='Paid'

plt.figure(figsize=(10,20))
sns.countplot(y ='Genres',hue='Content Rating',data = df.loc[seleted_rows,:])
plt.title('Typical Paid Apps Genre Content')
plt.show()

plt.figure(figsize=(10,20))
sns.boxplot(y ='Genres',x='Rating',data = df.loc[seleted_rows,:])
plt.title('Typical Paid Apps Genre Prices')
plt.show()
References:
Data Sets:
1. Basic-2: Labor Relations Data, UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. Published in: Bergadano,
F., Matwin, S., Michalski, R., Zhang, J., Measuring Quality of Concept Descriptions, Procs. of the 3rd European Working Sessions on Learning, Glasgow, October 1988.
2. Basic-3: https://www.kaggle.com/spscientist/students-performance-in-exams
3. Basic-4: modified, https://www.kaggle.com/lava18/google-play-store-apps

Others:
1. Series: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html
2. DataFrames: https://pandas.pydata.org/pandas-docs/stable/reference/frame.html
3. Read CSV: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
4. Read XLSX: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html
5. Visualization: https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html
6. Seaborn: https://seaborn.pydata.org/

Python Cheat Sheet 2.0
100% (1)
Python Cheat Sheet 2.0
10 pages
Final Industrial Project I Prison Management System
100% (2)
Final Industrial Project I Prison Management System
62 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (4)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
11 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
Ciros Studio Manual 1
100% (1)
Ciros Studio Manual 1
154 pages
Python Course Cheat Sheet
No ratings yet
Python Course Cheat Sheet
30 pages
UNIT 1 PYTHON PROGRAMMING-II
No ratings yet
UNIT 1 PYTHON PROGRAMMING-II
15 pages
Fds Mannual
No ratings yet
Fds Mannual
39 pages
Fundamentals of Data Science Lab Manual
No ratings yet
Fundamentals of Data Science Lab Manual
34 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
FOD Record Sem 1
No ratings yet
FOD Record Sem 1
25 pages
Black and White Blank Note Document
No ratings yet
Black and White Blank Note Document
57 pages
Pandas Cheatsheet DF
No ratings yet
Pandas Cheatsheet DF
1 page
Practical List 2022-23
100% (1)
Practical List 2022-23
4 pages
Ch-2 Panda: #Import The Pandas Library and Aliasing As PD
No ratings yet
Ch-2 Panda: #Import The Pandas Library and Aliasing As PD
5 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
RECORD BOOK PROGRAMS 2024-2025
No ratings yet
RECORD BOOK PROGRAMS 2024-2025
11 pages
Khadeeja_DS_PRACTICAL 4
No ratings yet
Khadeeja_DS_PRACTICAL 4
24 pages
FODS_LAB_MANUAL
No ratings yet
FODS_LAB_MANUAL
26 pages
DSF LAB EXP FULL (1) (1)
No ratings yet
DSF LAB EXP FULL (1) (1)
88 pages
Data Science Practical Problems
No ratings yet
Data Science Practical Problems
40 pages
Practical of R
No ratings yet
Practical of R
38 pages
Python Cheat Sheet For Excel Users
No ratings yet
Python Cheat Sheet For Excel Users
5 pages
Untitled8 (2)
No ratings yet
Untitled8 (2)
2 pages
Ip Practical
No ratings yet
Ip Practical
23 pages
Ip Project Work 2
No ratings yet
Ip Project Work 2
52 pages
Fundamentals of Data science Lab manual new
No ratings yet
Fundamentals of Data science Lab manual new
33 pages
FDS RECORD-1-4
No ratings yet
FDS RECORD-1-4
18 pages
AI Final PDF
No ratings yet
AI Final PDF
38 pages
Data Science Fundamentals Lab
No ratings yet
Data Science Fundamentals Lab
24 pages
Pandas Questions Ip File
No ratings yet
Pandas Questions Ip File
13 pages
Python Cheat Sheet For Excel Users
100% (2)
Python Cheat Sheet For Excel Users
5 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Python Libraries
No ratings yet
Python Libraries
27 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
IP Practical File - Reference
No ratings yet
IP Practical File - Reference
98 pages
Mohit
No ratings yet
Mohit
19 pages
IP Practical PRGM
No ratings yet
IP Practical PRGM
41 pages
I.p file
No ratings yet
I.p file
20 pages
numpy_dataframe
No ratings yet
numpy_dataframe
12 pages
P03 Introduction To Pandas Ans
No ratings yet
P03 Introduction To Pandas Ans
45 pages
Practical File Questions With Answers
No ratings yet
Practical File Questions With Answers
7 pages
Python For Data Science Cheat Sheet 2.0
No ratings yet
Python For Data Science Cheat Sheet 2.0
11 pages
unit5-pandas 100 ques-ans
No ratings yet
unit5-pandas 100 ques-ans
60 pages
DAV EXP 1 t12 31
No ratings yet
DAV EXP 1 t12 31
39 pages
Practical List
No ratings yet
Practical List
9 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Section 7
No ratings yet
Section 7
33 pages
Assignment - 1.ipynb - Colaboratory
No ratings yet
Assignment - 1.ipynb - Colaboratory
4 pages
Data Science Python Cheat Sheet
No ratings yet
Data Science Python Cheat Sheet
25 pages
Record File Work
No ratings yet
Record File Work
34 pages
Python CheatSheet horizontal
No ratings yet
Python CheatSheet horizontal
2 pages
IP Record Python 23-24 Aryan
No ratings yet
IP Record Python 23-24 Aryan
42 pages
Pragya File
No ratings yet
Pragya File
31 pages
Practical File Python
No ratings yet
Practical File Python
25 pages
Time Series Analysis Group 9
No ratings yet
Time Series Analysis Group 9
16 pages
EX-02-Data manipulation pandas matplot
No ratings yet
EX-02-Data manipulation pandas matplot
9 pages
Dsa Record-1
No ratings yet
Dsa Record-1
153 pages
dv_lab_manual_modified
No ratings yet
dv_lab_manual_modified
31 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Python Cheat Sheet For Beginners
No ratings yet
Python Cheat Sheet For Beginners
1 page
Mastering C++ Swiftly
From Everand
Mastering C++ Swiftly
Kameron Hussain
No ratings yet
Topic 8-ISE291 Shimaa Alwadie
No ratings yet
Topic 8-ISE291 Shimaa Alwadie
10 pages
Topic 5-ISE291 Shimaa Alwadie
No ratings yet
Topic 5-ISE291 Shimaa Alwadie
8 pages
Topic 9 Basic Clustering Methods
No ratings yet
Topic 9 Basic Clustering Methods
3 pages
Kobe Bryant_ The Game of His Life ( PDFDrive )
No ratings yet
Kobe Bryant_ The Game of His Life ( PDFDrive )
163 pages
Password and Smart Card Based Door Lock
No ratings yet
Password and Smart Card Based Door Lock
15 pages
Test Bank for Concepts of Database Management, 10th Edition, Lisa Friedrichsen, Joseph J. Adamski, Lisa Ruffolo, Ellen Monk, Joy L. Starks Philip J. Pratt Mary Z. Last - Free Download Available To Read All Chapters
100% (16)
Test Bank for Concepts of Database Management, 10th Edition, Lisa Friedrichsen, Joseph J. Adamski, Lisa Ruffolo, Ellen Monk, Joy L. Starks Philip J. Pratt Mary Z. Last - Free Download Available To Read All Chapters
42 pages
0804 Google Domains
100% (2)
0804 Google Domains
239 pages
Ravi Resume
No ratings yet
Ravi Resume
1 page
05 - Junior Network Adm - Bahan Ajar
No ratings yet
05 - Junior Network Adm - Bahan Ajar
18 pages
Winowicz, Paweł - Dark Web in Ten Minutes - A Quick Guide How To Stay Anonymous Using TOR Network (2020)
No ratings yet
Winowicz, Paweł - Dark Web in Ten Minutes - A Quick Guide How To Stay Anonymous Using TOR Network (2020)
21 pages
8000 Series Expansion Modules: Datasheet
No ratings yet
8000 Series Expansion Modules: Datasheet
6 pages
Operating System: Security
No ratings yet
Operating System: Security
2 pages
74F Extended Octal-Plus Family Applications: Integrated Circuits
No ratings yet
74F Extended Octal-Plus Family Applications: Integrated Circuits
14 pages
OM-E-CL500 Rev.4 - Operator's Manual English HTI CL-500
No ratings yet
OM-E-CL500 Rev.4 - Operator's Manual English HTI CL-500
28 pages
Bryan Jang 1.2.5 Practice - Analyzing Business Ethical Dilemmas (Practice)
No ratings yet
Bryan Jang 1.2.5 Practice - Analyzing Business Ethical Dilemmas (Practice)
5 pages
BeginnersGuide ProPanel en US PDF
No ratings yet
BeginnersGuide ProPanel en US PDF
0 pages
p601 5 3
No ratings yet
p601 5 3
433 pages
LLM Agent Overview
No ratings yet
LLM Agent Overview
1 page
Sadhana Java Resume
No ratings yet
Sadhana Java Resume
6 pages
Hurco-ProCobots_Overview-Brochure_Web
No ratings yet
Hurco-ProCobots_Overview-Brochure_Web
8 pages
User Manual Smart Bracelet I5 PLUS: 【Compatibility】
No ratings yet
User Manual Smart Bracelet I5 PLUS: 【Compatibility】
5 pages
TJ1400 Product Family 10.x FD Guide
No ratings yet
TJ1400 Product Family 10.x FD Guide
42 pages
Final Essay 15% ENGLISH FOR ENGINEERS V
No ratings yet
Final Essay 15% ENGLISH FOR ENGINEERS V
2 pages
COA Solved Question Paper July 2023
No ratings yet
COA Solved Question Paper July 2023
12 pages
Knergy Arquitectura
No ratings yet
Knergy Arquitectura
1 page
Module - 3: 8051 Stack, I/O Port Interfacing and Programming
100% (1)
Module - 3: 8051 Stack, I/O Port Interfacing and Programming
51 pages
Open Gapps Arm 7.1 Mini 20190223.versionlog
No ratings yet
Open Gapps Arm 7.1 Mini 20190223.versionlog
2 pages
Unit3 Pps
No ratings yet
Unit3 Pps
90 pages
AS372 Manual
No ratings yet
AS372 Manual
54 pages
React Native
No ratings yet
React Native
3 pages
9.3.1.4 Packet Tracer - Implementing A Subnetted IPv6 Addressing Scheme Instructions
100% (3)
9.3.1.4 Packet Tracer - Implementing A Subnetted IPv6 Addressing Scheme Instructions
3 pages
Best Game Hub
No ratings yet
Best Game Hub
7 pages