0% found this document useful (0 votes)
9 views5 pages

Data Science Python All Units

The document outlines a comprehensive curriculum for Data Science using Python, divided into five units covering topics such as Python programming, file handling, object-oriented programming, NumPy, and data manipulation with Pandas. Each unit includes practical examples and exercises on key concepts like data types, exception handling, data cleaning, and visualization techniques. The curriculum aims to equip learners with essential skills for data analysis and manipulation.

Uploaded by

oneshotsejee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views5 pages

Data Science Python All Units

The document outlines a comprehensive curriculum for Data Science using Python, divided into five units covering topics such as Python programming, file handling, object-oriented programming, NumPy, and data manipulation with Pandas. Each unit includes practical examples and exercises on key concepts like data types, exception handling, data cleaning, and visualization techniques. The curriculum aims to equip learners with essential skills for data analysis and manipulation.

Uploaded by

oneshotsejee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

DATA SCIENCE USING PYTHON - COMPLETE UNITS

UNIT 1: INTRODUCTION TO DATA SCIENCE AND PYTHON PROGRAMMING


1. Implement basic Python programs for reading input from console.

name = input("Enter your name: ")


print("Hello", name)

2. Perform Creation, indexing, slicing, concatenation and repetition operations on


Python built-in datatypes: Strings, List, Tuples, Dictionary, Set

s = "Python"
print(s[1:4])
l = [1,2,3]
print(l[0:2] + l*2)
t = (1,2,3)
print(t[1:])
d = {'a':1}
print(d['a'])
s = {1,2}
s.add(3)

3. Solve problems using decision and looping statements.

x=5
if x > 0:
print("Positive")
for i in range(3):
print(i)

4. Apply Python built-in data types: Strings, List, Tuples, Dictionary. Set and their
methods to solve any given problem

print("hi".upper())
l = [1,2]; l.append(3)
t = (1,2,3)
print(t.count(2))
d = {'a':1}; print(d.get('a'))
s = {1}; s.add(2)

5. Handle numerical operations using math and random number functions


import math, random
print(math.sqrt(25))
print(random.randint(1, 10))

6. Create user-defined functions with different types of function arguments.

def greet(name, msg="Hi"):


print(msg, name)
greet("Nitesh")

UNIT 2: FILE, EXCEPTION HANDLING AND OOP


1. Create packages and import modules from packages.

from mypkg import module1


module1.say_hello()

2. Perform File manipulations- open, close, read, write, append and copy from one file
to another.

with open("f1.txt", "w") as f: f.write("Hi")


with open("f1.txt") as f: data = f.read()
with open("f2.txt", "w") as f: f.write(data)

3. Handle Exceptions using Python Built-in Exceptions

try:
x = 1/0
except ZeroDivisionError:
print("Cannot divide by zero")

4. Solve problems using Class declaration and Object creation.

class A:
def __init__(self, x): self.x = x
a = A(5)
print(a.x)

5. Implement OOP concepts like Data hiding and Data Abstraction

class Test:
def __init__(self): self.__val = 10
def get(self): return self.__val
t = Test()
print(t.get())

6. Solve any real-time problem using inheritance concept.


class Animal:
def speak(self): print("Sound")
class Dog(Animal):
def speak(self): print("Bark")
d = Dog()
d.speak()

UNIT 3: INTRODUCTION TO NUMPY


1. Create NumPy arrays from Python Data Structures, Intrinsic NumPy objects and
Random Functions.

import numpy as np
np.array([1,2,3])
np.arange(5)
np.random.rand(2)

2. Manipulation of NumPy arrays- Indexing, Slicing, Reshaping. Joining and Splitting.

a = np.array([[1,2],[3,4]])
print(a[1,1])
print(a.reshape(4,1))
print(np.hstack([a,a]))

3. Computation on NumPy arrays using Universal Functions and Mathematical


methods.

a = np.array([1,2,3])
print(np.mean(a), np.sum(a), np.sqrt(a))

4. Import a CSV file and perform various Statistical and Comparison operations on
rows/columns.

data = np.genfromtxt("data.csv", delimiter=",", skip_header=1)


print(np.mean(data, axis=0))

5. Load an image file and do crop and flip operation using NumPy Indexing.

from imageio import imread


img = imread("img.jpg")
crop = img[100:200,100:200]
flip = img[::-1]

UNIT 4: DATA MANIPULATION WITH PANDAS


1. Create Pandas Series and DataFrame from various inputs.
import pandas as pd
s = pd.Series([1,2,3])
df = pd.DataFrame({'A':[1,2]})

2. Import any CSV file to Pandas DataFrame and perform the following:

df = pd.read_csv("file.csv")
print(df.head(10))
print(df.tail(10))

(b) Get the shape, index and column details

print(df.shape, df.index, df.columns)

(c) Select/Delete the records (rows)/columns based on conditions.

print(df[df['Age'] > 20])


df.drop(columns=['Name'])

(d) Perform ranking and sorting operations.

df['Rank'] = df['Score'].rank()
df.sort_values(by='Age')

(e) Do required statistical operations on the given columns.

print(df.describe())
print(df['Score'].mean())

(f) Find the count and uniqueness of the given categorical values.

print(df['Gender'].value_counts())
print(df['Gender'].unique())

(g) Rename single/multiple columns.

df.rename(columns={'Name':'Full Name'}, inplace=True)

UNIT 5: DATA CLEANING, PREPARATION AND VISUALIZATION


1. Import any CSV file to Pandas DataFrame

import pandas as pd
df = pd.read_csv("data.csv")

(a) Handle missing data by detecting and dropping/ filling missing values.

df.isnull().sum()
df.dropna()
df.fillna(0)
df['Age'].fillna(df['Age'].mean())

(b) Transform data using apply() and map() method.

df['col'] = df['col'].apply(lambda x: x*2)


df['gender'] = df['gender'].map({'M':'Male','F':'Female'})

(c) Detect and filter outliers.

Q1 = df['col'].quantile(0.25)
Q3 = df['col'].quantile(0.75)
IQR = Q3 - Q1
df[(df['col'] < Q1 - 1.5*IQR) | (df['col'] > Q3 + 1.5*IQR)]

(d) Perform Vectorized String operations on Pandas Series.

df['Name'].str.upper()
df['Email'].str.contains('@gmail')

(e) Visualize data using Line Plots. Bar Plots, Histograms, Density Plots and Scatter
Plots.

import matplotlib.pyplot as plt


df['Marks'].plot(kind='line')
plt.show()

You might also like