0% found this document useful (0 votes)
15 views

Introduction To Data Science Syllabus

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Introduction To Data Science Syllabus

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

L T P C

20CSH402 Honors/Minorcourses
3 1 0 4
INTRODUCTIONTO DATASCIENCE
CourseObjectives:

i. Thecourseteachescriticalconceptsandskillsincomputerprogrammingandstatist
icalinference, in conjunction with hands-on analysis of real-world datasets,
including economicdata,document collections, geographical data, andsocial
networks.
ii. Itdelvesintosocialissuessurroundingdata analysissuchasprivacyanddesign.

CourseOutcomes:

Atthe endofthecourse,thestudents willbeableto:

i. Applydimensionalityreductiontoolssuchasprinciplecomponentanalysis
ii. Evaluateoutcomesandmakedecisionsbasedondata
iii. UnderstandhowtoUseexploratorytoolssuchasclusteringandvisualizationtoolst
oanalyzedata.
iv. Applydimensionalityreductiontoolssuchasprinciplecomponentanalysis
v. Ableto know how to Performbasicanalysis of networkdata.

UNIT –I
Introduction-Introduction to Data Science – Evolution of Data Science – Data Science
Roles – Stages in aDataScienceProject–ApplicationsofDataScienceinvariousfields–
DataSecurityIssues.

UNIT –II
DataCollectionAndData Pre-Processing-Data Collection Strategies – Data Pre-
Processing- Overview – Data Cleaning – Data Integration and Transformation – Data
Reduction – Data Discretization.

UNIT –III
ExploratoryData Analytics -Descriptive Statistics – Mean, Standard Deviation,
Skewness and Kurtosis – Box Plots – Pivot Table – Heat Map – Correlation Statistics –
ANOVA.
UNIT – IV
Model Development-Simple and Multiple Regression – Model Evaluation using
Visualization – Residual Plot –Distribution Plot – Polynomial Regression and Pipelines –
Measures for In-sampleEvaluation– Prediction and Decision Making.

UNIT –V:
ModelEvaluation- Generalization Error – Out-of-Sample Evaluation Metrics – Cross
Validation – Overfitting –Under Fitting and Model Selection – Prediction by using Ridge
Regression – TestingMultipleParameters by using Grid Search.

TextBooks:
1. DataScienceforBeginners,byAndrewPark
2. TheArtofDataScience—
AGuideforAnyoneWhoWorksWithData,byRogerD.PengandElizabethMatsui.

References:
1. JojoMoolayil,“SmarterDecisions:TheIntersectionofIoTandDataScience”,PA
CKT,2016.
2. CathyO’NeilandRachel Schutt,“DoingDataScience”,O'Reilly,2015.
3. DavidDietrich,BarryHeller,BeibeiYang,“DataScienceandBigdataAnalytics”,
EMC2013
4. Raj,Pethuru,“HandbookofResearchonCloudInfrastructuresforBigDataAnalyti
cs”,IGIGlobal.

You might also like