Data Analytics Course handout 2024 29.11.24 anjamma
Data Analytics Course handout 2024 29.11.24 anjamma
Data Analytics Course handout 2024 29.11.24 anjamma
HOD
Review Report
Department: CSE Date:
Subject Code:22CS513PE
Title of the subject: Data Analytics
1. Subject Expert:
2. IQAC Coordinator1:
3. IQAC Coordinator 2:
COURSE FILE
COURSE DESCRIPTION / COURSE INFORMATION SHEET
Email nanjamma@tkrec.ac.in
Phone No 8919192173
Imparting Knowledge and instilling skills to the aspiring students in the fieldof
Vision Engineering, Technology, Science and Management to face the emerging
challenges of the society.
2. Course Handout
b) Program 1. The students of the program will have strong foundation in the
Educational fundamental principles and gain advanced knowledge in the Basic
Objectives Sciences, Mathematics and other application of Advanced Computer
(PEOs) Engineering.
2. The students of the program will be prepared for their successful
careers in the software industry / seek higher studies and continue to
develop.
3. The students of the program will prepare to engage in professional
development through self-study, graduate and professional studies in
engineering & business.
4. Graduates shall have good communication skills, leadership skills,
professional, ethical and social responsibilities
2 1
Manage the data for Analysis
3 Understand various sources of Data like Sensors/Signals/GPS etc. 2
4 Data Management 1
5 Data Quality(noise, outliers ,missing values, duplicate data) 2
6 1
Data Processing
7 1
Data Processing
8 1
Assignment
9 Slip Test 1
Total 12
UNIT-II- Data Analytics
19 1
Regression Concepts
20 1
Blue Property Assumptions
21 1
Least Square Estimation
22 1
Variable Rationalization
23 1
Model Building
24 1
Logistic Regression
25 1
Model Fit Statistics
26 Model Construction 1
27 Analytics Applications 1
28 1
Assignment
29 1
Slip Test
Total 11
UNIT-IV- Object Segmentation
30 Regression vs Segmentation 1
31 Supervised and Unsupervised Learning 1
32 1
Tree Building -Regression
33 1
Classification & Over fitting
34 Pruning and Complexity 1
35 1
Multiple Decision Trees
36 Times Series Methods - 1
Arima
37 1
Measures of Forecast Accuracy
38 STL Approach 1
39 Extract Features from Models 1
40 Assignment 1
41 Slip Test 1
Course Outcomes
CO1. Understand the impact of data analytics for business decisions and strategy.
CO2. Carry out data analysis/statistical analysis.
CO3. To carry out standard data visualization and formal inference procedures.
CO4. Design Data Architecture.
CO5. Understand various Data Sources.
CO/PO PO1 PO2 PO PO4 PO5 PO6 PO7 PO8 PO9 PO1 PO1 PO1 PSO PSO
3 0 1 2 1 2
CO-1 2 2 1 2 3
CO-2 1 1 1 1 1 2 3
CO-3 1 2 1 2 2 1
CO-4 1 1 2 2 1
CO-5 2 1 1 2 2 2 1
Average 1
Type Course PO1 PO2 PO PO PO5 PO PO7 PO8 PO PO1 PO11 PO12 PSO PS
Code,Title 3 4 6 9 0 1 O2
Theory 2.25 2.5 2 2.66 2 2.66 2.5 2.25 1.8 1.8
22CS513PE,
Data
Analytics
(Professioal
Elective - I)
Delivery Methodology
Note-MentionOtherAssessmenttools(ifany)
Teaching diary for the course
At the end of the course, the students are able to achieve the following course learning outcomes
TB Block board
Data Processing
TB Block board
Data Processing
Data Analytics: Introduction to
Data Analytics TB & RB Block board
II
Introduction to Tools and TB Block board
CO2, CO3
Environment
Application of Modeling in TB Block board
Business
Data bases& Types of Data
And Variables TB & RB Block board
TB Block board
Missing Imputations
Object
Segmentation:RegressionvsSeg
mentation
CO3, CO4 Supervised and TB & RB Block board
UnsupervisedLearning
TreeBuilding -Regression TB Block board
IV
TB & RB Block board
Classification&Over fitting
TB & RB Block board
PruningandComplexity
TB & RB Block board
MultipleDecisionTrees
Q. Question CO BL
No
1. 1 1
What is Data Management?
2. Define Design Data Architecture? 1 1
6. 1 1
What is Data Preprocessing?
7. Distinguish between Data Analytics and 3 3
Data Analysis?
SECTION – B
(Essay Questions for 10 Marks)
Q. Question CO BL
No
1. Explain Design Data Architecture and 1 2
manage the data for analysis indetail with
neat sketch..
2. Explainthe sourcesof primaryData. 1 2
3. Demonstrate briefly 1 3
aboutdatapreprocessing.
4. Explain in detail forgeneratingprimarydata. 1 2
2. Explainabouttools usedfordataanalytics? 1 2
4. Explainmissingimputations? 2 1
5. Definedatavariables?Interprettheuseofvariablesfor 1 1
businessmodeling
6. 1 1
What is the need for Business Modelling?
SECTION – B
(Essay Questions for 10 Marks)
Q. No Question CO BL
1. Discusstheimportanceofdataanalytics. 2 2
2. Describethetoolsusedfordata analyticswithanexample? 2 2
3. Explainhowandwheremissingimputationsareinvolvedin 2 2
realworldscenario
4. Explaindatabasesandtypesofdataandvariablesinvolvedi 2 6
ndataanalytics
5. Explainwith exampletheneedfor businessmodeling 2 2
UNIT - III
SECTION – A
(Short Answer Questions for 1 Mark)
Q. No Question CO BL
1. StateBLUEpropertyassumptions? 3 2
2. Whatisvariablerationalization? 1 1
3. Explaintheoreticallyananalyticsapplicationinbusiness 2 2
domain?
4. Howtocalculatea LSEregressionline? 3 3
5. ExplainOLS? 2 2
6. 3 1
What is Linear Regression?
7. 1 1
Define Logistic Regression?
8. 5 6
Write about Model Fit Statistics?
SECTION – B
(Essay Questions for 10 Marks)
Q. No Question CO BL
1. Explainaboutregressionanddiscusswithanexample? 2 2
2. Summarizehowdoes LSEwork? 2 2
3. DescribetheworkingproceduresofLogisticRegressionin 3 2
Businessworld?
4. Discuss briefly aboutvariablerationalization? 3 3
5. Explainaboutmodelfitstatisticsusedforregressionwithan 2 2
exampleandalsodiscussaboutmodelconstruction?
UNIT - IV
SECTION – A
(Short Answer Questions for 1 Mark)
Q. Question CO BL
No
1. WhatisSegmentation in Data Analytics ? 1 1
2. Describesegmentationwithanexample? 2 1
3. Givereal-timeexamplesofsupervisedlearning 4 2
4. Whataredecisiontrees 4 1
SECTION – B
(Essay Questions for 10 Marks)
Q. Question CO BL
No
1. What is Linear Regression ? 2 1&2
Explainwithanexample
2. Differentiatebetweensupervisedandunsupervisedle 4 6
arning
3. Write briefly aboutoverfittingandpruning? 5 6
4. Explaintimeseriesmethodwithan example 4 2
UNIT – V
SECTION – A
(Short Answer Questions for 1 Mark)
Q. Question CO BL
No
1. Describe the purpose of data visualization in data 2 1
analytics?
2. Define Pixel Oriented Visualization Techniques. 1 1
3. Write short notes on Hierarchical visualization 5 6
techniques.
4. Identify some of important tools to visualize complex 2 2
data and relationships in business analytics?
5. DDefine Geometric Projection Visualization? 1 1
6. Tell about Icon- based Visualization Techniques? 1 1
SECTION – B
(Essay Questions for 10 Marks)
Q. Question CO BL
No
1. How can pixel-oriented visualization techniques be 2 2
applied to large datasets? Discuss the challenges and
solutions related to scalability and interpretability.
1.Explain DesignDataArchitectureand manage the data for analysis indetail with neat sketch.
5.Discuss briefly about Data Quality (noise, out liars, missing values, duplicate data) and show in data
sets.
UNIT II: Data Analytics
data analytics
UNIT-I
Bloom’sTa
S. No Questions xonomyLe
vel
1 whatisdataanalytics? L1
2 Explainabouttools usedfordataanalytics? L1
3 List out somedatamodelingtechniques? L3
4 Explainmissingimputations? L2
5 Definedatavariables?Interprettheuseofvariablesfor businessmodeling L1
6 What is the need for Business Modelling? L1
7 Explainthe sourcesof primaryData. L2
8 Writeaboutdatapreprocessingneeds. L6
16 Explainaboutregressionanddiscusswithanexample? L2
17 Summarizehowdoes LSEwork? L2
18 DescribetheworkingproceduresofLogisticRegressionin L4
Businessworld?
19 Discussaboutvariablerealization? L4
Bloom’sTa
S. No Questions xonomyLe
vel
21 What is regression. L1
22 Describe segmentation with an example. L2
23 Givereal-timeexamplesofsupervisedlearning L2
24 Whataredecisiontrees L1
25 Brieflydescribe Arima method L3
26 WhatisLinearRegression?Explainwithanexample L1
UNIT-V
Bloom’sTa
S. No Questions xonomyLe
vel
31 Name some frequently used 2-D space-filling curves? L2
UNIVERSITY PAPERS
CodeNo:138FU R16
JAWAHARLALNEHRUTECHNOLOGICALUNIVERSITYHYDERABAD
B.TechIVYearIISemesterExaminations, September-2020
DATAANALYTICS
(ComputerScienceandEngineering)
Time:2Hours Max.Marks: 75
AnsweranyFiveQuestionsAllQu
estionsCarryEqualMarks
---
1. MakeacomparisonofRandomizedblockdesignandLatinsquaredesign.Quoteap
propriateexamples.
[15]
2.a) Explaindatapreprocessingindatamanagement.
b) Discusstheprocessof handlingduplicate valuesin organizationaldata. [7+8]
3. Explaindataimputationandhowcanrepeatedimputationsenormouslyimproveth
equalityofestimation.
[15]
4.a) Discusstheimportanceofbusinessmodeling.
b) Comparethetechniques fordealingnumerical datawith categoricaldata. [7+8]
5. Applylinearregressionusingthemethodofleastsquarestothefollowingdataandpr
edictthecropyieldfor rainfallof5cm.
[15]
Rainfall(incms) 10.5 8.8 13.4 12.5 18.8 10.3 7.0 15.6 16
Paddyyield(quintalperacre) 30.3 46.2 58.8 59.0 82.4 49.2 31.9 76.0 78.8
6.a) Explaintheadvantagesofdecisiontrees.
b) Describetheneed oftreepruningindecision trees. [7+8]
c) Discussin detail thesteps involved in ETLprocess andtools available
forthis process.[15]
7. Explainthechallengesinvisualizingcomplexdataandrelationsandsuggestsuitablemechanis
mstoaddress them. [15]
---ooOoo---
CodeNo:138FU R16
JAWAHARLALNEHRUTECHNOLOGICALUNIVERSITYHYDERABAD
B.TechIVYearIISemesterExaminations, September-2020
DATAANALYTICS
(ElectronicsandCommunicationEngineering)
Time:2Hours Max.Marks: 75
AnsweranyFiveQuestions
AllQuestionsCarryEqualMarks
---
1.a)Data set D {10K, 15K,22K, 25K,36K,40K,13K,19K, 88K,94K} represents packages of thestudents placed in an
interview where "K represents thousand". Identify the outliers in thedataset and analyzeitsimpact instudyingthe
spread of data.
b) Illustratetechniques ofmissingvalues treatment withexample. [8+7]
2.a) DemonstrateMissingImputationmethodsindetailwithexamples.
b) IllustrateDatamodelingtechniques. [8+7]
3.a) ExplaindifferenttypesofvariablesusedinRegressionmodeling.
b) Demonstrate linearregressionwith suitableexample. [8+7]
6.a) Demonstratedatapreprocessingtechniquesindetail.
b) What isdatadeduplication?Explaindeduplication methods. [9+6]
8.a) Applylogisticregressiontodemonstratebinaryclassification.
b) What isleastsquareestimate?Illustrateitsimportancein regressionmodeling. [7+8]
---ooOoo---
LISTOFTOPICSFORSTUDENTSEMINARS(Optional):
1. DataManagement
2. DataAnalytics
3. Regression
4. LogisticRegression
5. ObjectSegmentation
6. TimeSeriesMethods
7. DataVisualization
Questions
Q. Blooms
Questions from Marks CO
No Level
UNIT
Q.4 II 6 L4 III
Illustrate Data modeling techniques with examples.
CSE-A
S. Ext Grade
H . T No Name of the Student MID
No
1 22R91A0566 ENDRAKANTI ABHISHEK
2 22R91A0567 G SHIRISHA
3 22R91A0568 GANGADHARI THARUN
4 22R91A0569 GANTA NAGARJUNA
5 22R91A0570 GARDAS GANESH
6 22R91A0571 GHANTA EESHA
7 22R91A0572 GILLA SHIVA
8 22R91A0573 GOLKONDA KEERTHANA
9 22R91A0574 GOMASA SHARANYA
10 22R91A0575 GUGGILLA BHARGAVI
11 22R91A0576 GUMMULA SINDHUJA
12 22R91A0577 GUNDAPUNENI AKHIL
13 22R91A0578 GUNDU ADARSH SAI
14 22R91A0579 INDLA RAMYA SRI
15 22R91A0580 ISLAVATH ASHOK KUMAR
16 22R91A0581 ITIKELA ABHINAYA
17 22R91A0582 J BHAVANA
18 22R91A0583 J KARTHIK
19 22R91A0584 JABU SRIMAN NARAYANA
20 22R91A0585 JADHAV SHIV RAJ
21 22R91A0586 JAGANNATH SINGH
22 22R91A0587 JAKKALA SAGAR
23 22R91A0588 JAKKENA VARSHA
24 22R91A0589 JAPA SAMPATH
25 22R91A0590 JAYAVARAPU YUGESH
26 22R91A0591 JIDUGU ANUSHA LAVANYA
27 22R91A0592 JINNA SUDHEER
28 22R91A0593 JINUKALA SHIVA KUMAR
29 22R91A0594 K NIKHIL
30 22R91A0595 KADABOINA SANJANA
31 22R91A0596 KAKINADA AJITH KUMAR
32 22R91A0597 KALERI MANIKANTA
33 22R91A0598 KALLU VARSHITHA
34 22R91A0599 KALUVALA BABY LAHARI
35 22R91A05A0 KAMATHAM VIGNESH
36 22R91A05A1 KAMBALA BHAVANA
KAMBHAMPATI SUJANA
37 22R91A05A2
HARITHA
KANCHARLA JESHWANTH
38 22R91A05A3
REDDY
39 22R91A05A4 KANDENI TILAK
40 22R91A05A5 KANDUKURI SAI SRUTHI
KANMARALAPUDI SAI
41 22R91A05A6
KIRITI
42 22R91A05A7 KANNEVENI ASHWITHA
43 22R91A05A8 KANUGULA ANUSHA
KANUGULA JEEVAN
44 22R91A05A9
AVINASH
KAPILAVAI BHANU
45 22R91A05B0
PRAKASH
46 22R91A05B1 KARNE VIJAY
CSE-C
S. Ext Grade
H . T No Name of the Student MID
No
1 22R91A05K6 POCHAMPALLY BABU
2 22R91A05K7 POKURI PREMALATHA
3 22R91A05K8 PULI VAMSHI GOUD
4 22R91A05K9 PUTTA MADHAVI
5 22R91A05L0 PUTTA SREEJIT KUMAR
6 22R91A05L1 RAJKUNDAL BALAJI
7 22R91A05L2 RAJULA NALINKUMAR
8 22R91A05L3 RAMAVATH RAMESH
9 22R91A05L4 RAPARTHI RAHUL
10 22R91A05L5 RAPOLU HARIKA
11 22R91A05L6 SABHAVAT SRINU
12 22R91A05L7 SAMUDRALA ABHINESH
13 22R91A05L8 SANGOJI PRANAV
14 22R91A05L9 SARTHAK KUMAR
SARVIGARI BHARATH
15 22R91A05M0
REDDY
16 22R91A05M1 SHAIK AFREEN
17 22R91A05M2 SHAIK AFZAL
18 22R91A05M3 SHAIK ASIF
19 22R91A05M4 SHAIK MALIK
20 22R91A05M5 SHAIK NASEEMA
21 22R91A05M6 SHARTA ABHINAY
22 22R91A05M7 SINGIREDDY VARSHA
SINGIREDDY YUGANDHAR
23 22R91A05M8
REDDY
24 22R91A05M9 SIRIGIRI SRUTHI
25 22R91A05N0 SIRIKONDA NITHIN
26 22R91A05N1 SIRVATI MANASA
27 22R91A05N2 SOMISHETTY AKHILA
28 22R91A05N3 SUDIREDDY PALLAVI
29 22R91A05N4 SUNKARABOINA RAKESH
30 22R91A05N5 SYED NABRAAS
TALAGADADEEVI KHYATHI
31 22R91A05N6
GAYATHRI
32 22R91A05N7 TALLA ANILKUMAR REDDY
33 22R91A05N8 TALLAPELLI SRINIVAS
TEKULA SURYAVARDHAN
34 22R91A05N9
REDDY
35 22R91A05P0 TEKULAPALLY SRILATHA
36 22R91A05P1 THIPPARAPU SAHITH
37 22R91A05P2 THIRUMANI AKHILA
THODIMA SHASHIDHAR
38 22R91A05P3
REDDY
39 22R91A05P4 THOGITI ADITHYA
40 22R91A05P5 THOKALA VIVEK REDDY
41 22R91A05P6 THUMBURI ABHINAV
Summary statistic
NO.STUDENTS APPEARED: 70
NO. STUDENTS PASS :
NO. STUDENTS FAILED:
Graphical statistics (Pie-chart for CAY and bar chart for the past three year)
REMEDIAL CLASSES DETAILS
Attainment of COs (using course end survey) (bar chart (CO scores vs each CO)
All five units are completed and 80% of students understood the subject with the help
oflab practical’s students and the various examples given in the classroom.
CERTIFICATE
Certificate by HOD