SEMESTER-IV
Virtual Purely
Course Code Course Name Theory Credit
Lab Internal
V20PBBA06 DATA SCIENCE USING R 4
Course Objectives
• Become acquainted with the use of R tool for Data Science applications.
• Acquire experience in analyzing data using R.
• Develop the skills to use the software for pre–analytic phase data handling operations.
• Learn various methods of using Hadoop and R together
• Understand how to write Interpretation and do decision making
UNIT I Introduction to Data Science
Introduction to Data Science – Basic concepts – Data – Nature – Process for Data Science – Handling
Data
UNIT II R and its applications
R software – core and optional packages – Data science packages – Exploratory Analytics using R –
Visualizing Data– Applications
UNIT III Pre- Processing
Pre–processing Data with R – Scrapping– sampling – munging – cleaning – data from multiple
sources – extraction from data bases
UNIT IV Big Data in R
Handling Big Data in R – Hadoop and R – New frameworks – Mapreduce with R – Organizing Data
Sources
UNIT V Automation
Automation of Data Analytics – considerations – organizing for Data Science –Interpreting and
Decision making
Learning Resources
1. James (JD) Long (2019),“R Cook book”-2nd Edition-O‟Reilly Media Inc.
2. Wiktorski, Tomasz (2019),”Data-intensive Systems-Principles and Fundamentals using
Hadoop and Spark”-Springer.
3. Andrew Olesky (2018), “Data Science with R: A Step By Step Guide with Visual Illustrations
& Examples”, Kindle Edition.
4. Hadley Wickham, Garrett Grolemund (2017), “R for Data Science: Import, Tidy, Transform,
Visualize, and Model Data”, Oreilly.
5. Thomas Mailund (2017), “Beginning Data Science in R: Data Analysis, Visualization, and
Modeling for the Data