Tools For Data Science-Data Science Methodology
Tools For Data Science-Data Science Methodology
Tools For Data Science-Data Science Methodology
Link: https://www.coursera.org/learn/open-source-tools-for-data-science?action=enroll#syllabus
1
Anaconda Jupyter environment. Finally, the module gives an overview of cloud based Jupyter
environments and their data science features.
1. Introduction to Jupyter Notebooks
2. Getting Started with Jupyter
3. Jupyter Kernels
4. Jupyter Architecture
5. Additional Anaconda Jupyter Environments
6. Additional Cloud Based Jupyter Environments
Week 5. RStudio & GitHub
R is a statistical programming language and is a powerful tool for data processing and
manipulation. This module will start with an introduction to R and RStudio. You will learn
about the different R visualization packages and how to create visual charts using the plot
function.
1. Introduction to R and RStudio
2. Plotting in RStudio
3. Overview of Git/GitHub
4. Introduction to GitHub
5. GitHub Repositories
6. GitHub - Getting Started
7. GitHub - Working with Branches
8. GitHub Branches
Week 6. Create and Share your Jupyter Notebook
In this module, you will work on a final project to demonstrate some of the skills learned in
the course. You will also be tested on your knowledge of various components and tools in a
Data Scientist's toolkit learned in the previous modules.
Week 7. [Optional] IBM Watson Studio
Watson Studio is a collaborative platform for the data science community and is used
by Data Analysts, Data Scientists, Data Engineers, Developers, and Data Stewards to analyze
data and construct models. In this module, you will learn about Watson Studio and IBM Cloud
Pak for data as a service. Then you will create an IBM Watson Studio service and a project
in Watson Studio. After creating the project, you will create a Jupyter notebook and load a
data file. You will also explore the different templates and kernels in a Jupyter notebook.
Finally, you will connect your Watson Studio account to GitHub and publish the notebook in
GitHub.
1. Introduction to Watson Studio
2. Optional: Creating an account on IBM Watson Studio
3. Jupyter Notebooks in Watson Studio
4. Jupyter Notebooks in Watson Studio
5. Linking GitHub to Watson Studio
2
DATA SCIENCE METHODOLOGY
Link: https://www.coursera.org/learn/data-science-methodology#syllabus
Week 1. From Problem to Approach and From Requirements to
Collection
In this module, you will learn about why we are interested in data science, what a
methodology is, and why data scientists need a methodology. You will also learn about the
data science methodology and its flowchart. You will learn about the first two stages of the
data science methodology, namely Business Understanding and Analytic Approach. Finally,
through a lab session, you will also obtain how to complete the Business Understanding and
the Analytic Approach stages and the Data Requirements and Data Collection stages
pertaining to any data science problem.
1. Business Understanding
2. Analytic Approach
3. Data Requirements
4. Data Collection
Week 2. From Understanding to Preparation and From Modeling to
Evaluation
In this module, you will learn what it means to understand data, and prepare or clean
data. You will also learn about the purpose of data modeling and some characteristics of the
modeling process. Finally, through a lab session, you will learn how to complete the Data
Understanding and the Data Preparation stages, as well as the Modeling and the Model
Evaluation stages pertaining to any data science problem.
1. Data Understanding
2. Data Preparation - Concepts
3. Data Preparation - Case Study
4. Modeling - Concepts
5. Modeling - Case Study
6. Evaluation
Week 3. From Deployment to Feedback
In this module, you will learn about what happens when a model is deployed and why
model feedback is important. Also, by completing a peer-reviewed assignment, you will
demonstrate your understanding of the data science methodology by applying it to a problem
that you define.
1. Deployment
2. Feedback
3. Course Summary