Tools For Data Science-Data Science Methodology

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

TOOLS FOR DATA SCIENCE

Link: https://www.coursera.org/learn/open-source-tools-for-data-science?action=enroll#syllabus

Week 1. Overview of Data Science Tools


In this module, you will learn about the different types and categories of tools that data
scientists use and popular examples of each. You will also become familiar with Open Source,
Cloud-based, and Commercial options for data science tools.
1. Course Introduction
2. Categories of Data Science Tools
3. Open Source Tools for Data Science
4. Open Source Tools for Data Science
5. Commercial Tools for Data Science
6. Cloud Based Tools for Data Science
Week 2. Languages of Data Science
For users who are just starting on their data science journey, the range of programming
languages can be overwhelming. So, which language should you learn first? This module will
bring awareness about the criteria that would determine which language you should learn.
You will learn the benefits of Python, R, SQL, and other common languages such as Java,
Scala, C++, JavaScript, and Julia. You will explore how you can use these languages in Data
Science. You will also look at some sites to locate more information about the languages.
1. Languages of Data Science
2. Introduction to Python
3. Introduction to R Language
4. Introduction to SQL
5. Other Languages for Data Science
Week 3. Packages, APIs, Datasets and Models
In this module, you will learn about the various libraries in data science. In addition, you will
understand an API in relation to REST request and response. Further, in the module, you will
explore open data sets on the Data Asset eXchange. Finally, you will learn how to use a
machine learning model to solve a problem and navigate the Model Asset eXchange.
1. Libraries for Data Science
2. Application Programming Interfaces (APIs)
3. Data Sets - Powering Data Science
4. Sharing Enterprise Data - Data Asset eXchange
5. Machine Learning Models – Learning from Models to Make Predictions
6. The Model Asset eXchange
Week 4. Jupyter Notebooks and JupyterLab
With the advancement of digital data, Jupyter Notebook allows a Data Scientist to record their
data experiments and results that others can reuse. This module introduces the Jupyter
Notebook and Jupyter Lab. You will learn how to work with different kernels in a Notebook
session and about the basic Jupyter architecture. In addition, you will identify the tools in an

1
Anaconda Jupyter environment. Finally, the module gives an overview of cloud based Jupyter
environments and their data science features.
1. Introduction to Jupyter Notebooks
2. Getting Started with Jupyter
3. Jupyter Kernels
4. Jupyter Architecture
5. Additional Anaconda Jupyter Environments
6. Additional Cloud Based Jupyter Environments
Week 5. RStudio & GitHub
R is a statistical programming language and is a powerful tool for data processing and
manipulation. This module will start with an introduction to R and RStudio. You will learn
about the different R visualization packages and how to create visual charts using the plot
function.
1. Introduction to R and RStudio
2. Plotting in RStudio
3. Overview of Git/GitHub
4. Introduction to GitHub
5. GitHub Repositories
6. GitHub - Getting Started
7. GitHub - Working with Branches
8. GitHub Branches
Week 6. Create and Share your Jupyter Notebook
In this module, you will work on a final project to demonstrate some of the skills learned in
the course. You will also be tested on your knowledge of various components and tools in a
Data Scientist's toolkit learned in the previous modules.
Week 7. [Optional] IBM Watson Studio
Watson Studio is a collaborative platform for the data science community and is used
by Data Analysts, Data Scientists, Data Engineers, Developers, and Data Stewards to analyze
data and construct models. In this module, you will learn about Watson Studio and IBM Cloud
Pak for data as a service. Then you will create an IBM Watson Studio service and a project
in Watson Studio. After creating the project, you will create a Jupyter notebook and load a
data file. You will also explore the different templates and kernels in a Jupyter notebook.
Finally, you will connect your Watson Studio account to GitHub and publish the notebook in
GitHub.
1. Introduction to Watson Studio
2. Optional: Creating an account on IBM Watson Studio
3. Jupyter Notebooks in Watson Studio
4. Jupyter Notebooks in Watson Studio
5. Linking GitHub to Watson Studio

2
DATA SCIENCE METHODOLOGY
Link: https://www.coursera.org/learn/data-science-methodology#syllabus
Week 1. From Problem to Approach and From Requirements to
Collection
In this module, you will learn about why we are interested in data science, what a
methodology is, and why data scientists need a methodology. You will also learn about the
data science methodology and its flowchart. You will learn about the first two stages of the
data science methodology, namely Business Understanding and Analytic Approach. Finally,
through a lab session, you will also obtain how to complete the Business Understanding and
the Analytic Approach stages and the Data Requirements and Data Collection stages
pertaining to any data science problem.
1. Business Understanding
2. Analytic Approach
3. Data Requirements
4. Data Collection
Week 2. From Understanding to Preparation and From Modeling to
Evaluation
In this module, you will learn what it means to understand data, and prepare or clean
data. You will also learn about the purpose of data modeling and some characteristics of the
modeling process. Finally, through a lab session, you will learn how to complete the Data
Understanding and the Data Preparation stages, as well as the Modeling and the Model
Evaluation stages pertaining to any data science problem.
1. Data Understanding
2. Data Preparation - Concepts
3. Data Preparation - Case Study
4. Modeling - Concepts
5. Modeling - Case Study
6. Evaluation
Week 3. From Deployment to Feedback
In this module, you will learn about what happens when a model is deployed and why
model feedback is important. Also, by completing a peer-reviewed assignment, you will
demonstrate your understanding of the data science methodology by applying it to a problem
that you define.
1. Deployment
2. Feedback
3. Course Summary

You might also like