1
Data Science with Python Training
Duration: 60 hours Course Code: SSDN-C250
Overview:
A cohesive software application that offers a mixture of basic building blocks essential all for creating many kinds of
data science solution and incorporating such solutions into business processes, surrounding infrastructure and
products. These include tasks relating to data access and ingestion, data preparation, interactive exploration and
visualization, feature engineering, advanced modelling, testing, training, deployment and performance engineering
including the Python Programming which is used for production development and also is good for data analysis. Data
science is much more-broad which is most widely used with python programming language.
What you’ll learn:
dfdfdfdfd Target Audience:
After completing the course with Data Science and python Developers aspiring to be a data scientist or
programming will enable you to master the concepts of Data machine learning engineer.
Analysing. Analytics managers who are leading a team of
analysts.
Introduction of Data Science with Statistical Analysis Business analysts who want to understand data
and business application. science techniques and Information architects
What is Python and mathematical calculation with who want to gain expertise.
Python?
Scientific calculation with Python.
Prerequisite Knowledge:
Data manipulation and machine learning with Python.
Familiarity with the fundamentals of Python
Data visualization in Python with matplotib and data
programming
science with Python web scraping.
Understanding of the basics of statistics and data
science.
You can reach us:
SSDN Technologies, M-50, OLD DLF, Sec-14, Gurgaon, Haryana, 122001. www.ssdntech.com
Contact Us: +91-9999.111.686, +91-9999.50.9970, +91-9999.10.9937. bdt@ssdntech.com
SSDN Technologies +91-9999-111-686
2
Course Content
Module 1: Data Science Overview
Data Science
Data Scientists
Examples of Data Science
Python for Data Science
Module 2: Data Analytics Overview
Introduction to Data Visualization
Processes in Data Science
Data Wrangling, Data Exploration, and Model Selection
Exploratory Data Analysis or EDA
Data Visualization
Plotting
Hypothesis Building and Testing
Module 3: Statistical Analysis and Business Applications
Introduction to Statistics
Statistical and Non-Statistical Analysis
Some Common Terms Used in Statistics
Data Distribution: Central Tendency, Percentiles, Dispersion
Histogram
Bell Curve
Hypothesis Testing
Chi-Square Test
Correlation Matrix
Inferential Statistics
Module 4: Python: Environment Setup and Essentials
Introduction to Anaconda
Installation of Anaconda Python Distribution - For Windows, Mac OS, and Linux
Jupyter Notebook Installation
Jupyter Notebook Introducti
Control Flow
Module 5: Mathematical Computing with Python (NumPy)
NumPy Overview
Properties, Purpose, and Types of ndarray
Class and Attributes of ndarray Object
Basic Operations: Concept and Examples
Accessing Array Elements: Indexing, Slicing, Iteration, Indexing with Boolean Arrays
Copy and Views
Universal Functions (ufunc)
Shape Manipulation
Broadcasting
Linear Algebra
SSDN Technologies +91-9999-111-686
3
Module 6: Scientific computing with Python (Scipy)
SciPy and its Characteristics
SciPy sub-packages
SciPy sub-packages –Integration
SciPy sub-packages – Optimize
Linear Algebra
SciPy sub-packages – Statistics
SciPy sub-packages – Weave
Module 7: Data Manipulation with Python (Pandas)
Introduction to Pandas
Data Structures
Series
DataFrame
Missing Values
Data Operations
Data Standardization
Pandas File Read and Write Support
SQL Operation
Module 8: Machine Learning with Python (Scikit–Learn)
Introduction to Machine Learning
Machine Learning Approach
How Supervised and Unsupervised Learning Models Work
Scikit-Learn
Supervised Learning Models - Linea
Unsupervised Learning Models: Dimensionality Reduction
Pipeline
Model Persistence
Model Evaluation - Metric Functions
Module 9: Natural Language Processing with Scikit-Learn
NLP Overview
NLP Approach for Text Data
NLP Environment Setup
NLP Sentence analysis
NLP Applications
Major NLP Libraries
Scikit-Learn Approach
Scikit - Learn Approach Built - in Modules
Scikit - Learn Approach Feature Extraction
Bag of Words
Extraction Considerations
Scikit - Learn Approach Model Training
Scikit - Learn Grid Search and Multiple Parameters
Pipeline
SSDN Technologies +91-9999-111-686
4
Module 10: Data Visualization in Python using Matplotlib
Introduction to Data Visualization
Python Libraries
Plots
Matplotlib Features:
o Line Properties Plot with (x, y)
o Controlling Line Patterns and Colors
o Set Axis, Labels, and Legend Properties
o Alpha and Annotation
o Multiple Plots
o Subplots
Types of Plots and Seaborn
Module 11: Data Science with Python Web Scraping
Web Scraping
Common Data/Page Formats on The Web
The Parser
Importance of Objects
Understanding the Tree
Searching the Tree
Navigating options
Modifying the Tree
Parsing Only Part of the Document
Printing and Formatting
Encoding
Module 12: Python integration with Hadoop, MapReduce and Spark
Need for Integrating Python with Hadoop
Big Data Hadoop Architecture
MapReduce
Cloudera QuickStart VM Set Up
Apache Spark
Resilient Distributed Systems (RDD)
PySpark
Spark Tools
PySpark Integration with Jupyter Notebook
SSDN Technologies +91-9999-111-686