0% found this document useful (0 votes)
3 views16 pages

10000coders Data Science Curriculum

Uploaded by

22h45a0505
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views16 pages

10000coders Data Science Curriculum

Uploaded by

22h45a0505
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

ADDRESS

RD NO 1, KPHB, Kukatpally,
Hyderabad

CONTACT
8790732332, 9700610241
Table of CONTENT
01 Foundation & Data Literacy

02 Python Programming & Data Tools

03 Data Analysis & Visualization

04 Database & SQL Skills

05 Machine Learning Fundamentals

06 Advanced Topics & Specializations

07 Cloud & Deployment

08 Business Intelligence & Advanced Analytics

09 Portfolio & Career Preparation


01. FOUNDATION & DATA LITERACY

Module 1: Data Science Landscape & Career Paths


What is Data Science? Real-world impact and applications
Career Paths: Data Scientist vs Data Analyst vs ML Engineer vs Data
Engineer
Industry Applications: Healthcare, Finance, E-commerce, Sports, Social
Media
Success Stories: Case studies of data-driven decisions
Setting Expectations: Salary ranges, skill requirements, jobmarket.

Module 2: Data Fundamentals & Ethics


Types of Data: Structured vs Unstructured, Quantitative vs Qualitative
Data Collection Methods: Surveys, APIs, Web scraping, Sensors
Data Quality: Accuracy, Completeness, Consistency, Timeliness
Data Ethics & Privacy: GDPR basics, Algorithmic bias, Responsible AI
Data Lifecycle: Collection →Storage →
Processing →Analysis →
Insights

Module 3: Excel/Google Sheets Mastery


Core Functions: VLOOKUP, INDEX-MATCH, Conditional formatting
Pivot Tables & Charts: Dynamic reporting and visualization
Data Cleaning: Remove duplicates, handle errors, text-to-columns
Statistical Functions: AVERAGE, MEDIAN, STDEV, CORREL
Project: Create a sales performance dashboard
02. PYTHON PROGRAMMING &
DATA TYPES

Module 4: Python Programming Fundamentals


Environment Setup: Anaconda, Jupyter Notebooks, VS Code
Python Basics: Variables, data types, operators, input/output
Control Structures: Conditions, loops, nested logic
Functions: Definition, parameters, return values, scope
Data Structures: Lists, tuples, dictionaries, sets
File Handling: Reading/writing CSV, TXT, JSON files
Error Handling: Try-except blocks, debugging techniques
Project: Build a student grade management system

Module 5: Data Manipulation with Pandas


Series & DataFrames: Creation, indexing, slicing
Data Loading: CSV, Excel, JSON, SQL databases
Data Exploration: head(), info(), describe(), shape
Data Cleaning: Handle missing values, duplicates, data types
Data Transformation: Filtering, sorting, groupby, pivot tables
String Operations: Text cleaning, regex basics
Date/Time Handling: Parsing dates, time series basics
Project: Clean and analyze a messy real-world dataset

Module 6: Numerical Computing with NumPy


Array Operations: Creation, indexing, slicing, reshaping
Mathematical Functions: Statistics, linear algebra basics
Broadcasting: Efficient array operations
Integration with Pandas: When to use NumPy vs Pandas
Project: Financial portfolio analysis with stock data
03. DATA ANALYSIS &
VISUALIZATION
Module 7: Exploratory Data Analysis (EDA)
Statistical Measures: Mean, median, mode, variance, standard deviation
Distribution Analysis: Histograms, box plots, skewness, kurtosis
Correlation Analysis: Pearson, Spearman correlation
Outlier Detection: IQR method, Z-score, visualization techniques
Data Profiling: Automated EDA tools (pandas-profiling)
Project: Complete EDA on Titanic/House prices dataset

Module 8: Data Visualization Mastery


Matplotlib Fundamentals: Plots, subplots, customization
Seaborn for Statistical Plots: Heatmaps, pair plots, regression plots
Plotly for Interactive Visualizations: Dynamic charts, dashboards Chart
Selection: When to use bar, line, scatter, histogram, etc.
Storytelling with Data: Design principles, color theory, annotations
Advanced Visualizations: Geographic plots,
network graphs Project: Create an interactive dashboard for COVID-19 data

Module 9: Statistics for Data Science


Descriptive Statistics: Measures of central tendency and spread
Probability Fundamentals: Basic probability, conditional probability
Probability Distributions: Normal, binomial,
Poisson Sampling & Sampling Distributions: Central limit theorem
Confidence Intervals: Interpretation and calculation
Hypothesis Testing: t-tests, chi-square tests, ANOVA
A/B Testing: Design, execution, interpretation
Project: Design and analyze an A/B test for an e-commerce website
04: DATABASE & SQL
SKILLS
Managing and querying structured data

Module 10: SQL for Data Analysis


Database Fundamentals: Tables, relationships, normalization
Basic Queries: SELECT, WHERE, ORDER BY, LIMIT
Aggregation: GROUP BY, HAVING, COUNT, SUM, AVG
Joins: INNER, LEFT, RIGHT, FULL OUTER joins
Subqueries: Nested queries, correlated subqueries
Window Functions: ROW_NUMBER, RANK, LAG, LEAD
Date Functions: Date arithmetic, formatting, extraction
Python Integration: SQLite, pandas.read_sql()
Project: Analyze sales data from multiple tables with complex queries

Module 11: Database Design & Advanced SQL


Data Modeling: ER diagrams, primary/foreign keys
Performance Optimization: Indexes, query optimization
Data Warehousing Concepts: OLTP vs OLAP, ETL basics
NoSQL Basics: When to use MongoDB, document
databases Project: Design and implement a database for a library
management system

🤖 05: MACHINE LEARNING


FUNDAMENTALS
Building predictive models

Module 12: ML Foundations & Problem Framing


What is Machine Learning? Types of learning, real-world applications
Problem Types: Regression, classification, clustering, recommendation
ML Workflow: Problem definition → →
Data →
Model →
Evaluation
Deployment
Data Preparation: Feature engineering, scaling, encoding
Train-Test Split: Validation strategies, cross-validation
Overfitting vs Underfitting: Bias-variance tradeoff
Model Evaluation Metrics: Accuracy, precision, recall, F1-score, ROC-
AUC
Module 13: Supervised Learning - Regression
Linear Regression: Simple and multiple regression
Polynomial Regression: Non-linear relationships
Regularization: Ridge, Lasso, Elastic Net
Model Evaluation: R², MAE, MSE, RMSE
Feature Selection: Correlation, statistical tests, recursive elimination
Project: Predict house prices with comprehensive feature engineering

Module 14: Supervised Learning - Classification


Logistic Regression: Binary and multiclass classification
Decision Trees: Splitting criteria, pruning, interpretation
k-Nearest Neighbors (k-NN): Distance metrics, choosing k
Naive Bayes: Gaussian, multinomial, Bernoulli
Support Vector Machines: Linear and non-linear kernels
Model Evaluation: Confusion matrix, classification
report Handling Imbalanced Data: SMOTE, class weights, sampling
techniques Project: Build a customer churn prediction model

Module 15: Ensemble Methods & Model Optimization


Bagging: Random Forest, Extra Trees
Boosting: AdaBoost, Gradient Boosting, XGBoost, LightGBM
Stacking: Meta-learning approaches
Hyperparameter Tuning: Grid search, random search, Bayesian
optimization
Feature Importance: Tree-based importance, permutation importance
Model Selection: Comparing multiple algorithms
Project: Kaggle-style competition with ensemble methods

Module 16: Unsupervised Learning


Clustering: K-means, hierarchical clustering, DBSCAN
Dimensionality Reduction: PCA, t-SNE, UMAP
Association Rules: Market basket analysis, Apriori algorithm
Anomaly Detection: Isolation Forest, One-class SVM
Evaluation: Silhouette score, elbow method, cluster interpretation
Project: Customer segmentation for marketing strategy
06: ADVANCED TOPICS &
SPECIALIZATIONS
Exploring cutting-edge applications

Module 17: Natural Language Processing (NLP)


Text Preprocessing: Cleaning, tokenization, stemming, lemmatization
Feature Extraction: Bag of words, TF-IDF, n-grams
Sentiment Analysis: Rule-based and ML approaches
Text Classification: Spam detection, topic classification
Named Entity Recognition: Extracting information from text
Word Embeddings: Word2Vec, GloVe basics
Project: Build a movie review sentiment analyzer

Module 18: Time Series Analysis


Time Series Components: Trend, seasonality, noise
Stationarity: Tests and transformations
Moving Averages: Simple and exponential smoothing
ARIMA Models: Autoregression, differencing, moving averages
Seasonal Decomposition: STL decomposition
Forecasting Evaluation: MAE, MAPE, forecast accuracy
Project: Stock price prediction or sales forecasting

Module 19: Introduction to Deep Learning


Neural Network Fundamentals: Perceptron, multi-layer networks
Tensorflow/Keras Basics: Building simple neural networks
Activation Functions: ReLU, sigmoid, tanh
Loss Functions & Optimizers: SGD, Adam, learning rate
Image Classification: CNN basics with MNIST/CIFAR-10
Transfer Learning: Using pre-trained models
Project: Build an image classifier for everyday objects

Module 20: Computer Vision Basics


Image Processing: OpenCV fundamentals
Feature Detection: Edges, corners, contours
Image Classification: CNN architectures
Object Detection: YOLO concepts
Face Recognition: Basic implementations
Project: Build a face mask detection system
07: CLOUD & DEPLOYMENT
Making models production-ready

Module 21: Cloud Computing Fundamentals


Cloud Platforms: AWS, Google Cloud, Azure overview
Cloud Storage: S3, Google Cloud Storage
Compute Services: EC2, Google Compute Engine
Managed Services: AWS SageMaker, Google AI Platform
Cost Management: Understanding pricing, optimization
Project: Deploy a model on cloud platform

Module 22: Model Deployment & MLOps


Model Serialization: Pickle, joblib, ONNX
API Development: Flask, FastAPI for model serving
Containerization: Docker basics for ML applications
Model Monitoring: Performance tracking, drift detection
Version Control: Git for code, DVC for data and models
CI/CD for ML: Automated testing and deployment
Project: End-to-end ML pipeline with monitoring

Module 23: Web Applications for Data Science


Streamlit: Interactive web apps for ML models
Dash/Plotly: Complex interactive dashboards
Gradio: Quick ML demos and interfaces
Heroku Deployment: Free hosting for prototypes
User Authentication: Basic security concepts
Project: Deploy a complete ML web application
08: BUSINESS INTELLIGENCE &
ADVANCED ANALYTICS
Enterprise-level skills

Module 24: Business Intelligence Tools


Power BI: Connecting data sources, DAX formulas
Tableau: Advanced visualizations, calculated fields
Google Data Studio: Free BI tool mastery
Dashboard Design: KPIs, drill-downs, interactivity
Data Storytelling: Executive presentations
Project: Create comprehensive business dashboard

Module 25: Advanced Analytics & Optimization


Statistical Modeling: Advanced regression techniques
Optimization: Linear programming, constraint optimization
Simulation: Monte Carlo methods
Causal Inference: Understanding causation vs correlation
Experimental Design: A/B testing, factorial designs
Project: Business optimization case study

09: PORTFOLIO & CAREER


PREPARATION
Becoming job-ready

26: Portfolio Development


Profile: Professional setup, README files
Project Documentation: Clear explanations, code comments
Jupyter Notebooks: Best practices, storytelling
Portfolio Website: Showcase projects and skills
Resume Building: ATS-friendly data science resumes
LinkedIn Optimization: Professional networking
Module 27: Capstone Project Options
Choose one comprehensive project:
1. E-commerce Recommendation System
2. Healthcare Predictive Analytics
3. Financial Risk Assessment Model
4. Social Media Sentiment Analysis Platform
5. Supply Chain Optimization
6. Real Estate Price Prediction App
Requirements:

Data collection/cleaning
Comprehensive EDA
Multiple ML models
Model comparison and selection
Deployment (web app or API)
Business presentation

Module 28: Interview Preparation & Job Search


Technical Interview Prep: Coding challenges, ML concepts
Case Study Practice: Business problem solving
Behavioral Interviews: STAR method, data science scenarios
Salary Negotiation: Market research, negotiation tactics
Job Search Strategy: Where to apply, networking tips
Mock Interviews: Practice with feedback

🛠 Tools & Technologies Covered


Programming & Development
Python, SQL, Git, Docker
Jupyter, VS Code, Google Colab
Linux command line basics
Data Processing & Analysis
Pandas, NumPy, Scikit-learn
SciPy, Statsmodels
Beautiful Soup, Requests (web scraping)
Visualization & BI
Matplotlib, Seaborn, Plotly
Power BI, Tableau, Google Data Studio
Streamlit, Dash
Cloud & Deployment
AWS (S3, EC2, SageMaker)
Google Cloud Platform
Heroku, Docker
Databases
SQL (PostgreSQL, MySQL, SQLite)
MongoDB basics
Data warehousing concepts

2500+ Students Placed


1600+ Hiring Partners
1600+
1600+

HIRING PARTNERS
HIRING PARTNERS
Everything
You Need to
Start a Tech
Career

Pre Recordings available in Dashboard and


having 1 year access after registration.
Contact us : Address
8790732332, 9700610241, 9381823567 Rd no 1, KPHB, Kukatpally,
Hyderabad.
www.10000coders.in

You might also like