Data Science Roadmap 2024
Data Science Roadmap 2024
Data Science Roadmap 2024
Following is the roadmap to learn Data Science skills for a total beginner (no coding or computer
science background needed). It includes FREE learning resources for technical skills (or tool skills)
and soft (or core) skills
3 hours in Tool Skills + 1 hour in Core Skills = 4 hours study Every Day
codebasics.io
2
Even though these posts are NOT sufficient, do your additional research.
• https://bit.ly/4at9Jaw
• https://bit.ly/477IOOs
• https://bit.ly/3GPD7dp
• Topics
o Variables, Numbers, Strings
o Lists, Dictionaries, Sets, Tuples
o If condition, for loop
o Functions, Lambda Functions
o Modules (pip install)
o Read, Write files
o Exception handling
o Classes, Objects
• Learning Resources
o Track A (Free)
▪ Free Python Tutorials on YouTube (first 16 videos)
- https://bit.ly/3X6CCC7
▪ Codebasics python HINDI tutorials
- https://bit.ly/3vmXrgw
o Track B (Affordable Fees)
▪ Python course: https://codebasics.io/courses/python-for-beginner-and-
intermediate-learners
codebasics.io
3
• Motivation
o Physics to Data Scientist Transition -> https://bit.ly/47cA8GU
• Assignment
• Tech Skills
o Numpy
▪ numpy YouTube playlist: https://bit.ly/3GTppa8
o Pandas, Matplotlib, Seaborn
▪ Go through chapter 3 in this course (entire chapter is free):
https://codebasics.io/courses/math-statistics-for-data-professionals
• Core/Soft Skills
o Linkedin
▪ Start following prominent data science influencers.
▪ Daliana Liu: https://www.linkedin.com/in/dalianaliu/
▪ Nitin Aggarwal: https://www.linkedin.com/in/ntnaggarwal/
▪ Steve Nouri: https://www.linkedin.com/in/stevenouri/
▪ Dhaval Patel: https://www.linkedin.com/in/dhavalsays/
codebasics.io
4
▪ Increase engagement.
▪ Start commenting meaningfully on data science and career-
related posts.
▪ Helps network with others working in the industry build
connections.
▪ Learning and brainstorming opportunity.
▪ Remember online presence is a new form of resume
o Business Fundamentals - Soft Skill
▪ Learn business concepts from ThinkSchool and other YT Case Studies
▪ Example: How Amul beat competition: https://youtu.be/nnwqtZiYMxQ
o Discord
▪ Start asking questions and get help from the community. This post
shows how to ask questions the right way: https://bit.ly/3I70EbI
▪ Join codebasics discord server: https://discord.gg/r42Kbuk
• Assignment
codebasics.io
5
o Learning Resources
▪ Track A (Free)
▪ Learn the above topics from this excellent Khan academy course
on statistics and probability.
▪ Course link: https://www.khanacademy.org/math/statistics-
probability
▪ While doing khan academy course, when you have doubts, use
statquest YouTube channel:
https://www.youtube.com/@statquest
▪ Use this free YouTube playlist: https://bit.ly/3QrSXis
• Motivation
o Petroleum engineer to data scientist: https://bit.ly/3REsqiL
• Assignment
☐ Finish all exercises in this playlist: https://bit.ly/3QrSXis
☐ Finish all exercises in Khan academy course.
• Assignment
☐ Perform EDA (Exploratory data analysis on at least 2 additional datasets on
Kaggle)
codebasics.io
6
• Topics
o Basics of relational databases.
o Basic Queries: SELECT, WHERE LIKE, DISTINCT, BETWEEN, GROUP BY, ORDER
BY
o Advanced Queries: CTE, Subqueries, Window Functions
o Joins: Left, Right, Inner, Full
o No need to learn database creation, indexes, triggers etc. as those things are
rarely used by data scientists.
• Learning Resources
o Track A
▪ Khan academy: https://bit.ly/3WFku20
▪ https://www.w3schools.com/sql/
▪ https://sqlbolt.com/
o Track B
▪ SQL course for data professionals: https://codebasics.io/courses/sql-
beginner-to-advanced-for-data-professionals
• Core/Soft Skills
o Presentation skills
▪ Death by PowerPoint: https://youtu.be/Iwpi1Lm6dFo
• Assignment
☐ Participate in SQL resume project challenge on https://codebasics.io/
▪ Link: https://codebasics.io/challenge/codebasics-resume-project-
challenge/7
▪ These challenges help you improve technical skills, soft skills and
business understanding.
☐ Make a LinkedIn post with a submission of your resume project challenge
• Sample post: https://bit.ly/48Bg5mB
• Codebasics is promoting winning entries to employers. This way you
can get interview calls. We do this in two ways:
o We have a database of employers hiring for data analyst
positions. We send first 10 or 20 profiles based on their
performance.
o LinkedIn post by Dhaval (who has more than 100k followers and
some of them are HR managers, data analytics senior managers):
https://bit.ly/3jnni5c
codebasics.io
7
codebasics.io
8
• Motivation
o How Kaggle helped this person become ML engineer: https://bit.ly/3RFVruy
• Assignment
☐ Complete all exercises in ML playlist: https://bit.ly/3io5qqX
☐ Work on 2 Kaggle ML notebooks
☐ Write 2 LinkedIn posts on whatever you have learnt in ML
☐ Discord: Help people with at least 10 answers
• You need to finish two end to end ML projects. One on Regression, the other on
Classification
• Regression Project: Bangalore property price prediction
o YouTube playlist link: https://bit.ly/3ivycWr
o Project covers following
▪ Data cleaning
▪ Feature engineering
▪ Model building and hyper parameter tuning
▪ Write flask server as a web backend
▪ Building website for price prediction
▪ Deployment to AWS
• Classification Project: Sports celebrity image classification
o YouTube playlist link: https://bit.ly/3ioaMSU
o Project covers following
▪ Data collection and data cleaning
▪ Feature engineering and model training
▪ Flask server as a web backend
▪ Building website and deployment
• ATS Resume Preparation
o Resumes are dying but not dead yet. Focus more on online presence.
o Here is the resume tips video along with some templates you can use for your
data analyst resume: https://www.youtube.com/watch?v=buQSI8NLOMw
o Use this checklist to ensure you have the right ATS Resume: Check here.
codebasics.io
9
• GitHub
o Upload your projects with code on github and using github.io create a
portfolio website
o Sample portfolio website: http://rajag0pal.github.io/
• Linktree
o Helpful to add multiple links in one page.
• Assignment
o In above two projects make following changes
☐ Use FastAPI instead of flask. FastAPI tutorial: https://youtu.be/Wr1JjhTt1Xg
☐ Regression project: Instead of property prediction, take any other project
of your interest from Kaggle for regression
☐ Classification project: Instead of sports celebrity classification, take any
other project of your interest from Kaggle for classification and build end to
end solution along with deployment to AWS or Azure
☐ Add a link of your projects in your resume and LinkedIn.
(Tag Codebasics, Dhaval Patel and Hemanand Vadivel with the hashtag
#dsroadmap24 so we can engage to increase your visibility)
• Topics
o What is a neural network? Forward propagation, back propagation
o Building multilayer perceptron
o Special neural network architectures
▪ Convolutional neural network (CNN)
▪ Sequence models: RNN, LSTM
• Learning Resources
o Deep Learning playlist (tensorflow): https://bit.ly/3vOZ3zV
o Deep learning playlist (pytorch): https://bit.ly/3TzDbWp
o End to end potato disease classification project: https://bit.ly/3QzkVJi
codebasics.io
10
• Assignment
☐ Instead of potato plant images use tomato plant images or some other image
classification dataset.
☐ Deploy to Azure instead of GCP.
☐ Create a presentation as if you are presenting to stakeholders and upload
video presentation on LinkedIn.
• Many data scientists choose a specialized track which is either NLP or Computer
vision. You don’t need to learn both.
• Natural Language Processing (NLP)
o Topics
▪ Regex
▪ Text presentation: Count vectorizer, TF-IDF, BOW, Word2Vec,
Embeddings
▪ Text classification: Naïve Bayes
▪ Fundamentals of Spacy & NLTP library
▪ One end to end project
o Learning Resources
▪ NLP YouTube playlist: https://bit.ly/3XnjfEZ
codebasics.io
11
Week 25 onwards….
• More projects
• Online brand building through LinkedIn, Kaggle, Discord, Opensource contribution
FAQs
codebasics.io
12
codebasics.io