BATCH ID: FEBRUARY 2024 [2PM - 5PM] DATA SCIENCE AND ARTIFICIAL INTELLIGENCE COURSE
TOPICS SUB TOPICS
1. INDUCTION 1.1 GENERAL WELCOME 1.2 STUDENT INTRODUCTION 1.3 OVERVIEW OF DATA SCIENCE FIELD 1.4 ROADMAP OF COURSE 1.5 CODE OF CONDUCT 1.6 CAREER OPPORTUNITIES 1.7 RESUME AND LINKEDIN PROFILE BUILDING 1.8 OVERVIEW OF PLACEMENT PROCESS - RULES AND GUIDELINES 1.9 SOFTWARE INSTALLATION - ANACONDA, R STUDIO, JUPYTER NOTEBOOK, POWER BI, EXCEL 2. FUNDAMENTALS OF EXCEL 2.1 OVERVIEW OF EXCEL INTERFACE 2.2 BASIC FUNCTIONS ON PROVIDED DATA SET - SUM, SUMIF, COUNT, COUNTIF, AVERAGE, FIND, CONCATENATE, LEN, DAYS, COUNTA, VLOOKUP, HLOOKUP, IF, IFERROR, FIND/SEARCH, LEFT/RIGHT, RANK 2.3 RANGES AND TABLES 2.4 DATA CLEANING - TEXT FUNCTIONS, DATES AND TIMES 2.5 CONDITIONAL FORMATTING 2.6 SORTING AND FILTERING 2.7 SUBTOTALS WITH RANGES 3. ADVANCED EXCEL 3.1 PIVOTS 3.2 DATA ANALYSIS IN EXCEL - TRENDS, PATTERNS 3.3 DATA VISUALIZATION IN EXCEL - COLUMN CHARTS, LINE CHARTS, PIE CHARTS, BAR CHARTS, AREA CHARTS, SCATTER PLOTS 3.4 WORKING WITH MULTIPLE WORKSHEETS 3.5 LINKING & REFERENCING THE DATA BETWEEN WORKSHEETS 4. INTRODUCTION TO PYTHON 4.1 OVERVIEW OF PYTHON BASICS 4.2 UNDERSTANDING STATEMENTS, EXPRESSIONS, AND INDENTATION. 4.3 OVERVIEW OF IDENTIFIERS, KEYWORDS, AND COMMENTS. 4.4 VARIABLES: DECLARATION, ASSIGNMENT, AND NAMING CONVENTIONS. 4.5 COMMON DATA TYPES: INTEGERS, FLOATS, STRINGS. 4.6 TYPE CASTING AND CONVERSION. 4.7 OPERATORS IN PYTHON 4.8 HANDS-ON ACTIVITY 5. LOOPS, FUNCTIONS, ERROR 5.1 CONDITIONAL STATEMENTS & LOOPS HANDLING 5.2 LOOP CONTROL STATEMENTS: BREAK, CONTINUE, PASS 5.3 DEFINING AND CALLING FUNCTIONS. 5.4 FUNCTION PARAMETERS AND RETURN VALUES. 5.5 SCOPE OF VARIABLES (GLOBAL AND LOCAL). 5.6 ADVANCED FUNCTIONS 5.7 DEFAULT VALUES AND VARIABLE-LENGTH ARGUMENTS. 5.8 RECURSIVE FUNCTIONS. 5.9 MAP, REDUCE, AND FILTER 5.10 INTRODUCTION TO EXCEPTIONS. 5.11 TRY, EXCEPT, AND FINALLY BLOCKS. 5.12 HANDLING COMMON ERRORS. 5.13 HANDS-ON ACTIVITY 6. DATA STRUCTURE - 1: LIST AND 6.1 CREATION OF LISTS TUPLE 6.2 BASIC OPERATIONS ON LISTS 6.3 DEMONSTRATION OF LIST MANIPULATION TECHNIQUES 6.4 SLICING AND INDEXING IN LISTS. 6.5 LIST COMPREHENSION FOR CONCISE AND READABLE CODE. 6.6 TUPLES CREATION 6.7 BASIC OPERATIONS ON TUPLES 6.8 SLICING AND INDEXING IN TUPLES 6.9 COMMON OPERATIONS ON BOTH LISTS AND TUPLES 6.10 HANDS-ON ACTIVITY 7. DATA STRUCTURE - 2: 7.1 CREATION OF DICTIONARIES DICTIONARY AND SETS 7.2 BASIC OPERATIONS ON DICTIONARIES 7.3 MANIPULATING DICTIONARIES 7.4 DICTIONARY COMPREHENSION FOR CONCISE CREATION 7.5 CREATION OF SETS 7.6 MANIPULATING SETS 7.7 COMMON OPERATIONS ON BOTH DICTIONARIES AND SETS 7.8 HANDS-ON ACTIVITY 8. INTRODUCTION TO NUMPY 8.1 INTRO TO NUMPY AND CREATING NUMPY ARRAY 8.2 BASIC OPERATIONS ON ARRAYS 8.3 INDEXING & SLICING 8.4 RESHAPING, STACKING & SPLITTING 8.5 ITERATION, FILTERING & BOOLEAN INDEXING 8.6 IMAGE PROCESSING USING NUMPY & MATPLOTLIB 9. INTRODUCTION TO PANDAS 9.1 DATA STRUCTURE IN PANDAS AND DATA VISUALIZATION 9.2 CREATING DATAFRAME & LOADING FILES 9.3 DATA EXPLORATION OR EDA 9.4 CREATING & SAVING BASIC PLOTS USING MATPLOTLIB 9.5 CREATING STATISTICAL PLOTS USING SEABORN 9.6 EXPLORING RELATIONSHIPS IN DATA: PAIR PLOT & HEAT MAP 9.7 HANDS-ON ACTIVITY 10. INTRODUCTION TO SQL & BASIC 10.1 WHAT IS SQL AND ITS SIGNIFICANCE? QUERYING 10.2 SQL'S ROLE IN DATA RETRIEVAL AND MANIPULATION 10.3 BASIC SELECT STATEMENT FOR DATA RETRIEVAL 10.4 RETRIEVING SPECIFIC COLUMNS AND ALL COLUMNS 10.5 USING DISTINCT TO REMOVE DUPLICATES 10.6 INTRODUCTION TO DATA MODELS & ER DIAGRAMS 10.7 RELATIONAL VS. TRANSACTIONAL MODELS 10.8 HOW DATA IS ORGANIZED IN TABLES 10.9 FILTERING DATA WITH WHERE CLAUSE 10.10 SORTING DATA WITH ORDER BY 10.11 LIMITING RESULTS WITH LIMIT 10.12 USING ALIASES FOR COLUMN NAMES 11. ADVANCED SQL CONCEPT & DATA 11.1 CREATING AND USING TEMPORARY TABLES MANUPULATION 11.2 ADDING COMMENTS TO SQL CODE FOR DOCUMENTATION 11.3 INTRODUCTION TO DATA MODELING 11.4 DESIGNING A SIMPLE DATABASE SCHEMA 11.5 SORTING DATA WITH ORDER BY (ADVANCED) 11.6 ADVANCED FILTERING WITH IN, OR, AND, NOT 11.7 PERFORMING MATHEMATICAL OPERATIONS ON DATA 11.8 INTRODUCTION TO AGGREGATE FUNCTIONS (COUNT, SUM, AVG, MAX, MIN) 11.9 GROUPING DATA WITH GROUP BY 11.10 FILTERING GROUPED DATA WITH HAVING 11.11 UNDERSTANDING SUBQUERIES AND THEIR TYPES 11.12 PERFORMING JOIN OPERATIONS (INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN) 11.13 UPDATING AND DELETING DATA WITH SQL 11.14 ANALYZING DATA WITH SIMPLE STATISTICS 12. FUNDAMENTALS OF STATISTICS & 12.1 DEFINE STATISTICS AND ITS IMPORTANCE. PROBABILITY 12.2 EXPLAIN THE TYPES OF DATA: CATEGORICAL AND NUMERICAL. 12.3 INFERENTIAL AND DESCRIPTIVE STATISTICS 12.4 MEASURE OF CENTRAL TENDENCY: MEAN, MEDIAN, MODE. 12.5 MEASURE OF DISPERSION: VARIANCE & STANDARD DEVIATION 12.6 PROBABILITY BASICS, IT'S RULES & NOTATION 12.7 PROBABILITY DISTRIBUTION -DISCRETE & CONTINUOUS 12.8 NORMAL DISTRIBUTION & PROPERTIES 12.9 CENTRAL LIMIT THEOREM & ITS IMPORTANCE 12.10 SKEWNESS & T-DISTRIBUTIONS 13. ADAVANCED STATISTIC & 13.1 HYPOTHESIS TESTING - NULL & ALTERNATIVE HYPOTHESIS TESTING 13.2 SIGNIFICANCE LEVEL (ALPHA) & P-VALUE 13.3 ONE-SAMPLE & TWO-SAMPLE T-TEST 13.4 VISUALIZATION PLOTS FOR DATA EXPLORATION 13.5 INTERPRETATION OF VISUALIZATION 13.6 CORRELATION & REGRESSION 13.7 CONFIDENCE INTERVAL 13.8 HYPOTHESIS TESTING WITH Z-TEST 13.9 CHI-SQUARE TEST FOR CATEGORICAL DATA 13.10 ONE-WAY & TWO-WAY ANOVA 14. INTRODUCTION TO MACHINE 14.1 INTRO TO ML & ITS ROLE IN DATA ANALYSIS LEARNING AND REGRESSION 14.2 TYPES OF MACHINE LEARNING - SUPERVISED, UNSUPERVISED & REINFORCEMENT BASICS 14.3 DATA PREPROCCESING - HANDLING THE MISSING VALUES, OUTLIER & METHODS TO HANDLE OUTLIERS - IQR METHOD & Z METHOD 14.4 FEATURE SCALING 14.5 LINEAR REGRESSION AS REGRESSION TECHNIQUE 14.6 SIMPLE LINEAR REGRESSION 15. MULTIPLE LINEAR REGRESSION 15.1 MODEL EVALUATION METRICS FOR REGRESSION & MODEL EVALUATION 15.2 MEAN ABSOLUTE ERROR (MAE) 15.3 MEAN SQUARED ERROR (MSE) 15.4 ROOT MEAN SQUARED ERROR (RMSE) 15.5 R-SQUARED (COEFFICIENT OF DETERMINATION) 15.6 MULTIPLE LINEAR REGRESSION 15.7 INBUILT DATASET 15.8 CAIFORNIA HOUSING DATASET - MODEL EAVLAUTION 16. LOGISTIC REGRESSION AND 16.1 LOGISTIC REGRESSION CLASSIFICATION METRICS 16.2 BINARY CLASSIFICATION PROBLEM & LOGIT FUNCTION & ODDS RATIO 16.3 BINARY & MULTICLASS LR 16.4 CLASSIFICATION MATRIX: ACCURACY, PRECISION, RECALL & F1-SCORE 16.5 CONFUSION MATRIX INTERPRETATION 16.6 ROC CURVES & AUC 17. DECISION TREES AND 17.1 DECISION TREE & ITS STRUCTURE ENSEMBLE METHODS 17.2 DECISION NODES & LEAF NODES, PARENT/CHILD NODE 17.3 SPLITTING CRITERIA - GINI IMPURITY & ENTROPY 17.4 TREE PRUNING & OVERFITTING 17.5 TECHNIQUES TO PREVENT OVERFITTING 17.5 RANDOM FOREST - ENSEMBLE LEARNING & BAGGING 17.6 GRADIENT BOOSTING AND ADABOOSTAS EMSEMBLE METHOD 18. MODEL EVALUATION AND 18.1 K-FOLD CROSS-VALIDATION FOR MODEL EAVLUATIUON VALIDATION TECHNIQUES 18.2 HYPERPARAMETER TUNING USING GRID SEARCH. 18.3 DETAILED COVERAGE OF CLASSIFICATION METRICS. 18.4 PRECISION, RECALL, F1-SCORE, ROC CURVES, AUC. 18.5 INTERPRETATION AND PRACTICAL USAGE. 19. UNSUPERVISED LEARNING 19.1 K-MEANS CLUSTERING AND ITS APPLICATIONS. 19.2 K-MEANS ALGORITHM 19.3 CHOOSING THE NUMBER OF CLUSTERS (K) 19.4 INTRODUCTION TO HIERARCHICAL CLUSTERING 19.5 AGGLOMERATIVE HIERARCHICAL CLUSTERING 20. SUPPORT VECTOR MACHINES 20.1 CLASSIFICATION AND REGRESSION WITH SVM (SVM) AND K-NEAREST 20.2 THE CONCEPT OF MARGIN AND SUPPORT VECTORS NEIGHBORS (KNN) 20.3 KERNEL TRICK FOR NON-LINEAR DATA 20.4 INTRODUCTION TO KNN 20.5 HOW KNN MAKES PREDICTIONS BASED ON NEAREST NEIGHBORS 20.6 EUCLIDEAN DISTANCE, MANHATTAN DISTANCE, AND OTHER DISTANCE METRICS 20.7 CHOOSING THE VALUE OF K 21. TIME SERIES MODELING WITH 21.1 UNDERSTANDING TIME SERIES DATA. ARIMA AND SARIMA 21.2 ARIMA MODEL & ITS COMPONENRS 21.3 BUILDING ARIMA MODELS. 21.4 PRACTICAL FORECASTING WITH ARIMA. 21.5 SEASONAL ARIMA (SARIMA) MODEL & ITS COMPONENTS 21.6 BUILDING AND FORECASTING WITH SARIMA. 21.7 MODEL EVALUATION AND TUNING. 22. INTRODUCTION TO DEEP 22.1 OVERVIEW OF ARTIFICIAL NEURAL NETWORKS (ANNS) LEARNING 22.2 NEURAL NETWORK BASICS 22.3 MODEL REPRESENTATION IN DEEP LEARNING 22.4 DEEP LEARNING APPLICATIONS 22.5 TRAINING DEEP LEARNING MODELS 22.6 BUILDING A SIMPLE ARTIFICIAL NEURAL NETWORK 22.7 PRACTICAL EXAMPLE AND HANDS-ON: ANN 22.8 CONVOLUTIONAL NEURAL NETWORKS (CNNS) 22.9 PRACTICAL EXAMPLE AND HANDS-ON: CNN 23. DEEP LEARNING 23.1 RECURRENT NEURAL NETWORKS (RNNS) ARCHITECTURES AND TRAINING 23.2 RECURRENT NEURONS 23.3 VANISHING GRADIENT PROBLEM 23.4 LSTM & GRU 23.5 BUILDING & TRAINING RNN 23.6 OVERFITTING & REGULARIZATION TECHNIQUES 23.7 DROPOUT & NORMALIZATION 23.8 MODEL EVALUATION, METRICS & HYPERPARAMETER TECHNIQUES 23.9 PRACTICAL EXERCISE: RNN, LSTM, GRU 24. INTRO TO NATURAL LANGUAGE 24.1 WHAT IS NLP? PROCESSING (NLP) 24.2 CHALLENGES IN NLP 24.3 KEY NLP TASKS 24.4 TEXT PREPROCESSING IN NLP - TEXT TOKENIZATION, TEXT CLEANING AND NORMALIZATION, STOP WORDS REMOVAL, STEMMING AND LEMMATIZATION 24.5 NLP LIBRARIES AND FRAMEWORKS, 24.6 FEATURE EXTRACTION AND REPRESENTATION 24.7 BUILDING A SIMPLE TEXT CLASSIFICATION MODEL 25. ADVANCED NLP TECHNIQUES 25.1 ADVANCED WORD EMBEDDINGS 25.2 GLOVE (GLOBAL VECTORS FOR WORD REPRESENTATION) 25.3 N-GRAMS 25.4 RECURRENT NEURAL NETWORKS (RNN) 25.5 LONG SHORT-TERM MEMORY (LSTM) 25.6 GRU 25.7 HANDS-ON ADVANCED NLP TASKS 26. DATA SCIENCE PROJECT - 1 26.1 INTRODUCTION - DATA SCIENCE WORKFLOW 26.2 DATA COLLECTION 26.3 EXPLORATORY DATA ANALYSIS (EDA) AND VISUALIZATION 26.4 DATA PREPROCESSING 26.5 MACHINE LEARNING MODEL DEVELOPMENT 26.6 INTRODUCTION TO MODEL DEPLOYMENT 26.7 MODEL DEPLOYMENT USING STREAMLIT 27. DATA SCIENCE PROJECT - 2 27.1 INTRODUCTION 27.2 DATASET OVERVIEW 27.3 NLP MODEL DEVELOPMENT 27.4 DEEP LEARNING MODEL DEVELOPMENT 27.5 MODEL EVALUATION 27.6 MODEL DEPLOYMENT USING STREAMLIT 28. POWER BI 28.1 INTRODUCTION TO POWER BI,KEY FEATURES, INSTALLATION & SETUP 28.2 UNDERSTANDING THE POWER BI DESKTOP INTERFACE 28.3 EXPLORING THE WORKSPACE: RIBBONS, PANES, AND MENUS 28.4 DATA TRANSFORMATION 28.5 DATA MODELING: RELATIONSHIPS, KEYS, AND HIERARCHIES 28.6 DATA ANALYSIS EXPRESSIONS (DAX), DAX FUNCTIONS AND CALCULATIONS 28.7 ADVANCED DAX CALCULATIONS: TIME INTELLIGENCE, FILTERS, AND MEASURES 28.8 CHARTS AND PAGE LAYOUTS 28.9 CREATING A POWER BI DASHBOARD 28.10 PUBLISHING AND SHARING REPORTS AND DASHBOARDS 29. TABLEAU 29.1 INTRODUCTION TO TABLEAU PREP, DATA CONNECTIONS, DATA CLEANING & TRANSFORMATION 29.2 TABLEAU DESKTOP, DATA SOURCE CONNECTION & NAVIGATION 29.3 BASIC VISUAL ANALYTICS - SORTING AND FILTERING DATAINTERACTIVITY 29.4 CALCULATED FIELDS, AGGREGATIONS & LEVEL OF DETAIL (LOD) EXPRESSIONS 29.5 CHARTS AND DASHBOARDS USING TABLEAU 30. Introduction to Generative AI, 30.1 Overview, Key Features & Application of GenAI Transformers & LLMs 30.2 Ethical considerations and potential biases in generative AI 30.3 Introduction to Transformers & Self Attension Mechanism 30.4 Overview & Key Components of Transformers Architecture 30.5 Large Language Models (LLMs) & their use cases 30.6 Types of Transformers Model 30.7 Exploring Huggingface Community and Setting Up Environment 30.8 Practical Exercise: Implementing Text Generation, Summerization & Translation using GPT2 and Flan-T5 Model & Experimenting 31. Training and Fine-tuning LLMs 31.1 Fine-tuning LLMs for Specific Tasks 31.2 Dataset preparation and pre-processing techniques 31.3 Fine-tuning hyper parameter optimization 31.4 Evaluating the performance of fine-tuned models using metrics like BLEU and ROUGE 31.5 Introduction to RAG for fine-tuning 31.6 Practical Exercise 1: Training/Fine Tuning GPT2 on Indian Food Recipe Dataset and Evaluating on BLUE & ROUGE Score 31.7 Practical Exercise 2: Training /Fine Tuning Flan-T5 on Dialogue on knkarthick dialogsum Datset and Evaluating 32. Advanced Fine-tuning and Model 32.1 Fine-tuning for Precision Evaluation 32.2 Conditional text generation based on specific contexts 32.3 Text-to-Speech (TTS) and Speech-to-Text (STT) integration with Hugging Face 32.4 Advanced Model Evaluation Techniques: PEFT & LoRA 32.5 Qualitative analysis of generated text and summarization outputs 32.6 Human Evaluation in Generative Models 32.7 Practical Exercise: Implementing PEFT & LoRA on Indian Food Recipe Trained GPT 2 model & integrating with TTS & STT 33. Building a Real-Life Chatbot with 33.1 Real-World Applications of Generative AI Gradio Deployment 33.2 Case studies of successful LLM applications in various industries 33.3 Identifying new opportunities for generative AI solutions 33.4 Ethical considerations and responsible deployment practices 33.5 Designing and Developing a Chatbot 33.6 Defining the chatbot's functionalities and target audience 33.7 Introduction to Gradio - Interactive web-based interface 33.8 Practical Exercise: Implementing and deploying the Chabot with Gradio & testing 34. CAPSTONE PROJECT 34.1 PROJECT AND DATASET ASSIGNMENT BY CAPSTONE MENTOR (REMOTE/ONLINE) ALLOCATION 34.2 ORIENTATION SESSION BY CAPSTONE MENTOR - PROJECT EXPECTATIONS 34.3 MENTORSHIP SESSION BY CAPSTONE MENTOR - DOUBT RESOLUTIONS 34.4 PROJECT PRESENTATION 35. CAREER ENHANCEMENT 35.1 PRESENTATION SKILLS SESSION – 1 35.2 EMAIL ETIQUETTES 35.3 SOFT SKILLS TRAINING 35.4 INTERVIEW DO'S AND DON'TS 35.5 LINKEDIN PROFILE BUILDING 35.6 PERSONALITY DEVELOPMENT / GROOMING 36. CAREER ENHANCEMENT 36.1 CAREER ENHANCEMENT PROCESS OVERVIEW SESSION – 2 36.2 MOCK INTERVIEWS 36.3 HR AND TECHNICAL INTERVIEW PREP 36.4 ONE-ON-ONE FEEDBACK 37. INDUSTRY GUEST SESSION 37.1 GUEST SESSION FROM INDUSTRY PROFESSIONAL