Advanced Data Mining

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI

WORK INTEGRATED LEARNING PROGRAMMES


Part A: Content Design

Course Title Advanced Data Mining


Course No(s) AIML*ZG548
Credit Units 4
Credit Model 1 - 0.5 - 2.5
1 unit for class room hours, 0.5 unit for Tutorial, 1.5 units for
Student preparation.
1 unit = 32 hours
Content Authors

Version

Date May 2023

Course Objectives
No Objective

CO1 Understand diverse knowledge/patterns that can be discovered from a variety of information
repositories.

CO2 Learn methods/techniques of discovering knowledge from structured data, various types of
unstructured data and big data.

CO3 Understand efficiency, effectiveness of applicable techniques for data mining.

Text Book(s)
T1 Tan P. N., Steinbach M, Anuj Karpatne & Kumar V. “Introduction to Data Mining” 2nd Edition,
Pearson Education, 20nn
T2 Data Mining: Concepts and Techniques, Third Edition by Jiawei Han and Micheline Kamber
Morgan Kaufmann Publishers, 2011

Reference Book(s) & other resources

R1 Mining of Massive Datasets 3rd ed by Jure Leskovec, Anand Rajaraman, Jeffrey Ullman
R2 Data Mining – The Textbook by Charu Aggarwal, Springer 2015
Topic-wise additional references will be provided with lecture slides
Modular Content Structure

1. Introduction to Data Mining


1.1. Data Mining definitions
1.2. Various Patterns Discovered
1.3. DM process
1.4. DM challenges
2. Data Readiness and Quality
2.1. Data Quality
2.2. Data Exploration
2.3. Approaches for Big Data
3. Classification and Prediction
3.1. Concepts of classification and prediction
3.2. Solution Approaches
4. Association Analysis
4.1. Association analysis concepts
4.2. Apriori Algorithm for frequent itemsets
4.3. FP-Tree technique for frequent itemsets
4.4. Mining association rules
5. Clustering
5.1. Cluster analysis concepts.
5.2. Partitioning methods
5.3. Hierarchical methods for cluster analysis
5.4. Density based methods for cluster analysis
5.5. Advanced Methods of Clustering (Optics, Birch, Grid)
6. Anomaly Detection
6.1. Concepts of Outliers
6.2. Statistical approaches
6.3. Proximity and Density based outlier detection
7. Data mining on unstructured (Big) data
7.1. Big Data
7.1.1 Distributed File System
7.1.2 Map Reduce
7.1.3 Stream Processing
7.1.4 Apache Stack
7.2. Graph Mining methods and applications
7.3. Multimedia Data Mining
7.4. Text Mining, Web and Social Media Mining
8. Data Mining Applications
8.1. Recommendation systems
8.2. Fraud Detection
8.3. Sentiment Analysis
9. Data Mining Summary
9.1. Risk of False Discoveries
9.2. Course Summary
Learning Outcomes:
No Learning Outcomes

LO1 Understand the various types of interesting patterns in data.

LO2 Knowledge of data mining techniques for discovering interesting patterns from data.

LO3 Ability to compare applicable methods for the efficiency and effectiveness.

Part B: Contact Session Plan

Academic Term S2 2022-23


Course Title Advanced Data Mining
Course No
Lead Instructor
Course Contents

Contact List of Topic Title Topic # Text/Ref


Hours(#) (from content structure in Part A) (from Book/external
content resource
structure in
Part A)

1  Introduction to Data Mining 1 T1-Ch 1


o Data Mining definitions R2-Ch 1
2 o Various Patterns Discovered
o DM process
o DM challenges
3  Data Readiness and Quality 2 T2-Ch 3
o Data Quality R1-Ch 1
o Data Exploration
4 Class Notes
o Approaches for Big Data
5  Classification and Prediction 3 T2-Ch 9
o Concepts of classification and prediction
6 o Solution Approaches
7  Association Analysis 4 T1-Ch 4
8 o Association analysis concepts
9 o Apriori Algorithm for frequent itemsets
o FP-Tree technique for frequent itemsets
10 o Mining association rules
11
12
13  Clustering 5 T1-Ch 5,
14 o Cluster analysis concepts. T2-Ch 10
15 o Partitioning methods
o Hierarchical methods for cluster analysis
16 o Density based methods for cluster
17 analysis
18 o Advanced Methods of Clustering (Optics,
Birch, Grid)

19  Anomaly Detection 6 T2: 12.1-12.4


o Concepts of Outliers
o Statistical approaches
20 o Proximity and Density based outlier
detection
21 1.Data mining on unstructured (Big) data 7 R1: 2.1-2.3,
4.1-4.4
22 o Big Data
T2 (Second
─ Distributed File System Edition) : Ch 9,
23
─ Map Reduce Ch 10,
24 ─ Stream Processing Class Notes
25 ─ Apache Stack
o Graph Mining methods and applications
26
o Multimedia Data Mining
27 o Text Mining, Web, and Social Media
28 Mining

29  Data Mining Applications 8 Class Notes


o Recommendation systems
30 o Fraud Detection
o Sentiment Analysis

31  Data Mining Summary 9 T1: Ch 10


o Risk of False Discoveries Class
32 o Course Summary Notes
Select Topics for experiential learning
Topic No. Select Topics in Syllabus for experiential learning

1 Classification
2 Association
3 Clustering
4 Anomaly Detection

Evaluation Scheme

Legend: EC = Evaluation Component


No Name Type Duration Weight Day, Date, Session, Time
Assignment Implementation based 10% To be announced
EC-1 Quiz-I MCQs 1 hour 5% To be announced
Quiz-II MCQs 1 hour 5% To be announced
EC-2 Mid-Semester Test TBA 2 hours 30% To be announced
EC-3 Comprehensive Exam Open Book 3 hours 50% To be announced
Note - Evaluation components can be tailored depending on the proposed model.

Important Information
Syllabus for Mid-Semester Test : Topics in Weeks 1-8
Syllabus for Comprehensive Exam : All topics given in plan of study

Evaluation Guidelines:
1. EC-1 consists of one Assignment and two Quizzes. Announcements regarding the same will be made
in a timely manner.
2. If a student is unable to appear for the Regular examinations(for mid-semester and Comprehensive)
due to genuine exigencies, the student should follow the procedure to apply for the Make-Up
Test/Exam. The genuineness of the reason for absence in the Regular Exam shall be assessed prior to
giving permission to appear for the Make-up Exam.

You might also like