100% found this document useful (1 vote)
1K views6 pages

Stream Processing and Analytics

The document outlines the course content and structure for a course on stream processing and analytics, including 11 topics that will be covered over 16 lectures, with an evaluation scheme consisting of assignments, a mid-semester test, and a comprehensive exam. The course objectives are to introduce frameworks for real-time stream processing and various tools, techniques, and algorithms for processing streaming data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views6 pages

Stream Processing and Analytics

The document outlines the course content and structure for a course on stream processing and analytics, including 11 topics that will be covered over 16 lectures, with an evaluation scheme consisting of assignments, a mid-semester test, and a comprehensive exam. The course objectives are to introduce frameworks for real-time stream processing and various tools, techniques, and algorithms for processing streaming data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI

WORK INTEGRATED LEARNING PROGRAMMES


Digital
Part A: Content Design

Course Title Stream processing and Analytics 


Course No(s)  
Credit Units 5
Credit Model
Content Authors

Course Objectives
No

CO1 To introduce the framework for real time stream processing

CO2 To present a survey of tools and techniques for real time stream processing

CO3 To introduce processing various stream processing algorithms

CO4 To introduce approaches to evaluate stream learning algorithms

CO5 To introduce designing solutions to stream processing problems

Text Book(s)

T1 Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data, Byron


Ellis, 2014, Wiley
T2 Knowledge Discovery from Data Streams, João Gama, 2010, Chapman &
Hall/CRC

Reference Book(s) & other resources

R1 Streaming Data: Understanding The Real-Time Pipeline, Andrew G.Psaltis,


2017, Manning Publications
R2 Storm Applied, Sean Allen , Mathew Jankowski, Peter Pathirana, 2015,
Manning Publications
R3 Data Streams: Models and Algorithms, Charu C. Agarwal, 2007, Springer
R4 Fundamentals of Stream Processing, Hentrique MA, Burga Gedik, Deepak
ST, 2014 , Cambridge University Press
Content Structure

1. Introduction to Real Time Big Data Systems


a. Real Time , Streaming Data & Sources
b. Real time streaming system architecture
c. Characteristics of a Real Time Architecture and Processing
2. Configuration and Coordination Systems

a. Motivation

b. Distributed State and Issues

c. Coordination and Configuration using Apache ZooKeeper

3. Data Flow Management

a. Understanding Distributed Data Flows

i. Various Data Delivery and Processing Requirements

ii. N+1 Problem

b. Apache Kafka (High-Throughput Distributed Messaging)

4. Processing Stream Data

a. Elements of Distributed Stream Data Processing

b. Data Processing with Storm

5. Overview of Data Storage – Requirements

a. Need for long-term storage for a real time processing framework

b. Overview of In-memory Storage

c. No-Sql Storage Systems

d. Choosing a right storage solution

6. Visualizing Data

a. Visualizing Streaming Data – Requirements

b. Tools and Examples

7. Introduction to Stream Processing

a. Bounds of Random variables, Poisson Processors


b. Maintaining Simple Statistics from Data Streams

c. Sliding Windows and computing statistics over sliding windows

d. Data Synopsis (Sampling, Histograms, Wavelets, DFT)

8. Exact Aggregation

a. Timed Counting and Summation

b. Multi Resolution Time Series Aggregation

c. Stochastic Optimization

9. Statistical Approximation to Streaming Data

a. Probabilities and Distributions

b. Working with Distributions

c. Sampling Procedures for Streaming Data

10. Approximating Streaming Data with Sketching

a. Registers and Hash Functions

b. Working with Sets

c. The Bloom Filter

d. Distinct Value Sketches

e. The Count-Min Sketch

11. Advanced Topics

a. Clustering techniques for Streaming Data – Hierarchical Methods

b. Decision Tree (VFDT)

12. Case Studies in Design

Learning Outcomes:

No Learning Outcomes

LO1 Understand the Real time streaming systems and applications

LO2 Understand coordination, configuration systems

LO3 Understand Storage, Processing Systems and use it


LO4 Understand processing methods for streaming methods

LO5 Understanding aggregation and statistical approximations

LO6 Understand approximation techniques for data streams

Part B: Learning Plan

Academic Term
Course Title Stream processing and Analytics 
Course No  
Lead Instructor S P Vimal
Lecture Plan
Lectur Topics Text /
e# Referenc
e
1 Introduction to Real Time Big Data Systems (Real Time ,
Streaming Data & Sources, Real time streaming system
architecture ,Characteristics of a Real Time Architecture and
Processing )
2 Configuration and Coordination Systems (Motivation,
Distributed State and Issues, Coordination and Configuration
using Apache ZooKeeper )
3 Data Flow Management( Understanding Distributed Data
Flows, Various Data Delivery and Processing Requirements,
N+1 Problem, Apache Kafka )
4-5 Processing Stream Data (Elements of Distributed Stream
Data Processing , Data Processing with Storm )
6 Overview of Data Storage – Requirements (Need for
long-term storage for a real time processing framework,
Overview of In-memory Storage, No-Sql Storage Systems,
Choosing a right storage solution)

Visualizing Data (Visualizing Streaming Data –


Requirements, Tools and Examples)
7-8 Introduction to Stream Processing (Bounds of Random
variables, Poisson Processes, Maintaining Simple Statistics
from Data Streams, Sliding Windows and computing
statistics over sliding windows, Data Synopsis (Sampling,
Histograms, Wavelets, DFT))
9 Exact Aggregation ( Timed Counting and Summation, Multi
Resolution Time Series Aggregation, Stochastic Optimization

10-11 Statistical Approximation to Streaming (Probabilities and


Distributions, Working with Distributions, Sampling
Procedures for Streaming Data)
12 Approximating Streaming Data with Sketching( Registers
and Hash Functions, Working with Sets, The Bloom Filter,
Distinct Value Sketches, The Count-Min Sketch)
13-14 Advanced Topics (Clustering techniques for Streaming Data
– Hierarchical Methods,Decision Tree (VFDT), Fast Pattern
Mining)
15-16 Analytics Case Studies, Review

Evaluation Scheme​:
Legend: EC = Evaluation Component; AN = After Noon Session; FN = Fore Noon Session
No Name Type Duratio Weigh Day, Date, Session, Time
n t
EC-1 Assignment-1
Assignment-2
EC-2 Mid-Semester Test Closed
Book
EC-3 Comprehensive Open
Exam Book

Notes:
Syllabus for Mid-Semester Test (Closed Book): Topics in Session Nos. 1 to 16 (contact
hours)
Syllabus for Comprehensive Exam (Open Book): All topics (Session Nos. 1 to 32) (contact
hours)

Important links and information:


Elearn portal:​ https://elearn.bits-pilani.ac.in
Students are expected to visit the Elearn portal on a regular basis and stay up to date with the
latest announcements and deadlines.
Contact sessions: Students should attend the online lectures as per the schedule provided on
the Elearn portal.
Evaluation Guidelines:
1. EC-1 consists of either two Assignments or three Quizzes. Students will attempt them
through the course pages on the Elearn portal. Announcements will be made on the
portal, in a timely manner.
2. For Closed Book tests: No books or reference material of any kind will be permitted.
3. For Open Book exams: Use of books and any printed / written reference material
(filed or bound) is permitted. However, loose sheets of paper will not be allowed. Use
of calculators is permitted in all exams. Laptops/Mobiles of any kind are not allowed.
Exchange of any material is not allowed.
4. If a student is unable to appear for the Regular Test/Exam due to genuine exigencies,
the student should follow the procedure to apply for the Make-Up Test/Exam which
will be made available on the Elearn portal. The Make-Up Test/Exam will be
conducted only at selected exam centres on the dates to be announced later.

It shall be the responsibility of the individual student to be regular in maintaining the self study
schedule as given in the course handout, attend the online lectures, and take all the prescribed
evaluation components such as Assignment/Quiz, Mid-Semester Test and Comprehensive
Exam according to the evaluation scheme provided in the handout.

You might also like