Curriculum For Data Engineering:
Data Engineering redefines (2 Hours)
Evolution of Data Engineering
Data Engineering Skills and Activities
Data Engineer Responsibilities
Data Engineers and Roles
Data Engineering Lifecycle (2 hours)
Generation: Source Systems
Storage
Ingestion
Transformation
Serving Data
Data Engineering Lifecycle – Undercurrents (2 Hours)
Security
Data Management
Data Ops
Data Architecture
Orchestration
Software Engineering
Data Architecture – Design and Concepts (2 hours)
Enterprise Architecture Defined
Data Architecture Defined
Principles of Good Data Architecture
Major Architecture Concepts
Examples and Types of Data Architecture (2 hours)
Data Warehouse
Data Lake
Cloud Data Warehouse
Lambda Architecture
Kappa Architecture
Choosing Technologies Across the Data Engineering Lifecycle (2 hours)
Interoperability
Fin Ops
Cost Optimization
Location
Build vs Buy
Today vs Future
Open-Source Software
Monolith vs Modular
Serverless vs Server
Data Architecture – Case Study : Dropbox, Cloudflare, Netflix (2 hours)
Data Generation in Source Systems (4 Hours)
Sources of Data
Source System Practical Details
Message Queues and Event-Streaming Platforms
Whom You’ll Work With
Undercurrents and Their Impact on Source Systems
Storge (4 Hours)
Raw Ingredients of Data Storage
Data Storage Systems
Data Engineering Storage Abstractions
Big ideas and trends
Undercurrents and Their Impact on Storage
Ingestion (5 Hours)
Data Pipelines Defined
Key Engineering Considerations for the Ingestion Phase
o Batch Ingestion Considerations
o Message and Stream Ingestion Considerations
Ways to Ingest Data
Whom You’ll Work With
Undercurrents and Their Impact on Ingestion
Transformation (5 Hours)
Queries
Modeling
Transformations
Data Pipelines
Undercurrents and Their Impact on Transformations
Serving Data for Analytics, Machine Learning, and Reverse ETL (4 Hours)
General Considerations for Serving Data
Analytics
Machine Learning
Reverse ETL
Ways to Serve Data for Analytics and ML
Whom You’ll Work With
Undercurrents and Their Impact on Serving Data
Security and Privacy (2 Hours)
People
Processes
Example Security Policy
Technology
Future of Data Engineering (2 Hours)
The Data Engineering Lifecycle Isn’t Going Away
The Decline of Complexity and the Rise of Easy-to-Use Data Tools
The Cloud-Scale Data OS and Improved Interoperability
“Enterprisey” Data Engineering
Titles and Responsibilities Will Morph
Moving Beyond the Modern Data Stack, Toward the Live Data Stack
Serialization and Compression (2 Hour)
Serialization Formats
Database Storage Engines
Compression: gzip, bzip2, Snappy, Etc.
Cloud Networking (1 Hour)
Cloud Network Topology
CDNs
The Future of Data Egress Fees
Conclusion (2 Hours)