Principles of Parallel and Distributed Computing
Eras of Computing: Parallelism vs Sequential Computation
• Sequential Computing:
o One task at a time, step-by-step execution.
o Limited efficiency for large-scale problems.
• Parallel Computing:
o Multiple computations happening simultaneously.
o Increases processing speed and efficiency.
What is Parallel Computing?
• Definition: Performing multiple calculations or executing multiple processes at the same
time.
• Key Elements:
o MIMD (Multiple Instruction, Multiple Data): A model where different
processors execute different instructions on different data.
o Shared vs. Distributed Memory:
§ Shared Memory: Multiple processors access the same memory space.
§ Distributed Memory: Each processor has its own private memory.
o Efficiency Factors:
§ Easier? Depends on system architecture.
§ Failure tolerance? Can handle failures better with redundancy.
§ Scalability? More scalable than sequential computing.
§ Code Grains: Optimization for processor efficiency.
Elements of Distributed Computing
• Definition: A collection of independent computers appearing as a single system.
• Communication Types:
o Different ways computers interact within a distributed system.
• Software Architectures:
o Batch Sequential: Sequential processing in stages.
o Pipe-Filter Style: Processes organized as data streams.
§ Example: cat filename | grep pattern | wc (Command pipeline)
o Repository Architectural Model (Blackboard Systems):
§ Knowledge sources update a central knowledge base.
§ Control system manages rule-based execution.
o Virtual Machine Architectures:
§ Rule-Based Style: Execution as an abstract interface.
§ Interpreter Style: Uses an interpretation engine with internal memory
and pseudo-code.
System Architectures
• Client/Server Models:
o Thin Client Model: Minimal processing on client, heavy processing on server.
o Fat Client Model: Processing distributed between client and server.
• Multi-Tier Architectures:
o Two-Tier Architecture: Direct communication between client and server.
o Three-Tier & N-Tier Architectures: Intermediate layers for processing and
database management.
• Peer-to-Peer (P2P) Systems: No central server, direct communication between nodes.
Models for Inter-Process Communication (IPC)
• Shared Memory Model: Processes communicate via shared memory space.
• Message Passing Model:
o Point-to-Point Communication: Direct messages between processes.
o Publish-Subscribe Model: Processes subscribe to topics and receive messages.
o Remote Procedure Call (RPC): A process invokes a function on a remote
machine.
• Distributed Objects & Web Services: Objects and services communicating over a
network.
Communication Technology for Distributed Computing
• Remote Procedure Call (RPC): A process can request services from another computer.
• Service-Oriented Computing (SOC): A paradigm for designing and integrating
distributed applications.
What is a Service?
• Definition: A software component offering reusable, coherent functionalities.
• Four Major Characteristics:
1. Explicit Boundaries: Clear interfaces for interaction.
2. Autonomy: Services operate independently.
3. Schema & Contracts (XML, SOAP): Services share structured data, not code.
4. Policy-Based Compatibility: Services interact based on defined policies.
Virtualization
• Concept that allows multiple virtual instances to run on a single physical machine.
• Key in cloud computing for resource efficiency and flexibility.
Why We Need Virtualization?
• Increased Performance & Computing Capacity: Improves hardware efficiency.
• Underutilized Hardware & Software Resources: Optimizes the use of resources.
• Lack of Space: Helps businesses, especially enterprises, optimize physical space.
• Greening Initiatives: Reduces power consumption and enhances sustainability.
• Rising Administrative Costs: Lowers IT infrastructure and maintenance costs.
Characteristics of Virtualized Environments
• Increased Security: Isolation between virtual machines enhances security.
• Managed Execution: Controlled and optimized performance.
• Portability: Virtual images can be transferred across systems easily.
Machine Reference Model
• Different Execution Environments:
o OS Developer (System ISA)
o Application Developer (User ISA)
• Application Binary Interface (ABI):
o Defines low-level data types and executable formats.
• High-Level Abstraction:
o Nonprivileged Mode: Regular application execution.
o Privileged Mode: System-level control.
• Behavior Sensitivity:
o I/O Control Sensitivity: Handles input/output instructions.
o CPU Register Sensitivity: Manages privileged CPU operations.
Security Rings & Privilege Modes
• User Mode: Basic application execution.
• Supervisory Mode: System control functions.
• Hypervisor Mode: Higher-level control over multiple VMs.
• Managing Traps:
o Uses ISA’s 17 sensitive instructions to handle guest OS behavior.
Hardware-Level Virtualization (System Virtualization)
• Hypervisor (Virtual Machine Monitor - VMM):
o Type I (Bare-Metal): Runs directly on hardware, no host OS required.
o Type II (Hosted): Runs on top of an existing OS.
• Virtual Machine Manager (VMM) Components:
o Dispatcher: Handles entry points (invocation).
o Allocator: Manages resource distribution.
o Interpreter: Translates and optimizes execution.
• Efficient VMM Criteria (Goldberg & Popek’s Properties):
o Equivalence: Guest OS should behave the same as on physical hardware.
o Resource Control: Full control over virtualized resources.
o Efficiency: Most instructions should execute without intervention.
Hardware Virtualization Techniques
• Full Virtualization: Runs unmodified OS directly on a VM.
• Paravirtualization: Uses a lightweight VMM but requires OS modifications.
• Partial Virtualization: Only some parts of the hardware are virtualized.
Operating System-Level Virtualization
• Used for Concurrent Execution.
• No VMM or Hypervisor Required.
• Single OS Kernel with Multiple User Space Instances:
o Each instance has:
§ Isolated file system.
§ Separate IP configurations.
§ Access control.
• Direct OS System Calls: No emulation or extra overhead.
Programming Language-Level Virtualization
• Uses Bytecode for Execution.
• Ensures Platform/OS Portability.
• Common in Virtual Machines (JVM, .NET CLR).
Virtualization & Cloud Computing
• Key Use Cases:
o Computing and Storage Virtualization (not for networking).
o IaaS (Infrastructure as a Service): Hardware Virtualization.
o PaaS (Platform as a Service): Programming Language Virtualization.
• Important Concepts:
o Server Consolidation: Reducing physical servers through VMs.
o Virtual Machine Migration: Moving VMs between hosts.
o Live Migration: Moving VMs without downtime.
Advantages & Disadvantages
• Advantages:
o Sandboxed execution environment prevents security breaches.
o Resource partitioning allows fine-tuned allocation.
• Disadvantages:
o Performance Overhead: Managing virtual processors and privileged instructions.
o Memory Management Challenges: Paging and console functions impact
performance.
Examples of Virtualization Platforms
• Xen (Paravirtualization)
o Domain 0: First loaded VM, manages hypervisor.
o Domain U: Runs guest OS in Ring 1.
o Requires OS Modification: Not all OSes are compatible.
• VMware (Full Virtualization)
o Type I: Server Virtualization.
o Type II: Desktop Virtualization.
o Binary Translation & Direct Execution:
§ Direct execution for non-privileged tasks.
§ Translates sensitive instructions.
Other Types of Virtualization
• Storage Virtualization: Abstracts physical storage.
• Network Virtualization: Virtualized networking infrastructure.
• Desktop Virtualization: Runs a full OS instance remotely.
• Application Server Virtualization: Virtualized execution environments for applications.