Distributed System Chapter-1
Distributed System Chapter-1
Chapter One
2704
867 BOS
849 PVD
ORD 187
740 144
1846 621 JFK
184 1258
802
SFO BWI
1391
1464
337 1090
DFW 946
LAX 1235
1121
MIA
10/16/2024 2342 By Lewyehu Y. 1
1. Introduction and Definition
Before-1980
Computer were large and expensive
very slow (a few thousand instructions per second)
not connected among themselves
All systems were centralized systems
after the mid-80s: two major developments
cheap and powerful microprocessor-based computers appeared
High speed computer networks(LANs , WANs)
10/16/2024 By Lewyehu Y. 3
A distributed system is: a collection of independent
computers that appears to its users as a single coherent
system - computer (Tanenbaum & Van Steen).
10/16/2024 By Lewyehu Y. 4
Other Definitions
◊ A distributed system is a system designed to support the
development of applications and services which can exploit a
physical architecture consisting of multiple, autonomous
processing elements that do not share primary memory but
cooperate by sending asynchronous messages over a
communication network (Blair & Stefani).
10/16/2024 By Lewyehu Y. 5
Organization of a Distributed System
to support heterogeneous computers and networks and to
provide a single-system view, a distributed system is often
organized by means of a layer of software called middleware
that extends over multiple machines
10/16/2024 By Lewyehu Y. 7
Role of Middleware (MW)
Middleware allows independent computers to work together
closely.
10/16/2024 By Lewyehu Y. 8
Middleware Examples
10/16/2024 By Lewyehu Y. 9
All of the previous examples support communication across a
network:
10/16/2024 By Lewyehu Y. 10
Advantages of Distributed system?
Economically: a collection of microprocessors offer a better
price/performance than mainframes
10/16/2024 By Lewyehu Y. 11
Resource and Data Sharing
• printers, databases, multimedia servers, ...
Scalability, Extensibility
• the system grows with demand (e.g., extra servers)
Performance
• huge power (CPU, memory, ...) available
Inherent distribution, communication
• organizational distribution, e-mail, video
10/16/2024 By Lewyehu Y. 12
Disadvantage of Distributed system
Security: Easy access also applies to secret data
Privacy: unwanted communication such as spam
Partial failure: we often do not know where the error is
Software: difficult to develop software for distributed
systems
10/16/2024 By Lewyehu Y. 13
Characteristics of Distributed Systems
users and applications can interact with a distributed system in a
consistent and uniform way regardless of location
concurrency
10/16/2024 By Lewyehu Y. 15
Resource sharing (Making Resources Accessible)
The main goal of a distributed system is to make it easy for the users
(and applications) to access remote resources, and to share them in a
controlled and efficient way.
There are many reasons for wanting to share resources. One obvious
reason is that of economics.
10/16/2024 By Lewyehu Y. 16
Connecting users and resources also makes it easier to
collaborate and exchange information, as is clearly illustrated by
the success of the Internet with its simple protocols for
exchanging files, mail, documents, audio, and video.
Distribution Transparency
10/16/2024 By Lewyehu Y. 17
Transparency means that any distributed system should hide its
distributed nature from its users appearing and functioning as a
normal centralised system.
A distributed system that is able to present itself to users and
applications as if it were only a single computer system is said to be
transparent.
Types of Transparency
There are many types of transparency:
Access transparency: resources are accessed in a single, uniform
way (regardless of whether they are local or remote). Hide
differences in data representation and how resources are accessed.
10/16/2024 By Lewyehu Y. 18
Location transparency: users should not be aware of where a
resource is physically located. That is hide where resources are
located.
Concurrency transparency: multiple users may compete for and
share a single resource: this should not be apparent to them, hide
that resource may be shared by several competitive users.
Replication transparency: even if a resource is replicated, it should
appear to the user as a single resource (without knowledge of the
replicas). Hide that resources are replicated.
Failure transparency: always try to hide any faults. …and more:
mobility (migration/relocation), performance, scaling, persistence,
security , hides failure and recovery of resource
Relocation : Hide that resource may be moved to another location
while in use.
Migration: Hide that resources may moved to another location.
10/16/2024 By Lewyehu Y. 19
Openness
Openness is the property of a distributed system such that each
subsystem is continually open to interaction with other systems.
10/16/2024 By Lewyehu Y. 20
Another goal of an open distributed system is that it should be
flexible and extensible; easy to configure the system out of
different components; easy to add new components, replace
existing ones.
An Open Distributed System is a system that offers services
according to standard rules that describe the syntax and semantics
of those services; e.g., protocols in networks
In distributed systems, such services are often specified through
interfaces often described using an Interface Definition Language
(IDL)
Specify only syntax: the names of the functions, types of
parameters, return values,
10/16/2024
possible exceptions,….
By Lewyehu Y. 21
Interface Definition/Description Languages (IDL): used to
describe the interfaces between software components, usually
in a distributed system.
10/16/2024 By Lewyehu Y. 22
Scalability
• A system is described scalable if it will remain effective, without
disproportional performance loss, when there is an increase in the
number of resources, the number of users, or the amount of input
data.
10/16/2024 By Lewyehu Y. 23
A distributed system should be scalable
size: adding more users and resources to the system
geographically: users and resources may be far apart
administratively: should be easy to manage even if it spans many
administrative organizations.
10/16/2024 By Lewyehu Y. 24
Scalability Problems
When a system needs to scale, very different types of problems need to be solved.
If more users or resources need to be supported, we are often confronted with the
For example, many services are centralized in the sense that they are
The problem with this scheme is obvious: the server can become a bottleneck as
10/16/2024 By Lewyehu Y. 26
Scaling techniques
how to solve scaling problems
geographical scalability)
10/16/2024 By Lewyehu Y. 27
a. Hide Communication Latencies
try to avoid waiting for responses to remote service requests
Have separate handler for incoming response
i.e., construct requesting applications that use only asynchronous
communication instead of synchronous communication; when a
reply arrives the application is interrupted.
10/16/2024 By Lewyehu Y. 28
(a) a server checking the correctness of field entries
(b) a client doing the job
10/16/2024 By Lewyehu Y. 29
b. distribution
Another important scaling technique is distribution.
10/16/2024 By Lewyehu Y. 31
c. Replication
replicate components across a distributed system to increase
availability and for load balancing, leading to better performance
10/16/2024 By Lewyehu Y. 32
Concurrency
There is a possibility that several clients will attempt to access a
shared resource at the same time.
10/16/2024 By Lewyehu Y. 33
Pitfalls when Developing Distributed Systems
(Assumptions which are usually wrong for DS)
Peter Deutsch, then at Sun Microsystems, formulated these mistakes
as the following false assumptions that everyone makes when
developing a distributed application for the first time:
1. The network is reliable
2. Latency is zero
3. Bandwidth is infinite
4. The network is secure
5. Topology doesn’t change
6. There is one administrator
7. Transport cost is zero
8. The network is homogeneous
10/16/2024 By Lewyehu Y. 34
1.3 TYPES OF DISTRIBUTED SYSTEMS
a. Cluster Computing
b. Grid Computing
c. Cloud Computing)
10/16/2024 By Lewyehu Y. 35
a. Cluster Computing
Cluster computing involves a group of interconnected computers (or
nodes) that work together as a single system to perform tasks.
These nodes are usually located close to each other, often within the
same physical location.
10/16/2024 By Lewyehu Y. 36
A characteristic feature of cluster computing is its
homogeneity. In most cases, the computers in a cluster are
largely the same.
They all have the same operating system, and are all connected
through the same network.
10/16/2024 By Lewyehu Y. 37
Cluster Types & Uses
High Performance Clusters (HPC) : Run large parallel programs as like Scientific,
military, engineering apps; e.g., weather modelling, Web Server Clusters.
Load Balancing Clusters (LBC):
Load Balancing Clusters are designed to distribute incoming requests across multiple
servers to ensure no single server becomes overwhelmed.
Front end processor distributes incoming requests, server farms (e.g., at banks or popular
commercial and financial web sites).
Example: Banks use load balancers to direct customer requests across various servers,
maintaining performance during busy periods.
High Availability Clusters (HA) : provide redundancy – back up systems, May be more
fault tolerant than large scale mainframe systems.
High Availability Clusters are designed to ensure that systems remain operational and
available, even in the event of hardware or software failures.
Telecommunications
10/16/2024 By Lewyehu Y. 38
b. Grid Computing
Grid computing is a distributed computing model that connects
multiple heterogeneous computing resources across different
locations to work together on complex problems.
10/16/2024 By Lewyehu Y. 40
• The collective layer: It deals with handling access to multiple
resources and typically consists of services for resource discovery,
allocation and scheduling of tasks onto multiple resources, data
replication, and so on.
10/16/2024 By Lewyehu Y. 41
10/16/2024 By Lewyehu Y. 42
c. Cloud Computing
Cloud computing is the delivery of computing services—including
servers, storage, databases, networking, software, analytics, etc…
over the internet (the cloud).
Service Models:
10/16/2024 By Lewyehu Y. 43
Software as a Service (SaaS): Delivers software applications
over the internet on a subscription basis, eliminating the need
for installation and maintenance.
10/16/2024 By Lewyehu Y. 44
2. Distributed Information Systems
Distributed information systems manage and process data across
multiple nodes, enabling data sharing and collaboration among
users in different locations.
10/16/2024 By Lewyehu Y. 45
2.1.Transaction processing systems
Transaction processing systems are specialized software systems
designed to handle transactions in a consistent, reliable, and
efficient manner.
They are critical in environments where a high volume of
transactions needs to be processed, such as banking,
e-commerce, and inventory management.
Highly Structured Approach: TPS typically use a client-server
model where the client (e.g., ATM) sends requests to a server
(e.g., bank database) to perform operations such as withdrawal,
deposit, or balance inquiry.
10/16/2024 By Lewyehu Y. 46
In practice, operations on a database are usually carried out in
the form of transactions.
10/16/2024 By Lewyehu Y. 47
10/16/2024 By Lewyehu Y. 48
The Transaction Model
the model for transactions comes from the world of business
a supplier and a retailer negotiate on
price
delivery date
Quality etc.
until the deal is concluded they can continue negotiating
or one of them can terminate
but once they have reached an agreement they are bound
by law to carry out their part of the deal
transactions between processes is similar with this
scenario
10/16/2024 By Lewyehu Y. 49
e.g., assume the following banking operation
withdraw an amount x from account 1
deposit the amount x to account 2
what happens if there is a problem after the first
activity is carried out?
group the two operations into one transaction; either
both are carried out or neither
we need a way to roll back when a transaction is not
completed
10/16/2024 By Lewyehu Y. 50
(a) transaction to reserve three flights commits
(b) transaction aborts when third flight is unavailable
10/16/2024 By Lewyehu Y. 51
properties of transactions, often referred to as ACID
1.Atomic: all statement in the transaction either completed
successfully or they were all rolled back.
- A transaction either happens completely or not at all;
intermediate states are not seen by other processes
1.Consistent: the transaction does not violate system invariants.
- The transaction is logically consistent state ; e.g., in an internal
transfer in a bank, the amount of money in the bank must be the
same as it was before the transfer (the law of conservation of
money);
1.Isolated or Serializable: concurrent transactions do not
interfere with each other; if two or more transactions are
running at the same time, the final result must look as though
all transactions run sequentially in some order
2.Durable: once a transaction commits, the changes are
permanent
10/16/2024 By Lewyehu Y. 52
Classification of Transactions
a transaction could be flat, nested or distributed
Flat Transaction
consists of a series of operations that satisfy the ACID
properties
simple and widely used but with some limitations
some transactions may take too much time
Nested Transaction
constructed from a number of sub-transactions; it is logically
decomposed into a hierarchy of sub-transactions
that run in parallel, on different machines; to gain performance
each may also execute one or more sub-transactions
10/16/2024 By Lewyehu Y. 53
Nested transactions are important in distributed systems
they provide a natural way of distributing a transaction across multiple
machines.
They follow a logical division of the work of the original transaction and
managed by the software component.
This component was called a transaction processing monitor or TP monitor
for short.
Its main task was to allow an application to access multiple
server/databases by offering it a transactional programming model, as
shown in Fig. 1-10.
10/16/2024 By Lewyehu Y. 54
Distributed Transaction
a flat transaction that operates on data that are distributed
across multiple machines
10/16/2024 By Lewyehu Y. 55
2.2. Enterprise Application Integration
10/16/2024 By Lewyehu Y. 56
10/16/2024 By Lewyehu Y. 57
3. Distributed Pervasive Systems
A pervasive system, also known as ubiquitous computing, refers to a
computing environment where technology is seamlessly integrated into the
background of everyday life.
The goal of pervasive systems is to enhance the user's experience by
making technology available at any time and in any place, often without
the user being explicitly aware of it.
• The first two types of systems are characterized by their stability: nodes
and network connections are more or less fixed but this type of system is
likely to incorporate small, battery-powered, mobile devices.
Home systems
Electronic health care systems – patient monitoring
Sensor networks – data collection, surveillance
10/16/2024 By Lewyehu Y. 58
Hence, Distributed systems are everywhere - Internet, intranet,
wireless networks.
10/16/2024 By Lewyehu Y. 59
In order to design a good distributed system, there are six key
design goals:
Concurrency
Scalability
Openness
Fault Tolerance
Privacy & Authentication and
Transparency.
10/16/2024 By Lewyehu Y. 60
10/16/2024 By Lewyehu Y. 61