100%(1)100% found this document useful (1 vote) 674 views149 pagesDistributed Computing Tech Knowledge
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
W istributed computing 1
13
14
1s
16
Introduction..
Examples of Distributed system
1.2.1 Internet:
1.2.2 Intranet.
123 Mobile and Ubiquitous Computin,
Resource Sharing and the Web.
13.1 World Wide Web (Www).
Issues and Goals.
141 Heterogencity.
142 Making Resources Accessible,
143 Distribution Transparency.
144 Scalability
145 Openness
Types of Distributed Systems,
15.1
1.5.1(A)
15.1(B)
15.2 Distributed Information Systems,
15.2(A) Transaction Processing Systems...
15.3 Enterprise Application Integration
15.4 Distributed Pervasive Systems.
Hardware Concepts.
1.6.1 Multiprocessors...17.2
Network Operating Systems
Middleware...
18
181 Positioning Middleware,
Models of Middleware
Interprocess Communication (Pc),
241
Types of Communication...
Remote Procedure Call (RPC)
2.2.1 RPC Operation...
2.2.2 Implementation Issue:
223 Asynchronous RPC..,
Distributed Objects (RMI: Remote Method Invocation).
Message-Oriented Communication
24.1 Persistence Synchronicity in Communicatio
242 Combination of Communication Types..
243 Message-Oriented Transient Communication...
24.4 Message-Oriented Persistent Communication...
5 Stream-Oriented Communication
25.1 Continuous Media Support.Table of Contents
3-110 3.30
Requirements of Mutual]
yi-Agrawala’s Algorithm,
4 Algorithm, Comparative|
W piseributed Computing
36
37
38
41
42
43
44
45
46
47
‘Table of Contents
3-16
35.2 Ricart-Agrawala’s Algorithm:
35.3 Centralized Algorithm...
354 Mackawa’s Algorithm.
‘Token Based Algorithms...
3.6.1 Suzuki-Kasami’s Broadcast Algorithm...
36.2 Raymond's Tree-Based Algorithm...
of Algorithms...
Comparative Performance Anal
3.7.1 Response Time...
3.7.2 Synchronization Delay
3.7.3 Message Traffi
Deadlock...
Introduction.
Resource Allocation Graphs...
Deadlock Prevention and Avoidance in Distributed System.
3.84 Deadlock Detection.
3.8.4(A) Centralized Approach.
3.8.4(B) Fully Distributed Approacl
eer
‘Chapter 4: Resource and Process Management 41 to 4-19
Desirable Features of global Scheduling algorithm, Task assignment approach, Load balancing approach, load sharing approach,
troduction to process management, process migration, Code Migration. 5
Introduction..
Desirable Features of Global Scheduling Algorithm
‘Task Assignment Approach
Load Balancing Approach.
44. Classification of Load Balancing Algorithm:
442 Issues in Designing Load-Balancing Algorithms...
Load-Sharing Approach..
45.1 Issues in Designing Load-Sharing Algorithms.
Introduction to Process Management...
Process MigréW istributed computing 5 __ Table of Conten
4.7. Desirable Features of Good Process Migration Mechanism... =
4.7.2 Process Migration Mechanisms. on
47.3 Process Migration in Heterogeneous System... ann
4.7.4 Advantages of Process Migration. 5.6.2
48 Code Migration.. ~ 415 Bes
“57 Proce
4.8.1 Approaches to Code Migration _ 45
574
48.2 Migration and Local Resources.
572
4.8.3 Migration in Heterogeneous System:
573
CEEnne 74
58 Recov
Chapter: Replication, Consistency and Fault Tolerance 5-1 to 529
[Consistency Models. Replica Management. Fault Tolerance: Introduction, Process resiiance, Recovery.
5.1 __ Introduction to Distributed Shared Memory..
5.1.1 Architecture of Distributed Shared Memory (DSM,
5.1.2 Design Issues of Distributed Shared Memory (DSM) x.
5.2 Introduction to Replication and Consistency...
5.2.1 Reasons for Replication. wS-4 Chapter6: Dis
introduction andl fe
System (NFS).Dosi
61 Introduc
5.2.2 Replication as Scaling Technique
5.3 Data-Centric Consistency Models
5.3.1 Continuous Consistency
62 Desirabl
5.3.2 Consistent Ordering of Operations...
63 File Mod
5.4 Client-Centric Consistency Models 5-11
634
54.1 Eventual Consistency. seen 1D
632
54.2 Monotonic Reads.. 5-12
64 File-Acce
A tonic Writ sce AB
5.4.3 Monotonic Writes. aaa
5
5.44 Read Your Writes... a2
5.45 Write Follows Reads 65 File-Cach
55 Replica Management... 65.1
5.5.1 Replica-Server Placement. 652
5.5.2 Content Replication and Placement... 653
5.5.3 Content Distribution...Table of Contents
W distributed Computing 6
———e—v—es
5-19
56 Fault Tolerance..
5.6.1 Basic Concepts. 5-19
5.6.2 Failure Models..
5.6.3. Failure Masking by Redundancy.
5.7 Process Resilience.
5.7.1 Design Issues...
2. Failure Masking and Replication.
5.7.3 Agreement in Faulty System:
5.7.4 Failure Detectior
58
581 __ Introductio
582 Stable Storage.
583 Checkpointing
5.84 Message Logging.
Recovery-Oriented Computing,
Introduction.
Desirable Features of a Good Distributed File System.
63 File Models...
63.1 Unstructured and Structured Files.
632 Mutable and Immutable Files.
64 File-Accessing Models..
64.1 Accessing Remote Files...
64.2 Unit of Data Transfer
65 File-Caching Schemes.
65.1 Cache Location.
65.2 Modification Propagation...
65.3 Cache Validation Scheme:W distributed computing
66 File Replication
6.6.1 Replication and Caching.
6.6.2 Advantages of Replicatio
66.3 Replication Transparency.
6.6.4 Multicopy Update Problem
67
68 Designing Distributed Systems: Google Case Study...
68.1 Google Search Engine.
68.2 Google Applications and Services.
68.3 Google Infrastructure .
68.4 The Google File System (GFS)—_—_
" Introduction to
Distributed Systems
Issues, Goals, Types of distributed systems, Grid and Cluster computing
Characterization of Distributed System:
Models, Hardware and Software Concepts: NOS, DOS.
Middleware : Models of Middleware, Services offered by middleware, Characterization of Distributed Systems
1.1 Introduction
The development of powerfulymicroprocessors and invention of the highyspeedsnetworks are the two major
developments in computer technology. Many machines in the same organization can be connected together
e.
through local area network and information can be transferred between machines in a very small amount o'
As a result of these developments, it became easy and practicable to organize computing system comprising large
number of machines connected by high speed networks. Over the period of last thirty years, the price of
microprocessors and communications technology has constantly reduced in real terms.
Because of this, the distributed computer systems appeared as a practical substitute to uniprocessor and
centralized systems. The networks of computers are present all over.
Internet is composed of many networks. All these networks separately and in combination as well, share the
necessary characteristics that make them pertinent topics to focus under distributed system.
In distributed system, components located on different computers in network communicate and coordinate by
passing messages. As per this argument, following are the characteristics of the distributed system.
o Internet
is small part of internet managed by individual organization.
© Intranet, wi
© Mobile and ubiquitous computing.
Resources required for computation are not available on single computer. These resources which are distributed
among many machines can be utilized for computation and it is main motivation behind distributed computing.
Followings are some of the important issues that need to be considered.
Concurrency
© The different cooperating applications should run in parallel and coordinate by utilizing the resources which are
available at different machines in network.2. Global Clock not Avai lable
Introduction t
Woistributed Computing a = Pista
* Cooperation among applications running on different machines in network is achieved by ogy
messages. For cooperation, applications action at particular time is exchanged. But, it is not easy to hore
Machines clocks with same time ‘
* _ Itis difficult to synchronize the different machines clock in network. Therefore, it is Necessary t
Consideration that single global clock is not available and different machines in LAN, MAN or Way have ‘a,
clock time. :
Independency in Failure of System Component.
* Any individual component failure should not affect the computation. Itis obvious that, any software op ate
‘component of the system may fail,
* This failure should not affect the system from running and system should take appropriate action
or recovey |
Definition of Distributed System
A computer network is defined as a set of communicating devices that are connected together by Communic
links. These devices include computers, printers and other devices capable of sending and/or receiving infor
fom other devices on the network These devices often called as node in the Network. So computer netwat
interconnected set of autonomous computers. I
that appears to its users as a single coher
‘System. Users of distributed system feel that, they are working with a single system,
A distributed system is
Following are the main characteristics of distributed system, ‘
© A distributed system comprises computers with distinct architecture and data representation Te
dissimilarities and the ways all these machines communicate are hidden from users.
© The manner in which distributed system is organized internally is also hidden from the users of the dstbv!
system,
© The interaction of users and applications with distributed system is in consistent and identical way, in spit*
where and when interaction occurs.
in spite of failures.
2 Fallure handling should be hidden from users and applications.
facilities.
In Fig, 1.1.1, applications are running on three different ‘machines A, B and C in network.1 Introduction to Distributed Systems
Computer A Computer 8 Computer C
stributed Computing
Fig. 1.1.1 : Distributed system organized as middleware
2 Examples of Distributed System
2.1 Internet
‘Applications running on different machines communicate with each other by passing messages. This
communication is possible due to different protocols designed for the same purpose.
Internet is 2 GRISRGSTRIstrIbuteGTSysteM, User at any location in network can use services of World Wide Web,
‘email file transfer etc. These services can be extended by addition of servers and new types of services.
Intemet Service Provider (ISP) provides internet connection to individual users and small organizations. Email and
web hosting like local services are provided by ISP and it also enable the users to access services available at
different locations in internet.
2.2 Intranet
s€curityipoliciespintranet may comprise many LANs which are linked by backbone connection.
Intranet can be administrated by administrator of single organization and its administration scope may vary
ranging from LAN on single site connected to a connected.
LANs that is part of many braches of companies. Router connects intranet to internet and user in intranet then can
use services available in rest of the internet.
Firewall which filters incoming and outgoing messages protects unauthorized messages to enter or leave the
intranet
2.3. Mobile and Ubiquitous Computing
Many small and portable devices manufactured with advancement in technology is now integrated into distributed
system. Examples of these devices are Laptop, PDAs, Mobile phones, Cameras, pagers, smart watches, devices
‘embedded in appliances, cars and refrigerators etc
As these devices are portable and easily gets connected, mobile computing became possible. Mobile users can
easily use resources available in network and services available in internet.
Ubiquitous computing ties together the many small chip devices in available physical environment of home, office
or elsewhere. In mobile computing, user location can be anywhere but can use resources available everywhere.
Whereas, in ubiquitous computing users are in same physical environment and gets benefited.1.4.3 Distribution Transparency
resources across several computers. A trans
computer system. Follo
Introduction to Distr,
aan ra ~ thange information. The
aps jer and excl
By connecting users and resources, it becomes easier ‘to work togeth
i iments, audio, ay
les, mail, docu ioe
the Internet is due to its straightforward protocols for exchanging — a
worldwide spread people can work together by means of groupware.
i it en leaving home.
and sell verity of goods without going to shop or ev eee
trusion on communication.
Buy
1g also increases the secur!
The increase in connectivity and sharin ae
Presently, systems offer fewer defenses against eavesdropping ey
it i a
favorite profile of 2 pal ;
a 1m with increased connectivity can a,
cas spam. Special information fie,
‘A communication can be tracked to cons proble
particularly if it is done without informing the user. A allied P'
ic jt il, calles
unnecessary communication, for example electronic junk mail
used to select inward messages based on their content.
.m is to hide the actuality of physical distribution of prog
Si ibuted systet
The second main goal of a distributed sys erieereestne
parent distributed system offers its feel to users
19 types of transparency is present in distributed system.
Access Tansporeney dete dsinartes f dats representation of ferent machines nthe tg
‘accessed by the user. Intel machines use little endian format tg,
the manner in which remote resources are | fee
bytes and SPARC uses big endian format. So Intel machines transfer high order bytes in beginning and ing,
SPARC low order bytes are transmitted frst. BierRTOPerating systemsalsojhave!their own ile names,
All these dissimilarities should be hidden from users and applications.
Location Transparency : \. For ex
URL used to access web server and file does not give any idea about its location. Also name of the rex
remains same although it changes the location when moved between machines in the distributed system,
Migration Transparency :
‘thelway/in|whichitheselresourcesvarevaccessed. Processes or files often are migrated by system to impror
performance. All this should remain the hidden from user of the distributed system.
Replication Transparency : It hides the fact that many copies of the resource are placed at different
Often resources are replicated to achieve availability or placed its copy near to location of its access. This ati
the replication should be transparent to the user.
Concurrency Transparency : It hides the sharing of the resource by many users of the distributed system
user is using the resource then other should remain unknown about it. Many users can have stored theirfl
file server. It is essential that each user does not become aware of the fact that other is using the same ie
This is called concurrency transparency. .
Failure Transparency : It hides the failure and recovery of the resource from user. User does not become
failure of resource to work appropriately, and that the system then recovers from that failure. It is not
achieve complete failure transparency. Because for example,
if network fails user can notice this failure.tributed Comp Introduction to Distributed Systems
———— re
‘* Performance Transparency : In order to improve performance system automatically takes some action and it user
remains unaware of this action taken by system. In case of overloading of processor jobs gets transfer to lightly
loaded machine. If many requests are coming from users for a particular resource then resource gets placed to
nearest server of those users.
‘© Scaling Transparency : The main goal of this transparency is to permit the system to expand in scale without
disturbing the operations and activities of existing users of distributed system.
1.4.4 Scalability
If system adapt to increased service load then it is said to be a scalable system. There are three ways to measure
the scalability of the system.
(i) Scalable with respect to its size
(ii) Geographically scalable
(iil) Administratively scalable
Practically, scalable system leads to some loss of performance as the system scales up. Following are the problems
related with scalability.
Scalable with respect to its size
«In order to supports for more number of users and resources, itis necessary to\deal with centralized services,
data and algorithms. If centralized server is used to provide services to more number of users and applications
then server will become bottleneck.
In spite of having more processing power, large storage capacity with centralized server, more number of
requests will be coming to server and increased communication will restrict further growth.
«Like the centralized services, centralized data also is not good idea. Keeping single’ database would/certainly
saturate all the incoming and outgoing communication lines.
In the same way above, centralized algorithms also a not good idea. If centralized routing algorithm is used, it will
collect the information about load on machines and communication lines. The same information algorithm
processes to compute optimal routes and distribute to all machines. Collecting such information and sending
processed information will also overload the part of network. Therefore use (of such (centralized algorithms must
bbe avoided. Preference should be given to only decentralized algorithms.
i) Geographically scalable
© Although users and resources may lie far distance apart geographically, still system allows the users to use
resources and system.
In synchronous communication client blocks until reply comes from server. Distributed system designed for local
‘area networks (LANs) are based on synchronous communication due to small distances between machines.
«Therefore itis hard to scale such systems. A care should be taken while designing interactive applications using
synchronous communication in wide area systems as machines are far apart and time required for
‘communication is three orders magnitude slower with compare to LAN.
- 7 atat ,
Woistributed Computing 18 soecton te —
™ ak munication i
4 further problem that get in the way of geographical scalability is that com munication in Wider
area network
intrinsically unreliable, and virtually always point-to-point. On the contrary, local-area net
S
" it vic de
highly reliable communication facilities based on broadcasting, which ease development of Papa)
systems,
If system has several centralized components then geographical scalability will be restricted
Performance and reliability problems resulting from wide-area communication. Use of centralizeg
will lead to waste of network resources.
Gli) Administratively Scalable
* Although system spans many independent administrative organizations, still its management is easy,
* In order to scale a distributed system across multiple, independent administrative domains, a key pro
requires to be resolved is that of conflicting policies with respect to resource usage (and payment
maa
and security.
Distributed system components residing in one domain can always be trusted by users that
‘same domain. In this scenario, system administration may have tested and authorized applicati
taken special measures to make sure that such components cannot be tampered with. Here,
within
Ons, and may,
the users try,
system administrators. But, this trust does not cross domain boundaries naturally.
fa distributed system crosses the domain boundaries,
two forms of security measures require to be tae,
is,
the distributed system on its own must protect against malicious attacks from the other new domain
the other new domain on its own must protect against malicious attacks from the distributed system,
Scaling Techniques
Following are the three techniques for scaling
Distribution
3. Replication
1. Hiding communication latencies
. in this case, waiting by cet!
response from geographically remote server can be avoided by using asynchronous communica
implementation of requester’s application.
other work until response is received.
On the other hand, a new thread of control can be started to carry out the
request. Even though it blocks
for the reply, other threads in the process can carry on.
* In case of interactive applications, asynchronous communication cannot be used effectively. In this
minimize the communication cost, server side computation of the request can be moved to the client sd:
side validation of form is the best example of this approach. Instead of cai
irying out validation at server
better to shift it at client side.Wierd Comping 1 Introduction to Diseibuted Systems
2. Distribution
+ Cistrbution is important scaling technique. n distribution component of the system vided into smaller parts
and then kept those pats across the system by spreading it. A good example of dstbutin isthe Internet
Domain Name System (DNS).
«There is hierarchical organization of DNS name space into a tree of domains. These domains are divided into non-
‘overlapping zones. The names belonging to each zone are managed by a single name server.
3. Replication
© Although problems related to scalability degrade the performance, it is in general 2 good thought in fact to
replicate components across 2 distributed system. Apart from increasing availabilty of the component,
replication also balances the load between components which result in achieving the better performance. In
WAN based system, placing a copy close to accessing location hides much of the communication latency
problems described above.
Similar to replication, caching leads to make a copy of a resource in the near to the client accessing that resource.
caching decision is taken by the client of a resource instead of owner of a resource.
replication is often planned in advance. The disadvantage of caching and
But, contrary to replication,
‘out on demand whi
© Caching is ca
inione' copy makes that copy different fromthe-others. As a result, caching and replication leads to consistency
problems.
4.5 Openness
Openness is the important goal of the distributed system. An open distributed system provides services'5\per
standardulesthatel the syntax and’ semantics ofithose\services. As, in computemetworks, standard ules state
the: message format, its ‘contents ‘and! meaning of sent and received messages. All these rules are present in
protocols.
Similar to above, in distributed systems, services are usually specified through interfaces, which are expressed in an
Interface Definition Language (IDL). The definitions of the interfaces written in an IDL almost capture only the
syntax of these services.
They state exactly the names of the available functions together with parameters type; return values, exceptions
likely to be raised etc. The semantics of interfaces means specification of what these interfaces can perform.
‘Actually, these specifications are specified in an informal way through natural language.
Processes make use of interfaces to communicate with each other. There can be different implementation of these
interfaces leading to different distributed system that function in the same manner.
Appropriate specifications are complete and neutral. Complete specifications specify the whole thing that is
essential to make an implementation. But, many interface definitions does not obey completeness.
So that itis required for a developer to put in implementation-specific details. If specifications do not impose what
an implementation should look like: they should be neutral. Completeness and neutrality are significant for
interoperability and portability.
‘An open distributed system should allow configuring the system out of different components from different
developers. Also, it should be trouble-free to put in new components or replace existing ones without disturbingWoistributed computing
those components that stay in place. It means, a!
systems interfaces are published.
avoid
‘uid be able tO make ag,
fed for building the ester ser, Ue ;
b a it dynamically.
© Monolithic approach should be ts by
documents ould be able tos
mechanism, For example apart fom sto"nd the aoe
which documents are stored an‘ d for how much ich duration. that can be P plugged into the browser. Coy,
jonent
«User should beable to implement is O° policy 6 2COMPOT ca ize so that it can call Pree,
implemented component must have ‘an interface that the brow
that interface mechanism and its also based on published ings
communication
+ Open dius 9 prot . Jso considers heterogeneous environment in terms of hardy,
resources. Ital
der to access the common
ino
software.
15 Typesof Distributed Systems ee
5 Thereis afferent cistributed computing systems som
i ems.
1, Distributed Information systems. pistributed Embedded Sys
«The following discussion describes these systems:
stributed computing Systems
is intended for high-performance computing tasks. Cluster computing corp
of similar workstations or PCS running same operating systems and
15.1 Dis
lass of distributed system
of a set
eed local-area network.
One ch
the hardware that consists
connected through a high sP*
distributed systems are
distinct, and may be vé
a group of computer systems. Here adminis
built 2
ftware, and deployed res
In case of grid computing,
domain of each system may be ery dissimilar in hardware, SO
technology.
1.5.1(A) Cluster Computing Systems
use of reduced cost and improved performance of comput
point of view, it became reliable to build supercomputer using off
technology by simply connecting a set of comparatively simple computers in 2 high-speed network. |
© Inalmost all cases,
‘on several machines.
The common configuration of Linux-t
inux-based Beowulf clusters is shown in Fig. 1.5.1. Each cluster comprises?"
‘© The master node characteristicall
de che ly manages the allocati
queue of submitted jobs, and offers an interface for = us i. tte nn
isers of the system.
Cluster computing systems were accepted beca
workstations, From technical and cost
holds ”*
‘© The master in fact runs the middl
leware required for the executi
whereas the c¢ the
compute nodes requires only a standard Specnng ye ee pe ear m
em. Libraries offer advanced mess”Introduction to Distributed Systems
Wooistributed Computing Lt
Master node _Compute node Compute node
Parallel
Application
Component
Local OS
Remote
access Network
High speed Network
Fig. 1.5.1: Cluster computing system
1.5.1(B) Grid Computing Systems
= Incontrast to homogeneous environment of cluster computing systems, grid computing systems includes a high
degree’of heterogeneity. In this case hardware, operating systems, networks, administrative domains, security
policies are different.
«Ima grid computing system, resources from distinct associations are brought together to let the collaboration of @
r0up of peoplevorinstitutions. This collaboration Teads|to vial organization. The people from same vital
organization are given access rights to the resources that are offered to that organization.
«These resources comprise compute servers (together with supercomputers, probably implemented as cluster
computers), storage facilities, and databases. Also, special networked devices like telescopes, sensors, etc, can be
made available as well. A layered architecture for grid computing systems is shown in following Fig. 1.5.2.
Fig. 1.5.2: A layered architecture for grid computing systems
. These interfaces are
Fabric Layer : It is a
customized to permit sharing of resources within a virtual organization.
iat span
Connectivity layer : This
the usage of multiple resources.
Instead of user, if
users program is authenticated then the handover of rights from user to program is carried out by connectivity
layer.A
fi ;
NMctions for getting configuration information on a particu ree oF generally, to carry «4, Po
SPerations such as process creation or reading data. Hence this layer is responsible for access contra)
z | ang
willbe dependent on the authentication carried out as part of the connectivity layer.
Collective Layer : It handles access to multiple resources and contains services for resource Aliscovery,
20d scheduling of tasks onto multiple resources, data TEBNCATOT, ins many
" i ‘i
Protocols for verity of functions, reflecting the broad range of services it may provide to a virtual Organization“
Application Layer : This layer contains applications that functions within a virtual organization and Which
the grid computing environment. 7
1.5.2 Distributed Information Systems
Many organizations deal with networked applications facing the problems in interoperability. Many 4,
middleware solutions provide easy way to integrate applications into an enterprise-wide information system,
In most of the cases networked application is a client server application in which server run the ra
(database) and clients sends request to it. After processing request server sends reply to client.
tegration is done at lowest level, it would permit clients to enclose a many requests, perhaps for die,
Servers, into a single larger request and let it be executed as a distributed transaction. The basic scheme vrs,
either all or none of the requests would be executed.
Such integration should also happen by allowing applications to communicate straight way with each other
has shown the way to enterprise application integration (EAI). Above two forms of distributed systems is ext
below.
1.5.2(A) Transaction Processit ig Systems
Programming using transactions involves primitives that are either provided by underlying distributed systema!
the language runtime system. Following are the examples of primitives for transactions.
that are present in the body of transa
BEGIN_TRANSACTION : For marking the start of transaction.
END_ TRANSACTION : For terminating the transaction and try to commit.
ABORT_ TRANSACTION : For killing the transaction and restore the previous state.
READ : To read the data from table or file.
WRITE : To write the data to table or file.
The main property of a transaction is either all of its operations are executed or none are executed, The of
yn can be system calls, library procedures, or programming le
statements. Following are the ACID properties of transaction,
Atomic : Execution of transaction is individual.
Consistent : There is no violation of system invariants by transaction.
Isolated : Concurrent transactions do not interfere with each other.
Durable : After commit is done by transaction, the changes are permanent.Introduction to Distributed Systems
Distributed Computing 1
Nested Transaction
«In nested transactions, top level transaction fork off children, which run in parallel to one another on different
computers for improving the performance. Every forked child may also execute one or more subtransactions, or
fork off its own children. Fig. 1.5.3 shows nested transaction.
}——— Nested Transaction +]
Sub transaction _-—-Subtransaction ——-Sub transaction
—" Bett ——
Database 1 Database 2 Database 3
Three independent DBs
Fig, 1.5.3 : A nested Transaction
Consider that out of several subtransactions running in parallel one of the commits and submit the result to the
had before the top-level transaction began.
In this case the committed result of subtransaction must nevertheless be undone. Therefore the permanence
referred to above applies only to top-level transactions.Any transaction or subtransaction basically works on 2
private copy of all data in the entire system.
, They manipulate the private copy only. If transaction or subtransaction aborts then private copy disappears. If i
commits, its private copy replaces the parent's copy. Hence, if after commit of subtransaction a new subtransaction
is started, then new one notice results created by the first one. Similarly, if higher-level transaction aborts then all
its underlying subtransactions have to be aborted as.well.
Nested transactions present a natural means of distributing a transaction across multiple machines in distributed
system. They exhibit logical splitting up of the work of the original transaction.
5.3. Enterprise Application Integration
tt became obvious to have some facilities to integrate applications independent from their databases. The reason
behind this was the decoupling of applications from databases. There is requirement of straightway
communication among application components instead of using the request/reply pattern that was supported by
transaction processing systems.
Fig. 1.5.4: Middleware as a communication facilitator in enterprise application integration|
Wiser Introduction to
tributed Computing 114 ent Distrib
There are many examples of communication middleware. Using Remote procedure calls (RPC, 7
‘component at client side can efficiently send a request to another application component at server sig,
8 local procedure call. This request at client side is packed as a message and sent to the called APDlicati
Same way, the result will be packed as message and sent back to the calling application as the resuy hi]
Procedure call, 4
* Like RPC, itis also possible to call the remote objects using remote method invocations (RMD. An Rig,
While communicating yg.
sing
is similar to RPC,
RMI both caller and callee have to be up and running, This is the
* In case of message-oriented middleware, applications just send messages to logical contact Points, ¢
Mmeans of a subject. Similarly, applications can demand for a specific type of message and Commun
middleware ensure that those messages are delivered to those applications.
15.4 Distributed Pervasive Systems
* The distributed system discussed above is stable and nodes are fixed and permanent in it. The distributey
in which mobile and embedded computing devices are present, instability is default behavior. These device
“ystem called as distributedpervasive systems are small in size with battery-powered, mobile, and co,
through wireless connection.
* _ An important characteristic of this system is absence of human administrative control. One possible best soit
owmer configures their devices. These devices require discovering their environment automatically.
* Im pervasive systems devices usually connect to the system with the intention of accessing and proba
Providing the information. This calls for means to without difficulty read, store, manage, and share in
Considering irregular and altering connectivity of devices, the memory space where accessible information
most likely alter continually.
1.6 Hardware Concepts
Hardware concepts illustrate the organization of hardware, their interconnection and the manner in vit
communicates with each other. Multiple CPUs exist in distributed system.
The machines are basically divided in two groups
© Multiprocessors :
* _ Multicomputers : Each processor hasits own private memory.
Both multiprocessors and multicomputers further divided in two categories on the basis of architect
interconnection network. The basic organization is shown in Figs, 1.6.1 and 1.6.2.
© Bussbased : Machines are connected by single cable or processors are connected by bus and share the mst!
rent wiring patterns and messages are routed along
+ Multicomputers can be homogeneous or heterogeneous. Homogeneous multicomputers use same techno”
interconnection network and all processors are similar, accessing the same amount of private memoryWhistributed computing 115 Introduction to Distributed Systems
OG @ oo
z
LJ
Bus-based shared memory multiprocessor _Bus-based multicomputer. Each
Processors share the memory processor has its private memory
Fig. 1.6.1
Heterogeneous multicomputers have different architecture for different machines. Machines are connected by
different types of networks using different technology.
mH) M4) fy
Pr] F] el &
&)
Processors share the memory and Processors have private memory
interconnected through switch and interconnected through switch
Fig. 1.6.2
6.1 Multiprocessors
Multiple CPUs and memory modules are connected through bus. Single memory is available. CPU writes memory
‘and CPU2 reads written value. CPU2 gets the same value which was written by CPU1. Memory having such
property is said to be coherent.
‘The disadvantage of the scheme is that, if few numbers of processors are connected then bus becomes overloaded
and performance degrades, To overcome the problem, a high cache memory is placed between CPU and bus. If
referenced word is in cache then it is accessed by CPU and bus is avoided, Typical size of cache is $12 KB to 1 MB.
‘Again memory can be incoherent if updated word in one cache is not propagated to other copies of the word
present in different cache memory: otherwise other copies will have old values. A bus-based multiprocessor
provides less scalability
Bus
Fig. 1.6.3 : A bus-based multiprocessor
To allow for more number of processors, processors and memory modules are connected by crossbar-switch.
When CPU wants to access particular memory, a crosspoint switch gets closed.
Several CPU can access different memory simultaneously but for two CPUs to access same memory at the same
time, one has to wait. It is shown in Fig, 1.64. Here for n CPUs and n memories, n? crosspoint switches are required.° Massively parallel processors (MPPs) and Cluster of workstations are the examples of the switched multicom
1.6.3 Heterogeneous Multicomputer Systems
Wsr04 compuing Inradtion Di
CPUs [eb
LP | ‘&
ey ~ itch
Cross point switch 2x 2 swite!
Fig. 1.6.4 : A crossbar switch Fig. 1.6.5 : An omega switching network
Omega network contains 2 X 2 switches, each switch having 2 inputs and 2 outputs. For any input any one
two output line can be selected. Any memory can be accessed by any switch. To reduce latency between ow,
memory, fast switching is needed and it is costly.
* Non-uniform memory access architecture allows the CPU to access its local memory fast and other Cpu, na
is slowly accessed.
1.6.2. Homogeneous Multicomputer Systems
* Nodes are connected through high speed interconnection network. Processors communicate thro
interconnection network; therefore traffic is low with compare to traffic between processor and memory,
* If nodes are connected through interconnection networks like fast Ethernet then it is bus-based multicompute,
has limited scalability and performance is low if few number of processors (25-100 nodes) added.
* Instead of broadcasting like in bus based multicomputers, ro
interconnection network. A grid is shown in Fig. 1.66 (a) and suitable for graph theory related probin
A 4-dimentional hypercube is shown in Fig. 1.6.6 (b). Vertices represent CPU and edges represent connec:
between CPUs.
19 of messages can be done thoy
| CH
oy
(a) Grid
Fig. 1.6.6
Today most of the distributed systems are built on the top of multicomputers having different processors
size etc. Interconnection network also have a different technology. These are called as hetet
multicomputers. In large scale heterogeneous multicomputer environment, services to application
performance varies at different locations.‘Wistributed Computing 1:17 Introduction to Distributed Systems
a requirement
trinsic heterogeneity, scaling are the factors due to which there
* Absence of global system view,
of sophisticated software to built the distributed applications for heterogeneous multicomputers.
1.7 __ Software Concepts
Distributed systems and traditional operating systems provides the similar services such as:
© Both play the role of resource managers for the underlying hardware.
«Both permit the sharing of resources like CPUs, memories, peripheral devices, the network, and data of all kinds
‘among multiple users and applications.
+ Distributed systems try to hide the details and heterogeneous nature. of the underlying hardware by offering 2
virtual machine on which applications can be executed without trouble.
+ tna network multiple machines may have ferent operating system installed on ft. Operating systems for
«Tightly coupled systems : It keepsasingle global view of thejresourcesjitimanages. Such tighty coupled
operating system is called as distributed operating system (00S). It is useful for the management of
be shared by many processes and details remains hidden.
set of computers, each has its own OS and there is coordination between
* Loosely-coupled systems : In a
‘Apart from above operating systems middleware is used to provide general purpose services and itis installed on
the top of NOS. It offers distribution transparency.
1.7.1 Distributed Operating Systems
There are two types of distributed operating systems.
, Multiprocessor Operating System : Itmanages resources of multiprocessor.
, Multicomputer operating system : It is designed for homogeneous multicomputer.
, Distributed operating system is si ctions to the traditional uniprocessor operating system. DOS handles
the multiple processors.
Jniprocessor Operating Systems
veloped for single CPU. It permis the userand applications share
These are traditional operating systems and dé
ifferent applications to use same hardware in isolated
the differentresources/available!in’system. It allows the i
thine gets their required resources.
manner, Simultaneously several applications executing on the same mac
lar in fun
rotects data of one application from the other application if both are simultaneously executing.
OS. To send the messages, applications should use
Operating system p
Applicatio
ns should use only facilities provided byrs through system calls that are impleme!
Tf virtually alt ©perating sys
architecture and it runs i
Flexibility. in this design, it
tem cor fs executes in keel mode then operating system i aid 10 have,
single address space, The disadvantage of monolithic design approach» q
is difficult to replace the components.
Monolithic operatin,
reliability,
the first py
mode,
9 systems are not a good scheme from point of view of openness, software engin
SF maintainability. If the operating system is organized into two parts, it can provide more fly
er Set of modules for managing the hardware is kept which can uniformly well be executed i,
For example, memory manax
execute in kernel Mode at tl
gement module keeps track allocated and free space tothe proces:
he time when registers of the Memory Management Unit (MMU) ar
No direct data exchange between modules
SES Its regi
re set.
User mode
Processors\having/access:to'a sharedmemory. In this case,
deal with the hardware, together with the multiple CPUs,
access these data. Hence protection against simultaneous
etree cctns ar designed with the inension of handling multiple processors, The mins
A key goals
the presence of number of CPUs from the application,
Such transparency can be attained an1.19 Introduction to Distributed Systems
Distributed Computing
Semaphores
A semaphore S is an integer variable that, apart from initialization, is accessed only through two standard atomic
operations: wait and signal. These operations were firstly termed P (for wat) and V (fr signa.
{A semaphore S is integer variable whose value can be accessed and changed only by two operations wait (P or
sleep or down) and signal (V or wakeup or up). Wait and signal are atomic operations.
Binary semaphores do not assume all the integer values. It assumes only two values 0 and 1. On the contrary,
counting semaphores (general semaphores) can assume only non-negative values.
‘The wait operation on semaphores S, written as wait(S) or P(S), operates as follows
ait(S): IF S>0
THEN S:
ELSE (wait on S)
-1
The signal operation on semaphore S, written as signal(S) or VS), operates 2§ follows
al(S): IF (one or more process are waiting on S)
‘THEN (let one of these processes proceed)
ELSE S:=S+1 2 eae
fone as single indivisible atomic operation, It means, once a semaphore
he semaphore until operation has finished. Mutual exclusion
The two operations wait and signal are d
operation has initiated, no other process can access t
enforced within wait(S) and signal(S).
only one process will be permitted to proceed. The other
es that processes will not
on the semaphore
If many processes attempt a wait(S) at the same time,
processes will be kept waiting in queue. The implementation of wait and signal promis
undergo indefinite delay. Semaphores solve the lost-wakeup problem.
jonitors
used incorrectly, it can produce timing errors that are hard to detect. These errors occur only if
s do not always occur.
constructs called as monitor.
If semaphores are
some particular execution sequences results and these sequence:
yarchers have developed high-level language
ructures that are all grouped together in a particular type of
.ccess to the monitor's
In order to handle such errors, rese
a set of procedures, variables, and data st
lures in a monitor if required, but direct a
processes.
‘A monitor
module or package. Processes may call the proced
internal data structures from procedures declared outside the monitor is restricted to the
Following is the example of the monitor.
jonitor example>
Wistributea Computing 1-20 Introduction to Dist
on
Monitors can achieve the mutual exclusion: only one process can be active in a monitor at a time,
7 quage construc, the compiler manages the calls to monitor.
bq,
AS monitors are a programming lan
differently from other procedure calls.
Normally, when a process calls a monitor procedure, if any other process is currently executing within the ng,
it gets checked,
* _Ifs0, the calling process will be blocked until the other process has left the monitor. If no other process ig i
"g
Monitor, the calling process may enter.
* The characteristics of a monitor are the following
© Only the monitor's procedures can access the local data variables. External procedures cannot access ,
© A process enters the monitor by calling one of its procedures.
© Only one process may be running in the monitor at a time; any other process that has called the mo,
blocked, waiting for the monitor to become available.
Synchronization is supported by monitor with the use of condition variables that are contained wit,
Monitor and accessible only within the monitor. Condition variables are a special data type in monitors, wc
ions:
operated on by two fun
wait (c): the calling process's execution gets suspended on condition c. The monitor is now accessible fru,
another process,
signal (¢): blocked process after a cwait (on the same condition) resumes its execution. If there are mary:
Processes, choose one of them; if there is no such process, do nothing.
Ins are dissimilar from those for the semaphore. If a process in a monitor sc-*
* Monitor wait and signal opera
and no task is waiting on the condition variable, the signal is lost.
When one process is executing in monitor, processes that trying to enter the monitor join a queue of pro=
blocked waiting for monitor availability.
Once a process is in the monitor, it may temporarily block itself on condition x by issuing cwait (x); it is thenel=
in a queue of processes waiting to reenter the monitor when the condition changes, and resume execution”
point in its program following the cwait (x) call.
condition variable x, it issues csignal ("|
If @ process that is executing in the monitor detects a change i
alerts the corresponding condition queue that the condition has changed.
A producer can add characters to the buffer only by means of the procedure append inside the morit™
producer does not have direct access to buffer.
The procedure first-checks the condition not full to determine if there is space available in the buffer. If
process executing the monitor is blocked on that condition,
Multicomputer Operating Systems
In multicomputer operating systems data structures for systemwide resource management cannot simpli *
.
by keeping them in physically shared memory. In its place, the only way of communication is through
passing. Following is the organization of multicomputer operating systems shown in Fig. 1.7.2.
J
- od&F Distributed Computing 121 Introduction to Distributed Systems
‘Machine A, Machine 8 Machine C
Distributed Applications
Distributed Operating System Services
Kem ] [__ Kerner | Kernel
Network
Fig. 1.7.2 : General structure of a multicomputer operating system
the local CPU, a local disk and others. Each
Kemel on each machine manages local resources, for example memory,
machine contains separate module for sending and receiving messages to and from other machines.
On the top of each local kernel there is @ common software layer that implements the operating system asa virtual
machine supporting parallel and concurrent execution of various tasks.
This software layer offers a complete software implementation of shared memor
such as, assignment of task to processor, masking hardware failures, providing
ry. Further facilities usually
implemented in this layer are,
transparent storage, and general interprocess communication.
ies to applications and do not offer
Some multicomputer operating systems offers only message-passing faci
shared memory implementation. But different system can have different semantics of message-passing primitives,
Their dissimilarities can be explained by taking into consideration whether or not messages are buffered. When
should be blocking of sending or receiving process is also need to be considered as well
Buffering of the messages can be done at the sender's side or at the receiver's side. There are four synchronization
points at which a sender or receiver can possibly block. At the sender's side, sender is blocked if buffer is full. If
sender buffer is not present then three other points to block the sender ar
co The message has been sent by sender.
©The message has arrived at the receiver side.
‘0 The message has been delivered to the receiver application.
‘Another issue that is significant to know message-passing semantics is whether or not communication is reliable.
In reliable communication, a assurance of receiving the message by receiver is given to the sender.
‘tributed Shared Memory Systems (DSMs)
‘As only message passing is
available with multicomputers, programming becomes difficult with multicomputers. On the other hand,
simple. In multicomputer case buffering, blocking, and reliable communication needs to be consider as well which
leads to complex task of programming.
Implementing the shared memory on multicomputers offers a virtual shared memory machine, running on a
multicomputer. This shared memory model can be used to write the applications on multicomputers. The role of
multicomputer operating system is important in this case.Introduction to Distr
pe to Distributed
1-22
ct idual node. With thi,
aval memory of €2 th indivi ith this
BUN csaseiresi ae . the address space is broken
page based Pa ree nary (DSM) is ealized. 19 DSM Serer reference fils to an wan wd
cess
of size 4 KB or 8 KB and kept overall the nodes inthe system" " pees the page having the ileene ‘a
locally unavailable, a trap takes place, and the operating acest ‘
; now completes 5
and starts again the faulting instruction, whic! cessors are availabe, tg
vided in 16 pat
pace a as the backing store instead of the local disk, y
Whistributed computing
reated by using ¥
ges and four Pro
* Following Fig.1.7.3 shows an address 5}
normal paging. Here RAM of other machine is being used
‘Shared global address space
afapiolr2]'9
ales]
rva]5}
0] 7]2[3]4] 51617
riprey [el
itd | [2
cPuz crus
Memory
cpu4
Fig. 1.7.3 : Pages are spread over 4 machines
As shown in Fig, 1.73, if processor 4 references the instruction OF data belonging to the pages 13 and 15 they,
done locally. If references to instructions and data are which are in the pages placed on other machines then |
to the operating system will occur. '
«All read only pages can be replicated so that references will be done locally and performance can be impro
read-write pages are replicated then if one of the copies gets
changes. Write invalidate can be used to perform write.on the page.
modified other copies should also reflect the sx)
Before performing write other copis=
invalidated.
If size of the pages kept larg
'e then it can reduce the total number of transfers when large section of contig
0 be accessed. Conversely, if a page comprises data needed by two independent process:
data needs t
different processors, the operating system may need to repetitively transfer the page between thas
processors. This is called as false sharing.
}
1.7.2 Network Operating Systems
ss under”
Network operating systems assumes underlying hardware as heterogeneous. Whereas DOS assume:
hardware as homogeneous. In heterogeneous environment, machines and operating systems installed on
may be different. All these machines are connected to each other in computer network.
NOS permit users to use services available on a particular machine in the network. Remote login service
by NOS allows the user to log in remote machine from his/her terminal. Using command for remote copy
copy the file from one machine to other.
Information sharing can be achieved by providing a shared, global file system accessible from all the wo!
The file system is implemented by file servers. The file servers accept requests to read and write files 1"
programs running on client machines. Each request of client request is executed by file server, and the
sent back to client.1-23 Introduction to Distributed Systems
rributed Computing
Machine A Machine B Machine C
ol
Distributed Applications
[ Network OS
Services
Keel |
_[ _[
Network
Fig. 1.7.4 :General structure of a network operating system
File servers usually supports hierarchical file systems, each with a root directory including subdirectories and fies.
Workstations can import or mount these file systems by augmenting their local file systems with those located on
the servers.
Distributed operating system attempts to achieve complete transparency in order to offer a single system view to
the user, Whereas achieving full transparency with network operating system is not possible. As we have seen for
remote login, user has to explicitly log into remote machine. In remote copy user knows the machine to which
he/she is copying the files. The main advantage of network operating system is that it provides scalability
.8 Middleware
Both DOS and NOS do not fulfil the criteria required for distributed system. A distributed operating system cannot
manage a set of independent computers, while a network operating system does not provide transparency A
distributed system consisting of advantages of both DOS and NOS: the scalability and openness of network
operating systems (NOS) and the transparency and related ease of use of distributed operating systems (005) is
possible to realize by using middleware.
1p of network operating systems which hide the heterogeneity of the
‘A middleware layer can be added on the toy
The most of the modern
collection of underlying platforms but as well get the better distribution transparency.
distributed systems are build by means of such an additional ayer of what is called middleware.
8.1 Positioning Middleware
istributed applications make direct use of the
It is not possible to achieve distribution transparency as many
programming interface provided by network operating systems. Applications always make use of interfaces to the
local file system as well
between applications and the network operating system then distribution
If additional layer of software is placed
ided. This layer is called middleware as shown
transparency can be achieved and higher level of abstraction is provis
in Fig. 18.1.
In this organization local resource mat
ters in the network. A major goal
mnagement is handled by NOS. It also provides simple communication means
is to hide heterogeneity of the underlying platforms
to connect to other comput
lete set of services.
a result; many middleware systems offer an essentially compl
from applications. AsWoisiributed computing Introduction to Distigy
1-24
Machine A Machine B Machine C
Distributed Applications
Middleware Services
Network OS Network OS
Services. Services.
Kernel Kernel
—_| _t Network
Fig. 1.8.1 : Organization of distributed system as middleware
1.9 Models of Middleware
In order to ease the development and integration of distributed applications most middleware is uses som.
to express distribution and communication. A relatively simple model is that of treating everything as a fe,
* The approach of treating everything as a file is introduced in UNIX and also followed in plan 9. Here, aly
such as keyboard, mouse, disk, network interface, etc, are treated as files. In essence, whether a file is,
remote makes no difference. For reading or writing bytes in file, an application first opens the file and then pe,
read or write operation. Once the operation is done it closes file again. As files can be shared by many prec
communication reduces to just accessing the same file.
* Another model of middleware is based on distributed file systems. This model supports di
only for files that only stores data. This model became popular as it is reasonably scalable.
ibution trans
* In Remote Procedure Calls (RPCs) based middleware model, a client side process calls a procedure imp
‘on a remote machine. In this call, parameters are transparentlysent to the remote machine (server) where
procedure actually executed. The result of execution then server sent back to the caller. Although called p
is executed on remotely, it appears as call was executed locally. In this case the communication with
process remains hidden from calling process.
* Similar to calling the remote procedure in RPC, it is also possible to invoke objects residing on remote co
in a transparent manner. Many middleware systems offer a concept of distributed objects. The main idea
distributed objects is that each object implements an interface that hides all the internal details of the objet!
its users. An interface contains the methods that the object implements. The process notices of an objet 2
interface.
In the implementation of distributed objects, object resides on single machine and its interface is kept 0°
other machines. The available interface of the object on invoking process's machine converts its method if
into a message that is sent to the object. After receiving method invocation message of process, object ®
the invoked method and sends back the result. The interface implementation then converts the reply mess
a return value, and submits to the invoking process. Similar to RPC, the invoking process remains totall
of the network communication.€¥ Distributed Computing 1-25 Introduction to Distributed Systems
The World Wide Web became successful due to the very simple but very much effective model of distributed
documents. In web model, information is organized in the form of documents. Each of the document is kept on 2
machine transparently sited anywhere in the wide area network.
» Links in the documents refer to other documents. By using link referring to particular document, that document is
obtained from its location and gets displayed on the user's machine. Apart from textual documents, web also
supports for audio, video and interactive graphic-based documents.
1.10 Services Offered by Middleware
{All middleware implements access transparency. For this, they provide high-level communication facilities to hide
the low-level message transmitting through computer networks. How communication is supported depends very
much on the model of distribution the middleware offers to users and applications.
Naming service is offered by almost all middleware, Due to name services offered by middleware itis possible to
share and search for the entities. Naming service becomes complex to deal with if scalability is considered. For
efficient searching of a name in a large-scale system, the place of the entity that is named must be assumed to be
fixed, This main difficulty which is needed to be handled. In World Wide Web (WWW), URL is used to refer to the
document, Server on which this document is stored, its name is present in URL. Therefore, if the document is
migrated to another server, its URL fails to search the document.
Persistence service is offered for storage purpose. Persistence service is provided through a distributed file system
to their systems. If not, it provides
in simple way, but many advanced middleware have integrated databases
facilities for applications to connect to databases.
Many middleware offers facility for distributed transactions if data storage is important Transactions permit
multiple read and write operations to take place atomically. Atomicity is the property where either transaction
commits $0 all its write operations are actually performed, or it fails, leaving all referenced data unchanged.
Distributed transactions operate on data which is distributes across multiple machines.
Te important service provided by middleware is security. Instead of depending on the underying local operating
systems to sufficiently support security for the entire network, security has to be partly implemented additionally in
the middleware layer itself.
jiddleware and Openness
Modern distributed systems are built as middleware for different operating systems. Due to this, applications
developed for a specific distributed system become operating system independent and more dependent on
specific middleware. The difficulty arises due to less openness of middleware.
Actual open distributed system is specified through interfaces that are complete, Completeness states that the
details required for implementing the system, has surely been specified. If interfaces are not complete then system
developers must add their own interfaces. Asa result, situation arises in which two middleware systems developed
by different teams follow the same standard, but applications developed for one system cannot ported to the
other without trouble,
Due to incompleteness although two different implementations implement precisely the same set of interfaces but
different underlying protocols, they cannot interoperate.