0% found this document useful (0 votes)
2 views14 pages

Introduction To Distributed System

Distributed systems consist of independent computers that communicate via messages, providing resource sharing and coordination across various applications. They are characterized by concurrency, lack of a global clock, and independent failures, which present unique challenges such as heterogeneity, security, and scalability. Examples include cloud computing, grid computing, and distributed storage systems, all of which are evolving due to trends like mobile computing and the demand for multimedia services.

Uploaded by

wande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views14 pages

Introduction To Distributed System

Distributed systems consist of independent computers that communicate via messages, providing resource sharing and coordination across various applications. They are characterized by concurrency, lack of a global clock, and independent failures, which present unique challenges such as heterogeneity, security, and scalability. Examples include cloud computing, grid computing, and distributed storage systems, all of which are evolving due to trends like mobile computing and the demand for multimedia services.

Uploaded by

wande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Chapter-1

Introduction to Distributed Systems


Introduction:
Networks of computers are everywhere. The Internet is one, as are the many networks of which it is
composed. Mobile phone networks, corporate networks, factory networks, campus networks, home
networks, in-car networks – all of these, both separately and in combination, share the essential
characteristics that make them relevant subjects for study under the heading distributed systems.

Distributed systems are everywhere. The Internet enables users throughout the world to access its
services wherever they may be located. Each organization manages an intranet, which provides local
services and Internet services for local users and generally provides services to other users in the
Internet. Small distributed systems can be constructed from mobile computers and other small
computational devices that are attached to a wireless network.

Resource sharing is the main motivating factor for constructing distributed systems. Resources such as
printers, files, web pages or database records are managed by servers of the appropriate type. For
example, web servers manage web pages and other web resources. Resources are accessed by clients –
for example, the clients of web servers are generally called browsers.

We define a distributed system as one in which hardware or software components located at networked
computers communicate and coordinate their actions only by passing messages. This simple definition
covers the entire range of systems in which networked computers can usefully be deployed.

Definition: - A distributed system is a collection of independent computers that appear to the users of
the system as a single computer.

Fig. 1-1 shows four networked computers and three applications, of which application B is distributed
across computers 2 and 3. Each application is offered the same interface. The distributed system
provides the means for components of a single distributed application to communicate with each other,

1
but also to let different applications communicate. At the same time, it hides, as best and reasonable as
possible, the differences in hardware and operating systems from each application.

Examples:
 A network of workstations allocated to users

 A pool of processors in the machine room allocated dynamically

 A single file system (all users access files with the same path name)

 User command executed in the best place (user workstation, a workstation belonging to
someone else, or on an unassigned processor in the machine room)

Our definition of distributed systems has the following significant consequences:

Concurrency: In a network of computers, concurrent program execution is the norm. I can do my work
on my computer while you do your work on yours, sharing resources such as web pages or files when
necessary. The capacity of the system to handle shared resources can be increased by adding more
resources (for example. computers) to the network. We will describe ways in which this extra capacity
can be usefully deployed at many points in this book. The coordination of concurrently executing
programs that share resources is also an important and recurring topic.

No global clock: When programs need to cooperate they coordinate their actions by exchanging
messages. Close coordination often depends on a shared idea of the time at which the programs’ actions
occur. But it turns out that there are limits to the accuracy with which the computers in a network can
synchronize their clocks – there is no single global notion of the correct time. This is a direct
consequence of the fact that the only communication is by sending messages through a network.

Independent failures: All computer systems can fail, and it is the responsibility of system designers to
plan for the consequences of possible failures. Distributed systems can fail in new ways. Faults in the
network result in the isolation of the computers that are connected to it, but that doesn’t mean that they
stop running. In fact, the programs on them may not be able to detect whether the network has failed or
has become unusually slow. Similarly, the failure of a computer, or the unexpected termination of a
program somewhere in the system (a crash), is not immediately made known to the other components
with which it communicates. Each component of the system can fail independently, leaving the others
still running. The consequences of this characteristic of distributed systems will be a recurring theme
throughout the book.

2
Examples of Distributed Systems:
1. Network of workstations

 Personal workstations + processors not assigned to specific users.


 Single file system, with all files accessible from all machines in the same way and using the
same path name.
 For a certain command the system can look for the best place (workstation) to execute it.
2. Automatic banking (teller machine) system

3
 Primary requirements: security and reliability.
 Consistency of replicated data.
 Concurrent transactions (operations which involve accounts in different banks; simultaneous
access from several users, etc).
 Fault tolerance

3. The cloud

4
0

 Computing as a utility: application, storage, computing services; pay on per-usage basis.


 Main concerns: scaling, performance, security/reliability.

5
Advantages and Disadvantages of Distributed Systems:-

Trends in distributed systems:

6
Distributed systems are undergoing a period of significant change and this can be traced back to a
number of influential trends:

1. The emergence of pervasive networking technology.


The modern Internet is a vast interconnected collection of computer networks of many different
types, with the range of types increasing all the time and now including, for example, a wide range
of wireless communication technologies such as WiFi, WiMAX, Bluetooth and third-generation
mobile phone networks. The net result is that networking has become a pervasive resource and
devices can be connected (if desired) at any time and in any place.

Figure 1.3 illustrates a typical portion of the Internet. Programs running on the computers connected
to it interact by passing messages, employing a common means of communication. The design and
construction of the Internet communication mechanisms (the Internet protocols) is a major technical
achievement, enabling a program running anywhere to address messages to programs anywhere else
and abstracting over the myriad of technologies mentioned above.

The Internet is also a very large distributed system. It enables users, wherever they are, to make use
of services such as the World Wide Web, email and file transfer. The set of services is open-ended
– it can be extended by the addition of server computers and new types of service. The figure shows
a collection of intranets – sub networks operated by companies and other organizations and
typically protected by firewalls.

Internet Service Providers (ISPs) are companies that provide broadband links and other types of
connection to individual users and small organizations, enabling them to access services anywhere in
the Internet as well as providing local services such as email and web hosting. The intranets are
linked together by backbones. A backbone is a network link with a high transmission capacity,
employing satellite connections, fiber optic cables and other high-bandwidth circuits.

7
2. Mobile Computing and Ubiquitous computing in distributed systems.
Mobile computing is the performance of computing tasks while the user is on the move, or visiting
places other than their usual environment. In mobile computing, users who are away from their
‘home’ intranet (the intranet at work, or their residence) are still provided with access to resources via
the devices they carry with them. They can continue to access the Internet; they can continue to
access resources in their home intranet; and there is increasing provision for users to utilize resources
such as printers or even sales points that are conveniently nearby as they move around. The latter is
also known as location-aware or context-aware computing. Mobility introduces a number of
challenges for distributed systems, including the need to deal with variable connectivity and indeed
disconnection, and the need to maintain operation in the face of device mobility.

Ubiquitous computing is the harnessing of many small, cheap computational devices that are present
in users’ physical environments, including the home, office and even natural settings. The term
‘ubiquitous’ is intended to suggest that small computing devices will eventually become so pervasive
in everyday objects that they are scarcely noticed. That is, their computational behavior will be
transparently and intimately tied up with their physical function.

Ubiquitous and mobile computing overlap, since the mobile user can in principle benefit from
computers that are everywhere. But they are distinct, in general. Ubiquitous computing could benefit
users while they remain in a single environment such as the home or a hospital. Similarly, mobile
computing has advantages even if it involves only conventional, discrete computers and devices such
as laptops and printers.

3. The increasing demand for multimedia services.


The benefits of distributed multimedia computing are considerable in that a wide range of new
(multimedia) services and applications can be provided on the desktop, including access to live or
pre-recorded television broadcasts, access to film libraries offering video-on-demand services, access
to music libraries, the provision of audio and video conferencing facilities and integrated telephony
features including IP telephony or related technologies such as Skype, a peer-to-peer alternative to IP
telephony.

Distributed multimedia applications such as webcasting place considerable demands on the


underlying distributed infrastructure in terms of:

 Providing support for an (extensible) range of encoding and encryption formats, such as the
MPEG series of standards (including for example the popular MP3 standard otherwise known
as MPEG-1, Audio Layer 3) and HDTV.

 Providing a range of mechanisms to ensure that the desired quality of service can be met.

 Providing associated resource management strategies, including appropriate scheduling policies


to support the desired quality of service.

 Providing adaptation strategies to deal with the inevitable situation in open systems where
quality of service cannot be met or sustained.

8
4. The view of distributed systems as a utility.
With the increasing maturity of distributed systems infrastructure, a number of companies are
promoting the view of distributed resources as a commodity or utility, drawing the analogy between
distributed resources and other utilities such as water or electricity. With this model, resources are
provided by appropriate service suppliers and effectively rented rather than owned by the end user.

For Example- The term cloud computing is used to capture this vision of computing as a utility. A
cloud is defined as a set of Internet-based application, storage and computing services sufficient to
support most users’ needs, thus enabling them to largely or totally dispense with local data storage
and application software (see Figure 1.5). The term also promotes a view of everything as a service,
from physical or virtual infrastructure through to software, often paid for on a per-usage basis rather
than purchased. Note that cloud computing reduces requirements on users’ devices, allowing very
simple desktop or portable devices to access a potentially wide range of resources and services.

Clouds are generally implemented on cluster computers to provide the necessary scale and
performance required by such services. A cluster computer is a set of interconnected computers that
cooperate closely to provide a single, integrated high performance computing capability.

Types of Distributed Systems:


Distributed systems are classified as following types.

1. Distributed computing systems.


2. Distributed information systems.
3. Distributed embedded systems.

Distributed Computing Systems:

9
An important class of distributed systems is the one used for high-performance computing tasks.
Roughly speaking, one can make a distinction between two subgroups. In cluster computing the
underlying hardware consists of a collection of similar workstations or PCs, closely connected by
means of a high speed local-area network. In addition, each node runs the same operating system.

The situation becomes quite different in the case of grid computing. This subgroup consists of
distributed systems that are often constructed as a federation of computer systems, where each system
may fall under a different administrative domain, and may be very different when it comes to hardware,
software, and deployed network technology.

 Cluster Computing Systems


Computers communicating over a high speed network can be made to work and present itself as a
single computer to the users. A set of computers that are grouped together in such a manner that they
form a single resource pool is called a cluster. Any task that has been assigned to the cluster would run
on all the computers in the cluster in a parallel fashion by breaking the whole task into smaller self
contained tasks. Then, the result of the smaller tasks would be combined to form the final result.

Cluster computing helps organizations to increase their computing power using the standard and
commonly available technology. These hardware and software which are commonly known as
commodity items can be purchased from the market at relatively low cost. Cluster computing has seen
tremendous growth in the recent years. Around 80 percent of top 500 supercomputing centers in the
world are using clusters. Clusters are used primarily to run scientific, engineering, commercial, and
industrial applications that require high availability and high throughput processing. Protein sequencing
in biomedical applications, earth quake simulation in civil engineering, petroleum reservoir simulation
in earth resource and petroleum engineering and replicated and distributed storage and backup servers
for high demand web based business applications are a few examples for applications which primarily
run on clusters. Below figure shows a typical arrangement of computers in a Computing Cluster.

Figure - Computing cluster

10
 Grid Computing Systems
Grid is a type of distributed computing system where a large number of small loosely
coupled computers are brought together to form a large virtual supercomputer. This virtual
super computer has to perform tasks that are large for any single computer to perform
within a reasonable time.

Grid is defined as a parallel and distributed system that is capable of selecting, sharing, and
aggregating geographically distributed resources dynamically at runtime based on their
availability, capability, performance, and cost meeting the users’ Quality of Service (QoS)
requirements. Grid computing combines computing resources distributed across a large
geographical area belonging to different persons and organization. The main purpose of the
grid system is to collaboratively work across multiple systems to solve single computing
task by dividing the task into smaller self contained tasks and distributing those tasks to
different computers.

The middleware used in grid computing is responsible for dividing and apportioning the
tasks. The size of a grid system can vary from few hundred computers within an
organization to large systems consisting of thousands of nodes across multiple
organizations. Small grids confined to a single organization is commonly known as intra-
node corporation while the larger wider system is referred to as inter node corporation.
Below figure shows Grid System distributed across heterogeneous computing platforms.

Figure - Grid computing system

Grids have been used to perform computationally intensive scientific, mathematical, and
academic problems through volunteer computing. Drug discovery, economic forecasting,
seismic analysis, and back office data processing for e-commerce are a few of the tasks that are
commonly solved using grid computing.

11
 Distributed Storage Systems
The rapid growth of storage volume, bandwidth and computation resources along with the reduction in
the cost of storage devices have fueled popularity of distributed storage systems. The main objective of
distributing storage across multiple devices is to protect the data in case of disk failure through
redundant storage in multiple devices and to make data available closer to the user in massively
distributed system. There are mainly four types of distributed storage systems. There are namely,
Server Attached Redundant Array of Independent Disks (RAID), centralized RAID, Network Attached
Storage (NAS) and Storage Area Network (SAN). NAS and SAN are the most popular distributed
storage techniques out of the four. Below figure shows the typical arrangement of distributed storage
system.

Figure - Distributed storage system

NAS and SAN have slight differences in techniques adopted for transferring data between devices and
the performance due to this difference. NAS mainly uses TCP/IP protocol to transfer data across
multiple devices whereas SAN uses SCSI setup on fiber channels. Hence NAS can be implemented on
any physical network supporting TCP/IP such as Ethernet, FDDI, or ATM. But SAN can be
implemented only fiber channel. SAN has better performance compared NAS as TCP has higher
overhead and SCSI faster than TCP/IP networks.

 Distributed Database Systems


Distributed database system is a collection of independent database systems distributed across multiple
computers that collaboratively store data in such a manner that a user can access data from anywhere as
if it has been stored locally irrespective of where the data is actually stored. Below figure shows an
arrangement of distributed database system across multiple network sites.

12
Figure - Distributed Database system

Challenges:
The construction of distributed systems produces many challenges:

Heterogeneity: They must be constructed from a variety of different networks, operating systems,
computer hardware and programming languages. The Internet communication protocols mask the
difference in networks, and middleware can deal with the other differences. Heterogeneity (that is,
variety and difference) applies to all of the following:

• Networks;
• Computer Hardware;
• Operating Systems;
• Programming Languages;
• Implementations by different developers

Openness: Distributed systems should be extensible – the first step is to publish the interfaces of the
components, but the integration of components written by different programmers is a real challenge.

Security: Encryption can be used to provide adequate protection of shared resources and to keep
sensitive information secret when it is transmitted in messages over a network. Denial of service attacks
are still a problem.

Scalability: A distributed system is scalable if the cost of adding a user is a constant amount in terms of
the resources that must be added. The algorithms used to access shared data should avoid performance
bottlenecks and data should be structured hierarchically to get the best access times. Frequently
accessed data can be replicated.

13
Failure handling: Any process, computer or network may fail independently of the others. Therefore
each component needs to be aware of the possible ways in which the components it depends on may
fail and be designed to deal with each of those failures appropriately.

Concurrency: The presence of multiple users in a distributed system is a source of concurrent requests
to its resources. Each resource must be designed to be safe in a concurrent environment.

Transparency: The aim is to make certain aspects of distribution invisible to the application
programmer so that they need only be concerned with the design of their particular application. For
example, they need not be concerned with its location or the details of how its operations are accessed
by other components, or whether it will be replicated or migrated. Even failures of networks and
processes can be presented to application programmers in the form of exceptions – but they must be
handled.

Quality of service: It is not sufficient to provide access to services in distributed systems. In particular,
it is also important to provide guarantees regarding the qualities associated with such service access.
Examples of such qualities include parameters related to performance, security and reliability.

****THE END****

14

You might also like