Introduction of Distributed Computing

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

DISTRIBUTED COMPUTING:

Distributed computing is a model in which components of a software system are shared


among multiple computers or nodes. Even though the software components may be
spread out across multiple computers in multiple locations, they're run as one system.
This is done to improve efficiency and performance. The systems on different networked
computers communicate and coordinate by sending messages back and forth to achieve a
defined task.

Distributed computing can increase performance, resilience and scalability, making it a


common computing model in database and application design. Distributed computing
networks can be connected as local networks or through a wide area network if the
machines are in a different geographic location. Processors in distributed computing
systems typically run in parallel.

In enterprise settings, distributed computing generally puts various steps in business


processes at the most efficient places in a computer network. For example, a typical
distribution has a three-tier model that organizes applications into the presentation tier (or
user interface), the application tier and the data tier. These tiers function as follows:

User Interface: Processing occurs on the PC at the user's location.


Application Processing: It takes place on a remote computer.
Database Access and Processing Algorithms: It happen on another computer that
provides centralized access for many business processes.

In addition to the three-tier model, other types of distributed computing include client-
server, n-tier and peer-to-peer. Client-Server Architectures use smart clients that
contact a server for data, then format and display that data to the user. N-tier System
Architectures used in application servers; these architectures use web applications to
forward requests to other enterprise services. Peer-to-Peer Architectures divide all
responsibilities among all peer computers, which can serve as clients or servers.

BENEFITS OF DISTRIBUTED COMPUTING:

Distributed computing includes the following benefits:


Performance: Distributed computing can help improve performance by having each
computer in a cluster handle different parts of a task simultaneously.
Scalability: Distributed computing clusters are scalable by adding new hardware when
needed.
Resilience and Redundancy: Multiple computers can provide the same services. This
way, if one machine isn't available, others can fill in for the service. Likewise, if two
machines that perform the same service are in different data centers and one data center
goes down, an organization can still operate.
Cost-Effectiveness: Distributed computing can use low-cost, off-the-shelf hardware.
Efficiency: Complex requests can be broken down into smaller pieces and distributed
among different systems. This way, the request is simplified and worked on as a form of
parallel computing, reducing the time needed to compute requests.
Distributed Applications: Unlike traditional applications that run on a single system,
distributed applications run on multiple systems simultaneously.

GRID COMPUTING, CLOUD COMPUTING AND DISTRIBUTED


COMPUTING

Grid computing is a computing model involving a distributed architecture of multiple


computers connected to solve a complex problem. In the grid computing model, servers
or PCs run independent tasks and are linked loosely by the internet or low-speed
networks. Individual participants can enable some of their computer's processing time to
solve complex problems.

SETI@home is one example of a grid computing project. Although the project's first
phase wrapped up in March 2020, for more than 20 years, individual computer owners
volunteered some of their multitasking processing cycles -- while concurrently still using
their computers -- to the Search for Extraterrestrial Intelligence (SETI) project. This
computer-intensive problem used thousands of PCs to download and search radio
telescope data.

Grid computing and distributed computing are similar concepts that can be hard to tell
apart. Generally, distributed computing has a broader definition than grid computing.
Grid computing is typically a large group of dispersed computers working together to
accomplish a defined task. Conversely, distributed computing can work on numerous
tasks simultaneously. Some may also define grid computing as just one type of distributed
computing. In addition, while grid computing typically has well-defined architectural
components, distributed computing can have various architectures, such as grid, cluster
and cloud computing.

Cloud computing is also similar in concept to distributed computing. Cloud computing is


a general term for anything that involves delivering hosted services over the internet.
These services, however, are divided into three main types: infrastructure as a service
(IaaS), platform as a service (PaaS) and software as a service (SaaS). Cloud computing is
also divided into private and public clouds. A public cloud sells services to another party,
while a private cloud is a proprietary network that supplies a hosted service to a limited
number of people, with specific access and permissions settings. Cloud computing aims
to provide easy, scalable access to computing resources and IT services.

Cloud and distributed computing both focus on spreading a service or services to a


number of different machines; however, cloud computing typically offers a service like a
specific software or storage for organizations to use on their own tasks. Meanwhile,
distributed computing involves distributing services to different computers to aid in or
around the same task.
EVOLUTION OF DISTRIBUTED COMPUTING:

In this article, we will see the history of distributed computing systems from the
mainframe era to the current day to the best of my knowledge. It is important to
understand the history of anything in order to track how far we progressed. The
distributed computing system is all about evolution from centralization to
decentralization, it depicts how the centralized systems evolved from time to time
towards decentralization. We had a centralized system like mainframe in early 1955 but
now we are probably using a decentralized system like edge computing and containers.
Mainframe:

In the early years of computing between 1960-1967, mainframe-based computing


machines were considered as the best solution for processing large-scale data as they
provided time-sharing to a local clients who interacts with teletype terminals. This type
of system conceptualized the client-server architecture. The client connects and request
the server and the server processes these request, enabling a single time-sharing system
to send multiple resources over a single medium amongst clients. The major drawback
it faced was that it was quite expensive and that lead to the innovation of early disk-
based storage and transistor memory.

Cluster Networks:

In the early 1970s, the development of packet-switching and cluster computing happens
which was considered an alternative for mainframe systems although it was
expensive. In cluster computing, the underlying hardware consists of a collection of
similar workstations or PCs, closely connected by means of a high-speed local-area
network where each node runs the same operating system. Its purpose was to achieve
parallelism. During 1967-1974, we also saw the creation of ARPANET and an early
network that enabled global message exchange allowing for services hostable on remote
machines across geographic bounds independent from a fixed programming model.
TCP/IP protocol that facilitated datagram and stream-orientated communication over a
packet-switched autonomous network of networks also came into existence.
Communication was mainly through datagram transport.

Internet & PC’s:

During this era, the evolution of the internet takes place. New technology such as
TCP/IP had begun to transform the Internet into several connected networks, linking
local networks to the wider Internet. Thus, the number of hosts connected to the network
began to grow rapidly, therefore the centralized naming systems such as HOSTS.TXT
couldn’t provide scalability. Hence Domain Name Systems (DNSs) came into existence
in 1985 and were able to transform hosts’ domain names into IP addresses. Early GUI-
based computers utilizing WIMP(windows, icons, menus, pointers) were developed
which provided feasibility of computing within the home, providing applications such
as video games and web browsing to consumers.

World Wide Web:

During the 1980 – the 1990s, the creation of HyperText Transfer Protocol (HTTP) and
HyperText Markup Language (HTML) resulted in the first web browsers, websites,s,
and web-server. It was developed by Tim Berners Lee at CERN. Standardization of
TCP/IP provided infrastructure for interconnected networks of networks known as
the World Wide Web (WWW). This leads to the tremendous growth of the number of
hosts connected to the Internet. As the number of PC-based application programs
running on independent machines started growing, the communications between such
application programs became extremely complex and added a growing challenge in the
aspect of application-to-application interaction. With the advent of Network computing
which enables remote procedure calls (RPCs) over TCP/IP, it turned out to be a widely
accepted way for application software communication. In this era, Servers provide
resources described by Uniform Resource Locators. Software applications running on a
variety of hardware platforms, OS, and different networks faced challenges when
required to communicate with each other and share data. These demanding challenges
lead to the concept of distributed computing applications.

P2P, Grids & Web Services:

Peer-to-peer (P2P) computing or networking is a distributed application architecture that


partitions tasks or workloads between peers without the requirement of a central
coordinator. Peers share equal privileges. In a P2P network, each client acts as a client
and server.P2P file sharing was introduced in 1999 when American college student
Shawn Fanning created the music-sharing service Napster.P2P networking enables
decentralized internet. With the introduction of Grid computing, multiple tasks can be
completed by computers jointly connected over a network. It basically makes use of a
data grid i.e., a set of computers can directly interact with each other to perform similar
tasks by using middleware. During 1994 – 2000, we also saw the creation of effective
x86 virtualization. With the introduction of web service, platform-independent
communication was established which uses XML-based information exchange systems
that use the Internet for direct application-to-application interaction. Through web
services Java can talk with Perl; Windows applications can talk with Unix applications.
Peer-to-peer networks are often created by collections of 12 or fewer machines. All of
these computers use unique security to keep their data, but they also share data with
every other node. In peer-to-peer networks, the nodes both consume and produce
resources. Therefore, as the number of nodes grows, so does the peer-to-peer network’s
capability for resource sharing. This is distinct from client-server networks where an
increase in nodes causes the server to become overloaded. It is challenging to give nodes
in peer-to-peer networks proper security because they function as both clients and
servers. A denial of service attack may result from this. The majority of contemporary
operating systems, including Windows and Mac OS, come with software to implement
peer.

Cloud, Mobile & IoT:

Cloud computing came up with the convergence of cluster technology, virtualization,


and middleware. Through cloud computing, you can manage your resources and
applications online over the internet without explicitly building on your hard drive or
server. The major advantage is provided that it can be accessed by anyone from
anywhere in the world. Many cloud providers offer subscription-based services. After
paying for a subscription, customers can access all the computing resources they need.
Customers no longer need to update outdated servers, buy hard drives when they run out
of storage, install software updates or buy a software licenses. The vendor does all that
for them. Mobile computing allows us to transmit data, such as voice, and video over a
wireless network. We no longer need to connect our mobile phones with switches. Some
of the most common forms of mobile computing is a smart cards, smartphones, and
tablets. IoT also began to emerge from mobile computing and with the utilization of
sensors, processing ability, software, and other technologies that connect and exchange
data with other devices and systems over the Internet.

The evolution of Application Programming Interface (API) based communication over


the REST model was needed to implement scalability, flexibility, portability, caching,
and security. Instead of implementing these capabilities at each and every API
separately, there came the requirement to have a common component to apply these
features on top of the API. This requirement leads the API management platform
evolution and today it has become one of the core features of any distributed system.
Instead of considering one computer as one computer, the idea to have multiple systems
within one computer came into existence. This leads to the idea of virtual machines
where the same computer can act as multiple computers and run them all in parallel.
Even though this was a good enough idea, it was not the best option when it comes to
resource utilization of the host computer. The various virtualization available today are
VM Ware Workstation, Microsoft Hyper-V, and Oracle Virtualization.
Fog and Edge Computing:

When the data produced by mobile computing and IoT services started to grow
tremendously, collecting and processing millions of data in real-time was still an issue.
This leads to the concept of edge computing in which client data is processed at the
periphery of the network, it’s all about the matter of location. That data is moved across
a WAN such as the internet, processed, and analyzed closer to the point such as
corporate LAN, where it’s created instead of the centralized data center which may
cause latency issues. Fog computing greatly reduces the need for bandwidth by not
sending every bit of information over cloud channels, and instead aggregating it at
certain access points. This type of distributed strategy lowers costs and improves
efficiencies. Companies like IBM are the driving force behind fog computing. The
composition of Fog and Edge computing further extends the Cloud computing model
away from centralized stakeholders to decentralized multi-stakeholder systems which
are capable of providing ultra-low service response times, and increased aggregate
bandwidths.

The idea of using a container becomes prominent when you can put your application
and all the relevant dependencies into a container image that can be run on any
environment which has a host operating system that can run containers. This concept
became more popular and improved a lot with the introduction of container-based
application deployment. Containers can act as same as virtual machines without having
the overhead of a separate operating system. Docker and Kubernetes are the two most
popular container building platforms. They provide the facility to run in large clusters
and communication between services running on containers.
Today distributed system is programmed by application programmers while the
underlying infrastructure management is done by a cloud provider. This is the current
state of distributed systems of computing, and it keeps on evolving.

DISTRIBUTED COMPUTING MODELS:

A distributed system is a model in which components located on networked computers


communicate and coordinate their actions by passing messages to each other. These are
the following categories of distributed computing system models:
i. Minicomputer Model
ii. Workstation Model
iii. Workstation – Server Model
iv. Processor – Pool Model
v. Hybrid Model

i. Minicomputer Model:

The minicomputer model is a simple extension of the centralized time-sharing system.


A distributed computing system based on this model consists of a few minicomputers
(they may be large supercomputers as well) interconnected by a communication network.
Each minicomputer usually has multiple users simultaneously logged on to it. For this,
several interactive terminals are connected to each minicomputer. Each user is logged on
to one specific minicomputer, with remote access to other minicomputers.

The network allows a user to access remote resources that are available on some machine
other than the one on to which the user is currently logged. The minicomputer model may
be used when resource sharing (such as sharing of information databases of different
types, with each type of database located on a different machine) with remote users is
desired. The early ARPAnet is an example of a distributed computing system based on
the minicomputer model.

ii. Workstation Model:

A distributed computing system based on the workstation model consists of several


workstations interconnected by a communication network. An organization may have
several workstations located throughout a building or campus, each workstation equipped
with its own disk and serving as a single-user computer.

It has been often found that in such an environment, at any one time a significant
proportion of the workstations are idle (not being used), resulting in the waste of large
amounts of CPU time. Therefore, the idea of the workstation model is to interconnect all
these workstations by a high-speed LAN so that idle workstations may be used to process
jobs of users who are logged onto other workstations and do not have sufficient
processing power at their own workstations to get their jobs processed efficiently.
iii. Workstation – Server Model:

The workstation model is a network of personal workstations, each with its own disk and
a local file system. A workstation with its own local disk is usually called a diskful
workstation and a workstation without a local disk is called a diskless workstation.
With the proliferation of high-speed networks, diskless workstations have become more
popular in network environments than diskful workstations, making the workstation-
server model more popular than the workstation model for building distributed computing
systems.

A distributed computing system based on the workstation-server model consists of a few


minicomputers and several workstations (most of which are diskless, but a few of which
may be diskful) interconnected by a communication network.
Note that when diskless workstations are used on a network, the file system to be used by
these workstations must be implemented either by a diskful workstation or by a
minicomputer equipped with a disk for file storage. One or more of the minicomputers
are used for implementing the file system.

Other minicomputers may be used for providing other types of services, such as database
service and print service. Therefore, each minicomputer is used as a server machine to
provide one or more types of services. Therefore in the workstation-server model, in
addition to the workstations, there are specialized machines (may be specialized
workstations) for running server processes (called servers) for managing and providing
access to shared resources.

For a number of reasons, such as higher reliability and better scalability, multiple servers
are often used for managing the resources of a particular type in a distributed computing
system.

For example, there may be multiple file servers, each running on a separate minicomputer
and cooperating via the network, for managing the files of all the users in the system. Due
to this reason, a distinction is often made between the services that are provided to clients
and the servers that provide them. That is, a service is an abstract entity that is provided
by one or more servers. For example, one or more file servers may be used in a distributed
computing system to provide file service to the users.

In this model, a user logs onto a workstation called his or her home workstation. Normal
computation activities required by the user’s processes are performed at the user’s home
workstation, but requests for services provided by special servers (such as a file server or
a database server) are sent to a server providing that type of service that performs the
user’s requested activity and returns the result of request processing to the user’s
workstation. Therefore, in this model, the user’s processes need not migrated to the server
machines for getting the work done by those machines.

iv. Processor-Pool Model:

An alternative approach is to construct a processor pool, full of Central Processing Units


(CPUs) within the machine room, which can be allocated dynamically to users on
demand.

Instead of giving individual workstations to users, the processor pool model given high-
performance graphics terminals. Actually, this approach is based on the observation on
What users really want is a good performance and high-quality graphical interface. It is
built with low-cost microprocessors also, the concept known as traditional time-sharing
is much closer than the Personal Computers (PCs) model.
The processor pool model consists of multiple processors and group of workstations. The
model is based on the observation that most of the time a user does not need any
computing power.

In this model, the process is pooled together to be shared by the users as needed.
The processor pool of process consists of large microcomputers and minicomputers
attached to the network. Each processor has its own memory to load and run. The
processors in the pool have no terminals attached directly to them, and the user accesses
the system from terminals that are attached to the network via a special device.

v. Hybrid Model:

Out of the four models described above, the workstation-server model, is the most widely
used model for building distributed computing systems. This is because a large number
of computer users only perform simple interactive tasks such as editing jobs, sending
electronic mails, and executing small programs. The workstation-server model is ideal
for such simple usage. However, in a working environment that has groups of users who
often perform jobs needing massive computation, the processor-pool model is more
attractive and suitable.

You might also like