Introduction of Distributed Computing
Introduction of Distributed Computing
Introduction of Distributed Computing
In addition to the three-tier model, other types of distributed computing include client-
server, n-tier and peer-to-peer. Client-Server Architectures use smart clients that
contact a server for data, then format and display that data to the user. N-tier System
Architectures used in application servers; these architectures use web applications to
forward requests to other enterprise services. Peer-to-Peer Architectures divide all
responsibilities among all peer computers, which can serve as clients or servers.
SETI@home is one example of a grid computing project. Although the project's first
phase wrapped up in March 2020, for more than 20 years, individual computer owners
volunteered some of their multitasking processing cycles -- while concurrently still using
their computers -- to the Search for Extraterrestrial Intelligence (SETI) project. This
computer-intensive problem used thousands of PCs to download and search radio
telescope data.
Grid computing and distributed computing are similar concepts that can be hard to tell
apart. Generally, distributed computing has a broader definition than grid computing.
Grid computing is typically a large group of dispersed computers working together to
accomplish a defined task. Conversely, distributed computing can work on numerous
tasks simultaneously. Some may also define grid computing as just one type of distributed
computing. In addition, while grid computing typically has well-defined architectural
components, distributed computing can have various architectures, such as grid, cluster
and cloud computing.
In this article, we will see the history of distributed computing systems from the
mainframe era to the current day to the best of my knowledge. It is important to
understand the history of anything in order to track how far we progressed. The
distributed computing system is all about evolution from centralization to
decentralization, it depicts how the centralized systems evolved from time to time
towards decentralization. We had a centralized system like mainframe in early 1955 but
now we are probably using a decentralized system like edge computing and containers.
Mainframe:
Cluster Networks:
In the early 1970s, the development of packet-switching and cluster computing happens
which was considered an alternative for mainframe systems although it was
expensive. In cluster computing, the underlying hardware consists of a collection of
similar workstations or PCs, closely connected by means of a high-speed local-area
network where each node runs the same operating system. Its purpose was to achieve
parallelism. During 1967-1974, we also saw the creation of ARPANET and an early
network that enabled global message exchange allowing for services hostable on remote
machines across geographic bounds independent from a fixed programming model.
TCP/IP protocol that facilitated datagram and stream-orientated communication over a
packet-switched autonomous network of networks also came into existence.
Communication was mainly through datagram transport.
During this era, the evolution of the internet takes place. New technology such as
TCP/IP had begun to transform the Internet into several connected networks, linking
local networks to the wider Internet. Thus, the number of hosts connected to the network
began to grow rapidly, therefore the centralized naming systems such as HOSTS.TXT
couldn’t provide scalability. Hence Domain Name Systems (DNSs) came into existence
in 1985 and were able to transform hosts’ domain names into IP addresses. Early GUI-
based computers utilizing WIMP(windows, icons, menus, pointers) were developed
which provided feasibility of computing within the home, providing applications such
as video games and web browsing to consumers.
During the 1980 – the 1990s, the creation of HyperText Transfer Protocol (HTTP) and
HyperText Markup Language (HTML) resulted in the first web browsers, websites,s,
and web-server. It was developed by Tim Berners Lee at CERN. Standardization of
TCP/IP provided infrastructure for interconnected networks of networks known as
the World Wide Web (WWW). This leads to the tremendous growth of the number of
hosts connected to the Internet. As the number of PC-based application programs
running on independent machines started growing, the communications between such
application programs became extremely complex and added a growing challenge in the
aspect of application-to-application interaction. With the advent of Network computing
which enables remote procedure calls (RPCs) over TCP/IP, it turned out to be a widely
accepted way for application software communication. In this era, Servers provide
resources described by Uniform Resource Locators. Software applications running on a
variety of hardware platforms, OS, and different networks faced challenges when
required to communicate with each other and share data. These demanding challenges
lead to the concept of distributed computing applications.
When the data produced by mobile computing and IoT services started to grow
tremendously, collecting and processing millions of data in real-time was still an issue.
This leads to the concept of edge computing in which client data is processed at the
periphery of the network, it’s all about the matter of location. That data is moved across
a WAN such as the internet, processed, and analyzed closer to the point such as
corporate LAN, where it’s created instead of the centralized data center which may
cause latency issues. Fog computing greatly reduces the need for bandwidth by not
sending every bit of information over cloud channels, and instead aggregating it at
certain access points. This type of distributed strategy lowers costs and improves
efficiencies. Companies like IBM are the driving force behind fog computing. The
composition of Fog and Edge computing further extends the Cloud computing model
away from centralized stakeholders to decentralized multi-stakeholder systems which
are capable of providing ultra-low service response times, and increased aggregate
bandwidths.
The idea of using a container becomes prominent when you can put your application
and all the relevant dependencies into a container image that can be run on any
environment which has a host operating system that can run containers. This concept
became more popular and improved a lot with the introduction of container-based
application deployment. Containers can act as same as virtual machines without having
the overhead of a separate operating system. Docker and Kubernetes are the two most
popular container building platforms. They provide the facility to run in large clusters
and communication between services running on containers.
Today distributed system is programmed by application programmers while the
underlying infrastructure management is done by a cloud provider. This is the current
state of distributed systems of computing, and it keeps on evolving.
i. Minicomputer Model:
The network allows a user to access remote resources that are available on some machine
other than the one on to which the user is currently logged. The minicomputer model may
be used when resource sharing (such as sharing of information databases of different
types, with each type of database located on a different machine) with remote users is
desired. The early ARPAnet is an example of a distributed computing system based on
the minicomputer model.
It has been often found that in such an environment, at any one time a significant
proportion of the workstations are idle (not being used), resulting in the waste of large
amounts of CPU time. Therefore, the idea of the workstation model is to interconnect all
these workstations by a high-speed LAN so that idle workstations may be used to process
jobs of users who are logged onto other workstations and do not have sufficient
processing power at their own workstations to get their jobs processed efficiently.
iii. Workstation – Server Model:
The workstation model is a network of personal workstations, each with its own disk and
a local file system. A workstation with its own local disk is usually called a diskful
workstation and a workstation without a local disk is called a diskless workstation.
With the proliferation of high-speed networks, diskless workstations have become more
popular in network environments than diskful workstations, making the workstation-
server model more popular than the workstation model for building distributed computing
systems.
Other minicomputers may be used for providing other types of services, such as database
service and print service. Therefore, each minicomputer is used as a server machine to
provide one or more types of services. Therefore in the workstation-server model, in
addition to the workstations, there are specialized machines (may be specialized
workstations) for running server processes (called servers) for managing and providing
access to shared resources.
For a number of reasons, such as higher reliability and better scalability, multiple servers
are often used for managing the resources of a particular type in a distributed computing
system.
For example, there may be multiple file servers, each running on a separate minicomputer
and cooperating via the network, for managing the files of all the users in the system. Due
to this reason, a distinction is often made between the services that are provided to clients
and the servers that provide them. That is, a service is an abstract entity that is provided
by one or more servers. For example, one or more file servers may be used in a distributed
computing system to provide file service to the users.
In this model, a user logs onto a workstation called his or her home workstation. Normal
computation activities required by the user’s processes are performed at the user’s home
workstation, but requests for services provided by special servers (such as a file server or
a database server) are sent to a server providing that type of service that performs the
user’s requested activity and returns the result of request processing to the user’s
workstation. Therefore, in this model, the user’s processes need not migrated to the server
machines for getting the work done by those machines.
Instead of giving individual workstations to users, the processor pool model given high-
performance graphics terminals. Actually, this approach is based on the observation on
What users really want is a good performance and high-quality graphical interface. It is
built with low-cost microprocessors also, the concept known as traditional time-sharing
is much closer than the Personal Computers (PCs) model.
The processor pool model consists of multiple processors and group of workstations. The
model is based on the observation that most of the time a user does not need any
computing power.
In this model, the process is pooled together to be shared by the users as needed.
The processor pool of process consists of large microcomputers and minicomputers
attached to the network. Each processor has its own memory to load and run. The
processors in the pool have no terminals attached directly to them, and the user accesses
the system from terminals that are attached to the network via a special device.
v. Hybrid Model:
Out of the four models described above, the workstation-server model, is the most widely
used model for building distributed computing systems. This is because a large number
of computer users only perform simple interactive tasks such as editing jobs, sending
electronic mails, and executing small programs. The workstation-server model is ideal
for such simple usage. However, in a working environment that has groups of users who
often perform jobs needing massive computation, the processor-pool model is more
attractive and suitable.