Distributed Computing
Distributed Computing
Distributed Computing
Distributed computing is a model in which components of a software system are shared among
multiple computers or nodes. Even though the software components may be spread out across
multiple computers in multiple locations, they're run as one system. This is done to improve
efficiency and performance.
Distributed hardware cannot use a shared memory due to being physically separated, so the
participating computers exchange messages and data (computation results) over a network.
This inter-machine communication occurs locally over an intranet (in a data center) or across the
country and world via the internet
Distributed applications often use a client-server architecture. Clients and servers share the work
and cover certain application functions with the software installed on them. A product search is
carried out using the following steps: The client acts as an input instance and a user interface that
receives the user request and processes it so that it can be sent on to a server.
The remote server then carries out the main part of the search function and searches a database.
The search results are prepared on the server-side to be sent back to the client and are
communicated to the client over the network. In the end, the results are displayed on the user’s
screen.
Distributed computing is a much broader technology that has been around for more than three
decades now. Simply stated, distributed computing is computing over distributed autonomous
computers that communicate only over a network. Distributed computing systems are usually
treated differently from parallel computing systems or shared-memory systems, where multiple
computers share a common memory pool that is used for communication between the processors.
Distributed memory systems use multiple computers to solve a common problem, with
computation distributed among the connected computers (nodes) and using message-passing to
communicate between the nodes.
For example, grid computing, studied in the previous section, is a form of distributed computing
where the nodes may belong to different administrative domains. Another example is the
network-based storage virtualization solution described in an earlier section in this chapter,
which used distributed computing between data and metadata servers. Distributed computing, a
method of running programs across several computers on a network, is becoming a popular way
to meet the demands for higher performance in both high-performance scientific computing and
more "general-purpose" applications.
There are many reasons to show the increasing acceptance and adoption of distributed
computing, such as performance, the availability of computers to connect, fault tolerance and
sharing of resources, etc. By connecting several machines together, more computation power,
memory, and I/O bandwidth can be accessed. Distributed computing can be implemented in a
variety of ways. For example, groups of work station interconnected by an appropriate high-
speed network (abbreviated to cluster) may even provide supercomputer-level computational
power. The combustion simulation is essential to the hydrodynamics and computer graphics.
However, it requires much computation for the combustion simulation. In this paper, the
computation of the combustion simulation is sped up by pipelined method under the distributed
systems.
Developing applications for distributed memory machines is much more involved than
traditional sequential machines. Sometimes new algorithms need to be developed to solve even a
well-known problem (sorting huge sequences of numbers). In order to ease the burden on
programmers, parallelizing compilers that convert sequential programs written for traditional
computers to distributed message programs exist, particularly for distributed SMP (symmetric
multiprocessor) clusters.
Distributed computing, however, can include heterogeneous computations where some nodes
may perform a lot more computation, some perform very little computation and a few others may
perform specialized functionality (like processing visual graphics). One of the main advantages
of using distributed computing (versus supercomputers like Cray where thousands of processors
are housed in a rack and communicate through shared memory) is that efficient scalable
programs can be designed so that independent processes are scheduled on different nodes and
they communicate only occasionally to exchange results – as opposed to working out of a shared
memory with multiple simultaneous accesses to a common memory.
With that description, it is probably obvious that cloud computing is also a specialized form of
distributed computing, where distributed SaaS applications utilize thin clients (such as browsers)
which offload computation to cloud-hosted servers (and services). Additionally, cloud-
computing vendors providing (IaaS and PaaS) solutions may internally use distributed
computing to provide highly scalable cost-effective infrastructure and platform.
Distributed computing studies the models, architectures, and algorithms used for building and
managing distributed systems. as a general definition of the term distributed system, we use the
one proposed by Tanenbaum a distributed system is a collection of independent computers that
appears to its users as a single coherent system.
This definition is general enough to include various types of distributed computing systems that
are especially focused on unified usage and aggregation of distributed resources. In this chapter,
we focus on the architectural models that are used to harness independent computers and present
them as a whole coherent system.
Distributed computing is a foundational model for cloud computing because cloud systems are
distributed systems. Besides administrative tasks mostly connected to the accessibility of
resources in the cloud, the extreme dynamism of cloud system where new nodes and services are
provisioned on demand constitutes the major challenge for engineers and developers.
This characteristic is pretty peculiar to cloud computing solutions and is mostly addressed at
the middleware layer of computing system. Infrastructure-as-a-Service solutions provide the
capabilities to add and remove resources, but it is up to those who deploy systems on this
scalable infrastructure to make use of such opportunities with wisdom and effectiveness.
Platform-as-a-Service solutions embed into their core offering algorithms and rules that control
the provisioning process and the lease of resources. These can be either completely transparent to
developers or subject to fine control. Integration between cloud resources and existing system
deployment is another element of concern.
A distributed computer system consists of multiple software components that are on multiple
computers, but run as a single system. The computers that are in a distributed system can be
physically close together and connected by a local network, or they can be geographically distant
and connected by a wide area network. A distributed system can consist of any number of
possible configurations, such as mainframes, personal computers, workstations, minicomputers,
and so on. The goal of distributed computing is to make such a network work as a single
computer.
Distributed computing systems can run on hardware that is provided by many vendors, and can
use a variety of standards-based software components. Such systems are independent of the
underlying software. They can run on various operating systems, and can use various
communications protocols. Some hardware might use UNIX or Linux as the operating system,
while other hardware might use Windows operating systems. For intermachine communications,
this hardware can use SNA or TCP/IP on Ethernet or Token Ring.
As the name implies, a distributed system is a computing system in which the various
components or nodes are spread out across a network of computers (or virtual machines,
containers, or any node that can connect and handle basic tasks). Though physically separated,
the nodes are linked together and pool their resources to maximize efficiency when running a
program.
Practically every form of computer network can function as a distributed system in some sense,
but early forms of distributed systems were difficult to setup and maintain. In the modern
environment, distributed systems will be cloud-based, operating over the internet. Most SaaS
applications will function as distributed systems, with end users having access to a managing
application that appears to be a single interface operating in one place, but will rely on cloud
computing power to process tasks. When you use a ridesharing app like Uber or Lyft, your only
interaction with the system will be an app on your phone, but there will be hundreds if not
thousands of components elsewhere that work together to deliver rides.
Distributed computing in simple words can be defined as a group of computers that are working
together at the backend while appearing as one to the end-user. The individual computers
working together in such groups operate concurrently and allow the whole system to keep
working if one or some of them fail.
In a distributed system multiple computers can host different software components, but all the
computers work to accomplish a common goal. The computers in a distributed system or group
can be physically located at the same place or close together, connected via a local network or
connected by a Wide Area Network. Distributed systems can also consist of different
configurations or a combination of configurations such as personal computers, workstations and
mainframes.
The biggest issue with vertical scaling is that even the best and most expensive hardware would
prove to be insufficient after a certain time. On the other horizontal scaling allows managing
increasing traffic/performance demands by adding more computers instead of constantly
upgrading a single system.
The initial costs of horizontal scalability might be higher, but after a certain point it becomes a
lot more efficient. Costs associated with vertical scalability start to rise sharply after a certain
point, which makes horizontal scaling a much better option after a certain threshold. Vertical
scaling might not be suitable for tech companies dealing with big data and very high workloads.
Complexity
Distributed computing systems are difficult to deploy, maintain and troubleshoot/debug than
their centralized counterparts. The increased complexity is not only limited to the hardware as
distributed systems also need software capable of handling the security and communications.
Security Concerns
Data access can be controlled fairly easily in a centralized computing system, but it’s not an easy
job to manage security of distributed systems. Not only the network itself has to be secured,
users also need to control replicated data across multiple locations.
Conclusion
Distributed computing helps improve performance of large-scale projects by combining the
power of multiple machines. It’s much more scalable and allows users to add computers
according to growing workload demands. Although distributed computing has its own
disadvantages, it offers unmatched scalability, better overall performance and more reliability,
which makes it a better solution for businesses dealing with high workloads and big data.