Week 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

Week 1: Introduction to Distributed Systems

 Lesson 1: Introduction to Distributed Systems


 Definition and characteristics of distributed systems.
 Importance in modern computing.
 Challenges and advantages.
 Lesson 2: Architectural Models
 Centralized vs. Distributed Systems.
 Client-server and peer-to-peer architectures.
 Hybrid and cloud-based architectures.

 Class Activity: Discuss examples of distributed systems in everyday life.

What is a Distributed System?


A distributed System is a collection of autonomous computer systems that are physically
separated but are connected by a centralized computer network that is equipped with distributed
system software. The autonomous computers will communicate among each system by sharing
resources and files and performing the tasks assigned to them.

Example of a Distributed System:


Any social media can have its Centralized Computer Network as its Headquarters and computer
systems that can be accessed by any user and using their services will be the Autonomous
Systems in the Distributed System Architecture.

 Distributed System Software: This Software enables computers to coordinate their


activities and to share resources such as Hardware, Software, Data, etc.
 Database: It is used to store the processed data that are processed by each
Node/System of the Distributed systems that are connected to the Centralized network.
 As we can see each Autonomous System has a common Application that can have its
data that is shared by the Centralized Database System.
 To Transfer the Data to Autonomous Systems, the Centralized System should have a
Middleware Service and should be connected to a Network.
 Middleware Services enable some services that are not present in the local systems or
centralized system default by acting as an interface between the Centralized System
and the local systems. By using components of Middleware Services systems
communicate and manage data.
 The Data which is been transferred through the database will be divided into
segments or modules and shared with Autonomous systems for processing.
 The Data will be processed and then will be transferred to the Centralized system
through the network and will be stored in the database.

Characteristics of Distributed System:


 Resource Sharing: It is the ability to use any Hardware, Software, or Data anywhere
in the System.
 Openness: It is concerned with Extensions and improvements in the system (i.e.,
How openly the software is developed and shared with others)
 Concurrency: It is naturally present in Distributed Systems, that deal with the same
activity or functionality that can be performed by separate users who are in remote
locations. Every local system has its own independent Operating Systems and
Resources.
 Scalability: It increases the scale of the system as several processors communicate
with more users by accommodating to improve the responsiveness of the system.
 Fault tolerance: It cares about the reliability of the system if there is a failure in
Hardware or Software, the system continues to operate properly without degrading
the performance of the system.
 Transparency: It hides the complexity of the Distributed Systems from the Users
and Application programs as there should be privacy in every system.
 Heterogeneity: Networks, computer hardware, operating systems, programming
languages, and developer implementations can all vary and differ among dispersed
system components.

Advantages of Distributed System:


 Applications in Distributed Systems are Inherently Distributed Applications.
 Information in Distributed Systems is shared among geographically distributed users.
 Resource Sharing (Autonomous systems can share resources from remote locations).
 It has a better price-performance ratio and flexibility.
 It has a shorter response time and higher throughput.
 It has higher reliability and availability against component failure.
 It has extensibility so that systems can be extended in more remote locations and also
incremental growth.

Disadvantages of Distributed System:


 Relevant Software for Distributed systems does not exist currently.
 Security poses a problem due to easy access to data as the resources are shared with
multiple systems.
 Networking Saturation may cause a hurdle in data transfer i.e., if there is a lag in the
network then the user will face a problem accessing data.
 In comparison to a single-user system, the database associated with distributed
systems is much more complex and challenging to manage.
 If every node in a distributed system tries to send data at once, the network may
become overloaded.
Applications Area of Distributed System:
 Finance and Commerce: Amazon, eBay, Online Banking, E-Commerce websites.
 Information Society: Search Engines, Wikipedia, Social Networking, Cloud
Computing.
 Cloud Technologies: AWS, Salesforce, Microsoft Azure, SAP.
 Entertainment: Online Gaming, Music, YouTube.
 Healthcare: Online patient records, Health Informatics.
 Education: E-learning.
 Transport and logistics: GPS, Google Maps.
 Environment Management: Sensor technologies.

Challenges of Distributed Systems:


While distributed systems offer many advantages, they also present some challenges that must be
addressed. These challenges include:
 Network latency: The communication network in a distributed system can introduce
latency, which can affect the performance of the system.
 Distributed coordination: Distributed systems require coordination among the nodes,
which can be challenging due to the distributed nature of the system.
 Security: Distributed systems are more vulnerable to security threats than centralized
systems due to the distributed nature of the system.
 Data consistency: Maintaining data consistency across multiple nodes in a distributed
system can be challenging.

Conclusion:
Distributed systems are becoming increasingly popular due to their high availability, scalability,
and fault tolerance. However, they also present some challenges that must be addressed. By
understanding the characteristics and challenges of distributed systems, developers can design
and implement effective distributed systems that meet the needs of their users.

Architecture Styles in Distributed Systems


Distributed systems are a set of autonomous computers that appear to be a single coherent
system to its users from outside. Two different types of systems exist from the perspective of
computers:
Centralized System:
 A centralized system consists of a single machine.
 All calculations are done by a particular computer.
 Its performance is low as the workload is not divided.
 There is also no machine present in the backup if the original computer system fails.
Distributed Systems:
 A distributed system consists of multiple machines.
 All computation work is divided among the different systems.
 Its performance is high as the workload is divided among different computers to
efficiently use their capacity.
 There are systems present in the backup, so if the main system fails then work will
not stop.
A distributed system contains multiple nodes that are physically separate but mixed using the
communication networks.
Centralized System vs. Distributed System
Criteria Centralized system Distributed System
Economics Low High

Availability Low High

Complexity Low High

Consistency Simple High

Scalability Poor Good

Technology Homogeneous Heterogeneous

Security High Low

Architecture of Distributed Systems


Cloud-based software, the backbone of distributed systems, is a complicated network of
servers that anyone with an internet connection can access. In a distributed system, components
and connectors arrange themselves in a way that eases communication. Components are modules
with well-defined interfaces that can be replaced or reused. Similarly, connectors are
communication links between modules that mediate coordination or cooperation among
components.
A distributed system is broadly divided into two essential concepts — software architecture
(further divided into layered architecture, object-based architecture, data-centered architecture,
and event-based architecture) and system architecture (further divided into client-server
architecture and peer-to-peer architecture).
Let’s understand each of these architecture systems in detail:
1. Software architecture
Software architecture is the logical organization of software components and their interaction
with other structures. It is at a lower level than system architecture and focuses entirely on
components; e.g., the web front end of an e-commerce system is a component. The four main
architectural styles of distributed systems in software components entail:

i) Layered architecture
Layered architecture provides a modular approach to software. Separating each component is
more efficient. For example, the open systems interconnection (OSI) model uses a layered
architecture for better results. It does this by contacting layers in sequence, which allows it to
reach its goal. In some instances, the implementation of layered architecture is in cross-layer
coordination. Under cross-layer, the interactions can skip any adjacent layer until it fulfills the
request and provides better performance results.

Layered Architecture
Layered architecture is a type of software that separates components into units. A request goes
from the top down, and the response goes from the bottom up. The advantage of layered
architecture is that it keeps things orderly and modifies each layer independently without
affecting the rest of the system.

ii) Object-based architecture


Object-based architecture centers around an arrangement of loosely coupled objects with no
specific architecture-like layers. Unlike layered architecture, object-based architecture doesn’t
have to follow any steps in a sequence. Each component is an object, and all the objects can
interact through an interface (or connector). Under object-based architecture, such interactions
between components can happen through a direct method call.
Object-based Architecture

At its core, communication between objects happens through method invocations, often called
remote procedure calls (RPC). Popular RPC systems include Java RMI Web Services and REST
API Calls. The primary design consideration of these architectures is that they are less structured.
Here, component equals object, and connector equals RPC or RMI.

iii) Data-centered architecture


Data-centered architecture works on a central data repository, either active or passive. Like most
producer-consumer scenarios, the producer (business) produces items for the common data store,
and the consumer (individual) can request data from it. Sometimes, this central repository can be
just a simple database.
Data-centered Architecture

All communication between objects happens through a data storage system in a data-centered
system. It supports its stores’ components with a persistent storage space such as an SQL
database, and the system stores all the nodes in this data storage.

iv) Event-based architecture


In event-based architecture, the entire communication is through events. When an event occurs,
the system gets the notification. This means that anyone who receives this event will also be
notified and have access to information. Sometimes, these events are data, and at other times
they are URLs to resources. As such, the receiver can process what information they receive and
act accordingly.
Event-Based Architecture

One significant advantage of event-based architecture is that the components are loosely
coupled. Eventually, it means that it’s easy to add, remove, and modify them. To better
understand this, think of publisher-subscriber systems, enterprise services buses, or akka.io. One
advantage of event-based architecture is allowing heterogeneous components to communicate
with the bus, regardless of their communication protocols.

2. System architecture
System-level architecture focuses on the entire system and the placement of components of a
distributed system across multiple machines. The client-server architecture and peer-to-peer
architecture are the two major system-level architectures that hold significance today. An
example would be an e-commerce system that contains a service layer, a database, and a web
front.

i) Client-server architecture
As the name suggests, client-server architecture consists of a client and a server. The server is
where all the work processes are, while the client is where the user interacts with the service and
other resources (remote server). The client can then request from the server, and the server will
respond accordingly. Typically, only one server handles the remote side; however, using
multiple servers ensures total safety.

The client-server architecture is the most common distributed system architecture which
decomposes the system into two major subsystems or logical processes −
 Client − This is the first process that issues a request to the second process i.e. the server.
 Server − This is the second process that receives the request, carries it out, and sends a
reply to the client.
In this architecture, the application is modeled as a set of services that are provided by servers
and a set of clients that use these services. The servers need not know about clients, but the
clients must know the identity of servers, and the mapping of processors to processes is not
necessarily 1:1

Client-server Architecture

Client-server architecture has one standard design feature: centralized security. Data such as
usernames and passwords are stored in a secure database for any server user to have access to
this information. This makes it more stable and secure than peer-to-peer. This stability comes
from client-server architecture, where the security database can allow resource usage in a more
meaningful way. The system is much more stable and secure, even though it isn’t as fast as a
server. The disadvantages of a distributed system are its single point of failure and not being as
scalable as a server.

Client-server Architecture can be classified into two models based on the functionality of the
client –
Thin-client model
In the thin-client model, all the application processing and data management is carried out by the
server. The client is simply responsible for running the presentation software.
 Used when legacy systems are migrated to client-server architectures in which the legacy
system acts as a server in its own right with a graphical interface implemented on a client
 A major disadvantage is that it places a heavy processing load on both the server and the
network.

Thick/Fat-client model
In the thick-client model, the server is only in charge of data management. The software on the
client implements the application logic and the interactions with the system user.
 Most appropriate for new C/S systems where the capabilities of the client system are
known in advance
 More complex than a thin client model especially for management. New versions of the
application have to be installed on all clients.

Advantages
 Separation of responsibilities such as user interface presentation and business logic
processing.
 Reusability of server components and potential for concurrency
 Simplifies the design and the development of distributed applications
 It makes it easy to migrate or integrate existing applications into a distributed
environment.
 It also makes effective use of resources when a large number of clients are accessing a
high-performance server.

Disadvantages
 Lack of heterogeneous infrastructure to deal with the requirement changes.
 Security complications.
 Limited server availability and reliability.
 Limited testability and scalability.
 Fat clients with presentation and business logic together.

ii) Peer-to-peer (P2P) architecture


A peer-to-peer network, also called a (P2P) network, works on the concept of no central control
in a distributed system. A node can either act as a client or server at any given time once it joins
the network. A node that requests something is called a client, and one that provides something is
called a server. In general, each node is called a peer.

Peer-to-Peer Architecture

If a new node wishes to provide services, it can do so in two ways. One way is to register with a
centralized lookup server, which will then direct the node to the service provider. The other way
is for the node to broadcast its service request to every other node in the network, and whichever
node responds will provide the requested service.

P2P networks of today have three separate sections:


 Structured P2P: The nodes in structured P2P follow a predefined distributed data
structure.
 Unstructured P2P: The nodes in unstructured P2P randomly select their neighbors.
 Hybrid P2P: In a hybrid P2P, some nodes have unique functions appointed to them in an
orderly manner.

Key Components of a Distributed System


The three basic components of a distributed system include a primary system controller, system
data store, and database. In a non-clustered environment, optional components consist of user
interfaces and secondary controllers.
Main Components of a Distributed System
1. Primary system controller
The primary system controller is the only controller in a distributed system and keeps track of
everything. It’s also responsible for controlling the dispatch and management of server requests
throughout the system. The executive and mailbox services are installed automatically on the
primary system controller. In a non-clustered environment, optional components consist of a user
interface and secondary controllers.

2. Secondary controller
The secondary controller is a process controller or a communications controller. It’s responsible
for regulating the flow of server processing requests and managing the system’s translation load.
It also governs communication between the system and VANs or trading partners.

3. User-interface client
The user interface client is an additional element in the system that provides users with important
system information. This is not a part of the clustered environment, and it does not operate on the
same machines as the controller. It provides functions that are necessary to monitor and control
the system.

4. System datastore
Each system has only one data store for all shared data. The data store is usually on the disk
vault, whether clustered or not. For non-clustered systems, this can be on one machine or
distributed across several devices, but all of these computers must have access to this data store.
5. Database
In a distributed system, a relational database stores all data. Once the data store locates the data,
it shares it with multiple users. Relational databases can be found in all data systems and allow
multiple users to use the same information simultaneously.

Examples of a Distributed System


When processing power is scarce, or when a system encounters unpredictable changes,
distributed systems are ideal, and they help balance the workload. Hence distributed systems
have boundless use cases varying from electronic banking systems to multiplayer online games.
Let’s check out more explicit instances of distributed systems:

1. Networks
The 1970s saw the invention of Ethernet and LAN (local area networks), which enabled
computers to connect in the same area. Peer-to-peer networks developed, and e-mail and the
internet continue to be the biggest examples of distributed systems.

2. Telecommunication networks
Telephone and cellular networks are other examples of peer-to-peer networks. Telephone
networks started as an early example of distributed communication, and cellular networks are
also a form of distributed communication systems. With the implementation of Voice over
Internet (VoIP) communication systems, they grow more complex as distributed communication
networks.

3. Real-time systems
Real-time systems are not limited to specific industries. These systems can be used and seen
throughout the world in the airline, ride-sharing, logistics, financial trading, massively
multiplayer online games (MMOGs), and e-commerce industries. The focus in such systems is
on the correspondence and processing of information with the need to convey data promptly to a
huge number of users who have an expressed interest in such data.

4. Parallel processors
Parallel computing splits specific tasks among multiple processors. This, in turn, creates pieces
to put together and form an extensive computational task. Previously, parallel computing only
focused on running software on multiple threads or processors accessing the same data and
memory. As operating systems became more prevalent, they too fell into the category of parallel
processing.

5. Distributed database systems


A distributed database is spread out across numerous servers or regions. Data can be replicated
across several platforms. A distributed database system can be either homogeneous or
heterogeneous. A homogeneous distributed database uses the same database management
system and data model across all systems.

Adding new nodes and locations makes it easier to control and scale performance. On the other
hand, multiple data models and database management systems are possible with heterogeneous
distributed databases. Gateways are used to translate data across nodes and are typically created
due to the merger of two or more applications or systems.

6. Distributed artificial intelligence


Distributed artificial intelligence is one of the many approaches to artificial intelligence that is
used for learning and entails complex learning algorithms, large-scale systems, and decision-
making. It requires a large set of computational data points located in various locations.

A few real-world examples of distributed systems include:


1. Video-rendering systems
2. Scientific computing
3. Airline and hotel reservation
4. Cryptocurrency processors like Bitcoin
5. P2P file-sharing like BitTorrent
6. Multiplayer video games
7. E-learning applications
8. Distributed supply chains like Amazon

Takeaway
Distributed systems are the most significant benefactor behind modern computing systems due to
their capability of providing scalable and improved performance. Distributed systems are an
essential component of wireless networks, cloud computing, and the Internet. Since they can
draw on the resources of other devices and processes, distributed systems offer some features
that would be hard or even impossible to develop on a singular system and have become
immensely reliable by combining the power of multiple machines.

You might also like