Distributed System Unit No 2
Distributed System Unit No 2
Distributed System Unit No 2
What is Middleware?
Middleware is a type of computer software program that provides services to software
applications beyond those available from the operating system. It can be described as
"software glue".
Middleware makes it easier for software developers to implement communication and
input/output, so they can focus on the specific purpose of their application.
Types of MiddleWare –
Message-Oriented Middleware (MOM):
Message-oriented middleware is designed to facilitate communication between
distributed applications through asynchronous message passing. Here's a more
detailed breakdown of its features and functionalities:
Asynchronous Communication: MOM systems allow applications to communicate
asynchronously, meaning that senders and receivers do not need to interact with each
other in real-time. Messages can be sent and received independently, enhancing
system flexibility and scalability.
Message Queuing: MOM typically employs message queues, which act as temporary
storage for messages sent from producers (senders) to consumers (receivers). This
decouples the communication process, enabling producers to send messages even if
consumers are temporarily unavailable.
Message Routing: MOM systems provide mechanisms for routing messages to their
intended destinations based on predefined rules or criteria. This ensures that messages
are delivered to the appropriate recipients efficiently.
Some of the routing mechanism are – P2P Messaging, Content base Messaging,
Publisher-Subscriber Messaging.
Message Transformation: MOM may offer features for transforming messages
between different formats or protocols. For instance, it can convert messages from one
data format to another to accommodate the requirements of different applications or
systems.
Reliability and Fault Tolerance: MOM systems often incorporate mechanisms for
ensuring message delivery reliability and fault tolerance. They may include features
such as message persistence, acknowledgment mechanisms, and failover capabilities
to handle system failures gracefully.
Examples of Message-Oriented Middleware include:
Apache Kafka
RabbitMQ
IBM MQ (formerly known as WebSphere MQ)
ActiveMQ
ZeroMQ
Advantages of Message-Oriented Middleware (MOM):
Asynchronous Communication
Reliability
Decoupling
Message Transformation
Scalability
Content-Centric Middleware:
Content-centric middleware focuses on managing and manipulating data or content
across distributed systems. Unlike MOM, which primarily deals with message
passing, content-centric middleware emphasizes operations related to data
dissemination, storage, and retrieval. Here's a closer look at its functionalities:
Content Discovery: Content-centric middleware enables applications to discover and
access content efficiently across distributed environments. It provides mechanisms for
indexing, cataloging, and searching content repositories, allowing users to locate
relevant data quickly.
Caching: Content-centric middleware often includes caching mechanisms to improve
data access performance and reduce network latency. Cached copies of frequently
accessed content are stored closer to the consumers, minimizing the need to retrieve
data from distant sources repeatedly.
Replication: To enhance data availability and resilience, content-centric middleware
supports content replication across distributed nodes or servers. Replication ensures
that multiple copies of data are available across the network, reducing the risk of data
loss or unavailability due to system failures.
Synchronization: Content-centric middleware facilitates data synchronization
between distributed replicas to ensure consistency and coherence. It manages
synchronization processes to update copies of content with the latest changes,
maintaining data integrity across the system.
Examples of Content-Centric Middleware include:
1. Apache CouchDB
2. Riak
3. Amazon DynamoDB
4. Cassandra
5. Redis
Functionalities of Middleware –
1. Data Transformation and Formatting
Middleware often includes features for transforming and formatting data to ensure that
it conforms to the requirements of the receiving application or system. For instance,
consider a middleware component that receives data in JSON format from a web
application and needs to transform it into XML format before sending it to a legacy
system that only accepts XML. The middleware can perform this transformation
seamlessly, ensuring compatibility between the two systems.
What is CORBA ?
https://datascientest.com/en/corba-infrastructure-definition-and-benefits
What is RMI ?
RMI stands for Remote Method Invocation. It is a mechanism that allows an object
residing in one system (JVM) to access/invoke an object running on another JVM.
RMI is used to build distributed applications; it provides remote communication
between Java programs. It is provided in the package java.rmi.
Goals of RMI
Following are the goals of RMI −
To minimize the complexity of the application.
To preserve type safety.
Distributed garbage collection.
Minimize the difference between working with local and remote objects.
Architecture of an RMI Application –
In an RMI application, we write two programs, a server program (resides on the
server) and a client program (resides on the client).
Inside the server program, a remote object is created and reference of that
object is made available for the client (using the registry).
The client program requests the remote objects on the server and tries to invoke
its methods.
DCE –
Distributed Computing Environment(DCE) is an integrated set of services and tools
which are used for building and running Distributed Applications. It is a collection of
integrated software components/frameworks that can be installed as a coherent
environment on top of the existing Operating System and serve as a platform for
building and running Distributed Applications.
Components of DCE –
Remote Procedure Call(RPC): It is a call made when a Computer program wants to
execute a subroutine in a different computer(another computer on a shared network).
Distributed File System(DFS): DCE includes a distributed file system that allows
clients to access files and directories on remote servers as if they were local.
Directory Service: It is used to keep track location of Virtual Resources in the
Distributed System. These Resources include Files, Printers, Servers, Scanner, and
other machines. This service prompts the user to ask for resources(through the
process) and provide them with convenience. Processes are unaware of the actual
location of resources.
Security Service: It allows the process to check for User Authenticity. Only an
authorized person can have access to protected and secured resources. It allows only
an authorized computer on a network of Distributed Systems to have access to secured
resources.
Distributed Time Service: Inter-process communication between different system
components requires synchronization so that communication takes place in a
designated order only. This service is responsible for maintaining a global clock and
hence synchronizing the local clocks with the notion of time.
Thread Service: The Thread Service provides the implementation of lightweight
processes (threads). Helps in the synchronization of multiple threads within a shared
address space.
Advantages of DCE:
Security
Lower Maintenance Cost
Scalability and Availability
Reduced Risks
Middleware Issues-
Complexity: Middleware systems can introduce complexity to the software
architecture, especially in large-scale distributed environments. Managing middleware
components, configurations, and interactions can be challenging, requiring specialized
knowledge and expertise.
Performance Overheads: Middleware introduces additional layers of abstraction and
processing overhead, which can impact system performance. Message queuing, data
transformation, and protocol mediation can introduce latency and reduce throughput,
especially in high-volume environments.
Scalability Limitations: Some middleware solutions may have scalability limitations,
particularly when scaling horizontally across distributed nodes or servers. Bottlenecks
in message processing, resource contention, or communication overheads can hinder
the ability to scale the system effectively.
Maintenance and Upgrades: Middleware systems require ongoing maintenance,
updates, and patches to address security vulnerabilities, performance issues, and
compatibility issues with evolving technologies. Managing these maintenance tasks
across distributed deployments can be time-consuming and resource-intensive.
Cost: Deploying and maintaining middleware solutions can incur significant costs,
including licensing fees, infrastructure expenses, and operational overheads.
Organizations need to carefully evaluate the return on investment (ROI) and total cost
of ownership (TCO) of middleware solutions to justify their adoption.
Learning Curve: Mastering middleware technologies and best practices may require
a steep learning curve for developers, administrators, and IT personnel. Training and
skill development programs may be necessary to ensure effective utilization and
management of middleware resources.
Apache Kafka –
Apache Kafka is an open-source streaming data platform originally developed by
LinkedIn. As it expanded Kafka’s capabilities, LinkedIn donated it to Apache for
further development.
Kafka operates like a traditional pub-sub message queue, such as RabbitMQ, in that it
enables you to publish and subscribe to streams of messages. But it differs from
traditional message queues in 3 key ways:
1. Kafka operates as a modern distributed system that runs as a cluster and can
scale to handle any number of applications.
2. Kafka is designed to serve as a storage system and can store data as long as
necessary; most message queues remove messages immediately after the
consumer confirms receipt.
3. Kafka handles stream processing, computing derived streams and datasets
dynamically, rather than just passing batches of messages.
Core Concept of Kafka –
Broadly, Kafka accepts streams of events written by data producers. Kafka stores
records chronologically in partitions across brokers (servers); multiple brokers
comprise a cluster. Each record contains information about an event and consists of a
key-value pair; timestamp and header are optional additional information. Kafka
groups records into topics; data consumers get their data by subscribing to the topics
they want.
Kafka Use Cases –
Real-time Data Ingestion
Log Aggregation
Event Sourcing
Real-time Analytics
Stream Processing
Machine Learning Pipelines
Commit Log
Adaptive software -
Adaptive software refers to software systems or applications that can modify their behavior,
structure, or configuration dynamically in response to changes in the environment, user
preferences, or system requirements. Adaptive software aims to improve flexibility,
adaptability, and responsiveness by adjusting its behavior autonomously without manual
intervention.