Security Challenges of DS?: 2.what Do You Mean by Scalability of A System?

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

DS-MID1

UNIT 1

1.Security challenges of DS?


Many of the information resources that are made available and maintained in distributed systems
have a high intrinsic value to their users. Their security is therefore of considerable importance.
Security for information resources has three components:
1. confidentiality (protection against disclosure to unauthorized individuals),
2. integrity (protection against alteration or corruption), and
3. availability (protection against interference with the means to access the resources).
CHALLENGES :
1.Location, security risks are associated with allowing free access to all of the resources in an
intranet. In a distributed system, clients send requests to access data managed by servers, which
involves sending information in messages over a network.
For example: 1. A doctor might request access to hospital patient data or send additions to
that data.
The challenge is to send sensitive information in a message over a network in a secure
manner. But security is not just a matter of concealing the contents of messages – it also
involves knowing for sure the identity of the user or other agent on whose behalf a message
was sent.
2.The second challenge here is to identify a remote user or other agent correctly. Both of these
challenges can be met by the use of encryption techniques developed for this purpose.
3.Denial of service attacks: Another security problem is that a user may wish to disrupt a service
for some reason. This can be achieved by bombarding the service with such a large number of
pointless requests that the serious users are unable to use it. This is called a denial of service
attack.
4. Security of mobile code: Mobile code needs to be handled with care. Consider someone who
receives an executable program as an electronic mail attachment: the possible effects of running
the program are unpredictable.
2.What do you mean by scalability of a system?
Scalability is an important indicator in distributed computing and parallel computing. It describes
the ability of the system to dynamically adjust its own computing performance by changing
available computing resources and scheduling methods. Scalability is divided into two aspects:
hardware and software. Scalability in hardware refers to changing workloads by changing
hardware resources, such as changing the number of processors, memory, and hard disk capacity.
The scalability of software is to meet the changing workload by changing the scheduling method
and the degree of parallelism. Metrics, design, and testing are the three main aspects of system
scalability research.
• The measure of scalability is to design and test the base of a scalable system foundation . The
general metric method is to evaluate the system performance changes in this process by loading
different system resources and system loads.
• On-demand dynamic allocation and scheduling of resources is an important foundation for
system scalability
• Testing is the basis for testing and evaluating scalability.
There are currently two levels of testing for scalability:
1) code-level testing, and
2) system-level testing.
The code level is for each block of code in a parallel program and detects its effect on system
scalability. In general, statistical methods are used to perform statistical modeling of parallel
algorithms. The system-level test method is to analyze the pre-analysis of the workload and the
monitoring of the real-time running status, and combine the results of the pre-analysis with the
real-time monitoring results to analyze the scalability of the entire system.
3.Client server report sharing system.
The Client-server model is a distributed application structure that partitions task or workload
between the providers of a resource or service, called servers, and service requesters called
clients. In the client-server architecture, when the client computer sends a request for data to the
server through the internet, the server accepts the requested process and deliver the data packets
requested back to the client. Clients do not share any of their resources. Examples of Client-
Server Model are Email, World Wide Web, etc.
How the Client-Server Model works ?
In this article we are going to take a dive into the Client-Server model and have a look at how
the Internet works via, web browsers. This article will help us in having a solid foundation of
the WEB and help in working with WEB technologies with ease.
• Client: When we talk the word Client, it mean to talk of a person or an organization
using a particular service. Similarly in the digital world a Client is a computer (Host) i.e.
capable of receiving information or using a particular service from the service providers
(Servers).
• Servers: Similarly, when we talk the word Servers, It mean a person or medium that
serves something. Similarly in this digital world a Server is a remote computer which
provides information (data) or access to particular services.
So, its basically the Client requesting something and the Server serving it as long as its present
in the database.
4.Explain design requirements for challenges of DS?
Issues in designing distributed systems:
10. Heterogeneity
The Internet enables users to access services and run applications over a heterogeneous
collection of computers and networks.Internet consists of many different sorts of network their
differences are masked by the fact that all of the computers attached to them use the Internet
protocols to communicate with one another.For eg., a computer attached to an Ethernet has an
implementation of the Internet protocols over the Ethernet, whereas a computer on a different
sort of network will need an implementation of the Internet protocols for that network.
2. Openness
The openness of a computer system is the characteristic that determines whether the system can
be extended and re-implemented in various ways.The openness of distributed systems is
determined primarily by the degree to which new resource-sharing services can be added and be
made available for use by a variety of client programs.
10. Security
Many of the information resources that are made available and maintained in distributed systems
have a high intrinsic value to their users.Their security is therefore of considerable importance.
Security for information resources has three components: confidentiality, integrity, and
availability.
10. Scalability
Distributed systems operate effectively and efficiently at many different scales, ranging from a
small intranet to the Internet. A system is described as scalable if it will remain effective when
there is a significant increase in the number of resources and the number of users.
10. Failure handling
Computer systems sometimes fail. When faults occur in hardware or software, programs may
produce incorrect results or may stop before they have completed the intended computation.
Failures in a distributed system are partial – that is, some components fail while others continue
to function. Therefore the handling of failures is particularly difficult.
6. Concurrency
Both services and applications provide resources that can be shared by clients in a distributed
system. There is therefore a possibility that several clients will attempt to access a shared
resource at the same time. Object that represents a shared resource in a distributed system must
be responsible for ensuring that it operates correctly in a concurrent environment. This applies
not only to servers but also to objects in applications. Therefore any programmer who takes an
implementation of an object that was not intended for use in a distributed system must do
whatever is necessary to make it safe in a concurrent environment.
7. Transparency
Transparency can be achieved at two different levels. Easiest to do is to hide the distribution
from the users. The concept of transparency can be applied to several aspects of a distributed
system.
8. Quality of service
Once users are provided with the functionality that they require of a service, such as the file
service in a distributed system, we can go on to ask about the quality of the service provided. The
main nonfunctional properties of systems that affect the quality of the service experienced by
clients and users are reliability, security and performance. Adaptability to meet changing system
configurations and resource availability has been recognized as a further important aspect of
service quality.
9. Reliability
One of the original goals of building distributed systems was to make them more reliable than
single-processor systems. The idea is that if a machine goes down, some other machine takes
over the job. A highly reliable system must be highly available, but that is not enough. Data
entrusted to the system must not be lost or garbled in any way, and if files are stored redundantly
on multiple servers, all the copies must be kept consistent. In general, the more copies that are
kept, the better the availability, but the greater the chance that they will be inconsistent,
especially if updates are frequent.
10. Performance
Always the hidden data in the background is the issue of performance. Building a transparent,
flexible, reliable distributed system, more important lies in its performance. In particular, when
running a particular application on a distributed system, it should not be appreciably worse than
running the same application on a single processor. Unfortunately, achieving this is easier said
than done.

UNIT 2
1.Explain marshalling in detail.
Marshalling in Distributed System
A Distributed system consists of numerous components located on different machines that
communicate and coordinate operations to seem like a single system to the end-user.
There should be a means to convert all of this data to a standard format so that it can be sent
successfully between computers. If the two computers are known to be of the same type, the
external format conversion can be skipped otherwise before transmission, the values are
converted to an agreed-upon external format, which is then converted to the local format on
receiving. For that, values are sent in the sender’s format, along with a description of the format,
and the recipient converts them if necessary. It’s worth noting, though, that bytes are never
changed during transmission.
Marshalling: Marshalling is the process of transferring and formatting a collection of data
structures into an external data representation type appropriate for transmission in a message.
Approaches:
There are three ways to successfully communicate between various sorts of data between
computers.
1. Common Object Request Broker Architecture (CORBA):
CORBA is a specification defined by the Object Management Group (OMG) that is currently
the most widely used middleware in most distributed systems. It allows systems with diverse
architectures, operating systems, programming languages, and computer hardware to work
together.
Marshalling CORBA:
From the specification of the categories of data items to be transmitted in a message, Marshalling
CORBA operations can be produced automatically.
2. Java’s Object Serialization:
Java Remote Method Invocation (RMI) allows you to pass both objects and primitive data
values as arguments and method calls. In Java, the term serialization refers to the activity of
putting an object (an instance of a class) or a set of related objects into a serial format suitable
for saving to disk or sending in a message.
3. Extensible Markup Language (XML):
Clients communicate with web services using XML, which is also used to define the interfaces
and other aspects of web services. However, XML is utilized in a variety of different
applications, including archiving and retrieval systems; while an XML archive is larger than a
binary archive, it has the advantage of being readable on any machine. Other XML applications
include the design of user interfaces and the encoding of operating system configuration files.
2.Explain multicast transmission in DS?
Multicast is a method of group communication where the sender sends data to multiple receivers
or nodes present in the network simultaneously. Multicasting is a type of one-to-many and many-
to-many communication as it allows sender or senders to send data packets to multiple receivers
at once across LANs or WANs. This process helps in minimizing the data frame of the network.
Multicasting works in similar to Broadcasting, but in Multicasting, the information is sent to the
targeted or specific members of the network. This task can be accomplished by transmitting
individual copies to each user or node present in the network, but sending individual copies to
each user is inefficient and might increase the network latency. To overcome these shortcomings,
multicasting allows a single transmission that can be split up among the multiple users,
consequently, this reduces the bandwidth of the signal.

Applications :
Multicasting is used in many areas like:
1. Internet protocol (IP)
2. Streaming Media
It also supports video conferencing applications and webcasts.
3. Methods in Inter process Communication .
Inter-process communication (IPC) is set of interfaces, which is usually programmed in order
for the programs to communicate between series of processes. This allows running programs
concurrently in an Operating System. These are the methods in IPC:

1. Names Pipes (Different Processes) –


This is a pipe with a specific name it can be used in processes that don’t have a shared
common process origin. E.g. is FIFO where the details written to a pipe is first named.
2. Message Queuing –
This allows messages to be passed between processes using either a single queue or
several message queue. This is managed by system kernel these messages are coordinated
using an API.
3. Semaphores –
This is used in solving problems associated with synchronization and to avoid race
condition. These are integer values which are greater than or equal to 0.
4. Shared memory –
This allows the interchange of data through a defined area of memory. Semaphore values
have to be obtained before data can get access to shared memory.
5. Sockets –
This method is mostly used to communicate over a network between a client and a server.
It allows for a standard connection which is computer and OS independent
6. Pipes (Same Process) –
This allows flow of data in one direction only. Analogous to simplex systems
(Keyboard). Data from the output is usually buffered until input process receives it which
must have a common origin.
4.Discuss the issues relating to datagram communication.

The following are some issues relating to datagram communication:

Message size: The receiving process needs to specify an array of bytes of a particular size in
which to receive a message.

Blocking: Sockets normally provide non-blocking sends and blocking receives for datagram
communication

Timeouts: A process that has invoked a receive operation should wait indefinitely in situations
where the sending process may have crashed or the expected message may have been lost. To
allow for such requirements, timeouts can be set on sockets.
Receive from any: The receive method does not specify an origin for messages. Instead, an
invocation of receive gets a message addressed to its socket from any origin

Failure model for UDP datagrams


UDP datagrams suffer from the following failures:
Omission failures: Messages may be dropped occasionally, either because of a checksum error
or because no buffer space is available atthe source or destination.
Ordering: Messages can sometimes be delivered out of sender order.

UNIT 3
1.Implementation of remote method invocation.

The RMI implementation consists of three abstraction layers.

These abstraction layers are:


1. The Stub and Skeleton layer, which intercepts method calls made by the client to the
interface reference variable and redirects these calls to a remote RMI service.
2. The Remote Reference layer understands how to interpret and manage references made
from clients to the remote service objects.
3. The bottom layer is the Transport layer, which is based on TCP/IP connections between
machines in a network. It provides basic connectivity, as well as some firewall
penetration strategies.

2.Explain communication between distributed objects.

In a distributed computing environment, distributed object communication realizes


communication between distributed objects. The main role is to allow objects to access data and
invoke methods on remote objects (objects residing in non-local memory space). Invoking a
method on a remote object is known as remote method invocation (RMI) or remote invocation,
and is the object-oriented programming analog of a remote procedure call (RPC).

The widely used approach on how to implement the communication channel is realized by
using stubs and skeletons. They are generated objects whose structure and behavior depends on
chosen communication protocol, but in general provide additional functionality that ensures
reliable communication over the network.
In RMI, a stub (which is the bit on the client) is defined by the programmer as an interface. The
rmic (rmi compiler) uses this to create the class stub. The stub performs type checking. The
skeleton is defined in a class which implements the interface stub.[1]

3.Discuss about various remote procedure calls.

Types of remote procedure call

These are the five types of remote procedure call.

Synchronous
This is the normal method of operation. The client makes a call and does not continue
until the server returns the reply.
Nonblocking
The client makes a call and continues with its own processing. The server does not
reply.
Batching
This is a facility for sending several client nonblocking calls in one batch.
Broadcast RPC
RPC clients have a broadcast facility, that is, they can send messages to many servers
and then receive all the consequent replies.
Callback RPC
The client makes a nonblocking client/server call, and the server signals completion by
calling a procedure associated with the client.
4.What is the importance of distributed garbage collection and explain its algorithm.

Importance

Distributed systems typically require distributed garbage collection. If a client holds a proxy to
an object in the server, it is important that the server does not garbage-collect that object until the
client releases the proxy (and it can be validly garbage-collected). Most third-party distributed
systems, such as RMI, handle the distributed garbage collection, but that does not necessarily
mean it will be done efficiently. The overhead of distributed garbage collection and remote
reference maintenance in RMI can slow network communications by a significant amount when
many objects are involved.

Understanding distributed garbage collection-Algorithm

The RMI subsystem implements reference counting based Distributed Garbage Collection
(DGC) to provide automatic memory management facilities for remote server objects.

When the client creates (unmarshalls) a remote reference, it calls dirty() on the server-side DGC.
After the client has finished with the remote reference, it calls the corresponding clean() method.

A reference to a remote object is leased for a time by the client holding the reference. The lease
period starts when the dirty() call is received. The client must renew the leases by making
additional dirty() calls on the remote references it holds before such leases expire. If the client
does not renew the lease before it expires, the distributed garbage collector assumes that the
remote object is no longer referenced by that client.

DGCClient implements the client side of the RMI distributed garbage collection system. The
external interface to DGCClient is the registerRefs() method. When a LiveRef to a remote object
enters the JVM, it must be registered with the DGCClient to participate in distributed garbage
collection. When the first LiveRef to a particular remote object is registered, a dirty() call is
made to the server-side DGC for the remote object. The call returns a lease guaranteeing that the
server-side DGC will not collect the remote object for a certain time. While LiveRef instances to
remote objects on a particular server exist, the DGCClient periodically sends more dirty calls to
renew its lease. The DGCClient tracks the local availability of registered LiveRef instances using
phantom references. When the LiveRef instance for a particular remote object is garbage
collected locally, a clean() call is made to the server-side DGC. The call indicates that the server
does not need to keep the remote object alive for this client. The RenewCleanThread handles the
asynchronous client-side DGC activity by renewing the leases and making clean calls. So this
thread waits until the next lease renewal or until any phantom reference is queued for generating
clean requests as necessary.

You might also like