0% found this document useful (0 votes)
31 views30 pages

Chapter 2

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 30

Chapter 02:

Processes
Processes Threads

Processes

We take a closer look at how the different types of processes play a crucial role in
distributed systems.

A process is often defined as a program in execution.

To execute a program, an operating system creates a number of virtual


processors, each one for running a different program. To keep track of these
virtual processors, the operating system has a process table.

Threads play a crucial role in obtaining performance in multi-core and


multiprocessor environments, but also help in structuring clients and servers.

Introduction to threads
Processes Threads

Introduction to threads
Basic idea
We build virtual processors in software, on top of physical processors:
Processor: Provides a set of instructions along with the capability of
automatically executing a series of those instructions.
Thread: A minimal software processor in whose context a series of
instructions can be executed.
Process: A software processor in whose context one or more threads
may be executed.

Introduction to threads
Processes Threads

Context switching
Contexts
• Processor context: The minimal collection of values stored in the
registers of a processor used for the execution of a series of instructions
(e.g., stack pointer, addressing registers, program counter).
• Thread context: The minimal collection of values stored in registers and
memory, used for the execution of a series of instructions (i.e., processor
context, state).
• Process context: The minimal collection of values stored in registers
and memory, used for the execution of a thread (i.e., thread context).

Introduction to threads
Processes Threads

Why use threads


Some simple reasons
• Avoid needless blocking: a single-threaded process will block when
doing I/O; in a multi-threaded process, the operating system can switch
the CPU to another thread in that process.
• Exploit parallelism: the threads in a multi-threaded process can be
scheduled to run in parallel on a multiprocessor or multi-core processor.
• Avoid process switching: structure large applications not as a collection of
processes, but through multiple threads.

Introduction to threads
Processes Threads

Avoid process switching


Avoid expensive context switching

1. Threads share the same address space. Thread context switching can be done
entirely independent of the operating system.
2. Process switching is generally (somewhat) more expensive as it involves getting
the OS in the loop.
3. Creating and destroying threads is much cheaper than doing so for processes.

Introduction to threads
Processes Threads

Threads and operating systems


Main issue
Should an OS kernel provide threads, or should they be implemented as
user-level packages?

User-space solution
• All operations can be completely handled within a single process ⇒
implementations can be extremely efficient.
• All services provided by the kernel are done on behalf of the process in
which a thread resides ⇒ if the kernel decides to block a thread, the
entire process will be blocked.
• Threads are used when there are many external events: threads block
on a per-event basis ⇒ if the kernel can’t distinguish threads, how can it
support signaling events to them?

Introduction to threads
Processes Threads

Threads and operating systems


Kernel solution
The whole idea is to have the kernel contain the implementation of a thread
package. This means that all operations return as system calls:
• Operations that block a thread are no longer a problem: the kernel
schedules another available thread within the same process.
• handling external events is simple: the kernel (which catches all events)
schedules the thread associated with the event.
• The problem is (or used to be) the loss of efficiency because each
thread operation requires a trap to the kernel.

Conclusion – but
Try to mix user-level and kernel-level threads into a single concept,
however, performance gain has not turned out to generally outweigh the
increased complexity.

Introduction to threads
Processes Threads

Using threads at the client side


Multi-threaded web client
Hiding network latencies:
• Web browser scans an incoming HTML page, and finds that more files
need to be fetched.
• Each file is fetched by a separate thread, each doing a (blocking) HTTP
request.
• As files come in, the browser displays them.

Multiple request-response calls to other machines (RPC)


• A client does several calls at the same time, each one by a different
thread.
• It then waits until all results have been returned.
• Note: if calls are to different servers, we may have a linear speed-up.

Threads in distributed systems


Processes Threads

Using threads at the server side


Improve performance
• Starting a thread is cheaper than starting a new process.
• Having a single-threaded server prohibits simple scale-up to a
multiprocessor system.
• As with clients: hide network latency by reacting to next request while
previous one is being replied.

Better structure
• Most servers have high I/O demands. Using simple, well-understood
blocking calls simplifies the structure.
• Multi-threaded programs tend to be smaller and easier to understand
due to simplified flow of control.

Threads in distributed systems


Processes Threads

Why multi-threading is popular: organization


Dispatcher/worker model

Overview
Model Characteristics

Multithreading Parallelism, blocking system calls

Single-threaded process No parallelism, blocking system calls

Finite-state machine Parallelism, nonblocking system calls


Threads in distributed systems
Processes Virtualization

Virtualization
Virtualization is important:
• Hardware changes faster than software
• Ease of portability and code migration
• Isolation of failing or attacked components

Principle: mimicking interfaces

Principle of virtualization
Processes Virtualization

Mimicking interfaces
Four types of interfaces at three different levels
1. Instruction set architecture: the set of machine instructions, with two
subsets:
• Privileged instructions: allowed to be executed only by the
operating system.
• General instructions: can be executed by any program.
2. System calls as offered by an operating system.
3. Library calls, known as an application programming interface (API)

Principle of virtualization
Processes Virtualization

Ways of virtualization

(a) Process VM (b) Native VMM (c) Hosted VMM

Differences
a) Separate set of instructions, an interpreter/emulator, running atop an OS.
b) Low-level instructions, along with bare-bones minimal operating system
c) Low-level instructions, but delegating most work to a full-fledged OS.

Principle of virtualization
Processes Virtualization

Containers

• Namespaces: a collection of processes in a container is given their own


view of identifiers
• Union file system: combine several file systems into a layered fashion
with only the highest layer allowing for write operations (and the one
being part of a container).
• Control groups: resource restrictions can be imposed upon a collection of
processes.
Containers
Processes Clients

Example: The X Window system


Basic organization

X client and server


The application acts as a client to the X-kernel, the latter running as a
server on the client’s machine.

Networked user interfaces


Processes Clients

Virtual desktop environment


Logical development
With an increasing number of cloud-based applications, the question is how to
use those applications from a user’s premise?
• Issue: develop the ultimate networked user interface
• Answer: use a Web browser to establish a seamless experience

The Google Chromebook


Virtual desktop environment
Processes Servers

Servers: General organization


Server: is a process implementing a specific service on behalf of a collection
of clients.

Two basic types


• Iterative server: Server handles the request before attending a next
request.
• Concurrent server: Uses a dispatcher, which picks up an incoming
request that is then passed on to a separate thread/process.

Concurrent servers are the norm: they can easily handle multiple requests,
notably in the presence of blocking operations (to disks or other servers).

General design issues


Processes Servers

Contacting a server
Most services are tied to a specific port

ftp-data 20 File Transfer [Default Data]

ftp 21 File Transfer [Control]

telnet 23 Telnet

smtp 25 Simple Mail Transfer

www 80 Web (HTTP)

Dynamically assigning an end point: two approaches

General design issues


Processes Servers

Out-of-band communication
Issue
Is it possible to interrupt a server once it has accepted (or is in the process of
accepting) a service request?

Solution 1: Use a separate port for urgent data


• Server has a separate thread/process for urgent messages
• Urgent message comes in ⇒ associated request is put on hold
• Note: we require OS supports priority-based scheduling

Solution 2: Use facilities of the transport layer


• Example: TCP allows for urgent messages in same connection
• Urgent messages can be caught using OS signaling techniques

General design issues


Processes Servers

Servers and state


Stateless servers
Never keep accurate information about the status of a client after having handled a
request:
 Don’t record whether a file has been opened (simply close it again after
access)
 Don’t promise to invalidate a client’s cache
 Don’t keep track of your clients

Stateful servers
Keeps track of the status of its clients:
 Record that a file has been opened, so that prefetching can be done
 Knows which data a client has cached, and allows clients to keep
local copies of shared data

General design issues


Processes Code migration

Code migration
Code migration involves passing programs and data: when a program migrates while running,
its status, pending signals, and other environment variables such as the stack and the
program counter also have to be moved.
Reasons to migrate code
Load distribution
● Ensuring that servers in a data center are sufficiently loaded (e.g., to
prevent waste of energy)
● Minimizing communication by ensuring that computations are close to
where the data is (think of mobile computing).

Flexibility: moving code to a client when needed

Avoids pre-installing software and increases dynamic configuration.


Code migration
Processes Code migration

Reasons to migrate code

Privacy and security


In many cases, one cannot move data to another location, for whatever
reason (often legal ones).
Solution: move the code to the data.

Example: federated machine learning

Code migration
Processes Code migration

Paradigms for code mobility

Code migration
Processes Code migration

Strong and weak mobility


Object components
• Code segment: contains the actual code
• Data segment: contains the state
• Execution state: contains context of thread executing the object’s code

Weak mobility: Move only code and data segment (and reboot
execution)
• Relatively simple, especially if code is portable
• Distinguish code shipping (push) from code fetching (pull)

Strong mobility: Move component, including execution state


• Migration: move entire object from one machine to the other
• Cloning: start a clone, and set it in the same execution state.

Code migration
Processes Code migration

Summary of models of code migration

Code migration
Processes Code migration

Overview of Software Agents

Software agents: software processes or robots that can move around the
system

To do specific tasks for which they are specially programmed.

Agents architecture can be mobile, Interface, Collaborative, Information,


Reactive and hybrid.

Software agents can be employed to monitor, detect, prevent, and


respond to various security threats and vulnerabilities.

Challenges in distributed agent systems include coordination mechanisms


among the agents, controlling the mobility of the agents, and their software
design and interfaces.

Overview of Software agents


Processes Code migration

Software agents main characteristics

Overview of Software agents


Processes Code migration

Why Software Agents

Automation: Automate repetitive and time-consuming tasks.

Proactive Monitoring (with Real-time Responses): Continuously monitor


systems, networks, and user activities, enabling proactive detection of
security threats and vulnerabilities.

Scalability can be deployed across multiple nodes, allowing for distributed


and scalable security monitoring and enforcement.

Adaptability changing environments and evolving threats.

Adaptability share information, coordinate responses, and collectively


defend against distributed attacks.

Reduce Human Error minimize human error by enforcing security policies


consistently and accurately.
Overview of Software agents
Processes Code migration

Examples of Software Agents


Web Crawlers

Virtual Personal Assistants

Surveillance agents

Robotic Process Automation (RPA) agents

Smart Home agents

Autonomous Vehicles

Antivirus agents

Buyer agents

etc.

Overview of Software agents

You might also like