Unit 1 Notes of Advance Operating System
Unit 1 Notes of Advance Operating System
Course Objectives:
Have an understanding of high-level OS kernel structure.
Gained insight into hardware-software interactions for compute and I/O.
Have practical skills in system tracing and performance analysis.
Have been exposed to research ideas in system structure and behaviour.
Have learned how to write systems-style performance evaluations.
Learning Outcomes:
1. Outline the potential benefits of distributed systems.
2. Summarize the major security issues associated with distributed systems along with the
Range of techniques available for increasing system security.
3. Apply standard design principles in the construction of these systems.
4. Select appropriate approaches for building a range of distributed systems, including
Some that employ middleware
Course Content:
UNIT I:
Overview of UNIX system calls. The anatomy of a system call and x86 mechanisms for system
call implementation. How the MMU/memory translation, segmentation, and hardware traps
interact to create kernel–user context separation. What makes virtualization work? The kernel
execution and programming context. Live debugging and tracing. Hardware and software
support for debugging.
UNIT II:
DTrace: programming, implementation/design, internals. Kprobes and SysTrace: Linux
catching up. Linking and loading. Executable and Linkable Format (ELF). Internals of linking
and dynamic linking. Internals of effective spinlock implementations on x86. OpenSolaris
adaptive mutexes: rationale and implementation optimization. Pre-emptive kernels. Effects of
modern memory hierarchies and related optimizations.
UNIT III:
Process and thread kernel data structures, process table traversal, lookup, allocation and
management of new structures, /proc internals, optimizations. Virtual File System and the
layering of a file system call from API to driver. Object-orientation patterns in kernel code; a
review of OO implementation generics (C++ vtables, etc).
UNIT IV:
OpenSolaris and Linux virtual memory and address space structures. Tying top-down and
bottom-up object and memory page lookups with the actual x86 page translation and
segmentation. How file operations, I/O buffering, and swapping all converged to using the
same mechanism. Kmem and Vmem allocators. OO approach to memory allocation.
Challenges of multiple CPUs and memory hierarchy. Security: integrity, isolation, mediation,
auditing. From MULTICS and MLS to modern UNIX. SELinux type enforcement: design,
implementation, and pragmatics. Kernel hook systems and policies they enable. Trap systems
and policies they enable. Tagged architectures and multi-level UNIX.
UNIT V:
ZFS overview. OpenSolaris boot environments and snapshots. OpenSolaris and UNIX System
V system administration pragmatics: service startup, dependencies, management, system
updates. Overview of the kernel network stack implementation. Path of a packet through a
kernel. Berkeley Packet Filter architecture. Linux Netfilter architecture.
Topics for Programs:
1. Getting Started with Kernel Tracing - I/O
2. Kernel Implications of IPC
3. Micro-Architectural Implications of IPC
4. The TCP State Machine
5. TCP Latency and Bandwidth
Text Books:
1. Jean Bacon, Concurrent Systems, Addison – Wesley, 1998.
2. William Stallings, Operating Systems, Prentice Hall, 1995.
3. Andrew S. Tanenbaum and Maarten van Steen. “Distributed Systems: Principles and
Paradigms”, Prentice Hall, 2nd Edition, 2007.
4. Silberschatz, Galvin, and Gagne, Operating System Concepts Essentials, 9th Edition
UNIX System Overview
UNIX Architecture
An operating system can be defined as the software that controls the
hardware resources of the computer and provides an environment under
which programs can run. This software is generally called the kernel, since
it is relatively small and resides at the core of the environment. The figure
below shows a diagram of the UNIX System architecture.
Linux is the kernel used by the GNU operating system. Some people refer
to this combination as the GNU/Linux operating system, but it is more
commonly referred to as simply Linux. Although this usage may not be
correct in a strict sense, it is understandable, given the dual meaning of the
phrase operating system.
Logging In
Shells
A shell is a command-line interpreter that reads user input and executes
commands. The user input to a shell is normally from the terminal (an
interactive shell) or sometimes from a file (called a shell script).
Files and Directories
File system
The UNIX file system is a hierarchical arrangement of directories and files.
Everything starts in the directory called root, whose name is the single
character /.
Filename
The names in a directory are called filenames. The only two characters that
cannot appear in a filename are the slash character (/) and the null
character. The slash separates the filenames that form a pathname
(described next) and the null character terminates a pathname. For
portability, POSIX.1 recommends restricting filenames to consist of the
following characters: letters (a-z, A-Z), numbers (0-9), period (.), dash (-),
and underscore (_).
Pathname
A sequence of one or more filenames, separated by slashes and optionally
starting with a slash, forms a pathname. A pathname that begins with a
slash is called an absolute pathname; otherwise, it’s called a relative
pathname.
Working Directory
Every process has a working directory, sometimes called the current
working directory. This is the directory from which all relative pathnames
are interpreted. A process can change its working directory with
the chdir function.
Home Directory
The working directory is set to our home directory, which is obtained from
our entry in the password file.
Input and Output
File Descriptors
File descriptors are normally small non-negative integers that the kernel
uses to identify the files accessed by a process. Whenever it opens an
existing file or creates a new file, the kernel returns a file descriptor that we
use when we want to read or write the file
Standard Input, Standard Output, and Standard Error
By convention, all shells open three descriptors whenever a new program
is run: standard input, standard output, and standard error.
Programs and Processes
Program
A program is an executable file residing on disk in a directory.
Processes and Process ID
An executing instance of a program is called a process.
The UNIX System guarantees that every process has a unique numeric
identifier called the process ID. The process ID is always a non-negative
integer.
Process Control
Threads and Thread IDs
Threads are identified by IDs. Thread IDs are local to a process. A thread
ID from one process has no meaning in another process. We use thread
IDs to refer to specific threads as we manipulate the threads within a
process.
Error Handling
When an error occurs in one of the UNIX System functions, a negative
value is often returned, and the integer errno is usually set to a value that
tells why. Some functions use a convention other than returning a negative
value. For example:
The file <errno.h> defines the symbol errno and constants for each value
that errno can assume. Each of these constants begins with the
character E. On Linux, the error constants are listed in the errno(3) manual
page
POSIX and ISO C define errno as a symbol expanding into a modifiable
lvalue of type integer. This can be either an integer that contains the error
number or a function that returns a pointer to the error number. The
historical definition is:
The following functions are defined by the C standard to help with printing
error messages.
The strerror function maps errnum, which is typically the errno value,
into an error message string and returns a pointer to the string.
#include <string.h>
#include <stdio.h>
The following code shows the use of these two error functions.
#include "apue.h"
#include <errno.h>
int
main(int argc, char *argv[])
{
fprintf(stderr, "EACCES: %s\n", strerror(EACCES));
errno = ENOENT;
perror(argv[0]);
exit(0);
}
$ ./a.out
EACCES: Permission denied
./a.out: No such file or director
1. Process control
o create process (for example, fork on Unix-like systems,
or NtCreateProcess in the Windows NT Native API)
o terminate process
o load, execute
o get/set process attributes
o wait for time, wait event, signal event
o allocate and free memory
2. File management
o create file, delete file
o open, close
o read, write, reposition
o get/set file attributes
3. Device management
o request device, release device
o read, write, reposition
o get/set device attributes
o logically attach or detach devices
4. Information maintenance
o get/set total system information (including time, date, computer
name, enterprise etc.)
o get/set process, file, or device metadata (including author,
opener, creation time and date, etc.)
5. Communication
o create, delete communication connection
o send, receive messages
o transfer status information
o attach or detach remote devices
6. Protection
o get/set file permissions
Traditional approaches to debugging and testing of embedded processors using a JTAG port are not sufficient to meet these
challenges. With JTAG we have two basic functions: stopping the processor at any instruction or data access, and
examining the system state or changing it from outside. The problem with this approach is that it is obtrusive – the order of
events during debugging may deviate from the order of events during “native” program execution when no interference from
debugging operations is present. These deviations can cause the original problem to disappear in the debug run. Moreover,
stepping through the program is time-consuming for programmers and is simply not an option for real-time embedded
systems. For instance, setting a breakpoint may be impossible or harmful in real-time systems such as a hard drive or
vehicle engine controller. A number of even more challenging issues arise in multi-core systems. They may have multiple
clock and power domains, and we must be able to support debugging of each core, regardless of what the other cores are
doing. Debugging through a JTAG port is not well suited to meet these challenges. To meet these challenges and get
reliable and high-performance products on the market on time, software developers increasingly rely upon on-chip resources
for debugging and tracing.
The existing commercially available trace modules rely either on hefty on-chip buffers to store execution traces of sufficiently
large program segments or on wide trace ports that can sustain a large amount of trace data in real-time. However, large
trace buffers and/or wide trace ports significantly increase the system’s complexity and cost. In addition, the size of the trace
buffers and the number of trace port pins cannot keep pace with the exponential growth in the number of on-chip logic gates
– which is a substantial problem in the era of multicore systems. These costs and scalability issues often make system-on-a-
chip designers reluctant to invest additional chip area for debugging and tracing.
Compressing program execution traces at runtime in hardware can reduce the requirements for on-chip trace buffers and
trace port communication bandwidth. Our research focuses on developing cost-effective compression algorithms and
hardware structures that can support unobtrusive program tracing in real-time. We have introduced several highly effective
mechanisms for capturing, filtering, and compression of instruction and data address traces and data value traces. Our
mechanisms exploit program characteristics such as instruction and data temporal and spatial locality, new hardware
structures (e.g., stream caches and stream predictors), and sophisticated software debugger to dramatically reduce the
trace size sufficient for faithful reconstruction of program executions.
We have introduced several mechanisms for compression of program execution traces, such as compression based on
stream descriptor cache and predictor, double move-to-front method, and compression based on branch predictors in the
trace module. Our best compression algorithm achieves over 28-fold improvement over the commercial state-of-the-art, and
requires only 0.036 bits per instruction of bandwidth at the trace port at the cost of only 5,200 additional gates. Our
techniques for filtering of data values achieve compression ratios between 5 and 20 times. These excellent results confirm
that investing in debugging infrastructure and specifically in trace compression algorithms can indeed enable unobtrusive
program tracing in real-time which in turn promise better software that runs faster and more reliable, and requires less time
to be developed and tested. We continue our efforts to explore hardware support for debugging in multi-core embedded
systems.
ed teDevelop and debug software on Simics
Agent-Based Debugger
When using an agent-based debugger, Simics has to keep running all the time to
keep the debugger happy. If Simics stops, the debugger’s connection to the
debugging agent will eventually time out and drop the connection. The agent-based
debugger is also completely unaware of Simics’ capabilities like reverse execution
and breakpoints on hardware accesses. If such features were used, the result would
be a confused debugger and a disconnected debug session.
An agent-based debugger is intrusive, in that it changes the state of the target
system. Attaching a debugger to a Simics system that contains a target agent
will change how the target behaves compared to a run when the debugger is not
attached, thus giving rise to classic problems like Heisenbugs that disappear as
soon as a debugger or log printouts are added to a system to try to find them.
Agent-based debuggers can be used alongside Simics features like checkpointing.
Just make sure to disconnect the debugger before taking a checkpoint and
reconnect it after the checkpoint has been opened. It makes perfect sense to boot a
machine and save a checkpoint, and then open that checkpoint and attach the
agent-based debugger.
In computer programming and software development, engineers will deploy debugging tools and
processes to find and mitigate "bugs" or problems within programs, applications, and systems.
The word "debugging" was derived in the 1940s when a Mark II computer (Aiken Relay
Calculator) malfunctioned, and engineers subsequently found a moth stuck in a relay, impeding
normal operation.
All kinds of techniques and tools allow engineers to root out problems within a software
environment. As software and electronic systems have become more complex, the various
debugging techniques have broadened with more methods to detect anomalies, assess impact,
and provide software patches or complete system updates.
Debugging ranges in complexity from fixing simple errors to performing lengthy and extensive
tasks, including data collection, analysis, and scheduling updates. The difficulty of software
debugging varies depending on the complexity of the system and, to some extent, on
the programming language used and the available tools.
Software tools enable the programmer to monitor the execution of a program, stop it, restart it,
set breakpoints, and change values in memory, among others. Much of the debugging process is
done in real-time by monitoring the code flow during execution, more so during the development
process before application deployment.
What is Tracing?
One technique that monitors software in real-time debugging is known as "tracing," which
involves a specialized use of logging to record information about a program's execution.
Programmers typically use this information to diagnose common problems with software and
applications. Tracing is a cross-cutting concern, meaning it involves aspects of a program that
can affect other parts of the same system and, in turn, provides detailed information of the
program as it's executed.
With debug and trace, programmers are able to monitor the application for errors and exceptions
without the need for an integrated development environment (IDE). In debug mode, a compiler
inserts debugging code inside the executable. Because the debugging code is part of the
executable, it runs on the same thread as the code. As a result, it doesn’t provide the same
efficiency of the code.
Trace works in both debug and logging mode, recording events as they occur in real-time. The
main advantage of using trace over debugging is to do a performance analysis, which can't be
accomplished on the debugging end.
What's more, trace runs on a different thread. Thus, it doesn’t impact the main code thread.
When used in tandem, tracing and debugging can provide information on program execution and
root out errors in the code as they happen.
Trace Techniques
It should be noted that tracing and logging are two separate entities; they provide overviews of
software execution, with each functioning differently (Fig. 1). Logging tracks error reporting and
related data in a centralized way, showing discrete events within an application or system, such
as failures, errors, corruption, and states of transformation.
1. Trace applications allow engineers to identify the root and processes that have caused applications to
function improperly via data collection using dynamic and static analysis. (Image credit: Pexels)
On the other hand, tracing follows a program's flow and data progression, providing more
information over a larger spectrum of the app stack. Tracing lets users see when and how the
error occurred, including which function is at fault, duration, parameters, and how deep the
function goes.
To that end, various techniques and applications can carry out that function. These techniques
depend on the ability to collect information about the system under study. Data-collection
techniques can be grouped into two categories: static analysis and dynamic analysis.
Static analysis uses the source code to uncover the system's components and relationships.
Performing static analysis has the benefit of covering all of the program's execution paths.
However, it’s only able to reveal the static aspects of the system, and it's limited in providing
insights into the behavioral characteristics of the program. This insight can be critical for
analyzing distributed applications such as service-based systems due to the high level of
interactions involved.
Dynamic analysis is the study of how the system behaves by analyzing its execution traces.
Unlike static analysis, dynamic allows users to focus only on parts of the program that need to be
analyzed, which is accomplished by analyzing the interactions of the active components.
Dynamic analysis also can be utilized for applications that require understanding the system's
behavior by relating the system inputs to its outputs.
There are two types of dynamic analysis: online and offline. Online analyzes the behavior of an
active system while it’s running. This type of dynamic analysis comes in handy when the system
under analysis will not terminate its task over long periods. Offline analysis is different from the
time when event traces are collected, meaning the event traces are collected during the execution
of the system, while the analysis is usually performed upon completion of the execution.
This brings us to distributed tracing, which handles trace-based analysis in a cloud environment,
microservices, container-based deliveries, etc. Distributed tracing follows an interaction by
tagging it with a unique identifier and staying with the transaction as it interacts with those
applications mentioned above.
The unique identifier provides real-time visibility as the application runs through its process. It
offers insight into the flow of requests through that microservice environment and identifies
where failures or performance issues occur within the system.
Applications
A wide variety of trace apps can provide in-depth insight into every software platform
imaginable, with some being platform-specific for systems that utilize Android, Windows, and
Linux, among a host of others. Below are several widely used trace applications that incorporate
a variety of metrics and analytics for pinpointing bugs along the development chain.
Datadog APM
The Datadog APM is a cloud-based software performance monitor that packs many varieties of
source data, including distributed-tracing messages (Fig 2). The platform can collect and process
OpenTracing and OpenTelemetry messages, which are filed with other indicators to make insight
a breeze. It also aggregates statistics on microservice performance and application processing
environments with agents that can target specific telemetry insight.
2. Datadog's Distributed Tracing platform aggregates metrics and events across the entire application
stack. (Image credit: Datadog)
Beyond monitoring standard distributed-tracing messages, the Datadog APM can interface with a
range of AWS services, including the Lambda microservices platform. In addition, it
generates visual representations that show the connections between microservices operating live
in a hierarchy.
The New Relic APM is targeted at developers and businesses that want to monitor their
microservices infrastructure. The platform centralizes the data collected from various sources,
including distributed-tracing messages garnered via OpenTelemetry, OpenTracing, OpenCensus,
and Zipkin. It also can monitor other data sources, including application log files from
infrastructure devices, along with a list of AWS services such as Lambda, Azure, Apache, and
operating-system status reports.
As with Datadog, New Relic deploys its own agents to provide additional insight into web and
app performance, driven by microservice actions, including browser monitors and connection
testers.
Dynatrace
Dynatrace is an AI-driven platform that utilizes the cloud, applying machine-learning and
heuristics to identify critical information from large amounts of data generated by reporting and
logging systems. The Dynatrace system takes advantage of the Open Trace standard and
processes activities that contribute to a given application. It then tracks back by analyzing
distributed-tracing messages to identify all of the micro services that worked on a given session
for that application.
While the platform allows developers and system managers to write and test micro services
efficiently, it also monitors application performance and will produce alerts if a problem is
identified within that microservice.
Audrius