Linux Lecture
Linux Lecture
2010
Distribution Number
1. Uname -a
2. Cat /etc/issue
3. 2.x.y
4. 2 is the major number
5. X is the minor number
6. Y is the patch label
7. If x is even then stable release
8. If x is odd then beta release
9. The minor number is almost over as 2.7 has not been used
Linus Torvalds
Architecture Library
1. Kernel
User Mode | sytem call
2. User mode , glibc has printf module
3. System call
4. Process space Kernel
5. Kernel mode completely priviledged
6. In microkernel, we use message passing architecture Device Drivers
7. In linux kernel ,its self help, process context changes and executes in kernel mode.
8. Printf is a library call , it becomes a system call (write) which performs in kernel mode.
9. API set – is a mix of library + system calls
Homework:
1. Download a kernel and apply the patch and view the changes
2. Highly useful
PROCESS EXECUTION
1. VM is a technique through which every process is made to believe that the entire space
is available to it.
2. Virtual memory fools the processes into believing
3. VM layer of the and MMU of the CPU
4. Mordern PC environment
5. Process image is always used
6. A process is imagined to have a memory from low address to high address
7. Runtime image
8. Uninitialized data segment (bss)
9. Homogeneous segments eg: text , data and stack.
10.Check using ps aux to lok for virtual and physical space used by the process
11.Process P can exec another process
12.When a process exec’s another process, the same process image is over-laid by the new
process. So the new process gets the process id of the predecessor
13.Eg: try exec in a shell … shell closes as it is overlaid by the new process. Exec bash wrks.
14.Err no is the uninitialized of the data segment. Kernel tells by updating the error number
the reason for the error
15.perror() is a useful functions. Do the clean-up
16.exec –family of routines …… to use in ur convenient way
17.
18.
0
text
data
Physical
Memory
stack
High
EXEC FAMILY
1. Argv[0] , argv[1] , …..
2. Execl
a. Pathname,
b. Execl(“/bin/ls”, “ls”, “-l”, “abc”, 0);
3. Execclp
4. Execv
5. Execvp : doesn’t require the pathname.
6. Execle
a. With copy of the environment
b. With the previous 4 the predecessor will inherit the environment of its successor.
c. Envp is a char** …. Callers environment variable changes
7. Execve is the main call …. Busy system call
a. Everything runs execve
b. Strace program call used the execve system call
8. Prove that the exec call overlays the pid of the parent process ….
a. Write 2 c programs
b. Or call ps from the program as ps prints its pid
9. Change the environment variable ….. setenv() and printenv()
a. Make a copy of the environment
b. Strncmp – sophisticated version of strcmp …. Maximum number of characters to
compare
c. Printenv
d. ./envp | grep ‘^TERM’
10.STACK_FRAME_BUFFE
a. […..
b. LOCALS TOP(ESP); LOWER ADDRESS
c. …/
d. SFP EBP
e. RET ADDR
f. […..
g. PARAMS BOTTOM; higher addresses
h. ….]
i. If you give a large input file to program eg: if it has gets() statement.
j. The process will crash .. “gets” is a badly coded library API …. Has no limit on the
number of
k. Buffer overflow exploit : done by the hackers
l. They compile the code of execl(“bin/sh”, …….)
m. Compile and take out the machine code ….. take the hex bytes and place it in a
buffer called the “crafted buffer”
n. They make the return address proint bck to top ….. so when program returns it
finally execs a shell
o. Crafted attack has been goin on for years.
p. Most frequent form of attack on unix
q. Use fgets where the maximum number of bits can be fixed …… use strncmp as the
number of bits … snprintf() …….. practice this as u can block the security
loopholes
r. SECURE PROGRAMMING FOR LINUX/UNIX -- free book
s. REDHAT has stack guard feature remove the byte codes from data portion
FORK :
1. Fork makes a copy of the process … the image is not over laid
2. Parent remains alive and the new child comes and runs in parallel to the parent
3. Multitasking OS using fork().
4. Instruction executed by both parent/ child is the next instruction after fork()
5. Child starts from the same position as parent
6. So for a division of labour,
7. ASS|U|ME – ass out of u and me
8. Fork is called once and returned in 2 processes both parent and child
9. Kernel guarantees different return value allowing the developer to knw who is the
parent and child
10.use switch on n : switch(n){ case -1 : do the clean-up case 0: child default : parent }
11.
p c
12.Data is copied across the fork
13.The order of execution is indeterminate
14.Synchronization can be used
15.Int fd= open(“abc”, 0);
16.User space and kernel space
17.Open file descritptor table .. simplistically arry of slots and pointers to the open file data
structure[ file table entry, fte]
18.Fd = 0,1,2 are reserved for input, output, error
19.For the child process a copy of the parent’s OFDT is given to the child. Open files are
truly shared between the processes forked. The file is opened for the child. This is the
fall out of the unix implementation.
20.The child process has acess to the opened file as child inherits all the permission
21.Simultaneous I/O can corrupt the file . The seek pointer is same for the both parent and
child. No guarantee for atomicity.
22.Close(fd) in one of the process. Just the link goes down.
23.If we need it on the both the process. Do file locking. Crude pseudo file locking using
semaphore.
24.File locking API’s available with unix.
25.Advisory locking is preferred over mandatory locking.
26.Advise others that we are using the file , its upto u to handle the file carefully
27.Getpid(),Getppid() ,return of fork() for child pid
28.Many attributes get auto-inherited by the child eg: pwd, root-directory,
umask,credentials(RUID,… same priveledge level as parent), open files, environment
variables.
29.Pid, ppid,utime, file locks, timer ,alarmsare not inherited
30.All the time counters are reset for the children.
FORK OPTIMIZATION :
1. Job of the fork : copy of the parent and child. Fork() is a heavy weight process
2. Text segment of a process is “r_x” … u cannot write on the process.
3. So the fork does not copy text section but shares between all the parent and children
4. Data is “rwx” , but even then it is not copied, handled using a technique called
COW(copy on write)
5. COW operation copies the entire page containing that variable … and that is why
very fast
6. Very efficient and frugal technique to use COW – heavily used in linux
7. BSD system engineer improvement – how the shell works fork() and exec() . After
fork() ,, exec() …. It’s the wastage of the effort done by fork().
8. So, vfork() virtual fork()… still in the same memory as the parent. If the new child
calls execve it gets a new image. Check in the main page. Kernel
9. Vfork has use cases : used in uclinux … fork() is not allowed
10.Vfork() not portable across all linux not POSIX standard . need to take care as a
programmer as
11.Implement a shell mysh.
12.If a parent dies … kernel changes the ppid of the child process to 1 [ init process].
Orphaned process is reparented by the init process.
13.Wait() – maintains synchronization between parent and child
14.In unix design the parent must wiat for the child. If the parent doesn’t do that
15.So if a process dies …. It is called a Zombie process – kernel still maintains that the
process has not died .. but the child has really died
16.Zombie – “ a process which has died but not yet buried”
17.Zombies take kernel memory ,also takes pids, server can get
18.You cannot kill the zombie. So, unix design is too be maintained
19.In classic unix, the zombies die only on reboot
20.In linux, if the parent dies the zombie are killed . IBM AIX also uses similar technique.
21.Write a program for orphaned process and zombie process .
22.Ps –l to check zombie(z) and orphaned(ppid=1).
23.Zombies are also called defunct processes
FORK()
1. Printf(“hello”); fork()
2. Hello is stored in a buffer . Buffering is always done in unix.
3. Fork() gets a copy, so it gets the buffer space.
4. All buffer are flushed in parent and child.
5. Buffer size is around 4096bytes size of a page.
6. Flushing of buffer is done when page is filled or program exit
7. SYN – to synchronize the writes and read
8. Fully buffered 4096
9. Line buffer I/O 128 bytes
10.Character buffer I/O 8 bytes
11.Raw I/O is unbuffered I/O …. Read or write. 8bytes
12.Setvbuf to change the buffer size.
13.Return value of read/write is number of bytes written or read
14.BOF / EOF – begin/end of file
15.LSEEK to set the file-pointer – rewind operation
16.Fread/fwrite/fseek – for higher level abstraction
IPC
1. System Call to create pipes.
2. Pipe is a page of RAM.
3. Visualize it as a pipe….. an abstraction
4. Treated as 2 open files.
5. Access to either end is via a file descriptor
6. A pipe object : read end and a write end.
7. A process can write data to write end and read from the read end
8. Int fd[2] .
9. I/O with pipes …. Use read/write
10.Unidirectional – data can flow from one end to another.
11.Fd[0] – read end ; fd[1] – write end of the file
12.Read(fd[0], buf,n);
13.I/O is blocking in nature. If you a read from an empty pipe , the process waits.
14.Default POSIX standard is blocking…. Which helps in auto-synchronization
15.Write(fd[1],buf,n)
16.Macro – PIPE_MAX …. If the pipe is full the write call will block.
17.File-reads are non-desctructive but pipe read is destructive.
18.EOF returned when read is empty
19.Signal SIGPIPE when write is done with buffer full.
20.2 related processes communication.
21.OFBD is shared between forked process. 2 descriptors of the pipe are created in the
OFDB.
22.The read end and write end can be accessed by both parent & child
23.“Ctrl+ c” kills all the parent and child ….. signal is broadcasted to all the processes in the
foreground
24.Check return values from read/write and use signal handler for pipes.
25.Fdptoc and fdctop : 2 pipes for mutual talks
26.“System” function calls – run all calls in child shell
27.Popen gives the power to read in the standard out of any process and also capture the
standard out of any other process.
28.Popen internally creates pipe and helps us to read/write to the process. Returns a file
pointer.
29.Pclose to finish off.
30.Fp = popen(“date”,”r”)
31.Fgets(p,80,fp);
32.Popen gives an abstraction layer above the system to run all the
33.File pointer to file descriptor – fdopen
34. Popen is heavy-weight
Named Pipes
1. Man mkfifo – user level command
2. Mkfifo –help
3. Cat > afifo
System V IPC
1. Pass the key value… part of the creating API…. Return is the handle to the IPC object [id]
2. Another process P2 specifies a similar-key value.
3. IPC objects are persistent in memory.
4. If you create an IPC object , can be shared
5. Clean-up needs to done finally .. or the IPC object will remain.
6. API changes with the kind of IPC you are working.
7. Message queue [similar to linked list]
8. Semaphore [used as a mutex]
9. Richard Stevens – network programming
10.Beej guide to UNIX IPC
11.Xxxget to get handle to the API’s – shm,ssh,mss – for creation and access
12.Architecture is typically client server.
13.1st parameter is the key value key_t key.
14.IPC_PRIVATE means to create a new object
15.Ftok() to make new unique id.
16.Xxxctl()
17.Ipcs –l to check the IPC system V configuration
Message queue
SEMAPHORE
1. Semaphore problem with creation and initialization interval.
2. Semctl() is the API for all kind of controls
MULTITHREADING
1. The libraries get memorymapped in the virtual memory segment.
2. Process VM[virtual memory] has text,data ….[libraries] , stack.
3. A process has OFDT, IPC segments, environement variables,signal hanling, the max
limits.
4. All threads live in a process address space.
5. The threads share everthing except the stack. Stack implements function calls … that is
why different stack.
6. We need a separate stack space.
7. Sun’s LWP library.
8. Posix1C = Pthread .. set the standard , not implement .. vendors implement.
9. Thread scheduling can be 2-tier with thread library select the winner thread and then
the kernel scheduler will select the winner thread. Unix does so. This is called m to n
mapping.Solaris 9-10 is the change of 2-tier to single tier
10.Linux does single-tier … one to one mapping.
11.Map a thread to a cpu you want. The technique is available but less used.
12.Pthread mutex for mutual exclusion on the global data.
13.Local data is on the stack .. not shared.
14.Same OFDT shared. So file coupling is tighter.
15.No COW in case of data in the threads.. It’s the same data.
16.SUSV – 1,2
17. Benchmark using the code. Time ./pthread & time ./fork
18.Non-Overlapping between the CPU
19.Master-slave model and can easily set the priority of the threads.
20.Pthread not in the libc library. –l pthread.
21.If you call exit() entire process will die. Call pthread_exit() to kill a specific thread.
22.Returns to the thread that is waiting[pthread_join caller] on this thread.
23.If main() calls exit() all the threads are killed.
24.But if main calls pthread_exit(NULL) … main only dies and the others remain alive.
25.Verify the shared library list using command ldd ./pthread_test. The loader is the /lib/ld-
linux.so. … The routines are brought to virtual space of the process. Eg… ldd /bin/ls ..
ldd is an important utility
26.Which firefox
27.File /usr/bin/firefox
28.Find the binary and run ldd x
29. Getpid equivalent is pthread_self()
30.Boolean equal cannot be used between structs so use pthread_equal(thread1,thread2).
31.Joining of threads …. Closest equivalent to wait() of the parent for the child.
32.U can’t make a detached thread joinable after it is created but the reverse is possible
using Pthread_detach().
33.Thread is joinable but the parent doesn’t call pthread_join so not waiting. The meta-
data of the thread is not freed out when the thread gets killed. Closest situation to a
zombie.
34.Pthread_attr_destroy() to prevent memory leak as the pthread_attr_init() calls malloc.
Very important to read man pages to understand expectations.
35.TSD – Thread safe data.
36.Errno is not thread_safe. –DREENTRANT flag creates private errno per thread.
37.Pass by value is better than pass by reference for thread-unsafe data . For thread-safe
data you can pass the address.
38.Pthread_mutexes :
39.Pthread_mutex_trylock() :similar to NOWAIT version of system V : usueful for busy
waiting, but reduced chances of getting lock.
40.After the unlock , a small delay helps to make the distribution fair and reduces the
starvation.
41.ThumbRule : “lock the data not the code”
42.Locking granurality is important. Follow the middle path.
43.Priority Inversion :
44.Condition Variables : Difference from mutex …. Technique superior to polling
45.CV is always paired with a mutex.
46.Mutex is passed along with the CV.
47.Signals are sent if the variable reaches a particular required value.
48.Pthread_cond_wait() : automatically and atomically puts the thread into sleep and
unlocks the mutex and wakes the thread up only when the other processs signals.
49.“what really happened on mars”
50.PRIORITY_INHERITANCE : to make it the maximum priority thread.
51.
SOCKET PROGRAMMING
1. Different types of socket. A socket fie generated at the time of socket creation.
2. Htons, htonl, ntohs, ntohl [l-32 , s-16 bit] ip address is 32 bits , port is 16bits
3. Listen() has a backlog number …. Length of the queue to accept as the backlog.
4.
5.
Process Scheduling on Linux
PROC
1. On RAM … not in the hard disks
2. Always on RAM .. loaded at the time of start-up
3. /proc/sys - files are used to tune the system.
4. Turn on or turn off the features
5. PID of the process appears as a folder
6. READ ABOUT PROC ON LINUX
7. Ls –l fd - gives the file descriptors open : 0,1,2 … stdin,stdout
8. /proc/cpuinfo – cpu info
9. /proc/meminfo – mem info
10.Very powerful
11.Free –m ---- uses proc/meminfo internally
12.Cat partitions
13.Cat /proc/versions
14.Cat threads-max
15.U can go down the thread lists using /proc/pid/task and perform similar options.
16.The usage of /proc in a shell script should be done using check whether /proc is
available on the linux -version
17. Read module 13