Multi Process-Multi Threaded: Amir Averbuch Nezer J. Zaidenberg
Multi Process-Multi Threaded: Amir Averbuch Nezer J. Zaidenberg
Multi Process-Multi Threaded: Amir Averbuch Nezer J. Zaidenberg
and pselect Ch. 14.5 Process and forking Ch. 8 Threads Ch. 11 + 12
times we are faced with a system that must handle multiple requests in parallel.
Handling
multiple inputs in multiple terminals (or sockets, or sessions etc.) Processing multiple requests by a server Handling several transactions, avoiding being hang if one transactions takes too long. Doing things while waiting for something else(I/O computation etc.)
Busy waiting
Busy
waiting (v) a process who keeps asking the kernel do I have something to do? (do I have I/O? did I wait enough time)
solution was to do things in a single process. With API that allows for concurrency Select(2) API is the most common Other APIs include
Select
The
Inputs
come over several file descriptors (in the UNIX OS an open terminal, communication socket, and actual file I/O are all done over file descriptors) Output may be written to several interfaces and it may take time to write (less frequent) Waiting for exceptions on file descriptors (practically non-existent) Usually it takes very little to process input or output
situation :
Select(2)
Select(2) API
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); Nfds - The first nfds file descriptors are checked in each set. Therefore, should be equal max fd used +1 (since fds start from zero) Fd_sets - actually bit_array. The OS provide facilities to manipulate. Timeout - return with timeout after XXX seconds
int main(void) { struct timeval tv; fd_set readfds; tv.tv_sec = 10; tv.tv_usec = 0; FD_ZERO(&readfds); FD_SET(0, &readfds); select(1, &readfds, NULL, NULL, &tv); if (FD_ISSET(0, &readfds)) { char c = getc(stdin); printf("%c was pressed",c); } else printf("timeout\n"); return 0; }
Select example
file descriptors can be handled. (In Windows ONLY sockets can be handled) One can not wait for file descriptor and semaphore /computation/ mutex / etc. Un-fairness in large set on some implementation Select ruins its arguments (timeval and fdsets) however no assumption can be made on how they are ruined (I.e. how much time is left on timeval) Many modern UNIX OS support Poll(2) a select replacement. On such systems Select is usually implemented using poll. (but Poll is not available anywhere!)
When
inputs comes in different format it may be possible to use other API. However same considerations apply.
problems
Async I/O methods, unless you know what you are doing. (use Poll if you like, but take note of portability issues) Busy waiting Extra Thread/Process to do what select can do just fine.
implementation of UNIX also include pselect(2) system call which is similar to select(2) pselect(2) have almost the same parameters with two differences Wait times can be given in nanoseconds instead of milliseconds A signal mask for signals to be ignored while waiting is given
a running program. With its own memory protected by the OS from other processes. Thread(n) mini-program or program within a program a separate running environment inside a process.
A single process may contain numerous Threads have memory protection from
Task(n)
threads. other processes (including threads in these process) but not from other threads in the same process.
clarification
Each
multi tasking each task specify when it agrees to be moved out of the CPU for another task. lwp library an example.
multitasking The kernel decides which process receives CPU and when. The kernel moves tasks into the running scope.
Does not exist MasOS classic
Pre-emptive
processes Scheduler (n) - Part of the OS kernel that is responsible on pre-empting tasks and putting new tasks to execute
Multi-process programming
Running
process. Task switching is managed by the OS in pre-emptive multi-tasking. Each process has its own memory space. (heap, stack, global variables, process environment) Information and synchronization should be delivered from process to process using multi process communications API (such as Unix domain sockets)
creates a new process identical to the current one except for the response to fork(2) Other methods to invoke a new processes under UNIX
System (run executable) execXXX (function family
Answer
Printf(3)
works with buffers (that we can fflush(3) later. First printf(3) just copied stuff to the buffer fork(2) duplicated the process. (including the buffer) Both buffers were flushed.
This example doesnt work on every system because printf(3) and flushing implementation are not standard and depend on compiler versions) but when it does work its KEWL!
unmapped shared memory kludge Platfrom specific APIs (Linux sendfile, Sun Doors etc.)
In this course
We
will discuss network and unix domain sockets as means to deliver information We will discuss file locking via fcntl(2) as means to implement semaphores. Other methods are described in APUE.
process will usually run un effected by other process it spawned. When process terminates it returns a return code (the int from int main()) to its parent process. The parent process usually (unless we do something smart) ignores it. Parent process can wait for a child process (or any child process.) to terminate using wait(2) and waitpid(2) API.
Zombie process
A
process that terminates, but whose parent has not received its termination status (usually means something is wrong with the parent) remain in the system as zombie process Orphaned processes are adopted by init (process number 1) who always wait for its children to die
exit(2)
Process
can die and notify its parent about its exit status using the exit(2) system call. Calling this system call terminate the calling process
guide to network programming provide helpful tutorial on how to communicate between two process on a single host. This guide will be described at recitation. http://beej.us/guide/bgnet/output/htmlsingl e/bgnet.html
Select - revisited
When
child process terminates parent process receive signal which causes select(2) to abort returning EINTR value. If you code multi process application and use select you should usually ignore this return status. (or mask SIGCHLD and use pselect(2))
processes are memory protected it is relatively hard to sync and pass information between multiple processes. Using APIs force us to some constraints inherited by the API Process overhead especially process creation overhead is heavy Context switching is expensive
should be handled simultaneously. select not suitable. Process are created infrequently (or preferably, only once). Relatively low number of processes overall IPC is not needed frequently. You want process memory protection.
we can (reasonably) do the job in one process. Lots of information is transferred. High performance is needed and you dont know what you are doing. (context switch is expensive.) In almost any case when thread be just as good, much simpler and wont hurt us.
are managed in separate memory spaces by the OS that requires us to use IPC to transfer information between processes. Threads are mini-processes. Sharing heap, process environment and global variables scope but each thread has a different stack for its own. Using threads - the entire heap is shared memory! (actually the entire process!)
Threads API
POSIX
95 threads API is now common on all UNIX OS and should be used whenever threads are needed on UNIX OS for all new applications. Legacy applications may use different threads API (usually prior to Posix 95) such as Solaris threads. Those APIs are usually almost identical to Posix API. Microsoft windows has similar API.
In this course
We
will cover POSIX threads API. We will briefly discuss microsoft windows threads API We will give example to cross platform thread class.
pthread_create(3)
Creates
a new thread Gets a function pointer to serve as the thread main function Threads can be manipulated (waited for, prioritized) in a similar way to processes but only internally.
Critical section
Very
often we reach a situation when two tasks need access to the same memory area.
This
Allowing
access to both tasks will very often result in corrupt reads or writes.
When both try to write When one write and one read No problem with two reads
can happen with processes and shared memory This occurs very frequently with threads.
Memory corruption
When
two tasks try to access same memory space We would like to guarantee that After a read either the new or old state of the memory will be given (not a mishmash) After multiple write either write state will be reside completely in the memory (but no a mishmash of two writes) Failing that we have memory corruption,
IT. Even if you are 100% sure you know what you are doing!!!! (and if you do, consult some one, think again, and consult somebody else too!)
pthread_mutex
Posix
A
Mutex or Mutually exclusion is a device served to lock other threads from entering critical section while I (I am a thread) am using it. Cond - sort of reverse mutex a device that is served to lock myself (I am a thread) from entering critical section while another thread prepares it for use.
a state were two resources are required in order to do something. Each resource is protected by a mutex. Two tasks each locks a mutex and wait for the other mutex to be available. Both tasks hang and no work is done. It is up to the software engineer to avoid deadlocks.
Recursive mutex
What
By
happens if a thread locks a mutex then by some chain of events re-locks it? the thread unlock the mutex (which was locked twice) does it unlocks or should it be unlocked twice?
Different
Should
implementation have different answers. Linux requires equal numbers of locks and unlocks while default Solaris behavior is to unlock all locks. Default behavior can be changed (for Linux or Solaris) by specifying the mutex is/is not recursive. Recursive = Linux interpretation.
Using recursive mutexes is usually deprecated way to write code. (since programmers reading the code tend to think the mutex is unlocked while in practice it is) But programmers do it anyway.
pthread_cond
Cond
is a reverse mutex i.e. unlike a mutex which is usable in first use and is blocked until released, a cond is blocked when first acquired and is released when a second thread acquires it. Cond is typically used in a producerconsumer environment when the consumer is ready to consume before the producer is ready to produce. The consumer locks the cond. The producer unlocks when something is available.
Pthread create
int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine)(void *), void *arg);
2nd and 5th arguments are contained in UNIX 2nd arguments (the thread attributes) 3rd and 4th windows argument correspond to 3rd and 4th unix argument. (the thread function and its arguments) 6th windows argument correspond to first unix argument (thread id)
Different OSs have different API but same principles rule everywhere. (including embedded OSs, realtime OSs, mainframe, cellphone OSs etc.)
Threads benefits
Threads
provide easy method for multiprogramming (because we have easier time passing information) Threads are lighter to create and delete then process Threads have easy access to other threads variables (and thus doesnt need to write a IPC protocol) Context switching is usually cheaper then process Threads are cool and sexy
need to do IPC means all problems with locking and unlocking are up to the programmer All seasoned programmers have several horror stories chasing bugs past midnight in dreaded threaded environment! Context switching makes it more efficient to use single thread the multi thread. Because threads are cool and sexy, Threads use is often overdone. De-threading is common task in many mature applications.
thread require its own stack in order to enter function define automatic variables etc. So the OS gives a new stack to each thread But the OS have no memory protection between threads period That means if we create a pointer and point to thread local stack scope, other threads can change it with no locking.
As always When people are doing things that will only confuse other programmers this is deprecated.
is one of the worst atrocities devised by mankind it is also non-reentrant This function uses a char * in the global scope so that multiple calls can be called with NULL as the first argument. Consider what happens to this function when multiple threads use it simultaneously.
thread calls strtok. Gives char pointer which is saved in strtok static char * Second thread calls strtok. Overwrites first char pointer. First thread call strtok with NULL Poetic justice? Just what the caller deserve?
global char * is a CRITICAL SECTION in the sense it may not be used twice by two different threads So the second call for strtok would ruin it for the first call. A different function was offered that doesnt use global buffer. strtok_r() this function takes an external buffer. Similarly ctime() now has ctime_r() localtime() has localtime_r() etc.
code requires several compile time consideration Usually a compile/link switch (-lpthread or pthread in UNIX platfroms or /MT (/MTd) in Microsoft Windows) Linking multi-thread and nonmultithreaded code may result in link or runtime errors on different platforms.
cannot do things in a single thread efficiently. Multi processing is required. Lots of data is shared between threads. We dont need OS memory protection. We think the new thread is absolutely necessary.
Create
know which
Instead
many this little thread only does this threads. Impossible to design reasonable locking and unlocking state machines. once number of threads go up, too many thread-2-thread interfaces locking and unlocking are guaranteed to cause bugs. Only create threads when things must be done in parallel and no other thread can reasonably do the task.
Single process should provide best overall performance Easiest to program Single process may be hard to design, specifically if needs to handle inputs from multiple sources types Single process may be prune to be hang on specific request Should be preferred when ever complexity rising from multiplicity is not severe
Summary
Multi process use the OS to create processes, swap process and context switch, thus adding load IPC makes it hard to program Usually easy to design if process tasks are easily separated Should be preferred when IPC is minimal and we wish to have better control over memory access in each process.
Multi thread use the OS to create threads and context switch, adding load. However not as much as process because threads are lighter Easy to program and pass information between threads, but also dangerous Usually hard to design to avoid deadlocks, bottlenecks, etc Should be preferred when lots of IPC is needed Dangerous : novice programmers reading and writing to unprotected memory segments
Producer - Consumer
Produce
queue
Producer - Consumer
Producer
Consumer used typically with handler threads. Some thread does some work and puts it for the other thread(s) to consume. Sometimes a series of producer-consumer define a single transaction Real world examples : handle requests by web server, db server or many other server that gets request in a single pipe and have several handling threads
Guard
Guard or Scope Mutex is a class that wraps a mutex implementation Class destructor releases the mutex. By using the C++ destructor mechanism we insure that when we leave the critical segment the mutex will be released
from multiple threads may enter non reentrant scope from many places. If we use locking and forget to release the mutex we may suffer from deadlocks (sometimes releasing the mutex is not as trivial as it sounds because legacy code tends to have many surprises in store (such as break, continue, goto and other goodies))
Using
select is very easy and is very often required. We cannot wait on socket and cond using select. Instead we will use socket buffer. We will read (using select(2) off course) 1 byte from the socket buffer, when we wish to wait for cond We will write 1 byte when we wish release cond
Signal
Signal header
class Csignal { private: int fd[2]; char buf; void InitSignal(); public: Csignal(); virtual ~Csignal(); Csignal(const Csignal& other) { InitSignal(); } int send(); int signal() {return send();} int wait(); int GetWaitFD(); };
Signal implementation
{ Csignal::Csignal() InitSignal(); buf = 42;
} void Csignal::InitSignal() { if (socketpair(AF_UNIX, SOCK_STREAM, 0, fd) == -1) THROW_SOCKETERROR; } Csignal::~Csignal() { close(fd[0]); close(fd[1]); }
Signal example
int Csignal::send() { if (::send (fd[0], &buf, sizeof(char), 0) != sizeof(char)) THROW_ERRNO; return 1; } int Csignal::wait() { char res; if (recv(fd[1], &res, sizeof(char), 0) != 1) THROW_ERRNO; return 1; } int Csignal::GetWaitFD() { return fd[1]; }
libraries exist on the net to manage OS services and provide infrastructure design pattern on efficient multi platfrom environment Examples include
Nspr (netscape portable ICE ACE which I prefer
run time)
This class will create a thread using Windows threads, Solaris threads and Posix threads. The class has an Execute method to be inherited and modified by derived classes (The derived class is a thread) I will only discuss the thread creation. Real world implementation should also include Methods to wait for termination, suspend, prioritize Attributes to get status (started, stopped, terminated) and return code Queues for working threads Etc etc.
Ctf_create_thread
inline int ctf_thread_create(pthread_t *thread, void* (*start_routine)(void *), void* arg) { #ifdef Solaris_threads return thr_create(NULL, (size_t)0, start_routine, arg, 0, thread); #else // POSIX THREADS return pthread_create(thread, NULL, start_routine, arg); #endif }