Net Work
Net Work
Net Work
1
Organizing Network Functionality
• Network architecture
– How should different pieces be organized?
– How should different pieces interact?
2
Problem
3
Solution: Indirection
• Solution: introduce an intermediate layer that provides a single
abstraction for various network technologies
– O(1) work to add app/media
– Indirection is an often used technique in computer science
Intermediate
layer
4
Network Architecture
5
Software Modularity
6
Network Modularity
7
Outline
• Layering
– how to break network functionality into modules
8
Layering
9
ISO OSI Reference Model
10
ISO OSI Reference Model
• Seven layers
– Lower two layers are peer-to-peer
– Network layer involves multiple switches
– Next four layers are end-to-end
Application Application
Presentation Presentation
Session Session
Transport Transport
Network Network Network
Datalink Datalink Datalink
Physical Physical Physical
Physical medium A Physical medium B
11
Layering Solves Problem
12
Key Concepts
• Service – says what a layer does
– Ethernet: unreliable subnet unicast/multicast/broadcast
datagram service
– IP: unreliable end-to-end unicast datagram service
– TCP: reliable end-to-end bi-directional byte stream service
– Guaranteed bandwidth/latency unicast service
• Service Interface – says how to access the service
– E.g. UNIX socket interface
• Protocol – says how is the service implemented
– a set of rules and formats that govern the communication
between two peers
13
Physical Layer (1)
14
Datalink Layer (2)
• Service:
– framing (attach frame separators)
– send data frames between peers
– others:
• arbitrate the access to common physical media
• per-hop reliable transmission
• per-hop flow control
15
Network Layer (3)
• Service:
– deliver a packet to specified network destination
– perform segmentation/reassemble
– others:
• packet scheduling
• buffer management
16
Transport Layer (4)
• Service:
– Multiplexing/demultiplexing
– optional: error-free and flow-controlled delivery
17
Session Layer (5)
• Service:
– full-duplex
– access management (e.g., token control)
– synchronization (e.g., provide check points for long transfers)
18
Presentation Layer (6)
19
Application Layer (7)
20
Who Does What?
Host A Host B
Application Application
Presentation Presentation
Session Session
Router
Transport Transport
Network Network Network
Datalink Datalink Datalink
Physical Physical Physical
Physical medium
21
Logical Communication
Host A Host B
Application Application
Presentation Presentation
Session Session
Router
Transport Transport
Network Network Network
Datalink Datalink Datalink
Physical Physical Physical
Physical medium
22
Physical Communication
Host A Host B
Application Application
Presentation Presentation
Session Session
Router
Transport Transport
Network Network Network
Datalink Datalink Datalink
Physical Physical Physical
Physical medium
23
Encapsulation
• A layer can use only the service provided by the layer
immediate below it
• Each layer may change and add a header to data packet
data data
data data
data data
data data
data data
data data
data data
24
Example: Postal System
25
Postal Service as Layered System
Layers:
• Letter writing/reading Customer Customer
• Delivery
Information Hiding:
• Network need not know letter contents Post Office Post Office
Encapsulation:
• Envelope
26
Internet Protocol Architecture
27
Functions of the Layers
– Service: Reliable transfer of frames over a link. (Data) Link Ethernet, WiFi
– Functions: Synchronization, error control, flow Layer T1
control, etc.
28
Internet Protocol Architecture
FTP FTP
FTP protocol
program program
IP IP protocol IP IP protocol IP
29
Internet Protocol Architecture
IP IP protocol IP IP protocol IP
30
Encapsulation
• As data is moving down the protocol stack, each protocol
is adding layer-specific control information.
U s e r d a ta
A p p lic a tio n
A p p lic a tio n
H eader U s e r d a ta
TC P
TC P H eader A p p lic a tio n d a ta
IP TC P segm ent
E th e rn e t fra m e
31
Hourglass
32
Implications of Hourglass
33
Reality
34
Summary
35
OSI & Internet protocol suite
36
Where we work?
Sockets
API
Open/X
Transport
Interface
37
Two reasons for this design
• Upper three layers handle all the details of
application and know little about communication i.e.
sending, receiving data etc
• Upper three layers form a user process while the
lower four layers are provided as part of operating
system or kernel.
About kernel
Kernel
• the part of the operating system that is mandatory
and common to all other software
• simply the name given to the lowest level of
abstraction that is implemented in software
Functionalities of Kernel
• Process Management
• Memory Management
• Device Management
• System Calls
Process Management
• A kernel typically sets up an address space for the
process,
• loads the file containing the code into memory, sets
up a stack for the program and branches to a given
location inside the program, thus starting its
execution
Memory Management
• The kernel has full access to the system's memory and must
allow processes to safely access this memory as they require it.
• Virtual addressing allows the kernel to make a given physical
address appear to be another address, the virtual address.
• Virtual address spaces may be different for different processes;
Device Management
• Processes need access to the peripherals connected to the
computer, which are controlled by the kernel through device
drivers.
• For example, to show the user something on the screen, an
application would make a request to the kernel, which would
forward the request to its display driver, which is then
responsible for actually plotting the character/pixel
System Calls
• A process must be able to access the services provided by the
kernel. This is implemented differently by each kernel, but most
provide a C library or an API, which in turn invokes the related
kernel functions
• Implemented using software simulated interrupts
Programs and Processes
• A program is an executable file residing on disk. A
program is read into memory and executed by the
kernel
• An executing instance of a program is called a
process
• Every process has a unique non-negative identifier
called process id (PID)
Process Environment
• What happens when we execute a C program?
./a.out
• How the command-line arguments are passed to the
process?
• Memory layout of a process
What happens when we execute a C program?
#include <unistd.h>
• pid_t getpid(void);
Returns: process ID of calling process
• pid_t getppid(void);
Returns: parent process ID of calling process
• uid_t getuid(void);
Returns: real user ID of calling process
• uid_t geteuid(void);
Returns: effective user ID of calling process
• gid_t getgid(void);
Returns: real group ID of calling process
• gid_t getegid(void);
Returns: effective group ID of calling process
fork()
• An existing process can create a new one by calling the fork function.
#include <unistd.h>
pid_t fork(void);
Returns: 0 in child, process ID of child in parent, 1 on error
• The new process created by fork is called the child process. This
function is called once but returns twice. The only difference in the
returns is that the return value in the child is 0, whereas the return value
in the parent is the process ID of the new child
fork()
• Both the child and the parent continue executing with the
instruction that follows the call to fork.
• The child is a copy of the parent. For example, the child gets a
copy of the parent's data space, heap, and stack. Note that this
is a copy for the child; the parent and the child do not share
these portions of memory. The parent and the child share the
text segment
copy-on-write (COW)
• don't perform a complete copy of the parent's data, stack, and
heap
• These regions are shared by the parent and the child and have
their protection changed by the kernel to read-only
• If either process tries to modify these regions, the kernel then
makes a copy of that piece of memory only, typically a "page" in
a virtual memory system.
int glob = 6; //global variable
int
main ()
{
int var;
pid_t pid;
var = 88;
printf ("Before fork\n");
if ((pid = fork ()) < 0)
perror ("fork"); //function to print error that occurred in the process
else if (pid == 0)
{
glob++;
var++;
printf ("pid = %d, glob=%d, var=%d\n", getpid (), glob, var);
exit (0);
}
else
{
printf ("pid = %d, glob=%d, var=%d\n", getpid (), glob, var);
exit (0);
}
}
fork()
• In general, we never know whether the child starts executing
before the parent or vice versa. This depends on the
scheduling algorithm used by the kernel.
• To synchronize child and parent, some form of interprocess
communication is required.
File sharing between parent and child
• one characteristic of fork is that all file descriptors that are open
in the parent are duplicated in the child.
• The parent and the child share a file table entry for every open
descriptor .
• Generally shell process has three different files opened for
standard input, standard output, and standard error. When a
command is executed as a process, they are inherited
vfork()
• The vfork function is intended to create a new process when the
purpose of the new process is to exec a new program
• The vfork function creates the new process, just like fork,
without copying the address space of the parent into the child,
as the child won't reference that address space
• vfork guarantees that the child runs first, until the child calls
exec or exit. When the child calls either of these functions, the
parent resumes.
What child inherits?
• Real user ID, real group ID, effective user ID, effective group ID
• Current working directory
• Root directory
• File mode creation mask
• Environment
• Process group ID
• Session ID
• Controlling terminal
• Attached shared memory segments
• Memory mappings
• Resource limits
What values in child are different from parent?
execl
• • •
execlp
• • •
execle
• • •
execv
• • •
execvp
• • •
execve
• • •
(letter in p l v e
name)
Example
#include <signal.h>
int kill( pid_t pid, int signo );
int raise(int signo);
int main()
{
signal( SIGINT, foo );
:
• Value Meaning
• Value Meaning
sigemptyset( &newmask );
sigaddset( &newmask, SIGINT );
struct siginfo {
• A signo
int signal causes
si_signo; /* the sa_handler
signal numbersignal
*/ handler to be
called.
int si_errno; /* if nonzero, errno value from
• While */ executes, the signals in sa_mask are blocked.
sa_handler
<errno.h>
Any
int more signo signals
si_code; are also blocked.
/* additional info (depends on
signal) */remains installed until it is changed by another
• sa_handler
sigaction()
pid_t si_pid; call. No
/* reset problem.
sending process ID */
• sa_sigaction
uid_t si_uid;specifies
/* handler
sending if SA_SIGINFO
process realflag isuser
set. ID */
void *si_addr; /* address that caused the fault
*/
int si_status; /* exit value or signal number */
long si_band; /* band number for SIGPOLL */
/* possibly other fields also */
};
Other POSIX Functions
• sigpending() examine blocked signals
• sigsetjmp()
siglongjmp() jump functions for use
in signal handlers which
handle masks correctly
• #include <unistd.h>
long alarm(long secs);
• #include <setjmp.h>
int setjmp( jmp_buf env );
• Returns 0 if called directly, non-zero if returning from a call to longjmp().
• #include <setjmp.h>
void longjmp( jmp_buf env, int val );
• In the setjmp() call, env is initialized to information about the current
state of the stack.
• The longjmp() call causes the stack to be reset to its env value.
• Execution restarts after the setjmp() call, but this time setjmp()
returns val.
Example
jmp_buf env; /* global */
int main(){
char line[MAX];
int errval;
if(( errval = setjmp(env) ) != 0 )
printf( “error %d: restart\n”, errval );
while( fgets( line, MAX, stdin ) != NULL )
process_line(line);
return 0;
}
continued
:
void process_line( char * ptr )
{
:
cmd_add()
:
}
void cmd_add()
{
int token;
token = get_token();
if( token < 0 ) /* bad error */
longjmp( env, 1 );
/* normal processing */
}
int get_token()
{
if( some error )
longjmp( env, 2 );
}
Stack Frames before calling longjmp()
top of stack
main()
stack frame
setjmp(env)
returns 0;
direction of env records stack
stack growth frames info
Stack Frames after longjmp()
top of stack
main()
stack frame
process_line()
stack frame
direction of
stack growth :
:
longjmp(env,1)
cmd_add() causes stack frames
stack frame to be reset
What happens if longjmp() is called in signal
handler?
• Signal is automatically added to signal mask (which
prevents it from further delivery) when a signal
handler is is entered. When signal handler is exited,
signal is removed from the mask.
• When longjmp() is called in signal handler, the signal
remains blocked.
siglongjmp & sigsetjmp
• POSIX does not specify whether longjmp will restore the signal context. If you
want to save and restore signal masks, use siglongjmp.
• POSIX does not specify whether setjmp will save the signal context. If you
want to save signal masks, use sigsetjmp.
• #include <setjmp.h>
• int sigsetjmp(sigjmp_buf env, int savemask);
Returns: 0 if called directly, nonzero if returning from a call to siglongjmp
• void siglongjmp(sigjmp_buf env, int val);
Inter Process Communication
122
Why do processes communicate?
To share resources
Client/server paradigms
Inherently distributed applications
Reusable software components
etc
123
Types of IPC
• Message Passing
– Pipes, FIFOs, and Message Queues
• Synchronization
– Mutexes, condition variables, read-write locks, file and record locks,
and semaphores
• Shared memory
• Remote Procedure Calls
– Solaris doors and Sun RPC
Sharing of information
What is IPC?
• Each process has a private address space. Normally, no
process can write to another process’s space. How to get
important data from process A to process B?
• Message passing between different processes running on
the same operating system is IPC
• Synchronization is required in case of IPC through shared
memory or file system
Pipes
• Pipes are the oldest form of UNIX System IPC and are provided
by all UNIX systems
• Most commonly used form of IPC
• Historically, they have been half duplex (i.e., data flows in only
one direction).
• Because they don’t have names, pipes can be used only
between processes that have a common ancestor.
– Normally, a pipe is created by a process, that process calls fork,
and the pipe is used between the parent and the child.
UNIX Pipes
Parent process, p1 Child process, p2
Info to be
Info to be Info copy
Info copy
shared
shared
int p[2];
pipe(p); read(p[0], inbuf, size);
write(p[1], “hello”, size); ….
….
pipe for p1 and p2
stdout
who|sort
• Create a pipe in the parent
• Fork a child
• Duplicate the standard output descriptor to write end of pipe
• Exec ‘who’ program
• In the parent wait for the child.
• Duplicate the standard input descriptor to read end of pipe
• Exec ‘sort’ program
who|sort
main ()
{ int i;
int p[2];
pid_t ret;
pipe (p);
ret = fork ();
if (ret == 0)
{
close (1);
dup (p[1]);
close (p[0]);
execlp (“who", “who", (char *) 0);
}
if (ret > 0)
{
close (0);
dup (p[0]);
close (p[1]);
wait (NULL);
execlp (“sort", “sort", (char *) 0);
}}
dup and dup2 Functions
• #include <unistd.h>
• int dup(int filedes);
• int dup2(int filedes, int filedes2);
Both return: new file descriptor if OK, 1 on error
• The new file descriptor returned by dup is guaranteed to be the lowest-
numbered available file descriptor.
• With dup2, we specify the value of the new descriptor with the filedes2
argument. If filedes2 is already open, it is first closed. If filedes equals
filedes2, then dup2 returns filedes2 without closing it.
dup and dup2
Popen
• #include <stdio.h>
• FILE *popen(const char *cmdstring, const char *type);
Or
Read and write operations Pipe and FIFO
Writing to pipe/fifo when pipe/fifo is open for
reading
• If data size is less than or equal to PIPE_BUF, the write is atomic i.e.
either all the data is written or no data written
• If there is no room in the pipe for the requested data (<PIPE_BUF), by
default it blocks.
– If O_NONBLOCK option is set, EAGAIN error is returned
• If data is >PIPE_BUF and O_NONBLOCK option is set, even if 1 byte
space is available in the pipe, it will write that much data and return
– Atomicity is not guaranteed
Message Queues
• A message queue is a linked list of messages stored within the
kernel and identified by a message queue identifier
• Any process with adequate privileges can place the message
into the queue and any process with adequate privileges can
read from queue
• There is no requirement that some process must be waiting to
receive message before sending the message
Message Queues
• Every message queue has following structure in kernel
Message Queues
Permissions
• struct ipc_perm {
uid_t uid; /* owner's effective user id */
gid_t gid; /* owner's effective group id */
uid_t cuid; /* creator's effective user id */
gid_t cgid; /* creator's effective group id */
mode_t mode; /* access modes */ . . . };
• Permission Bit
– user-read 0400
– user-write (alter) 0200
– group-read 0040
– group-write (alter) 0020
– other-read 0004
– other-write (alter) 0002
Message Queues
• First msgget is used to either open an existing queue or create a new
queue
• #include <sys/msg.h>
int msgget(key_t key, int flag);
– Returns: message queue ID if OK, 1 on error
• Key value can be IPC_PRIVATE, key generated by ftok() or any key
(long integer)
• Flag value must be
– IPC_CREAT if a new queue has to be created
– IPC_CREAT and IPC_EXCL if want to create a new a queue but don’t
reference existing one
Key Values
• The server can create a new IPC structure by specifying a key of
IPC_PRIVATE
– Kernel generates a uniqe id
• The client and the server can agree on a key by defining the key in a
common header.
• The client and the server can agree on a pathname and project ID
and call the function ftok to convert these two values into a key.
– #include <sys/ipc.h>
– key_t ftok(const char *path, int id);
– The path argument must refer to an existing file. Only the lower 8 bits of
id are used when generating the key.
Message Queues
• When a new queue is created, the following members of the
msqid_ds structure are initialized.
– The ipc_perm structure is initialized
– msg_qnum, msg_lspid, msg_lrpid, msg_stime, and msg_rtime are
all set to 0.
– msg_ctime is set to the current time.
– msg_qbytes is set to the system limit.
• On success, msgget returns the non-negative queue ID. This
value is then used with the other three message queue
functions.
Messages
• Each message is composed of a positive long integer type field, and the actual
data bytes. Messages are always placed at the end of the queue.
• Messaeg Template
• Most applications define their own message structure according to the needs of
the application
Sending Messages
• #include <sys/msg.h>
int msgsnd(int msqid, const void *ptr, size_t nbytes, int
flag);
• msqid is the id returned by msgget sys call
• The ptr argument is a pointer to a message structure
• Nbytes is the length of the user data i.e. sizeof(struct mesg) – size
of(long). Length can be zero.
• A flag value of 0 or IPC_NOWAIT can be specified
• mssnd() is blocked until one of the following occurs
– Room exists for the message
– Message queue is removed (EIDRM error is returned)
– Interrupted by a signal ( EINTR is returned)
158
Receiving Messages
159
Receiving Messages
• The type argument lets us specify which message we want.
– type == 0: The first message on the queue is returned.
– type > 0:The first message on the queue whose message type equals type
is returned.
– type < 0:The first message on the queue whose message type is the lowest
value less than or equal to the absolute value of type is returned.
• A nonzero type is used to read the messages in an order other than
first in, first out.
– Priority to messages, Multiplexing
160
Receiving Messages
• IPC_NOWAIT flag makes the operation nonblocking, causing msgrcv to
return -1 with errno set to ENOMSG if a message of the specified type
is not available.
• If IPC_NOWAIT is not specified, the operation blocks until
– a message of the specified type is available,
– the queue is removed from the system (-1 is returned with errno set to
EIDRM)
– a signal is caught and the signal handler returns (causing msgrcv to return 1
with errno set to EINTR).
161
Receiving Messages
• If the returned message is larger than nbytes and the
MSG_NOERROR bit in flag is set, the message is truncated.
– no notification is given to us that the message was truncated, and
the remainder of the message is discarded.
• If the message is too big and MSG_NOERROR is not specified,
an error of E2BIG is returned instead (and the message stays
on the queue).
162
Control Operations on Message Queues
• #include <sys/msg.h>
int msgctl(int msqid, int cmd, struct msqid_ds *buf );
• IPC_STAT: Fetch the msqid_ds structure for this queue, storing it in the
structure pointed to by buf.
• IPC_SET: Copy the following fields from the structure pointed to by buf to the
msqid_ds structure associated with this queue: msg_perm.uid, msg_perm.gid,
msg_perm.mode, and msg_qbytes.
• IPC_RMID: Remove the message queue from the system and any data still on
the queue. This removal is immediate.
– Any other process still using the message queue will get an error of EIDRM on its next attempted operation on the queue.
– Above two commands can be executed only by a process whose effective user ID equals msg_perm.cuid or
msg_perm.uid or by a process with superuser privileges
163
Server.c
/*key.h*/
#define MSGQ_PATH "/home/students/f2007045/msgq_server.c " if ((msqid = msgget (key, IPC_CREAT | 0644)) == -1)
{
struct my_msgbuf perror ("msgget");
{ exit (1);
long mtype; }
char mtext[200]; printf ("server: ready to receive messages\n");
}; for (;;)
{
int main (void) if (msgrcv (msqid, &(buf.mtype), sizeof (buf), 0, 0) == -1)
{ {
struct my_msgbuf buf; perror ("msgrcv");
int msqid; exit (1);
key_t key; }
if ((key = ftok (MSGQ_PATH, 'B')) == -1) printf ("server: \"%s\"\n", buf.mtext);
{ }
perror ("ftok"); return 0;
exit (1); }
}
164
Client.c
#include "key.h“ printf ("Enter lines of text, ^D to quit:\n");
struct my_msgbuf buf.mtype = 1;
{
long mtype; while (gets (buf.mtext), !feof (stdin))
char mtext[200]; {
}; if (msgsnd (msqid, &(buf.mtype), sizeof (buf), 0) == -1)
perror ("msgsnd");
main (void) }
{
struct my_msgbuf buf; if (msgctl (msqid, IPC_RMID, NULL) == -1)
int msqid; {
key_t key; perror ("msgctl");
if ((key = ftok (MSGQ_PATH, 'B')) == -1) exit (1);
{ }
perror ("ftok");
exit (1); return 0;
} }
if ((msqid = msgget (key, 0) == -1)
{
perror ("msgget");
exit (1);
}
165
Multiplexing Messages
166
Multiplexing Messages
167
System V Semaphores
• A semaphore is a primitive used to provide synchronization
between various processes (or between various threads in a
given process)
• Binary Semaphores: a semaphore that can assume only values
0 or 1
• Counting Semaphores: semaphore is initialized to N indicating
the number of resources
168
System V Semaphores
169
Semaphore operations
• Create a semaphore and initialize it
– should be atomically done
• Wait for a semaphore: This tests the value of the semaphore. waits
(blocks) if the value is less than or equal to 0 and then decrements the
semaphore value once it is greater than 0 (aka P, lock, wait)
– Testing and decrementing should be a single atomic operation
• Post a semaphore. This increments the semaphore value. If any
processes are blocked waiting for this semaphores’s value o be greater
than 0, one of those processes are woken up (aka V, unlock, signal)
170
Producer Consumer Problem
171
Producer Consumer Problem
• Semaphore put controls whether the producer can place an item into the
shared buffer
• Semaphore get controls whether the consumer can remove an item from the
shred buffer
172
System V Semaphores
• Add one more level of detail by defining “a set of
counting semaphores”
• When we say System V semaphore it refers to a set
of couting semaphores ( max size of set is 25)
173
System V Semaphores
• Kernel maintains the following structure for every set
174
System V Semaphores
• Kernel structure for a semaphore set having 2 counting
semaphores
175
Creating Semaphores
176
Initializing a semaphore value
177
Testing whether semaphore has been
initilized
• When process P1 creates semaphore sem_otime is
set to zero.
• When P1 calls semctl to initialize and then semop,
sem_otime is set to current time.
• When process P2 checks sem_otime is non zero it
understands that semaphore has been initialized.
178
semctl() commands
• IPC_STAT, IPC_SET, IPC_RMID same as in message queues
• GETVAL: Return the value of semval for the member semnum.
• SETVAL: Set the value of semval for the member semnum. The value is
specified by arg.val.
• GETPID: Return the value of sempid for the member semnum.
• GETNCNT: Return the value of semncnt for the member semnum.
• GETZCNT: Return the value of semzcnt for the member semnum.
• GETALL: Fetch all the semaphore values in the set. These values are stored in
the array pointed to by arg.array.
• SETALL: Set all the semaphore values in the set to the values pointed to by
arg.array
179
Semaphore opearions
180
Semaphore operations
• The operation on each member of the set is specified by the
corresponding sem_op value. This value can be negative, 0, or
positive.
• If sem_op>0:
– returning of resources by the process.
– Semval+=sem_op
– If the SEM_UNDO flag is specified, semadj -=sem_op
– subtracted from the semaphore's adjustment value for this process.
181
Semaphore operations
• If sem_op <0
– obtain resources that the semaphore controls.
• If semval>= |sem_op|
– the resources are available
– Semva -= |sem_op|
– If the SEM_UNDO flag is specified,
– semadj += sem_op
– added to the semaphore's adjustment value for this process.
182
Semaphore operations
• If semval < |sem_op|
– the resources are not available
– If IPC_NOWAIT is specified, semop returns with an error of EAGAIN.
– If IPC_NOWAIT is not specified, the semncnt value for this semaphore is incremented
(since the caller is about to go to sleep), and the calling process is suspended until
one of the following occurs.
• Semval>=|sem_op| i.e. some other process has released some resources. Semncnt--
• The semaphore is removed from the system. In this case, the function returns an error of
EIDRM.
• A signal is caught by the process, and the signal handler returns. and the function returns an
error of EINTR. semncnt--
183
Semaphore operations
• If sem_op = 0,
– this means that the calling process wants to wait until the semaphore's value becomes 0.
• If the semaphore's value is currently 0, the function returns immediately.
• If the semaphore's value is nonzero, the following conditions apply.
– If IPC_NOWAIT is specified, return is made with an error of EAGAIN.
– If IPC_NOWAIT is not specified, semzcnt++, and the calling process is suspended until one of the
following occurs.
• The semaphore's value becomes 0. semzcnt--
• The semaphore is removed from the system. In this case, the function returns an error of EIDRM.
• A signal is caught by the process, and the signal handler returns. the function returns an error of EINTR. Semzcnt--
184
Semval adjustment on process
termination
• it is a problem if a process terminates while it has resources allocated through a
semaphore.
• Whenever we specify the SEM_UNDO flag for a semaphore operation and we
allocate resources (a sem_op value less than 0), the kernel remembers how
many resources we allocated from that particular semaphore (the absolute value
of sem_op).
• When the process terminates, either voluntarily or involuntarily, the kernel
checks whether the process has any outstanding semaphore adjustments and, if
so, applies the adjustment to the corresponding semaphore value.
• If we set the value of a semaphore using semctl, with either the SETVAL or
SETALL commands, the adjustment value for that semaphore in all processes is
set to 0.
185
Producer Consumer
unsigned short val[1]; id = semget (KEY, 1, 0666);
id = semget (KEY, 1, IPC_CREAT | 0666); operations[0].sem_num = 0;
setval.val = 2; operations[0].sem_op = -1;
semctl (id, 0, SETVAL, setval); operations[0].sem_flg = 0;
186
Shared Memory
• Shared memory allows two or more processes to
share a given region of memory.
• This is the fastest form of IPC, because the data
does not need to be copied between the client and
the server
187
Message Passing
188
Shared Memory
189
Memory mapped files
190
Memory mapped files
• proto argument for read-write access is
PROT_READ|PROTO_WRITE
• Flags must be either MAP_SHARED or
MAP_PRIVATE
• MAP_SHARED is used to share
memory with other processes
191
Why mmap()?
• It makes file handling easy. We open some file and
map that file into our process address space. To write
or read from file we don’t have to use read(), write()
or lseek()
• Another use is to provide shared memory between
unrelated processes
192
Counter Example
193
System V Shared Memory
• For every shared memory segment kernel maintains
the following structure
194
System V Shared Memory
• Creating or opening shared memory
– #include <sys/shm.h>
– int shmget(key_t key, size_t size, int flag);
in mo f
me ze o
by r y
s
te
Si
– Size is given as zero if we are referencing existing shared
memory segment
– When a new segment is created, the contents of the
segment are initialized with zeros
195
Attaching shared memory to a process
• Once a shared memory segment has been created, a process attaches
it to its address space by calling shmat.
– #include <sys/shm.h>
– void *shmat(int shmid, const void *addr, int flag);
Returns: pointer to shared memory segment if OK, 1 on error
• The address in the calling process at which the segment is attached
depends on the addr argument
• If addr is 0, the segment is attached at the first available address
selected by the kernel. This is the recommended technique.
196
Dettaching shared memory from a
process
• #include <sys/shm.h>
• int shmdt(void *addr);
• this does not remove the identifier and its associated data
structure from the system.
• The identifier remains in existence until some process (often a
server) specifically removes it by calling shmctl with a command
of IPC_RMID.
197
shmctl
• #include <sys/shm.h>
• int shmctl(int shmid, int cmd, struct shmid_ds *buf);
• IPC_STAT, IPC_SET same as other XSI IPC.
• IPC_RMID:
• Remove the shared memory segment set from the system. The
segment is not removed until the last process using the
segment terminates or detaches it.
198
Memory Mapping of /dev/zero
• Shared memory can be used between unrelated processes. But if the processes
are related, some implementations provide a different technique.
• The device /dev/zero is an infinite source of 0 bytes when read. This device also
accepts any data that is written to it, ignoring the data.
• An unnamed memory region is created and is initialized to 0.
• Multiple processes can share this region if a common ancestor specifies the
MAP_SHARED flag to mmap.
void *area;
if ((fd = open("/dev/zero", O_RDWR)) < 0) perror("open error");
if ((area = mmap(0, SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0)) == MAP_FAILED) perror();
close(fd); 199
Anonymous Memory Mapping
• A facility similar to the /dev/zero feature. To use this facility, we specify
the MAP_ANON flag to mmap and specify the file descriptor as -1.
• The resulting region is anonymous (since it's not associated with a
pathname through a file descriptor) and creates a memory region that
can be shared with descendant processes.
• this call, we specify the MAP_ANON flag and set the file descriptor to
-1.
void *area;
if ((area = mmap(0, SIZE, PROT_READ | PROT_WRITE,
MAP_ANON | MAP_SHARED, -1, 0)) == MAP_FAILED)
perror();
200
Shared Memory
• Between unrelated processes:
– XSI or System V shared memory
– can use mmap to map the same file into another process
address spaces using the MAP_SHARED flag.
• Between related processes
– Memory mapping of /dev/zero
– Unonymous memory mapping
201
• Pipes and FIFOS
• System V Message
Queues, Semaphores,
Shared Memory
• Posix Message Queues,
semaphores, shared
memory
202
Effect of fork, exec, _exit on IPC
203
TCP/UDP
TCP/IP
TCP or UDP
• At the internet layer, a destination address identifies a host
computer; no further distinction is made regarding which
process will receive the datagram
• TCP or UDP add a mechanism that distinguishes among
destinations within a given host, allowing multiple processes to
send and receive datagrams independently
UDP (User Datagram Protocol)
• Three-way handshake
• It accomplishes two important functions.
– It guarantees that both sides are ready to transfer data (and that
they know they are both ready)
– it allows both sides to agree on initial sequence numbers.
• Sequence numbers are sent and acknowledged during the
handshake. Each machine must choose an initial sequence
number at random that it will use to identify bytes in the stream
it is sending.
TCP Connection Establishment
259
TCP/IP Model
TCP/IP
• TCP/IP does not include an API definition.
• There are a variety of APIs for use with TCP/IP:
– Sockets
– TLI, XTI
– Winsock
– MacTCP
Functions needed:
• Specify local and remote communication endpoints
• Initiate a connection
• Wait for incoming connection
• Send and receive data
• Terminate a connection gracefully
• Error handling
Berkeley Sockets
• Generic:
– support for multiple protocol families.
– address representation independence
• Uses existing I/O programming interface as much as
possible.
– Socket api is similar to file I/O
Socket
• A socket is an abstract representation of a communication
endpoint.
• Sockets work with Unix I/O services just like files, pipes &
FIFOs.
• Sockets (obviously) have special needs over files:
– establishing a connection
– specifying communication endpoint addresses
Unix Descriptor Table
Socket Descriptor Data Structure
Creating a Socket
uint16_t htons(uint16_t);
uint16_t ntohs(uint_16_t);
uint32_t htonl(uint32_t);
uint32_t ntohl(uint32_t);
TCP/IP Addresses
• We don’t need to deal with sockaddr structures
since we will only deal with a real protocol family.
• We can use sockaddr_in structures.
•const!
bind returns 0 if successful or -1 on error.
bind()
• calling bind() assigns the address specified by the
sockaddr structure to the socket descriptor.
• You can give bind() a sockaddr_in structure:
bind( mysock,
(struct sockaddr*) &myaddr,
sizeof(myaddr) );
bind() Example
int mysock,err;
struct sockaddr_in myaddr;
mysock = socket(PF_INET,SOCK_STREAM,0);
myaddr.sin_family = AF_INET;
myaddr.sin_port = htons( portnum );
myaddr.sin_addr = htonl( ipaddress);
– Client can ask the O.S. to assign any available port number.
IPv4 Address Conversion
int inet_aton( char *, struct in_addr *);
End-of-File close()
read()
close()
TCP Client
PF_INET
PF_INET6 STREAM 0, used by
PF_UNIX DGRAM RAW socket
PF_X25 RAW
port
ephemeral port three way
addr ip addr
(routing)
sd = connect (sd, server_addr, handshaking
addr_len);
Server
CONNECT actions
write (sd, *buff, mbytes); PORT#
IP-ADDR
1. socket is valid
2. fill remote endpoint
addr/port
3. choose local endpoint
read (sd, *buff, mbytes);
add/port
4. initiate 3-way handshaking
disconnect
close (sd); sequence
TCP Server
1. Turn sd from
listen (sd, backlog); active to passive
2. Queue length
family
port
CONNECT
SOCKET ssd = accept (sd, *cliaddr, *len); three way
handshaking
addr
process specifies result
IP address port
wildcard 0 kernel chooses IP addr and port
wildcard nonzero kernel chooses IP, process specifies port
local IP addr 0 process specifies IP, kernel chooses port
local IP addr nonzero process specifies IP and port
intclose(int sockfd) ;
10 Fputs(recvline, stdout);
11 }
12 }
TCP Concurrent Server
TCP Concurrent Server
2 int 15 Listen(listenfd, LISTENQ);
3 main(int argc, char **argv) 16 for ( ; ; ) {
4{ 17 clilen = sizeof(cliaddr);
5 int listenfd, connfd; 18 connfd = Accept(listenfd, (SA *) &cliaddr, &clilen);
6 pid_t childpid;
7 socklen_t clilen; 19 if ( (childpid = Fork()) == 0) { /* child process */
8 struct sockaddr_in cliaddr, servaddr; 20 Close(listenfd); /* close listening socket */
21 str_echo(connfd); /* process the request */
9 listenfd = Socket (AF_INET, SOCK_STREAM, 0); 22 exit (0);
23 }
10 bzero(&servaddr, sizeof(servaddr)); 24 Close(connfd); /* parent closes connected socket */
11 servaddr.sin_family = AF_INET; 25 }
12 servaddr.sin_addr.s_addr = htonl (INADDR_ANY); 26 }
13 servaddr.sin_port = htons (SERV_PORT);
for ( ; ; ) {
clilen = sizeof (cliaddr);
if ( (connfd = accept (listenfd, (SA *) &cliaddr,
&clilen)) < 0) {
if (errno == EINTR)
continue; /* back to for () */
else
err_sys ("accept error");
}
Connection Abort before accept Returns
Connection Abort before accept Returns
318
I/O Multiplexing
• We often need to be able to monitor multiple
descriptors:
– a generic TCP client (like telnet)
– need to be able to handle unexpected situations, perhaps a
server that shuts down without warning.
– A server that handles both TCP and UDP
Example - generic TCP client
• Input from standard input should be sent to a TCP
socket.
• Input from a TCP socket should be sent to standard
output.
• How do we know when to check for input from each
source?
TCP SOCKET Generic TCP Client
STDIN
STDOUT
Different Solutions
• Use nonblocking I/O.
– use fcntl() to set O_NONBLOCK
• Use alarm and signal handler to interrupt slow
system calls.
• Use multiple processes/threads.
• Use functions that support checking of multiple input
sources at the same time.
Non blocking I/O
if ( (n=read(tcpsock,…)<0))
if (errno != EWOULDBLOCK)
/* ERROR */
else write(STDOUT_FILENO,…)
}
The problem with nonblocking I/O
• Using blocking I/O allows the Operating System to
put your program to sleep when nothing is happening
(no input). Once input arrives the OS will wake up
your program and read() (or whatever) will return.
• With nonblocking I/O the process will waste
processor time in a busy-wait
Using alarms
signal(SIGALRM, sig_alrm);
alarm(MAX_TIME);
read(STDIN_FILENO,…);
...
signal(SIGALRM, sig_alrm);
alarm(MAX_TIME);
read(tcpsock,…);
...
Alarming Problem
Return OK
Process Copy complete
datagram
nonblocking I/O
application kernel
System call
recvfrom No datagram ready
EWOULDBLOCK
System call
recvfrom No datagram ready
EWOULDBLOCK Wait for
data
Process System call
repeatedly recvfrom datagram ready
call recvfrom copy datagram
wating for an
OK return Copy data
(polling) from
Return OK kernel
to user
Process application
datagram
I/O multiplexing(select and poll)
application kernel
System call
Process block select No datagram ready
in a call to
select waiting Wait for
for one of data
possibly many Return readable
sockets to Datagram ready
become readable System call
recvfrom copy datagram
Process blocks Copy data
while data from kernel
copied Return OK to user
into application Process Copy complete
buffer datagram
signal driven I/O(SIGIO)
application kernel
Sigaction system call
Establish SIGIO
Process
continues Signal handler
executing Return Wait for
data
Deliver SIGIO
Signal handler Datagram ready
System call copy datagram
recvfrom Copy data
Process blocks
while data from kernel
copied Return OK to user
into application Process Copy complete
buffer datagram
asynchronous I/O
application kernel
System call
aio_read No datagram ready
Return Wait for
data
Process
continues Datagram ready
executing copy datagram Copy data
from kernel
to user
Signal Delever signal
handler Copy complete
Process Specified in aio_read
datagram
Comparison of the I/O Models
blocked
check wait for
check data
check
check ready notification
initiate initiate
blocked
blocked
blocked copy data
from kernel
complete complete complete complete notification to user
ist phase handled differently, handles both phases
2nd phase handled the same
Select()
int select( int maxfd,
fd_set *readset,
fd_set *writeset,
fd_set *excepset,
const struct timeval *timeout);
maxfd :highest number assigned to a descriptor.
weadset: set of descriptors we want to read from.
writeset: set of descriptors we want to write to.
excepset: set of descriptors to watch for exceptions.
timeout: maximum time select should wait
struct timeval
struct timeval {
long tv_usec; /* seconds */
long tv_usec; /* microseconds */
}
fd_set rset;
client
Data of EOF select() for
• stdin
readability on either
Socket
standard input or
•
socket
error EOF
TCP
Continue…..
client request
time0
time1 request
time4
reply server
time5
reply
time6
reply
time7
reply
Batch input
Time 7:
request8 request7 request6 request5
Time 8:
request9 request8 request7 request6
data write
write
Read returns > 0 data close
Read returns > 0 FIN
Read returns 0 Ack of data and FIN
Shutdown function
• #include<sys/socket.h>
int shutdown(int sockfd, int howto);
/* return : 0 if OK, -1 on error */
• howto argument
SHUT_RD : read-half of the connection closed. No more reads can be issued
SHUT_WR : write-half of the connection closed. Also called half-close. Buffered
data will be sent followed by termination sequence.
SHUT_RDWR : both closed
Str_cli function using select and
shutdown
#include "unp.h"
void str_cli(FILE *fp, int sockfd)
{
int maxfdp1, stdineof;
fd_set rset;
charsendline[MAXLINE], recvline[MAXLINE];
stdineof = 0;
FD_ZERO(&rset);
for ( ; ; ) {
if (stdineof == 0) // select on standard input for readability
FD_SET(fileno(fp), &rset);
FD_SET(sockfd, &rset);
maxfdp1 = max(fileno(fp), sockfd) + 1;
Select(maxfdp1, &rset, NULL, NULL, NULL);
Continue…..
Str_cli function using select and shutdown
Before first client has established a connection
fd:0(stdin),1(stdout),2(stderr)
[FD_SETSIZE -1] -1
fd:3 => listening socket fd
Data structure TCP server(2)
After first client connection is established
* fd3 => listening socket fd
[FD_SETSIZE -1] -1
*fd4 => client socket fd
Data structure TCP server(3)
After second client connection is established
* fd3 => listening socket fd
[FD_SETSIZE -1] -1
* fd4 => client1 socket fd
* fd5 => client2 socket fd
Data structure TCP server(4)
After first client terminates its connection
*Maxfd does not change
* fd3 => listening socket fd
[FD_SETSIZE -1] -1
* fd4 => client1 socket fd deleted
* fd5 => client2 socket fd
TCP echo server using single process
#include "unp.h"
int main(int argc, char **argv)
{
int i, maxi, maxfd, listenfd, connfd, sockfd;
int nready, client[FD_SETSIZE];
ssize_t n;
fd_set rset, allset;
char line[MAXLINE];
socklen_t clilen;
struct sockaddr_in cliaddr, servaddr;
listenfd = Socket(AF_INET, SOCK_STREAM, 0);
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_port = htons(SERV_PORT);
Bind(listenfd, (SA *) &servaddr, sizeof(servaddr));
Listen(listenfd, LISTENQ);
maxfd = listenfd; /* initialize */
maxi = -1; /* index into client[] array */
for (i = 0; i < FD_SETSIZE; i++)
client[i] = -1; /* -1 indicates available entry */
FD_ZERO(&allset);
FD_SET(listenfd, &allset);
for ( ; ; ) {
rset = allset; /* structure assignment */
nready = Select(maxfd+1, &rset, NULL, NULL, NULL);
if (FD_ISSET(listenfd, &rset)) { /* new client connection */
clilen = sizeof(cliaddr);
connfd = Accept(listenfd, (SA *) &cliaddr, &clilen);
for (i = 0; i < FD_SETSIZE; i++)
if (client[i] < 0) {
client[i] = connfd; /* save descriptor */
break;}
if (i == FD_SETSIZE)
err_quit("too many clients");
FD_SET(connfd, &allset); /* add new descriptor to set */
if (connfd > maxfd)
maxfd = connfd; /* maxfd for select */
if (i > maxi)
maxi = i; /* max index in client[] array */
if (--nready <= 0)
continue; /* no more readable descriptors */
}
for (i = 0; i <= maxi; i++) { /* check all clients for data */
if ( (sockfd = client[i]) < 0)
continue;
if (FD_ISSET(sockfd, &rset)) {
if ( (n = Readline(sockfd, line, MAXLINE)) == 0) {
/*connection closed by client */
Close(sockfd);
FD_CLR(sockfd, &allset);
client[i] = -1;
} else
Writen(sockfd, line, n);
if (--nready <= 0)
break; /* no more readable descriptors */
}
}
}
}
Denial of service attacks
• If malicious client connect to the server, send 1 byte of
data(other than a newline), and then goes to sleep.
=>call readline, server is blocked.
Denial of service attacks
• Solution
– use nonblocking I/O
– have each client serviced by a separate thread of control
(spawn a process or a thread to service each client)
– place a timeout on the I/O operation
pselect function
#include <sys/select.h>
#include <signal.h>
#include <time.h>
RFC 1034
RFC 1035
Hierarchical Namespace
Naming Authorities
DNS Record Types
Types
Sample DNS Records
aix IN A 192.168.42.2
IN AAAA 3ffe:b80:1f8d:2:204:acff:fe17:bf38
IN MX 5 aix.unpbook.com.
IN MX 10 mailhost.unpbook.com.
aix-4 IN A 192.168.42.2
aix-6 IN AAAA 3ffe:b80:1f8d:2:204:acff:fe17:bf38
aix-611 IN AAAA fe80::204:acff:fe17:bf38
Resolvers and Name Servers
DNS library functions
gethostbyname
gethostbyaddr
getservbyname
getservbyport
getaddrinfo
379
gethostbyname
#include <netdb.h>
380
struct hostent
struct hostent {
char *h_name;
char **h_aliases; official name
int h_addrtype; (canonical)
int h_length; other names
char **h_addr_list; AF_INET or AF_INET6
};
address length (4 or
16)
array of ptrs to
addresses
381
struct hostent
gethostbyname and errors
• On error gethostbyname return null.
• Gethostbyname sets the global variable h_errno to indicate
the exact error:
– HOST_NOT_FOUND
– TRY_AGAIN
– NO_RECOVERY
– NO_DATA
– NO_ADDRESS
Sample code using gethostbyname()
char *ptr, **pptr;
char str [INET_ADDRSTRLEN];
struct hostent *hptr; switch (hptr->h_addrtype) {
case AF_INET:
while (--argc > 0) { pptr = hptr->h_addr_list;
ptr = *++argv; for ( ; *pptr != NULL; pptr++)
if ( (hptr = gethostbyname (ptr) ) == printf ("\taddress: %s\n",
NULL) { Inet_ntop (hptr->h_addrtype, *pptr,
err_msg ("gethostbyname error for host: str, sizeof (str)));
%s: %s", break;
ptr, hstrerror (h_errno) ); default:
continue; err_ret ("unknown address type");
} break;
printf ("official hostname: %s\n", }
hptr->h_name);
}
for (pptr = hptr->h_aliases; *pptr ! =
NULL; pptr++)
printf ("\talias: %s\n", *pptr);
gethostbyaddr
• #include <netdb.h>
struct hostent *gethostbyaddr (const char *addr, socklen_t
len, int family);
• The addr argument is not a char*, but is really a pointer to an in_addr
structure containing the IPv4 address. len is the size of this structure: 4
for an IPv4 address. The family argument is AF_INET.
• The function gethostbyaddr takes a binary IPv4 address and
tries to find the hostname corresponding to that address. This is
the reverse of gethostbyname
getservbyname and getservbyport
• bzero(&hints, sizeof(hints) ) ;
• hints.ai_flags = AI_CANONNAME;
• hints.ai_family = AF_INET;
connectionless
unreliable
datagram protocol
popular using
DNS(the Domain Name System)
NFS(the Network File System)
SNMP(Simple Network Management Protocol)
Socket functions for UDP client-server UDP Server
socket( )
bind( )
UDP Client
socket( ) recvfrom(
)
block until datagram
sendto( ) received from a client
data(request)
Process request
recvfrom( ) data(reply)
sendto( )
close( )
recvfrom and sendto functions
#include<sys/socket.h>
ssize_t sendto(int sockfd, const void *buff, size_t nbyte, int flag,
const struct sockaddr *to, socklen_t addrlen);
connection connection
Socket receive
buffer
UDP UDP
UDP
datagram datagram
#include “unp.h”
void dg_cli(FILE *fp, int sockfd, const SA *pservaddr, soklen_t servlen)
{
int n;
char sendline[MAXLINE], recvline[MAXLINE+1];
while(Fgets(sendline, MAXLINE, fp) != NULL) {
sendto(sockfd, sendline, strlen(sendline), 0, pservaddr, servlen);
continue
Verify Received
Response
If(len != servlen || memcmp(pservaddr, preply_addr, len) != 0) {
printf(“reply from %s (ignore)\n”,
Sock_ntop(preply_addr, len);
continue;
}
recvline[n] = 0; /*NULL terminate */
Fputs(recvline, stdout);
}
}
application peer
???
UDP } Stores peer IP address
and port#from connect
UDP
UDP datagram
UDP datagram from
some other
IP address and/or port#
UDP datagram
Lack of Flow Control with UDP
#include “unp.h”
The interface’s buffers were full or they could have been discarded by
the sending host.
The counter “dropped due to full socket buffers” indicates how many
datagram were received by UDP but were discarded because the
receiving socket’s receive queue was full
Solution
fast server, slow client.
Increase the size of socket receive buffer.
TCP and UDP Echo Server Using select
#include “unp.h”
int main(int argc, char **argv)
{
int listenfd, connfd, udpfd, nready, maxfd1;
char mesg[MAXLINE];
pid_t childpid;
fd_set rset;
ssize_t n;
socklen_t len;
const int on = 1;
struct sockaddr_in cliaddr, servaddr;
void sig_chld(int);
TCP and UDP Echo Server Using select
bzero(&seraddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htol(INADDR_ANY);
servaddr.sin_port = htos(SERV_PORT);
Listenfd, LISTENQ);
/* Create UDP socket */
udpfd = Socket(AF_INET, SOCK_DGRAM, 0);
bzero(&seraddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htol(INADDR_ANY);
servaddr.sin_port = htos(SERV_PORT);
if(FD_ISSET(udpfd, &rset)) {
len = sizeof(cliaddr);
n = Recvfro,(udp, mesg, MAXLINE, 0, (SA *)
&cliaddr, &len);
• Advantages of UDP:
– UDP supports broadcasting and multicasting
– UDP has no connection setup or teardown
• For a two packet request-reply, we need 8 extra packets to be
transmitted in TCP
• UDP: RTT+SPT, TCP: 2 *RTT + SPT
When to use UDP instead of TCP?
• Recommendations:
– UDP must be used for broadcast and multicast applications
• Error control or reliability be added if reqd at appl layer
– UDP can be used for simple request-reply applications, but error
detection must be built into the application
• Acknowledgements, timeouts, retransmissions
– UDP should not be used for bulk data transfer
• Bulk transfer requires flow control along with error control which is like
replicating TCP at appl layer
Adding Reliability to a UDP Application
• Introduction
• getsockopt and setsockopt function
• socket state
• Generic socket option
• IPv4 socket option
• ICMPv6 socket option
• IPv6 socket option
• TCP socket option
• fcnl function
Introduction
#include <sys/socket.h>
int getsockopt(int sockfd, , int level, int optname, void *optval, socklent_t
*optlen);
int setsockopt(int sockfd, int level , int optname, const void *optval,
socklent_t optlen);
client server
write data
Close Data queued by TCP
FIN
close returns
Ack of data and FIN
Application reads queued
data and FIN
FIN close
Ack of data and FIN
Default operation of close:it returns immediately
SO_LINGER
client server
write data
Close Data queued by TCP
FIN
Ack of data and FIN
close returns
Application reads queued
data and FIN
FIN close
Ack of data and FIN
Close with SO_LINGER socket option set and l_linger a positive value
SO_LINGER
client server
write data
Shutdown Data queued by TCP
FIN
read block
Ack of data and FIN
Application reads queued
data and FIN
FIN close
read returns 0
Ack of data and FIN
Using shutdown to know that peer has received our data
• An way to know that the peer application has read the data
– use an application-level ack or application ACK
– client
char ack;
Write(sockfd, data, nbytes); // data from client to server
n=Read(sockfd, &ack, 1); // wait for application-level ack
– server
nbytes=Read(sockfd, buff, sizeof(buff)); //data from client
//server verifies it received the correct amount of data
from
// the client
Write(sockfd, “”, 1);//server’s ACK back to client
SO_RCVBUF , SO_SNDBUF
• let us change the default send-buffer, receive-buffer size.
– Default TCP send and receive buffer size :
• 4096bytes
• 8192-61440 bytes
– Default UDP buffer size : 9000bytes, 40000 bytes
• SO_RCVBUF option must be setting before connection established.
– For client, it should be before calling connect()
– For server it should be before calling listen()
• TCP socket buffer size should be at least three times the MSSs
SO_RCVLOWAT , SO_SNDLOWAT
• Every socket has a receive low-water mark and send low-water mark.
(used by select function)
• Receive low-water mark:
– the amount of data that must be in the socket receive buffer for select to
return “readable”.
– Default receive low-water mark : 1 for TCP and UDP
• Send low-water mark:
– the amount of available space that must exist in the socket send buffer for
select to return “writable”
– Default send low-water mark : 2048 for TCP
– UDP send buffer never change because dose not keep a copy of send
datagram.
SO_RCVTIMEO, SO_SNDTIMEO
• Allow a listening server to start and bind its well known port even if
previously established connection exist that use this port as their local
port.
• Allow multiple instance of the same server to be started on the same
port, as long as each instance binds a different local IP address.
• Allow a single process to bind the same port to multiple sockets, as
long as each bind specifies a different local IP address.
• Allow completely duplicate bindings : multicasting
SO_TYPE
• Return the socket type.
• Returned value is such as SOCK_STREAM,
SOCK_DGRAM...
SO_USELOOPBACK
• This option applies only to sockets in the routing
domain(AF_ROUTE).
• The socket receives a copy of everything sent on the
socket.
IPv4 socket option
• Level => IPPROTO_IP
• IP_HDRINCL => If this option is set for a raw IP
socket, we must build our IP header for all the
datagrams that we send on the raw socket.
IPv4 socket option
• IP_OPTIONS=>allows us to set IP option in IPv4
header.(chapter 24)
• IP_RECVDSTADDR=>This socket option causes the
destination IP address of a received UDP datagram
to be returned as ancillary data by recvmsg.
(chapter20)
IP_RECVIF
• Cause the index of the interface on which a UDP
datagram is received to be returned as ancillary data
by recvmsg.(chapter20)
IP_TOS
• lets us set the type-of-service(TOS) field in IP header
for a TCP or UDP socket.
• If we call getsockopt for this option, the current value
that would be placed into the TOS(type of service)
field in the IP header is returned
IP_TTL
• We can set and fetch the default TTL(time to live
field).
ICMPv6 socket option
• This socket option is processed by ICMPv6 and has
a level of IPPROTO_ICMPV6.
• ICMP6_FILTER =>lets us fetch and set an
icmp6_filter structure that specifies which of the
256possible ICMPv6 message types are passed to
the process on a raw socket.(chapter 25)
IPv6 socket option
• This socket option is processed by IPv6 and have a
level of IPPROTO_IPV6.
• IPV6_ADDRFORM=>allow a socket to be converted
from IPv4 to IPv6 or vice versa.(chapter 10)
• IPV6_CHECKSUM=>specifies the byte offset into the
user data of where the checksum field is located.
IPV6_DSTOPTS
• Specifies that any received IPv6 destination options
are to be returned as ancillary data by recvmsg.
IPV6_HOPLIMIT
• Setting this option specifies that the received hop
limit field be returned as ancillary data by recvmsg.
(chapter 20)
• Default off.
IPV6_HOPOPTS
• Setting this option specifies that any received IPv6
hop-by-hop option are to be returned as ancillary
data by recvmsg.(chapter 24)
IPV6_NEXTHOP
• This is not a socket option but the type of an ancillary
data object that can be specified to sendmsg. This
object specifies the next-hop address for a datagram
as a socket address structure.(chapter20)
IPV6_PKTINFO
• Setting this option specifies that the following two
pieces of infoemation about a received IPv6
datagram are to be returned as ancillary data by
recvmsg:the destination IPv6 address and the
arriving interface index.(chapter 20)
IPV6_PKTOPTIONS
• Most of the IPv6 socket options assume a UDP
socket with the information being passed between
the kernel and the application using ancillary data
with recvmsg and sendmsg.
• A TCP socket fetch and store these values using
IPV6_ PKTOPTIONS socket option.
IPV6_RTHDR
• Setting this option specifies that a received IPv6
routing header is to be returned as ancillary data by
recvmsg.(chapter 24)
• Default off
IPV6_UNICAST_HOPS
• This is similar to the IPv4 IP_TTL.
• Specifies the default hop limit for outgoing datagram
sent on the socket, while fetching the socket option
returns the value for the hop limit that the kernel will
use for the socket.
TCP socket option
• There are five socket option for TCP, but three are
new with Posix.1g and not widely supported.
• Specify the level as IPPROTO_TCP.
TCP_KEEPALIVE
• This is new with Posix.1g
• It specifies the idle time in second for the connection
before TCP starts sending keepalive probe.
• Default 2hours
• this option is effective only when the
SO_KEEPALIVE socket option enabled.
TCP_MAXRT
• This is new with Posix.1g.
• It specifies the amount of time in seconds before a
connection is broken once TCP starts retransmitting
data.
– 0 : use default
– -1:retransmit forever
– positive value:rounded up to next transmission time
TCP_MAXSEG
• This allows us to fetch or set the maximum segment
size(MSS) for TCP connection.
TCP_NODELAY
• This option disables TCP’s Nagle algorithm.
(default this algorithm enabled)
• purpose of the Nagle algorithm.
==>prevent a connection from having multiple small
packets outstanding at any time.
• Small packet => any packet smaller than MSS.
Nagle algorithm
• Default enabled.
• Reduce the number of small packet on the WAN.
• If given connection has outstanding data , then no
small packet data will be sent on connection until the
existing data is acknowledged.
Nagle algorithm disabled
h 0
e 250
l 500
l 750
o 1000
! 1250
1500
1500
1750
2000
Nagle algorithm enabled
h h
0
e 250
l 500
el
l 750
o 1000
! 1250
lo
1500
1500
1750
!
2000
2250
2500
fcntl function
• File control
• This function perform various descriptor control
operation.
• Provide the following features
– Nonblocking I/O(chapter 15)
– signal-driven I/O(chapter 22)
– set socket owner to receive SIGIO signal.
(chapter 21,22)
#include <fcntl.h>
int fcntl(int fd, int cmd, …./* int arg */);
Returns:depends on cmd if OK, -1 on error
each descriptor has a set of file flags that fetched with
the F_GETFL command
and set with F_SETFL command.
Misuse of fcntl
/* wrong way to set socket nonblocking */
if(fcntl(fd, F_SETFL,O_NONBLOCK) < 0)
err_sys(“F_ SETFL error”);
• <sys/un.h>
struct sockaddr_un{
uint8_t sun_len;
sa_family_t sun_family; /*AF_LOCAL*/
char sun_path[104]; /*null terminated pathname*/
};
• sun_path => must null terminated
socketpair Function
#include<sys/socket.h>
• family must be AF_LOCAL
int socketpair(int family, int type, int protocol, int sockfd[2]);
• protocol must be 0
return: nonzero if OK, -1 on error
socketpair Function
#include "unp.h"
int main(int argc, char **argv)
{
int listenfd, connfd;
pid_t childpid;
socklen_t clilen;
struct sockaddr_un cliaddr, servaddr;
void sig_chld(int);
unlink(UNIXSTR_PATH);
bzero(&servaddr, sizeof(servaddr));
servaddr.sun_family = AF_LOCAL;
strcpy(servaddr.sun_path, UNIXSTR_PATH);
for ( ; ; ) {
clilen = sizeof(cliaddr);
if ( (connfd = accept(listenfd, (SA *) &cliaddr,
&clilen)) < 0) {
if (errno == EINTR)
continue; /* back to for() */
else
err_sys("accept error");
}
if ( (childpid = Fork()) == 0) { /* child process */
Close(listenfd); /* close listening socket */
str_echo(connfd); /* process the request */
exit(0);
}
Close(connfd); /* parent closes connected socket */
}
}
passing descriptors
• Current unix system provide a way to pass any open descriptor from one process to any other
process.(using sendmsg)
• The ability to pass an open file descriptor between processes is powerful. It can lead to different
ways of designing clientserver applications.
• It allows one process (typically a server) to do everything that is required to open a file (involving
such details as translating a network name to a network address, dialing a modem, negotiating locks
for the file, etc.) and simply pass back to the calling process a descriptor that can be used with all the
I/O functions.
• All the details involved in opening the file or device are hidden from the client.
passing descriptors(2)
[0] [1]
After creating stream pipe using socketpair
mycat openfile
fork
Exec(command-line args)
[0] [1]
descriptor
mycat program after invoking openfile program
recvmsg and sendmsg
#include <sys/socket.h>
Struct msghdr {
void *msg_name; /* starting address of buffer */
socklen_t msg_namelen; /* size of protocol address */
struct iovec *msg_iov; /* scatter/gather array */
size_t msg_iovlen; /* # elements in msg_iov */
void *msg_control; /* ancillary data; must be aligned
for a cmsghdr structure */
socklen_t msg_controllen; /* length of ancillary data */
int msg_flags; /* flags returned by recvmsg() */
};
recvmsg and sendmsg
m s g h d r{ }
m s g _ n a m e
m s g _ n a m1 6e le n io v e c { }
m s g _ io v io v _ b a s e
1 0 0
m s g _ io v l 3e n io v _ le n
m s g _ c o n t r o l io v _ b a s e
6 0
m s g _ c o n 2 t 0r o l l e i no v _ le n
m s g _ f la g 0s io v _ b a s e
8 0
io v _ le n
F ig u re 1 3 . 8 D a t a s t ru c t u r e s w h e n r e c v m s g is c a lle d
recvmsg and sendmsg
s o c k a d d r _ in { }
1 6 , A F _ IN E T , 2 0 0 0
1 9 8 . 6 9 . 1 0 . 2
m s g h d r{ }
m s g _ n a m e
m s g _ n a m 1 e6 l e n io v e c { } [ ]
m s g _ io v io v _ b a s e
1 0 0
m s g _ io v l e3 n io v _ le n
m s g _ c o n t r o l io v _ b a s e
6 0
m s g _ c o n 2t r0 o lle ni o v _ le n
m s g _ f la g s0 io v _ b a s e
8 0
io v _ le n
c m s g _ l e1 n6
c m s g _ l e I P v eP l R O T P _ I P
c m s g _ t yI P p _e R E C V D S T A D D R
2 0 6 . 6 2 . 2 2 6 . 3 5
F ig u re 1 3 . 9 U p d a te o f F ig u re 1 3 . 8 w h e n re c v m s g r
Ancillary Data
• Ancillary data can be sent and received using the msg_control and
msg_controllen members of the msghdr structure with sendmsg and recvmsg
functions.
Protocol c
IPv4 IP
Ancillary Data
m s g _ c o n t r o l
c m s g _ l e n
c m s g _ l e v e l c m s g h d r{ }
CMSG_LEN()
c m s g _ t y p e
cmsg_len p a d a c c i lla r y
d a t a o b je c t
C M S G _ S P A C E ( )
d a t a
msg_controllen
p a d
c m s g _ l e n
c m s g _ l e v e l c m s g h d r{ }
CMSG_LEN()
c m s g _ t y p e
cmsg_len
a c c i lla r y
p a d d a t a o b je c t
C M S G _ S P A C E ( )
d a t a
F ig u r e 1 3 . 1 2 A n c illa r y d a t a c o n t a in in g t w o a n c illa r
Ancillary Data
c m s g h d r{ } c m s g h d r{ }
c m s g _ l e n1 6 c m s g _ l e n1 6
c m s g _ l e v S e Ol L _ S O C K E cT m s g _ l e v S e Ol L _ S O C K E T
c m s g _ t y p Se C M _ R IG H T Sc m s g _ t y p S e C M _ C R E D S
d i s c r i p t o r
f c r e d { }
F ig u r e 1 3 . 1 3 c m s g h d r s t r u c t u r e w h e n u s e d w it h
Control Message Header
struct cmsghdr {
socklen_t cmsg_len; /* data byte count, including header */
int cmsg_level; /* originating protocol */
int cmsg_type; /* protocol-specific type */
/* followed by the actual control message data */
};
Control Message Header
static struct cmsghdr *cmptr = NULL; /* malloc'ed first time */ } else if (nr == 0) {
* Receive a file descriptor from a server process. Also, any data return(-1);
* We have a 2-byte protocol for receiving the fd from send_fd(). for (ptr = buf; ptr < &buf[nr]; ) {
if (*ptr++ == 0) {
*/
if (ptr != &buf[nr-1])
int
err_dump("message format error");
recv_fd(int fd, ssize_t (*userfunc)(int, const void *, size_t))
status = *ptr & 0xFF; /* prevent sign extension */
{
if (status == 0) {
int newfd, nr, status;
if (msg.msg_controllen != CONTROLLEN)
char *ptr;
err_dump("status = 0 but no fd");
char buf[MAXLINE];
newfd = *(int *)CMSG_DATA(cmptr);
struct iovec iov[1];
} else {
struct msghdr msg;
newfd = -status;
}
status = -1;
nr -= 2;
for ( ; ; ) {
}
iov[0].iov_base = buf;
}
iov[0].iov_len = sizeof(buf);
if (nr > 0 && (*userfunc)(STDERR_FILENO, buf, nr) != nr)
msg.msg_iov = iov;
return(-1);
msg.msg_iovlen = 1;
if (status >= 0) /* final data has arrived */
msg.msg_name = NULL;
return(newfd); /* descriptor, or -status */
msg.msg_namelen = 0;
}
if (cmptr == NULL && (cmptr = malloc(CONTROLLEN)) == NULL) }
return(-1);
Control Message Header
if (cmptr == NULL && (cmptr = malloc(CONTROLLEN)) == NULL)
return(-1);
msg.msg_control = cmptr;
msg.msg_controllen = CONTROLLEN;
if ((nr = recvmsg(fd, &msg, 0)) < 0) {
err_sys("recvmsg error");
} else if (nr == 0) {
err_ret("connection closed by server");
return(-1);
}
for (ptr = buf; ptr < &buf[nr]; ) {
if (*ptr++ == 0) {
if (ptr != &buf[nr-1])
err_dump("message format error");
status = *ptr & 0xFF; /* prevent sign extension */
if (status == 0) {
if (msg.msg_controllen != CONTROLLEN)
err_dump("status = 0 but no fd");
newfd = *(int *)CMSG_DATA(cmptr);
} else {
newfd = -status;
}
nr -= 2;
}
}
if (nr > 0 && (*userfunc)(STDERR_FILENO, buf, nr) != nr)
return(-1);
if (status >= 0) /* final data has arrived */
return(newfd); /* descriptor, or -status */
}
}
Ancillary Data
#include "unp.h"
int my_open(const char *, int);
int main(int argc, char **argv)
{
int fd, n;
charbuff[BUFFSIZE];
if (argc != 2)
err_quit("usage: mycat <pathname>");
if ( (fd = my_open(argv[1], O_RDONLY)) < 0)
err_sys("cannot open %s", argv[1]);
while ( (n = Read(fd, buff, BUFFSIZE)) > 0)
Write(STDOUT_FILENO, buff, n);
exit(0);
}
mycat program show in Figure 14.7)
#include "unp.h"
int
my_open(const char *pathname, int mode)
{
int fd, sockfd[2], status;
pid_t childpid;
char c, argsockfd[10], argmode[10];
Socketpair(AF_LOCAL, SOCK_STREAM, 0, sockfd);
if ( (childpid = Fork()) == 0) { /* child process */
Close(sockfd[0]);
snprintf(argsockfd, sizeof(argsockfd), "%d", sockfd[1]);
snprintf(argmode, sizeof(argmode), "%d", mode);
execl("./openfile", "openfile", argsockfd, pathname, argmode,
(char *) NULL);
err_sys("execl error");
}
myopen function(1) : open a file and return a descriptor
/* parent process - wait for the child to terminate */
Close(sockfd[1]); /* close the end we don't use */
Waitpid(childpid, &status, 0);
if (WIFEXITED(status) == 0)
err_quit("child did not terminate");
if ( (status = WEXITSTATUS(status)) == 0)
Read_fd(sockfd[0], &c, 1, &fd);
else {
errno = status; /* set errno value from child's status */
fd = -1;
}
Close(sockfd[0]);
return(fd);
}
myopen function(2) : open a file and return a descriptor
receiving sender credentials
Struct fcred{
uid_t fc_ruid; /*real user ID*/
gid_t fc_rgid; /*real group ID*/
char fc_login[MAXLOGNAME];/*setlogin() name*/
uid_t fc_uid; /*effectivr user ID*/
short fc_ngroups; /*number of groups*/
gid_t fc_groups[NGROUPS]; /*supplemenary group IDs*/
};
#define fc_gid fc_groups[0] /* effective group ID */
receiving sender credentials(2)
• Usally MAXLOGNAME is 16
• NGROUP is 16
• fc_ngroups is at least 1
• the credentials are sent as ancillary data when data is sent on unix domain socket.(only if
receiver of data has enabled the LOCAL_CREDS socket option)
• on a datagram socket , the credentials accompany every datagram.
• Credentials cannot be sent along with a descriptor
• user are not able to forge credentials
Advanced I/O Functions
Outline
• Socket Timeouts
• recv and send Functions
• readv and writev Functions
• recvmsg and sendmsg Function
• Ancillary Data
• How much Data is Queued?
• Sockets and Standard I/O
Socket Timeouts
• Three ways to place a timeout on an I/O operation involving a
socket
– Call alarm, which generates the SIGALRM signal when the
specified time has expired.
– Block waiting for I/O in select, which has a time limit built in, instead
of blocking in a call to read or write.
– Use the newer SO_RCVTIMEO and SO_SNDTIMEO socket
options.
Connect with a Timeout Using SIGALRM
int
readable_timeo(int fd, int sec)
{
fd_set rset;
struct timeval tv;
FD_ZERO(&rset);
FD_SET(fd, &rset);
tv.tv_sec = sec;
tv.tv_usec = 0;
int n;
char sendline[MAXLINE], recvline[MAXLINE + 1];
struct timeval tv;
tv.tv_sec = 5;
tv.tv_usec = 0;
Setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));
while (Fgets(sendline, MAXLINE, fp) != NULL) {
Sendto(sockfd, sendline, strlen(sendline), 0, pservaddr, servlen);
n = recvfrom(sockfd, recvline, MAXLINE, 0, NULL, NULL);
if (n < 0) {
if (errno == EWOULDBLOCK) {
fprintf(stderr, "socket timeout\n");
continue;
} else
err_sys("recvfrom error");
}
recvline[n] = 0; /* null terminate */
Fputs(recvline, stdout);
}
recv and send Functions
#include <sys/socket.h>
ssize_t recv (int sockfd, void *buff, size_t nbytes, int flags);
ssize_t send (int sockfd, const void *buff, size_t nbytes, int flags);
Flag
M S G _D O N T
readv and writev Functions
#include <sys/uio.h>
ssize_t readv (int filedes, const struct iovec *iov, int iovcnt);
ssize_t writev (int filedes, const struct iovec *iov, int iovcnt);
Struct iovec {
void *iov_base; /* starting address of buffer */
size_t iov_len; /* size of buffer */
};
– readv and writev let us read into or write from one or more
buffers with a single function call.
• are called scatter read and gather write.
readv and writev Functions
– The readv and writev functions can be used with any descriptor, not just sockets.
– writev is an atomic operation. For a record-based protocol such as UDP, one call to
writev generates a single UDP datagram.
– One use of writev with the TCP_NODELAY socket option. //modify
• a write of 4 bytes followed by a write of 396 bytes could invoke the Nagle algorithm and a
preferred solution is to call writev for the two buffers.
Nagle’s Algorithm
Struct msghdr {
void *msg_name; /* starting address of buffer */
socklen_t msg_namelen; /* size of protocol address */
struct iovec *msg_iov; /* scatter/gather array */
size_t msg_iovlen; /* # elements in msg_iov */
void *msg_control; /* ancillary data; must be aligned
for a cmsghdr structure */
socklen_t msg_controllen; /* length of ancillary data */
int msg_flags; /* flags returned by recvmsg() */
};
recvmsg and sendmsg
Flag
recvmsg and sendmsg
m s g h d r{ }
m s g _ n a m e
m s g _ n a m1 6e le n io v e c { }
m s g _ io v io v _ b a s e
1 0 0
m s g _ io v l 3e n io v _ le n
m s g _ c o n t r o l io v _ b a s e
6 0
m s g _ c o n 2 t 0r o l l e i no v _ le n
m s g _ f la g 0s io v _ b a s e
8 0
io v _ le n
F ig u re 1 3 . 8 D a t a s t ru c t u r e s w h e n r e c v m s g is c a lle d
recvmsg and sendmsg
s o c k a d d r _ in { }
1 6 , A F _ IN E T , 2 0 0 0
1 9 8 . 6 9 . 1 0 . 2
m s g h d r{ }
m s g _ n a m e
m s g _ n a m 1 e6 l e n io v e c { } [ ]
m s g _ io v io v _ b a s e
1 0 0
m s g _ io v l e3 n io v _ le n
m s g _ c o n t r o l io v _ b a s e
6 0
m s g _ c o n 2t r0 o lle ni o v _ le n
m s g _ f la g s0 io v _ b a s e
8 0
io v _ le n
c m s g _ l e1 n6
c m s g _ l e I P v eP l R O T P _ I P
c m s g _ t yI P p _e R E C V D S T A D D R
2 0 6 . 6 2 . 2 2 6 . 3 5
F ig u re 1 3 . 9 U p d a te o f F ig u re 1 3 . 8 w h e n re c v m s g r
Ancillary Data
• Ancillary data can be sent and received using the msg_control and
msg_controllen members of the msghdr structure with sendmsg and recvmsg
functions.
Protocol c
IPv4 IP
Ancillary Data
m s g _ c o n t r o l
c m s g _ l e n
c m s g _ l e v e l c m s g h d r{ }
CMSG_LEN()
c m s g _ t y p e
cmsg_len p a d a c c i lla r y
d a t a o b je c t
C M S G _ S P A C E ( )
d a t a
msg_controllen
p a d
c m s g _ l e n
c m s g _ l e v e l c m s g h d r{ }
CMSG_LEN()
c m s g _ t y p e
cmsg_len
a c c i lla r y
p a d d a t a o b je c t
C M S G _ S P A C E ( )
d a t a
F ig u r e 1 3 . 1 2 A n c illa r y d a t a c o n t a in in g t w o a n c illa r
Ancillary Data
c m s g h d r{ } c m s g h d r{ }
c m s g _ l e n1 6 c m s g _ l e n1 6
c m s g _ l e v S e Ol L _ S O C K E cT m s g _ l e v S e Ol L _ S O C K E T
c m s g _ t y p Se C M _ R IG H T Sc m s g _ t y p S e C M _ C R E D S
d i s c r i p t o r
f c r e d { }
F ig u r e 1 3 . 1 3 c m s g h d r s t r u c t u r e w h e n u s e d w it h
How Much Data Is Queued?
• nonblocking I/O
• MSG_PEEK with MSG_DONTWAIT flag
• FIONREAD command of ioctl
Sockets and Standard I/O
• The standard I/O stream can be used with sockets, but there are a few
items to consider.
– A standard I/O stream can be created from any desciptor by calling the
fdopen function. Similarly, given a standard I/O stream, we can obtain the
corresponding descriptor by calling fileno.
– fseek, fsetpos, rewind functions is that they all call lseek, which fails on a
socket.
– The easiest way to handle this read-write problem is to open two standard
I/O streams for a given socket: one for reading, and one for writing.
Standard i/O buffers
• Fully buffered: i/O takes place only when the buffer is
full, fflush() or exit() 8192 bytes
• Line buffered: i/O takes place when a new line is
encountered, fflush(), or exit()
• Unbuffered: i/O take place each time a standard i/O
output function is called.
Standard i/O buffers
• Standard error is always unbuffered
• Standard input and standard output are fully buffered,
unless they refer to a terminal device in which case
they are line buffered.
• All other streams are fully buffered unless they refer
to terminal device in which case they are line
buffered.
Sockets and Standard I/O
#include "unp.h"
void
str_echo(int sockfd)
{
char line[MAXLINE];
FILE *fpin, *fpout;
for ( ; ; ) {
if (Fgets(line, MAXLINE, fpin) == NULL)
return; /* connection closed by other end */
Fputs(line, fpout);
}
}
Chapter 12.
Daemon Processes
and inetd Superserver
12.1 Introduction
• A daemon is a process that runs in the background and is
independent of control from all terminals.
• There are numerous ways to start a daemon
1. the system initialization scripts ( /etc/rc )
2. the inetd superserver
3. cron deamon
4. the at command
5. from user terminals
Filesystem
Unix domain socket
/var/log/messages
/dev/log
UDP socket
port 514 syslogd
syslogd Console
/dev/klog
Remote syslogd
12. 3 syslog function
#include <syslog.h>
void syslog(int priority, const char *message, . . . );
• Log message
level have
value a level between 0 and 7.
description
LOG_EMERG 0 system is unusable ( highest priority )
LOG_ALERT 1 action must be taken immediately
LOG_CRIT 2 critical conditions
LOG_ERR 3 error conditions
LOG_WARNING 4 warning conditions
LOG_NOTICE 5 normal but significant condition (default)
LOG_INFO 6 informational
LOG_DEBUG 7 debug-level message ( lowest priority )
Figure 12.1 level of log message.
12. 3 syslog function
facility Description
• A facility to identify
LOG_AUTH the type of process sending the
security / authorization messages
message.
LOG_AUTHPRIV security / authorization messages (private)
LOG_CRON cron daemon
LOG_DAEMON system daemons
LOG_FTP FTP daemon
LOG_KERN kernel messages
LOG_LOCAL0 local use
LOG_LOCAL1 local use
LOG_LOCAL2 local use
LOG_LOCAL3 local use
LOG_LOCAL4 local use
LOG_LOCAL5 local use
LOG_LOCAL6 local use
LOG_LOCAL7 local use
LOG_LPR line printer system
LOG_MAIL mail system
LOG_NEWS network news system
LOG_SYSLOG messages generated internally by syslog
LOG_USER random user-level messages(default)
LOG_UUCP UUCP system
Figure 12.2 facility of log messages.
12. 3 syslog function
• Openlog and closelog
– openlog can be called before the first call to syslog and
closelog can be called when the application is finished
sending is finished log messages.
options
LOG_CONS
#include <syslog.h>
void openlog(const char *ident, int options, int facility);
void closelog(void);
Unix Login
Unix Login
Process Group
• process group is a collection of one or more
processes, usually associated with the same job
• int setpgid(pid_t pid, pid_t pgid);
• pid_t getpgid(pid_t pid);
• It is possible for a process group leader to create a
process group, create processes in the group, and
then terminate. The process group still exists, as long
as at least one process is in the group, regardless of
whether the group leader terminates
•
Process Groups in a Session
• pid_t setsid(void);
• This function returns an error if the caller is already a
process group leader.
• To ensure this is not the case, the usual practice is to
call fork and have the parent terminate and the child
continue. We are guaranteed that the child is not a
process group leader, because the process group ID
of the parent is inherited by the child, but the child
gets a new process ID. Hence, it is impossible for the
child's process ID to equal its inherited process group
ID
Controlling Terminal
#include
#define
12.4 daemon_init Function
<syslog.h>
MAXFD 64
extern int daemon_proc; /* defined in error.c */
void daemon_init(const char *pname, int facility)
{
int i;
pid_t pid;
if ( (pid = Fork()) != 0)
exit(0); /* parent terminates */
/* 1st child continues */
setsid(); /* become session leader */
Signal(SIGHUP, SIG_IGN);
if ( (pid = Fork()) != 0) exit(0); /* 1st child terminates */
• Figure 12.7 b i n d ( )
l i s t e n ( )
( i f T C P s
s e l e c t ( )
f o r r e a d
a c c p e t ( )
( i f T C P
f o r k ( )
inetd service specification
• WAIT specifies that inetd should not look for new clients for
the service until the child (the real server) has terminated.
• TCP servers usually specify nowait - this means inetd can
start multiple copies of the TCP server program - providing
concurrency
• Most UDP services run with inetd told to wait until the child
server has died.
Broadcasting
Broadcasting 578
Broadcasting
• TCP works only with unicast addresses, UDP supports also broadcasting
and multicasting
Broadcasting 579
Broadcasting
Types of Casting:
Unicast: One to One
Anycast: a set to one in a set
Multicast: a set to all in a set
Broadcast: all to all
Broadcasting 580
Uses of Broadcasting
• Mainly used for resource discovery purposes (server is known to exist in the local
subnet, but IP address is not known)
Broadcasting 581
Broadcast Address Types
• IPv4 address: {netid; subnetid; hostid}
– Subnet-directed Broadcast Address:
• {netid; subnetid; -1} //-1 means all bits are 1’s
• netid = 128.7, subnetid: 6
Broadcast Address: 128.7.6.255
• Normally, routers do not forward these broadcasts
Broadcasting 582
Broadcast Address Types
Broadcasting 583
Unicast Vs Broadcast
Broadcasting 584
Unicast
Sending Receiving
Appl Appl
Sendto
Dest IP: 7433
128.7.6.5 Port
Dest Port: 7433 =7433
UDP UDP UDP
Protocol
=UDP
IPv4 IPv4 IPv4
128.7.6.99 = unicast 128.7.6.5 = unicast
128.7.6.255 = broadcast 128.7.6.255 = broadcast
Frame type
= 0800
Data Data Data
Link Link Link
02:60:8c:2f:4e:00 08:00:20:03:f6:42
subnet 128.7.6
Enet IPv4 UDP UDP
Dest Enet: 08:00:20:03:f6:42
hdr hdr hdr Data
Frame type: 0800
Dest Port: 7433
Dest IP: 128.7.6.5
Protocol: UDP
Broadcasting 585
Broadcast
Protocol Protocol
=UDP =UDP
IPv4 IPv4 IPv4
128.7.6.99 = unicast 128.7.6.5 = unicast
128.7.6.255 = broadcast 128.7.6.255 = broadcast
Frame type Frame type
= 0800 = 0800
Data Data Data
Link Link Link
02:60:8c:2f:4e:00 02:60:20:03:f6:42
subnet 128.7.6
Enet IPv4 UDP UDP
Dest Enet: ff:ff:ff:ff:ff:ff
hdr hdr hdr Data
Frame type: 0800
Dest Port: 520
Dest IP: 128.7.6.255
Protocol: UDP
Broadcasting 586
Programming Requirements
• Setsockopt(sockfd,
SOL_SOCKET,SO_BROADCAST,&on,sizeof(on)).
• IP Fragmentation: BSD generates EMSGSIZE if size
exceeds outgoing MTU
Broadcasting 587
Race Condition
void dg_cli(…) {
setsockopt(sockfd, SOL_SOCKET,SO_BROADCAST,&on,sizeof(on));
signal(SIGALRM, func);
while(fgets(…)!=NULL) {
sendto(…); Problem?
alarm(1);
for(; ; ) {
if (n=recvfrom(…) <0) {
if (errno==EINTR) break;
else err_sys(…);
} else {
recvline[n]=0;
sleep(1);
printf(…);
}}}
Void func( int signo) { return; }
Broadcasting 588
Solutions to Race Condition
Broadcasting 589
2. pselect can be used with SIGALRM first blocked and then
pselect being called with an empty signal set as it’s last
argument.
Broadcasting 590
3. Using non-local goto siglongjmp to jump from signal
handler to the caller.
signal(SIGALRM, func);
while (fgets(…)!=NULL) {
sendto(…);
alarm(5);
for(; ;) {
if (sigsetjmp(jmpbuf, 1) != 0)
break;
n=recvfrom(…);
recvline[n]=0;
printf(…);
}
void func(…) {
siglongjmp(jmpbuf, 1);
}
Broadcasting 591
4. Using IPC from signal handler to function
void dg_cli(…) {
setsockopt(…);
pipe (pipefd);
FD_ZERO(&rset);
signal(SIGALRM, func);
while(fgets(…)!=NULL){
sendto(…);
alarm(5);
for(; ;) {
FD_SET(sockfd, &rset);
FD_SET(pipefd[0],&rset);
if(n = select (…) <0) {
if (errno==EINTR) continue; else err_sys(…); }
if (FD_ISSET(sockfd, &rset) ) {
recvfrom(…); printf(…); }
if (FD_ISSET(pipefd[0], &rset)) {
read(pipefd[0], &n, 1); break; }
void func(int signo) {
write (pipefd[1], “ ”, 1); return;}
Broadcasting 592
Multicasting
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
CLASS 1–0 32
B: bit Class
NET-IDD (12b)
address is called the group address
HOST-ID (14b)
CLASS 1C:
1 0 NET-ID (21b) HOST-ID (8b)
CLASS 1D:
1 1 0 GROUP-ID (28b)
Multicasting 593
• A mapping from IPv4 multicast addresses to Ethernet addresses
is also defined
– High order 24 bits always 01:00:5e
– 25th bit is 0
– Low order 23 bits from lowest 23 bits of multicast group address
– Not one-to-one, many (32) multicast addresses to a single Ethernet
address
Multicasting 594
multicast address
• IPv4 class D address
– 224.0.0.0 ~ 239.255.255.255
– (224.0.0.1: all hosts group), (224.0.0.2: all-routers group)
Multicast Addresses Scope
Multicast Session
• Especially in the case of streaming multimedia, the combination
of an IP multicast address (either IPv4 or IPv6) and a transport-
layer port (typically UDP) is referred to as a session.
• For example, an audio/video teleconference may comprise two
sessions; one for audio and one for video. These sessions
almost always use different ports and sometimes also use
different groups for flexibility in choice when receiving.
Multicast vs Broadcast
Sending Receiving
Appl Appl
sendto
Dest IP: 224.0.1.1 123
Dest Port: 123 Port
=123 join
UDP UDP UDP
224.0.1.1
Protocol
=UDP
IPv4 IPv4 Perfect sw filtering IPv4
based on dest IP
receive
Frame type 01:00:5e:
= 0800 00:01:01
Data Data Imperfect hw filtering Data
Link Link based on dest Enet Link
02:60:8c:2f:4e:00 02:60:20:03:f6:42
subnet 128.7.6
Enet IPv4 UDP UDP
Dest Enet: 01:00:5e:00:01:01
hdr hdr hdr Data
Frame type: 0800
Dest Port: 123
Dest IP: 224.0.1.1
Protocol: UDP
Multicasting 598
Multicasting on a WAN
MR1 MR5
Multicasting 599
Hosts joining a Multicast Group
join
group
H1
MR1 MR5
MRP MRP
MRP MRP
MR2 MR3 MR4
H2 H3 H4 H5
join join join join
group group group group
Multicasting 600
Sending packets on a WAN
join
group
H1
MR1 MR5
H2 H3 H4 H5
join join join join
group group group group
Multicasting 601
Multicasting
• Specifically note that;
– All interested multicast routers receive the packets, MR5 does not
receive any since there are no interested hosts in its LAN
– Packets are put to the specific LAN only if there are hosts in that LAN
to receive those packets, MR3 only forwards
– Multicast router MR2 both puts packets on its LAN for hosts H2 & H3,
and also makes a copy of the packets and forwards them to MR3.
– This behavior is something unique to multicast forwarding.
Multicasting 602
Source-Specific Multicast
• Multicasting on a WAN has been difficult to deploy for several
reasons.
– The biggest problem is that the MRP; needs to get the data from all
the senders, which may be located anywhere in the network, to all
the receivers, which may similarly be located anywhere.
– Another large problem is multicast address allocation: There are
not enough IPv4 multicast addresses to statically assign them to
everyone who wants one, as is done with unicast addresses.
Source-Specific Multicast
• combines the group address with a system's source address, which solves the
problems as follows:
– The receivers supply the sender's source address to the routers as part of joining the
group.
– This removes the rendezvous problem from the network, as the network now knows
exactly where the sender is.
– However, it retains the scaling properties of not requiring the sender to know who all
the receivers are. This simplifies multicast routing protocols immensely.
• It redefines the identifier from simply being a multicast group address to being a
combination of a unicast source and multicast destination (which SSM now calls
a channel.
• An SSM session is the combination of source, destination, and port
• struct ip_mreq {
• struct in_addr imr_multiaddr; /* IPv4 class D multicast addr */
• struct in_addr imr_interface; /* IPv4 addr of local interface */
• };
• struct ipv6_mreq {
• struct in6_addr ipv6mr_multiaddr; /* IPv6 multicast addr */
• unsigned int ipv6mr_interface; /* interface index, or 0 */
• };
• struct group_req {
• unsigned int gr_interface; /* interface index, or 0 */
• struct sockaddr_storage gr_group; /* IPv4 or IPv6 multicast addr */
• }
struct ip_mreq_source {
struct in_addr imr_multiaddr; /* IPv4 class D multicast addr */
struct in_addr imr_sourceaddr; /* IPv4 source addr */
struct in_addr imr_interface; /* IPv4 addr of local interface */
};
struct group_source_req {
unsigned int gsr_interface; /* interface index, or 0 */
struct sockaddr_storage gsr_group; /* IPv4 or IPv6 multicast addr */
struct sockaddr_storage gsr_source; /* IPv4 or IPv6 source addr */
}
Multicast Socket Options
Multicasting 609
Multicasting
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
CLASS 1–0 32
B: bit Class
NET-IDD (12b)
address is called the group address
HOST-ID (14b)
CLASS 1C:
1 0 NET-ID (21b) HOST-ID (8b)
CLASS 1D:
1 1 0 GROUP-ID (28b)
Multicasting 610
• A mapping from IPv4 multicast addresses to Ethernet addresses
is also defined
– High order 24 bits always 01:00:5e
– 25th bit is 0
– Low order 23 bits from lowest 23 bits of multicast group address
– Not one-to-one, many (32) multicast addresses to a single Ethernet
address
Multicasting 611
multicast address
• IPv4 class D address
– 224.0.0.0 ~ 239.255.255.255
– (224.0.0.1: all hosts group), (224.0.0.2: all-routers group)
Multicast Addresses Scope
Multicast Session
• Especially in the case of streaming multimedia, the combination
of an IP multicast address (either IPv4 or IPv6) and a transport-
layer port (typically UDP) is referred to as a session.
• For example, an audio/video teleconference may comprise two
sessions; one for audio and one for video. These sessions
almost always use different ports and sometimes also use
different groups for flexibility in choice when receiving.
Multicast vs Broadcast
Sending Receiving
Appl Appl
sendto
Dest IP: 224.0.1.1 123
Dest Port: 123 Port
=123 join
UDP UDP UDP
224.0.1.1
Protocol
=UDP
IPv4 IPv4 Perfect sw filtering IPv4
based on dest IP
receive
Frame type 01:00:5e:
= 0800 00:01:01
Data Data Imperfect hw filtering Data
Link Link based on dest Enet Link
02:60:8c:2f:4e:00 02:60:20:03:f6:42
subnet 128.7.6
Enet IPv4 UDP UDP
Dest Enet: 01:00:5e:00:01:01
hdr hdr hdr Data
Frame type: 0800
Dest Port: 123
Dest IP: 224.0.1.1
Protocol: UDP
Multicasting 615
Multicasting on a WAN
MR1 MR5
Multicasting 616
Hosts joining a Multicast Group
join
group
H1
MR1 MR5
MRP MRP
MRP MRP
MR2 MR3 MR4
H2 H3 H4 H5
join join join join
group group group group
Multicasting 617
Sending packets on a WAN
join
group
H1
MR1 MR5
H2 H3 H4 H5
join join join join
group group group group
Multicasting 618
Multicasting
Multicasting 619
Source-Specific Multicast
• Multicasting on a WAN has been difficult to deploy for several
reasons.
– The biggest problem is that the MRP; needs to get the data from all
the senders, which may be located anywhere in the network, to all
the receivers, which may similarly be located anywhere.
– Another large problem is multicast address allocation: There are
not enough IPv4 multicast addresses to statically assign them to
everyone who wants one, as is done with unicast addresses.
Source-Specific Multicast
• combines the group address with a system's source address, which solves the
problems as follows:
– The receivers supply the sender's source address to the routers as part of joining the
group.
– This removes the rendezvous problem from the network, as the network now knows
exactly where the sender is.
– However, it retains the scaling properties of not requiring the sender to know who all
the receivers are. This simplifies multicast routing protocols immensely.
• It redefines the identifier from simply being a multicast group address to being a
combination of a unicast source and multicast destination (which SSM now calls
a channel.
• An SSM session is the combination of source, destination, and port
• struct ip_mreq {
• struct in_addr imr_multiaddr; /* IPv4 class D multicast addr */
• struct in_addr imr_interface; /* IPv4 addr of local interface */
• };
• struct ipv6_mreq {
• struct in6_addr ipv6mr_multiaddr; /* IPv6 multicast addr */
• unsigned int ipv6mr_interface; /* interface index, or 0 */
• };
• struct group_req {
• unsigned int gr_interface; /* interface index, or 0 */
• struct sockaddr_storage gr_group; /* IPv4 or IPv6 multicast addr */
• }
struct ip_mreq_source {
struct in_addr imr_multiaddr; /* IPv4 class D multicast addr */
struct in_addr imr_sourceaddr; /* IPv4 source addr */
struct in_addr imr_interface; /* IPv4 addr of local interface */
};
struct group_source_req {
unsigned int gsr_interface; /* interface index, or 0 */
struct sockaddr_storage gsr_group; /* IPv4 or IPv6 multicast addr */
struct sockaddr_storage gsr_source; /* IPv4 or IPv6 source addr */
}
Multicast Socket Options
Multicasting 626
Distributed Program Design
c al
• Communication-Oriented Design
yp i s
T e t
– Design protocol first.
oc k a ch
– Build programs that adhere to the protocol. S r o
p
Ap
• Application-Oriented Design
– Build application(s).
– Divide programs up and add communication protocols.
PC
R
RPC
Remote Procedure Call
• Call a procedure (subroutine) that is running on
another machine.
• Issues:
– identifying and accessing the remote procedure
– parameters
– return value
Remote Subroutine
Client
Server
ol int
c int foo(int
foo(int x,
x, int
int yy )) {{
pr oto if
blah,
blah, blah,
blah, blah
blah if (x>100)
(x>100)
return(y-2);
return(y-2);
bar else
else if
if (x>10)
bar == foo(a,b);
foo(a,b); (x>10)
return(y-x);
return(y-x);
blah, else
blah, blah,
blah, blah
blah else
return(x+y);
return(x+y);
}}
Sun RPC
• There are a number of popular RPC specifications.
• Sun RPC (ONC RPC) is widely used.
• NFS (Network File System) is RPC based.
• Rich set of support tools.
Sun RPC Organization
Remote Program
deposit(DavesAccount,$100)
• svc_run() is a dispatcher.
• A dispatcher waits for incoming connections and
invokes the appropriate function to handle each
incoming request.
High-Level Library Limitation
• The High-Level RPC library calls support UDP only
(no TCP).
• You must use lower-level RPC library functions to
use TCP.
• The High-Level library calls do not support any kind
of authentication.
Low-level RPC Library
• Full control over all IPC options
– TCP & UDP
– Timeout values
– Asynchronous procedure calls
• Multi-tasking Servers
• Broadcasting
Protocol Input
Input File
File
Description
rpcgen
int x[n]
x0 x1 x2 ... xn-1
n s0 s1 s2 s3 . . . Sn-1
Color Code:
Keywords Generated Symbolic Constants
Used to generate stub and procedure names
Procedure Numbers
• Procedure #0 is created for you automatically.
– Start at procedure #1!
Client stub:
rtype *proc_1(arg *, CLIENT *);
Server procedure:
rtype *proc_1_svc(arg *,
struct svc_req *);
Program Numbers
• Use something like:
555555555 or 22222222
• Type is CLIENT *
clnt_create
CLIENT *clnt_create( r
r v e
char *host, o f se
a m e
u_long prog, os tn e r
H m b
u_long vers, m n u
g r a r
char *proto); Pro u m be
i on n
Ver s
• Or Not!
RPC without rpcgen
• Can do asynchronous RPC
– Callbacks
– Single process is both client and server.
• Write your own dispatcher (and provide concurrency)
• Can establish control over many network parameters:
protocols, timeouts, resends, etc.
rpcinfo
rpcinfo –p host prints a list of all registered
programs on host.
u : UDP
rpcinfo –[ut] host program# makes a call to
t : TCP#0 of the specified RPC program (RPC
procedure
ping).
Sample Code
• simple – integer add and subtract
• ulookup – look up username and uid.
• varray – variable length array example.
• linkedlist – arg is linked list.
Example simp
• Standalone program simp.c
– Takes 2 integers from command line and prints out the sum
and difference.
– Functions:
int add( int x, int y );
int subtract( int x, int y );
Splitting simp.c
• Move the functions add() and subtract() to the server.
program SIMP_PROG {
version SIMP_VERSION {
int ADD(operands) = 1;
int SUB(operands) = 2;
} = VERSION_NUMBER;
} = 555555555;
rpcgen –C simp.x
simp.x
simp.x
rpcgen
simp_clnt.c
simp.h
Client Stubs simp_xdr.c
header file simp_svc.c
XDR filters Server skeleton
xdr_operands XDR filter
bool_t xdr_operands( XDR *xdrs,
operands *objp){
if (!xdr_int(xdrs, &objp->x))
return (FALSE);
if (!xdr_int(xdrs, &objp->y))
return (FALSE);
return (TRUE);
}
simpclient.c
• This was the main program – is now the client.
• Reads 2 ints from the command line.
• Creates a RPC handle.
• Calls the remote add and subtract procedures.
• Prints the results.
simpservice.c
67
69
25 21 23 53 161
Bootp
DHCP
TCP UDP
Port # Port #
IPv6 EGP OSPF
41 8 89 6 17 Port
address
protocol
1 2
IP
address
frame
type
MAC
address
ICMP IGMP User TCP User UDP
(ping, etc)
ICMP
echo TCP stack UDP stack
timestamp
2 port port
98
1
17
6
17 UDP
6 TCP
1 ICMP
2 IGMP
89 OSPF
What can raw sockets do?
• Bypass TCP/UDP layers
• Read and write ICMP and IGMP packets
– ping, traceroute, multicast routing daemon
• Read and write IP datagrams with an IP protocol field not processed by the
kernel
– OSPF
• Send and receive your own IP packets with your own IP header using the
IP_HDRINCL socket option
– can build and send TCP and UDP packets
– testing, hacking
– only superuser can create raw socket though
• You need to do all protocol processing at user-level
Creating Raw Sockets
• Only Superuser can create
• socket(AF_INET, SOCK_RAW, protocol)
– where protocol is one of the constants, IPPROTO_xxx, such as
IPPROTO_ICMP.
• bind can be called on the raw socket, but this is rare. This
function sets only the local address: There is no concept of a
port number with a raw socket.
• connect can be called on the raw socket, but this is rare. This
function sets only the foreign address: Again, there is no
concept of a port number with a raw socket.
• set the identifier to the PID of the ping process and we increment the sequence
number by one for each packet we send
• We store the 8-byte timestamp of when the packet is sent as the optional data.
The rules of ICMP require that the identifier, sequence number, and any optional
data be returned in the echo reply.
• Storing the timestamp in the packet lets us calculate the RTT when the reply is
received.
main Sig_Alrm
Read loop
Send_v4
recvfrom Proc_v4
Send an echo
request once a
Infinite receive loop second
BPF datalink
Accessing a BPF: Open a BPF device, Use ioctl to set the properties like
Load the filter, set read timeout, set buffer size, attach a DL to BPF, enable
Promiscuous mode etc.
Disadvantages:
1. No kernel buffering, hence, more system calls
2. No device filtering, hence, ETH_IP_P will give
packets from Ethernet, PPP, SLIP links, and loop
back devices
subtype
Ping Program
• Create a raw socket to send/receive ICMP echo
request and echo reply packets
• Install SIGALRM handler to process output
– Sending echo request packets every t second
– Build ICMP packets (type, code, checksum, id,
seq, sending timestamp as optional data)
• Enter an infinite loop processing input
– Use recvmsg() to read from the network
– Parse the message and retrieve the ICMP packet
– Print ICMP packet information, e.g., peer IP
address, round-trip time
Traceroute program
Lecture#8
Problem 1
• This problem is about implementing a local chat server and client in a system. The server
and client will facilitate the communication between multiple users of the system. You
should submit client_idno.c and server_idno.c for client and server respectively.
• The chat server supports the following functionalities.
• let us say currently users B, C and D have entered chat server. Then user A joins chat.
Server will tell all the current chatters B, C and D: ‘A just joined’
– command: connect <username>
• A can say a message to every one “Hello! Everyone!” or A can whisper a message to C
alone ‘I want to tell a secret to you’. So server should facilitate one to all and one to one
communication.
– Command: talk * //to talk to all chatters
– Command: talk <username> to talk to one user
• A can also get the list of all chatters.
– Command: list
• A can disconnect from chat
– Command: disconnect
Problem 2
• The server program should
• start like ./server <path>
• since it runs within the system, it should use either FIFO/Message Queues for
inter process communication.
• use select() call for dealing with multiple users concurrently
• The client program should
• start like ./client <serverpath>
• take care of interpreting commands entered by user.
• process the command until Ctrl-D is pressed. When a user types and then
presses <ENTER>, that is the end of one message. But the program will still
wait for the next message until user presses Ctrl-D (EOF for fgets()).
• the client is capable of handling the sending and receiving simultaneously. Any
messages received while the user is typing the message to be sent, will be
simply flashed on the console.
Problem 3
• A simple TCP based chat server could allow two users to use
any TCP client (telnet, for example) to communicate with each
other. Consider a single process, single thread server that
can support exactly 2 clients at once, the server simply forwards
whatever is sent from one client to the other (in both directions).
As soon as something is sent from one client it is immediately
forwarded to the other client. As soon as either client terminates
the connection, the server exits. Provide server code with
comments.
Problem 4
1. When the server starts it reads from a file having a list of domain
names which are to be forbidden to access. When a HTTP
request comes to server,
http://discovery.bits-pilani.ac.in/index.html, it checks if the
domain name “discovery.bits-pilani.ac.in” exists in the list. If it is,
the server sends back HTTP error 403 Forbidden to the client. If
not it sends the request to the actual server. When it gets the
reply, it sends the reply to the client.
2. Your server takes a port number on the command line. It can be
iterative server.
3. Your server will be tested with a browser.
Problem 5
• Suppose you are given a task of testing the validity of links in a given web
page. You are expected to test each url present in the web page and
report the result. URL is of this form:
• http://<domain name>/<directory1>/<directory2>/ … /<filename>
• Testing URL for validity means to test the existence of domain name, and
existence of file in the given path on remote server.
• To simplify the problem, you can take a list of URLs in a file; one url per
line. Your program takes this file name as command-line argument. Your
program should read each URL and validate the URL. The result is one
of {VALID, INCORRECT DOMAIN, FILE DOESNT EXIST}. Your
program should display the URL and result(s); each URL and its result
per one line on console
Problem 6
Consider the following network. There are n nodes connected in a ring topology. The
communication to any node in the network happens in clock-wise direction i.e. through the
next node. Each node shares a set of files with it.
The nodes communicate using SUN RPC . When a node joins the network it invokes
connectMe() on the next node and the previous node. The next node and previous node
addresses are supplied as CLA. When a node searches for a file, it invokes
void* search(Node n, char* filename)
{
If search is successful then
Return the result set
Else
return search(nextNode(n), filename);
}
Write the protocol file. Take help of rpcgen. Develop rpcclient and rpcserver. Demonstration
should have all communications printed on the console indication the ip, port, file etc.
ISZC462
Tutorial 2
EC1 solutions
Q1
• Write a TCP client and server programs for the
following. The connection between client and
server is persistent i.e. multiple requests are sent
on the same connection. The client sends N
integers to server. The server sums up all of them
and sends the result back to the client. The server
handles the clients concurrently. Also the server
avoids zombies processes to hang around. [10]
Q1 Ans
Protocol:
Client server: 4 bytes: N, 4 bytes: 1st int, 4 bytes: 2nd int, … until last integer
Server client: 4 bytes: result
/*Client.c*/
void error(char *msg)
{
perror(msg);
exit(0);
}
int main(int argc, char *argv[])
{
int sockfd, portno, n;
struct sockaddr_in serv_addr;
struct hostent *server;
char buffer[256];
if (argc < 3) {
fprintf(stderr,"usage %s hostname port\n", argv[0]);
exit(0);
}
portno = atoi(argv[2]);
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0)
error("ERROR opening socket");
server = gethostbyname(argv[1]);
if (server == NULL) {
fprintf(stderr,"ERROR, no such host\n");
exit(0);
}
Q1 Ans
bzero((char *) &serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
bcopy((char *)server->h_addr,
(char *)&serv_addr.sin_addr.s_addr,
server->h_length);
serv_addr.sin_port = htons(portno);
if (connect(sockfd,&serv_addr,sizeof(serv_addr)) < 0)
error("ERROR connecting");
/*Protocol implementation*/
printf("Enter number of integers:");
scanf("%d", &N):
while(N>0){
buf[0]=N;
for(i=0;i<N'i++)
{ printf("Enter the %dth number:");
scanf("%d",&buf[i+1]);
}
write(sockfd,buf,(N+1)*4);
n=read(sockfd,&result, 4);
if(n==0)
printf("Server terminted prematurely");
printf("The result is: %d\n", result);
printf("Enter number of integers(-1 to exit):");
scanf("%d", &N):
}while();
return 0;
}
Q1 Ans
/*server.c*/
void
error (char *msg)
{
perror (msg);
exit (1);
}
void
sigchldhandler (int signo)
{
int pid;
while ((pid = waitpid (-1, NULL, WNOHANG)) > 0);
}
int
main (int argc, char *argv[])
{
int ret, i, N, val, sum;
int sockfd, newsockfd, portno, clilen;
char buffer[256];
struct sockaddr_in serv_addr, cli_addr;
int n;
signal (SIGCHLD, sigchldhandler);
if (argc < 2)
{
fprintf (stderr, "ERROR, no port provided\n");
exit (1);
}
sockfd = socket (AF_INET, SOCK_STREAM, 0);
if (sockfd < 0)
error ("ERROR opening socket");
Q1 Ans
bzero ((char *) &serv_addr, sizeof (serv_addr));
portno = atoi (argv[1]);
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = INADDR_ANY;
serv_addr.sin_port = htons (portno);
if (bind (sockfd, (struct sockaddr *) &serv_addr, sizeof (serv_addr)) < 0)
error ("ERROR on binding");
listen (sockfd, 5);
for (;;)
{
clilen = sizeof (cli_addr);
newsockfd = accept (sockfd, (struct sockaddr *) &cli_addr, &clilen);
if (newsockfd < 0)
error ("ERROR on accept");
printf ("connection is accepted");
Q1 Ans
ret = fork ();
if (ret == 0)
{
close (sockfd);
n = read (newsockfd, &N, 4);
printf ("N=%d\n", N);
while (n > 0)
{
i = 0;
sum = 0;
while (i < N)
{
n = read (newsockfd, &val, 4);
printf ("val[%d]=%d\n", i, val);
if (n < 0)
error ("ERROR reading from socket");
sum = sum + val;
i++;
}
printf ("sum=%d\n", sum);
n = write (newsockfd, &sum, 4);
if (n < 0)
error ("ERROR writing to socket");
n = read (newsockfd, &N, 4);
}
return 0;
}
else if (ret > 0)
{
close (newsockfd);
continue;
}
}
}
Q2
1.Write a complete program to implement the
shell command ls –l|grep ^d| wc –l that
displays the number of sub directories in
the current directory. Use system calls
such as exec etc. and pipes for inter
process communication. [8]
Q2 Ans
main ()
{
int pid, p1[2], p2[2];
pipe (p1);
pipe (p2);
pid = fork ();
if (pid == 0)
{
pid = fork ();
if (pid > 0)
{
close(p2[1]);
dup2 (p2[0], 0);
dup2 (p1[1], 1);
wait (NULL);
execlp ("grep", "grep","^d", NULL);
}
else if (pid == 0)
{
dup2 (p2[1], 1);
execlp ("ls", "ls", "-l", NULL);
}
}
else
{
close(p2[1]);
close(p1[1]);
dup2 (p1[0], 0);
execlp ("wc", "wc", "-l", NULL);
}
}
Q3
What is a connected UDP socket? How is it created?
What are the advantages of using it?