Transport Layer: TCP/UDP: Chapter 24, 16
Transport Layer: TCP/UDP: Chapter 24, 16
Transport Layer: TCP/UDP: Chapter 24, 16
Chapter 24, 16
Transport Layer
• Purpose of transport layer services:
– multiplexing/demultiplexing
– reliable data transfer
– flow control
– congestion control
• Connection-less transport: UDP
• Connection-oriented transport: TCP
– reliable transfer
– flow control
– connection management
1
Transport services and protocols
• provide logical application
transport
communication between network
data link network
application processes physical
lo
gnetwork
data link
physical
ic
running on different hosts data
a
link
physical
l
e
n network
• transport protocols run in d
-
e
data link
n physical network
end systems via software d
t
r
data link
a physical
n
• transport vs network layer s
p network
o
rdata link
services: t
physical
Transport-layer protocols
Internet transport services: application
transport
network
• reliable, in-order unicast data link
physical
network
lo data link
delivery (TCP) gnetwork
ic
data link
physical
a
physical
l
– congestion e
n
d
network
-
e
data link
– flow control n
d
physical network
t data link
r physical
– connection setup a
n
s
p network
o
• unreliable (“best-effort”), rdata link
t
physical
– real-time
– bandwidth guarantees
– reliable multicast
2
Multiplexing/demultiplexing
Recall: segment - unit of data
exchanged between Demultiplexing: delivering
transport layer entities received segments to
– aka TPDU: transport correct app layer processes
protocol data unit or
“packet” receiver
P3 P4
application-layer M M
data
application
segment P1 transport P2
header M
M network
application application
segment Ht M transport transport
Hn segment network network
Multiplexing/demultiplexing
Multiplexing:
gathering data from multiple
app processes, enveloping 32 bits
data with header (later used source port # dest port #
for demultiplexing)
multiplexing/demultiplexing: other header fields
• based on sender, receiver port
numbers, IP addresses
– source, dest port #s in each application
segment data
– recall: well-known port numbers (message)
for specific applications
3
Multiplexing/demultiplexing: examples
source port: x Web client
host A dest. port: 23 server B host C
source port:23
dest. port: x
Source IP: C Source IP: C
Dest IP: B Dest IP: B
source port: y source port: x
port use: simple telnet app dest. port: 80 dest. port: 80
Source IP: A
Dest IP: B Web
Web client source port: x server B
host A dest. port: 80
port use: Web server
4
UDP: more
• often used for streaming
32 bits
multimedia apps
– Controversial: no Length, in source port # dest port #
congestion control bytes of UDP length checksum
segment,
• other UDP uses including
(why?): header
– DNS
Application
– SNMP data
• reliable transfer over (message)
UDP: add reliability at
application layer
– application-specific UDP segment format
error recover!
UDP checksum
Goal: detect “errors” (e.g., flipped bits) in transmitted
segment
Sender: Receiver:
• treat segment contents as • compute checksum of received
sequence of 16-bit integers segment
• checksum: addition (1’s • check if computed checksum
complement sum) of equals checksum field value:
segment contents – NO - error detected
• Toss data OR
• sender puts checksum value • Pass to app with warning
into UDP checksum field – YES - no error detected.
5
Connection Oriented Transport
Protocol Mechanisms
• Properties of connection-oriented Transport
Protocols:
– Logical connection
– Establishment
– Maintenance termination
– Reliable
– e.g. TCP
Connection-Oriented Transport
via Reliable Network Layer
• Transport Layer Services like TCP are complicated – to
start, let’s first assume we are working with a reliable
network layer service
– e.g. reliable packet switched network using X.25
– e.g. frame relay using LAPF control protocol
– e.g. IEEE 802.3 using connection oriented LLC service
– NOT IP! IP is unreliable
• Assume arbitrary length message
• Transport service is end to end protocol between two
systems on same network
6
Issues in a Simple Transprot
Protocol
• If we have a reliable network layer, then the
transport layer must consider:
– Addressing
– Multiplexing
– Flow Control
– Connection establishment and termination
Addressing
• Target user specified by:
– User identification
• Usually host, port
– Called a socket in TCP/UDP
• Port represents a particular transport service (TS), e.g. HTTPD
– Transport protocol identification
• Generally only one per host
• If more than one, then usually one of each type
– Specify transport protocol (TCP, UDP)
– Host address
• An attached network device
• In an internet, a global internet address (IP Address)
• A well-known address or lookup via name server
7
Multiplexing
• Multiple users employ same transport protocol
• User identified by port number or service access
point (SAP)
• Described previously
Flow Control
• Can be difficult than flow control at the data link layer –
data is likely traveling across many networks, not one
network. Some potential problems:
– Longer transmission delay between transport entities compared
with actual transmission time
• Delay in communication of flow control info
– Variable transmission delay
• Difficult to use timeouts
• Flow may be controlled because:
– The receiving user cannot keep up
– The receiving transport entity cannot keep up
– If either happens, the results is a buffer that can get full and
eventually lose data
8
Model of Frame Transmission
Diagram for Frame/Packet Transmission
9
Coping with Flow Control
Requirements (2)
• One protocol: Stop-and-Wait
– Sender must wait for recipient to send ACK before
sending the next packet
• Not very efficient usage of the network, only one
outstanding message can be in transit at a time
– Works well on reliable network
• Failure to receive ACK is taken as flow control indication
– Does not work well on unreliable network
• Cannot distinguish between lost segment and flow control
10
Sliding Window Enhancements
• Receiver can acknowledge frames without permitting
further transmission (Receive Not Ready)
• Must send a normal acknowledge to resume
• If full duplex two-way communications, we need two
windows: one for transmit and one for receive
– Piggybacking – if sending data and acknowledgement frame,
combine together
RR N=Receive Ready on N
11
Use of Header Fields
• For credit-based window size
– When sending, Sequence Number is that of first octet
in segment
– ACK includes AN=i (Acknowledgement Number),
W=j (Window Size)
– All octets through SN=i-1 acknowledged
• Next expected octet is i
– Permission to send additional window of W=j octets
• i.e. octets through i+j-1
12
Establishment and Termination
• Even with a reliable network service, both ends
need to “set up” the connection:
– Allow each end to know the other exists and is
listening
– Negotiation of optional parameters
• Maximum Segment Size
• Maximum Window Size
– Triggers allocation of transport entity resources
• Buffer space allocated
• Entry in connection tables
SYN=Sync
FIN=Finish
13
Connection Establishment
14
Termination
• Connection can be terminated by sending FIN
• Graceful termination
– CLOSE_WAIT state and FIN_WAIT must accept
incoming data until FIN received
– Ensures both sides have received all outstanding data
and that both sides agree to connection termination
before actual termination
15
Problems
• Ordered Delivery
• Retransmission strategy
• Duplication detection
• Flow control
• Connection establishment
• Connection termination
• Crash recovery
Ordered Delivery
• Segments may arrive out of order
• Number segments sequentially
• TCP numbers each octet sequentially
• Segments are numbered by the first octet number in
the segment
16
Retransmission Strategy
• Need to re-transmit when
– Segment damaged in transit
– Segment fails to arrive
• Receiver must acknowledge successful receipt
• Use cumulative acknowledgement
• Time out waiting for ACK triggers
re-transmission
Timer Value
• Fixed timer
– Based on understanding of network behavior
– Can not adapt to changing network conditions
– Too small leads to unnecessary re-transmissions
– Too large and response to lost segments is slow
– Should be a bit longer than Round Trip Time (RTT)
• Adaptive scheme
– E.g. set timer to average of previous ACKs
– Problems:
• Sender may not ACK immediately
• Cannot distinguish between ACK of original segment and re-
transmitted segment
• Conditions may change suddenly
17
Duplication Detection
• If ACK lost, segment is re-transmitted
• Receiver must recognize duplicates
• Duplicate received prior to closing connection
– Receiver assumes ACK lost and ACKs duplicate
– Sender must not get confused with multiple ACKs
– Sequence number space large enough to not cycle within
maximum life of segment
Incorrect
Duplicate
Detection
18
Flow Control
• Can use credit allocation described earlier
Connection Establishment
• Two way handshake
– A send SYN, B replies with SYN
– Lost SYN handled by re-transmission
• Can lead to duplicate SYNs
– Ignore duplicate SYNs once connected
• Lost or delayed data segments can cause
connection problems
– Segment from old connections
19
Two Way
Handshake:
Obsolete
Data
Segment
A wants new
connection, B expects SN j
picks SN k
20
Connection Establishment –
Three Way Handshake
• Solution: Explicitly acknowledge each other’s
SYN and sequence number
– Use SYN i
– Need ACK to include i
Three Way
Handshake:
Examples
21
Three Way
Handshake:
State
Diagram
Connection Termination
• Same problems we had with connection
establishment can also occur with connection
termination
– Lost or obsolete FIN segment
– Can lose last data segment if FIN arrives before last
data segment
• Solution: associate sequence number with FIN
• Receiver waits for all segments before FIN
sequence number
• Must explicitly ACK FIN
22
Graceful Close
• Send FIN i and receive AN i
• Receive FIN j and send AN j
• Wait twice maximum expected segment lifetime
Crash Recovery
• If the transport service crashes and restarts, after restart
all state info is lost
• Connection is half open
– Side that did not crash still thinks it is connected
• Close connection using persistence timer
– Wait for ACK for (time out) * (number of retries)
– When expired, close connection and inform user
• Send RST i in response to any i segment arriving
• User must decide whether to reconnect
– Problems with lost or duplicate data
23
TCP:Overview RFCs: 793, 1122, 1323, 2018, 2581
• point-to-point: • full duplex data:
– one sender, one receiver – bi-directional data flow in
same connection
• reliable, in-order byte – MSS: maximum segment
stream: size
– no “message boundaries” • connection-oriented:
• pipelined: – handshaking (exchange of
control msgs) init’s sender,
– TCP congestion and flow
receiver state before data
control set window size exchange
• send & receive buffers • flow controlled:
– sender will not overwhelm
socket
application
writes data
application
reads data
receiver
socket
door door
TCP TCP
send buffer receive buffer
segment
TCP Properties
• stream orientation. stream of OCTETS (bytes) passed
between send/ recv
• byte stream is full duplex
– think of it as two independent streams joined with a
piggybacking mechanism
– piggybacking - one data stream has control info for the other
data stream (going the other way)
• unstructured stream
– TCP doesn’t show packet boundaries to applications
– But you can still structure your message if you want
– Recall usage with sockets:
• One write() call to send data
• May require multiple read() calls
24
TCP segment structure
32 bits
URG: urgent data
source port # dest port # counting
by bytes
ACK: ACK #
sequence number of data
valid acknowledgement number (not segments!)
head not
PSH: push data now len used U A P R S F rcvr window size
(generally not used) # bytes
checksum ptr urgent data
rcvr willing
RST, SYN, FIN: Options (variable length) to accept
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)
TCP Fields
• Source, Destination Port: 16 bits each
• Sequence Number: 32 bits
– Sequence # of first data octet in the segment, initialized
randomly as described earlier
• ACK Number: 32 bits
– Piggybacked ACK, contains sequence number of the next data
octet the receiver expects
• Header Len: 4 bits
– Number of 32 bit words in the header
• Not Used: 6 bits for future use
25
TCP Fields
• Flags – 6 bits
– URG – Urgent Pointer field significant
– ACK – Ack field significant
– PSH – Push (flush or “push” buffer now, send data to app)
– RST – Reset connection
– SYN – Synchronize sequence numbers
– FIN – No more data
• Window – 16 bits
– Flow control credit allocation
• Checksum – 16 bits
– One’s complement sum as in UDP
• Urgent Pointer – 16 bits
– Last octet in a seq of “urgent” data. Sometimes not interpreted. Urgent
data should be processed now, even before any data sitting in the buffer
(e.g. send control-c to terminate)
• Options – Variable
– Support for timestamping, negotiating MSS
26
TCP: retransmission scenarios
Host A Host B Host A Host B
Seq=9 Seq=9
2, 8 b 2, 8 b
ytes
yte s data tu Seq=
data
oe 100,
20 by
t uto im tes d
u ata
o =100 e t
e ACK
m 29
m
i it =q 10
0
t X 00 eS K=
AC ACK=
120
loss 1=
Seq=9 qe Seq=9
2, 8 b
2, 8 b
yte s data
S yte s data
20
K=1
=100 AC
AC K
time time
lost ACK scenario premature timeout,
cumulative ACKs
receiver buffering
27
TCP Round Trip Time and Timeout
Q: how to set TCP Q: how to estimate RTT?
timeout value? • SampleRTT: measured time
• longer than RTT from segment transmission until
ACK receipt
– note: RTT will vary
– ignore retransmissions,
• too short: premature cumulatively ACKed
timeout segments
– unnecessary • SampleRTT will vary, want
retransmissions estimated RTT “smoother”
• too long: slow reaction – use several recent
to segment loss measurements, not just
current SampleRTT
28
TCP Connection Management
29
Principles of Congestion Control
Congestion:
• informally: “too many sources sending too much data too
fast for network to handle”
• different from flow control!
• manifestations:
– lost packets (buffer overflow at routers)
– long delays (queueing in router buffers)
• A top-10 problem!
30
Causes/costs of congestion: scenario 2
“offered load”
31
Causes/costs of congestion: scenario 3
• four senders Q: what happens as λin
• multihop paths
and λinincrease ?
• timeout/retransmit
32
Approaches towards congestion control
Congwin
33
TCP congestion control:
• “probing” for usable • two “phases”
bandwidth: – slow start
– ideally: transmit as fast as – congestion avoidance
possible (Congwin as
large as possible) without • important variables:
loss – Congwin
– Reality: – threshold: defines
– increase Congwin until threshold between two
loss (congestion) slow start phase,
– loss: decrease Congwin, congestion control phase
then begin probing
(increasing) again
TCP Slowstart
Host A Host B
Slowstart algorithm one segm
T ent
T
initialize: Congwin = 1 R
for (each segment ACKed) two segm
ents
Congwin++
until (loss event OR
four segm
CongWin > threshold) ents
34
TCP Congestion Avoidance
Congestion avoidance
/* slowstart is over */
/* Congwin > threshold */
Until (loss event) {
every w segments ACKed:
Congwin++
}
threshold = Congwin/2
Congwin = 1
perform slowstart 1
TCP Fairness
AIMD
TCP congestion Fairness goal: if N TCP
avoidance: sessions share same
• AIMD: additive bottleneck link, each
increase, should get 1/N of link
multiplicative
capacity
decrease
TCP connection 1
– increase window by
1 per RTT
– decrease window by
factor of 2 on loss
event bottleneck
TCP
router
connection 2
capacity C
35
Why is TCP fair?
Two competing sessions:
• Additive increase gives slope of 1, as throughput increases
• multiplicative decrease decreases throughput proportionally
C equal bandwidth share
tu
ph
gu
rho loss: decrease window by factor of 2
t congestion avoidance: additive increase
2 loss: decrease window by factor of 2
no congestion avoidance: additive increase
it
ce
nn
oC
C
Connection 1 throughput
36