0% found this document useful (0 votes)
96 views

04 DPDK Based Userspace TCPIP Stack Testing

This document discusses using packetdrill to test a DPDK-based userspace TCP/IP stack. It provides background on Luna, a high performance network framework using DPDK. It then discusses challenges in testing the Luna Stack userspace TCP/IP stack, including time-series related bugs and large number of corner cases. It introduces packetdrill, an open source tool for testing TCP implementations, and how it was modified to test the Luna Stack by dynamically linking the userspace stack and dispatching function calls. Finally, it discusses experience using packetdrill to write 75 test cases for the Luna TCP stack, finding 3 production bugs and adding the tests to continuous integration.

Uploaded by

Wei Jin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views

04 DPDK Based Userspace TCPIP Stack Testing

This document discusses using packetdrill to test a DPDK-based userspace TCP/IP stack. It provides background on Luna, a high performance network framework using DPDK. It then discusses challenges in testing the Luna Stack userspace TCP/IP stack, including time-series related bugs and large number of corner cases. It introduces packetdrill, an open source tool for testing TCP implementations, and how it was modified to test the Luna Stack by dynamically linking the userspace stack and dispatching function calls. Finally, it discusses experience using packetdrill to write 75 test cases for the Luna TCP stack, finding 3 production bugs and adding the tests to continuous integration.

Uploaded by

Wei Jin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

x

DPDK-based
userspace TCP/IP stack testing
SHU MA
EBS – KUAFU
ALIBABA CLOUD
Agenda

1 Background

2 Current status

3 Our practice

4 Q&A
Background
 Luna
• high performance network framework
• DPDK
• Luna Stack (userspace lightweight TCP/IP stack)

 Product
• ESSD (cloud disk)
• hundreds of production clusters
• tens of thousands of machines

 Latency
• 1/3 kernel
• nearly as fast as RDMA

https://www.aliyun.com/product/disk
Background
 Challenges in developing Luna Stack
• Bug is time-series-related
• hard to reproduce

Test Framework
• hard to troubleshoot

• Large number of corner cases 1. bug reproduction


• hard to fix 2. trouble shooting
• easy to break other cases 3. regression
4. correctness

• Convince upper-layer developers


• correctness
• robustness
Current status
 Linux kernel, FreeBSD
• Internal
• Low unit test coverage
• External (LTP)
• 20+ scripts for TCP/IP

 Testing approaches
• Unit test(white box)
• need to know code detail, hard to write
• Function test(black box)
• hard to create scenarios with strict time-series
• packetdrill(grey box)
• Google, open source
• USENIX ATC 2013
• 3 new TCP features, 10 kernel bugs
bug fix for Linux kernel
Packetdrill: script
 4 statements 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 bind(3, ..., ...) = 0
• packets
+0 listen(3, 1) = 0
• tcpdump-like syntax
• inbound, outbound +0 < S 0:0(0) win 32792 <mss 1460, nop, wscale 7, nop, nop, TS val 0 ecr 0>
• system calls +0 > S. 0:0(0) ack 1 <mss 1460, nop, nop, TS val 0 ecr 0, nop, wscale 7>
• strace-like syntax +0 `netstat -anp | grep 8080 | grep SYN_RCVD` // examine TCP state
• shell commands
+.1 < . 1:1(0) ack 1 win 100
• python scripts
+0 accept(3, ..., ...) = 4
+0 %{ assert tcpi_snd_cwnd = 10 }% // examine TCP_INFO
 time model
• relative time +0 write(4, ..., 1000) = 1000 // send 1 packet
• +0, +.1 +0 > . 1:1001(1000) ack 1

• absolute time
+.2 > . 1:1001(1000) ack 1 // RTO retrans, 200ms
• 0.100, 0.100…0.200
+.4 > . 1:1001(1000) ack 1 // RTO retarns, 400ms

100 lines of UT -> 13 lines of script


Packetdrill: pros & cons
 Pros
• time-series
• developer-friendly script syntax
• high maintainability
• reusable among different stacks
libluna.so

1. Posix socket API


 Cons 2. DPDK send/receive API
• kernel TCP/IP 3. netstat/ss/pollinghang
• TCP_INFO/netstat/ss
• polling related time-series packetdrill

1. dynamically link libluna.so


2. DPDK polling thread
3. dispatch function call
Modified packetdrill
 Main thread
• read script line by line
• send/receive packets via DPDK
• dispatch function
• run shell tools
• inspect: netstat, ss
• interfere: pollinghang?time=10

 Stack thread
• polling mode
• userspace stack initialization
• call dispatched function

 Usage
• ./packetdrill ./test.pkt
• ./packetdrill --userspace_stack --so_filename=libluna.so ./test.pkt
• Compare between Luna TCP and kernel TCP
Modified packetdrill
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 bind(3, ..., ...) = 0 +0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0 +0 listen(3, 1) = 0

+0 < S 0:0(0) win 32792 <…> +0 < S 0:0(0) win 32792 <…>
+0 > S. 0:0(0) ack 1 <…> +0 > S. 0:0(0) ack 1 <…>
+0 `netstat -anp | grep 8080 | grep SYN_RCVD` +0 `curl http://127.0.0.1:8899/netstat | grep 8080 | grep SYN_RCVD`

+.1 < . 1:1(0) ack 1 win 100


+.1 < . 1:1(0) ack 1 win 100
+0 accept(3, ..., ...) = 4
+0 accept(3, ..., ...) = 4
+0 `curl http://127.0.0.1:8899/ss | grep 8080 |
+0 %{ assert tcpi_snd_cwnd = 10 }%
sed 's/^.*\(cwnd:[0-9]*\).*$/\1/' | grep 10`

+0 write(4, ..., 1000) = 1000


+0 write(4, ..., 1000) = 1000
+0 > . 1:1001(1000) ack 1 +0 > . 1:1001(1000) ack 1

+.2 > . 1:1001(1000) ack 1 +.2 > . 1:1001(1000) ack 1


+.4 > . 1:1001(1000) ack 1 +.4 > . 1:1001(1000) ack 1

script for kernel TCP script for userspace TCP


Experience in Alibaba
 75 test cases for Luna TCP
• TCP state transmission
• exceptional packet handling
• congestion control、keep alive、custom features …
• RFC 793, 1122, 3042, 5681, 6582

 reproduction
• fix 3 bugs in production

 regression
• added to Jenkins

Thank You !


Q&A

You might also like