Skip to content

Commit 6ff50cd

Browse files
Eric Dumazetdavem330
authored andcommitted
tcp: gso: do not generate out of order packets
GSO TCP handler has following issues : 1) ooo_okay from original GSO packet is duplicated to all segments 2) segments (but the last one) are orphaned, so transmit path can not get transmit queue number from the socket. This happens if GSO segmentation is done before stacked device for example. Result is we can send packets from a given TCP flow to different TX queues (if using multiqueue NICS). This generates OOO problems and spurious SACK & retransmits. Fix this by keeping socket pointer set for all segments. This means that every segment must also have a destructor, and the original gso skb truesize must be split on all segments, to keep precise sk->sk_wmem_alloc accounting. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent 5c4b274 commit 6ff50cd

File tree

1 file changed

+21
-1
lines changed

1 file changed

+21
-1
lines changed

net/ipv4/tcp.c

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2887,6 +2887,7 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb,
28872887
unsigned int mss;
28882888
struct sk_buff *gso_skb = skb;
28892889
__sum16 newcheck;
2890+
bool ooo_okay, copy_destructor;
28902891

28912892
if (!pskb_may_pull(skb, sizeof(*th)))
28922893
goto out;
@@ -2927,10 +2928,18 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb,
29272928
goto out;
29282929
}
29292930

2931+
copy_destructor = gso_skb->destructor == tcp_wfree;
2932+
ooo_okay = gso_skb->ooo_okay;
2933+
/* All segments but the first should have ooo_okay cleared */
2934+
skb->ooo_okay = 0;
2935+
29302936
segs = skb_segment(skb, features);
29312937
if (IS_ERR(segs))
29322938
goto out;
29332939

2940+
/* Only first segment might have ooo_okay set */
2941+
segs->ooo_okay = ooo_okay;
2942+
29342943
delta = htonl(oldlen + (thlen + mss));
29352944

29362945
skb = segs;
@@ -2950,6 +2959,17 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb,
29502959
thlen, skb->csum));
29512960

29522961
seq += mss;
2962+
if (copy_destructor) {
2963+
skb->destructor = gso_skb->destructor;
2964+
skb->sk = gso_skb->sk;
2965+
/* {tcp|sock}_wfree() use exact truesize accounting :
2966+
* sum(skb->truesize) MUST be exactly be gso_skb->truesize
2967+
* So we account mss bytes of 'true size' for each segment.
2968+
* The last segment will contain the remaining.
2969+
*/
2970+
skb->truesize = mss;
2971+
gso_skb->truesize -= mss;
2972+
}
29532973
skb = skb->next;
29542974
th = tcp_hdr(skb);
29552975

@@ -2962,7 +2982,7 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb,
29622982
* is freed at TX completion, and not right now when gso_skb
29632983
* is freed by GSO engine
29642984
*/
2965-
if (gso_skb->destructor == tcp_wfree) {
2985+
if (copy_destructor) {
29662986
swap(gso_skb->sk, skb->sk);
29672987
swap(gso_skb->destructor, skb->destructor);
29682988
swap(gso_skb->truesize, skb->truesize);

0 commit comments

Comments
 (0)