Skip to content

Commit db65a3a

Browse files
ronenildavem330
authored andcommitted
netlink: Trim skb to alloc size to avoid MSG_TRUNC
netlink_dump() allocates skb based on the calculated min_dump_alloc or a per socket max_recvmsg_len. min_alloc_size is maximum space required for any single netdev attributes as calculated by rtnl_calcit(). max_recvmsg_len tracks the user provided buffer to netlink_recvmsg. It is capped at 16KiB. The intention is to avoid small allocations and to minimize the number of calls required to obtain dump information for all net devices. netlink_dump packs as many small messages as could fit within an skb that was sized for the largest single netdev information. The actual space available within an skb is larger than what is requested. It could be much larger and up to near 2x with align to next power of 2 approach. Allowing netlink_dump to use all the space available within the allocated skb increases the buffer size a user has to provide to avoid truncaion (i.e. MSG_TRUNG flag set). It was observed that with many VLANs configured on at least one netdev, a larger buffer of near 64KiB was necessary to avoid "Message truncated" error in "ip link" or "bridge [-c[ompressvlans]] vlan show" when min_alloc_size was only little over 32KiB. This patch trims skb to allocated size in order to allow the user to avoid truncation with more reasonable buffer size. Signed-off-by: Ronen Arad <ronen.arad@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent c7c49b8 commit db65a3a

File tree

1 file changed

+22
-12
lines changed

1 file changed

+22
-12
lines changed

net/netlink/af_netlink.c

Lines changed: 22 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2785,6 +2785,7 @@ static int netlink_dump(struct sock *sk)
27852785
struct sk_buff *skb = NULL;
27862786
struct nlmsghdr *nlh;
27872787
int len, err = -ENOBUFS;
2788+
int alloc_min_size;
27882789
int alloc_size;
27892790

27902791
mutex_lock(nlk->cb_mutex);
@@ -2793,9 +2794,6 @@ static int netlink_dump(struct sock *sk)
27932794
goto errout_skb;
27942795
}
27952796

2796-
cb = &nlk->cb;
2797-
alloc_size = max_t(int, cb->min_dump_alloc, NLMSG_GOODSIZE);
2798-
27992797
if (!netlink_rx_is_mmaped(sk) &&
28002798
atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf)
28012799
goto errout_skb;
@@ -2805,23 +2803,35 @@ static int netlink_dump(struct sock *sk)
28052803
* to reduce number of system calls on dump operations, if user
28062804
* ever provided a big enough buffer.
28072805
*/
2808-
if (alloc_size < nlk->max_recvmsg_len) {
2809-
skb = netlink_alloc_skb(sk,
2810-
nlk->max_recvmsg_len,
2811-
nlk->portid,
2806+
cb = &nlk->cb;
2807+
alloc_min_size = max_t(int, cb->min_dump_alloc, NLMSG_GOODSIZE);
2808+
2809+
if (alloc_min_size < nlk->max_recvmsg_len) {
2810+
alloc_size = nlk->max_recvmsg_len;
2811+
skb = netlink_alloc_skb(sk, alloc_size, nlk->portid,
28122812
GFP_KERNEL |
28132813
__GFP_NOWARN |
28142814
__GFP_NORETRY);
2815-
/* available room should be exact amount to avoid MSG_TRUNC */
2816-
if (skb)
2817-
skb_reserve(skb, skb_tailroom(skb) -
2818-
nlk->max_recvmsg_len);
28192815
}
2820-
if (!skb)
2816+
if (!skb) {
2817+
alloc_size = alloc_min_size;
28212818
skb = netlink_alloc_skb(sk, alloc_size, nlk->portid,
28222819
GFP_KERNEL);
2820+
}
28232821
if (!skb)
28242822
goto errout_skb;
2823+
2824+
/* Trim skb to allocated size. User is expected to provide buffer as
2825+
* large as max(min_dump_alloc, 16KiB (mac_recvmsg_len capped at
2826+
* netlink_recvmsg())). dump will pack as many smaller messages as
2827+
* could fit within the allocated skb. skb is typically allocated
2828+
* with larger space than required (could be as much as near 2x the
2829+
* requested size with align to next power of 2 approach). Allowing
2830+
* dump to use the excess space makes it difficult for a user to have a
2831+
* reasonable static buffer based on the expected largest dump of a
2832+
* single netdev. The outcome is MSG_TRUNC error.
2833+
*/
2834+
skb_reserve(skb, skb_tailroom(skb) - alloc_size);
28252835
netlink_skb_set_owner_r(skb, sk);
28262836

28272837
len = cb->dump(skb, cb);

0 commit comments

Comments
 (0)