Skip to content

Commit 0548740

Browse files
committed
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller: 1) Several hash table refcount fixes in batman-adv, from Sven Eckelmann. 2) Use after free in bpf_evict_inode(), from Daniel Borkmann. 3) Fix mdio bus registration in ixgbe, from Ivan Vecera. 4) Unbounded loop in __skb_try_recv_datagram(), from Paolo Abeni. 5) ila rhashtable corruption fix from Herbert Xu. 6) Don't allow upper-devices to be added to vrf devices, from Sabrina Dubroca. 7) Add qmi_wwan device ID for Olicard 600, from Bjørn Mork. 8) Don't leave skb->next poisoned in __netif_receive_skb_list_ptype, from Alexander Lobakin. 9) Missing IDR checks in mlx5 driver, from Aditya Pakki. 10) Fix false connection termination in ktls, from Jakub Kicinski. 11) Work around some ASPM issues with r8169 by disabling rx interrupt coalescing on certain chips. From Heiner Kallweit. 12) Properly use per-cpu qstat values on NOLOCK qdiscs, from Paolo Abeni. 13) Fully initialize sockaddr_in structures in SCTP, from Xin Long. 14) Various BPF flow dissector fixes from Stanislav Fomichev. 15) Divide by zero in act_sample, from Davide Caratti. 16) Fix bridging multicast regression introduced by rhashtable conversion, from Nikolay Aleksandrov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits) ibmvnic: Fix completion structure initialization ipv6: sit: reset ip header pointer in ipip6_rcv net: bridge: always clear mcast matching struct on reports and leaves libcxgb: fix incorrect ppmax calculation vlan: conditional inclusion of FCoE hooks to match netdevice.h and bnx2x sch_cake: Make sure we can write the IP header before changing DSCP bits sch_cake: Use tc_skb_protocol() helper for getting packet protocol tcp: Ensure DCTCP reacts to losses net/sched: act_sample: fix divide by zero in the traffic path net: thunderx: fix NULL pointer dereference in nicvf_open/nicvf_stop net: hns: Fix sparse: some warnings in HNS drivers net: hns: Fix WARNING when remove HNS driver with SMMU enabled net: hns: fix ICMP6 neighbor solicitation messages discard problem net: hns: Fix probabilistic memory overwrite when HNS driver initialized net: hns: Use NAPI_POLL_WEIGHT for hns driver net: hns: fix KASAN: use-after-free in hns_nic_net_xmit_hw() flow_dissector: rst'ify documentation ipv6: Fix dangling pointer when ipv6 fragment net-gro: Fix GRO flush when receiving a GSO packet. flow_dissector: document BPF flow dissector environment ...
2 parents 8e22ba9 + bbd669a commit 0548740

File tree

122 files changed

+1096
-563
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

122 files changed

+1096
-563
lines changed

Documentation/bpf/btf.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -148,16 +148,16 @@ The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()``
148148
for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
149149

150150
The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
151-
for this int. For example, a bitfield struct member has: * btf member bit
152-
offset 100 from the start of the structure, * btf member pointing to an int
153-
type, * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
151+
for this int. For example, a bitfield struct member has:
152+
* btf member bit offset 100 from the start of the structure,
153+
* btf member pointing to an int type,
154+
* the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
154155

155156
Then in the struct memory layout, this member will occupy ``4`` bits starting
156157
from bits ``100 + 2 = 102``.
157158

158159
Alternatively, the bitfield struct member can be the following to access the
159160
same bits as the above:
160-
161161
* btf member bit offset 102,
162162
* btf member pointing to an int type,
163163
* the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
==================
4+
BPF Flow Dissector
5+
==================
6+
7+
Overview
8+
========
9+
10+
Flow dissector is a routine that parses metadata out of the packets. It's
11+
used in the various places in the networking subsystem (RFS, flow hash, etc).
12+
13+
BPF flow dissector is an attempt to reimplement C-based flow dissector logic
14+
in BPF to gain all the benefits of BPF verifier (namely, limits on the
15+
number of instructions and tail calls).
16+
17+
API
18+
===
19+
20+
BPF flow dissector programs operate on an ``__sk_buff``. However, only the
21+
limited set of fields is allowed: ``data``, ``data_end`` and ``flow_keys``.
22+
``flow_keys`` is ``struct bpf_flow_keys`` and contains flow dissector input
23+
and output arguments.
24+
25+
The inputs are:
26+
* ``nhoff`` - initial offset of the networking header
27+
* ``thoff`` - initial offset of the transport header, initialized to nhoff
28+
* ``n_proto`` - L3 protocol type, parsed out of L2 header
29+
30+
Flow dissector BPF program should fill out the rest of the ``struct
31+
bpf_flow_keys`` fields. Input arguments ``nhoff/thoff/n_proto`` should be
32+
also adjusted accordingly.
33+
34+
The return code of the BPF program is either BPF_OK to indicate successful
35+
dissection, or BPF_DROP to indicate parsing error.
36+
37+
__sk_buff->data
38+
===============
39+
40+
In the VLAN-less case, this is what the initial state of the BPF flow
41+
dissector looks like::
42+
43+
+------+------+------------+-----------+
44+
| DMAC | SMAC | ETHER_TYPE | L3_HEADER |
45+
+------+------+------------+-----------+
46+
^
47+
|
48+
+-- flow dissector starts here
49+
50+
51+
.. code:: c
52+
53+
skb->data + flow_keys->nhoff point to the first byte of L3_HEADER
54+
flow_keys->thoff = nhoff
55+
flow_keys->n_proto = ETHER_TYPE
56+
57+
In case of VLAN, flow dissector can be called with the two different states.
58+
59+
Pre-VLAN parsing::
60+
61+
+------+------+------+-----+-----------+-----------+
62+
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
63+
+------+------+------+-----+-----------+-----------+
64+
^
65+
|
66+
+-- flow dissector starts here
67+
68+
.. code:: c
69+
70+
skb->data + flow_keys->nhoff point the to first byte of TCI
71+
flow_keys->thoff = nhoff
72+
flow_keys->n_proto = TPID
73+
74+
Please note that TPID can be 802.1AD and, hence, BPF program would
75+
have to parse VLAN information twice for double tagged packets.
76+
77+
78+
Post-VLAN parsing::
79+
80+
+------+------+------+-----+-----------+-----------+
81+
| DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
82+
+------+------+------+-----+-----------+-----------+
83+
^
84+
|
85+
+-- flow dissector starts here
86+
87+
.. code:: c
88+
89+
skb->data + flow_keys->nhoff point the to first byte of L3_HEADER
90+
flow_keys->thoff = nhoff
91+
flow_keys->n_proto = ETHER_TYPE
92+
93+
In this case VLAN information has been processed before the flow dissector
94+
and BPF flow dissector is not required to handle it.
95+
96+
97+
The takeaway here is as follows: BPF flow dissector program can be called with
98+
the optional VLAN header and should gracefully handle both cases: when single
99+
or double VLAN is present and when it is not present. The same program
100+
can be called for both cases and would have to be written carefully to
101+
handle both cases.
102+
103+
104+
Reference Implementation
105+
========================
106+
107+
See ``tools/testing/selftests/bpf/progs/bpf_flow.c`` for the reference
108+
implementation and ``tools/testing/selftests/bpf/flow_dissector_load.[hc]``
109+
for the loader. bpftool can be used to load BPF flow dissector program as well.
110+
111+
The reference implementation is organized as follows:
112+
* ``jmp_table`` map that contains sub-programs for each supported L3 protocol
113+
* ``_dissect`` routine - entry point; it does input ``n_proto`` parsing and
114+
does ``bpf_tail_call`` to the appropriate L3 handler
115+
116+
Since BPF at this point doesn't support looping (or any jumping back),
117+
jmp_table is used instead to handle multiple levels of encapsulation (and
118+
IPv6 options).
119+
120+
121+
Current Limitations
122+
===================
123+
BPF flow dissector doesn't support exporting all the metadata that in-kernel
124+
C-based implementation can export. Notable example is single VLAN (802.1Q)
125+
and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
126+
for a set of information that's currently can be exported from the BPF context.

Documentation/networking/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Contents:
99
netdev-FAQ
1010
af_xdp
1111
batman-adv
12+
bpf_flow_dissector
1213
can
1314
can_ucan_protocol
1415
device_drivers/freescale/dpaa2/index

MAINTAINERS

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5833,7 +5833,7 @@ L: netdev@vger.kernel.org
58335833
S: Maintained
58345834
F: Documentation/ABI/testing/sysfs-bus-mdio
58355835
F: Documentation/devicetree/bindings/net/mdio*
5836-
F: Documentation/networking/phy.txt
5836+
F: Documentation/networking/phy.rst
58375837
F: drivers/net/phy/
58385838
F: drivers/of/of_mdio.c
58395839
F: drivers/of/of_net.c
@@ -13981,7 +13981,7 @@ F: drivers/media/rc/serial_ir.c
1398113981
SFC NETWORK DRIVER
1398213982
M: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
1398313983
M: Edward Cree <ecree@solarflare.com>
13984-
M: Bert Kenward <bkenward@solarflare.com>
13984+
M: Martin Habets <mhabets@solarflare.com>
1398513985
L: netdev@vger.kernel.org
1398613986
S: Supported
1398713987
F: drivers/net/ethernet/sfc/

drivers/net/bonding/bond_sysfs_slave.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,9 @@ static SLAVE_ATTR_RO(link_failure_count);
5555

5656
static ssize_t perm_hwaddr_show(struct slave *slave, char *buf)
5757
{
58-
return sprintf(buf, "%pM\n", slave->perm_hwaddr);
58+
return sprintf(buf, "%*phC\n",
59+
slave->dev->addr_len,
60+
slave->perm_hwaddr);
5961
}
6062
static SLAVE_ATTR_RO(perm_hwaddr);
6163

drivers/net/dsa/mv88e6xxx/port.c

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -427,18 +427,22 @@ int mv88e6390x_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
427427
return 0;
428428

429429
lane = mv88e6390x_serdes_get_lane(chip, port);
430-
if (lane < 0)
430+
if (lane < 0 && lane != -ENODEV)
431431
return lane;
432432

433-
if (chip->ports[port].serdes_irq) {
434-
err = mv88e6390_serdes_irq_disable(chip, port, lane);
433+
if (lane >= 0) {
434+
if (chip->ports[port].serdes_irq) {
435+
err = mv88e6390_serdes_irq_disable(chip, port, lane);
436+
if (err)
437+
return err;
438+
}
439+
440+
err = mv88e6390x_serdes_power(chip, port, false);
435441
if (err)
436442
return err;
437443
}
438444

439-
err = mv88e6390x_serdes_power(chip, port, false);
440-
if (err)
441-
return err;
445+
chip->ports[port].cmode = 0;
442446

443447
if (cmode) {
444448
err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_STS, &reg);
@@ -452,6 +456,12 @@ int mv88e6390x_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
452456
if (err)
453457
return err;
454458

459+
chip->ports[port].cmode = cmode;
460+
461+
lane = mv88e6390x_serdes_get_lane(chip, port);
462+
if (lane < 0)
463+
return lane;
464+
455465
err = mv88e6390x_serdes_power(chip, port, true);
456466
if (err)
457467
return err;
@@ -463,8 +473,6 @@ int mv88e6390x_port_set_cmode(struct mv88e6xxx_chip *chip, int port,
463473
}
464474
}
465475

466-
chip->ports[port].cmode = cmode;
467-
468476
return 0;
469477
}
470478

drivers/net/ethernet/cavium/thunder/nicvf_main.c

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1328,10 +1328,11 @@ int nicvf_stop(struct net_device *netdev)
13281328
struct nicvf_cq_poll *cq_poll = NULL;
13291329
union nic_mbx mbx = {};
13301330

1331-
cancel_delayed_work_sync(&nic->link_change_work);
1332-
13331331
/* wait till all queued set_rx_mode tasks completes */
1334-
drain_workqueue(nic->nicvf_rx_mode_wq);
1332+
if (nic->nicvf_rx_mode_wq) {
1333+
cancel_delayed_work_sync(&nic->link_change_work);
1334+
drain_workqueue(nic->nicvf_rx_mode_wq);
1335+
}
13351336

13361337
mbx.msg.msg = NIC_MBOX_MSG_SHUTDOWN;
13371338
nicvf_send_msg_to_pf(nic, &mbx);
@@ -1452,7 +1453,8 @@ int nicvf_open(struct net_device *netdev)
14521453
struct nicvf_cq_poll *cq_poll = NULL;
14531454

14541455
/* wait till all queued set_rx_mode tasks completes if any */
1455-
drain_workqueue(nic->nicvf_rx_mode_wq);
1456+
if (nic->nicvf_rx_mode_wq)
1457+
drain_workqueue(nic->nicvf_rx_mode_wq);
14561458

14571459
netif_carrier_off(netdev);
14581460

@@ -1550,10 +1552,12 @@ int nicvf_open(struct net_device *netdev)
15501552
/* Send VF config done msg to PF */
15511553
nicvf_send_cfg_done(nic);
15521554

1553-
INIT_DELAYED_WORK(&nic->link_change_work,
1554-
nicvf_link_status_check_task);
1555-
queue_delayed_work(nic->nicvf_rx_mode_wq,
1556-
&nic->link_change_work, 0);
1555+
if (nic->nicvf_rx_mode_wq) {
1556+
INIT_DELAYED_WORK(&nic->link_change_work,
1557+
nicvf_link_status_check_task);
1558+
queue_delayed_work(nic->nicvf_rx_mode_wq,
1559+
&nic->link_change_work, 0);
1560+
}
15571561

15581562
return 0;
15591563
cleanup:

drivers/net/ethernet/cavium/thunder/nicvf_queues.c

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -105,20 +105,19 @@ static inline struct pgcache *nicvf_alloc_page(struct nicvf *nic,
105105
/* Check if page can be recycled */
106106
if (page) {
107107
ref_count = page_ref_count(page);
108-
/* Check if this page has been used once i.e 'put_page'
109-
* called after packet transmission i.e internal ref_count
110-
* and page's ref_count are equal i.e page can be recycled.
108+
/* This page can be recycled if internal ref_count and page's
109+
* ref_count are equal, indicating that the page has been used
110+
* once for packet transmission. For non-XDP mode, internal
111+
* ref_count is always '1'.
111112
*/
112-
if (rbdr->is_xdp && (ref_count == pgcache->ref_count))
113-
pgcache->ref_count--;
114-
else
115-
page = NULL;
116-
117-
/* In non-XDP mode, page's ref_count needs to be '1' for it
118-
* to be recycled.
119-
*/
120-
if (!rbdr->is_xdp && (ref_count != 1))
113+
if (rbdr->is_xdp) {
114+
if (ref_count == pgcache->ref_count)
115+
pgcache->ref_count--;
116+
else
117+
page = NULL;
118+
} else if (ref_count != 1) {
121119
page = NULL;
120+
}
122121
}
123122

124123
if (!page) {
@@ -365,11 +364,10 @@ static void nicvf_free_rbdr(struct nicvf *nic, struct rbdr *rbdr)
365364
while (head < rbdr->pgcnt) {
366365
pgcache = &rbdr->pgcache[head];
367366
if (pgcache->page && page_ref_count(pgcache->page) != 0) {
368-
if (!rbdr->is_xdp) {
369-
put_page(pgcache->page);
370-
continue;
367+
if (rbdr->is_xdp) {
368+
page_ref_sub(pgcache->page,
369+
pgcache->ref_count - 1);
371370
}
372-
page_ref_sub(pgcache->page, pgcache->ref_count - 1);
373371
put_page(pgcache->page);
374372
}
375373
head++;

drivers/net/ethernet/chelsio/libcxgb/libcxgb_ppm.c

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -354,7 +354,10 @@ static struct cxgbi_ppm_pool *ppm_alloc_cpu_pool(unsigned int *total,
354354
ppmax = max;
355355

356356
/* pool size must be multiple of unsigned long */
357-
bmap = BITS_TO_LONGS(ppmax);
357+
bmap = ppmax / BITS_PER_TYPE(unsigned long);
358+
if (!bmap)
359+
return NULL;
360+
358361
ppmax = (bmap * sizeof(unsigned long)) << 3;
359362

360363
alloc_sz = sizeof(*pools) + sizeof(unsigned long) * bmap;
@@ -402,6 +405,10 @@ int cxgbi_ppm_init(void **ppm_pp, struct net_device *ndev,
402405
if (reserve_factor) {
403406
ppmax_pool = ppmax / reserve_factor;
404407
pool = ppm_alloc_cpu_pool(&ppmax_pool, &pool_index_max);
408+
if (!pool) {
409+
ppmax_pool = 0;
410+
reserve_factor = 0;
411+
}
405412

406413
pr_debug("%s: ppmax %u, cpu total %u, per cpu %u.\n",
407414
ndev->name, ppmax, ppmax_pool, pool_index_max);

drivers/net/ethernet/hisilicon/hns/hnae.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,6 @@ static int hnae_alloc_buffers(struct hnae_ring *ring)
150150
/* free desc along with its attached buffer */
151151
static void hnae_free_desc(struct hnae_ring *ring)
152152
{
153-
hnae_free_buffers(ring);
154153
dma_unmap_single(ring_to_dev(ring), ring->desc_dma_addr,
155154
ring->desc_num * sizeof(ring->desc[0]),
156155
ring_to_dma_dir(ring));
@@ -183,6 +182,9 @@ static int hnae_alloc_desc(struct hnae_ring *ring)
183182
/* fini ring, also free the buffer for the ring */
184183
static void hnae_fini_ring(struct hnae_ring *ring)
185184
{
185+
if (is_rx_ring(ring))
186+
hnae_free_buffers(ring);
187+
186188
hnae_free_desc(ring);
187189
kfree(ring->desc_cb);
188190
ring->desc_cb = NULL;

drivers/net/ethernet/hisilicon/hns/hnae.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -357,7 +357,7 @@ struct hnae_buf_ops {
357357
};
358358

359359
struct hnae_queue {
360-
void __iomem *io_base;
360+
u8 __iomem *io_base;
361361
phys_addr_t phy_base;
362362
struct hnae_ae_dev *dev; /* the device who use this queue */
363363
struct hnae_ring rx_ring ____cacheline_internodealigned_in_smp;

drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -370,7 +370,7 @@ int hns_mac_clr_multicast(struct hns_mac_cb *mac_cb, int vfn)
370370
static void hns_mac_param_get(struct mac_params *param,
371371
struct hns_mac_cb *mac_cb)
372372
{
373-
param->vaddr = (void *)mac_cb->vaddr;
373+
param->vaddr = mac_cb->vaddr;
374374
param->mac_mode = hns_get_enet_interface(mac_cb);
375375
ether_addr_copy(param->addr, mac_cb->addr_entry_idx[0].addr);
376376
param->mac_id = mac_cb->mac_id;

0 commit comments

Comments
 (0)