Skip to content

Commit 79c0ef3

Browse files
committed
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller: 1) Prevent index integer overflow in ptr_ring, from Jason Wang. 2) Program mvpp2 multicast filter properly, from Mikulas Patocka. 3) The bridge brport attribute file is write only and doesn't have a ->show() method, don't blindly invoke it. From Xin Long. 4) Inverted mask used in genphy_setup_forced(), from Ingo van Lil. 5) Fix multiple definition issue with if_ether.h UAPI header, from Hauke Mehrtens. 6) Fix GFP_KERNEL usage in atomic in RDS protocol code, from Sowmini Varadhan. 7) Revert XDP redirect support from thunderx driver, it is not implemented properly. From Jesper Dangaard Brouer. 8) Fix missing RTNL protection across some tipc operations, from Ying Xue. 9) Return the correct IV bytes in the TLS getsockopt code, from Boris Pismenny. 10) Take tclassid into consideration properly when doing FIB rule matching. From Stefano Brivio. 11) cxgb4 device needs more PCI VPD quirks, from Casey Leedom. 12) TUN driver doesn't align frags properly, and we can end up doing unaligned atomics on misaligned metadata. From Eric Dumazet. 13) Fix various crashes found using DEBUG_PREEMPT in rmnet driver, from Subash Abhinov Kasiviswanathan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits) tg3: APE heartbeat changes mlxsw: spectrum_router: Do not unconditionally clear route offload indication net: qualcomm: rmnet: Fix possible null dereference in command processing net: qualcomm: rmnet: Fix warning seen with 64 bit stats net: qualcomm: rmnet: Fix crash on real dev unregistration sctp: remove the left unnecessary check for chunk in sctp_renege_events rxrpc: Work around usercopy check tun: fix tun_napi_alloc_frags() frag allocator udplite: fix partial checksum initialization skbuff: Fix comment mis-spelling. dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock PCI/cxgb4: Extend T3 PCI quirk to T4+ devices cxgb4: fix trailing zero in CIM LA dump cxgb4: free up resources of pf 0-3 fib_semantics: Don't match route with mismatching tclassid NFC: llcp: Limit size of SDP URI tls: getsockopt return record sequence number tls: reset the crypto info if copy_from_user fails tls: retrun the correct IV in getsockopt docs: segmentation-offloads.txt: add SCTP info ...
2 parents 91ab883 + 506b0a3 commit 79c0ef3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+504
-395
lines changed

Documentation/networking/segmentation-offloads.txt

Lines changed: 34 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ The following technologies are described:
1313
* Generic Segmentation Offload - GSO
1414
* Generic Receive Offload - GRO
1515
* Partial Generic Segmentation Offload - GSO_PARTIAL
16+
* SCTP accelleration with GSO - GSO_BY_FRAGS
1617

1718
TCP Segmentation Offload
1819
========================
@@ -49,6 +50,10 @@ datagram into multiple IPv4 fragments. Many of the requirements for UDP
4950
fragmentation offload are the same as TSO. However the IPv4 ID for
5051
fragments should not increment as a single IPv4 datagram is fragmented.
5152

53+
UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
54+
still receive them from tuntap and similar devices. Offload of UDP-based
55+
tunnel protocols is still supported.
56+
5257
IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
5358
========================================================
5459

@@ -83,10 +88,10 @@ SKB_GSO_UDP_TUNNEL_CSUM. These two additional tunnel types reflect the
8388
fact that the outer header also requests to have a non-zero checksum
8489
included in the outer header.
8590

86-
Finally there is SKB_GSO_REMCSUM which indicates that a given tunnel header
87-
has requested a remote checksum offload. In this case the inner headers
88-
will be left with a partial checksum and only the outer header checksum
89-
will be computed.
91+
Finally there is SKB_GSO_TUNNEL_REMCSUM which indicates that a given tunnel
92+
header has requested a remote checksum offload. In this case the inner
93+
headers will be left with a partial checksum and only the outer header
94+
checksum will be computed.
9095

9196
Generic Segmentation Offload
9297
============================
@@ -128,3 +133,28 @@ values for if the header was simply duplicated. The one exception to this
128133
is the outer IPv4 ID field. It is up to the device drivers to guarantee
129134
that the IPv4 ID field is incremented in the case that a given header does
130135
not have the DF bit set.
136+
137+
SCTP accelleration with GSO
138+
===========================
139+
140+
SCTP - despite the lack of hardware support - can still take advantage of
141+
GSO to pass one large packet through the network stack, rather than
142+
multiple small packets.
143+
144+
This requires a different approach to other offloads, as SCTP packets
145+
cannot be just segmented to (P)MTU. Rather, the chunks must be contained in
146+
IP segments, padding respected. So unlike regular GSO, SCTP can't just
147+
generate a big skb, set gso_size to the fragmentation point and deliver it
148+
to IP layer.
149+
150+
Instead, the SCTP protocol layer builds an skb with the segments correctly
151+
padded and stored as chained skbs, and skb_segment() splits based on those.
152+
To signal this, gso_size is set to the special value GSO_BY_FRAGS.
153+
154+
Therefore, any code in the core networking stack must be aware of the
155+
possibility that gso_size will be GSO_BY_FRAGS and handle that case
156+
appropriately. (For size checks, the skb_gso_validate_*_len family of
157+
helpers do this automatically.)
158+
159+
This also affects drivers with the NETIF_F_FRAGLIST & NETIF_F_GSO_SCTP bits
160+
set. Note also that NETIF_F_GSO_SCTP is included in NETIF_F_GSO_SOFTWARE.

drivers/net/ethernet/broadcom/tg3.c

Lines changed: 24 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -820,7 +820,7 @@ static int tg3_ape_event_lock(struct tg3 *tp, u32 timeout_us)
820820

821821
tg3_ape_unlock(tp, TG3_APE_LOCK_MEM);
822822

823-
udelay(10);
823+
usleep_range(10, 20);
824824
timeout_us -= (timeout_us > 10) ? 10 : timeout_us;
825825
}
826826

@@ -922,8 +922,8 @@ static int tg3_ape_send_event(struct tg3 *tp, u32 event)
922922
if (!(apedata & APE_FW_STATUS_READY))
923923
return -EAGAIN;
924924

925-
/* Wait for up to 1 millisecond for APE to service previous event. */
926-
err = tg3_ape_event_lock(tp, 1000);
925+
/* Wait for up to 20 millisecond for APE to service previous event. */
926+
err = tg3_ape_event_lock(tp, 20000);
927927
if (err)
928928
return err;
929929

@@ -946,6 +946,7 @@ static void tg3_ape_driver_state_change(struct tg3 *tp, int kind)
946946

947947
switch (kind) {
948948
case RESET_KIND_INIT:
949+
tg3_ape_write32(tp, TG3_APE_HOST_HEARTBEAT_COUNT, tp->ape_hb++);
949950
tg3_ape_write32(tp, TG3_APE_HOST_SEG_SIG,
950951
APE_HOST_SEG_SIG_MAGIC);
951952
tg3_ape_write32(tp, TG3_APE_HOST_SEG_LEN,
@@ -962,13 +963,6 @@ static void tg3_ape_driver_state_change(struct tg3 *tp, int kind)
962963
event = APE_EVENT_STATUS_STATE_START;
963964
break;
964965
case RESET_KIND_SHUTDOWN:
965-
/* With the interface we are currently using,
966-
* APE does not track driver state. Wiping
967-
* out the HOST SEGMENT SIGNATURE forces
968-
* the APE to assume OS absent status.
969-
*/
970-
tg3_ape_write32(tp, TG3_APE_HOST_SEG_SIG, 0x0);
971-
972966
if (device_may_wakeup(&tp->pdev->dev) &&
973967
tg3_flag(tp, WOL_ENABLE)) {
974968
tg3_ape_write32(tp, TG3_APE_HOST_WOL_SPEED,
@@ -990,6 +984,18 @@ static void tg3_ape_driver_state_change(struct tg3 *tp, int kind)
990984
tg3_ape_send_event(tp, event);
991985
}
992986

987+
static void tg3_send_ape_heartbeat(struct tg3 *tp,
988+
unsigned long interval)
989+
{
990+
/* Check if hb interval has exceeded */
991+
if (!tg3_flag(tp, ENABLE_APE) ||
992+
time_before(jiffies, tp->ape_hb_jiffies + interval))
993+
return;
994+
995+
tg3_ape_write32(tp, TG3_APE_HOST_HEARTBEAT_COUNT, tp->ape_hb++);
996+
tp->ape_hb_jiffies = jiffies;
997+
}
998+
993999
static void tg3_disable_ints(struct tg3 *tp)
9941000
{
9951001
int i;
@@ -7262,6 +7268,7 @@ static int tg3_poll_msix(struct napi_struct *napi, int budget)
72627268
}
72637269
}
72647270

7271+
tg3_send_ape_heartbeat(tp, TG3_APE_HB_INTERVAL << 1);
72657272
return work_done;
72667273

72677274
tx_recovery:
@@ -7344,6 +7351,7 @@ static int tg3_poll(struct napi_struct *napi, int budget)
73447351
}
73457352
}
73467353

7354+
tg3_send_ape_heartbeat(tp, TG3_APE_HB_INTERVAL << 1);
73477355
return work_done;
73487356

73497357
tx_recovery:
@@ -10732,7 +10740,7 @@ static int tg3_reset_hw(struct tg3 *tp, bool reset_phy)
1073210740
if (tg3_flag(tp, ENABLE_APE))
1073310741
/* Write our heartbeat update interval to APE. */
1073410742
tg3_ape_write32(tp, TG3_APE_HOST_HEARTBEAT_INT_MS,
10735-
APE_HOST_HEARTBEAT_INT_DISABLE);
10743+
APE_HOST_HEARTBEAT_INT_5SEC);
1073610744

1073710745
tg3_write_sig_post_reset(tp, RESET_KIND_INIT);
1073810746

@@ -11077,6 +11085,9 @@ static void tg3_timer(struct timer_list *t)
1107711085
tp->asf_counter = tp->asf_multiplier;
1107811086
}
1107911087

11088+
/* Update the APE heartbeat every 5 seconds.*/
11089+
tg3_send_ape_heartbeat(tp, TG3_APE_HB_INTERVAL);
11090+
1108011091
spin_unlock(&tp->lock);
1108111092

1108211093
restart_timer:
@@ -16653,6 +16664,8 @@ static int tg3_get_invariants(struct tg3 *tp, const struct pci_device_id *ent)
1665316664
pci_state_reg);
1665416665

1665516666
tg3_ape_lock_init(tp);
16667+
tp->ape_hb_interval =
16668+
msecs_to_jiffies(APE_HOST_HEARTBEAT_INT_5SEC);
1665616669
}
1665716670

1665816671
/* Set up tp->grc_local_ctrl before calling

drivers/net/ethernet/broadcom/tg3.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2508,6 +2508,7 @@
25082508
#define TG3_APE_LOCK_PHY3 5
25092509
#define TG3_APE_LOCK_GPIO 7
25102510

2511+
#define TG3_APE_HB_INTERVAL (tp->ape_hb_interval)
25112512
#define TG3_EEPROM_SB_F1R2_MBA_OFF 0x10
25122513

25132514

@@ -3423,6 +3424,10 @@ struct tg3 {
34233424
struct device *hwmon_dev;
34243425
bool link_up;
34253426
bool pcierr_recovery;
3427+
3428+
u32 ape_hb;
3429+
unsigned long ape_hb_interval;
3430+
unsigned long ape_hb_jiffies;
34263431
};
34273432

34283433
/* Accessor macros for chip and asic attributes

drivers/net/ethernet/cavium/common/cavium_ptp.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,8 @@ EXPORT_SYMBOL(cavium_ptp_get);
7575

7676
void cavium_ptp_put(struct cavium_ptp *ptp)
7777
{
78+
if (!ptp)
79+
return;
7880
pci_dev_put(ptp->pdev);
7981
}
8082
EXPORT_SYMBOL(cavium_ptp_put);

drivers/net/ethernet/cavium/thunder/nicvf_main.c

Lines changed: 26 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -67,11 +67,6 @@ module_param(cpi_alg, int, S_IRUGO);
6767
MODULE_PARM_DESC(cpi_alg,
6868
"PFC algorithm (0=none, 1=VLAN, 2=VLAN16, 3=IP Diffserv)");
6969

70-
struct nicvf_xdp_tx {
71-
u64 dma_addr;
72-
u8 qidx;
73-
};
74-
7570
static inline u8 nicvf_netdev_qidx(struct nicvf *nic, u8 qidx)
7671
{
7772
if (nic->sqs_mode)
@@ -507,29 +502,14 @@ static int nicvf_init_resources(struct nicvf *nic)
507502
return 0;
508503
}
509504

510-
static void nicvf_unmap_page(struct nicvf *nic, struct page *page, u64 dma_addr)
511-
{
512-
/* Check if it's a recycled page, if not unmap the DMA mapping.
513-
* Recycled page holds an extra reference.
514-
*/
515-
if (page_ref_count(page) == 1) {
516-
dma_addr &= PAGE_MASK;
517-
dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
518-
RCV_FRAG_LEN + XDP_HEADROOM,
519-
DMA_FROM_DEVICE,
520-
DMA_ATTR_SKIP_CPU_SYNC);
521-
}
522-
}
523-
524505
static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
525506
struct cqe_rx_t *cqe_rx, struct snd_queue *sq,
526507
struct rcv_queue *rq, struct sk_buff **skb)
527508
{
528509
struct xdp_buff xdp;
529510
struct page *page;
530-
struct nicvf_xdp_tx *xdp_tx = NULL;
531511
u32 action;
532-
u16 len, err, offset = 0;
512+
u16 len, offset = 0;
533513
u64 dma_addr, cpu_addr;
534514
void *orig_data;
535515

@@ -543,7 +523,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
543523
cpu_addr = (u64)phys_to_virt(cpu_addr);
544524
page = virt_to_page((void *)cpu_addr);
545525

546-
xdp.data_hard_start = page_address(page) + RCV_BUF_HEADROOM;
526+
xdp.data_hard_start = page_address(page);
547527
xdp.data = (void *)cpu_addr;
548528
xdp_set_data_meta_invalid(&xdp);
549529
xdp.data_end = xdp.data + len;
@@ -563,7 +543,18 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
563543

564544
switch (action) {
565545
case XDP_PASS:
566-
nicvf_unmap_page(nic, page, dma_addr);
546+
/* Check if it's a recycled page, if not
547+
* unmap the DMA mapping.
548+
*
549+
* Recycled page holds an extra reference.
550+
*/
551+
if (page_ref_count(page) == 1) {
552+
dma_addr &= PAGE_MASK;
553+
dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
554+
RCV_FRAG_LEN + XDP_PACKET_HEADROOM,
555+
DMA_FROM_DEVICE,
556+
DMA_ATTR_SKIP_CPU_SYNC);
557+
}
567558

568559
/* Build SKB and pass on packet to network stack */
569560
*skb = build_skb(xdp.data,
@@ -576,28 +567,25 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
576567
case XDP_TX:
577568
nicvf_xdp_sq_append_pkt(nic, sq, (u64)xdp.data, dma_addr, len);
578569
return true;
579-
case XDP_REDIRECT:
580-
/* Save DMA address for use while transmitting */
581-
xdp_tx = (struct nicvf_xdp_tx *)page_address(page);
582-
xdp_tx->dma_addr = dma_addr;
583-
xdp_tx->qidx = nicvf_netdev_qidx(nic, cqe_rx->rq_idx);
584-
585-
err = xdp_do_redirect(nic->pnicvf->netdev, &xdp, prog);
586-
if (!err)
587-
return true;
588-
589-
/* Free the page on error */
590-
nicvf_unmap_page(nic, page, dma_addr);
591-
put_page(page);
592-
break;
593570
default:
594571
bpf_warn_invalid_xdp_action(action);
595572
/* fall through */
596573
case XDP_ABORTED:
597574
trace_xdp_exception(nic->netdev, prog, action);
598575
/* fall through */
599576
case XDP_DROP:
600-
nicvf_unmap_page(nic, page, dma_addr);
577+
/* Check if it's a recycled page, if not
578+
* unmap the DMA mapping.
579+
*
580+
* Recycled page holds an extra reference.
581+
*/
582+
if (page_ref_count(page) == 1) {
583+
dma_addr &= PAGE_MASK;
584+
dma_unmap_page_attrs(&nic->pdev->dev, dma_addr,
585+
RCV_FRAG_LEN + XDP_PACKET_HEADROOM,
586+
DMA_FROM_DEVICE,
587+
DMA_ATTR_SKIP_CPU_SYNC);
588+
}
601589
put_page(page);
602590
return true;
603591
}
@@ -1864,50 +1852,6 @@ static int nicvf_xdp(struct net_device *netdev, struct netdev_bpf *xdp)
18641852
}
18651853
}
18661854

1867-
static int nicvf_xdp_xmit(struct net_device *netdev, struct xdp_buff *xdp)
1868-
{
1869-
struct nicvf *nic = netdev_priv(netdev);
1870-
struct nicvf *snic = nic;
1871-
struct nicvf_xdp_tx *xdp_tx;
1872-
struct snd_queue *sq;
1873-
struct page *page;
1874-
int err, qidx;
1875-
1876-
if (!netif_running(netdev) || !nic->xdp_prog)
1877-
return -EINVAL;
1878-
1879-
page = virt_to_page(xdp->data);
1880-
xdp_tx = (struct nicvf_xdp_tx *)page_address(page);
1881-
qidx = xdp_tx->qidx;
1882-
1883-
if (xdp_tx->qidx >= nic->xdp_tx_queues)
1884-
return -EINVAL;
1885-
1886-
/* Get secondary Qset's info */
1887-
if (xdp_tx->qidx >= MAX_SND_QUEUES_PER_QS) {
1888-
qidx = xdp_tx->qidx / MAX_SND_QUEUES_PER_QS;
1889-
snic = (struct nicvf *)nic->snicvf[qidx - 1];
1890-
if (!snic)
1891-
return -EINVAL;
1892-
qidx = xdp_tx->qidx % MAX_SND_QUEUES_PER_QS;
1893-
}
1894-
1895-
sq = &snic->qs->sq[qidx];
1896-
err = nicvf_xdp_sq_append_pkt(snic, sq, (u64)xdp->data,
1897-
xdp_tx->dma_addr,
1898-
xdp->data_end - xdp->data);
1899-
if (err)
1900-
return -ENOMEM;
1901-
1902-
nicvf_xdp_sq_doorbell(snic, sq, qidx);
1903-
return 0;
1904-
}
1905-
1906-
static void nicvf_xdp_flush(struct net_device *dev)
1907-
{
1908-
return;
1909-
}
1910-
19111855
static int nicvf_config_hwtstamp(struct net_device *netdev, struct ifreq *ifr)
19121856
{
19131857
struct hwtstamp_config config;
@@ -1986,8 +1930,6 @@ static const struct net_device_ops nicvf_netdev_ops = {
19861930
.ndo_fix_features = nicvf_fix_features,
19871931
.ndo_set_features = nicvf_set_features,
19881932
.ndo_bpf = nicvf_xdp,
1989-
.ndo_xdp_xmit = nicvf_xdp_xmit,
1990-
.ndo_xdp_flush = nicvf_xdp_flush,
19911933
.ndo_do_ioctl = nicvf_ioctl,
19921934
};
19931935

0 commit comments

Comments
 (0)