Skip to content

Commit 42a2d92

Browse files
committed
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: 1) The addition of nftables. No longer will we need protocol aware firewall filtering modules, it can all live in userspace. At the core of nftables is a, for lack of a better term, virtual machine that executes byte codes to inspect packet or metadata (arriving interface index, etc.) and make verdict decisions. Besides support for loading packet contents and comparing them, the interpreter supports lookups in various datastructures as fundamental operations. For example sets are supports, and therefore one could create a set of whitelist IP address entries which have ACCEPT verdicts attached to them, and use the appropriate byte codes to do such lookups. Since the interpreted code is composed in userspace, userspace can do things like optimize things before giving it to the kernel. Another major improvement is the capability of atomically updating portions of the ruleset. In the existing netfilter implementation, one has to update the entire rule set in order to make a change and this is very expensive. Userspace tools exist to create nftables rules using existing netfilter rule sets, but both kernel implementations will need to co-exist for quite some time as we transition from the old to the new stuff. Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have worked so hard on this. 2) Daniel Borkmann and Hannes Frederic Sowa made several improvements to our pseudo-random number generator, mostly used for things like UDP port randomization and netfitler, amongst other things. In particular the taus88 generater is updated to taus113, and test cases are added. 3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet and Yang Yingliang. 4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin Sujir. 5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet, Neal Cardwell, and Yuchung Cheng. 6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary control message data, much like other socket option attributes. From Francesco Fusco. 7) Allow applications to specify a cap on the rate computed automatically by the kernel for pacing flows, via a new SO_MAX_PACING_RATE socket option. From Eric Dumazet. 8) Make the initial autotuned send buffer sizing in TCP more closely reflect actual needs, from Eric Dumazet. 9) Currently early socket demux only happens for TCP sockets, but we can do it for connected UDP sockets too. Implementation from Shawn Bohrer. 10) Refactor inet socket demux with the goal of improving hash demux performance for listening sockets. With the main goals being able to use RCU lookups on even request sockets, and eliminating the listening lock contention. From Eric Dumazet. 11) The bonding layer has many demuxes in it's fast path, and an RCU conversion was started back in 3.11, several changes here extend the RCU usage to even more locations. From Ding Tianhong and Wang Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav Falico. 12) Allow stackability of segmentation offloads to, in particular, allow segmentation offloading over tunnels. From Eric Dumazet. 13) Significantly improve the handling of secret keys we input into the various hash functions in the inet hashtables, TCP fast open, as well as syncookies. From Hannes Frederic Sowa. The key fundamental operation is "net_get_random_once()" which uses static keys. Hannes even extended this to ipv4/ipv6 fragmentation handling and our generic flow dissector. 14) The generic driver layer takes care now to set the driver data to NULL on device removal, so it's no longer necessary for drivers to explicitly set it to NULL any more. Many drivers have been cleaned up in this way, from Jingoo Han. 15) Add a BPF based packet scheduler classifier, from Daniel Borkmann. 16) Improve CRC32 interfaces and generic SKB checksum iterators so that SCTP's checksumming can more cleanly be handled. Also from Daniel Borkmann. 17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces using the interface MTU value. This helps avoid PMTU attacks, particularly on DNS servers. From Hannes Frederic Sowa. 18) Use generic XPS for transmit queue steering rather than internal (re-)implementation in virtio-net. From Jason Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits) random32: add test cases for taus113 implementation random32: upgrade taus88 generator to taus113 from errata paper random32: move rnd_state to linux/random.h random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized random32: add periodic reseeding random32: fix off-by-one in seeding requirement PHY: Add RTL8201CP phy_driver to realtek xtsonic: add missing platform_set_drvdata() in xtsonic_probe() macmace: add missing platform_set_drvdata() in mace_probe() ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe() ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh vlan: Implement vlan_dev_get_egress_qos_mask as an inline. ixgbe: add warning when max_vfs is out of range. igb: Update link modes display in ethtool netfilter: push reasm skb through instead of original frag skbs ip6_output: fragment outgoing reassembled skb properly MAINTAINERS: mv643xx_eth: take over maintainership from Lennart net_sched: tbf: support of 64bit rates ixgbe: deleting dfwd stations out of order can cause null ptr deref ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS ...
2 parents 5cbb3d2 + 75ecab1 commit 42a2d92

File tree

1,331 files changed

+78932
-32379
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,331 files changed

+78932
-32379
lines changed

Documentation/ABI/testing/sysfs-class-net-batman-adv

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11

22
What: /sys/class/net/<iface>/batman-adv/iface_status
33
Date: May 2010
4-
Contact: Marek Lindner <lindner_marek@yahoo.de>
4+
Contact: Marek Lindner <mareklindner@neomailbox.ch>
55
Description:
66
Indicates the status of <iface> as it is seen by batman.
77

88
What: /sys/class/net/<iface>/batman-adv/mesh_iface
99
Date: May 2010
10-
Contact: Marek Lindner <lindner_marek@yahoo.de>
10+
Contact: Marek Lindner <mareklindner@neomailbox.ch>
1111
Description:
1212
The /sys/class/net/<iface>/batman-adv/mesh_iface file
1313
displays the batman mesh interface this <iface>

Documentation/ABI/testing/sysfs-class-net-mesh

Lines changed: 12 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,31 @@
11

22
What: /sys/class/net/<mesh_iface>/mesh/aggregated_ogms
33
Date: May 2010
4-
Contact: Marek Lindner <lindner_marek@yahoo.de>
4+
Contact: Marek Lindner <mareklindner@neomailbox.ch>
55
Description:
66
Indicates whether the batman protocol messages of the
77
mesh <mesh_iface> shall be aggregated or not.
88

9-
What: /sys/class/net/<mesh_iface>/mesh/ap_isolation
9+
What: /sys/class/net/<mesh_iface>/mesh/<vlan_subdir>/ap_isolation
1010
Date: May 2011
11-
Contact: Antonio Quartulli <ordex@autistici.org>
11+
Contact: Antonio Quartulli <antonio@meshcoding.com>
1212
Description:
1313
Indicates whether the data traffic going from a
1414
wireless client to another wireless client will be
15-
silently dropped.
15+
silently dropped. <vlan_subdir> is empty when referring
16+
to the untagged lan.
1617

1718
What: /sys/class/net/<mesh_iface>/mesh/bonding
1819
Date: June 2010
19-
Contact: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
20+
Contact: Simon Wunderlich <sw@simonwunderlich.de>
2021
Description:
2122
Indicates whether the data traffic going through the
2223
mesh will be sent using multiple interfaces at the
2324
same time (if available).
2425

2526
What: /sys/class/net/<mesh_iface>/mesh/bridge_loop_avoidance
2627
Date: November 2011
27-
Contact: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
28+
Contact: Simon Wunderlich <sw@simonwunderlich.de>
2829
Description:
2930
Indicates whether the bridge loop avoidance feature
3031
is enabled. This feature detects and avoids loops
@@ -41,21 +42,21 @@ Description:
4142

4243
What: /sys/class/net/<mesh_iface>/mesh/gw_bandwidth
4344
Date: October 2010
44-
Contact: Marek Lindner <lindner_marek@yahoo.de>
45+
Contact: Marek Lindner <mareklindner@neomailbox.ch>
4546
Description:
4647
Defines the bandwidth which is propagated by this
4748
node if gw_mode was set to 'server'.
4849

4950
What: /sys/class/net/<mesh_iface>/mesh/gw_mode
5051
Date: October 2010
51-
Contact: Marek Lindner <lindner_marek@yahoo.de>
52+
Contact: Marek Lindner <mareklindner@neomailbox.ch>
5253
Description:
5354
Defines the state of the gateway features. Can be
5455
either 'off', 'client' or 'server'.
5556

5657
What: /sys/class/net/<mesh_iface>/mesh/gw_sel_class
5758
Date: October 2010
58-
Contact: Marek Lindner <lindner_marek@yahoo.de>
59+
Contact: Marek Lindner <mareklindner@neomailbox.ch>
5960
Description:
6061
Defines the selection criteria this node will use
6162
to choose a gateway if gw_mode was set to 'client'.
@@ -77,25 +78,14 @@ Description:
7778

7879
What: /sys/class/net/<mesh_iface>/mesh/orig_interval
7980
Date: May 2010
80-
Contact: Marek Lindner <lindner_marek@yahoo.de>
81+
Contact: Marek Lindner <mareklindner@neomailbox.ch>
8182
Description:
8283
Defines the interval in milliseconds in which batman
8384
sends its protocol messages.
8485

8586
What: /sys/class/net/<mesh_iface>/mesh/routing_algo
8687
Date: Dec 2011
87-
Contact: Marek Lindner <lindner_marek@yahoo.de>
88+
Contact: Marek Lindner <mareklindner@neomailbox.ch>
8889
Description:
8990
Defines the routing procotol this mesh instance
9091
uses to find the optimal paths through the mesh.
91-
92-
What: /sys/class/net/<mesh_iface>/mesh/vis_mode
93-
Date: May 2010
94-
Contact: Marek Lindner <lindner_marek@yahoo.de>
95-
Description:
96-
Each batman node only maintains information about its
97-
own local neighborhood, therefore generating graphs
98-
showing the topology of the entire mesh is not easily
99-
feasible without having a central instance to collect
100-
the local topologies from all nodes. This file allows
101-
to activate the collecting (server) mode.

Documentation/DocBook/80211.tmpl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -152,8 +152,8 @@
152152
!Finclude/net/cfg80211.h cfg80211_scan_request
153153
!Finclude/net/cfg80211.h cfg80211_scan_done
154154
!Finclude/net/cfg80211.h cfg80211_bss
155-
!Finclude/net/cfg80211.h cfg80211_inform_bss_frame
156-
!Finclude/net/cfg80211.h cfg80211_inform_bss
155+
!Finclude/net/cfg80211.h cfg80211_inform_bss_width_frame
156+
!Finclude/net/cfg80211.h cfg80211_inform_bss_width
157157
!Finclude/net/cfg80211.h cfg80211_unlink_bss
158158
!Finclude/net/cfg80211.h cfg80211_find_ie
159159
!Finclude/net/cfg80211.h ieee80211_bss_get_ie
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
TI CPSW Phy mode Selection Device Tree Bindings
2+
-----------------------------------------------
3+
4+
Required properties:
5+
- compatible : Should be "ti,am3352-cpsw-phy-sel"
6+
- reg : physical base address and size of the cpsw
7+
registers map
8+
- reg-names : names of the register map given in "reg" node
9+
10+
Optional properties:
11+
-rmii-clock-ext : If present, the driver will configure the RMII
12+
interface to external clock usage
13+
14+
Examples:
15+
16+
phy_sel: cpsw-phy-sel@44e10650 {
17+
compatible = "ti,am3352-cpsw-phy-sel";
18+
reg= <0x44e10650 0x4>;
19+
reg-names = "gmii-sel";
20+
};
21+
22+
(or)
23+
phy_sel: cpsw-phy-sel@44e10650 {
24+
compatible = "ti,am3352-cpsw-phy-sel";
25+
reg= <0x44e10650 0x4>;
26+
reg-names = "gmii-sel";
27+
rmii-clock-ext;
28+
};

Documentation/networking/batman-adv.txt

Lines changed: 4 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -69,16 +69,15 @@ folder:
6969
# aggregated_ogms gw_bandwidth log_level
7070
# ap_isolation gw_mode orig_interval
7171
# bonding gw_sel_class routing_algo
72-
# bridge_loop_avoidance hop_penalty vis_mode
73-
# fragmentation
72+
# bridge_loop_avoidance hop_penalty fragmentation
7473

7574

7675
There is a special folder for debugging information:
7776

7877
# ls /sys/kernel/debug/batman_adv/bat0/
7978
# bla_backbone_table log transtable_global
8079
# bla_claim_table originators transtable_local
81-
# gateways socket vis_data
80+
# gateways socket
8281

8382
Some of the files contain all sort of status information regard-
8483
ing the mesh network. For example, you can view the table of
@@ -127,51 +126,6 @@ ously assigned to interfaces now used by batman advanced, e.g.
127126
# ifconfig eth0 0.0.0.0
128127

129128

130-
VISUALIZATION
131-
-------------
132-
133-
If you want topology visualization, at least one mesh node must
134-
be configured as VIS-server:
135-
136-
# echo "server" > /sys/class/net/bat0/mesh/vis_mode
137-
138-
Each node is either configured as "server" or as "client" (de-
139-
fault: "client"). Clients send their topology data to the server
140-
next to them, and server synchronize with other servers. If there
141-
is no server configured (default) within the mesh, no topology
142-
information will be transmitted. With these "synchronizing
143-
servers", there can be 1 or more vis servers sharing the same (or
144-
at least very similar) data.
145-
146-
When configured as server, you can get a topology snapshot of
147-
your mesh:
148-
149-
# cat /sys/kernel/debug/batman_adv/bat0/vis_data
150-
151-
This raw output is intended to be easily parsable and convertable
152-
with other tools. Have a look at the batctl README if you want a
153-
vis output in dot or json format for instance and how those out-
154-
puts could then be visualised in an image.
155-
156-
The raw format consists of comma separated values per entry where
157-
each entry is giving information about a certain source inter-
158-
face. Each entry can/has to have the following values:
159-
-> "mac" - mac address of an originator's source interface
160-
(each line begins with it)
161-
-> "TQ mac value" - src mac's link quality towards mac address
162-
of a neighbor originator's interface which
163-
is being used for routing
164-
-> "TT mac" - TT announced by source mac
165-
-> "PRIMARY" - this is a primary interface
166-
-> "SEC mac" - secondary mac address of source
167-
(requires preceding PRIMARY)
168-
169-
The TQ value has a range from 4 to 255 with 255 being the best.
170-
The TT entries are showing which hosts are connected to the mesh
171-
via bat0 or being bridged into the mesh network. The PRIMARY/SEC
172-
values are only applied on primary interfaces
173-
174-
175129
LOGGING/DEBUGGING
176130
-----------------
177131

@@ -245,5 +199,5 @@ Mailing-list: b.a.t.m.a.n@open-mesh.org (optional subscription
245199

246200
You can also contact the Authors:
247201

248-
Marek Lindner <lindner_marek@yahoo.de>
249-
Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
202+
Marek Lindner <mareklindner@neomailbox.ch>
203+
Simon Wunderlich <sw@simonwunderlich.de>

Documentation/networking/bonding.txt

Lines changed: 45 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -639,6 +639,15 @@ num_unsol_na
639639
are generated by the ipv4 and ipv6 code and the numbers of
640640
repetitions cannot be set independently.
641641

642+
packets_per_slave
643+
644+
Specify the number of packets to transmit through a slave before
645+
moving to the next one. When set to 0 then a slave is chosen at
646+
random.
647+
648+
The valid range is 0 - 65535; the default value is 1. This option
649+
has effect only in balance-rr mode.
650+
642651
primary
643652

644653
A string (eth0, eth2, etc) specifying which slave is the
@@ -743,21 +752,16 @@ xmit_hash_policy
743752
protocol information to generate the hash.
744753

745754
Uses XOR of hardware MAC addresses and IP addresses to
746-
generate the hash. The IPv4 formula is
747-
748-
(((source IP XOR dest IP) AND 0xffff) XOR
749-
( source MAC XOR destination MAC ))
750-
modulo slave count
751-
752-
The IPv6 formula is
755+
generate the hash. The formula is
753756

754-
hash = (source ip quad 2 XOR dest IP quad 2) XOR
755-
(source ip quad 3 XOR dest IP quad 3) XOR
756-
(source ip quad 4 XOR dest IP quad 4)
757+
hash = source MAC XOR destination MAC
758+
hash = hash XOR source IP XOR destination IP
759+
hash = hash XOR (hash RSHIFT 16)
760+
hash = hash XOR (hash RSHIFT 8)
761+
And then hash is reduced modulo slave count.
757762

758-
(((hash >> 24) XOR (hash >> 16) XOR (hash >> 8) XOR hash)
759-
XOR (source MAC XOR destination MAC))
760-
modulo slave count
763+
If the protocol is IPv6 then the source and destination
764+
addresses are first hashed using ipv6_addr_hash.
761765

762766
This algorithm will place all traffic to a particular
763767
network peer on the same slave. For non-IP traffic,
@@ -779,32 +783,23 @@ xmit_hash_policy
779783
slaves, although a single connection will not span
780784
multiple slaves.
781785

782-
The formula for unfragmented IPv4 TCP and UDP packets is
786+
The formula for unfragmented TCP and UDP packets is
783787

784-
((source port XOR dest port) XOR
785-
((source IP XOR dest IP) AND 0xffff)
786-
modulo slave count
788+
hash = source port, destination port (as in the header)
789+
hash = hash XOR source IP XOR destination IP
790+
hash = hash XOR (hash RSHIFT 16)
791+
hash = hash XOR (hash RSHIFT 8)
792+
And then hash is reduced modulo slave count.
787793

788-
The formula for unfragmented IPv6 TCP and UDP packets is
789-
790-
hash = (source port XOR dest port) XOR
791-
((source ip quad 2 XOR dest IP quad 2) XOR
792-
(source ip quad 3 XOR dest IP quad 3) XOR
793-
(source ip quad 4 XOR dest IP quad 4))
794-
795-
((hash >> 24) XOR (hash >> 16) XOR (hash >> 8) XOR hash)
796-
modulo slave count
794+
If the protocol is IPv6 then the source and destination
795+
addresses are first hashed using ipv6_addr_hash.
797796

798797
For fragmented TCP or UDP packets and all other IPv4 and
799798
IPv6 protocol traffic, the source and destination port
800799
information is omitted. For non-IP traffic, the
801800
formula is the same as for the layer2 transmit hash
802801
policy.
803802

804-
The IPv4 policy is intended to mimic the behavior of
805-
certain switches, notably Cisco switches with PFC2 as
806-
well as some Foundry and IBM products.
807-
808803
This algorithm is not fully 802.3ad compliant. A
809804
single TCP or UDP conversation containing both
810805
fragmented and unfragmented packets will see packets
@@ -815,6 +810,26 @@ xmit_hash_policy
815810
conversations. Other implementations of 802.3ad may
816811
or may not tolerate this noncompliance.
817812

813+
encap2+3
814+
815+
This policy uses the same formula as layer2+3 but it
816+
relies on skb_flow_dissect to obtain the header fields
817+
which might result in the use of inner headers if an
818+
encapsulation protocol is used. For example this will
819+
improve the performance for tunnel users because the
820+
packets will be distributed according to the encapsulated
821+
flows.
822+
823+
encap3+4
824+
825+
This policy uses the same formula as layer3+4 but it
826+
relies on skb_flow_dissect to obtain the header fields
827+
which might result in the use of inner headers if an
828+
encapsulation protocol is used. For example this will
829+
improve the performance for tunnel users because the
830+
packets will be distributed according to the encapsulated
831+
flows.
832+
818833
The default value is layer2. This option was added in bonding
819834
version 2.6.3. In earlier versions of bonding, this parameter
820835
does not exist, and the layer2 policy is the only policy. The

0 commit comments

Comments
 (0)