Skip to content

Commit 3976001

Browse files
committed
Merge branch 'ipv6-Improve-user-experience-with-multipath-routes'
David Ahern says: ==================== net: ipv6: Improve user experience with multipath routes This series closes a couple of gaps between IPv4 and IPv6 with respect to multipath routes: 1. IPv4 allows all nexthops of multipath routes to be deleted using just the prefix and length; IPv6 only deletes the first nexthop for the route if only the prefix and length are given. 2. IPv4 returns multipath routes encoded in the RTA_MULTIPATH attribute. IPv6 returns a series of routes with the same prefix and length - one for each nexthop. This happens for both dumps and notifications. IPv6 does accept RTA_MULTIPATH encoded routes, but installs them as a series of routes. Patch 1 addresses the first item by allowing IPv6 multipath routes to be deleted using just the prefix and length. Patch 2 addresses the second allowing IPv6 multipath routes to be returned encoded in the RTA_MULTIPATH. Patches 3 and 4 upate the RTM_{NEW,DEL}ROUTE notifications to generate 1 notification with RTA_MULTIPATH where applicable. Patch 5 prints IPv6 addresses in compressed format when showing route replace errors. This was noticed testing REPLACE failures. The end result for multipath routes: 1. Dump - RTA_MULTIPATH used for multipath routes $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:200::/120 metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 ... 2. Route Add - one notification with RTA_MULTIPATH attribute $ ip -6 ro add vrf red 2001:db8:200::/120 nexthop via 2001:db8:1::2 nexthop via 2001:db8:2::2 $ ip mon route 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 2. Route Replace - one notification with RTA_MULTIPATH attribute $ ip -6 ro replace vrf red 2001:db8:200::/120 nexthop via 2001:db8:1::16 nexthop via 2001:db8:2::16 $ ip mon route Replaced 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::16 dev eth1 weight 1 nexthop via 2001:db8:2::16 dev eth2 weight 1 - on a failure after the insertion of the first nexthop (which means the original route has been replaced in the FIB), a notification is sent with the successful nexthops and then the nexthops are deleted with one notification per hop. This is consistent with how it works today except the successful additions are coalesced into 1 notification. 3. Route Delete - delete of entire multipath route using prefix/length only 1 notification is generated: $ ip -6 ro del vrf red 2001:db8:200::/120 $ ip mon route Deleted 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::16 dev eth1 weight 1 nexthop via 2001:db8:2::16 dev eth2 weight 1 - if a delete request contains nexthops one notification is generated per nexthop deleted. This is unavoidable since IPv6 alllows a single nexthop to be deleted within a multipath route 4. Route Appends - IPv6 allows nexthops to be appended to an existing route. In this case one notification is sent for the new route with the append flag set. $ ip -6 ro append vrf red 2001:db8:200::/120 nexthop via 2001:db8:2::20 nexthop via 2001:db8:1::20 $ ip mon route Append 2001:db8:200::/120 table red metric 1024 nexthop via 2001:db8:1::2 dev eth1 weight 1 nexthop via 2001:db8:2::2 dev eth2 weight 1 nexthop via 2001:db8:2::20 dev eth2 weight 1 nexthop via 2001:db8:1::20 dev eth1 weight 1 - on failure of an append, a notification is sent with the route containing all of the nexthops successfully added, and it is followed by delete notifications as the hops are removed returning the route to its prior state. This is consistent with how it works today except the successful additions are coalesced into 1 notification. Addresses some of the inconsistencies also noted by Roopa at netdev0.1: https://www.netdev01.org/docs/prabhu-linux_ipv4_ipv6_inconsistencies_talk_slides.pdf v4 - changed series to do encoding in 1 patch and updating notificatons in separate patches to make it easier to review and understand - 1 notification for delete when using prefix/length; 1 notification for append - handle delete of a single nexthop without RTA_MULTIPATH in delete request - upated commit messages and cover letter v3 - removed the need for a user API to opt-in to change. Requiring an API just shifts the difference from same API with different behavior to different API to achieve equivalent behavior - route notifications changed to use RTA_MULTIPATH for add and replace - upated commit messages and cover letter v2 - fixed locking in patch 1 as noted by DaveM - changed user API for patch 2 to require an rtmsg with RTM_F_ALL_NEXTHOPS set in rtm_flags - revamped explanation of patch 2 and cover letter ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2 parents 4d6308a + 7d4d506 commit 3976001

File tree

4 files changed

+227
-25
lines changed

4 files changed

+227
-25
lines changed

include/net/ip6_fib.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,9 @@ struct fib6_config {
3737
int fc_ifindex;
3838
u32 fc_flags;
3939
u32 fc_protocol;
40-
u32 fc_type; /* only 8 bits are used */
40+
u16 fc_type; /* only 8 bits are used */
41+
u16 fc_delete_all_nh : 1,
42+
__unused : 15;
4143

4244
struct in6_addr fc_dst;
4345
struct in6_addr fc_src;

include/net/netlink.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -229,6 +229,7 @@ struct nl_info {
229229
struct nlmsghdr *nlh;
230230
struct net *nl_net;
231231
u32 portid;
232+
bool skip_notify;
232233
};
233234

234235
int netlink_rcv_skb(struct sk_buff *skb,

net/ipv6/ip6_fib.c

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,16 @@ static int fib6_dump_node(struct fib6_walker *w)
318318
w->leaf = rt;
319319
return 1;
320320
}
321+
322+
/* Multipath routes are dumped in one route with the
323+
* RTA_MULTIPATH attribute. Jump 'rt' to point to the
324+
* last sibling of this route (no need to dump the
325+
* sibling routes again)
326+
*/
327+
if (rt->rt6i_nsiblings)
328+
rt = list_last_entry(&rt->rt6i_siblings,
329+
struct rt6_info,
330+
rt6i_siblings);
321331
}
322332
w->leaf = NULL;
323333
return 0;
@@ -871,7 +881,8 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
871881
*ins = rt;
872882
rt->rt6i_node = fn;
873883
atomic_inc(&rt->rt6i_ref);
874-
inet6_rt_notify(RTM_NEWROUTE, rt, info, nlflags);
884+
if (!info->skip_notify)
885+
inet6_rt_notify(RTM_NEWROUTE, rt, info, nlflags);
875886
info->nl_net->ipv6.rt6_stats->fib_rt_entries++;
876887

877888
if (!(fn->fn_flags & RTN_RTINFO)) {
@@ -897,7 +908,8 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
897908
rt->rt6i_node = fn;
898909
rt->dst.rt6_next = iter->dst.rt6_next;
899910
atomic_inc(&rt->rt6i_ref);
900-
inet6_rt_notify(RTM_NEWROUTE, rt, info, NLM_F_REPLACE);
911+
if (!info->skip_notify)
912+
inet6_rt_notify(RTM_NEWROUTE, rt, info, NLM_F_REPLACE);
901913
if (!(fn->fn_flags & RTN_RTINFO)) {
902914
info->nl_net->ipv6.rt6_stats->fib_route_nodes++;
903915
fn->fn_flags |= RTN_RTINFO;
@@ -1442,7 +1454,8 @@ static void fib6_del_route(struct fib6_node *fn, struct rt6_info **rtp,
14421454

14431455
fib6_purge_rt(rt, fn, net);
14441456

1445-
inet6_rt_notify(RTM_DELROUTE, rt, info, 0);
1457+
if (!info->skip_notify)
1458+
inet6_rt_notify(RTM_DELROUTE, rt, info, 0);
14461459
rt6_release(rt);
14471460
}
14481461

0 commit comments

Comments
 (0)