VPC-Virtua Port Channel 18-Aug

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

vPC (Virtual port channel)

https://www.cisco.com/c/dam/en/us/td/docs/switches/datacenter/sw/
design/vpc_design/vpc_best_practices_design_guide.pdf
Benefits of vPC:

• A virtual port channel (vPC) allows links that are physically connected to two different Cisco
Nexus 7000 Series devices to appear as a single port channel to a third device. The third device
can be a switch, server, or any other networking device that supports link aggregation
technology.

• vPC belongs to Multi-chassis Ether-channel like vSS.

• Uses all available uplink bandwidth.

• No more spanning tree blocked ports.

• Provides fast convergence upon link or device failure.

• Simplifies the network design.

• Unlike VSS and stacking, in vPC each peer device in the vPC domain runs its own control plane

• If there is any issue in the control plane of one peer, it will not impact another peer.

• In the figure, we have 2 switches i.e., SW1 & SW2 and a L3 device ABC. Suppose we are running
any layer 3 protocol like OSPF between the switch and the ABC, now if you login into the ABC
device and try to check its OSPF neighbors, in the case of stacking and VSS, you will see only one
neighbor from the prospective of the ABC device, because here both switches have a single
control plane. And in vPC case you will see 2 OSPF neighbors from the prospective of the ABC
device, because here both switches have their own control plane. That is why, if there any
problem in the control plane of any device then it will be not populated to another vPC device.

• vPC technology is supported since NX-OS 4.1.3 (i.e., since the introduction of the NEXUS 7k
platform in 2009)

• vPC feature is included in the base NX-OS software license, so no need to purchase any
additional license for vPC.
• vPC is a layer 2 technology, we can’t have L3 port-channel in vPC.

• vPC peers must be identical (same modal/module/line cards etc.)

• Exceptions:

• N7000 and N7700 in same vPC domain: OK

• N5500 and N5600 in same vPC dmain: NOK

• vPC peers can have mix SUP (supervisor engines)

• N9k SUP-A and SUP-B

• N7k SUP1, SUP2 or SUP2E

vPC Primary and Secondary switch selection


The selection order:

1. vPC master sticky-bit set to 0 or 1. (vPC master sticky bit is a Programmed protection Mechanism,
introduced to avoid unnecessary role change.)

2. User-defined vPC role priority (lower is better)

3. System MAC address value (lower is better)

vPC terminologies

vPC • It is a port-channel between the vPC peers and the downstream devices.
• A vPC is a L2 port type: swtchport mode trunk or switchport mode access

vPC peer device • One of the vPC peer member.


• vPC peer devices must be directly connected, we can’t use any other
switch or any other device between them.

vPC member • Port-channel members of a vPC


port
vPC peer-link • It is the P2P connection between the vPC peer devices.
• It is used to synchronize the state between vPC peed devices (arp/mac-
table/igmp etc.)
• It must be a 10-Gigabit link.
• vPC peer-link is a L2 trunk carrying vPC VLANs.
• Standard 802.1Q trunk.
• Carries CFS messages.
• Carries flooded traffic from vPC peers.
• Carries STP BPDUs, HSRP hellos, IGMP updates etc.
vPC Domain ID • Domain containing the 2 peer devices.
• Only 2 peer devices can be part of same vPC domain.
• It is used to create the LACP ID to fool the downstream devices
vPC keep-alive • The keepalive link between vPC peer devices; this link is used to monitor
the liveness of the peer device.
• Default interval 1 sec
• Default timeout 5 sec
• Default hold timeout 3 sec
• It uses UDP port number 3200
• It is a 96-byte long packet that contain a 32-byte long payload in it.
• No need to be in same subnet between vPC peer devices.
vPC VLAN • The VLANs that are allowed on the vPC peer-link.
• VLAN carried over the vPC peer-link and used to communicate via vPC with
a third device.

Non-vPC VLAN • A VLAN that is not part of any vPC and not present on vPC peer-link.
Orphan ports • A port that belongs to a single attached device.
• vPC VLAN is typically used on this port.
• If a single connected port is not using a vPC VLAN then it will not call an
orphan port.
Cisco Fabric • Underlying protocol running on top of vPC peer-link providing reliable
Services (CFS) synchronization and consistency check mechanisms between the 2 peer
protocol devices.
• This protocol performs the following functions:
• Configuration validation and comparison (consistency check)
• Synchronization of MAC addressed for vPC member ports.
• vPC member port status advertisement.
• Spanning Tree Protocol management.
• Synchronization of HSRP/ARP and IGMP messages.
• It does not require any special command to run CFS, it will automatically
wake-up when you create the vPC peer-link and configure vPC.
vPC configuration steps

The following order must be followed

• Enable vPC feature

• Configure the domain-id

• Establish the peed keep-alive line

• Establish the vPC peer link

• Configure the vPC member ports


• Enabling vPC feature:

N7K-1(config)#feature vpc

• Configure the domain-id:

N7K-1(config)#vpc domain “1-1000” (should be same on both peers)

• Establish the peer keep-alive line:

• Select the interface which you want to configure as a peer keep-alive line and
configure an IP address on both the peers.

• If you are using the management interface for keepalive then Now go to under
vpc domain and run the following command:

• N7K-1(config-vpc-domain) # peer-keepalive destination “peer ip.ip.ip.ip”


(IP address of the peer which you configured in the 1st step

• And if you are using a regular Gig ethernet port then use the following
command under vpc domain id:

N7K-1(config-vpc-domain) # peer-keepalive destination “peer ip.ip.ip.ip”


vrf default source “self-ip.ip.ip.ip”

Verify vpc peer-keepalive: show vpc peer-keepalive

• Recommendations: In the case of Dual-SUP, do not connect the mgmt0 port


back-to-back as a keepalive link, because in case of active SUP failure keep alive
connectivity can be broken.

• Suppose you have connected mgmt0 port back-to-back for a keepalive


line on 2 peer switches and both switches have dual SUPs. So, as you
know that if you are using dual SUP, then the active SUP will own the
mgmt0 interface. So, when the active SUP goes down then the
standby/secondary SUP will become active and now it will own the
mgmt0 interface but here is the issue, now you come to know that
there is not any cable connected to the mgmt0 interface of this new
SUP.

• Now you may think, let’s connect both the SUP on both switches back-
to-back with their respective mgmt0 interface, but this will not solve the
problem because let’s suppose SWH-A have SUP-1A & SUP-2A and
SWH-B have SUP-1B & SUP2B.
Now in this situation suppose the SUP-1 on both sides are active and
suddenly SUP-1A goes down and now SUP-2A became the active SUP.
So, on SWH-1, SUP-2A’s mgmt0 interface active and on SWH-2, SUP-1B’s
mgmt0 interface active, and this is not allowed, so, connecting both side
on mgmt0 interface in case of dual SUP is not recommended.

Solution: Use a L2 switch to connect the mgmt. ports of the


peer devices.

• Establish the vPC peer link:

• Go under the interfaces that you want to use as peer-link.

• Make them trunk port.

• Put the ports in the Port-channel. (Can use LACP)

• Then go to under the port-channel and run the following command to


use this port-channel as a vPC peer-link:

N7K-1(config)#interface port-channel 1

N7K-1(config-if) #vpc peer-link


Now check the running configuration on the port channel you
will see a “spanning-tree port type network” is automatically
added this means that the bridge assurance is enabled.

Bridge Assurance: Bridge Assurance is STP extension


that protects L2 network from any unidirectional link event
caused by physical cable failure or adjacent switch control plane
failure.

When we enable the bridge assurance then, it will track


the links to avoid any unidirectional links. Bridge Assurance
causes the switch to send BPDUs on all operational ports that
carry STP port type setting of "network". If a neighbor port stop
receiving BPDUs, the port is moved into the blocking state. If
the blocked port begins receiving BPDUs again, it is removed
from bridge assurance blocking state, and goes through normal
Rapid-PVST transition.

While the Bridge Assurance is strongly recommended


on peer-link, it’s not supported on vPC.

Recommendations for vPC peer-link

• Member port must be 10G ports. Peer-link is not supported on 1G line cards of FEX
ports, even if the FEX port is of 10G.

• Use at least 2*10G ports for redundancy.

• M1 & M2 line cards supports up to 8 member ports while F1, F2, F2E, F3 & M3 support
up to 16 member ports.

• Must use same type of line cards on both peer devices. Logic behind using same line
cards: Each port type has its own different hardware characteristics in terms of
forwarding, queuing and security, hence, to avoid any forwarding plane issues different
line cards combinations are not supported in vPC.

• Use at least 2 different line card’s ports to increase high availability of peer-link.
• Configure the vPC member ports:

N7k-1(config)# interface eth1/3-4

N7k-1(config-if)#switchport mode trunk

N7k-1(config-if)#channel-group 2 mode active (enable LACP)

N7k-1(config)#exit

N7k-1(config)#interface port-channel 2

N7k-1(config-if)# vpc 2 (enable vPC on this port-channel)


Recommendations for vPC member ports:

• Must use same type of line cards on both peer devices, logic behind this is same as for vPC peer-
link.

Consistency
Both switched in the vPC domain maintain distinct planes. Cisco fabric services (CFS) protocol do the
comparison of the configuration of the peer devices and if there is a difference in the configuration
means configuration is not consistent, then based on the type of consistency it will take some action.

System configuration must be kept in sync, with an automated consistency check to help ensure correct
network behavior.

There are two types of consistency checks:

• Type-1:

• For Global configuration type 1 inconsistency check, only vPC member ports on the
secondary peer device are set to down state.

show vpc consistency-parameters global

• For vPC interface configuration type 1 inconsistency check, only vPC member ports on
secondary peer device are set to down state.

show vpc consistency-parameters interface port-channel <id>

• Type-2:

• For global configuration type 2 inconsistency check, all vPC member ports remain in up
state and vPC systems trigger to protective actions.

• For vPC interface configuration type 2 inconsistency check, the misconfigured vPC
remains in up state. However, depending on the discrepancy type, vPC systems will
trigger protective actions. The most typical one deals with allowed VLAN in vPC interface
trunking configuration. In that case, vPC systems will disable from the vPC interface
VLAN that do not match on both sides.

vPC and Spanning tree protocol


In default situation:

 In the vPC environment, BPDUs are controlled by the primary device only, regardless the root
bridge (means to say that it does not matter who is the root bridge i.e., either primary switch or
the secondary switch, only the primary switch will process the BPDU) and are send to all the
designated ports.
 Secondary device is not going to generate BPDUs. If a secondary switch receives BPDU msg from
a downstream switch which is running in vPC environment then the secondary switch will not
entertain/process the BPDU, it will just forward the BPDUs to the primary switch via vPC peer-
link.
 But what if the secondary switch receives a BPDU msg from a downstream switch which is not
running in vPC environment?
o In that situation secondary switch will process that BPDU.
 Issue in default behavior:
o In the upper written default behavior, when the root bridge goes down then there will
be an outage of 3 seconds because the downstream device came to know that the root
bridge goes down that’s why it's ports will go into the designated state. When the
switch came back online there will also an outage of 3 seconds because again the root
bridge selection happens, and that’s why there will be a change in the port states of the
downstream device. So, there will be total 6 second outage in our network.
 Solution: vPC peer-switch

vPC peer-switch

 This feature allows the primary and secondary switch to use a single MAC address, so that,
whenever the root bridge goes down the downstream device will never come to know that the
root bridge is down because another switch will become root bridge without any outage
because they both are already using same MAC address to fool the downstream device.
 NOTE: The MAC address which these switches now use will be same as the LACP ID.
 BPDU flow in this feature:
o In this situation both primary and secondary devices will send/generate/process BPDU
messages, so the downstream switch will receive duplicate BPDUs. If the downstream
switch sends any BPDU to the secondary device, then this peer is not going to forward
the received BPDU to the primary switch.
 Enabling the feature:
o Under the vPC domain and run the following command:
vpc domain 1
peer-switch
 Recommendations:
o To use this feature successfully we need to make sure that configured priority is same
on both the switches.

vPC loop avoidance


 vPC performs loop avoidance at data-plane layer instead of control plane for spanning tree
protocol.
 All logics are implemented directly in hardware on vPC peer-link ports, avoiding any dependency
to CPU utilization.

How loop can form in vPC:

 Suppose server-1 in the diagram sends and broadcast msg to N5K-1.


 N5K-1 forward the broadcast msg to server-3, server-2 & N5K-2.
 Then N5K-2 will forward the msg to Server-1, server-2 & server-4, now look carefully, here the
server-1 got the broadcast msg which was generated by it itself.
 So now the server-1 will again forward the msg and now we can say that there a loop in our
network.

vPC loop avoidance (55xx Switches):


 Nexus 55xx series switches use FTAG value to avoid the loop into vPC.
 Primary switch use FTAG 256 & Secondary use 257, means all the vPC ports on primary switch
will be a member of FTAG group 256 and similarly all the vPC ports on the secondary device will
be a member of FTAG group 257.
 Orphan ports in primary switch are member of FTAG 257 group
 Orphan ports in secondary switch are member of FTAG 256 group.
 Loop avoidance scheme packet flow:
o Server-1 sends the broadcast msg to N5K-1.
o N5K-1 sends the broadcast msg to the vPC peer link with it’s FTAG value of 256.
o Ultimately N5K-2 receive the frame and found that the FTAG value on that frame is 256,
so, now the N5K-2 will forward the frame only to the ports which are part of FTAG group
256. So, it will forward the frame to the server-4 only. Now there is not any loop formed
in the network.
 Same thing will happen if the server-2 sends any broadcast msg.

vPC loop avoidance (7k Switches):


 These devices use the concept of VSL bit
 vPC loop avoidance rule states that traffic coming from vPC member port, then crossing vPC
peer-link is NOT allowed to egress any vPC member port; however, it can egress any other type
of port (L3 port, orphan port, …).
 Here N7K-1, set the VSL bit ON and send the traffic which it receives on the vPC port form the
server-1.
 N7K-2 receive the frame and found that the VLS bit is ON, so according to rule by default it will
understand that he can’t send that traffic to the vPC member ports.
o But there can only be one situation where the rule gets bypassed:
 The only exception to this rule occurs when vPC member port goes down. vPC
peer devices exchange member port states and reprogram in hardware the vPC
loop avoidance logic for that particular vPC.
 Suppose Server-1 wants to reach to server-2 and assume that the links between
the N7K-1 & server-2 are down.
 Server-1 will forward the frame to the N7K-1. And because N7K-1 lost its direct
connectivity to server-2, it needs to forward the packed to the N7K-2 switch to
reach to server-2.
 N7K-2 will now forward the traffic to the server-2 because with the help to CFS
protocol N7K-2 knows that the vPC member ports of N7K-1 which are part of
the vPC for server-2 are down. That’s why it will forward the traffic and there
will be no loop in the network.

HSRP & VRRP in vPC

 HSRP & VRRP operate in active-active mode from the data plane perspective, unlike
classical networks where we have the active/standby concept.
o This means if we configure HSRP or VRRP in vPC environment then both
switches will forward the traffic which they received from the downstream
devices.
 No additional configuration requires for this. This is the default behavior.
 Bur from the control plane perspective still the active/standby concept is followed.
o Because only the HSRP active switch will respond to ARP for VIP MAC address.
o If the HSRP standby switch receives any request for the VIP MAC address, then it
will forward the request to the HSRP active device.
o NOTE: No matter which switch is vPC primary device, only the HSRP active
switch will entertain the ARP requests.
 Recommendation:
o For ease of troubleshooting/operation make sure that the “primary” vPC switch
will become HSRP/VRRP “active” device for the control plane perspective.
o Don’t use HSRP/VRRP object tracking in vPC domain.
 Reason: can cause traffic blackholing: explained below

Example: Suppose here we are tracking the WAN link on both switches, for both VLAN interfaces.
Suddenly the link between N7K-2 & WAN device goes down, due to that the VLAN 10 & 20 interfaces on
N7k-2 goes down. Now the server-2 trying to reach the server-1, so because of port-channel hashing
algorithm some traffic will sent towards n7k-1 and some towards n7k-2. The packets which are sent
towards n7k-1 will reach to server 2, without any issue. But the packets which are sent towards to n7k-
2, will reach to n7k-2 and then get forwarded towards n7k-1, but now n7k-1 will not forward these
packets on the vPC towards the server-1 because on the loop avoidance mechanism in 7k switch and so
these packets will get dropped at n7k-1.

FHRP & vPC


 From the data plane perspective, both peer devices are forwarding.
o HOW? -> By imposing the G-Bit (Gateway bit) for HSRP / VRRP vMAC in MAC
address-table on both vPC devices.
 On active HSRP/VRRP instance, the vMAC points to sup-eth1(R).
 On standby HSRP/VRRP device, vMAC points to vPC peer-link.
Eve-ng command to check HSRP virtual mac: show system internal l2gwder | in “0000|
0001”

vPC Peer gateway

 The vPC peer-gateway capability allows a vPC to route packets that are addressed to the
router MAC address of the vPC peer.
 This functionality is used to overcome scenarios with misconfigurations and issues that
arise with load balancers or network attached storage (NAS) devices that try to optimize
packet forwarding.

Example: In this topology nx-2 and nx-3 are acting as the gateway fof VLAN 100 and VLAN
200. Nx-2 and nx-3 have a vPC configured for the web server and nx-1, which connects to
the NAS. NX-1 is only switching (not routing) packets to or from the NAS device.

When the web server sends a packet to the NAS device (172.32.100.22), it computes a
hash to identify which link it should send the packets on to reach the NAS device. Assume
that the web server sends the packet to nx-2, which then changes the packet’s sour e MAC
address to 00c1.5c00.0011 (part of the routing process) and forwards the packet on the nx-
1. NX-1 forwards (switches) the packet on to the NAS device.

Now the NAS devices create the reply packet and, when generating the packet headers,
uses the destination MAC address of the HSRP gateway 00c1.1234.0001 and forward the
packet to the NX-1. NX-1 computes a hash based on the source and destination IP and
forward the packet towards NX-3. NX-2 and NX-3 both have the destination MAC address for
the HSRP gateway and can then route the packet for the 172.32.200.0/24 network and
forward it back to the web server. This is the correct and normal forwarding behavior.

The problem occurs when the NAS server enables a feature for optimizing packet flow.
After the NAS device receives the packets from the web server and generate the reply
packet headers, it just uses the source and destination MAC addresses from the packet it
originally received. When NX-1 receive the reply packet, it calculates the hash and forward
the packet towards NX-3. Now NX-3 does not have the MAC address 00c1.5c00.0011 (NX-2’s
VLAN 100 interface) and can’t forward the packet toward NX-2. The packet is dropped
because packet received on vPC member port can’t be forwarded across the peer-link, as a
loop-prevention mechanism.

Enabling a vPC peed-gateway on NX-2 and NX-3 allows NX-3 to route packets destined
for NX-2’s MAC addresses, and vice versa. The vPC peer-gateway feature is enabled with the
command “peer-gateway” under he vPC domain configuration. The vPC peer-gateway
functionality is verified with the “show vpc” command.

 Peer devices will listen for the SVI MAC of another peer.
 We can activate it anytime, no traffic impact.
 Was designed for devices like NAS that do not follow the standard arp process to
retrieve MAC of the gateway or F5 which uses the source MAC for return traffic to keep
the path symmetrical.

vPC peer-gateway exclude VLAN

 If WAN link on the peer goes down, it can use the backup path over peer-link for the
routed traffic.
 Because of the peer-link feature we have G bit set for the next hop Ips MAC address.
 Now our traffic will be software switched and we can experience heavy packets drops or
delay due to CoPP and hardware rate limiter.
 So, in the situation we need to enable this “vPC peer-gateway exclude VLAN” feature so
that the traffic will not be software switched. Now it will do hardware switch.
 To enable this feature: go under vpc domain and then:
Peer-gateway exclude vlan <vlan-list>

vPC Failover Scenarios


When vPC keep alive link goes down then nothing happens.

Case-1:

 When only peer-link goes down?


 When the peer-link goes down then on the vPC secondary switch shutdown all
the vPC member ports and all the SVI for the vPC member ports. To avoid the
dual active (split brain) scenario.

Case-2:

 vPC peer-link is down and after some time primary fails completely. Then?

 On vPC secondary device, vPC member ports and the SVIs will stay in a down
state until the primary is back.

Quick Fix: We can take the configuration backup on the secondary device and the
remove the vPC configuration from the vPC member ports. But the is not a
recommended solution. After removing the vPC configuration those ports will came
up immediately.

Correct Solution: Use “auto-recovery”.

 By default, “auto-recovery” is Disabled. Go under the vPC


domain configuration and run “auto-recovery” command to
enable this. Enable this on both peer devices.
 Now if the peer-link go down and after that primary device fails,
then the secondary device will become “operational primary”
also the vPC member ports and the SVIs on the secondary
switch will came up immediately.
Case-3:

 Both peer devices went down but only one came UP.
 All the vPC member ports and the SVIs will stay in down state until another peer
came UP. Because if you check the output in “show vPC role” then you will find
that the role is “none established” because vPC configuration order will not
complete until there are two devices in the vPC domain.

Quick Fix: We can take the configuration backup on the secondary device and the
remove the vPC configuration from the vPC member ports. But the is not a
recommended solution. After removing the vPC configuration those ports will come
up immediately.

Correct Solution: Use “auto-recovery”.

 Device that is UP will wait till auto-recovery timeout (default


240 seconds, can be modified with “auto-recovery reload-delay
xxxx” command)
 If either vPC peer-link came UP or the keep-alive link came UP,
then auto-recovery will not get triggered.
 If the peer-link or peer keep-alive link stays down, the auto-
recovery will kicks in and the device will become primary and
bring UP all the vPCs and the SVIs.

Case-4:

 Even though orphan ports are not part of vPC their SVI will also be shut down
and will also be isolated.

Solution: Use the below command under the vPC domain configuration, to keep
the orphan ports up on the secondary device during peer-link failure.

Dual-active exclude interface-vlan xxx

Case-5: vPC split brain

 It happens when vPC keep-alive link goes down and after the peer-link goes
down.
 Both devices will be in primary role.
 vPC orphan ports can be isolated on both peers.
 Peer-gateway feature will no longer works as HSRP is also broken because of
peer-link down and we can have issue.
 New MAC addresses will no longer sync between peers.
NOTE: And when the peer-link and the keep alive link came UP the roles of the peer
devices will be reversed. Means now the old primary device will become operational
secondary and the old secondary device will become operational primary because of
“master-sticky bit set to 1” on secondary device.

Case-6: vPC orphan ports stays active on secondary during peer-link fails.

 When peer link goes down secondary device shuts down the vPC member ports
(orphan ports stays up) and the vPC VLAN SVIs.
 Devices link LB/FW can creare problems in this situation.

Solution:

1. Connect the LB/FW as the vPC not orphan ports.


2. Connect the LB/FW via a L2 switch which is connected to vPC domain.
3. Use “vPC orphan-port suspend” command. Run this command under
the orphan port which you want to suspend if the peer link goes down.

Case-7:

 WAN link goes down on primary after the peer-link failure.


 As we know when the peer-link goes down  secondary device shuts down the
member ports (orphan ports stays UP) and the vPC vlan SVIs.
 Because WAN link on the primary is also down, traffic will be backholed.

Solution:

 Use enhances object tracking


config t
track 1 interface ethernet 1/2 line protocol (track the peer-link)
track 2 interface ethernet 1/7 line protocol (track the WAN link)
track 3 list boolean or (crate the final track to track the previous
both tracks)
object 1 (call track 1)
object 2 (call track-2)
exit
vpc domain 1
track 3
exit
show track

NOTE: in the output of show track you will see that track 3 is UP. If both
objects in the track 3 goes down then only track 3 goes down, if one on the object is
UP it will remain UP.

When both the objects go down then the track 3 will also goes down.
Then via keep-alive link, the primary device tells the secondary device to change the
status of the vPC member port and vPC SVI, to UP.

 vPC “self-solation” command. Run this command under the vPC domain
configuration. Using this feature we will get the same result as object
tracking.
NX-OS 7.2 or later

You might also like