BRKDCN-2958 DC Operation and Maintenance Best Practice
BRKDCN-2958 DC Operation and Maintenance Best Practice
BRKDCN-2958 DC Operation and Maintenance Best Practice
BRKDCN-2958
Anis Edavalath
• 11 years with Cisco Advanced Services
• Enterprise Campus and Datacenter across different verticals
• Worked 10 years with BU engineering groups in Security ,
switching, datacenter and Network Management products
• Design and deployment of Next Gen Data center architecture
enterprise and cloud customers
• AS team lead for ACI, VxLAN, Tetration, SDA (uniform policy)
• Worked with major telecom vendors and Cloud providers prior
to Cisco
• CCIE Datacenter # 48152
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 2
Cisco Webex App
Questions?
Use Cisco Webex App to chat
with the speaker after the session
How
1 Find this session in the Cisco Live Mobile App
2 Click “Join the Discussion”
3 Install the Webex App or go directly to the Webex space Enter your personal notes here
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Course Objective and Goal
• To help Data Center operations and engineering staff understand the
change management best practice to maintain a datacenter
environment or migrate a legacy environment to next gen Cisco
Nexus data center network deployment.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
• Baseline
• VPC, VxLAN & ACI Refresher
• Operational Best Practices
• Change Management best practices
• Migration Methodology
• Five key Use cases
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
DC Baseline Refresher
vPC Feature Overview For Your
Reference
vPC Peer
vPC Terminology Keepalive Link
S2
S1
Orphan Port CFS
vPC Member
Port
Failure Scenario
vPC
• If both peers are active, then Secondary vPC peer will
disable all vPCs to avoid Dual-Active.
Orphan
Device • Data will automatically forward down remaining active
S3 port channel ports.
LEAF
• Original L2 packet is encapsulated with VXLAN
VTEP A
HOST1
VTEP B header in a UDP->IP->Ethernet
HOST2
MAC H1
MAC H2
VLAN 10 → VNI1000
VLAN 10 → VNI1000 Overcome 4094 VLAN Scale Limitation
DMAC SMAC Original
DMAC SMAC Original • VLANs use a 10-bit VLAN ID
H2 H1 L2 Data
H2 H1 L2 Data
VTEP A or VTEP B in deployment will be a pair, and this Multi-Tenant with virtualization
pair will provide host redundancy for Layer 2 via VPC. • Isolation of network traffic by a tenant and
reusability of networking taxonomy for tenancy
VPC is still NEEDED and VTEP will represent the
VPC pair!
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
VxLAN Refresher With BGP EVPN Address Learning
Host C
MAC-C GARP for IP C
IP-C Target MAC: MAC-C
Broadcast, Unicast and Target IP: IP-C
Multicast traffic can use BGP EVPN
MAC-A 10000 VTEP-1-IP
either Multicast group or update 5
IP-A VRF FOO VTEP-1-IP
Ingress replication of MAC-C: VNI
traffic- not covered 10001 MAC-B 10000 VTEP-2-IP
IP-C: VNI 6 IP-B VRF FOO VTEP-2-IP
VTEP-
10001
3
VTEP-3-IP
Nexthop: VTEP-3-
VTEP-3-IP MAC BGP EVPN
BGP EVPN update
Host B
update MAC-B: VNI MAC-B
MAC-A: VNI 10000 IP-B
10000 IP-B: VNI
IP-A: VNI 2 20000
10000 4 Nexthop:
VTEP-1 VTEP-
VTEP-2-IP
Nexthop:
2 3
VTEP-1-IP
VTEP-1-IP VTEP-2-IP GARP for IP B
Host A 1 VTEP-1-MAC VTEP-2-MAC Target MAC: MAC-B
MAC-A Target IP: IP-B
IP-A MAC-B 10000 VTEP-2-IP
Hosts’ Setup
IP-B VRF FOO VTEP-2-IP MAC-A 10000 VTEP-1-IP
GARP for IP A Vlan 10: VNI IP-A VRF FOO VTEP-1-IP
Target MAC: MAC-A MAC-C 10000 VTEP-3-IP
10000
Target IP: IP-A MAC-C 10000 VTEP-3-IP
IP-C VRF FOO VTEP-3-IP VRF FOO: VNI
20000 IP-C VRF FOO VTEP-3-IP
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
SDN ‘with’ FCAPS ‘and’ Automation
Application Centric Programmable
Infrastructure EVPN Fabric
Automated application centric-policy model with DevOps toolset used for Network Management
embedded security (Puppet, Chef, Ansible etc.)
Fault
Configuration External
Tools NDFC
Integrated
Accounting
Toolset
Performance
Security
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Application Network Profiles (ANP) & ACI:
how it works?
SLA
QoS
APPLICATION STORAGE
CONNECTIVIT SECURITY
Security QOS L4..7 AND
Y POLICY POLICIES
Classification SERVICES COMPUTE
APP PROFILE
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Operational Best Practices
NX-OS High Availability
Process Modularity
• Service Restart-ability
Future Services
Protocol Stack (IPv4 / IPv6 / L2) Possibilities
Service (PSS)
Chassis Management
Chip/Driver Infrastructure
• Checkpoints states to PSS
Kernel
• Recover states from PSS
upon restart.
• Stateful Restart with Graceful Restart
• Recover states based on information from
other services and/or network.
• Mainly Routing Protocols
• Stateless Restart
• Fresh start, no trace of former instantiation.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
In-Service Software Upgrade
Nexus# install all nxos bootflash:nxos.9.2.3.bin
Upgrade and reboot
Initiate stateful failover
Upgrade and reboot
Upgrade I/O modules
Active Standby
Release
Release Release
Release
7.0(3)I7(4)
OSPF
OSPF
9.2(3)
7.0(3)I7(4) 9.2(3)
BGP
BGP
PIM
etc.
PIM
etc.
HA Manager HA Manager
Linux Kernel Linux Kernel
Best Practice:
Release
Release VPCs should be distributed.
7.0(3)I7(4)
9.2(3)
I/O Module Images
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
NX-OS High Availability
Supervisor Switchover
• Triggers:
• HA Policy Initiated – e.g. 3 component crashes → SSO
• User Initiated – system switchover
• ISSU initiated SSO
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Defect Impact
Sally: You know how Boss gets when we call him at 2 AM...
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Software Patching in NX-OS
Who’s familiar with Software Maintenance Updates (SMU)?
Overview Benefits
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
SMU Lifecycle – CLI
SMU SMU
SMU
Repository
Switch# install add …
Switch# install remove …
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Patching Highlights
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Stateful Switchover Mode
SSO-Aware and SSO-Compliant Applications
Active Supervisor
Standby Hot Supervisor
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Routing Protocol Redundancy With NSF
(Graceful Restart)
Active Supervisor Engine Slot 1 Standby Supervisor Engine Slot 2
EIGRP RIB OSPF RIB ARP Table EIGRP RIB OSPF RIB ARP Table
Prefix Next Hop Prefix Next Hop IP MAC Prefix Next Hop Prefix Next Hop IP MAC
FIB Table
SSO FIB Table
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Routing Protocol Redundancy With NSF (Graceful Restart)
Prefix Next Hop Prefix Next Hop IP MAC Prefix Next Hop Prefix Next Hop IP MAC
FIB Table
SSO FIB Table
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Routing Protocol Redundancy With NSF
(Graceful Restart)
Standby Supervisor Engine Slot 2
EIGRP RIB OSPF RIB ARP Table
10.0.0.0
- -
10.1.1.1 - 192.168.0 -192.168.0.1 -10.1.1.1 aabbcc:ddee32
-
-
10.1.0.0 -
10.1.1.1 192.168.55.0
- 192.168.55.1
- -10.1.1.2 adbb32:d34e43
-
-
10.20.0.0 -
10.1.1.1 192.168.32.0
- 192.168.32.1
- 10.20.1.1
- aa25cc:ddeee8
-
FIB Table
10.1.1.1 aabbcc:ddee32
10.1.1.2 adbb32:d34e43
192.168.0.0 aa25cc:ddeee8
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Routing Protocol Redundancy With NSR
(Stateful Restart)
Active Supervisor Engine Slot 1 Standby Supervisor Engine Slot 2
BGP RIB OSPF RIB ARP Table BGP RIB OSPF RIB ARP Table
Prefix Next Hop Prefix Next Hop IP MAC Prefix Next Hop Prefix Next Hop IP MAC
10.0.0.0 10.1.1.1 192.168.0 192.168.0.1 10.1.1.1 aabbcc:ddee32 10.0.0.0 10.1.1.1 192.168.0 192.168.0.1 10.1.1.1 aabbcc:ddee32
10.1.0.0 10.1.1.1 192.168.55.0 192.168.55.1 10.1.1.2 adbb32:d34e43 10.1.0.0 10.1.1.1 192.168.55.0 192.168.55.1 10.1.1.2 adbb32:d34e43
10.20.0.0 10.1.1.1 192.168.32.0 192.168.32.1 10.20.1.1 aa25cc:ddeee8 10.20.0.0 10.1.1.1 192.168.32.0 192.168.32.1 10.20.1.1 aa25cc:ddeee8
FIB Table
SSO FIB Table
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
Routing Protocol Redundancy With NSR
(Stateful Restart)
Standby Supervisor Engine Slot 2
BGP RIB OSPF RIB ARP Table
FIB Table
10.1.1.1 aabbcc:ddee32
10.1.1.2 adbb32:d34e43
192.168.0.0 aa25cc:ddeee8
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
vPC Best Practice
• vPC Domain ID’s
✓ Use a unique vPC domain ID within a contiguous L2 domain to avoid MAC overlap.
• vPC peer-gateway
✓ Acts as active gateway for frames addressed to peer switch. Avoid Peer Link forwarding.
• Distribute port-channel member interfaces across line cards within the same chassis.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
vPC Configuration Best Practices
• vPC object tracking, tracks both peer-link and
uplinks in a list of Boolean OR S4 S5
• Object Tracking triggered when the track object
goes down
• Suspends the vPCs on the impaired device.
• Traffic forwarded over the remaining vPC peer.
! Track the vpc peer link
track 1 interface port-channel11 line-protocol
! Track the uplinks
track 2
track 3
interface Ethernet1/1 line-protocol
interface Ethernet1/2 line-protocol S1 S2
! Combine all tracked objects into one.
! “OR” means if ALL objects are down, this object will go down
track 10 list boolean OR
object 1
object 2
object 3
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
Vxlan Considerations and Best Practices
• Use the Nexus Dashboard NDFC App for Vxlan configuration and management
• Summarize external routes on border leaf or Advertise default routes on a per tenant
basis
• Advertise only the LPM prefix routes of internal public layer 3 subnets out of border
leaf switches
• Include all local loopbacks in underlay routing to make troubleshooting easy
• Same Vlan Per L2 VNI and consistent naming for vlan VNI pair
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
Operational Best Practices
Loop Mitigation Settings
MO Naming Convention
• Develop and plan the MO(Managed Objects) Naming • Enable MCP ( Miscabling Protocol) per vlan
Convention according to Organizations best Practice
• To find out and disable loops caused by misbehaving
external L2
Tags and Aliases • Rogue EP Control preferred over Endpoint loop protection
• Workaround to Rename Objects
• Objects can be grouped to make query easier
• Tags/Aliases have no functional impact- Where as Labels
have
BD Level Configuration
• Do Not enable Unicast routing when gateway is not BD SVI
• Limit IP Learning to Subnet
• L2 Unknown unicast set to Flood
• ARP Flooding enabled
AAA Fallback to Local Auth Fabric Wide Configuration
• Fallback domain should be set to local to avoid lockout • IP Aging Policy
• Disable Remote EP Learning – On Border Leaf
• Enforce Subnet Check
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
Change Management
NX-OS
Graceful Insertion
and Removal
Nexus Graceful Removal
router bgp 33
Discontinue advertisement of all prefixes.
isolate
router eigrp 1
isolate Advertises maximum metrics for all K-values.
router ospf 1
isolate max-metric router-lsa
router isis 1
isolate
set-overload-bit
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
Nexus feature
Graceful Insertion N9372(config)# no system
mode maintenance
Following configuration
will be applied:
• Move the switch from Maintenance mode to
Normal mode. router bgp 33
• Control plane maintained throughout no isolate
isolation of the switch.
router eigrp 1
• Protocols advertise routes only after it is
installed in hardware.
no isolate
router ospf 1
no isolate
router isis 1
no isolate
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
VPC Shutdown Feature Configure
PKA
Secondary
This feature allows customer to manually “isolate” a switch
Primary
from vPC domain. This is a vPC configuration option. Vlan 1-100
Secondary
Primary
Vlan 1-100
• Down Peer Link brought to normal state.
• vPC Members
• Etc.
Vlan 1-100 Vlan 1-100
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Graceful Insertion and Removal
OSPF:
max-metric router-lsa
Isolate for
Change Window
VPC:
shutdown
feature ospf
feature vpc Scripting takes time.
It’d be nice to automate
this…
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Graceful Insertion and Removal
vPC vPC
One command!
Pre-change System Snapshot
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
Graceful Insertion and Removal
vPC vPC
One command!
Post-change System Snapshot
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
Configuration Profiles
• Maintenance-mode profile is applied when entering GIR mode,
• Normal-mode profile is applied when GIR mode is exited.
Automatic Profiles Manual Profiles
• Generated by default • User created profile for maintenance-
• Parses configuration to determine mode and normal-mode
changes going into and out of GIR • Flexible selection of protocols for
• Changes based on base protocol isolation
configuration settings.
• Use: maintenance windows and
• Use: Maintenance Windows isolation during troubleshooting using
preconfigured scripts.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Enabling Graceful Insertion and Removal
Automatic Profile Generation
N7K-1-Core# show system mode
System Mode : Normal
N7K-1-Core# config
Enter configuration commands, one per line. End with
CNTL/Z.
N7K-1-Core(config)# system mode maintenance
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Enabling Graceful Insertion and Removal
Custom Profile Generation
config-profile maintenance-mode type admin config-profile normal-mode type admin
router bgp 65001 router bgp 65001
isolate no isolate
sleep instance 1 10 sleep instance 1 10
router ospf 100 router ospf 100
isolate no isolate
sleep instance 3 20 sleep instance 3 20
vpc domain 20 vpc domain 20
shutdown no shutdown
system interface shutdown exclude fex-fabric no system interface shutdown
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
Graceful Insertion and Removal Mode for
Unplanned Outages
system mode maintenance on-reload reset-reason reason
HW_ERROR-Hardware error,
KERN_FAILURE-Kernel panic,
WDOG_TIMEOUT-Watchdog timeout,
FATAL_ERROR-Fatal error,
MANUAL_RELOAD---Manual reload,
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Nexus GIR Snapshots
• Used before and after a GIR mode to compare pre/post change operation.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Nexus GIR Snapshots Comparison
Nexus# sh snapshots compare before_maintenance after_maintenance switch# show snapshots compare snapshot1 snapshot2 ipv4routes
================================================================================ metric snapshot1 snapshot2 changed
Feature Tag before_maintenance after_maintenance # of routes 33 3 *
================================================================================ # of adjacencies 10 4 *
[bgp]
-------------------------------------------------------------------------------- Prefix Changed Attribute
------ -----------------
[neighbor-id:100.120.1.221] 23.0.0.0/8 not in snapshot2
connectionsdropped 2 **3** 10.10.10.1/32 not in snapshot2
lastflap P1DT21H5M12S **P1DT21H25M47S** 21.1.2.3/8 adjacency index has changed from 29 (snapshot1) to 38
lastread P1DT21H25M12S **PT0S** (snapshot2)
lastwrite P1DT21H25M14S **PT0S**
state Established **Idle**
localport 52737 **0**
{+-}
remoteport 179 **0**
notificationssent 2 **3**
<...>
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Tools manage DC
• Controller (APIC / NDFC)
• Nexus Insights
Health Score
Aggregated View
Fabric Topology
View
Aggregation of system-wide health, including pod health scores, tenant
health scores, system fault counts domain and type and the APIC cluster
health state.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Capacity Dashboard
View the Capacity of Data Center Fabric -ACI
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Troubleshoot a Flow
Use Inbuilt Visibility Engine
Faults
Eligible Path
Drops
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
Troubleshoot a Flow
Use Inbuilt Visibility Engine
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Maintenance Upgrade #1
Upgrade APIC
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Maintenance Upgrade #2
Create Groups
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Cisco Nexus Dashboard Fabric Controller With Nexus Dashboard
Cisco Nexus Fabric Controller
Part of Nexus Dashboard
Dashboard
(NDFC)
Simple to automate, simple to consume
1. Single plane to build and
manage multisite fabrics
2. NDFC is an application that
you can invoke in single
Insights Fabric controller
Fabric discovery
installation for SAN and for
the fabric
Fabric controller
Orchestrator SAN controller
Fabric discovery 3. IPAM integration for VXLAN
EVPN fabrics
4. Micro Services Scaled mode
Data broker SAN controller – active/ active, increased
scale for
Consume all services in one place managed/monitored objects
5. Manages IOS-XE and IOS-
XR platforms
Private cloud | Third-party apps Public cloud
| 6. Granular RBAC applicable to
Multisite fabric management
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
NDFC Topology View Detailed Views
• Dynamic Arrangement
• Multi-Fabric/Overlay
• Arrange by Tier
• Core, Ag, Access
Leaf, Spine etc..
• Metadata Tags
• Device Pop-Over
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Capacity Dashboard
View the Capacity of non ACI Fabric
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Nexus
Dashboard
Efficient Operations at the DC (AIOPs)-
Nexus Dashboard
Note: Application insights using AppD integration with nexus insight upgrade is not covered in this session
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Unified Datacenter Controller views
Nexus #1 Benchmarking
Dashboard
#2 Upgrade planning
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Verify Fabric Communication & relationship
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Analyse the Anomalies
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
Pre-Change Analysis using Nexus Dashboard
Analyzes and Reveals the impacts of intended configuration changes.
Base Assurance
Snapshot Analysis
Results
(The current configuration)
Pre-change Assurance
Snapshot Analysis
Results
Other features: Anomaly detection and correlation (software, config or hardware), Upgrade pre-check and post-check
across multiple fabrics, data plane dependency mapping and micro burst detection
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
Create and Run a Pre-Change Analysis
Pre-Change Analysis is under the “Change Management” Option in the main
panel. Users can create, edit, clone or delete a pre-change analysis.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
Network Compliances Use Cases
Network Compliances
EPGs in
SecureVRF_PCI EPGs in VDI tenant
• Golden Configuration
must always talk to
tenant must be DHCP service in
• Naming Convention
segmented tenant common
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Network Compliances Use Cases
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
View Compliance Analysis Status
Compliance is a top-level option in
the main panel.
On the Compliance page, the
summary of the compliance
analysis of the site is shown for the
selected time window.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
Upgrade Assist
Overview Upgrade Assist
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
How to do an upgrade analysis Reference
• To start an upgrade analysis, select
Change management
analysis -> Firmware upgrade analysis
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
How to do an upgrade analysis Reference
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
How to do an upgrade analysis Reference
3. Check all the boxes against the nodes you wish to analyze
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 74
How to do an upgrade analysis
Reference
• Wait for the results to populate
• Double click any of the listed analysis results to open and see detailed results
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
How to do an upgrade analysis
Reference
The areas checked for during pre-upgrade analysis for ‘ACI are listed’ below
1 Critical faults No Critical Faults from System -> Faults
2 OOB management IP Ensure static OOB Management IP is configured
3 vPC nodes Configure vPC for the listed leaf nodes to avoid traffic loss during the reboot of leaf nodes.
4 Route reflectors Configure vPC for the listed leaf nodes to avoid traffic loss during the reboot of leaf nodes.
5 NTP status Configure NTP to avoid any issues in DB sync between nodes, SSL certificate check, etc.
6 Infra VLAN id Check if configured infra VLAN ID are same across nodes
7 Fabric Recovery Enabled Check if Fabric Recovery is in progress
8 CIMC compatibility Check if running recommended CIMC version
9 Version compatibility Check version is compatible or multi-hop upgrade is needed
10 Target firmware check Check if image is copied and available in device
11 SNMPv3 auth compatibility Check SNMPv3 authorization and/or privacy
12 Remote leaf compatibility Check if remote leaf is not supported in the target version.
13 Multi-Tier compatibility Check if Remote leaf is not supported in the target version
14 Bootflash storage Check for space in bootflash folder to download image
15 Spine redundancy Check if each pod upgrades spine nodes with at least two separate groups to avoid traffic loss. Spines should not be in maintenance group.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
How to do an upgrade analysis
Reference
The areas checked for during pre-upgrade analysis for NXOS are listed below
# Validation Step Description
1 validate po summary Check if all port-channel members are in (P) state
2 validate vpcs Check if vpc status is “up”
3 validate vpc role Check if local vPC role is secondary
4 validate sticky bit Check if local switch vPC sticky bit is False
5 validate hsrp state Check if "hsrp mgo state" is Active/Standby
6 validate mods Check if module in ok/active/standby state and diag pass
7 validate ospf Check if OSPF is in FULL FULL/DR state
8 validate bgp Check if BGP session is in Up State
9 validate free space Check if bootflash on active/standby supervisor is greater than threshold
10 validate logging nvram Check for Severity 1, 2 or 3 messages
11 validate logging log Check for Severity 1, 2 or 3 messages
12 validate mod exceptions Check for non-user initiated resets
13 validate redundancy Check if redundancy status is "Active with HA standby" for EOR
14 validate reset reason Check if module was reset due to reasons other than those initiated by user
15 validate system reset reason Check if module was reset due to reasons other than those initiated by user
16 validate diag module Checks if previously initiated diag had shown failure
17 validate flash ramdisk Check if all filesystems are equal to or below integer percentage
18 validate console mgmt Check if console register bits are RTS|DTR|DSR
19 validate environment Check if all modules are in ok state and backup power present
20 validate arp table Check if certain number of ARP'S are in INCOMPLETE
21 validate cores Check if core files are present
22 validate device connectivity Check if we are able to reach the device with PolicyGateway/NDFC/SIM
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
Migration Best Practices
Datacenter Migration – Strategy and Approach
• Architecture Governance
Source Architecture 1 • Target Architecture definition
• Scale Considerations
Spanning • Impact Analysis
Fabric Path • Prechecks and Validations
Tree Analysis &
Planning
Vxlan ACI
Post Migration Application 2 • Nexus Dashboard or Cisco Secure
6 monitoring and Dependency
• APM monitoring Workload (flow metrics or ADM)
support Mapping
• Issue resolution and • Open-Source automation
SLA • Business Process
• Detailed Health • Server to port Mapping
analysis • Automated policy generation
Top-down approach
Discovery
Bottom-up approach
− Core business processes
− Current DR capability
Databases Application Servers Web Servers
− Available downtime window for migration
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Datacenter Migration Scenarios and Considerations
• Gateway Considerations
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
Visibility Considerations and Best Practices
for Migration
Application dependency
Application flow Map
Mapping 1. Selecting Application for Migration
b) Latency considerations
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Network Considerations for Datacenter Migration
1 Migration Build a parallel vxlan fabric
Planning
Establish L2 connection Existing/Brownfield
Datacenter New Vxlan EVPN Datcenter
between legacy and new
vxlan fabric
Establish Dedicated L3
Border Leaf
interconnect Between 2 Core
fabrics
Border Leaf
2 Layer 2 Dedicated Leaf for L2 Core
No summarization on border
routers during Migration
Distribution
4 Overlapping Vlan translation on the
Vlans Migration Leaf
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Gateway Migrations from Legacy to EVPN environment
X
FHRP approach for
Gateway in CE
DCI
L3
DAG
192.168.1.1
L2
Gateway Consideration
New Architecture
Legacy Architecture • Pre migration
VPC VPC
DAG DAG DAG DAG
………………....
192.168.1.1 192.168.1.1 192.168.1.1 192.168.1.1
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
ACI Network Centric Deployment
Network configuration
WAN/
Corp-L3out Internet
• VRF CORP …. vrf
Tenant: Example-Corp
configuration VRF: Corp
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
ACI Migration Example
Layer 3 Routing
Static, OSPF, BGP
APIC
1.1.1.12/30
1.1.1.0/30
Blue Tenant
and Context
Policies
Migration
BD Blue_1 BD Blue_2
Vlan 10,11 10.10.11.1/24
L2_
Out EPG EPG
L2_ L3Out
blue_1 Out
blue_2
isolation
✓ Simple to deploy compare to old DCI
technology and uses EVPN concept
✓ Supports endpoints connected to
BGW – cost effective for smaller
fabrics Vx Dedicated anycast BGW
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Legacy Network to VXLAN EVPN Migration
using Border Gateway
• Introduce a pair of BGW to Legacy sites
✓ Back to back vPC provide multipath connectivity Layer 3 Inter site Network
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Transition Legacy Network to VXLAN EVPN
using Border Gateway – Final Step
• Build a parallel Nexus 9000 Hub and Spoke Evpn Layer 3 Inter site Network
Vxlan Fabric
✓ Workload migration commences at this point.
Vxlan Overlay
BGW BGW
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Legacy Network to VXLAN EVPN Migration using
Border Gateway
Layer 3 Inter site Network
VTEP VTEP
VTEP VTEP
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
Change Management
Maintenance Windows – Golden Rules
• Change Review Board
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
Traditional vPC Environment Change
Change Best Practice and Window
Primary Secondary
Core Isolation
1. Graceful L3 Protocol Isolation
vPC
2. Layer 2 Isolation
• VPC
3. Interface Isolation
Using GIR Mode Steps 1-3 could be achieved prescriptively.
Access Isolation
1. Layer 2 Isolation
• VPC
2. Interface Isolation
1. Fex-fabric (include/exclude)
2. Dual-attached FEX Procedure * Recommended
Fex Using GIR Mode Steps 1-2 could be achieved prescriptively.
NOTE: Maintenance mode consideration should be based on Fex-
fabric connectivity.
If change window is for software upgrade or spot fix, consider ISSU or SMU feasibility.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
L3 Environment
Change Best Practice and Window
Core Isolation
1. Graceful L3 Protocol Isolation
2. Interface Isolation
Using GIR Mode Steps 1-2 could be achieved
prescriptively.
Access Isolation
Layer 3 1. L3 Protocol isolation
2. Layer 2 Isolation
• vPC
3. Interface Isolation
1. Fex-fabric (include/exclude)
2. Dual-attached FEX Procedure * Recommended
Using GIR Mode, prescriptive isolation is possible.
If change window is for software upgrade or spot fix, consider ISSU or SMU feasibility.
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 95
Summary
Putting it all Together
ISSU ✓ X ✓
GIR + Cold Boot ✓ X ✓
GIR + Disruptive
✓ X ✓
Installer
SMU Restart ✓ X X
GIR + SMU ISSU ✓ X X
GIR X ✓ X
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
Datacenter Operations –key take ways
• Understand the features and relations to best practices
• Utilize Software hardware best practice in deployment of the Data center
• Change Management features
• Isolation (GIR)
• Tools used to manage DC environment
• Day 0,1,2 use controllers based on automation
• Assurance to manage DC use Nexus Insights
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
Datacenter Migrations –key take ways
1. Verify environment conforms to data center networking best
practices, and leverage DC controllers
2. Isolate Node to minimize the disruption - leverage features like
GIR for change window planning
3. Leverage the Migration methodology and use cases to customize
your transformation
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
Complete your Session Survey
• Please complete your session survey
after each session. Your feedback
is important.
• Complete a minimum of 4 session
surveys and the Overall Conference
survey (open from Thursday) to
receive your Cisco Live t-shirt.
• All surveys can be taken in the Cisco Events Mobile App or
by logging in to the Session Catalog and clicking the
"Attendee Dashboard” at
https://www.ciscolive.com/emea/learn/sessions/session-catalog.html
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
Pay for Learning with Cisco
Here at the event? Visit us at The Learning and Certifications lounge at the World of Solutions
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 101
Continue
Agenda Your Education
BRKDCN-2958 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 102
Thank you