Not to publically shame Cisco or anything, but to publically shame Cisco…

We’re moving a site to a different physical location next week, so I wanted to do a round of patching beforehand to clear out any possible lingering software bugs. Our IOS-XE ISR 4431 routers and 3650 switch stack were on 16.12.04 which is very current, but I noticed 16.12.05 came out last week and while my general rule of thumb is never try a Cisco IOS version until it’s been out for at least a month, you’d think a release ending in a ‘5’ marked as MD would be good to go, right?

Oh…Cisco. Fool me once, shame on you. Fool me twice? We can’t get fooled again…,

So the router came up find with no errors and seemed to check out fine, but I soon realized the DMVPN tunnel showed no EIGRP neighbors and ARP showed all entries incomplete. I ultimately had to come in through a console backdoor and noticed the core switch had suspended the LACP bundle, even though the router reported interface Port-Channel1 up/up. Huh.

Looking closer at the router configuration, I soon realized the problem. What had previously been a working LACP configuration in IOS-XE 16.12.04 had now become broken a forced EtherChannel on 16.12.05:

interface GigabitEthernet0/0/0
 no ip address
 negotiation auto
 channel-group 1 
!
interface GigabitEthernet0/0/1
 no ip address
 negotiation auto
 channel-group 1 
!

Thus, the core switch (Nexus 93180YC-EX) which was still configured as LACP passive had rightfully suspended the bundle.

Recreating the bundles as LACP made them functional again:

conf t
interface GigabitEthernet0/0/0
 no channel-group 1 
!
interface GigabitEthernet0/0/1
 no channel-group 1 
 channel-group 1 mode active
!
interface GigabitEthernet0/0/0
 channel-group 1 mode active
!

Ironically, LACP support is one of the reasons I had retired our working-perfectly-fine 2921s in favor of the 4431s.

Once again, this example proves three important points:

  • IOS-XE, despite being out for like 10 years, is still buggy
  • Code developed non-MD trains is being slipped in to the MD train and bringing new bugs along with it. In this case, https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvw74609
  • Cisco’s branding of “MD” is meaningless anyway, because maintenance releases clearly are not undergoing adequate regression testing