Nexus C93180YC-EX - link flapping

Hello,

I have C93180YC-EX switches in a vPC running firmware version 9.3(10). I have a Synology NAS connected to them via mLACP, and every time I reboot the NAS, the ports on both switches transition into the errdisable state.

Initially, the Nexus switches report that LACP is not enabled on the peer device, so the port channel member (i.e., the physical port) gets suspended. See the log:

%LACP-5-LACP_SUSPEND_INDIVIDUAL: LACP port Ethernet1/11(0x1a001400) of port-channel Ethernet1/11(0x1a001400) not receiving any LACP BPDUs suspending (individual) port
%LACP-5-LACP_SUSPEND_INDIVIDUAL: LACP port Ethernet1/11(0x1a001400) of port-channel Ethernet1/11(0x1a001400) not receiving any LACP BPDUs suspending (individual) port
%ETH_PORT_CHANNEL-5-PORT_SUSPENDED: Ethernet1/11: Ethernet1/11 is suspended

Afterward, I see the following messages in the log:

%ETHPORT-5-IF_DUPLEX: Interface Ethernet1/11, operational duplex mode changed to Full
%ETHPORT-5-IF_RX_FLOW_CONTROL: Interface Ethernet1/11, operational Receive Flow Control state changed to off
%ETHPORT-5-IF_TX_FLOW_CONTROL: Interface Ethernet1/11, operational Transmit Flow Control state changed to off
%ETHPORT-5-SPEED: Interface port-channel111, operational speed changed to 10 Gbps
%ETHPORT-5-IF_DUPLEX: Interface port-channel111, operational duplex mode changed to Full
%ETHPORT-5-IF_RX_FLOW_CONTROL: Interface port-channel111, operational Receive Flow Control state changed to off
%ETHPORT-5-IF_TX_FLOW_CONTROL: Interface port-channel111, operational Transmit Flow Control state changed to off
%ETHPORT-5-SPEED: Interface Ethernet1/11, operational speed changed to 10 Gbps
%ETHPORT-5-IF_DUPLEX: Interface Ethernet1/11, operational duplex mode changed to Full
%ETHPORT-5-IF_RX_FLOW_CONTROL: Interface Ethernet1/11, operational Receive Flow Control state changed to off
%ETHPORT-5-IF_TX_FLOW_CONTROL: Interface Ethernet1/11, operational Transmit Flow Control state changed to off
%ETHPORT-5-SPEED: Interface port-channel111, operational speed changed to 10 Gbps
%ETHPORT-5-IF_DUPLEX: Interface port-channel111, operational duplex mode changed to Full
%ETHPORT-5-IF_RX_FLOW_CONTROL: Interface port-channel111, operational Receive Flow Control state changed to off
%ETHPORT-5-IF_TX_FLOW_CONTROL: Interface port-channel111, operational Transmit Flow Control state changed to off

Finally, the port transitions into the errdisable state:

%ETHPORT-5-IF_DOWN_ERROR_DISABLED: Interface Ethernet1/11 is down (Error disabled. Reason:Too many link flaps)

I am not sure:

a) Whether these two occurrences (i.e., LACP not active and “Too many link flaps”) are related in any way.

b) Whether there is a way to configure errdisable recovery per port, so that it does not have a global impact on the entire switch.

Thank you in advance for your responses. :slight_smile:

Hello Michal

Based on the information you shared, it seems that this issue typically boils down to two primary factors. First, the Synology NAS is not sending LACP PDUs on time when it reboots, causing the Nexus to suspend the port-channel member links, and seondly, the links are flapping multiple times in a short period, which triggers the “Too many link flaps” err-disable mechanism. All of this seems to be a result of the synchronization timing between LACP negotiation and the NAS reboot process. I believe that they are related to each other.

So it is indeed quite likely that the LACP break triggers the flaps contributing to the “Too many link flaps” error. Here are some guidelines that may help you to troubleshoot and tweak some configs to eliminate the behavior you are seeing:

  1. From the NAS end, make sure that LACP is set up correctly and is set to active or balance-tcp depending on the bonding options available. Some NAS devices briefly remain in a non-LACP or “standby” state when they reboot, causing the Nexus side to suspend ports. If there is a configuration parameter that minimizes or eliminates this behavior, enable it. The config options on the NAS end are somewhat limited, so you may find that little change can be effected from this end.
  2. From the Nexus end, I suggest you use longer LACP timers to give the NAS time to reboot. Make sure that the lacp rate fast command is disabled on the interfaces facing the NAS. That way the LACP timers are set to 30 seconds rather than 1 second. If the config is available on the NAS end, do the same there.
  3. Concerning the errdisabled state, consider increasing your link flap errdisable thresholds on the Nexus switches. You can also enable errdisable recovery so that the port returns to an operational state after a predetermined period of time, without the need for manual intervention.

These are general guidelines that should help you resolve the issue. Let us know your results!!

I hope this has been helpful!

Laz