SW_MATM-4-MACFLAP_NOTIF err-disabled

Hi all,

we have a medium-sized network with about 150 switches (many VLANs) and we have experienced a major outage yesterday. Obviously due to some failed network device we got mac-flapping on one Vlan and within a few seconds almost the whole network went down because of many of the switches uplinks (configured as trunks) had been err-disabled. Now my question is: There are many err-disable recovery options in IOS but I cannot find out if one of those will address the issue we have experienced. To me it seems none of them does. Can anyone confirm this or teach me better?

Thanks
Daniel

Hi Daniel,

A big flat L2 network is a big risk. When something happens, it can bring the entire network down like you witnessed.

Which exact error message put your interfaces in errdisable mode?

The “%SW_MATM-4-MACFLAP_NOTIF” error is notified through syslog but it doesn’t put the interfaces in err-disable mode.

Rene

Hi Rene,

thanks for your reply. That’s what I thought also, but the only message in the switches logs is the MACFlap, i. e. “Host f0de.f181.119c in vlan 1 is flapping between port Te1/0/1 and port Gi2/0/16” and the uplinks got err-disabled. I had to go to each switch and manually do a shutdown and no shutdown which took me several hours. The question is not so much why it happened but more if it is possible to recover from it automaticallyl. But of course I understand when the cause cannot be determined you cannot tell how to recover from it. :slight_smile:

Daniel

Hi Daniel,

I wasn’t sure so I did a little experiment last night :slight_smile: One switch, two hosts with the same MAC/IP address sending pings to some IP address. This produces the “%SW_MATM-4-MACFLAP_NOTIF” error through syslog non-stop but it doesn’t cause an interface to go into err-disable mode. I left it running for hours :slight_smile:

Do you use an external syslog server? The local log of your switches (show logging) probably got swamped with MAC flapping messages so that’s probably why you don’t see the logging line why the interfaces went in errdisable mode.

You can enable autorecovery but you will need to know the exact reason why they went in errdisable. It has to be something from this list:

Switch#show errdisable recovery 
ErrDisable Reason            Timer Status
-----------------            --------------
arp-inspection               Disabled
bpduguard                    Disabled
channel-misconfig (STP)      Disabled
dhcp-rate-limit              Disabled
dtp-flap                     Disabled
gbic-invalid                 Disabled
inline-power                 Disabled
l2ptguard                    Disabled
link-flap                    Disabled
mac-limit                    Disabled
link-monitor-failure         Disabled
loopback                     Disabled
oam-remote-failure           Disabled
pagp-flap                    Disabled
port-mode-failure            Disabled
pppoe-ia-rate-limit          Disabled
psecure-violation            Disabled
security-violation           Disabled
sfp-config-mismatch          Disabled
storm-control                Disabled
udld                         Disabled
unicast-flood                Disabled
vmps                         Disabled
psp                          Disabled
dual-active-recovery         Disabled
evc-lite input mapping fa    Disabled
Recovery command: "clear     Disabled

If you don’t use logging, setup an external logging server so you can see it next time :slight_smile: Or enable some of these autorecovery features beforehand :slight_smile:

Rene