ccie blog

Loop Guard and UDLD

Loop guard and UDLD are two ways to protect your fiber cables from causing loops in the network.  In short, loop guard is a spanning-tree optimisation, and UDLD is a layer 1/2 protocol (unrelated to spanning-tree) that protects your upper layer protocols from causing loops in the network.  To explain these features clearly, see the diagrams below.  The first diagram is the layer 2 spanning-tree topology, and the second diagram is the actual physical wiring used in the topology. You will need to use both diagrams as a reference point simultaneously in order to understand how loop guard and UDLD work in the examples I will provide.

Loop Guard and UDLD

In case you are not familiar with fiber, you need to make sure you understand the connection between Sw2 and Sw3 in the diagram on the right hand side.  This is two physical cables, one is to transmit data and the other is to receive data. These fiber cables are usually plugged into an SFP such as the one shown below, and then the SFP is inserted into the switch. On the switch, this is shown as one physical port. In my diagram, it’s shown as Gi0/1 on Sw2 and Sw3.

SFP

So then, lets start off with loop guard and its purpose. In the spanning-tree topology, Sw2’s Gi0/1 port is currently in the alternate (ALTN) discarding state (in legacy spanning-tree, this is just known as being in a blocking state). So Sw3 is therefore responsible for forwarding frames on the segment between Sw2 and 3. So what happens then, if the fiber cable that is currently transmitting BPDU’s from Sw3, out the Tx port, down towards Sw2’s Rx port becomes broken for some reason? It means that Sw3, although he is the designated port for the segment, is unable to send BPDU’s towards Sw2. Therefore Sw2 now believes there is not another switch for the segment out of his Gi0/1 interface and transitions the port into designated role/forwarding state (Remember, a port must receive BPDU’s in order to stay in a ALTN/blocking state).  When a port moves into the forwarding state, it is allowed to learn mac addresses & also send and receive user traffic. What this effectively means then, is that we have created a unidirectional loop. To illustrate this, assume a host connected to Sw1 sends a broadcast ARP (refer to the diagram below for this scenario, and note that both Sw2 & 3 are both designated ports for the segment between them). So Sw1 will forward the broadcast ARP frame out all ports except for the port the frame was received on, so Sw2 and Sw3 get the frame. Sw2 is able to forward the frame to Sw3 because his port is now designated/forwarding. Sw3, when he receives this, will forward the frame out all ports except for that on which it was received, and sends it to Sw1. And so on, there is an endless unidirectional loop from Sw1–>Sw2–>Sw3–>Sw1.  The key element here is to note that because Sw2 is able to send data in the forwarding plane of the switch out of Gi0/1, and no ports are in a blocking state for the segment, it means that user traffic ends up being forwarding back upstream towards the root bridge (i.e. no ports are in blocking state on the segment). This is what creates the loop in the network, and what UDLD and Loopguard can be used to protect against.

Loopguard LoopEnabling loopguard on Sw2 Gi0/1 stops this ALTN port erroneously transitioning into a forwarding state for the segment when BPDU’s have stopped being received on the port i.e. it stops the port going from an ALTN port to a designated port when BPDU’s are no longer received. In our scenario where the fiber connected to Sw2’s Rx port broke, it means that when BPDU’s are no longer received from Sw3, and the port gets put into loop inconsistent state (which is basically a STP blocking state) until BPDU’s are received again.

If we were to use UDLD instead of loop guard, this would effectively do the same thing. So when the fiber to Sw2’s Rx port fails, and UDLD is in aggressive mode, the port is put into error disabled. The way UDLD works out that there is a unidirectional link (i.e. just 1 part of the fiber is broke) is pretty cool. Each switch sends out periodic Ethernet multicast UDLD hello’s destined to 0100.0ccc.cccd and lists its own device ID, port ID, time-out value, and a bunch of other parameters. When a switch receives this UDLD frame, it does two things; it stores and caches this information from the neighbor, and it echos the same device ID and Port ID it just received in the UDLD hello back towards originating switch. When the originating switch sees the UDLD frame come in with his own device ID and Port ID, it knows a UDLD neighbor exists out of the interface. These multicast hellos are used to build and maintain the neighbor relationship, and are expected to be received before the time-out interval expires in order to keep the neighbor alive from a UDLD perspective.

So in my topology, when the fiber is broken on the Sw3 Tx port, UDLD identifies that we are no longer seeing a UDLD frame back in on the Gi0/1 interface (that would normally list Sw3‘s devie-ID and port-ID), and when the UDLD time-out period expires, the switch transmits 8 UDLD frames, one per second, and if no reply is received then the port goes into err-disabled. This is the default action of aggressive mode. In normal mode, the port just goes into the unknown state, which is designed for an “informational purpose”. In reality, that’s just useless, so use aggressive mode to prevent loops.

The key differences between UDLD and loop guard then, is that UDLD protects against mis-wiring of your fiber ports, or a physical wiring problem that would cause your upper layer protocols like spanning-tree to break. Note though, that UDLD is not a part of spanning-tree, nor does it play any part in a spanning-tree topology. It is merely there as a helper for spanning-tree because spanning-tree is unable to identify a fault at layer 1 like this that would cause a loop in the network. Now loop guard is a spanning-tree optimisation and its function is to stop root or ALTN ports transitioning into the designated/forwarding state. A lot of the time loop guard is going to kick in when there is a physical layer problem as I showed in my example, but it can also protect against some spanning-tree stupidity or moron-bad configured ACLs. For example let’s say someone accidentally went to the gi0/1 interface on Sw2 and configured #spanning-tree bpdufilter enable. The port would neither send or receive BPDU’s, and it would become designated and cause a loop. If loop guard was pre-configured on the port, it would just go into loop inconsistent state and be blocked. UDLD would be non the wiser, but loop guard would see this problem.

The recommended best practice is to use both UDLD and loop guard together. It’s also recommended to make sure that you tune your UDLD timers to detect a layer 1 problem faster than spanning-tree can transition a port into designated/forwarding state.

4 Comments

Me, myself and IJune 19th, 2015 at 12:25 pm

Nice stuff. Thanks.

aravindJuly 31st, 2015 at 8:18 am

hi, very use full nice and great

I need help for my ccie written exam.

RaitoOctober 30th, 2015 at 8:39 am

Nice article, just answered my question if the only purpose of loopguard was to detect unidirectional links.

SamuelJanuary 13th, 2016 at 1:14 pm

Awesome!!! you just nailed it!!!!

Leave a comment

Your comment