STP Problem

NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
I have an idea of whats going on but I want a second opinion. Can anyone tell by looking at the picture if there is a loop? (Side Note: Only this switch produces about 150 packets of STP traffic per second)


c:%5CDocuments%20and%20Settings%5Cdtommasino%5CDesktop%5Cstp.jpg
«1

Comments

  • Forsaken_GAForsaken_GA Member Posts: 4,024
    If you have a loop, you will know it. Your network will be unusable within minutes.
  • XenzXenz Member Posts: 140
    Any other information you can give? Looking at a wireshark capture that only shows 4 TCN's without a context is kind of hard to tell.
    Currently working on:
    CCNP, 70-620 Vista 70-290 Server 2003
    Packet Tracer activities and ramblings on my blog:
    http://www.sbntech.info
  • kryollakryolla Member Posts: 785
    try this

    sh spanning-tree active det

    and watch your BPDU count and that might be able to tell what port is the culprit and if possible shut it down and see if spanning tree stabilizes

    you also have all the correct filters such as portfast and bpdu guard/filter


    VLAN0005 is executing the ieee compatible Spanning Tree protocol
    Bridge Identifier has priority 32768, sysid 5, address 000e.d784.9180
    Configured hello time 2, max age 20, forward delay 15
    Current root has priority 32773, address 0009.b716.ae00
    Root port is 19 (FastEthernet0/19), cost of root path is 19
    Topology change flag not set, detected flag not set
    Number of topology changes 0 last change occurred 1w4d ago
    Times: hold 1, topology change 35, notification 2
    hello 2, max age 20, forward delay 15
    Timers: hello 0, topology change 0, notification 0, aging 300

    Port 19 (FastEthernet0/19) of VLAN0005 is root forwarding
    Port path cost 19, Port priority 128, Port Identifier 128.19.
    Designated root has priority 32773, address 0009.b716.ae00
    Designated bridge has priority 32773, address 0009.b716.ae00
    Designated port id is 128.13, designated path cost 0
    Timers: message age 1, forward delay 0, hold 0
    Number of transitions to forwarding state: 1
    Link type is point-to-point by default
    BPDU: sent 0, received 500826
    Studying for CCIE and drinking Home Brew
  • NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
    @Forsaken_GA - point taken however isn't the point of PVST to have spannig-tree within a VLAN, so wouldn't it be possible to have a loop only in a single VLAN making just the VLAN network unusable (correct me if I'm wrong which is quite possible).

    @Xenz - There are about 15 switches with 5 or so VLANs that all have clients connected to them. The switches feed into a 16th that trunks via fiber out of the building to another location. The 16th switch is where all the STP traffic is occuring.

    @kryolla - good idea thanks, going to give that a try now. Portfast, and bpdu guard are enabled (or so I'm told by another engineer)
  • CyanicCyanic Member Posts: 289
    Spanning tree storms can make the physical devices unstable and effect all traffic flowing through them, regardless of VLAN.
  • XenzXenz Member Posts: 140
    Hopefully someone much smarter than me can explain this since this is prime teaching material. Based on his limited **** information and based on what he/she has told us, am I wrong about understanding the capture and the information presented under info?

    The packets listing all sorts of random root priorities, I would imagine seeing redundancy (listing the same priority possibly), but they are all over the place. I don't assume that 15 switches in the network are all connected to this switch and that he's not capturing all the traffic from all the interfaces. Nothing he/she could engineer would affect the STP priority+system ID other than VLANs or lowering priority. He's mentioned there are only 5 VLANs I would assume outside of the defaults.

    I can't see the number of TCN's, but my guess on high STP packet output would be an interface flapping. At the same time, I'm sure he/she would have noticed large scale network outages while STP tried to converge. Unless that's how this was detected?

    Is this just weird behavior that sometimes shows up or am I not understanding something about STP that I really would like to know/should know? Any theories or anything you guys can tell me I would appreciate it.
    Currently working on:
    CCNP, 70-620 Vista 70-290 Server 2003
    Packet Tracer activities and ramblings on my blog:
    http://www.sbntech.info
  • kryollakryolla Member Posts: 785
    interface flapping with portfast does not cause TCN BPDU :).
    Studying for CCIE and drinking Home Brew
  • XenzXenz Member Posts: 140
    Whoops missed that post. Am I reading the capture wrong or any ideas on why you would see multiple root priorities?
    Currently working on:
    CCNP, 70-620 Vista 70-290 Server 2003
    Packet Tracer activities and ramblings on my blog:
    http://www.sbntech.info
  • kryollakryolla Member Posts: 785
    I noticed that as well, I would like to know if any ports are transitioning or if there is any unusual amount of BPDU getting sent upstream on the root port which means it is TCN BPDU or BPDU aging out due to congestion
    Studying for CCIE and drinking Home Brew
  • Forsaken_GAForsaken_GA Member Posts: 4,024
    There's not enough information to tell what's really going on just from the Wireshark capture. Assuming he's forwarding his syslogging to a syslog host (and if he's not, he should) it'd be helpful to ratchet up the log level and post some of that.
  • Forsaken_GAForsaken_GA Member Posts: 4,024
    @Forsaken_GA - point taken however isn't the point of PVST to have spannig-tree within a VLAN, so wouldn't it be possible to have a loop only in a single VLAN making just the VLAN network unusable (correct me if I'm wrong which is quite possible).

    For normal convergence, sure. Broadcast storms don't really care about PVST though, they just keep infinitely propagating packets until the physical hardware doesn't have any resources left to process.

    Everyone should cause a broadcast storm in a lab environment at least once, just so they can see the consequences, and know what it looks like if they ever see it in a live environment. I have had the misfortune to witness one in production, and it led to a 2 hour outage, as I had to physically cut off the remote location causing it until someone could drive out there to remedy the problem.
  • NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
    Wow thanks for all the replies guys/gals. I can provide you with more information if you'd like. I'm not the network engineer so I have noooo clue how he has this thing configured (although I would image there are issues, because he doesn't understand the basics so I'm not sure how he became a network engineer....but anyway). I can get access to the switch and provide output / large .pcap files if need be. What would make it easier?

    @Forsaken_GA - there is a syslog server but the engineer never configured it properly so I'll try to pull any logs I can off the local switch

    kryolla - you mentioned the output of show spanning-tree active detail....what would be a normal BPDU count assuming STP is functioning properly? It should have small to no sent and a larger amount of received, yes?
  • kryollakryolla Member Posts: 785
    BPDU are orignated by the root switch every hello interval so look at the root switch and all ports should be designated and only sending if it is receiving BPDU and the count is increasing abnormally then a switch downstream is sending a lot of TCN BPDU. The next hop switch will have 1 root port and the rest will be either blocking or designated ports (DP) and forwarding. So the root port receives the BPDU add its sending BID and port cost to it and floods it out all DP. The blocking ports will stay block and will not send any BPDU as long as it is receiving a better BPDU. So looks at all DP and you see most of the BPDU are sending and not receiving once again if is receiving then most likely a TCN is happening so I would investigate that. Enable spanning tree logs for the stages the port goes through because as soon as the blocking port BPDU ages out for indirect failure or if the RP goes down for a direct failure it will go into the listerning, learning stages. Loops are not that common anymore because company dont use hubs. Check this link out for root guard and unidirectional links

    Spanning Tree Protocol Problems and Related Design Considerations - Cisco Systems
    Studying for CCIE and drinking Home Brew
  • CyanicCyanic Member Posts: 289
    Xenz wrote: »
    The packets listing all sorts of random root priorities, I would imagine seeing redundancy (listing the same priority possibly), but they are all over the place.

    If you notice, all the updates with different root priorities are from the same device. Also if you look at the times, these are being sent way faster then they should, ~2 ms. My guess is that there is nothing wrong with this as the updates are for different root bridges, i.e., vlans.

    Forsaken is right we don't have enough information from the screen capture to troubleshoot this effectively.
  • NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
    Just curious but if there are multiple VLANs with PVST running on them should one of the ports be in a BLK state / VLAN ? Isn't the point to have one in a block state so as for traffic to just not move around in a circular state?
  • XenzXenz Member Posts: 140
    He noted there are only 5 VLAN's configured which is why the priorities are wacky. Using 32768 + VLAN, we should only see 5 values corresponding to his VLAN's, there are well over 20 different priorities set for this network.

    Noting the fact they all revolve around the default priority, my guess is the network admin has let the root bridge be elected via defaults which would likely have the same root bridge for each VLAN.

    I know the BPDU's are being sent way too fast, never disputed it, but I was wondering why a switch would be sending them way too fast with inconsistent root priorities?

    Kryolla is focused on the solution as far as what could be causing the 150pps flood and TCN's. I'm just curious what would cause these inconsistencies to show.
    Currently working on:
    CCNP, 70-620 Vista 70-290 Server 2003
    Packet Tracer activities and ramblings on my blog:
    http://www.sbntech.info
  • XenzXenz Member Posts: 140
    Just curious but if there are multiple VLANs with PVST running on them should one of the ports be in a BLK state / VLAN ? Isn't the point to have one in a block state so as for traffic to just not move around in a circular state?


    Kryolla is looking at this situation as an indirect failure in the topology allowing the max age timer to expire and forcing the ports to go through the states.

    The idea is that where a redundant link exists, one port on one side of the link will be in a blocking state. There can be multiple forwarding ports depending on the setup, the root port (port closest to the root bridge), and designated port(s) on redundant links.

    Think of a pyramid with the top being the root bridge. The link connecting the base of the pyramid together aren't the root ports since the links connecting to the root bridge are considered RP's, instead the links connecting the base of the pyramid are a redundant link that could allow a loop. STP decides that one of these ports should be in a blocking state while the other becomes the designated port for the segment.

    PVST just means that for each VLAN, an instance of STP is generated so there are 5 different STP topologies in the case of 5 VLAN's. So, figuratively speaking, for each VLAN a pyramid is made.
    Currently working on:
    CCNP, 70-620 Vista 70-290 Server 2003
    Packet Tracer activities and ramblings on my blog:
    http://www.sbntech.info
  • NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
    I understand your analogy it just seems weird to me the way that the configuration is setup. If I run "show spanning-tree summary" I see that all ports are in a FWD state, which shouldn't be the case because there is the Root port, DP, and then a BLK port (so as not to create a forwarding loop). We do have a pyramid structure (as you would have to for PVST to work) however something just doesn't look right. Trying to get a better packet capture for the same switch.
  • kryollakryolla Member Posts: 785
    Just curious but if there are multiple VLANs with PVST running on them should one of the ports be in a BLK state / VLAN ? Isn't the point to have one in a block state so as for traffic to just not move around in a circular state?

    blocking state is an alternative to the RP as the BPDU is superior than what your sending so that is next in line to become the RP. DP means your BPDU are more superior than the next hop switch. The switch doesnt have to have a blocking port which means if the RP goes down the switch is isolated per se and will think it is the root switch an start advertising itself as such to downstream switches.

    EDIT: what impact does your issue have on the network or are you just being proactive? Also you can just shotgun it and save and reload, it might or might not fix the situation if indeed the switch is producing all those BPDUs
    Studying for CCIE and drinking Home Brew
  • CyanicCyanic Member Posts: 289
    Looking at this again, if these are all from the same source MAC, doesn't this mean this is all on the same vlan, and that all the priorities are for the same bridge?

    We have seen similar spanning tree storms in our environment. We were never able to get a root cause, but after we upgraded the 2948Gs on the segments, the problem went away.

    Cisco is recommending routing to the edge to us because of layer 2 issues like this.
  • XenzXenz Member Posts: 140
    I understand your analogy it just seems weird to me the way that the configuration is setup. If I run "show spanning-tree summary" I see that all ports are in a FWD state, which shouldn't be the case because there is the Root port, DP, and then a BLK port (so as not to create a forwarding loop). We do have a pyramid structure (as you would have to for PVST to work) however something just doesn't look right. Trying to get a better packet capture for the same switch.


    My pyramid analogy was just to illustrate the idea between the ports (RP,DP,BLK). You don't need a pyramid anything for PVSTP to run. The switch you are looking at is likely to be the root bridge because the root bridge has all of it's ports in a forwarding state, and there is no root port. You can check this by using the command show spanning-tree vlan <vlan>

    Root ID Priority 32769
    Address 0003.E46D.7426
    This bridge is the root
    Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec

    Bridge ID Priority 32769 (priority 32768 sys-id-ext 1)
    Address 0003.E46D.7426
    Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
    Aging Time 20

    If you see "This bridge is root" in the Root ID category, then the switch you are on is the root for whatever vlan you are on. If you use show spanning-tree it will give you a full list of all the STP instances for each vlan.

    *edit* Show spanning-tree summary should tell you if that switch is root for whatever vlans. I'm curious because you say you used this command, this command shouldn't show you the port states for each interface, just a count of how many ports are forwarding/blocking/etc along with some STP config info. You're saying the output from show spanning-tree summary shows under forwarding that there are X ports in forwarding with 0 in blocking/listening/learning?
    Currently working on:
    CCNP, 70-620 Vista 70-290 Server 2003
    Packet Tracer activities and ramblings on my blog:
    http://www.sbntech.info
  • NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
    Xenz wrote: »
    You're saying the output from show spanning-tree summary shows under forwarding that there are X ports in forwarding with 0 in blocking/listening/learning?

    Yes exactly, when I run show spanning-tree summary I get all 40 or so vlans that we have on the network, with ALL of them showing in the forwarding state. I didn't think this was possible because somewhere on the network something has to be BLK or I don't see the point. Just left work will post a screen shot of the command in the A.M.

    SIDE NOTE: This is not my design or implementation of stuff, the network engineer we have I don't think is very good, I'm just trying to look at his network and analyze things to have a better understanding.
  • kryollakryolla Member Posts: 785
    you wont get ports blocking with the following topology there is NO redundancy

    SW1---Sw2----Sw3----Sw4---Sw5---Sw6--ETC

    or

    SW1
    / \
    / \
    SW2 SW3
    / \ / \
    / \ / \
    sw4 sw5 sw6 sw7
    Studying for CCIE and drinking Home Brew
  • NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
    The attached is the setup that we have running. The fiber line leads out of the building to a core L3 switch, switch 2 - 7 are config as a stack. I don't see redundancy here as this is a pyramid design just laid out differently. If there is no redundancy.....whats the point of STP?
  • XenzXenz Member Posts: 140
    I'm not sure how you're capturing, but based on the fact you say you have 40 VLAN's, then the only thing on the parse that seems odd is the TCN's.

    Maybe Kryolla can answer this. If using SPAN to monitor traffic, the source MAC would be changed to the port on which the PC is connected that you're monitoring from or does SPAN not change anything? I've never used this feature hence why I'm asking. So really the only thing out of the ordinary in the screenshot is the TCN notifications?

    Also, maybe a dumb question so read this paragraph, does anyone ever use MST in production networks? If so, based on this network would you implement it here? Is there a best practice on when to implement MST and when not to? The BCMSN exam certification book really downplayed MST so I'm not sure if they just didn't elaborate much or if it's rarely used in real networks.
    Currently working on:
    CCNP, 70-620 Vista 70-290 Server 2003
    Packet Tracer activities and ramblings on my blog:
    http://www.sbntech.info
  • NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
    There are 40 or so VLANs throughout the whole network, however I'm focusing on a specific 5 because they are the only ones configured in the building that I'm in. The show spanning-tree summary shows information for the whole network though.

    No SPAN ports, I'm plugging directly into a switch with a Cat5 as a normal client and just firing up wireshark filtering out all the traffic just to see the STP packets.
  • kryollakryolla Member Posts: 785
    that topology seem like it is just pieced together with no thought. There is no point in running STP for that as sw2-7 is daisy chained or in series and the rest just in parallel with sw1. Are there end users on all switches.
    Studying for CCIE and drinking Home Brew
  • NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
    kryolla wrote: »
    that topology seem like it is just pieced together with no thought.

    Haha welcome to my job....
    kryolla wrote: »
    There is no point in running STP for that as sw2-7 is daisy chained or in series and the rest just in parallel with sw1. Are there end users on all switches.

    There are about 500-600 end users plugged in sporadically all over the place in random different vlans for that building.
  • kryollakryolla Member Posts: 785
    WOW all I can is WOW and nobody brought this up to management. What'll happen if I pull this one cable right here lol. Did you have any major outage or end users complaining or pests eating that one fiber? I have a funny story about a squirrel getting in the building and running around on the ladder racks and building a nest inside a fiber duct

    whats you traffic pattern is a majority of it stay in the building or leave?
    Studying for CCIE and drinking Home Brew
  • NightShade03NightShade03 Member Posts: 1,383 ■■■■■■■□□□
    Management?!?!?! What is that?! Seriously we don't have any mamanagement, the engineer is the guy in charge (which is why I'm leaving my current company lol) and he never seems to see anything wrong with the network design lol. I've tried to get him to look at or redo the design but he doesn't care. The fiber lines are split and hanging out of a junction box next to the rack waiting for someone to sneeze in the wrong direction to bring down the main trunk line! icon_rolleyes.gif

    Most of the traffic leaves (95% out, the rest in). The only thing in this building is an offline backup server, with all the end users. All the traffic is pushed out of the building into the main building were the servers are (routed via a single Core L3 switch). Its a freakin accident waiting to happen lol
Sign In or Register to comment.