6506 packet forwarding issue

DevilWAHDevilWAH Member Posts: 2,997 ■■■■■■■■□□
I have a pair of 6500's set up.

one physical interface is configured as LAyer 2 and connected to a check point firewall as a trunked port

vlan 1001 has a layer 3 interface and this is the nexthop to the firewall..
check point (192.168.1.1 vlan 1001) ------ (int vlan 1001 192.168.1.2) 6506

there are a mixture of layer 2 / layer 3 interfaces on the 6506, but to keep it simple we also have a vlan 1 interface set up as the default gate way to the cient machine
int vlan 1 - 172.20.20.1 255.255.255.0

many ports on the 6505 are trunk links to distribution/access layer switches the that clients are attached to. Let say port 1,2 and 3 are connected to 3 different access switches. Port 4 is connected to the firewall and restricted to only permit vlan 1001

So the issue is that from out side the a conversation starts to client 172.20.20.10 which is connected to the access switch on port 2.

I would expect packet comes in from the firewall via port 4 to the 6506 and hits the vlan 1001 interface. 6506 processes the packet and sees its destination is on the 172.20.20.0/24 network, knows this is local, does an ARP lookup, finds the client on port 2, and forwards it at layer 2 out of this port.....

But what I see it the packet to the client get flooded out ports 1 2 and 3?? it still gets to the client and the conversation happens just fine. but every other switch gets a copy to (i have put a tap to prove this to my self :) )

If any one can suggest why i see this broadcast storm I would be greatfull. It only happens with packets coming in from the firewall.

Cheers
  • If you can't explain it simply, you don't understand it well enough. Albert Einstein
  • An arrow can only be shot by pulling it backward. So when life is dragging you back with difficulties. It means that its going to launch you into something great. So just focus and keep aiming.

Comments

  • EdTheLadEdTheLad Member Posts: 2,111 ■■■■□□□□□□
    You have a problem on vlan 1, the arp between the svi and client isn't working for some reason. The switch connected to the firewall has a route to the vlan1 but doesn't have the corresponding mac entry for the client, so it then floods to all ports in vlan 1.
    Check the cam entries for vlan 1, have you learned clients mac?
    Quick test, config an svi in vlan 1 using same subnet as the client, try a ping check.
    Networking, sometimes i love it, mostly i hate it.Its all about the $$$$
  • DevilWAHDevilWAH Member Posts: 2,997 ■■■■■■■■□□
    Sorry should have said this. Yes the switch has a record for the destination MAC, this was my first thought but its arp table and CAM entries show the destination IP resolved to the correct MAC and it sees this mac on the expected Port leading to the access switch (and no other ports).

    So this part all seems fine?
    • If you can't explain it simply, you don't understand it well enough. Albert Einstein
    • An arrow can only be shot by pulling it backward. So when life is dragging you back with difficulties. It means that its going to launch you into something great. So just focus and keep aiming.
  • DevilWAHDevilWAH Member Posts: 2,997 ■■■■■■■■□□
    Here is what I see on a the port leading to an access switch that dose not contain the client using a tap between the 6500 and the access switch.


    on the 6500



    ASH_6506#sh arp xxx.xxx.xxx.x24.94
    Protocol Address Age (min) Hardware Addr Type Interface
    Internet xxx.xxx.xxx.x24.94 14 0050.564d.c3ac ARPA Vlan1


    ASH_6506#sh mac address-table add 0050.564d.c3ac


    Legend: * - primary entry
    age - seconds since last seen
    n/a - not available
    S - secure entry
    R - router's gateway mac address entry
    D - Duplicate mac address entry


    Displaying entries from active supervisor:


    vlan mac address type learn age ports
    ----+----+
    +
    +
    +
    +
    * 1 0050.564d.c3ac dynamic Yes 5 Po3


    So yes I should not expect any traffic out this port as the mac is seen on portchannel 3 but the capture was done on a physical interface that is not part of this or any other bundle.
    • If you can't explain it simply, you don't understand it well enough. Albert Einstein
    • An arrow can only be shot by pulling it backward. So when life is dragging you back with difficulties. It means that its going to launch you into something great. So just focus and keep aiming.
  • EdTheLadEdTheLad Member Posts: 2,111 ■■■■□□□□□□
    Looks like you have a bug. If you can i would suggest possibly changing the port-channel config into a single link, I've seen many issues with port-channels, maybe the mac associated with the port-channel isn't being seen correctly.Also are you using CEF? i'm not sure what show commands are available in the 6500 for cef, but could check to see if the mac address is correct for the adjacency.
    Networking, sometimes i love it, mostly i hate it.Its all about the $$$$
  • deth1kdeth1k Member Posts: 312
    This will help:

    conf t

    mac-address-table synchronize
    mac-address-table aging-time 0 routed-mac
  • DevilWAHDevilWAH Member Posts: 2,997 ■■■■■■■■□□
    You know strange you say that as I have been looking at the port-channel stuff because I could not capture on the etherchannl interface as expected and have seen this
    ASH_6506#sh etherchannel 3 summary Flags:  D - down        P - bundled in port-channel
            I - stand-alone s - suspended
            H - Hot-standby (LACP only)
            R - Layer3      S - Layer2
            U - in use      N - not in use, no aggregation
            f - failed to allocate aggregator
    
    
            M - not in use, no aggregation due to minimum links not met
            m - not in use, port not aggregated due to minimum links not met
            u - unsuitable for bundling
            d - default port
    
    
            w - waiting to be aggregated
    Number of channel-groups in use: 7
    Number of aggregators:           7
    
    
    Group  Port-channel  Protocol    Ports
    ------+-------------+-----------+-----------------------------------------------
    3      Po3(SN)         LACP      Te1/5/4(P)     Te2/5/4(P)     
    
    
    Last applied Hash Distribution Algorithm: Adaptive
    

    "Not in use no aggregation". So for a laugh I shut down one of the links and low and behold the flooding stopped. and even after enabling it is came back up fine.

    Interestingly all interfaces on both sides are configured with "Channel-group mode ON" so not sure what its issue is but will have to look in to it more. for now though I will monitor and see what's happening. I have a feeling this has been causing the traffic to take diferent paths through the 6506's for up and down stream and causing the flooding.

    Hats of to you sir as I think you got it. :)
    • If you can't explain it simply, you don't understand it well enough. Albert Einstein
    • An arrow can only be shot by pulling it backward. So when life is dragging you back with difficulties. It means that its going to launch you into something great. So just focus and keep aiming.
  • DevilWAHDevilWAH Member Posts: 2,997 ■■■■■■■■□□
    deth1k wrote: »
    This will help:

    conf t

    mac-address-table synchronize
    mac-address-table aging-time 0 routed-mac

    I know that would fix it but it would also paint over the cause rather than fix the underlying problem. cheers for the suggestion though :)
    • If you can't explain it simply, you don't understand it well enough. Albert Einstein
    • An arrow can only be shot by pulling it backward. So when life is dragging you back with difficulties. It means that its going to launch you into something great. So just focus and keep aiming.
  • DevilWAHDevilWAH Member Posts: 2,997 ■■■■■■■■□□
    Sadly even fixing the ether-channel issue has not fixed it. I am sure its caused by frames taking diverse routes through the ether-channel links preventing CAM correctly forwarding them and causing the flood.

    I know i could fix this by deth1k method or removing the ether channel and using some thing like STP to bring up the back up link for redundancy.

    but that would mean admitting defeat :) So now waiting for my friends at CISCO TAC to get back to me after going through it with them most of the day. Seems it has defeated them to for now.
    • If you can't explain it simply, you don't understand it well enough. Albert Einstein
    • An arrow can only be shot by pulling it backward. So when life is dragging you back with difficulties. It means that its going to launch you into something great. So just focus and keep aiming.
  • DevilWAHDevilWAH Member Posts: 2,997 ■■■■■■■■□□
    OK just a little update.

    Breaking the etherchanel dose not fix it icon_sad.gif jsut looked like it did as the problem is sporadic and did not happen during that period I was testing.

    But last time I saw it happening I went on to the 6505 and looked up the info on the address the packets should be going to.

    A "#show ARP# shows the IP resolving to the correct MAC address on the VLAN I would expect.

    however a "#Show MAc Address table address xxxx.xxxx.xxxx" for the shown address showed nothing.

    Well that would explain the flooding, no CAM entry and fall back to acting as a hub..

    So did a packet capture on the physical interface I know the endstation is connects to, and I see both half's of the conversation so who its not population the CAM table I have no idea. Next I simple did a Ping from the 6506 to the destination ip address, looking now the MAC address table is populated as expected and the floodign stops.

    So any one care to explain why a conversation via the switch fails to populate the CAM tables, while a ping to the same end stating from the switch does?

    Cheers
    • If you can't explain it simply, you don't understand it well enough. Albert Einstein
    • An arrow can only be shot by pulling it backward. So when life is dragging you back with difficulties. It means that its going to launch you into something great. So just focus and keep aiming.
  • EdTheLadEdTheLad Member Posts: 2,111 ■■■■□□□□□□
    When you say breaking the etherchannel doesn't fix it do you mean just disconnecting a link in the bundle? It would be nice if you could reconfigure the port as a single link just to check if the cause is the etherchannel.
    Regarding the cam entry getting timed out with an active flow, this might happen due to cef or fast switching, just depends on how the software is coded, the info learned from the current flow gets timed out, the ping is seen as a new flow and the mac address gets updated. You never said if cef was enabled, my guess would be you have an issue with cef and the etherchannel. Might be a known bug, you just have to try and narrow it down for the tac. If you cant modify the etherchannel config, maybe you can try and reproduce the issue on some other test ports? What about logging? do you see any logs around the time of mac entry loss? whats the frequency of loss, if you knew the seconds you might be able to correlate to a watchdog timer.A workaround could be to use EEM to generate a ping every x seconds to repopulate the cam before it expires.
    Networking, sometimes i love it, mostly i hate it.Its all about the $$$$
Sign In or Register to comment.