BGP adjacencies not coming up after reboot

daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
Hi there,

I'm working at the moment on a MPLS-lab simulating an ISP-network in GNS3. After hours my topology finally worked and I could ping the other Provider Edge routers with MP-BGP. After I started GNS3 today and booted up all routers, the BGP adjacencies between the router's were all gone and were not coming up either.

I saved my running-configs to startup-configs and saved my project before I closed my project. My routing-table looks fine, all PE-routers are able to ping each other's loopback interfaces and MPLS works properly. Except when I use the command "sh bgp vpnv4 unicast all summary", all connections are down.

This is my lab:

I can provide running-configs of all router's if necessary.

Thanks.

Comments

  • deth1kdeth1k Member Posts: 312
    Not looked at your configs but have you set router id for your BGP?
  • daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
    deth1k wrote: »
    Not looked at your configs but have you set router id for your BGP?

    I set unique BGP-router id's on all routers and restarted them. BGP ajacencies still not coming up. I'll post my BGP configs.
  • daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
    Here are my BGP configs for R10 and R1:
    R10(config)#do sh runn | section bgp
     redistribute bgp 1 metric 1000000 10 255 1 1500
    router bgp 1
     no synchronization
     bgp router-id 10.10.10.10
     bgp log-neighbor-changes
     neighbor 1.1.1.1 remote-as 1
     neighbor 1.1.1.1 update-source Loopback10
     neighbor 4.4.4.4 remote-as 1
     neighbor 4.4.4.4 update-source Loopback10
     neighbor 7.7.7.7 remote-as 1
     neighbor 7.7.7.7 update-source Loopback10
     no auto-summary
     !
     address-family vpnv4
     neighbor 1.1.1.1 activate
     neighbor 1.1.1.1 send-community extended
     neighbor 4.4.4.4 activate
     neighbor 4.4.4.4 send-community extended
     neighbor 7.7.7.7 activate
     neighbor 7.7.7.7 send-community extended
     exit-address-family
     !
     address-family ipv4 vrf VRF10
     no synchronization
     exit-address-family
    
    router bgp 1
     no synchronization
     bgp router-id 1.1.1.1
     bgp log-neighbor-changes
     neighbor 4.4.4.4 remote-as 1
     neighbor 4.4.4.4 update-source Loopback10
     neighbor 7.7.7.7 remote-as 1
     neighbor 7.7.7.7 update-source Loopback10
     neighbor 10.10.10.10 remote-as 1
     neighbor 10.10.10.10 update-source Loopback10
     no auto-summary
     !
     address-family vpnv4
     neighbor 4.4.4.4 activate
     neighbor 4.4.4.4 send-community extended
     neighbor 7.7.7.7 activate
     neighbor 7.7.7.7 send-community extended
     neighbor 10.10.10.10 activate
     neighbor 10.10.10.10 send-community extended
     exit-address-family
     !
     address-family ipv4 vrf VRF1
     no synchronization
     exit-address-family
    
    
    
  • networker050184networker050184 Mod Posts: 11,962 Mod
    Your BGP config looks fine. Are the IPv4 adjacency up or nothing at all? I see that you have update source set to loopabck 10 so make sure that is right.
    An expert is a man who has made all the mistakes which can be made.
  • fredrikjjfredrikjj Member Posts: 879
    Are able to establish a tcp session (port 179) between the two ip addresses that you are using for peering?

    PS.
    You can use telnet for this. You want to see something like this:

    (my test network is just a p2p connection between R1 [10.0.0.1] and R2 [10.0.0.2])

    R1#telnet 10.0.0.2 179
    Trying 10.0.0.2, 179 ...
    % Connection refused by remote host


    R1#debug ip packet detail
    *Mar 1 00:16:22.531: IP: tableid=0, s=10.0.0.1 (local), d=10.0.0.2 (Serial0/0), routed via FIB
    *Mar 1 00:16:22.531: IP: s=10.0.0.1 (local), d=10.0.0.2 (Serial0/0), len 44, sending
    *Mar 1 00:16:22.531: TCP src=32305, dst=179, seq=1123914025, ack=0, win=4128 SYN
    *Mar 1 00:16:22.579: IP: tableid=0, s=10.0.0.2 (Serial0/0), d=10.0.0.1 (Serial0/0), routed via RIB
    *Mar 1 00:16:22.579: IP: s=10.0.0.2 (Serial0/0), d=10.0.0.1 (Serial0/0), len 40, rcvd 3
    *Mar 1 00:16:22.583: TCP src=179, dst=32305, seq=0, ack=1123914026, win=0 ACK RST

    Where the ip you want to peer with is actually responding. The connection in this case is reset because nothing is enabled that listens to port 179.

    If you enable BGP on the other side and point it to the router you are telnetting from, you would see:

    R1#telnet 10.0.0.2 179
    Trying 10.0.0.2, 179 ... Open


    *Mar 1 00:02:31.543: IP: tableid=0, s=10.0.0.1 (local), d=10.0.0.2 (Serial0/0), routed via FIB
    *Mar 1 00:02:31.543: IP: s=10.0.0.1 (local), d=10.0.0.2 (Serial0/0), len 44, sending
    *Mar 1 00:02:31.543: TCP src=45138, dst=179, seq=1138799739, ack=0, win=4128 SYN
    *Mar 1 00:02:31.603: IP: tableid=0, s=10.0.0.2 (Serial0/0), d=10.0.0.1 (Serial0/0), routed via RIB
    *Mar 1 00:02:31.603: IP: s=10.0.0.2 (Serial0/0), d=10.0.0.1 (Serial0/0), len 44, rcvd 3
    *Mar 1 00:02:31.607: TCP src=179, dst=45138, seq=3265423818, ack=1138799740, win=16384 ACK SYN



    PPS.
    And don't forget to set the source interface to the loopback on the telnet command with the /source-interface parameter.

    PPPS.
    I guess what I'm trying to say is that it's premature to look at the BGP configs.
  • FloOzFloOz Member Posts: 1,614 ■■■■□□□□□□
    Do you have NLRI to the loopback addresses? Aka is OSPF advertising the loopbacks properly?
  • darkerzdarkerz Member Posts: 431 ■■■■□□□□□□
    Is there any TCP session going on? Can you PCAP that for us?

    Do you have end to end IP connectivity? ICMP is good for this.

    Do you have end to end TCP connectivity? Telnet is good for this.
    :twisted:
  • vishaw1986vishaw1986 Member Posts: 40 ■■□□□□□□□□
    Hey

    use this command for each EBGP nei

    neighbor x.x.x.x ebgp-multihop 10
  • EdTheLadEdTheLad Member Posts: 2,111 ■■■■□□□□□□
    vishaw1986 wrote: »
    Hey
    use this command for each EBGP nei
    neighbor x.x.x.x ebgp-multihop 10

    His peers are iBGP, so its not required.
    Networking, sometimes i love it, mostly i hate it.Its all about the $$$$
  • vishaw1986vishaw1986 Member Posts: 40 ■■□□□□□□□□
    oh sorry my mistake...
  • daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
    Sorry for the late answer. My IPv4 adjacencies are up, I can ping all the PE-routers can ping each other over the IGP (OSPF) in the backbone. Also my lab worked when before I closed it so my configs should be correct. After I restarted GNS3 the adjacency problem began. Thanks for the help.
  • daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
    Hi, thanks for the help. My debug output seems a little different than yours but I'm not very good in reading these debugs.

    R10(config)#do telnet 1.1.1.1 179 /source-interface lo10

    Trying 1.1.1.1, 179 ... OpenR1(config)# do telnet 10.10.10.10 179 /source-interface lo10Trying 10.10.10.10, 179 ... Open

    *Mar 1 00:38:50.891: IP: tableid=0, s=10.10.10.10 (FastEthernet2/0), d=1.1.1.1 (Loopback10), routed via RIB
    *Mar 1 00:38:50.895: IP: s=10.10.10.10 (FastEthernet2/0), d=1.1.1.1, len 41, rcvd 4

    *Mar 1 00:38:50.899: TCP src=36587, dst=179, seq=2780489803, ack=4227112511, win=4075 ACK PSH

    *Mar 1 00:38:51.103: IP: tableid=0, s=1.1.1.1 (local), d=10.10.10.10 (FastEthernet2/0), routed via FIB

    *Mar 1 00:38:51.107: IP: s=1.1.1.1 (local), d=10.10.10.10 (FastEthernet2/0), len 40, sending

    *Mar 1 00:38:51.111: TCP src=179, dst=36587, seq=4227112511, ack=2780489804, win=16383 ACK
  • daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
    darkerz wrote: »
    Is there any TCP session going on? Can you PCAP that for us?

    Do you have end to end IP connectivity? ICMP is good for this.

    Do you have end to end TCP connectivity? Telnet is good for this.

    Thanks for your help.

    I posted the ip packet debug log in one of the other answers.

    I have end to end ICMP connectivity (PE-routers can ping each other's loopback-interfaces).

    I have end to end TCP connectivity (I can telnet all loopback interfaces from each PE-router).
  • deth1kdeth1k Member Posts: 312
    What's your CPU usage like when all nodes are fired up? Try bringing up only two routers and see if BGP comes up. GNS is weird and wonderful even when configs are correct. Most stable I've found is for 7200 12.2SRD/E also has more features for MPLS testing.
  • daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
    deth1k wrote: »
    What's your CPU usage like when all nodes are fired up? Try bringing up only two routers and see if BGP comes up. GNS is weird and wonderful even when configs are correct. Most stable I've found is for 7200 12.2SRD/E also has more features for MPLS testing.

    My CPU utilization is around 11% and memory 2,1GB/7,9GB. I'm also starting to think it's a GNS-problem (or IOS-problem). I'm using the 3700 routers.
  • fredrikjjfredrikjj Member Posts: 879
    The next step would be to check the BGP state, and enable bgp debugging. BGP can fail to form peering sessions if there's disagreement on certain parameters exchanged in the Open message. Since you have the capability to establish a TCP session, the Open message should be sent.
  • daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
    fredrikjj wrote: »
    The next step would be to check the BGP state, and enable bgp debugging. BGP can fail to form peering sessions if there's disagreement on certain parameters exchanged in the Open message. Since you have the capability to establish a TCP session, the Open message should be sent.

    This is the output from bgp debug. Thanks for the help. This is all new for me (this is my exam exercise for school).
    1. *Mar 1 03:20:37.599: BGP: 4.4.4.4 active open failed - no route to peer, open active delayed 25404ms (35000ms max, 28% jitter)
    2. R10(config-router)#
    3. *Mar 1 03:20:44.699: BGP: 7.7.7.7 active open failed - no route to peer, open active delayed 33885ms (35000ms max, 28% jitter)
    4. R10(config-router)#
    5. *Mar 1 03:20:56.171: BGP: 1.1.1.1 active open failed - no route to peer, open active delayed 26915ms (35000ms max, 28% jitter)
  • fredrikjjfredrikjj Member Posts: 879
    http://blog.ipexpert.com/2010/11/08/bgp-peering-and-default-routes/
    One of the interesting restrictions of BGP is that neighbor will not establish BGP session with a peer if the only way to reach it is through the default route. Is there anything we can do in that scenario?

    Could that be the problem?

    It explains why ping and telnet would work, but not bgp.
  • daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
    fredrikjj wrote: »
    BGP Peering and Default Routes « Ccie « CCIE Blog

    Could that be the problem?

    It explains why ping and telnet would work, but not bgp.

    My BGP adjacencies are coming up again :) R8,R3,R5 and R6 were not so stubby routers for backbone - PE area. Once I make it normal stub routers, my BGP adjacencies work. Still figuring out why that was preventing BGP from bringing up adjacencies.

    Thanks for the help :)
  • fredrikjjfredrikjj Member Posts: 879
    My guess is that you configured BGP first and established peerings. After that you configured certain OSPF areas as totally nssa or totally stub which removed the summary LSAs belonging to the loopbacks used for peering and replaced them all with a default route. Since the the BGP peerings were already established, the default route only problem didn't manifest itself. However, once you rebooted, it did.
  • fredrikjjfredrikjj Member Posts: 879
    I managed to replicate the problem.

    Topology: R1 - R2 - R3

    R1 in area 1 with loopback 1.1.1.1/32
    R3 in area 3 with loopback 3.3.3.3/32
    R2 has a loopback in area 0

    I established connectivity between 1.1.1.1 and 3.3.3.3 with OSPF and then enabled BGP. I then made area 1 and 3 totally stubby.

    As you can see, the inter-area route to the loopback is gone and BGP is still in the established state (otherwise we would see something other than 0 in the State/PfxRcd column).


    R1(config-router)#do show ip bgp summary
    BGP router identifier 1.1.1.1, local AS number 1
    BGP table version is 1, main routing table version 1


    Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
    3.3.3.3 4 1 5 5 1 0 0 0 0:02:26 0

    R1(config-router)#do show ip route

    Gateway of last resort is 10.0.12.2 to network 0.0.0.0


    1.0.0.0/32 is subnetted, 1 subnets
    C 1.1.1.1 is directly connected, Loopback0
    10.0.0.0/24 is subnetted, 1 subnets
    C 10.0.12.0 is directly connected, Serial0/0
    O*IA 0.0.0.0/0 [110/65] via 10.0.12.2, 00:00:53,




    R3(config-router)#do show ip bgp summary
    BGP router identifier 3.3.3.3, local AS number 1
    BGP table version is 1, main routing table version 1


    Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
    1.1.1.1 4 1 4 4 1 0 0 00:01:50 0

    R3(config-router)#do show ip route
    Gateway of last resort is 10.0.23.2 to network 0.0.0.0


    3.0.0.0/32 is subnetted, 1 subnets
    C 3.3.3.3 is directly connected, Loopback0
    10.0.0.0/24 is subnetted, 1 subnets
    C 10.0.23.0 is directly connected, Serial0/0
    O*IA 0.0.0.0/0 [110/65] via 10.0.23.2, 00:00:11, Serial0/0

    I debugged BGP during the transition from normal areas to totally stubby and nothing happened.


    Once rebooted, this happened:

    *Mar 1 00:09:26.775: BGP: 3.3.3.3 active open failed - no route to peer, open active delayed 31893ms (35000ms max, 28% jitter)

    R1#show ip bgp summary
    BGP router identifier 1.1.1.1, local AS number 1
    BGP table version is 1, main routing table version 1


    Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
    3.3.3.3 4 1 0 0 0 0 0 never Active


    PS.
    The root cause of this is probably BGP's extremely long default holdtime of 180 seconds. You OSPF adjacency goes down when you change area types, but unless it takes you several minutes to get it back up, the BGP session won't go down.

    PPS.
    That's not exactly true because it's enough that one side of the peering has a non-default route. So even if you ran a shorter hold time, BGP would just flap before getting restablished. Really a curious problem this.
  • daan5000daan5000 Member Posts: 34 ■■■□□□□□□□
    Thanks for taking the time reproducing the problem. I made the same lab you made to follow along and it's really clear now when the problem occurs. It's indeed very strange if you make one area stub or no stub at all it works. I'll try to narrow down the problem. I'll keep you updated.
  • EdTheLadEdTheLad Member Posts: 2,111 ■■■■□□□□□□
    BGP will not establish using a default route, this is an in built adjacency restriction. Since the adjacency established via a non default route the bgp session was up and everything good. Later you modified your IGP which caused the BGP keepalive to use the default route, no problem there as the keepalive is dumb and just verifies ip connectivity, the adjacency was already established. After bouncing the session, bgp doesn't come up due to the default route restriction.
    Networking, sometimes i love it, mostly i hate it.Its all about the $$$$
Sign In or Register to comment.