Need help - strange site-to-site VPN problem with Cisco PIX

MentholMooseMentholMoose Member Posts: 1,525 ■■■■■■■■□□
I'm having a strange problem with a site-to-site VPN between two locations. Each site has a PIX and the VPN has been rock solid for years. I know, I know, they are far overdue for a hardware refresh, but they have been so reliable that I just haven't bothered with upgrading the hardware. However, a couple days ago the VPN stopped working reliably. I am the only person with access and I'm certain there weren't any configuration changes. I compared the configurations with backups and nothing looks out of the ordinary.

I've been pulling my hair out trying to figure out the problem, and I think I have narrowed it down to one of the ISPs. Naturally, they vehemently disagree. So I'm looking for a second opinion and some advice on how to proceed.

Here are the details of the symptoms and the troubleshooting I've done.

Site A - PIX 501 (50 user) with 6.3(5)145
Site B - PIX 506 (UL) with 6.3(5)145

They both have a very simple configuration. They each have a single internal network and a single WAN interface, and they are both configured to do NAT. The VPN is configured like this:

PIX 6.x: Simple PIX-to-PIX VPN Tunnel Configuration Example - Cisco Systems

The tunnel appears to be established successfully. If I reload / power-cycle / clear crypto isakmp sa either PIX, sh crypto isakmp sa on both sites shows it quickly going to "QM_IDLE" state. If I then generate some traffic from inside hosts on each site to the other (e.g. from a Site A host, ping a Site B host, and vice versa) and do sh crypto ipsec sa...

Site A:
#pkts encaps: 2139, #pkts encrypt: 2139, #pkts digest 2139
#pkts decaps: 1245, #pkts decrypt: 1245, #pkts verify 1245

Site B:
#pkts encaps: 1245, #pkts encrypt: 1245, #pkts digest 1245
#pkts decaps: 0, #pkts decrypt: 0, #pkts verify 0


So, from the PIX perspective it looks like Site A is successfully receiving and decrypting the traffic from Site B, but NOT vice versa. I can confirm this with tcpdump on the inside hosts being pinged. The Site A host receives the ping request from the Site B host and replies to it, but the Site B host does not receive the reply. The Site B host never receives any ping request from the Site A host.

To better understand the problem I decided the next step was to see what was happening on the WAN side of each PIX, using a hub and laptop with tcpdump. When sending a continuous ping from a Site A host to a Site B host, from outside the Site A PIX I see the outgoing VPN traffic but no responses:

03:01:01.864027 IP site_a_outside_ip > site_b_outside_ip: ESP(spi=0x80c8acc1,seq=0xec), length 132
03:01:02.639307 IP site_a_outside_ip > site_b_outside_ip: ESP(spi=0x80c8acc1,seq=0xed), length 100

While simultaneously doing a tcpdump outside the Site B PIX, I do not see these incoming packets.

When I do the reverse scenario, i.e. send a continuous ping from a Site B host to a Site A host, from outside the Site B PIX I see the outgoing VPN traffic but no response, and from outside the Site A PIX I see both the incoming and outgoing ESP packets.

So it appears to me that there is either a problem with the outgoing ESP packets being dropped by the Site A ISP, or incoming ESP packets being dropped by the Site B ISP. I haven't been able to definitively confirm either. I don't have a PIX at a third location to try setting up a VPN to either site. The best I've been able to do so far is send some forged ESP packets from a third location to Site B and I confirmed that they are being received. I'm not sure this is a valid test since they are not real ESP packets, just IP packets with protocol 50 set. I haven't yet been able to try sending these forged ESP packets from Site A to that third location, but this is on my TODO list.

Since it seems these ESP packets are getting dropped somewhere, I tried configuring IPsec to not actually encrypt the traffic... i.e. I changed the configuration that has been working for 5+ years:
crypto ipsec transform-set vpn_to_site_a esp-aes-192 esp-sha-hmac

To:
crypto ipsec transform-set vpn_to_site_a ah-sha-hmac

And it works fine (of course it's configured on both sides accordingly). Instead of ESP packets with tcpdump on the WAN side, I see AH packets, and communication fully works both ways. Unfortunately, the traffic is not encrypted which is not what I want. I tried several other transform options and it simply doesn't work when ESP is enabled.

I'm not sure what else to try, and I really need to get this back up and running. Both ISPs are essentially blaming my routers. Any comments, ideas, or suggestions?
MentholMoose
MCSA 2003, LFCS, LFCE (expired), VCP6-DCV

Comments

  • MentholMooseMentholMoose Member Posts: 1,525 ■■■■■■■■□□
    One last note. Before doing the above analysis I was leaning toward some strange hardware failure. But given the symptoms and the results of my investigation, I don't see how the routers would cause the problem. But I really need some advice on this issue. Thanks!
    MentholMoose
    MCSA 2003, LFCS, LFCE (expired), VCP6-DCV
Sign In or Register to comment.