Need help - strange site-to-site VPN problem with Cisco PIX
MentholMoose
Member Posts: 1,525 ■■■■■■■■□□
in Off-Topic
I'm having a strange problem with a site-to-site VPN between two locations. Each site has a PIX and the VPN has been rock solid for years. I know, I know, they are far overdue for a hardware refresh, but they have been so reliable that I just haven't bothered with upgrading the hardware. However, a couple days ago the VPN stopped working reliably. I am the only person with access and I'm certain there weren't any configuration changes. I compared the configurations with backups and nothing looks out of the ordinary.
I've been pulling my hair out trying to figure out the problem, and I think I have narrowed it down to one of the ISPs. Naturally, they vehemently disagree. So I'm looking for a second opinion and some advice on how to proceed.
Here are the details of the symptoms and the troubleshooting I've done.
Site A - PIX 501 (50 user) with 6.3(5)145
Site B - PIX 506 (UL) with 6.3(5)145
They both have a very simple configuration. They each have a single internal network and a single WAN interface, and they are both configured to do NAT. The VPN is configured like this:
PIX 6.x: Simple PIX-to-PIX VPN Tunnel Configuration Example - Cisco Systems
The tunnel appears to be established successfully. If I reload / power-cycle / clear crypto isakmp sa either PIX, sh crypto isakmp sa on both sites shows it quickly going to "QM_IDLE" state. If I then generate some traffic from inside hosts on each site to the other (e.g. from a Site A host, ping a Site B host, and vice versa) and do sh crypto ipsec sa...
Site A:
#pkts encaps: 2139, #pkts encrypt: 2139, #pkts digest 2139
#pkts decaps: 1245, #pkts decrypt: 1245, #pkts verify 1245
Site B:
#pkts encaps: 1245, #pkts encrypt: 1245, #pkts digest 1245
#pkts decaps: 0, #pkts decrypt: 0, #pkts verify 0
So, from the PIX perspective it looks like Site A is successfully receiving and decrypting the traffic from Site B, but NOT vice versa. I can confirm this with tcpdump on the inside hosts being pinged. The Site A host receives the ping request from the Site B host and replies to it, but the Site B host does not receive the reply. The Site B host never receives any ping request from the Site A host.
To better understand the problem I decided the next step was to see what was happening on the WAN side of each PIX, using a hub and laptop with tcpdump. When sending a continuous ping from a Site A host to a Site B host, from outside the Site A PIX I see the outgoing VPN traffic but no responses:
03:01:01.864027 IP site_a_outside_ip > site_b_outside_ip: ESP(spi=0x80c8acc1,seq=0xec), length 132
03:01:02.639307 IP site_a_outside_ip > site_b_outside_ip: ESP(spi=0x80c8acc1,seq=0xed), length 100
While simultaneously doing a tcpdump outside the Site B PIX, I do not see these incoming packets.
When I do the reverse scenario, i.e. send a continuous ping from a Site B host to a Site A host, from outside the Site B PIX I see the outgoing VPN traffic but no response, and from outside the Site A PIX I see both the incoming and outgoing ESP packets.
So it appears to me that there is either a problem with the outgoing ESP packets being dropped by the Site A ISP, or incoming ESP packets being dropped by the Site B ISP. I haven't been able to definitively confirm either. I don't have a PIX at a third location to try setting up a VPN to either site. The best I've been able to do so far is send some forged ESP packets from a third location to Site B and I confirmed that they are being received. I'm not sure this is a valid test since they are not real ESP packets, just IP packets with protocol 50 set. I haven't yet been able to try sending these forged ESP packets from Site A to that third location, but this is on my TODO list.
Since it seems these ESP packets are getting dropped somewhere, I tried configuring IPsec to not actually encrypt the traffic... i.e. I changed the configuration that has been working for 5+ years:
crypto ipsec transform-set vpn_to_site_a esp-aes-192 esp-sha-hmac
To:
crypto ipsec transform-set vpn_to_site_a ah-sha-hmac
And it works fine (of course it's configured on both sides accordingly). Instead of ESP packets with tcpdump on the WAN side, I see AH packets, and communication fully works both ways. Unfortunately, the traffic is not encrypted which is not what I want. I tried several other transform options and it simply doesn't work when ESP is enabled.
I'm not sure what else to try, and I really need to get this back up and running. Both ISPs are essentially blaming my routers. Any comments, ideas, or suggestions?
I've been pulling my hair out trying to figure out the problem, and I think I have narrowed it down to one of the ISPs. Naturally, they vehemently disagree. So I'm looking for a second opinion and some advice on how to proceed.
Here are the details of the symptoms and the troubleshooting I've done.
Site A - PIX 501 (50 user) with 6.3(5)145
Site B - PIX 506 (UL) with 6.3(5)145
They both have a very simple configuration. They each have a single internal network and a single WAN interface, and they are both configured to do NAT. The VPN is configured like this:
PIX 6.x: Simple PIX-to-PIX VPN Tunnel Configuration Example - Cisco Systems
The tunnel appears to be established successfully. If I reload / power-cycle / clear crypto isakmp sa either PIX, sh crypto isakmp sa on both sites shows it quickly going to "QM_IDLE" state. If I then generate some traffic from inside hosts on each site to the other (e.g. from a Site A host, ping a Site B host, and vice versa) and do sh crypto ipsec sa...
Site A:
#pkts encaps: 2139, #pkts encrypt: 2139, #pkts digest 2139
#pkts decaps: 1245, #pkts decrypt: 1245, #pkts verify 1245
Site B:
#pkts encaps: 1245, #pkts encrypt: 1245, #pkts digest 1245
#pkts decaps: 0, #pkts decrypt: 0, #pkts verify 0
So, from the PIX perspective it looks like Site A is successfully receiving and decrypting the traffic from Site B, but NOT vice versa. I can confirm this with tcpdump on the inside hosts being pinged. The Site A host receives the ping request from the Site B host and replies to it, but the Site B host does not receive the reply. The Site B host never receives any ping request from the Site A host.
To better understand the problem I decided the next step was to see what was happening on the WAN side of each PIX, using a hub and laptop with tcpdump. When sending a continuous ping from a Site A host to a Site B host, from outside the Site A PIX I see the outgoing VPN traffic but no responses:
03:01:01.864027 IP site_a_outside_ip > site_b_outside_ip: ESP(spi=0x80c8acc1,seq=0xec), length 132
03:01:02.639307 IP site_a_outside_ip > site_b_outside_ip: ESP(spi=0x80c8acc1,seq=0xed), length 100
While simultaneously doing a tcpdump outside the Site B PIX, I do not see these incoming packets.
When I do the reverse scenario, i.e. send a continuous ping from a Site B host to a Site A host, from outside the Site B PIX I see the outgoing VPN traffic but no response, and from outside the Site A PIX I see both the incoming and outgoing ESP packets.
So it appears to me that there is either a problem with the outgoing ESP packets being dropped by the Site A ISP, or incoming ESP packets being dropped by the Site B ISP. I haven't been able to definitively confirm either. I don't have a PIX at a third location to try setting up a VPN to either site. The best I've been able to do so far is send some forged ESP packets from a third location to Site B and I confirmed that they are being received. I'm not sure this is a valid test since they are not real ESP packets, just IP packets with protocol 50 set. I haven't yet been able to try sending these forged ESP packets from Site A to that third location, but this is on my TODO list.
Since it seems these ESP packets are getting dropped somewhere, I tried configuring IPsec to not actually encrypt the traffic... i.e. I changed the configuration that has been working for 5+ years:
crypto ipsec transform-set vpn_to_site_a esp-aes-192 esp-sha-hmac
To:
crypto ipsec transform-set vpn_to_site_a ah-sha-hmac
And it works fine (of course it's configured on both sides accordingly). Instead of ESP packets with tcpdump on the WAN side, I see AH packets, and communication fully works both ways. Unfortunately, the traffic is not encrypted which is not what I want. I tried several other transform options and it simply doesn't work when ESP is enabled.
I'm not sure what else to try, and I really need to get this back up and running. Both ISPs are essentially blaming my routers. Any comments, ideas, or suggestions?
MentholMoose
MCSA 2003, LFCS, LFCE (expired), VCP6-DCV
MCSA 2003, LFCS, LFCE (expired), VCP6-DCV
Comments
-
MentholMoose Member Posts: 1,525 ■■■■■■■■□□One last note. Before doing the above analysis I was leaning toward some strange hardware failure. But given the symptoms and the results of my investigation, I don't see how the routers would cause the problem. But I really need some advice on this issue. Thanks!MentholMoose
MCSA 2003, LFCS, LFCE (expired), VCP6-DCV