Hi Guys,
Hoping someone could help me with an issue we are mysteriously having on our 6509's at one of our offices. This happened out of the blue and was not a result of any work going on etc.
Firstly, there's 2 6509's (SW1 & SW2) with the usual vlan routing, etherchanel, hsrp etc running eigrp. however yesterday out of now where the two could no longer communicate and ping each other on certain IP ranges using the DG for that subnet. For example half of the virtual hsrp IP's are active on SW1 and the other half on SW2. now sw2 cannot ping virtual IP's on SW1 and SW1 and cannot ping certain IP's on SW2.
A big issue is for some reason all of our access switches has only one link to only ONE of the 6500's - dont ask, it was a guy before me! im looking to rectify this but there's ALOT of faulty fibre so its not possible at the mo and a disaster like this has been waiting to happen because no one has solid knowledge about this site. So any access switch which is connected to SW2 cant use vlan routing to reach servers etc when it goes through SW2 (because SW2 cant ping the DG situated on SW1).
We've cleared all the usual cache, the configs have not changed at all, confirmed the hsrp groups are configured correctly, no error message ANYWHERE on console or from some debug commands, status lights on the 6500 chassis are green and OK, reseated the fibre connections on each switch, confirmed etherchannel is showing as OK, when you run a show ip route for a DG IP it knows to go through the etherchannel. When you run a sh cdp neigh on the switch module it cant see SW2 as a neighbour but if you go into the supervisor then it can see SW2's supervisor IP as a cdp neighbour? also, all inetrfaces are showing as up up.
We also failed over hsrp by forcing the priority over to the SW1 (which was having less issues than 2) and when we done that we could no longer see SW2. It was like it fell from the earth never to return

I Changed the priorities back and it went back to its orginal state. We done the same on SW2 and again it happened as above.
Both core's have now been rebooted. Wasnt my idea but it was to put people minds at rest as they had been up for around 4 years now. Anyway, it came back up and is still the same.
Ive went round and tested alot of the client machines and found that regardless of which core switch access switches are going through its vlan 4 cant route off its subnet. as a temp measure we've moved all vlan 4 clients into vlan 3 and they work ok. It doesnt seem a routing issue though its enabled for that range etc and its strange because these have been running for 10 yrs and no one touches them and like i say the config has not changed so its strange why it would happen all of a sudden. also both core switches are having issues pinging IP's from all vlans not just 4 but operationally only clients on vlan 4 seem to be effected when it goes through sw2 to reach a virtual IP on sw1. for some reason clients which do the same but for other vlans are working OK - even though on that core switch it cannot ping that same virtual IP which these clients are going to!!
Also when i run show ip eigrp neigh from the switch module it shows no neighbours however if i run it from the supervisor it shows the other core switch's supervisor as its neighbour. btw, the sup is running catos and the switch module are running ios.
So, i know there are plenty of people around here with solid 6500 experiance and im running out of idea's. i hope i havent missed anything obvious but would welcome opinions.
Thanks guys.