IP SLA implementation

We have recently had problems at work with HSRP and IP SLAs. We connect through redundant links that are not controlled by us, and one of them has been dropping some packets intermittently. This caused HSRP active gateway to go standby, and after a short time it would recover again.

There is a single IP SLA configured, and when the network dropped these packets it was the only responsible of bringing the active HSRP switch down.

I was wondering if it would not be best to have two IP SLAs with increased timers synched, so small packet losses would not be that much of an issue.

An example:

One SLA that pings every 10 seconds and decreases 20 to priority
Two SLAs that ping every 10 seconds and decrease priority by 10. One would start 5 seconds after the other.

The only bad thing about it would be having another process running on the router, but I don´t think that having a process that sends a ping can be that costly...

So... Does it make sense? Have you seen it done somewhere?

Thanks in advance!

(I edited the timers)

Find more posts tagged with

Free for TechExams community: Cybersecurity salary guide

Compare cert salaries and plan your next career move

Button

Comments

sandman748

if its just a small amount of packet loss (less than three minutes) you can delay the object going down on the track statement

sample config

track x sla y
delay down 90

if you wanted to have a second sla to another site in case the object you are tracking is flaky you can also track a boolean list so that both sites have to stop responding to ping for the hsrp to fail over

example

track 1 ip sla 1
track 2 ip sla 2
track 3 list boolean or
object 1
object 2

We use a combination of both for our dual ISP wan connection. So on track 3 we also have

delay down 90 up 180

The end result is that both responders have to be down for 90 seconds before the HSRP state change and back up again for 3 minutes before it changes back.

Obviously those timers can be tweaked to your liking. It doesn't have to be that long before going down or back up.

_Gonzalo_

Thanks sandman748!

That list boolean is definitely something I´ll use. In fact, I checked today in detail and discovered that the whole hrsp tracking configuration is a mess, so I´m remaking it tomorrow. I´ll post it when it´s done.

The timers will still have to be 10, but I think I have it almost clear. By the way, I think you wanted to type "and" instead of "or".

sandman748

I definitely meant to say OR not AND.

The logic is if track 1 OR track 2 = UP then track 3 = UP

I want both sites to be down before I flip the switch.

If you use AND, one site going down will cause the list to be down.

Depends on what behavior you are looking for.

_Gonzalo_

sandman748 wrote: »

I definitely meant to say OR not AND

That you do! Logic told me "AND", but I looked it up and realized that you were right.

Thanks again!

It was really useful. I do not have the config at hand, but I´ll try to post it tonight.

_Gonzalo_

Sorry for the delay... Config should end like this:

On both

ip sla monitor 101

type echo protocol ipIcmpEcho 10.X.X.17

frequency 10

ip sla monitor schedule 101 life forever start-time X

ip sla monitor 102

type echo protocol ipIcmpEcho 10.X.X.18

frequency 10

ip sla monitor schedule 102 life forever start-time X+2 SECONDS

ip sla monitor 103

type echo protocol ipIcmpEcho 10.X.X.17

frequency 10

ip sla monitor schedule 103 life forever start-time X+5 SECONDS

ip sla monitor 104

type echo protocol ipIcmpEcho 10.X.X.18

frequency 10

ip sla monitor schedule 104 life forever start-time X+7/8 SECONDS

and

Track 101 rtr 101

Track 102 rtr 102

Track 104 rtr 103

Track 104 rtr 104

Track 100 list Boolean or

object 101

object 102

object 103

object 104

On sw01a

interface FastEthernet0/1

ip address 2.X.X.91 255.255.255.0

standby 99 ip 2.X.X.90

standby 99 priority 129

standby 99 preempt

standby 99 preempt delay minimum 1

standby 99 name TRACKHSRP1

standby 99 track Tunnel0

standby 99 track Tunnel1

standby 99 track 100 decrement 20

On sw01b

interface FastEthernet0/1

standby 99 ip 2.X.X.90

standby 99 priority 110

standby 99 preempt delay minimum 1

standby 99 name TRACKHSRP1

standby 99 track Tunnel10

standby 99 track Tunnel11

standby 99 track 100 decrement 20

The other side of the tunnels would just react to tunnel failure like this:

sw03a:

standby 99 priority 109

standby 99 track Tunnel0 decrement 12

standby 99 track Tunnel10 decrement 8

sw04a:

standby 99 priority 100

standby 99 track Tunnel1 decrement 10

standby 99 track Tunnel11 decrement 10

***I edited the priorities&decrements...

d4nz1g

Let me see if I understand: You have one active router for 2 sites?
If so, they are converging due to hello packet loss, and you would have a split brain scenario.

Keep in mind that this design is not recommended at all, the right one would be one active/standby per site (altough we run this where I am working at haha).

_Gonzalo_

Hey!

The first two are actually in two different sites, but share two HSRP instances. sw03 and sw04 are on a separate site and share other two instances. There are even more HSRP instances functioning on the network, and underneath there is a lot of L2 zones that we do not manage. Also, there is some traffic routed out of the tunnel, through a couple of routers more and a firewall (per path) that we also manage.

I just limited it to a portion to simplify, as the other factors were not affecting this particular case, but basically it´s that.

The fact that the instance number is the same is just due to being tired, hehehe (but would not affect, as they share L2 links by pairs:s1-s2, s3-s4)