Print

Print


I’ll also add, this is currently a draft with the IETF grow WG:

https://tools.ietf.org/html/draft-ietf-grow-bgp-session-culling-01

(and one which I fully support)

--
Andrew Hoyos
[log in to unmask]



> On Apr 26, 2017, at 9:34 AM, Andrew Hoyos <[log in to unmask]> wrote:
> 
> On Apr 26, 2017, at 9:19 AM, Mike Horwath <[log in to unmask]> wrote:
>> 
>> On Wed, Apr 26, 2017 at 08:07:45AM -0500, Andrew Hoyos wrote:
>>> I would suggest that perhaps we look into filtering BGP (tcp/179)
>>> with an ACL prior to maintenance start on those specific ports being
>>> moved.  Many other IXs are doing this for maintenance as a way to
>>> gracefully take things down, and let bilateral and RS sessions time
>>> out without killing active traffic. As we've noticed, not all
>>> members being moved are bothering to shut down sessions prior, which
>>> causes impact to/from those members.  (i.e.:
>>> https://ripe67.ripe.net/presentations/374-WH-IXPMaintReduce.pdf)
>> 
>> Don't even need ACLs.
>> 
>> Just take down the route servers for the 2 hour period.
>> 
>> Bilateral are unaffected and they can arrange things anyway with their
>> peers.
> 
> I’d disagree. The maintenance currently taking place affects more than just the route servers. Plenty of people are doing bi-lateral peering on MICE, and that *IS* affected by maintenance events like these.
> 
> Adding an ACL to the port ensures graceful shutdown/end of traffic, rather than an abrupt drop and hold timer fun.
> I’d much rather that someone running the maintenance and in control of the ultimate link up/down events be the one deciding when things are starting/ending and re-enabling traffic gracefully.
> 
>> Adding another step to the process creates more complications as well,
>> and another point of failure if you screw up along the way.
> 
> Disagree, adding an ACL to a port is pretty trivial. Add (pre-existing) ACL to port 10 minutes before maintenance starts. Remove when complete. 
> Script up into copy/paste thing with port numbers for bonus points and less changes of failure.
> 
>> Clean shutdown of bird is easier, quicker, and will for sure make the
>> multilateral peering not be further affected by bouncing repeatedly.
> 
> Yes, great for MLPA, but not for bilateral. 
> 
> Lastly, In this *specific* case, this presents issues with other members ports who are *NOT* affected by the maintenance and a loss of traffic for them if they are doing MLPA. Why break everyone and cause a total route server outage, when it’s not necessary at all? Yesterday’s maintenance only affected a portion of members. ACL’s on member ports would be the cleanest way to minimize outage duration for all members with the least impact to the IX as a whole.
> 
> 
>