I’ll also add, this is currently a draft with the IETF grow WG: https://tools.ietf.org/html/draft-ietf-grow-bgp-session-culling-01 (and one which I fully support) -- Andrew Hoyos [log in to unmask] > On Apr 26, 2017, at 9:34 AM, Andrew Hoyos <[log in to unmask]> wrote: > > On Apr 26, 2017, at 9:19 AM, Mike Horwath <[log in to unmask]> wrote: >> >> On Wed, Apr 26, 2017 at 08:07:45AM -0500, Andrew Hoyos wrote: >>> I would suggest that perhaps we look into filtering BGP (tcp/179) >>> with an ACL prior to maintenance start on those specific ports being >>> moved. Many other IXs are doing this for maintenance as a way to >>> gracefully take things down, and let bilateral and RS sessions time >>> out without killing active traffic. As we've noticed, not all >>> members being moved are bothering to shut down sessions prior, which >>> causes impact to/from those members. (i.e.: >>> https://ripe67.ripe.net/presentations/374-WH-IXPMaintReduce.pdf) >> >> Don't even need ACLs. >> >> Just take down the route servers for the 2 hour period. >> >> Bilateral are unaffected and they can arrange things anyway with their >> peers. > > I’d disagree. The maintenance currently taking place affects more than just the route servers. Plenty of people are doing bi-lateral peering on MICE, and that *IS* affected by maintenance events like these. > > Adding an ACL to the port ensures graceful shutdown/end of traffic, rather than an abrupt drop and hold timer fun. > I’d much rather that someone running the maintenance and in control of the ultimate link up/down events be the one deciding when things are starting/ending and re-enabling traffic gracefully. > >> Adding another step to the process creates more complications as well, >> and another point of failure if you screw up along the way. > > Disagree, adding an ACL to a port is pretty trivial. Add (pre-existing) ACL to port 10 minutes before maintenance starts. Remove when complete. > Script up into copy/paste thing with port numbers for bonus points and less changes of failure. > >> Clean shutdown of bird is easier, quicker, and will for sure make the >> multilateral peering not be further affected by bouncing repeatedly. > > Yes, great for MLPA, but not for bilateral. > > Lastly, In this *specific* case, this presents issues with other members ports who are *NOT* affected by the maintenance and a loss of traffic for them if they are doing MLPA. Why break everyone and cause a total route server outage, when it’s not necessary at all? Yesterday’s maintenance only affected a portion of members. ACL’s on member ports would be the cleanest way to minimize outage duration for all members with the least impact to the IX as a whole. > > >