AS57777 Network Design

Internal: Multiprotocol BGP Link-Local Address Peering & BFD

Inside AS57777 we use as an Internal Gateway Protocol (IGP) Multiprotocol BGP Link-Local Address Peering: thus using IPv6 Link-Local sessions between the routers and sending both IPv4 and IPv6 routing information in BGP (RFC5549/RFC8950) with each router having their own ASN. We leverage Bidirectional Forwarding Detection (BFD; RFC5880) to rapidly detect link failures.

Internal routers only have their primary IPv4 + IPv6 addresses on a loopback interface. Links between routers are unnumbered and only have a IPv6 Link-Local assigned to them. Next-hops inside AS57777 are thus all IPv6 Link-Local for both IPv4 and IPv6 and there are no IPv4 BGP sessions internally.

As we have our own IEEE OUI-36 block we hard-code the MAC of each interface thus automatically gaining stable and easily identifiable IPv6 Link-Local (fe80::/10) addresses.

Internally we use the ASN 4205777xxx where xxx is the last two octet of the IPv4 address increased by 700. Per example r1.zrh.ch.massars.net has the IPv4 address 185.173.128.2, Router ID is the same and the internal ASN is 4205777702.

The configuration management tool has a mapping of the primary IPv4 + IPv6 + MAC per router and can then easily generate all configuration (local IPv6 Link-Local and the remote IPv6 Link-Local Address along with ASNs).

The choice for eBGP as an IGP is common in large scale datacenter settings and avoids operating a separate IGP like OSPFv3. Scaling for our network is not a real issue, as it is relatively small. Our configuration tool can insert prepends to work around routers that are in maintainance mode.

We mix both Bird and OpenBGPd in the network, and thus also Linux and OpenBSD hosts, with either routing suite running on either OS. As these are all software based, they are easily replicated in our test and lab environments. For more complex testing we utilize ContainerLab, netlab and for other use cases EVE-NG.

Hosts / Servers

While there are a few hosts living in a good old fashioned subnet, we are moving them to BGP Unnumbered IPv6 Link-Local connections instead. Each host/server that needs flexibility is a BGP speaker announcing the prefixes that it needs. That allows much more flexible placement and moving of hosts and floating the IP and the host around the network without need to renumber. In IPv4 this also means avoiding the 0, gateway and subnet addresses, thus saving a few IPv4 addresses.

Bird IPv4 over IPv6 Link-Local Nexthop Example


# Template defining both IPv4 + IPv6 in a single peering session
# Thus enabling that single BGP session to carry both IPv4 + IPv6 prefixes
# (normally part of other template with way many more options)
# (see https://bgpfilterguide.nlnog.net etc)
router id 192.0.2.1;

protocol device {
        scan time 600;
}

protocol direct {
        ipv4;
        ipv6;
        check link yes;
}

protocol kernel fib4 {
	ipv4 {
		import all;
		export all;
	}
}

protocol kernel fib6 {
	ipv6 {
		import all;
		export all;
	}
}

template bgp tmpl_peer_ipv6ll
{
	# Example Local ASN
	local as 4205777701;

	ipv4
	{
		# Export and import them all, one will want to filter this normally ;)
		export all;
		import all;

		# Use the IPv6-LL address as a nexthop
		next hop self;

		# Enable the ability to have IPv6 NLRI over IPv4
		# (thus IPv6 Nexthop for a IPv4 path)
		extended next hop on;
	};

	ipv6
	{
		export all;
		import all;

		next hop self;
		extended next hop on;
	};
};

protocol bgp peer_other from tmpl_peer_ipv6ll {
	# Little Description of this peer
	description "Example Peering over IPv6 LL";

	# Local IPv6-LL address, needs to be configured
	source address fe80::8e1f:64ff:fefa:a401;

	# Friendly party IPv6-LL + ASN
	neighbor fe80::8e1f:64ff:fefa:a402 as 4205777702;

	# Interface this peering happens on
	interface "eth0";
}

External: eBGPv4

Naturally, externally we use eBGPv4.

Filtering Management: RPKI, IRR and Spamhaus DROP

RPKI data is fetched using rpki-client, IRR data is retrieved using bgpq4 and additionally the Spamhaus Don't Route Or Peer Lists (DROP) lists (IPv4, IPv6, ASN) are retrieved and all are massaged into a JSON file and provided to StayRTR which serves this up using the RTR protocol to either Bird or OpenBGPd.

Based on that information a decision is taken if the prefix/ASN should be accepted from a BGP peer or not. If filtered out, When a prefix is not in our routing tables, it does not exist, and thus packets are dropped due to RPF filters as the source prefix does not exist. A single less-specific announcements though would still cause the packets to be viable routing wise and thus the IPv4 + IPv6 prefix lists are additionally dropped by our border router firewalls.

Additionally we peer with Team Cymru UTRS thus ensuring that we are not the source of unwanted traffic.

Thus delivering a clean and happy Internet experience: if one appears on Spamhaus DROP one has been doing bad for a while.

Spamhaus DROP through RTR protocol

An example entry from the ASN drop file:
	{ "asn": 23456, "prefix": "192.0.2.0/24", "maxLength": 24, "ta": "asndrop", "expires": 1733258739 }
In Bird we check the above JSON file served with StayRTR using the following configuration snippet:
protocol rpki drop1
{
	roa4 { table drop4; };
	remote 127.0.0.1 port 325;
	retry 300;
}

# birdc "eval test_drop_beacons()"
function test_drop_beacons()
{
	print "Testing ROA (DROP)";
	print "Should be TRUE TRUE";
	print roa_check(drop4, 192.0.2.0/24, 23456) = ROA_VALID;
	print roa_check(drop4, 185.173.128.0/24, 57777) = ROA_UNKNOWN;
}

...

function fn_drop_asn()
{
        # Listed in Spamhaus DROP, as served through RTR
        # We test against the fixed IPv4 Doc Prefix for both IPv4 + IPv6
        # As we only care if we do not want to see the given ASN
        for int n_asn in bgp_path do {
                if roa_check(drop4, 192.0.2.0/24, n_asn) = ROA_VALID then {
                        return true;
                }
	}
	return false;
}

Noting that ASNs that should not be dropped are not listed in the file and thus are ROA_UNKNOWN. Hence we simply check for ROA_VALID in this special IRR table to check if a ASN should be dropped.

Configuration Management

To easily deploy and test configurations and ensure that they are correct and as we expected, we have a new internal configuration management tool that can rsync configurations to our hosts. As it is programmed in Golang we utilize Golang templates for configuration snippets. The configuration is kept in git and we can easily deploy new versions to single or all hosts and roll-back configuration changes. The system deploys to a test copy of the network first and if that validates correctly with a myriad of tests run against the configuration, it can be deployed to production.

BGP Peer management is part of the above, we'll be providing access to the tool at a later stage for self-management purposes.

Naming

All hosts are named in the format <function>.<site>.massars.net. The site code follows UN/LOCODE as detailed on the Sites page.

Hosts

<loc> indicates the location of the device inside a geographic zone, typically a rack or room identifier.
FormatRoleNotes
ap-<loc>Access Point (WiFi)
bcast.<loc>BroadcastThe broadcast address of a subnet
cam-<loc>Camera
dk<n>Docker HostRuns docker containers
gw.<dom>GatewayThe gateway of a subnet
mdm-<t>ModemWith indicating the type
mm<nn>Milk MachineA host that has as primary function to run VMs and containers
ipmi.<host>IPMIIPMI for the host
pi<nn>Raspberry Pi
r<n>Router
sw-<loc>(-[ui|zx])Switchui(unifi) or zx(zyxel) are for the situation where there are two switches in same location
ups-<loc>UPS