VyOS 1.2.0 development news in July

Despite the slow news season and the RAID incident that luckily slowed us down only for a couple of days, I think we've made good progress in July.

First, Kim Hagen got cloud-init to work, even though it didn't make it to the mainline image, and WAAgent required for Azure is not working yet. Some more work, and VyOS will get a much wider cloud platform support. He's also working on Wireguard integration and it's expected to be merged into current soon.

The new VRRP CLI and IPv6 support is another big change, but it's got its own blog post, so I won't stop there and cover things that did not get their own blog posts instead.

IPsec and VTI

While I regard VTI as the most leaky abstraction ever created and always suggest using honest GRE/IPsec instead, I know many people don't really have any choice because their partners or service providers are using it. In older StrongSWAN versions it used to just work.

Updating StrongSWAN to the latest version had an unforeseen and very unpleasant side effect: VTI tunnels stopped working. A workaround in form of "install_routes = no" in /etc/strongswan.d/charon.conf was discovered, but it has an equally bad side effect: site to site tunnels stop working when it's applied.

The root cause of the problem is that for VTI tunnels to work, their traffic selectors have to be set to 0.0.0.0/0 for traffic to match the tunnel, even though actual routing decision is made according to netfilter marks. Unless route insertion is disabled entirely, StrongSWAN thus mistakenly inserts a default route through the VTI peer address, which makes all traffic routed to nowhere.

This is a hard problem without a workaround that is easy and effective. It's an architectural problem in the new StrongSWAN, according to our investigation of its source code and its developer responses, there is simply no way to control route insertion per peer. One developer responded to it with "why, site to site and VTI tunnels are never used on the same machine anyway" — yeah, people are reporting bugs just out of curiosity.

While there is no clean solution within StrongSWAN, this definitely has been a blocker for the release candidate. Reimplementing route insertion with an up/down script proved to be a hard problem since there are lots of cases to handle and complete information about the intended SA may not always be available to scripts. Switching to another IKE implementation seems like an attractive option, but needs a serious evaluation of the alternatives, and a complete rewrite of the IPsec config scripts — which is planned, but will take a while because the legacy scripts is an unmaintainable mess.

I think I've found a workable (even if far from perfect workaround) — instead of inserting missing routes, delete the bad routes. I've made a test setup and it seems to work reasonably well. The obvious issue is that it doesn't prevent bad things from happening, but rather undoes the damage, so there may still be a brief traffic disruption when VTI tunnels go up. Another problem is a possible race condition between StrongSWAN inserting routes and the script deleting them, though I haven't seen it in practice yet and I hope it doesn't exist. But, at least you can now use both VTI and site to site tunnels on the same machine.

For people who want to use VTI exclusively, there is now "set vpn ipsec options disable-route-autoinstall" option that disables route insertion globally, thus removing the possible disruption, at cost of making site to site tunnels impossible to use. That option is disabled by default.

I hope it will be good enough until we find a better solution. Your testing is needed to confirm that it is!

When VyOS CLI isn't enough

Sometimes a particular configuration option is supported by the software that VyOS uses, but the CLI does not expose it. Since VyOS is open source, you can always fix that, but sometimes you need it by tomorrow, and there's simply no time to do it.

In a number of places, we've left an escape hatch for it that allows bypassing the CLI and including a raw snippet into the generated config. Of course, you give up the sanity checks of the CLI and take full responsibility for the correctness of the resulting config, but sometimes it's necessary.

OpenVPN

In openvpn, we have an option called "openvpn-option". You can pass any options to OpenVPN process with it, but note that in the current versions, it has to follow the command line rather than config file option, i.e. prepend it with "--". See this example:

set interfaces openvpn vtun10 openvpn-option "--connect-freq 10 60"

Note that the "push" option em is supported. I see OpenVPN configs with openvpn-option heavily overused once in a while — before including an option, make sure what you need to do is really not supported.

DHCP server

In the DHCP server, there is not one, but too escape hatches. One is the "subnet-parameters" option under "subnet". Another one is a "global-parameters" under "shared-network-name".

See an example:

set service dhcp-server shared-network-name LAN subnet 10.0.0.0/24 subnet-parameters "ping-timeout 5;"

Since dhcpd.conf syntax is more complex than just a list of options, it's important to make sure that generated config will be valid. It's easy to make your DHCP server stop loading and spend some time reading the log to see what is wrong, so be careful here.

Note that these options are not supported by the DHCPv6 server. Anyone thinks we should support it?

Dynamic DNS

In dynamic DNS, you can use the generic HTTP method if your provider and protocol is not supported.

set service dns dynamic interface eth0 use-web url http://dyndns.example.com/?update

Since no one can possibly support all providers, I believe it will remain a necessary option forever.

When all else fails: the postconfig script

If something is not supported and doesn't have a handy escape hatch, you still can implement it with the postconfig script. That script is found at /config/scripts/vyatta-postconfig-bootup.script and runs after config.boot loading is complete, so it's particularly conductive to manipulating things like raw iptables rules.

VyOS doesn't delete or overwrite anything in the global netfilter tables after boot, so it's safe to put your commands there, for example "/sbin/iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu" for global MSS clamping.

You still need to be careful not to conflict with any of the rules inserted by VyOS though, in general, so always make sure to check what exactly VyOS generates before using the postconfig script.

Looking forward

VyOS 1.2.0 brings improvements to the postconfig script execution and adds some more escape hatches — stay tuned for updates.