VyOS 1.2.0 development news in July

Despite the slow news season and the RAID incident that luckily slowed us down only for a couple of days, I think we've made good progress in July.

First, Kim Hagen got cloud-init to work, even though it didn't make it to the mainline image, and WAAgent required for Azure is not working yet. Some more work, and VyOS will get a much wider cloud platform support. He's also working on Wireguard integration and it's expected to be merged into current soon.

The new VRRP CLI and IPv6 support is another big change, but it's got its own blog post, so I won't stop there and cover things that did not get their own blog posts instead.

IPsec and VTI

While I regard VTI as the most leaky abstraction ever created and always suggest using honest GRE/IPsec instead, I know many people don't really have any choice because their partners or service providers are using it. In older StrongSWAN versions it used to just work.

Updating StrongSWAN to the latest version had an unforeseen and very unpleasant side effect: VTI tunnels stopped working. A workaround in form of "install_routes = no" in /etc/strongswan.d/charon.conf was discovered, but it has an equally bad side effect: site to site tunnels stop working when it's applied.

The root cause of the problem is that for VTI tunnels to work, their traffic selectors have to be set to 0.0.0.0/0 for traffic to match the tunnel, even though actual routing decision is made according to netfilter marks. Unless route insertion is disabled entirely, StrongSWAN thus mistakenly inserts a default route through the VTI peer address, which makes all traffic routed to nowhere.

This is a hard problem without a workaround that is easy and effective. It's an architectural problem in the new StrongSWAN, according to our investigation of its source code and its developer responses, there is simply no way to control route insertion per peer. One developer responded to it with "why, site to site and VTI tunnels are never used on the same machine anyway" — yeah, people are reporting bugs just out of curiosity.

While there is no clean solution within StrongSWAN, this definitely has been a blocker for the release candidate. Reimplementing route insertion with an up/down script proved to be a hard problem since there are lots of cases to handle and complete information about the intended SA may not always be available to scripts. Switching to another IKE implementation seems like an attractive option, but needs a serious evaluation of the alternatives, and a complete rewrite of the IPsec config scripts — which is planned, but will take a while because the legacy scripts is an unmaintainable mess.

I think I've found a workable (even if far from perfect workaround) — instead of inserting missing routes, delete the bad routes. I've made a test setup and it seems to work reasonably well. The obvious issue is that it doesn't prevent bad things from happening, but rather undoes the damage, so there may still be a brief traffic disruption when VTI tunnels go up. Another problem is a possible race condition between StrongSWAN inserting routes and the script deleting them, though I haven't seen it in practice yet and I hope it doesn't exist. But, at least you can now use both VTI and site to site tunnels on the same machine.

For people who want to use VTI exclusively, there is now "set vpn ipsec options disable-route-autoinstall" option that disables route insertion globally, thus removing the possible disruption, at cost of making site to site tunnels impossible to use. That option is disabled by default.

I hope it will be good enough until we find a better solution. Your testing is needed to confirm that it is!

Pre-config scripts

Apart from post-config scripts, that were always there, VyOS also supports pre-config scripts now, that are executed before the config.boot file is loaded. The pre-config script must be located in /config/scripts/vyos-preconfig-bootup.script

While it looks somewhat more exotic, there are use cases for it, for example copying a config from an external drive or network, or modifying the config. For example, if you are using VRRP transition scripts to modify the running config, you may write a script for the backup node that removes the options that are only supposed to be enabled on master, and no longer worry that they will remain enabled if you happen to save the config when that node is in master state.

Suppose you want your backup node to enable an IPsec tunnel when it becomes master. Then you can put something like this in your /config/scripts/vyos-preconfig-bootup.script:


#!/usr/bin/env python3

from vyos.configtree import ConfigTree

with open('/config/config.boot', 'r') as f:
    config_file = f.read()

config = ConfigTree(config_file)

if config.exists(['vpn', 'ipsec', 'site-to-site', 'peer', '192.0.2.10', 'tunnel', '1', 'disable']):
    config.delete(['vpn', 'ipsec', 'site-to-site', 'peer', '192.0.2.10', 'tunnel', '1', 'disable'])

with open(file_name, 'w') as f:
    f.write(config.to_string())

On that subject, for the post-config script, you can now use an alternavive, Vyatta-free name: /config/scripts/vyos-postconfig-bootup.script

If both old and new style script exist, the old one will be executed first.

DNSSEC support in the DNS forwarding service

Thanks to our new contributor who goes by mb300sd, the DNS forwarding service can configure DNSSEC options. The new command is "set service dns forwarding dnssec (off|process-no-validate|process|log-fail|validate)".

New Python/XML rewrites

The command definitions and scripts for dynamic DNS and for syslog have been rewritten by Christian Poessinger and hagbard respectively.

SNMP improvements

Christian Poessinger and Jules made a number of improvements in SNMP, including the new IPv6 community option (community6) and multiple bugfixes to improves the script robustness.

Planning to drop support for loading configs from very old Vyatta Core options

Kim Hagen and I discussed this issue lately. Right now we technically support loading configs from all Vyatta Core versions theoretically going down to the XORP-based Vyatta 1.0.
In practice, there are two problems with it: first, it relies on a rather clunky migration script system with lots of messy migration scripts, and second, since those versions are hardly seen in the wild now, no one put any real effort into testing if it still works or not.
The third problem is that Vyatta Core 6.5 is itself partially incompatible with older versions and requires "modify" firewalls to be manually rewritten as "policy route" settings, and there is no simple solution for this problem, not mentioning the fact that "policy route" syntax was a bad idea in the hindsight since it deprived users of useful options of modify firewalls without offering an alternative, so it will have to be redesigned anyway — with migration scripts to support previous VyOS and latest Vyatta Core of course!

If we drop support for loading configs from Vyatta Core versions older than 6.5, we can get rid of the old migration system entirely and make a new one without having to care about "bug compatibility". Users of the old Vyatta Core will still have the option to upgrade through VyOS 1.1.8, and since they have neglected updating for at least five years, I guess it's not a big price to pay.