VyOS 1.2.0 development news in June

If we haven't written any new blog posts in a while, that's because we've all been busy with development. Here is what happened since the last status update.

RADIUS authentication and authorization

We've had nominal support for RADIUS authentication for system login for a long time, but it was essentially useless because it required that the user account to already exist in VyOS to actually work, it was just a password checking method.

Kim Hagen found a way to implement real support for it and it appears to work fine (but of course needs testing). Apart from authentication, his implementation also supports authorization through the priv-lvl attribute. Privilege level 15 is mapped to admin users, anything below it is operator. This seems to be the most common way to do that, after Cisco.

Correct naming of PPTP and L2TP interfaces works again

The idea to switch to the upstream PPPD turned out to be premature — while it does have support for pre-up scripts, it still makes an assumption that the interface name is not changed by pre-up script, so remote access VPN interfaces were named pppX, which broke the "run show vpn remote-access" commands, and would break zone-based firewalls of people who rely on that naming.

We've re-applied the old patches for it to the latest upstream PPPD and opened pull requests for them, so if they are merged, we can finally stop building our own, and if not, we have clean patches that should be easy to apply to future releases.

Writing migration scripts (and manipulating VyOS config files outside VyOS) just got easier

Long story short

VyOS 1.2.0-rolling (starting with the next nightly build) includes a library for parsing and manipulating config files without loading them into the system config. It can be used for automatically converting configs from old versions in case an incompatible change was made, and for standalone utilities. Motivation and history are discussed below.

Here is an example of interacting with the new library:
>>> from vyos import configtree

>>> c = configtree.ConfigTree("system { host-name vyos \n } interfaces { dummy dum0 { address 192.0.2.1/24 \n address 192.0.2.20/24 \n disable \n } }  /* version: 1.2.0 */")

>>> print(c.to_string())
system {
    host-name vyos
}
interfaces {
    dummy dum0 {
        address 192.0.2.1/24
        address 192.0.2.20/24
        disable { }
    }
}

 /* version: 1.2.0 */

>>> c.set(['interfaces', 'dummy', 'dum0', 'address'], value='293.0.113.3/32', replace=False)

>>> c.delete_value(['interfaces', 'dummy', 'dum0', 'address'], '192.0.2.1/24')

>>> c.delete(['interfaces', 'dummy', 'dum0', 'disable'])

>>> c.is_tag(['interfaces', 'dummy'])
True

>>> c.exists(['interfaces', 'dummy', 'dum0', 'disable'])
False

>>> c.list_nodes(['interfaces', 'dummy'])
['dum0']

>>> print(c.to_string())
system {
    host-name vyos
}
interfaces {
    dummy dum0 {
        address 192.0.2.20/24
        address 293.0.113.3/32
    }
}

 /* version: 1.2.0 */

As you can see, it largely mimics the API you get for the running config. The only notable differences are that the "set" method requires that you specify the path and the value separately, and to have nodes formatted as tag nodes (i.e. "ethernet eth0 { ..." as opposed to "ethernet { eth0 { ..." you need to mark them as such with "set_tag", unless they were originally formatted that way in the config you parsed.

Things the new style configuration mode definitions intentionally do not support

I've made three important changes to the design of the configuration command definitions, and later I realized that I never wrote down a complete explanation of the changes and the motivation behind them.

So, let's make it clear: these changes are intentional and they shouldn't be reintroduced. Here's the details:

The "type" option

In the old definitions, you can see the "type:" option in the node.def files very often. In the new style XML definitions, there's no equivalent of it, and the type is always set to "txt" in autogenerated node.def's for tag and leaf nodes (which means "anything" to the configuration backend).

I always felt that the "type" option suffers from two problems: scope creep and redundancy.

The scope creep is in the fact that "type" was used for both value validation and generating completion help in "val_help:" option. Also, the "u32" type (32-bit unsigned integer) has a little known undocumented feature: it could be used for range validation in form of "type: u32:$lower-$upper" (e.g. u32:0-65535). It has never been used consistently even by the original Vyatta Core authors, plenty of node.def's use additional validation statements instead.

Now to the redundancy: there are two parallel mechanisms for validations in the old style definitions. Or three, depending on the way you count them. There are "syntax:expression:" statements that are used for validating values at set time, and "commit:expression:" that are checked at commit time.

My feeling from working with the system for scary amount of time was that the "type" option alone is almost never suffucient, and thus useless, since actual, detailed validation is almost always done elsewhere, in those "syntax/commit:expression:" statements or in the configuration scripts. Sometimes a "commit:expression:" is used where "syntax:expression:" would be more appropriate (i.e. validation is delayed) but let's focus on set-time validation only.

But without data to back it up, a feeling is just a feeling, so I made up a quick and dirty script to do some analysis. You can repeat what I've done easily with "find /opt/vyatta/share/vyatta-cfg/templates/ -type f -maxdepth 100 -name 'node.def' | nodecheck.py".

On VyOS 1.1.8 (which doesn't include any rewritten code) the output is:
Has type: 2737
Has type txt: 1387
Has type other than txt: 1350
Has commit or syntax expression: 1700
Has commit or syntax expression and type txt: 740
Has commit or syntax expression and type other than txt: 960

While irrelevant to the problem on hand, the total count of node.def's is 4293). In other words, of all nodes that have the type option, 50% have it set to "txt". Some of them are genuinely "anything goes" nodes such as "description" options, but most use it as a placeholder.

68% of all nodes that have a type are also using either "syntax:expression:" or "commit:expression:". Of all nodes that have a type more specific than "txt", 73% have additional validation. This means that even for supposedly specific types, type alone is enough only in 23% cases. This raises the question whether we need types at all.

Sure, we could introduce more types and add support for something of a sum type, but is it worth the trouble if validation can be easily delegated to external scripts? Besides, right now types are built in the config backend, which means adding a new one requires modifying it starting from the node.def parser.

In the new style definitions, I felt like the only special case that is special enough is regular expression. This is how value constraints checked at set time are defined:

<leafNode name="foo">
  <properties>
    <constraint>
      <regex>(baz|baz)</regex>
      <validator>quux</validator>
    </constraint>
  </properties>
<leafNode>

Here the "validator" tag contains a reference to a script stored in /usr/libexec/vyos/validators/. Since adding a new validator is easy, there's no reason to hesitate to add new ones for common (and even not so common) cases. Note that "regex" option is automatically wrapped in ^$, so there's no need to do it by hand every time.

Default values

The old definitions used to support "default:" option for setting default values for nodes. It looks innocous on the surface, but things get complicated when you look deeper into its behaviour.

You may think a node either exists, or it does not. What is the value of a node that doesn't exist? Sounds rather like a Zen koan, but here's cheap enlightenment for you: it depends on whether it has a default value or not.

Thus, nodes effectively have three states: "doesn't exist", "exists", and "default". As you can already guess, it's hard to tell the latter two apart. It's also very hard to see if a node was deleted from a config or just reset to a default value. It also means that every node lookup cannot operate on the running config tree alone and has to consult the command definitions as well, which is very bad if you plan to use the same code for the CLI and for standalone config handling programs such as migration scripts.

Last time people tried to introduce rollback without reboot, the difficulties of handling the third virtual "default" state was one of the biggest problems, and it's still one of the reasons we don't have a real rollback. VyConf has no support for default values for this reason, so we should eliminate them as we rewrite the code.

Defaults should be handled by config scripts now. Sure, we lose "show -all" and the ability to view defaults, but the complications that come with it hardly make it worth the trouble. There are also many implicit defaults that come from underlying software options anyway.

Embedded shell scripts

That's just a big "no". Have you ever tried to debug code that is spread across multiple node.def's in nested directories and that cannot be executed separately or stepped through?

While it's tempting to allow that for "trivial" scripts, the code tends to grow and things get ugly. Look the the implementation of PPPoE or tunnel interfaces in VyOS 1.1.8.

If it's more than one command, make it an external script, and you'll never regret the decision when it begins to grow.

New-style operational mode command definitions are here

We've had a convertor from the new style configuration command definitions in XML to the old style "templates" for a while in VyOS 1.2.0. As I predicted, a number of issues were only discovered and fixed as we have rewritten more old scripts, but by now they should be fully functional. However, until very recently, there was no equivalent of it for the operational mode commands. Now there is.

The new style

In case you've missed our earlier posts, here's a quick review. The configuration backend currently used by VyOS keeps command definitions in a very cumbesome format with a lot of nested directories where a directory represents a nesting level (e.g. "system options reboot-on-panic" is in "templates/system/options/reboot-on-panic"), a file with command properties must be present in every directory, and command definition files are allowed to have embedded shell scripts.

This makes command definitions very hard to read, and even harder to write. We cannot easily change that format until a new configuration backend is complete, but we can abstract it away, and that's what we did.

The new command definitions are in XML, and a RelaxNG schema is provided, so everyone can see its complete grammar and automatically check if their definitions are correctly written. Automated validation is already integrated in the build process.

Rewriting the command definitions goes along with rewriting the code they call. New style code goes to the vyos-1x package.

VyOS 1.2.0 status update

While VyOS 1.2.0 nightly builds have been fairly usable for a while already, there are still some things to be done because we can make a named release candidate from it. These are the things that have been done lately:

EC2 AMI scripts retargeting and clean up

The original AMI build scripts had been virtually unchanged since their original implementation in 2014, and by this time they've had ansible warnings at every other step, which prompted us to question everything they do, and we did. This resulted in a big spring cleanup of those scripts, and now they are way shorter, faster, and robust.

Other than the fact that they now work with VyOS 1.2.0 properly, one of the biggest improvements from the user point of view is that it's now easy to build an AMI with a custom config file simply by editing the file at playbooks/templates/config.boot.default.ec2

The primary motivation for it was to replace cumbersome in-place editing of the config.boot.default from the image with a single template, but in the end it's a win-win solution for both developers and users.

The original scripts were also notorious for their long execution time and fragility. What's worse is that when they failed (and it's usually "when" rather than "if"), they would leave behind a lot of gargabe they couldn't automatically clean up, since they used to create a temporary VPC complete with an internet gateway, subnet, and route table, all just for a single build instance. They also used a t2.medium instance that was clearly oversized for the task and could be expensive to leave running if clean up failed.

Now they create the build instance in the first available subnet of the default VPC, so even if they fail, you only need to delete a t2.micro instance, a key pair, and a security group.

It is no longer possible to build VyOS 1.1.x images with those scripts from the baseline code, but I've created a tag named 1.1.x from the last commit where it was possible, so you can do it if you want to — without these recent improvements of course.

Package upgrades and new drivers

We've upgraded StrongSWAN to 5.6.2, which hopefully will fix a few longstanding issues. Some enthusiastic testers are already testing it, but everyone is invited to test it as well.

SR-IOV is now basically a requirement for high performance virtualized networking, and it needs appropriate drivers. Recent nightly builds include a newer version of Intel's ixgbe and Mellanox OFED drivers, so the support for recent 10gig cards and SR-IOV in particular has improved.

A step towards using the master branch again

The "current" git branch we use throughout the project where everyone else uses "master" was never intended to be a permanent setup: it always was a workaround for the master branch in packages inherited from Vyatta Core being messed up beyond any repair. It will take quite some time to get rid of the "current" branch completely and we'll only be able to do it when we finally consolidate all the code under vyos-1x, but we've made jenkins builds correctly put the packages built from the "master" branch in our development repository, so we'll be able to do it for packages that do not include any legacy code at least.

IPv6 VRRP status

This is the most interesting part. IPv6 VRRP is perhaps a single most awaited feature. Originally it was blocked by lack of support for it in keepalived. Now keepalived supports it, but integrating it will need some backwards-incompatible changes.

Originally, keepalived allowed mixing IPv4 and IPv6 in the same group, but it no longer allows it (curiously, the protocol standard does allow IPv4 advertisments over IPv6 transport, but I can see why they may want to keep these separate). This means to take advantage of the improvements it made, we also have to disallow it, thus breaking the configs of people who attempted to use it. We've been thinking about keeping the old syntax while generating different configs from it, or automated migration, but it's not clear if automated migration is really feasible.

An incompatible syntax change is definitely needed because, for example, if we want to support setting hello source address or unicast VRRP peer address for both IPv4 and IPv6, we obviously need separate options.

Soon IPv6 addresses in IPv4 VRRP groups will be disallowed and syntax for IPv6-only VRRP groups will be added alongside the old vrrp-group syntax. If you have ideas for the new syntax, the possible automated migration, or generally how to make the transition smooth, please comment on the relevant task.

PowerDNS recursor instead of dnsmasq

The old dnsmasq (which I, frankly, always viewed as something of a spork, with its limited DHCP server functionality built into what's mainly a caching DNS resolver), has been replaced with PowerDNS recursor, a much cleaner implementation.

On security of GRE/IPsec scenarios

As we've already discussed, there are many ways to setup GRE (or something else) over IPsec and they all have their advantages and disadvantages. Recently an issue was brought to my attention: which ones are safe against unencrypted GRE traffic being sent?

The reason this issue can appear at all is that GRE and IPsec are related to each other more like routing and NAT: in some setups their configuration has to be carefully coordinated, but in general they can easily be used without each other. Lack of tight coupling between features allows greater flexibility, but it may also create situations when the setup stops working as intended without a clear indication as to why it happened.

Let's review the knowingly safe scenarios:

VTI

This one is least flexible, but also foolproof by design: the VTI interface (which is secretly simply IPIP) is brought up only when an IPsec tunnel associated with it is up, and goes down when the tunnel goes down. No traffic will ever be sent over a VTI interface until IKE succeeds.

Tunnel sourced from a loopback address

If you have missed it, the basic idea of this setup is the following:

set interfaces dummy dum0 address 192.168.1.100/32

set interfaces tunnel tun0 local-ip 192.168.1.100/32
set interfaces tunnel tun0 remote-ip 192.168.1.101/32 # assigned to dum0 on the remote side

set vpn ipsec site-to-site peer 203.0.113.50 tunnel 1 local prefix 192.168.1.100/32
set vpn ipsec site-to-site peer 203.0.113.50 tunnel 1 remote prefix 192.168.1.101/32

Most often it's used when the routers are behind NAT, or one side lacks a static address, which makes selecting traffic for encryptions by protocol alone impossible. However, it also introduces tight coupling between IPsec and GRE: since the remote end of the GRE tunnel can only be reached via an IPsec tunnel, no communication between the routers over GRE is possible unless the IPsec tunnel is up. If you fear that any packets may be sent via the default route, you can nullroute the IPsec tunnel network to be sure.

The complicated case

Now let's examine the simplest kind of setup:

set interfaces tunnel tun0 local-ip 192.0.2.100 # WAN address
set interfaces tunnel tun0 remote-ip 203.0.113.200

set vpn ipsec site-to-site peer 203.0.113.200 tunnel 1 protocol gre

In this case IPsec is setup to encrypt the GRE traffic to 203.0.113.200, but the GRE tunnel itself can work without IPsec. In fact, it will work without IPsec, just without encryption, and that is the concern for some people. If the IPsec tunnel goes down due to misconfiguration, it will fall back to the common, unencrypted GRE.

What can you do about it?

As a user, if your requirement is to prevent unencrypted traffic from ever being sent, you should use VTI or use loopback addresses for tunnel endpoints.

For developers this question is more complicated.

What should be done about it?

The opinions are divided. I'll summarize the arguments here.

Arguments for fixing it:

  • Cisco does it that way (attempts to detect that GRE and IPsec are related — at least in some implementations and at least when it's referenced as IPsec profile in the GRE tunnel)
  • The current behaviour is against user's intentions

Arguments against fixing it:

  • Attempts to guess user's intentions are doomed to fail at least some of the time (for example, what if a user intentionally brings an IPsec tunnel down to isolate GRE setup issues?)
  • The only way to guarantee that unencrypted traffic is never sent is checking for a live SA matching protocol and source before forwarding every packet — that's not good for performance).

Practical considerations:

  • Since IKE is in the userspace, the kernel can't even know that an SA is supposed to exist until IKE succeeds: automatic detection would be a big change that is unlikely to be accepted in the mainline kernel.
  • Configuration changes required to avoid the issue are simple
If you have any thoughts on the issue, please share with us!

When VyOS CLI isn't enough

Sometimes a particular configuration option is supported by the software that VyOS uses, but the CLI does not expose it. Since VyOS is open source, you can always fix that, but sometimes you need it by tomorrow, and there's simply no time to do it.

In a number of places, we've left an escape hatch for it that allows bypassing the CLI and including a raw snippet into the generated config. Of course, you give up the sanity checks of the CLI and take full responsibility for the correctness of the resulting config, but sometimes it's necessary.

OpenVPN

In openvpn, we have an option called "openvpn-option". You can pass any options to OpenVPN process with it, but note that in the current versions, it has to follow the command line rather than config file option, i.e. prepend it with "--". See this example:

set interfaces openvpn vtun10 openvpn-option "--connect-freq 10 60"

Note that the "push" option em is supported. I see OpenVPN configs with openvpn-option heavily overused once in a while — before including an option, make sure what you need to do is really not supported.

DHCP server

In the DHCP server, there is not one, but too escape hatches. One is the "subnet-parameters" option under "subnet". Another one is a "global-parameters" under "shared-network-name".

See an example:

set service dhcp-server shared-network-name LAN subnet 10.0.0.0/24 subnet-parameters "ping-timeout 5;"

Since dhcpd.conf syntax is more complex than just a list of options, it's important to make sure that generated config will be valid. It's easy to make your DHCP server stop loading and spend some time reading the log to see what is wrong, so be careful here.

Note that these options are not supported by the DHCPv6 server. Anyone thinks we should support it?

Dynamic DNS

In dynamic DNS, you can use the generic HTTP method if your provider and protocol is not supported.

set service dns dynamic interface eth0 use-web url http://dyndns.example.com/?update

Since no one can possibly support all providers, I believe it will remain a necessary option forever.

When all else fails: the postconfig script

If something is not supported and doesn't have a handy escape hatch, you still can implement it with the postconfig script. That script is found at /config/scripts/vyatta-postconfig-bootup.script and runs after config.boot loading is complete, so it's particularly conductive to manipulating things like raw iptables rules.

VyOS doesn't delete or overwrite anything in the global netfilter tables after boot, so it's safe to put your commands there, for example "/sbin/iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu" for global MSS clamping.

You still need to be careful not to conflict with any of the rules inserted by VyOS though, in general, so always make sure to check what exactly VyOS generates before using the postconfig script.

Looking forward

VyOS 1.2.0 brings improvements to the postconfig script execution and adds some more escape hatches — stay tuned for updates.


DNS forwarding in VyOS

A lot of small networks do not have their own DNS server, but it's not always desirable to just leave hosts to use an external third-party server either, that's why we've had DNS forwarding in VyOS for a long time and are going to keep it there for the foreseeable future.

Experienced VyOS users already know all about it, but we should post something for newcomers too, shouldn't we?

Configuring DNS forwarding is very simple. Assuming you have "system name-server" set, all you need to do to simply forward requests from hosts behind eth0 to it is "set service dns forwarding listen-on eth0". Repeat for every interfaces where you have clients and you are done.

There are some knobs for telling the service to use or not use specific DNS servers though:

set service dns forwarding listen-on eth0

# Use name servers from "system name-server"
set service dns forwarding system

# Use servers received from DHCP on eth1 (typically an ISP interface)
set service dns forwarding dhcp eth1

# Use a hardcoded name server
set service dns forwarding name-server 192.0.2.10

You can also specify cache size:

set service dns forwarding cache-size 1000

One of the less known features is the option to use different name servers for different domains. It can be used for a quick and dirty split-horizon DNS, or simply for using an internal server just for internal domains rather than recursive queries:

set service dns forwarding domain mycompany.local server 192.168.52.100
set service dns forwarding domain mycompany.example.com server 192.168.52.100

And that's all to it. DNS forwarding is not a big feature — useful doesn't always equal complex.

Loopback and the dummies

"There is no place like 127.0.0.1" the old saying goes. While the loopback interface is most often seen as the interface where the 127.0.0.1 address is assigned by default and where the 127.0.0.0/8 network is routed, and just a way for programs on the same host to communicate over the network without actual network, it has uses in networked context as well.

Before we talk about those use cases, we need to discuss interfaces themselves. In some OSes, such as Cisco IOS, and many BSD derivatives, it is possible to create multiple loopbacks. Linux kernel (and thus VyOS) historically allowed only one loopback (named "lo"), and this behaviour has become too traditional and relied upon to change overnight, so to implement multiple loopback, a new interface type called "dummy" was added. Dummy interfaces are functionally identical to loopbacks so the difference is mostly aesthetic.

This is how to setup a dummy interface: "set interfaces dummy dum0 address ...". If your problem does not require independent interfaces, you can also just add another address to the loopback.

So, why would one want to use a loopback/dummy interface instead of assigning another address to a physical NIC?

Case 1: tunnel endpoints

We have already talked about GRE/IPsec behind NAT and/or with dynamic addresses. Since GRE requires fixed local and remote endpoint  addresses to work, and in a setup where dynamic addresses or NAT is involved you do not have fixed addresses, the trick is to use a pair of addresses made up specially for this purpose as GRE endpoints.

Case 2: management addresses

Suppose you have a router A with two NICs, connected to networks B and C that are connected to each other, so that if any of the links fails, the network as a whole is still operational. However, if you choose either NIC A or NIC B address as a management address, it may become inaccessible if one of the NICs fail, forcing you to manually fall back to the other address.

To prevent this situations, people often assign a dedicated management address to a loopback, create a DNS record for it, and advertise that address to all other routers so that as long as there is at least one path to that router that works, they do not need to worry about addresses of physical NICs to SSH to the router, and are free to change those addresses without having to update the DNS or memorize the new address.

Case 3: iBGP peer addresses

Since iBGP uses the same autonomous system number for all routers, it loses the ability to use AS path for path selection and loop detection. This means to keep the network loop-free, one has to setup it as a full mesh, or use a route reflector.

If we use addresses of physical NICs for session endpoints, we run into the same problem as in the previous use case: a session goes down with the link even if there are other valid paths. A possible solution is to select dedicated addresses for iBGP sessions, assign them to loopbacks, and advertise them to all other routers through a link-state protocol such as OSPF.

Your use case?

If you know other cases when a network setup can be improved by using loopback or dummy interfaces, let us know!

Naming of the nightly builds

Historically, we used to use "999.$timestamp" version numbers for development builds, including nightly builds. In our build scripts termninology, a development build is any build that is started without doing "./configure --build-type=release --version=1.2.0" or similar (before the build script rewrite that was "./configure --with-release-build" and you also needed to put a version string in livecd/version or somethinf like that). In short, most builds in existence had that nondescript 999 version. That's how it was before the fork and we just didn't change that.

However, that approach is rather problematic. The 999 version doesn't tell anything about the branch it's built from or the nearest release, so one can only guess from the timestamp what it might be, and even that is not reliable. With introduction of a rolling release that will exist alongside the stable releases, this gets even more problematic, so something needs to be done about it.

We decided to change the format to "$release-rolling+$timestamp", like "1.2.0-rolling+201804060100". I have some hesitations about the "+", so if people think it should rather be "-", we can change it.

If you visit https://downloads.vyos.io/?dir=rolling/current/amd64 , you can see the new naming scheme in action. Let us know if you experience any problems with it!