blog.vyos.net

Loopback and the dummies

"There is no place like 127.0.0.1" the old saying goes. While the loopback interface is most often seen as the interface where the 127.0.0.1 address is assigned by default and where the 127.0.0.0/8 network is routed, and just a way for programs on the same host to communicate over the network without actual network, it has uses in networked context as well.

Before we talk about those use cases, we need to discuss interfaces themselves. In some OSes, such as Cisco IOS, and many BSD derivatives, it is possible to create multiple loopbacks. Linux kernel (and thus VyOS) historically allowed only one loopback (named "lo"), and this behaviour has become too traditional and relied upon to change overnight, so to implement multiple loopback, a new interface type called "dummy" was added. Dummy interfaces are functionally identical to loopbacks so the difference is mostly aesthetic.

This is how to setup a dummy interface: "set interfaces dummy dum0 address ...". If your problem does not require independent interfaces, you can also just add another address to the loopback.

So, why would one want to use a loopback/dummy interface instead of assigning another address to a physical NIC?

Case 1: tunnel endpoints

We have already talked about GRE/IPsec behind NAT and/or with dynamic addresses. Since GRE requires fixed local and remote endpoint addresses to work, and in a setup where dynamic addresses or NAT is involved you do not have fixed addresses, the trick is to use a pair of addresses made up specially for this purpose as GRE endpoints.

Case 2: management addresses

Suppose you have a router A with two NICs, connected to networks B and C that are connected to each other, so that if any of the links fails, the network as a whole is still operational. However, if you choose either NIC A or NIC B address as a management address, it may become inaccessible if one of the NICs fail, forcing you to manually fall back to the other address.

To prevent this situations, people often assign a dedicated management address to a loopback, create a DNS record for it, and advertise that address to all other routers so that as long as there is at least one path to that router that works, they do not need to worry about addresses of physical NICs to SSH to the router, and are free to change those addresses without having to update the DNS or memorize the new address.

Case 3: iBGP peer addresses

Since iBGP uses the same autonomous system number for all routers, it loses the ability to use AS path for path selection and loop detection. This means to keep the network loop-free, one has to setup it as a full mesh, or use a route reflector.

If we use addresses of physical NICs for session endpoints, we run into the same problem as in the previous use case: a session goes down with the link even if there are other valid paths. A possible solution is to select dedicated addresses for iBGP sessions, assign them to loopbacks, and advertise them to all other routers through a link-state protocol such as OSPF.

Your use case?

If you know other cases when a network setup can be improved by using loopback or dummy interfaces, let us know!

Naming of the nightly builds

Historically, we used to use "999.$timestamp" version numbers for development builds, including nightly builds. In our build scripts termninology, a development build is any build that is started without doing "./configure --build-type=release --version=1.2.0" or similar (before the build script rewrite that was "./configure --with-release-build" and you also needed to put a version string in livecd/version or somethinf like that). In short, most builds in existence had that nondescript 999 version. That's how it was before the fork and we just didn't change that.

However, that approach is rather problematic. The 999 version doesn't tell anything about the branch it's built from or the nearest release, so one can only guess from the timestamp what it might be, and even that is not reliable. With introduction of a rolling release that will exist alongside the stable releases, this gets even more problematic, so something needs to be done about it.

We decided to change the format to "$release-rolling+$timestamp", like "1.2.0-rolling+201804060100". I have some hesitations about the "+", so if people think it should rather be "-", we can change it.

If you visit https://downloads.vyos.io/?dir=rolling/current/amd64 , you can see the new naming scheme in action. Let us know if you experience any problems with it!

VyOS to leverage the blockchain technology

Wild success of the blockchain technology on the mass market in the past couple of years is truly remarkable and it allowed many startups to raise substantial capitals. To keep up with those developments, we've been working on vyCoin, a cryptocurrency that will revolutionize the way people think about their network peering.

People will be able to mine vyCoin by routing other people's traffic and pay for traffic routing with it. Through smart contracts, it will be possible to charge vyCoins for routing other people's traffic with it as well. The use cases are endless, from facilitating peering arrangements at Internet exchanges to allowing people to earn something by participating in distributed private networks and ad hoc wireless.

To speed up the adoption, 50% of vyCoins come pre-mined and will be available to early adopters via an ICO. The vyCoin client will be integrated in the VyOS release images along with a kernel module for packet tracking so it will automatically start mining when you route traffic of other networks. The client will be sending 50% of the mined coins back to us as a service fee.

The ICO date will be announced later, meanwhile you can watch the vyCoin introduction video on Youtube.

Take a third option: site to site OpenVPN

I've written a long series of post about setting up IPsec VPNs between NATed machines. As you've already seen, with some creative configuration it's possible, but is it always worth the sacrifice? Sometimes performance requirements, or lack of support for anything else on the other side make it necessary, but if the other side is also a VyOS, or another open source system, there's an alternative.

While OpenVPN is usually associated with road warrior VPN setups, it is not limited to it. It does have a site to site option and it's very quick and easy to setup. For some strange reason, that option is neglected by just about everyone who otherwise supports OpenVPN: in OpenWRT it's possible to setup through custom config options, while in Mikrotik RouterOS it's not possible to setup at all. In VyOS we have an explicit option for it. OPNsense also supports site to site OpenVPN out of the box (I thought it doesn't, but the authors corrected me).

The advantages are that it takes very few commands to get a tunnel to work, and that it will work in any network where you can forward a single port, even is both sides are behind double NAT. The downside is performance: squeezing even 100mbit/s of encrypted traffic out of it can be impossible, typical iperf figures are 10-20 mbit/s. For many use cases that performance is more than enough, though if you plan to use the tunnel for storage replication or another high-traffic job, that option is definitely not for you and you'll have to resort to IPsec.

Ongoing improvements in project

Hi everyone,

While developers are working on the 1.2.0 release candidate, myself I’ve been looking into ways to improve the social and commercial aspects of project.

I’m happy to have established a positive relationship with a few companies, many of whom are themselves open source project developers and are ready to support other open source project.

The following open source tools will be added to our toolchain:

- SonarQube by SonarSource is a source code analyzer that will help us improve code quality of the 1.2.0 and subsequent releases.

It’s odd that we didn’t know about it before, but now that we know of it, I hope it will change the project for the better. It uses an open core model and SonarCloud SaaS offering is available with no cost for open sources projects.

- Discourse is a great discussion board software that will replace our old mybb forum. The old forum is now in read only mode to allow migration of the old topics to Discourse, and you give the new forum a try at https://forum.vyos.io. Discourse kindly offered us a discount for managed hosting via their open source program. The domain will be eventually changes to discourse.vyos.io (as per their discounted hosting plan requirements), but we’ll setup a redirect from both old domains when it’s ready.

On the commercial side, we are happy to announce that we now a part of two important Technology Alliances:

VMware Technology Alliance Program - Access Level will allow us to validate VyOS 1.2 OVA as a VMware Ready solution, which is also the first step towards the vCloud NFV certification. We hope it will help VyOS gain trust of the enterprise users and increase its adoption in that market.

Nutanix Elevate Technology Alliance Partner Program opens doors to the Nutanix platform for us. One of our goals from the beginning has been to bring VyOS to as many virtual platforms as possible, and it already runs on VMware, Hyper-V, Xen, and KVM, so Nutanix AHV support is a logical continuation of that effort, and we will also be listed on the Nutanix Marketplace to make it easy to deploy for Nutanix users.

I suppose these steps will help VyOS move forward and become a more successful project!

How to use AS path matching in your BGP policies

AS path is one of the most fundamental attributes of a (e)BGP advertisments. Its length is the first parameter in the best path selection algorithm (shorter is better), and it's also the sole mechanism of loop detection (if an AS is seen twice, there's a loop). However, despite the important role it plays behind the scenes, it's rather underutilized in routing policies.

A lot of time when prefix-list or specific route-map rule options such as next-hop can do, route filtering and modification based on AS path can do it better.

Let's see how to use it.

Firewall groups today and tomorrow

Substantial work has been done by Marian Tudosoiu to bring IPv6 firewall groups to the current implementation of firewall configuration scripts even before we give it a complete rewrite. It's already merged into the current branch and is expected to be included in the 1.2.0-rc1 release. Now it's probably a good time to make a post about using firewall groups for those who haven't used them yet.

Of course there's still a lot of work to be done, such as integrating groups into NAT, which likely does require a complete rewrite to be feasible.

The concept is simple enough: instead of creating multiple rules that only differ in one address or port number, you create a group with all those addresses and ports, and reference it in a rule.

VyOS has three group types: address groups, network groups, and port groups. In 1.1.8 they can only be used with IPv4 firewall rulesets, including "policy route" rules.

Let's create some groups:

set firewall group port-group ManagementPorts port 22
set firewall group port-group ManagementPorts port 23
set firewall group port-group ManagementPorts port 443

set firewall group address-group Servers address 10.10.0.10
set firewall group address-group Servers address 10.10.0.15
set firewall group address-group Servers address 10.10.0.20

set firewall group network-group TrustedNets network 192.168.5.0/24
set firewall group network-group TrustedNets network 172.18.19.128/25
set firewall group network-group TrustedNets network 10.20.30.144/32

Now we can create a ruleset that uses them. Let's make a rule that references nothing but groups:

set firewall name DMZ-In rule 10 action accept
set firewall name DMZ-In rule 10 protocol tcp
set firewall name DMZ-In rule 10 source group network-group TrustedNets
set firewall name DMZ-In rule 10 destination group port-group ManagementPorts
set firewall name DMZ-In rule 10 destination group address-group Servers

An important part is that you can modify groups on the fly without updating any rules.

As you can see, groups is a simple concept that can be learnt in minutes. Once they are in IPv6 and NAT, their use will be very similar.

The night of living dead protocols: RIPv2

RIP's name seems to have anticipated its ultimate fate. It used to stand for Routing Information Protocol before newer and better protocols killed it. Still, most routers in the world still support it even though few people seriously consider using it, thus making it an undead protocol.

This is mainly for compatibility reasons. After all, if you have an old box at a remote location, connected to a 128kbps ISDN line or worse, that is working fine and is impractical to replace, but supports nothing but RIP, what can you do? Likewise, some modern small routers by Netgear or D-link only have RIP, so if you can't just replace it, there's no other choice.

Besides, RIP remains a valuable teaching tool since it's conceptually simple, and understanding its limitations can help one understand the new routing protocols and their strengths.

What's bad about RIP?

There are some very good reasons to choose something else than RIP if you can. It's one of the oldest protocols in existence, and it was designed at the time when neither modern route selection techniques existed, nor the size of networks was big enough to warrant those techniques. When RIP reached its scalability limits, it was impossible to retrofit those techniques into it, so it was replaced rather than upgraded. RIPv2, the latest version, replaced broadcast advertisments with multicast and added support for CIDR, but that's about it — the fundamental design problems are, well, fundamental.

The biggest issue is that in RIP, routers are only aware of themselves and their immediate peers. They are completely blind to the rest of the network. Link-state and path-vector protocols such as OSPF and eBGP are aware of the full topology (concrete or abstract), and can either reduce the full network graph to loop-free tree or immediately detect route advertisments as loop-inducing respectively.

All information that RIP advertisments include is the network, next hop, and an integer metric value. No router has any idea how its peers are connected to one another, so there is no way to detect loops before they form.

RIP includes a number of solutions to this problem, and they all have a limiting effect on its scalability.

Split horizon

It's simple — do not advertise routes received from peers back to them. It prevents the trivial loop when peers are trying to route traffic to networks they learnt about from someone else through you, but it cannot prevent wider loops. At least it has no effect on convergence time and doesn't create scalability limits either.

Counting to infinity

Before the other mechanisms were developed, this one was the only measure for detecting unreachable or looping routes. To make sure a route that is not actually reachable will be eventually recognized as such, it was decided to choose a maximum value that represents "infinite" (unreachable) metric. Since this process already can take quite some time, the value had to be small. In RIP, the infinite metric was set to 16. This means if a network has paths longer than 15 hops, the protocol just stops working.

Reverse poisoning

Even with 16 as infinite metric, the process of counting to infinity can be slow. The next idea was to not just wait for unreachable routes to become known as unreachable naturally, but actively advertise them to your peers as unreachable if a peer that was advertising them goes down.

This still is not a complete solution because if a router first receives an unreachability advertisment from a router who's aware of the true situation, but later receives an update from a router that is still not aware of it, it will start using the second false advertisment.

Hold timer

The ultimate solution is to ignore any advertisment for a network whose metric has recently increased for some time, to avoid receiving updates from routers who do not know the real situation yet. This time shouldn't be too small, it should be similar to the time it takes for updates to propagate through the entire network. A common default value is 180 seconds. That is, to prevent convergence issues completely, it may take a network of just a few routers three minutes to converge.

Configuration

So, suppose you are aware of all issues, but want or need to configure RIP nonetheless. It's pretty simple. First, enable RIP on all interfaces where you want to send and receive advertisments:

set protocols rip interface eth0
set protocols rip interface eth1

Networks that are configured on those interfaces will become a part of the RIP table automatically. If you want to advertise networks that are on other interfaces, you need to add them explicitly:

set protocols rip network 10.74.74.0/24

You can also use "redistribute" and "default-information originate" commands just like in all other routing protocols.

If everything is right, you will see something like this on the neighbor router:

vyos@vyos# run show ip rip 
Codes: R - RIP, C - connected, S - Static, O - OSPF, B - BGP
Sub-codes:
      (n) - normal, (s) - static, (d) - default, (r) - redistribute,
      (i) - interface

     Network            Next Hop         Metric From            Tag Time
R(n) 10.74.74.0/24      10.217.32.132         2 10.217.32.132     0 02:55
C(i) 10.217.32.0/24     0.0.0.0               1 self              0

If you remove the interface or the network statement, you will see the unreachable metric 16 for the duration of the hold timer, and only when the timer expires it will disappear from your table complely:

vyos@vyos# run show ip rip 
Codes: R - RIP, C - connected, S - Static, O - OSPF, B - BGP
Sub-codes:
      (n) - normal, (s) - static, (d) - default, (r) - redistribute,
      (i) - interface

     Network            Next Hop         Metric From            Tag Time
R(n) 10.74.74.0/24      10.217.32.132        16 10.217.32.132     0 01:57

Using the "policy route" and packet marking for custom QoS matches

There is only that much you can do in a QoS rules to describe the traffic you want it to match. There's DCP, source/destination, and protocol, and that's enough to cover most of the use cases. Most, but not all. Fortunately, they can also match packet marks and that's what enables creating custom matches.

Packet marks are numeric values set by Netfilter rules that are local to the router and can be used as match criteria in other Netfilter rules and many other components of the Linux kernel (ip, tc, and so on).

Suppose you have a few phones in the office and you want to prioritize their VoIP traffic. You could create a QoS match for each of them, but it's quite some config duplication, which will only get worse when you add more phones. If you find a way to group those addresses in one match, wouldn't it be nice? Sadly, there's no such option in QoS. Firewall can use address groups though, so we can make the QoS rule match a packet mark (e.g. 100) and set that mark to traffic from the phones.

# show traffic-policy 
 priority-queue VoIP {
     class 7 {
         match SIP {
             mark 100
         }
         queue-type drop-tail
     }
     default {
         queue-type fair-queue
     }
 }

Now the confusing bit. Where do we set the mark? Around Vyatta 6.5, an unfortunate design decision was made: "firewall modify" was moved under overly narrow and not so obvious "policy route". Sadly we are stuck with it for the time being because it's not so easy to automatically convert the syntax for upgrades. But, its odd name notwithstanding, it still does the job.

Let's create an address group and a "policy route" instance that sets the mark 100:

# show firewall group 
 address-group Phones {
     address 10.4.5.10
     address 10.4.5.11
     address 10.4.5.12
 }
[edit]
# show policy route 
 route VoIP {
     rule 10 {
         set {
             mark 100
         }
         source {
             group {
                 address-group Phones
             }
         }
     }
 }

Now we need to assign the QoS ruleset to our WAN and the "policy route" instance to our LAN interface:

set interfaces ethernet eth0 policy route VoIP
set interfaces ethernet eth1 traffic-policy out VoIP

You can as well take advantage of "policy route" ruleset options for time-based filtering or matching related connections. Besides, you can use it to set DSCP values in case your QoS setup is on a different router:

set policy route Foo rule 10 set dscp 46

Writing the new-style command definitions

Earlier I said new features in Perl code and old style templates will not be merged anymore starting from May the 1st (if you have any such features already working and testing, you still have a chance to get them in, so hurry up!).

Now it's time to write step by step guides to using the new style and we'll start with command definitions.

History and motivation

Old-style command definitions (aka "templates") have quite a lot of design issues and proved to be one of the worst deterrents for new contributors (right after Perl code).

If you are not familiar with them, I'll remind you how they work. Suppose we need to create a command for new interface type "silly" (that's like dummy... but also silly). Suppose we start with address option, "set interfaces silly silly0 address 192.0.2.1/24". What we'd need to do:

Create directory structure interfaces/silly/node.tag/address
Put a node.def file under interfaces/, silly/, and address/, but not under node.tag (otherwise directory will not be recognized as a command definition)
Write a bunch of "tags" such as "help: Silly interface name" in the node.def's

There is a whole lot of problems with this approach:

To get the complete picture of commands of a component, you need to read a lot of files in multiple deeply nested directories
Every such file can contain embedded shell scripts, which means the logic rather than just data can be scattered across dozens files
You cannot check whether your node.def's are even syntactically correct without loading them into VyOS and trying them by hand

Some of these problems such as fragility of the data syntax could possibly be fixed. The problems with data and logic scattering, however, are fundamental, and cannot be cured without changing the approach.

A lot of design and development work went into the configuration mode commands definitions for vyconf and thus VyOS 2.0. However, vyconf is not and will never be a drop-in replacement for the old configuration backend (because it means it would have to reimplement the old unfortunate design decisions to be compatible, which defeats the purpose). And, since the plan is to rewrite VyOS 1.x.x gradually in the new style to have an operational system at all times and be able to reuse the code in 2.0 with minimal changes, we need a way to use new style command definitions alongside the old ones. As a compromise, we've made a convertor from new style definitions to the old style.

To learn how to use the new style and how much better it is, read on...