Writing the new-style command definitions

Earlier I said new features in Perl code and old style templates will not be merged anymore starting from May the 1st (if you have any such features already working and testing, you still have a chance to get them in, so hurry up!).

Now it's time to write step by step guides to using the new style and we'll start with command definitions.

History and motivation

Old-style command definitions (aka "templates") have quite a lot of design issues and proved to be one of the worst deterrents for new contributors (right after Perl code).

If you are not familiar with them, I'll remind you how they work. Suppose we need to create a command for new interface type "silly" (that's like dummy... but also silly). Suppose we start with address option, "set interfaces silly silly0 address 192.0.2.1/24". What we'd need to do:

  • Create directory structure interfaces/silly/node.tag/address
  • Put a node.def file under interfaces/, silly/, and address/, but not under node.tag (otherwise directory will not be recognized as a command definition)
  • Write a bunch of "tags" such as "help: Silly interface name" in the node.def's

There is a whole lot of problems with this approach:

  • To get the complete picture of commands of a component, you need to read a lot of files in multiple deeply nested directories
  • Every such file can contain embedded shell scripts, which means the logic rather than just data can be scattered across dozens files
  • You cannot check whether your node.def's are even syntactically correct without loading them into VyOS and trying them by hand

Some of these problems such as fragility of the data syntax could possibly be fixed. The problems with data and logic scattering, however, are fundamental, and cannot be cured without changing the approach.

A lot of design and development work went into the configuration mode commands definitions for vyconf and thus VyOS 2.0. However, vyconf is not and will never be a drop-in replacement for the old configuration backend (because it means it would have to reimplement the old unfortunate design decisions to be compatible, which defeats the purpose). And, since the plan is to rewrite VyOS 1.x.x gradually in the new style to have an operational system at all times and be able to reuse the code in 2.0 with minimal changes, we need a way to use new style command definitions alongside the old ones. As a compromise, we've made a convertor from new style definitions to the old style.

To learn how to use the new style and how much better it is, read on...

VyOS builds now use the deb.debian.net load balanced mirror

If there are any good things about that packages server migration and restructuring is that it promoted a revamp of the associated part of the build scripts.

Since the start the default Debian mirror was set to nl.debian.org for a completely arbitrary reason. This of course was suboptimal for most users who are far from the Netherlands, and while the mirror is easy enough to change in ./configure options, a better out of the box experience wouldn't harm.

Danny ter Haar (fromport) suggested that we change it to deb.debian.org which is load balanced, which I think is a good idea. There's a small chance that it will redirect you to a dead mirror, but if you run into any issues, you can always set it by hand.

No new features in Perl and shell and no old style templates since May 2018

Now that the Python library for accessing the running config and the generator of old style templates from new style XML command definitions are known to be functional, it's time to set a cutoff date for the old style code. We decided to arbitrarily set it to May 2018.

Since May the 1st, no new code written in the old style will be accepted. We will accept (and make ourselves) fixes to the old code when the bug severity is high enough to affect VyOS operation, but all new features get to be new style.

If you have some new feature already almost done, consider completing it until May. If you are planning a new feature, consider learning about the new style first.

If you are late to the party, please read these blog posts:

For those who are not, I'll reiterate the main points:

  • Contributors who know Perl and are still willing to write it are getting progressively harder to find and many people say they would contribute if it wasn't in Perl.
  • The reasons people abandoned Perl are more than compelling: lack of proper type checking, exception handling, and many other features of post-60's language designs do not help reliability and ease of maintenance at all.
  • Old style handwritten templates are very hard to maintain since both the data and often the logic is spread across dozens, if not hundreds of files in deeply nested directories. Separation of logic from data and a way to keep all command definitions of a feature in a single, observable file are required to make it maintainable.
  • With old style templates, there's no way to verify them without installing them on VyOS and trying them by hand. XML is the only common language that has ready to use tools for verification of its syntax and semantics: right now it's already integrated in our build system and a build with malformed command definitions will fail.

I'll write a tutorial about writing features in the new style shortly, stay tuned. Meanwhile you can look at the implementation of cron in 1.2.0:

  • https://github.com/vyos/vyos-1x/blob/current/src/conf-mode/vyos-update-crontab.py (conf mode script)
  • https://github.com/vyos/vyos-1x/blob/current/interface-definitions/cron.xml (command definitions)


VyOS 1.2.0 repository re-structuring

In preparation for the new 1.2.0 (jessie-based) beta release, we are re-populating the package repositories. The old repositories are now archived, you still can find then in the /legacy/repos directory on dev.packages.vyos.net

The purpose of this is two-fold. First, the old repo got quite messy, and Debian people (rightfully!) keep reminding us about it, but it would be difficult to do a gradual cleanup. Second, since the CI server has moved, and so did the build hosts, we need to test how well the new procedures are working. And, additionally, it should tell us if we are prepared to restore VyOS from its source should anything happen to the packages.vyos.net server or its contents.

For perhaps a couple of days, there will be no new nightly builds, and you will not be able to build ISOs yourself, unless you change the repo path in ./configure options by hand. Stay tuned.

VyOS 2.0 development digest #9: socket communication functionality, complete parser, and open tasks

Socket communication

A long-awaited (by me, anyway ;) milestone: VyConf is now capable of communicating with clients. This allows us to write a simple non-interactive client. Right now the only supported operaion is "status" (a keepalive of sorts), but the list will be growing.

I guess I should talk about the client before going into technical details of the protocol. The client will be way easier to use than what we have now. Two main problems with CLI tools from VyOS 1.x is that my_cli_bin (the command used by set/delete operations) requires a lot of environment setup, and that cli-shell-api is limited in scope. Part of the reason for this is that my_cli_bin is used in the interactive shell. Since the interactive shell of VyConf will be a standalone program rather than a bash completion hack, we are free to make the non-interactive client more idiomatic as a shell command, closer in user experience to git or s3cmd.

This is what it will look like:


SESSION=$(vycli setupSession)
vycli --session=$SESSION configure
vycli --session=$SESSION set "system host-name vyos"
vycli --session=$SESSION delete "system name-server 192.0.2.1"
vycli --session=$SESSION commit
vycli --session=$SESSION exists "service dhcp-server"
vycli --session=$SESSION commit returnValue "system host-name"
vycli --session=$SESSION --format=json show "interfaces ethernet"

As you can see, first, the top level words are subcommands, much like "git branch". Since the set of top level words is fixed anyway, this doesn't create new limitations. Second, the same client can execute both high level set/delete/commit operations and low level exists/returnValue/etc. methods. Third, the only thing it needs to operate is a session token (I'm thinking that unless it's passed in --session option, vycli should try to get it from an environment variable, but we'll see, let me know what you think about this issue). This way contributors will get an easy way to test the code even before interactive shell is complete; and when VyOS 2.0 is usable, shell scripts and people fond of working from bash rather than the domain-specific shell will have access to all system functions, without worrying about intricate environment variable setup.

The protocol

As I already said in the previous post, VyConf uses Protobuf for serialized messages. Protobuf doesn't define any framing, however, so we have to come up with something. Most popular options are delimiters and length headers. The issue with delimiters is that you have to make sure they do not appear in user input, or you risk losing a part of the message. Some programs choose to escape delimiters, other rely on unusual sequences, e.g. the backend of OPNSense uses three null bytes for it. Since Protobuf is a binary protocol, no sequence is unusual enough, so length headers look like the best option. VyConf uses 4 byte headers in network order, that are followed by a Protobuf message. This is easy enough to implement in any language, so it shouldn't be a problem when writing bindings for other languages.

The code

There is a single client library that can be used by all of the non-interactive client and the interactive shell. It will also serve as the OCaml bindings package for VyConf (Python and other languages wil need their own bindings, but with Protobuf, most of it can be autogenerated).

Parser improvements

Inactive and ephemeral nodes

The curly config parser is now complete. It supports the inactive and ephemeral properties. This is what a config with those will look like:

protocols {
  static {
    /* Inserted by a fail2ban-like script */
    #EPHEMERAL route 192.0.2.78/32 {
      blackhole;
    }
    /* DIsabled by admin */
    #INACTIVE route 203.0.113.128/25 {
      next-hop 203.0.113.1;
    }
  }
}

While I'm not sure if there are valid use cases for it, nodes can be inactive and ephemeral at the same time. Deactivating an ephemeral node that was created by scritp perhaps? Anyway, since both are a part of the config format that the "show" command will produce, we get to support both in the parser too.

Multi nodes

By multi nodes I mean nodes that may have more than one value, such as "address" in interfaces. As you remember, I suggested and implemented a new syntax for such nodes:

interfaces {
  ethernet eth0 {
    address [
      192.0.2.1/24;
      192.0.2.2/24;
    ];
  }
}

However, the parser now supports the original syntax too, that is:.

interfaces {
  ethernet eth0 {
    address 192.0.2.1/24;
    address 192.0.2.2/24;
  }
}

I didn't intend to support it originally, but it was another edge case that prompted me to add it. For config read operations to work correctly, every path in the tree must be unique. The high level Config_tree.set function maintains this invariant, but the parser gets to use lower level primitives that do not, so if a user creates a config with duplicate nodes, e.g. by careless pasting, the config tree that the parser returns will have them too, so we get to detect such situations and do something about it. Configs with duplicate tag nodes (e.g. "ethernet eth0 { ... } ethernet eth0 { ... }") are rejected as incorrect since there is no way to recover from this. Multiple non-leaf nodes with distinct children (e.g. "system { host-name vyos; } system { name-server 192.0.2.1; }") can be merged cleanly, so I've added some code to merge them by moving children of subsequent nodes under the first on and removing the extra nodes afterwards. However, since in the raw config there is no real distinction between leaf and non-leaf nodes, so in case of leaf nodes that code would simply remove all but the first. I've extended it to also move values into the first node, which equates support for the old syntax, except node comments and inactive/ephemeral properties will be inherited from the first node. Then again, this is how the parser in VyOS 1.x behaves, so nothing is lost.

While the show command in VyOS 2.0 will always use the new syntax with curly brackets, the parser will not break the principle of least astonishment for people used to the old one. Also, if we decide to write a migration utility for converting 1.x configs to 2.0, we'll be able to reuse the parser, after adding semicolons to the old config with a simple regulat expression perhaps.

Misc

Node names and unquoted values now can contain any characters that are not reserved, that is, anything but whitespace, curly braces, square brackets, and semicolons.

What's next?

Next I'm going to work on adding low level config operations (exists/returnValue/...) and set commands so that we can do some real life tests.

There's a bunch of open tasks if you want to join the development:

T254 is about preventing nodes with reserved characters in their names early in the process, at the "set" time. There's a rather nasty bug in VyOS 1.1.7 related to this: you can pass a quoted node name with spaces to set and if there is no validation rule attached to the node, as it's with "vpn l2tp remote-access authentication local-users", the node will be created. It will fail to parse correctly after you save and reload the config. We'll fix it in 1.2.0 of course, but we also need to prevent it from ever appearing in 2.0 too.

T255 is about adding the curly config renderer. While we can use the JSON serializer for testing right now, the usual format is also just easier on the eyes, and it's a relatively simple task too.

VyOS 2.0 development digest #8: vote for or against the new tag node syntax, and the protobuf schema

Tag node syntax

The change in tag node format I introduced in the previous post turned out quite polarizing and started quite some discussion in the comments. I created a poll in phabricator for it: https://phabricator.vyos.net/V3 , please cast your vote there.

If you missed the post, or found the explanation confusing, here's what it's all about. Right now in config files we format tag nodes (i.e. nodes that can have children without predefined names, such as interfaces and firewall rules) differently from other nodes:


/* normal node */
interfaces {
  /* tag node */
  ethernet eth0 {
    address 192.0.2.1/24
  }
  /* tag node */
  ethernet eth1 {
    address 203.0.113.1/24
  }
}

It looks nice, but complicates the parser. What I proposed and implemented in the initial parser draft is to not use any custom formatting for tag nodes:

/* normal node */
interfaces {
  /* actually a tag node, but rendering is now the same as for normal */
  ethernet {
    eth0 {
      address 192.0.2.1/24;
    }
    eth1 {
     address 203.0.113.1/24
    }
  }
}

This makes the parser noticeable simpler, but makes the syntax more verbose and adds more newlines.

If more people vote against this change than for it, I'll take time to implement it in the parser.

Note: This change only affects the config syntax, and has no effect on the command syntax. The command for the example above would still be "set interfaces ethernet eth0 address 192.0.2.1/24", in user input and in the output of "show configuration commands". Tag nodes will also be usable as edit levels regardless of the config file syntax, as in "edit interfaces tunnel; copy tun0 to tun1".

Protobuf schema

Today I wrote an initial draft of the protobuf schema that VyConf daemon will use for communication with clients (shell, CLI tool, and HTTP bridge). You can find it here: https://github.com/vyos/vyconf/blob/master/data/vyconf.proto

Right now it defines the following operations:

VyOS 2.0 development digest #7: Python coding guidelines for config scripts in 1.2.0, and config parser for VyConf

Python coding guidelines for 1.2.0

In some previous post I was talking about the Python wrapper for the config reading library. However, simply switching to a language that is not Perl will not automatically make that code easy to move to 2.0 when the backend is ready, neither it will automatically improve the design and architecture. It will also improve the code in general, and help keeping it readable and maintainable.

You can find the document here: http://wiki.vyos.net/wiki/Python_config_script_policy 

In short:

  • Logic for config validation, generating configs, and changing system settings/restarting services must be completely separated
  • For any configs that allow nesting (dhcpd.conf, ipsec.conf etc.) template processor must be used (as opposed to string concatenation)
  • Functions should not randomly output anything to stdout/stderr
  • Code must be unit-testable

Config parser for VyConf/VyOS 2.0

Today I pushed initial implementation of the new config lexer and parser. It already supports nodes and node comments, but doesn't support node metadata (that will be used to mark inactive and ephemeral nodes).

You can read the code (https://github.com/vyos/vyconf/blob/master/src/curly_lexer.mll and https://github.com/vyos/vyconf/blob/master/src/curly_parser.mly) and play with it by loading the .cma's into REPL. Next step is to add config renderer. Once the protobuf schema is ready we can wrap it all into a daemon and finally have something to really play with, rather than just run the unit tests.

Informally, here's what I changed in the config syntax.

Old config

interfaces {
  /* WAN interface */
  ethernet eth0 {
    address 192.0.2.1/24
    address 192.0.2.2/24
    duplex auto
  }
}

New config

interfaces {
  ethernet {
    /* WAN interface */
    eth0 {
      address [
        192.0.2.1/24;
        192.0.2.2/24;
      ];
      duplex auto;
      // This kind of comment is ignored by the parser
    }
  }
}

As you can see, the changes are:

  • Leaf nodes are now terminated by semicolons rather than newlines.
  • There is syntax for comments that are ignored by the parser
  • Multi nodes have the array of values in square brackets.
  • Tag nodes do not receive any special formatting.

I suppose the last change may be controversial, because it can lead to somewhat odd-looking constructs like:

interfaces {
  ethernet {
    eth0 {
      vif {
        21 {
          address 192.0.2.1/24
        }
      }
    }
  }
}

If you are really going to miss the old approach to tag nodes (that is "ethernet eth0 {" as opposed to "ethernet { eth0 { ...", let me know and I guess I can come up with something. The main difficulty is that, while this never occurs in configs VyOS config save produces, different tag nodes, e.g. "interfaces ethernet" and "interfaces tunnel" can be intermingled, so for parsing we have to track which ones were already created, and this will make the parser code a lot longer.

I'm pretty convinced that "address 192.0.2.1/24; address 192.0.2.2/24" is simply visual clutter and JunOS-like square bracket syntax will make it cleaner. It also solves the aforementioned problem with interleaved nodes tracking for leaf nodes.

Let me know what you think about the syntax.

VyOS 2.0 development digest #6: new beginner-friendly tasks, design questions, and the details of the config tree

The tasks

Both tasks from the previous post have been taken up and implemented by Phil Summers (thanks, Phil!). New tasks await.

First task was very simple: the Reference_tree module needs functions for checking facts about nodes, analogous to is_multi. For config output, and for high level set/delete/commit operations we need easy ways to know if the node is tag or leaf, or valueless, what component is responsible for it etc. It can be done mostly by analogy with is_multi function and its relatives, so it's friendly to complete beginners. But Phil Summers implemented it before I could make the post (thanks again, Phil!).

Second task is a little bit more involved but still simple enough for anyone who started learning ML not long ago. It's about loading interface definitions from a directory. In VyOS, we may have a bunch of files in /usr/share/vyos/interfaces such as firewall.xml, system.xml, ospf.xml, and so on, and we need to load them into the reference tree that is used for path validation, completion etc.

Design questions

To give you some context, I'll remind you that the vyconf shell will not be bash-based, due to having to fork and modify bash (or any other UNIX shell) to get completion from the first word to begin with, and for variety of other reasons. So, first question: do you think we should use the vyconf shell where you can enter VyOS configuration commands as login shell, or we should go for JunOS-like approach when you login to a UNIX shell and then issue a command to enter the configuration shell? You can cast your vote here: https://phabricator.vyos.net/V2 

Second question is more open-ended: we are going to printing the config as the normal VyOS config syntax, and as set commands, but what else should we support? Some considerations: since "show" will be a part of the config API, it can be used by e.g. web GUI to display the config. This means config output of XML or JSON can be a useful thing. But, which one, or perhaps both? And also we need to decide what the XML and/or JSON shouid look like, since we can go for a generic schema that keeps node names in attributes, or we can use custom tags such as <interfaces> (but then every component should provide a schema).

Now, to the "long-awaited" details of the config tree...

VyOS 2.0 development digest #5: doing 1.2.x and 2.0 development in parallel

There was a rather heated discussion about the 1.2.0 situation on the channel, and valid points were definitely expressed: while 2.0 is being written, 1.2.0 can't benefit from any of that work, and it's sad. We do need a working VyOS in any case, and we can't just stop doing anything about it and only work on 2.0. My original plan was to put 1.2.0 in maintenance mode once it's stabilized, but it would mean no updates at all for anyone, other than bugfixes. To make things worse, some things do need if not rewrite, but at least very deep refactoring bordering on rewrite just to make them work again, due to deep changes in the configs of e.g. StrongSWAN.

There are three main issues with reusing the old code,  as I already said: it's written in Perl, it mixes config reading and checking with logic, and it can't be tested outside VyOS. The fourth issue is that the logic for generating, writing, and applying configs is not separated in most scripts either so they don't fit the 2.0 model of more transactional commits. The question is if we can do anything about those issues to enable rewriting bits of 1.2.0 in a way that will allow reusing that code in 2.0 when the config backend and base system are ready, and what exactly should we do.

My conclusion so far is that we probably can, with some dirty hacks and extra care. Read on.

The language

I guess by now everyone agrees that Perl is a bad idea. There are few people who know it these days, and there is no justification for knowing it. The language is a minefield that lacks proper error reporting mechanism or means to convey the semantics.

If you are new to it, look at this examples:

All "error reporting" enabled, let's try to divide a string by an integer.

$ perl -e 'use strict; use warnings; print "foobar" / 42'
Argument "foobar" isn't numeric in division (/) at -e line 1.
0

A warning indeed... Didn't prevent program from producing a value though: garbage in, garbage out. And, my long time favorite: analogous issues bit me in real code a number of times!

$ perl -e 'print reverse "dog"'
dog

Even if you know that it has to do with "list context", good luck finding information about default context of this or that function in the docs. In short, if the language of VyOS 1.x wasn't Perl, a lot of bugs would be outright impossible.

Python looks like a good candidate for config scripts: it's strongly typed, the type and object system is fairly expressive, there are nice unit test libraries and template processors and other things, and it's reasonably fast. What I don't like about it and dynamically typed languages in general is that it needs a damn good test coverage because the set of errors it can detect at compile time is limited and a lot of errors make it to runtime, but there are always compromises.

But, we need bindings. VyConf will use sockets and protobuf messages for its API which makes writing bindings for pretty much any language trivial, but in 1.x.x it's more complicated. The C++/Perl library from VyOS backend is not really easy to follow, and not trivial to produce bindings for. However, we have cli-shell-api, which is already used in config scripts written in shell, and behaves as it should. It also produces fairly machine-friendly output, even though its error reporting is rudimantary (then again, error reporting of the C++ and Perl library isn't all that nice either). So, for a proof of concept, I decided to make a thin wrapper around cli-shell-api: later it can be rewritten as a real C++ binding if this approach shows its limitations. It will need some C++ library logic extraction and cleanup to replicate the behaviour (why the C++ library itself links against Perl interpreter library? Did you know it also links to specific version of the apt-pkg library that was never meant for end users and made no promise of API stability, for its version comparison function that it uses for soring names of nodes like eth0? That's another story though).

Anyway, I need to add the Python library to the vyatta-cfg package which I'll do soon, for the time being you can put the file to your VyOS (it works in 1.1.7 with python2.6) and play with it.

Upd: since then, the Python library is a part of VyOS 1.2.0+ images and is in the default PYTHONPATH. Check out scripts in https://github.com/vyos/vyos-1x

Right now it exposes just a handful of functions: exists(), return_value(), return_values(), and list_nodes(). It also has is_leaf/is_tag/is_multi functions that it uses internally to produce somewhat better error reporting, though they are unnecessary in config scripts, since you already know that about nodes from templates. Those four functions are enough to write a config script for something like squid, dnsmasq, openvpn, or anything else that can reload its config on its own. It's programs that need fancy update logic that really need exists_orig or return_effective_value. Incidentally, a lot of components that need that rewrite to repair or could seriously benefit from serious overhaul are like that: for example. iptables is handled by manipulating individual rules right now even though iptables-restore is atomic, likewise openvpn is now managed by passing it the config in command line options while it's perfectly capable of reloading its config and this would make tunnel restarts a lot less disruptive, and strongswan, the holder of the least maintainable config script, is indeed capable of live reload too.

Which brings us to the next part...

The conventions

To avoid having to do two rewrites of the same code instead of just one, we need to make sure that at least substantial part of the code from VyOS 1.2.x can be reused in 2.0. For this we need to setup a set of conventions. I suggest the following, and let's discuss it.

Language version

Python 3 SHALL be used.

Rationale: well, how much longer can we all keep 2.x alive if 3.0 is just a cleaner and nicer implementation?

Coding standard

No single function shall SHOULD be longer than 100 lines.

Rationale: https://github.com/vyos/vyatta-cfg-vpn/blob/current/scripts/vpn-config.pl#L449-L1134 ;)

Logic separation and testability

This is the most important part. To be able to reuse anything, we need to separate assumptions about the environment from the core logic. To be able to test it in isolation and make sure most of the bugs are caught on developer workstations rather than test routers, we need to avoid dependendies on the global state whenever possible. Also, to fit the transactional commit model of VyOS 2.0 later, we need to keep consistency checking, generating configs, and restarting services separated.

For this, I suggest that we config scripts follow this blueprint:


def get_config():
    foo = vyos.config.return_value("foo bar")
    bar = vyos.config.return_value("baz quux")
    return {"foo": foo, "bar": bar} # Could be an object depending on size and complexity...

def verify(config):
    result do_some_checks(config)
    if checks_succees(result):
        return None
    else:
        raise ScaryException("Some error")

def generate(config):
    write_config_files(config)

def apply(config):
    restart_some_services(config)

if __name__ == '__main__':
    try:
       c = get_config()
       verify(c)
       generate(c)
       apply(c)
    except ScaryException:
        exit_gracefully()

This way the function that process the config can be tested outside of VyOS by creating the same stucture as get_config() would create by hand (or from file) and passing it as an argument. Likewise, in 2.0 we can call its verify(), update(), and apply() functions separately.

Let me know what you think.

VyOS 2.0 development digest #4: simple tasks for beginners, and the reasons to learn (and use) OCaml

Look, I lied again. This post is still not about the config and reference tree internals. People in the chat and elsewhere started showing some interest in learning OCaml and contributing to VyConf, and Hiroyuki Satou even already made a small pull request (it's regarding build docs rather than the code itself, but that's a good start and will help people with setting up the environment), so I decided to make a post to explain some details and address common concerns.

The easy tasks

There are a couple of tasks that can be done completely by analogy, so they are good for getting familiar with the code and making sure your build environment actually works.

The first one is about new properties of config tree nodes, "inactive" and "ephemeral", that will be used for JunOS-like activate/deactivate functionality, and for nodes that are meant to be temporary and won't make it to the saved config respectively.

The other one is about adding "hidden" and "secret" properties to the command definition schema and the reference tree, "hidden" is meant for hiding commands from completion (for feature toggles, or easter eggs ;), and "secret" is meant to hide sensitive data from unpriviliged users or for making public pastes.

Make sure you reference the task number in your commit description, as in "T225: add this and that" so that phabricator can automatically connect the commits with the task.

If you want to take them up and need any assistance, feel free to ask me in phabricator or elsewhere.