The language-puppet website.

Work with your manifests!

Custom Functions and Language-puppet

For now using custom functions, such as those from the puppetlabs-stdlib, required that they were implemented in language-puppet. I do not believe there are a lot of users, and received no complaints. But this might change, and I do not want the current situation to go on :

  • New users will need to meddle with the source code to add their own functions.

  • These functions will ne be easy to distribute.

  • Everybody will have access to the custom functions used in the manifests I work with.

In order to make things a bit better, I used the Lua module in Haskell to provide a small scripting language for people to use. It might have been easier for them to just add a full ruby interpreter, but that would have been against my convictions and is probably pretty difficult. Note that this is not yet released.

Now that this is done, it might be possible to add custom types in the same way, as well as one of the lua templating systems. This will make it possible to get rid of the ruby dependency and largely increase the performances.

In the next few posts I will describe some of the inner workings of this module and implementation samples.

By the way, there are now links for the documentation.

Binary Distribution

You can now get a compiled version of puppetresources on the Download page.

It is not as cool as the compiled version because you lack the interactive features, but is still pretty powerfull and much easier to setup.

Version 1.8.0 - Exciting Features !

Version 1.8.0 is out, and breaks the existing API. On the other hand, it is now very well integrated with PuppetDB ! The following new features are now available :

  • Facts can be retrieved from PuppetDB.
  • Exported resources can also be retrieved from PuppetDB.
  • The resource query function from this module has been implemented.

This means you can now test all your higher order Puppet recipes. A typical use case is for the configuration of proxy servers. It gets really easy to declare something like this on a host:

1
2
3
4
5
@@backend { $::hostname:
    port => 1234,
    type => 'static_content',
    tag  => 'productionproxy';
}

Then you can get them in a hash with :

1
2
3
4
5
$h = pdbresourcequery(
        ['and',
            ['=', ['node','active'], true],
            ['=', 'type', 'Backend']]
        , 'parameters')

And use it in a template with :

1
2
3
<%
scope.lookupvar('myclass::h').sort.each ...
%>

This is the kind of things that drives people to Puppet. On the other hand, it quickly gets tricky to test. How are you supposed to check that your template will look the way you expect it to ?

Previously I used to just hardcode the result of the pdbresourcequery function during the testing phase, and use puppetresources to look at the rendered template. Now with PuppetDB integration you get the actual facts and exported resources that Puppet will use when computing a catalog.

So how do you use it ? It requires having access to a clear text HTTP port that answers PuppetDB queries. Typically, you will want to use a SSH tunnel if you test on your workstation, just like that:

1
$ ssh -L 8080:localhost:8080 your.puppet.master

Once this is done, you have the following options, depending on how you use the language-puppet library :

  • With puppetresources, start with the new -r switch, like this :
1
$ puppetresources -r http://localhost:8080 ./puppet/ node.test
  • With the puppetresources in interactive mode, initialize the Daemon with the initializedaemonWithPuppet function, like this :
1
> queryfunc <- initializedaemonWithPuppet (Just "http://localhost:8080") "./puppet/"
  • With the Daemon API, modify the Prefs records so as to add the correct URL:
1
2
let prefs = genPrefs "./puppet/"
queryfunc <- initDaemon (prefs { puppetDBurl = Just "http://localhost:8080" })

If you wish to use the lower level APIs, you might want to check the new PuppetDB modules (unfortunately hackage doesn’t seem to generate documentation at the moment so I can’t link it). You might use the built-in PuppetDB query function, use your own or just mock their output.

There are probably bugs, and the URL to access to pdbresourcequery is currently hardcoded (won’t take long to makes it right), but this is a major milestone for this project.

Version 1.7.2, With More PuppetDB

A new maintenance version is out. The long awaited resource relationship update will still have to wait, but there are exciting promises in this upgrade, such as the early PuppetDB support. I will blog later about it, but this will rock.

First of all, quite a few bugs were fixed since 1.7.0, and a few nice features have been added. A lot of minor enhancements have been added now that the official language documentation is published. The behaviour of the library should now more closely match that of Puppet. The problem is that there is a huge difference in behaviour that might bite you.

When I started this project, all that was available was the language tutorial and the Puppet source code. As I explained already, I find it less troublesome to just rewrite it from scratch than to try to follow Ruby code. I recently wrote a pair of resource providers, so I can attest this still holds true. It seemed to me at that time that the various statements could be inserted at any place in a manifest and without changing its meaning.

Some effects would be undefined obviously (putting two conflicting defaults in the same class for example), but everything seemed doable with little effort. One of the early design decisions was to support this feature fully, except in conditionnals where I couldn’t see how to handle them efficiently.

The cost of this decision is that all data types have been doubled : one version for the “raw” or “unresolved” version, and one version for the “final” or “resolved” version. Everytime a statement is interpreted, it might be left in an unresolved state, which leads to all kind of performance and logic problems (for example, when you can’t resolve a variable you have to store some pointer to its scope to resolve it later).

It turns out that this was not needed to be Puppet-perfect, as Puppet doesn’t even attempt to do this for variable assignements. This means that every data types could be fully resolved when first found.

This means that the following code fragment will work as you might expect in language-puppet, but not in Puppet:

1
2
file { $filename: ensure => present; }
$filename = 'foo'

And while language-puppet is much faster than Puppet, despite spawning a ruby process for (almost) every template evaluation, it could have been even more. Also the internals would be much cleaner …

Anyway, here goes the changelog:

New features

  • Amending attributes with a collector.
  • Stdlib functions : chomp
  • Resource pretty printer now aligns =>.
  • Case statements with regexps.

Bugs fixed

  • Various details have been modified since the official language documentation has been published.
  • Better handling of collector conditions.
  • Solves bug with interpolable strings that are not resolved when first found.

Version 0.1.7 Is Out

The promised work on relationships hasn’t been done, only minor bugfixes for this release. The main difference with the previous version is that some resource types are now better validated. This is not Puppet-perfect, but it should catch a few bugs.

New features

  • Fix bug with ‘<’ in the Erb parser !
  • Assignments can now be any valid Puppet expression.
  • Proper list of metaparameters.

Bugs fixed

  • Quick resolution of boolean conditions.
  • Start of the move to a real PCRE library.
  • Function is_domain_name.
  • New native types : file, zone_record, cron, exec, group, host, mount.

Migrating Machines Between Puppetmasters

I am currently in the process of migrating machines to a new puppetmaster setup. This is something that also might happen to you, especially when you would like to start with fresh, clean, manifests.

In order to facilitate this process, you can use the diffing feature of the puppetresources program. The following lines describe a typical session, where one would like to check that the set of resources described in the old catalogs for host oldhost somehow match that of newhost.

1
2
3
4
5
6
7
8
9
10
11
12
13
$ ghci Main.hs

> queryold <- initializedaemon "/path/to/old/manifests"
> querynew <- initializedaemon "/path/to/new/manifests"
> oldcatalog <- queryold "oldhost.domain"
> querynew "newhost.domain" >>= diff oldcatalog
...
> querynew "newhost.domain" >>= diff oldcatalog
...
> querynew "newhost.domain" >>= diff oldcatalog
...
> querynew "newhost.domain" >>= diff oldcatalog
...

Note that there will almost always be a difference, as the list of classes is, for now, part of the catalog. On the other hand, the relationships between resources will not be checked as it is not (yet) handled by the tool.

Version 0.1.6 Is Out

A new version is out. I will start work on the relationships between resources just after this one.

New features

  • Errors now print a stack trace (only works with profiling builds)
  • Nested classes
  • generate() function
  • defines with spurious top level statements now should work
  • validate_* functions from puppetlabs/stdlib

Bugs fixed

  • Metaparameters now include stages (not handled)
  • Resolving non empty arrays as boolean returns true
  • Duplicate parameters are now detected

The Puppet Resources Application

This is the official demonstration of the capabilities of the language-puppet library. While it is pretty hackish, it is also quite useful, especially for people working a log with large manifests.

Installing the application

In order for the application to work, you will need a working Haskell environment for now. As it compiles into native code, it is pretty simple to generate a redistributable binary. As with most Haskell binaries, the only required library is libgmp.

The only trouble is the dependency on a Ruby script that will compute the complicated templates, as it would be way too much work to emulate fully this language. This means you will also need a ruby interpreter installed, and probably not much more.

So once you have a working cabal, and a recent GHC (all coming from your distribution or from The Haskell Platform), just type :

1
cabal install puppetresources

Checking the parser

You now have a puppetresources executable. If invoked with a single argument, it will parse a Puppet manifest file and show you how it parsed it :

1
2
3
4
5
6
$ puppetresources modules/git/manifests/init.pp 
class git () {
    package { "git-core":
        "ensure" => "installed";
    }
}

This is not terribly useful, but could be used to check for syntax errors.

Computing catalogs

This is the main usage of this application : computing whole catalogs. In order to do this you must invoke it with a path to a standard Puppet directory (such as /etc/puppet) and a node name :

1
2
3
4
5
6
7
8
9
10
11
$ puppetresources . test.nod
The defined() function is not implemented for resource references. Returning true at "./modules/apt/manifests/ppa.pp" (line 20, column 3)
The defined() function is not implemented for resource references. Returning true at "./modules/apt/manifests/key.pp" (line 38, column 7)
The defined() function is not implemented for resource references. Returning true at "./modules/apt/manifests/key.pp" (line 42, column 7)
anchor {
    "apt::builddep::glusterfs-server": #"./modules/apt/manifests/builddep.pp" (line 12, column 12)
        name => "apt::builddep::glusterfs-server";
    "apt::key/Add key: 55BE302B from Apt::Source
        debian_unstable": #"./modules/apt/manifests/key.pp" (line 32, column 16)
        name => "apt::key/Add key: 55BE302B from Apt::Source
        debian_unstable";

This will display the whole catalog as a large, top level, Puppet manifest, after the warnings. As not everything is implemented yet, there will probably be quite a few of them for real world catalogs.

You will notice that the relationship between resources is not yet supported, and that a class resource type is used. This is a placeholder for the future relationships system.

The typical use of this feature occurs during manifests development. It serves as a high level correctness checker, as it will fail about the same as the real Puppet application, but can be run on your workstation before pushing to the puppet master. It is also significantly faster than the Ruby code.

Debugging templates

Using the previous features helps a lot, but is not very useful when debugging templates. The reason is that the content parameter is displayed as a one line string, and is hard to read in most cases :

1
2
3
4
5
6
$ puppetresources . test.nod  2> /dev/null | grep content
    content => "# debian_unstable\ndeb http://debian.mirror.iweb.ca/debian/ unstable main contrib non-free\ndeb-src http://debian.mirror.iweb.ca/debian/ unstable main contrib non-free\n",
    content => "# debian_unstable\nPackage: *\nPin: origin \"debian.mirror.iweb.ca\"\nPin-Priority: -10\n",
    content => "# karmic-security\nPackage: *\nPin: release a=karmic-security\nPin-Priority: 700\n",
    content => "# karmic-updates\nPackage: *\nPin: release a=karmic-updates\nPin-Priority: 700\n",
    content => "# karmic\nPackage: *\nPin: release a=karmic\nPin-Priority: 700\n",

It is then possible to add a third argument, which is the name of one of the files, to display its content on the standard output :

1
2
3
4
5
$ puppetresources samplesite test.nod karmic.pref 2> /dev/null
# karmic
Package: *
Pin: release a=karmic
Pin-Priority: 700

Interactive use and diffing

This is the fun part. For this a binary distribution will not work as it requires ghci. In order to play with this, you need to run it on the Main.hs file, initialize the daemon and start computing catalogs :

session.hs
1
2
3
4
5
6
$ ghci Main.hs
>>> queryfunc <- initializedaemon "./samplesite/"
>>> c1 <- queryfunc "test.nod"
>>> c2 <- queryfunc "test2.nod"
>>> :type c1
c1 :: FinalCatalog

The catalogs returned are of type FinalCatalog, which is a plain Data.Map. This means you can manipulate it as usual, checking its size or typing crazy one liners :

session.hs
1
2
3
4
5
6
7
8
>>> Map.size c1
25
>>> mapM_ print $ Map.toList $ Map.map (length . lines . (\x -> case x of (ResolvedString n) -> n) .fromJust . Map.lookup "content" . rrparams) $ Map.filter (Map.member "content" . rrparams) c1
(("file","debian_unstable.list"),3)
(("file","debian_unstable.pref"),4)
(("file","karmic-security.pref"),4)
(("file","karmic-updates.pref"),4)
(("file","karmic.pref"),4)

But the real point here is to run diffs between catalogs, to check differences between hosts that should be alike, or progress when altering a catalog :

1
2
3
4
5
6
>>> diff c1 c2
file[karmic-updates.pref] {
# content
+ Pin-Priority: 750
- Pin-Priority: 700
}

Important note about the facts

All the facts, except those related to hostnames, are extracted from the current host. This means that the differences will not be accurate if part of the catalog comes from the facts, which is pretty common. Handling this properly is left as an exercice to the reader, but should be fairly obvious (roll out your own initializedaemon that takes facts as parameter).

The Daemon

The library comes with a nice building block in the Puppet.Daemon module. Here is a minimalistic application of this library (note that I have tried to keep this readable by non Haskellers).

test.hs
1
2
3
4
5
6
7
8
9
10
11
12
13
import Puppet.Daemon
import Puppet.Init
import Puppet.Printers
import Facter

main = do
    let prefs = genPrefs "/etc/puppet"
    queryfunc <- initDaemon prefs
    rawfacts <- allFacts
    o <- queryfunc "node.test" (genFacts rawfacts)
    case o of
        Left err -> error err
        Right c  -> putStrLn (showFCatalog c)

Here is a detailed explanation of what is happening:

  • Lines 1 to 4 import all the required libraries that we will use in this example.
  • Line 6 declares the main function, nothing interesting here.
  • Line 7 computes the preferences from the base puppet directory. As explained in the haddocks, this structure holds the directory for manifests, modules and templates. This helper function just fills the fields with their default values. The astute reader will notice that this structure could holds pool sizes, for compilation and parsing. This is mainly used if your application needs to scale, and will be covered in a later post.
  • Line 8 initializes the daemon, and returns a function that takes a node name and a set of facts, and should return a catalog object.
  • Line 9 gathers facts from the local computer, using the hsfacter module. This is pretty experimental, but works for the current use cases.
  • Line 10 computes the catalog for node node.test.
  • Lines 11 to 13 display an error if something failed or the actual catalog.

This covers a very basic usage of the tool. In the next post, a simple yet useful application will be described.

What Is This ?

The language-puppet library is the start of a rewrite of the Puppet configuration management tool. While it is far from being fully featured, it is already pretty useful to me, especially when writing or modifying manifests. The ultimate goal is to provide a drop-in replacement of the Puppetmaster for people more concerned with efficiency than features.

This project comes from a growing frustration at using Puppet. While it is so useful that I am not sure how I could live without it, it is written in Ruby, which is possibly the worst choice for a production system (an inflammatory rant can be found here). My main gripes with the current implementation of puppet are its speed (tens of seconds to compile a catalog, and about twice as much to apply it, really?) and memory usage (controlled by restarting the ruby processes).

So here it is, a tentative rewrite of Puppet in a nice language that compiles into native code, namely Haskell. I will cover the usage of this library in the following posts, and the kind of applications that are enabled by the increase in performance.