The language-puppet website.

Work with your manifests!

Version 0.10.1 Out: New PuppetDB Backends, ‘Future’ Parser, and Yet Another New Testing Module

I finally uploaded a new version of language-puppet to Hackage. Here is a selection of the new features since 0.9.0:

  • A complete rewrite of the PuppetDB subsystem, which is the focus of this article.
  • A new pdbQuery binary that can query one of the new PuppetDB backends, but also dump the content of a real PuppetDB into something that can be consumed by the new TestDB backend. I will probably blog more about this feature in subsequent posts.
  • An actual command line parser for the language-puppet binary.
  • Very tentative new testing module, that is more or less hspec package in a ReaderT FinalCatalog. This will require more work, and a dedicated post.
  • A limited support for the new lambda methods of the future parser. The limitation is intentional, and will be described in more details in another post.

The main change about PuppetDB support is from an API point of view. It isn’t a single function with a terribly vague type anymore (T.Text -> Value -> IO (S.Either String Value)).

It is now a set of functions, one for each supported command and endpoint, with a specific query type that can only define valid queries. This makes it easier to use a PuppetDB backend programatically or to give better error messages to the user of functions such as pdbresourcequery.

Behind this API are now three backends :

  • The dummy backend, which as its name implies, is just a stub that will return empty answers.
  • The remote backend, that connects to a real PuppetDB server.
  • The TestDB backend, brand new, not really tested and the focus of our article.

This backend is a limited re-implementation of the features offered by a full fledged PuppetDB server, with a focus on interactive development of manifests and modules. Instead of describing this in details, I will describe a (perfectly artificial) scenario where it comes handy. In this scenario we have two backend servers, and a proxy. We would like to use exported resources so that the backends are automatically registered.

site.pp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
node 'proxy' {
    include haproxy
}

node 'back1' {
    @@haproxy::backend { $hostname:
        backend_type   => 'web',
        backend_server => $hostname,
        backend_port   => 80;
    }
}
node 'back2' {
    @@haproxy::backend { $hostname:
        backend_type   => 'web',
        backend_server => $hostname,
        backend_port   => 80;
    }
}

If we run our puppetresources with a new command line option, here is what happens:

Compute the catalog of back1
1
2
3
4
5
6
7
8
9
10
11
$ puppetresources --pdbfile /tmp/pdb -p . -o back1
Warning: could not decode /tmp/pdb :InvalidYaml (Just (YamlException "Yaml file not found: /tmp/pdb"))

Exported:
haproxy::backend {
    @@back1: # "./manifests/site.pp" (line 7, column 26) [top level] [Scope []]
        alias          => [],
        backend_port   => "80",
        backend_server => "back1",
        backend_type   => "web";
}

There is a scary warning, and then the usual output of the puppetresources tool. We can see that Haproxy::Backend[back1] is an exported resource, and is not applied on the current node. Here is the content of the newly generated file :

/tmp/pdb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
facts: {}
resources:
  back1:
    data:
      resources:
      - aliases:
        - back1
        sourceline: 7
        parameters:
          backend_server: back1
          backend_port: 80
          backend_type: web
        exported: true
        title: back1
        type: haproxy::backend
        sourcefile: ./manifests/site.pp
        tags:
        - haproxy::backend
      transaction-uuid: uiid
      name: back1
      version: version
      edges: []
    metadata:
      api_version: 1

This is a YAML representation of the catalog wire format. You can easily edit this file to test arbitrary conditions, or just inspect it. You can also include arbitrary facts to manually test for edge cases.

The frontend server includes the haproxy module, which looks like this :

modules/haproxy/manifests/init.pp
1
2
3
4
5
6
7
class haproxy {
    $backends = pdbresourcequery(['=','type','Haproxy::Backend'],'parameters')

    file { '/etc/haproxy/haproxy.cfg':
        content => template('haproxy/config.erb');
    }
}

Now if we run the puppetresources binary using the generated PuppetDB file, we get:

1
2
3
4
5
6
7
8
9
10
11
12
$ puppetresources --pdbfile /tmp/pdb -p . -o proxy -t file

file {
  /etc/haproxy/haproxy.cfg: # "./modules/haproxy/manifests/init.pp" (line 4, column 12) [class haproxy] [Scope [haproxy]]
    alias   => [],
    content => "backend web\n        server back1 back1:80\n",
    ensure  => "present",
    group   => "root",
    mode    => "0644",
    path    => "/etc/haproxy/haproxy.cfg",
    title   => "/etc/haproxy/haproxy.cfg";
}

You can notice the new command line flags. The -n (resource name) and -t (resource types) accept regular expressions to filter resources. When used in conjunction with -c, the resource name must be the exact title of a file, and its content field will be displayed verbatim :

1
2
3
4
5
$ puppetresources --pdbfile /tmp/pdb -p . -o back2 > /dev/null
$ puppetresources --pdbfile /tmp/pdb -p . -o proxy -n /etc/haproxy/haproxy.cfg -c
backend web
        server back1 back1:80
        server back2 back2:80

This new feature will let you experiment with exported resources in a very natural way, without having to ever compute in-development manifest files on a live Puppet master.

Version 0.9.0 Released

This version is out of the beta stage, and I have updated the repository. For some reason I am currently getting a bad request from Hackage when trying to upload it, and will try to solve this ASAP. New binaries are available for downloading.

New stuff

  • Colors everywhere ! The new pretty printers are much nicer to work with. puppetresources will default to monochrome when it detects it is not printing on a terminal.
  • A lot of behavior is just more correct.
  • The hsfacter and puppetresources packages have been deprecated and incorporated in language-puppet.
  • The hruby package (it is used to embed a ruby interpreter in your Haskell program) should now support Ruby 1.8, 1.9 and 2.0.
  • The filecache package (caches computation results based on a file name, expires them with inotify) is a more robust solution for caching parsed manifests.
  • The puppetresources command has now evolved a lot, and is now much more useful. In particular, it will let you compile catalogs and display only a subset of the resources. I will post another post soon describing this mode of operation, but here is a teaser. When I type puppetresources -p . -o nodename -t '^package$' -n 'git', I get :

Searching for git packages

As you can see, you can now filter the catalog output using regexps on resource types and names. The output is colored, and you get useful information, such as where the resource was declared, in which container (class or define), and what was the context at that time.

Still not done

  • Support of hiera. I do not use it at my site, but this is probably very much required.
  • A real Puppet master replacement. All the bricks are available, and it has even been tested a little bit ! I only need to check how to register stuff in the PuppetDB (facts, exported resources, catalogs …). I expect it to be around ten times faster for compiling catalogs, and eat a lot less memory than the Ruby version.
  • Polish the testing DSL : it is usable, but it isn’t really nice.
  • Creating an external testing DSL. I plans to use this tool for continuous testing and integration. Having tests written in plain text files that would not require compilation would make this possible.
  • Repopulate the Haddocks, as they have been decimated during the big rewrite.
  • Filtering resources displayed by puppetresources using a standard Puppet selector. This should be a piece of cake, and would really help when designing complex selectors.
  • The hruby package is currently used in a weird way when you need to create Ruby objects from Haskell. If you instantiate a bunch of them, Ruby’s GC might just release them as you go, as their are not referenced. For this reason, you currently need to freeze Ruby’s GC, run everything that needs your shiny objects, and then restore the GC. It would be more efficient to mark these objects as managed by a foreign program, and use something like ResourceT to free them when the computation is over.

Feedback needed !

I am certain this project is almost exclusively used at my site, and this frustrates me a bit. I suppose that the fact that it is not in Ruby is playing against it, as most of the Puppet ecosystem happens to be written in that language, and everybody seems to believe that Ruby is for devops. It can however do things that just can’t be done with the tools I know of, and very quickly with that.

So if you have an idea about what is wrong with language-puppet, or advice to put it under the spotlight, please let me know.

Version 0.9.0 Is Out for Testing

Finally the huge rewrite is almost done !

It is beta

It is now only accessible though GitHub, in the beta branch. It uses another highly experimental module : filecache. This module should provide a quick and easy method for caching the result of an IO computation on a file. It uses hinotify for cache expiration, and is completely untested !

The behaviour of the interpreter is radically different from the previous version in many ways. It should hopefully be a better approximation of the “real” Puppet implementation. This is one of the major goals of this rewrite, especially concerning resource dependencies that were totally off the mark in the previous version.

On the other hand, most of the documentation has disappeared.

Performance improvements

The second goal is to increase performances. In order to do so, several radical changes have been performed :

  • The Ruby interpreter now lives in its own system thread, which is pretty useful as this package now only works with the threaded runtime !
  • The “unresolved” types that lived between parsing and final interpretation have been ditched.
  • All important types are now strict, and the strict-base-types module is well used.

I do not have hard numbers right now, but now that everything is strict using more cores actually speeds things up. This is still not perfect and will require tuning, but this is much better than the previous state of affairs. The parser seems a bit slower (between 10% and 30%), but the interpreter is much faster. This means that single testing is a bit slower (about 5% slower), but future embedding in a long lived process will give a huge performance increase.

Nicer codebase

First of all, the lens package is now used everywhere. This might look like obfuscation for those not used to it, but I believe it is incredibly nice, especially when working with a complex state.

The parser has been completely rewritten, using the parsers abstraction, with a little hack to use Parsec underneath with my own “lexeme” function. I would have liked to use trifecta for its nicer error messages, but I could not find how to do the same trick with it.

Speaking of which, now almost all text is pretty printed in color. This is noticeably slower when rendering a large catalog, but so much nicer to the eyes it is well worth it.

The “puppetresources” and “facter” modules are now merged, and will be marked as deprecated when this version stabilizes.

There is still a lot of work to do with it, but the testing API should be much easier to use. I am considering writing a DSL to it so that it wouldn’t require a development environment to test stuff.

Finally, there used to be a lot of tuples in the type signatures that are now replaced by proper types, making function signatures much more expressive.

before.hs
1
_unresolvedRels :: ![([(LinkType, GeneralValue, GeneralValue)], (T.Text, GeneralString), RelUpdateType, SourcePos, [[ScopeName]])],
after.hs
1
2
3
4
5
6
_extraRelations     :: ![LinkInformation]
data LinkInformation = LinkInformation { _linksrc  :: !RIdentifier
                                       , _linkdst  :: !RIdentifier
                                       , _linkType :: !LinkType
                                       , _linkPos  :: !PPosition
                                       }

Full Rewrite in Progress

In the process of writing the language-puppet library, I learned quite a lot about Haskell and its libraries. The first part of language-puppet that was written was the parser. At that time I did not understand monads, brute-forced the do-notation until it seemed to do what I wanted, and generally made all kind of blunders. The other problem was that I was learning Puppet too, at a time when it was changing a lot and nothing was really documented. This led to unfortunate decisions that I already documented.

I dediced to rewrite everything from scratch, by directly implementing all I could find in the reference. I started a new parser during the weekend, encoding as many verifications as possible in it, and then tried it on real manifests. Boy, was I naïve ! It did not work at all. The specification is good for learning the language or dissipating some common misconceptions, but is of moderate use for my purpose. I relaxed most of the checks and it seems to work now.

Alt text

On the technical side, I am now using the parsers package, which has a very nice interface. I considered using trifecta as the underlying parser. Its error messages are gorgeous, but it turns out it is not trivial to get my own lexeme system in place with it. I went with parsec, and, instead of using the parsec-parsers package, wrote my own instances (to be honest I copy-pasted those of the package and added a non-default definition for token). Edward Kmett was nice enough to give me pointers on how to do this with trifecta, but this did look quite clumsy. He hinted that he might work on a monad-transformer approach to this problem, so I am just waiting for this to happen. The nice thing about the parsers approach is that switching now is trivial.

As can be seen on the previous screenshot, I am using a nice pretty printing library that let me (ab)use color.

Another huge difference is that I now use strict type whenever possible. The previous version seemed to be able to support an arbitrary number of worker threads with 300mb of storage for my catalogs, whereas the Puppet version could go up to 800mb for a single thread. I would like to at least halve this figure for the next version.

The next step is to write the new daemon infrastructure. I already have a generic file-cache module that let you cache things related to files. When a file is modified, the cached value is automagically invalidated (using inotify). I hope this will work well in practice and will not be blocking all the other threads.

Hruby Package Released

I finally released the hruby package, along with an updated version of the language-puppet package. It is very unfortunate this package will never get proper haddocks on Hackage, as the documentation is quite useful. If someone has a suggestion for getting haddocks without having the ruby1.8 library installed, I am interested. Also the path to the Ruby include files is hardcoded, meaning it might require manual tweaking to get it right.

Both libraries now have build flags :

  • Hruby has a flag for ruby1.9. This flag is mostly cosmetic as I didn’t even test it, and just copied the files for ruby1.8.
  • Language-puppet now has a -fhruby option, to build with this library.

The immediate result is a two-fold speed increase for single runs of puppetresource, and a six-time speed-up for scripts computing several catalogs. The reason is that the parser is not too fast, but its results get cached. Also, the language-puppet daemon infrastructure still let you define the number of threads that should be spawned to compute templates. This should be set to 1. The ruby interpreter cannot be used in a thread-safe way.

There are still several issues to address. The first is related to the multiple variable assignment problem. In Puppet all variables are immutable, and can’t be reassigned. Well, except when overwriting variables belonging to an inherited class. I wish they never introduced inheritance, as it introduces all kind of special rules, and seems generally fragile. Moreover, given how I (have not) implemented scopes, it is not trivial to have a robust way to check if the overwrite is valid or not.

The second most important issue is the fact that the dependency system isn’t working as it should be. I still get dependency loops in Puppet that are not catched by language-puppet. This is a show-stopper, and must be fixed soon. It is however a big challenge.

Finally, as language-puppet is Linux only for now, I would like to start using the inotify feature. The current caching mechanism works by issuing stat system calls on all files that might have changed. Inotify would greatly reduce the number of system calls, which is always a good thing. I am not sure this would lead to a big speed increase however.

Embedded Ruby Interpreter and Performance Increase

Despite hitting a nasty (but obvious) bug involving Ruby’s GC, it seems that the feature is now stable.

I have been using a script for a while that computes catalogs for 30 nodes, taking into account exported resources, and runs some tests on the results. This script used to run in around 50 seconds. On my puppet master, the combined catalog generation time for those hosts is around 10 minutes¹.

Language-puppet was about ten times faster than the original implementation, but was wasting a significant amount of time spawning Ruby processes, rendering gobs of data (the list of all known variables and their values), and feeding them to said process, for each template evaluation. On the Ruby side, the data was interpreted (with eval), the templates were loaded and interpolated, and the response spit back to the Haskell executable. For this reason I wrote a minimalist template parser that is capable of interpolating the simplest ones while staring in Haskell land.

Now the Ruby process is embedded, and variable resolution happens only when needed, by providing a callback Haskell function to the Ruby runtime.

The whole script now runs in less than 10 seconds (six if you omit the tunnelled accesses to PuppetDB). This is now acceptable to run it before almost all commits, which was the goal. It will help making sure nothing got (too) broken, especially with regards to exported resources.

The software is now stable enough, and I will probably prepare a new binary release soon, along with a Debian-style repository.


¹ This is not a fair comparison however. My script queries the PuppetDB for facts using a SSH tunnel, whereas the puppet master is local. On the other hand the puppet master does stuff my script doesn’t, such as updating facts and reporting data into PuppetDB (in all fairness my script updates a local PuppetDB-like database). I do not believe this accounts for an important fraction of those ten minutes, but might be wrong. Also, the puppet master has a faster CPU, and does not run unit tests on the catalogs.

Incoming: Ruby Bridge

I am working on a quick and dirty Ruby bridge library, that I hope will yield a huge performance gain with template interpolation in the language-puppet library. Right now, it is capable of:

  • Initializing a Ruby interpreter from libruby
  • Calling Ruby methods and functions
  • Registering methods or functions that will be called from Ruby code
  • Converting data between the two Worlds (right now the most complex instance is the JSON one, which means that many complex Ruby types can’t be converted, but it is more than enough for passing data)
  • Embedding native Haskell values that can be passed around in Ruby to the Haskell-provided external functions (I will use this for passing the Puppet catalog state around)

There are still a few things to do before releasing it :

  • Making compilation a bit less dependant on the system. This will probably require quite a few flags in the cabal definition …
  • Hunting for memory leaks. I am not sure how to do this with the GHC Runtime in the middle, and I do hope that ruby_finalize frees everything that is managed by the Ruby runtime. After all, restarting processes seems to be the only working garbage collection method for Ruby daemons …
  • Writing stubs for the Puppet library methods that might be needed by templates. I would like to be able to support custom types and functions directly written in Ruby instead of Lua, but this will probably turn into a nightmare …
  • Cleaning things up !

Here is a quick code preview :

test.hs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
{-# LANGUAGE OverloadedStrings, OverloadedStrings #-}
module Main where

import Foreign.Ruby.Bindings
import Data.Aeson
import Data.Attoparsec.Number

-- this is an external function that will be executed from the Ruby interpreter
-- the first parameter to the function is probably some reference to some top object
-- my knowledge of ruby is close to nonexistent, so I can't say for sure ...
extfunc :: RValue -> RValue -> IO RValue
extfunc _ v = do
    -- deserialize the Ruby value into some JSON Value
    onv <- fromRuby v :: IO (Maybe Value)
    -- and display it
    print onv
    -- now let's create a JSON object containing all kind of data types
    let nv = object [ ("bigint" , Number (I 16518656116889898998656112323135664684684))
                    , ("int", Number (I 12))
                    , ("double", Number (D 0.123))
                    , ("null", "Null")
                    , ("string", String "string")
                    , ("true", Bool True)
                    , ("false", Bool False)
                    , ("array", toJSON ([1,2,3,4,5] :: [Int]))
                    , ("object", object [ ("k", String "v") ] )
                    ]
    -- turn it into Ruby values, and return this
    toRuby nv

-- this is the function that is called if everything was loaded properly
nextThings :: IO ()
nextThings = do
    -- turn the extfunc function into something that can be called by the Ruby interpreter
    myfunc <- mkRegistered2 extfunc
    -- and bind it to the global 'hsfunction' function
    rb_define_global_function "hsfunction" myfunc 1
    -- now call a method in the Ruby interpreter
    o <- safeMethodCall "MyClass" "testfunc" []
    case o of
        Right v -> (fromRuby v :: IO (Maybe Value)) >>= print
        Left r -> putStrLn r

main :: IO ()
main = do
    -- initialize stuff
    ruby_init
    ruby_init_loadpath
    -- and load "test.rb"
    s <- rb_load_protect "test.rb" 0
    if s == 0
        then nextThings
        else showError >>= putStrLn

And here is the ruby program, that calls our external function :

test.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class MyClass
    def self.testfunc
        hsfunction( [16588,
                    "qsqsd",
                    true,
                    { 'a' => 'b' },
                    :symbol,
                    0.432,
                    5611561561186918918918618789115616591891198189123165165889 ]
                ).each do |k,v|
            puts "#{k} => #{v} [#{v.class}]"
        end
        12
    end
end

And the output, showing that data is properly converted from either sides :

1
2
3
4
5
6
7
8
9
10
11
Just (Array (fromList [Number 16588,String "qsqsd",Bool True,Object fromList [("a",String "b")],String "symbol",Number 0.432,Number 5611561561186918918918618789115616591891198189123165165889]))
bigint => 16518656116889898998656112323135664684684 [Bignum]
int => 12 [Fixnum]
double => 0.123 [Float]
array => 12345 [Array]
true => true [TrueClass]
null => Null [String]
string => string [String]
object => kv [Hash]
false => false [FalseClass]
Just (Number 12)

EDIT: added link to the code.

Language-puppet v0.4.0

I just released the latest language-puppet version. For the full list of changes, please take a look at the changelog. Here are the highlights.

PuppetDB code reworked

The PuppetDB code and API has been completely overhauled. It is now more generic : the resource collection and puppet query functions now work the same. Additionally, a PuppetDB stub has been created for testing use.

Better diagnostic facilities

As the main use of this library is to test stuff, the following features were added:

  • Several error messages have been reworked so that they are more informative.
  • A dumpvariables built-in function has been added. It just prints all known variables (and facts) to stdout, and can be quite handy.
  • The “scope stack” description is stored with the resources. This turned out to be extremely useful when debugging resource names colisions or to find out where some resource is defined.

Here is an example, let’s say you do not remember which package installs the collectd package. Just run this :

1
2
3
4
5
6
7
» puppetresources . default.domain 'Package[collectd]'

package {
    "collectd": #"./modules/collectd/manifests/base.pp" (line 4, column 9) ["::","site::baseconfig","collectd","collectd::client","collectd::base"]
        ensure          => "installed",
        require         => [Class["collectd"], Class["collectd::base"], Class["collectd::client"], Class["site::baseconfig"]];
}

You now know exactly where the package resource is declared, and the list of “scopes” that have been traversed in order to do so. Note that this information is displayed when resources names collide.

Easier to setup

This library doesn’t depend from a newish bytestring anymore, and should build with the package provided with a GHC compiler of the 7.6.x serie.

This is not yet done, but I will certainly soon publish a debian-style repository of the compiled puppetresources binary. I am interested in suggestions for an automated building system.

Better testing

The testing API seems sufficient to write pretty strong tests, but would still benefit from a few more helper functions. The testing “daemon” has been reworked to use the new PuppetDB stub. It makes it possible to test complex interactions between hosts using the exported resource or PuppetDB query features.

Work in progress

I will probably lensify the code until I get a descent understanding of it.

I do not intend to work on Hiera emulation just yet, as I am probably the only user of this library for now and I do not use this feature.

One area of improvement would be to embed the ruby interpreter in the library. I am not sure how to do this, but as there are quite a few projects of lightweight interpreters sprouting from the earth, it might be possible in the near future. The only problem would be figuring out how to build a large C project with cabal.

Some other considerations

I recently ported the code from random.c to Haskell (here). This has been quite tedious, and is quite hard to read. This is an almost naive port of the code found in the Ruby interpreter, without the useless loop variables. For some reason, there are many loops like this :

1
2
3
4
5
6
7
8
9
10
i=1; j=0;
k = (N>key_length ? N : key_length);
for (; k; k--) {
    mt->state[i] = (mt->state[i] ^ ((mt->state[i-1] ^ (mt->state[i-1] >> 30)) * 1664525U))
        + init_key[j] + j; /* non linear */
    mt->state[i] &= 0xffffffffU; /* for WORDSIZE > 32 machines */
    i++; j++;
    if (i>=N) { mt->state[0] = mt->state[N-1]; i=1; }
    if (j>=key_length) j=0;
}

As you can see, the value of k is never used in the loop. I am not sure why the author didn’t go for something like :

1
for(i=1;i<k;i++) {

Anyway, the Haskell code is pretty bad, and will certainly only work for 64-bit builds. I am not sure how I should have written it. I suppose staying in the ST monad would have lead to nicer code, and I am open to suggestions.

The Hslogstash Package

This blog post is not about language-puppet, but might be of interest to my fellow sysadmins with an interest in Haskell. I recently worked with Logstash in a way that might not be typical, as all my messages are emitted by services that are Logstash-aware: they directly write JSON messages to the TCP input of the Logstash server. This means that most of the features (and some would say, the whole point) of Logstash were of no use to me.

I stuck with my grand mission of rewriting the handful of useful Ruby programs, and wrote a new package. I based almost everything around the excellent conduit abstraction. It has the following features:

  • Haskell types for representing Logstash messages, along with the type-classes necessary for converting them from and to JSON
  • An ElasticSearch conduit, using the bulk insert API
  • A Redis source using the pipelining features of the hedis package, and a simple Redis sink
  • A Logstash listener, based on the TCP listener from network-conduit, able to accept latin1 and UTF-8 messages at the same time
  • A pair of “retrying” sinks, one using a Socket and the other establishing TCP connections. They are used for garanteed delivery of a whole ByteString segment, retrying to connect until it is sent (this is obviously useful for JSON messages)
  • A few functions for handling bulk APIs in Conduits
  • And finally, the coolest part, a few helper functions that will let you route between conduits !

The last part was made after a little discussion on the Haskell-Cafe mailing list. It is built with with stm-conduit, which already has a helper function for merging sources. This package introduces the other useful functionnality: the ability to “route” items coming from a source to several sinks. The main function, branchConduits, works by taking a Source, a routing function, and a Sink list. The routing function associates a (possibly empty) list of integers to every item coming from the Source. These integers directly map to the corresponding Sink, letting you define the routing policy.

The package includes a few examples of common tasks, all of them with acceptable runtime performance, such as :

  • Moving messages from a TCP server to Redis
  • Moving messages from Redis to Elasticsearch
  • Routing messages between conduits

So if you need more control, or much better performance, than what you would get from Logstash, and you are not afraid to write (a lot of) code, please use this package and let me know what is missing and/or buggy!

Sneak Peak at the Language-puppet PuppetDB Testing Features

I always thought that one of the most rewarding effect of Puppet is that the whole system gets configured automatically as nodes are added. For me, the main, and for a long time sole, manifestation of this property is in the configuration of the Nagios servers. The built-in types lend themselves pretty well to this exercice.

Now, with PuppetDB, we have a simple and powerful way to create new effects, beyond what could be achieved with just exported resources (I believe it used to be possible before PuppetDB, but required black magic in the template files). I will demonstrate a typical use case, along with a sneak peak of the testing features that will appear in the next version of language-puppet.

Let’s say we have an HTTP proxy and several groups of servers acting as backends. You wish to be able to add servers to the pool just by running the agent on them. The site.pp should look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
node 'proxy' {
    include haproxy
}

node 'back1' {
    haproxy::backend { $hostname:
        backend_type   => 'web',
        backend_server => $hostname,
        backend_port   => 80;
    }
}
node 'back2' {
    haproxy::backend { $hostname:
        backend_type   => 'web',
        backend_server => $hostname,
        backend_port   => 80;
    }
}

The haproxy::backend defines are empty, and the interesting part is in the haproxy class:

1
2
3
4
5
6
7
8
9
10
11
class haproxy {
    $backends = pdbresourcequery(
        ['and',
            ['=',['node','active'],true],
            ['=','type','Haproxy::Backend']
        ],'parameters')

    file { '/etc/haproxy/haproxy.cfg':
        content => template('haproxy/config.erb');
    }
}

The pdbresourcequery function comes from this excellent module, and has been included natively in language-puppet for a while. Its effect here is to fill the $backends variable with an array containing all resources that are of type Haproxy::Backend on any active node.

But now comes the complicated part: how are you supposed to write, and, more importantly, to test, the config.erb template ? As far as I know you can’t pull this off with puppet-rspec (and it is way too slow anyway). With the new testing API, you can write a simple program like this:

Main.hs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
module Main where

import qualified Data.Map as Map
import Control.Monad (void)
import Puppet.Testing
import Puppet.Interpreter.Types
import Facter

main :: IO ()
main = do
    qfunction <- testingDaemon Nothing "." allFacts
    void $ qfunction "back1"
    void $ qfunction "back2"
    (proxycatalog, _, _) <- qfunction "proxy"
    case Map.lookup ("file","/etc/haproxy/haproxy.cfg") proxycatalog of
        Nothing -> error "could not find config file"
        Just f  -> case Map.lookup "content" (rrparams f) of
                       Just (ResolvedString s) -> putStrLn s
                       _ -> error "could not find content"

Line by line, this program does:

  • lines 1-10 : various headers
  • line 11 : the catalog computing function is initialized, using the new testing system
  • line 12 : the catalog for the node back1 is computed, and stored into the fake PuppetDB
  • line 13 : same thing for back2
  • line 14 : same thing for proxy, but we keep the final catalog this time
  • lines 15-19 : the content of the /etc/haproxy/haproxy.cfg is displayed. This part is terrible and will be replaced by some helper soon.

The template groups the resources by their “backend_type” attribute, creates a backend block for each of them, and populates the blocks with the corresponding backends.

modules/haproxy/templates/config.erb
1
2
3
4
5
6
7
<%- backends = scope.lookupvar('haproxy::backends').group_by do |x| x["backend_type"] end -%>
<%- backends.each do |backendname, backends| -%>
backend <%= backendname %>
    <%- backends.each do |backend| -%>
        server <%=backend["backend_server"]%> <%=backend["backend_server"]%>:<%=backend["backend_port"]%>
    <%- end -%>
<%- end -%>

And the output is :

1
2
3
backend web
        server back2 back2:80
        server back1 back1:80

It works! With this feature, it will soon be possible to test and experiment with the most complex aspects of inter-node interactions.