Grab the alpha version of
Hspuppetmaster (compiled for x64 Linux)! It will become a full-fledged
replacement for the default puppetmaster, but is still not ready for prime-time.
In order to use it, you must:
untar everything on your puppetmaster host
run it with ./hspuppetmaster /etc/puppet +RTS -N (the first argument is the
location of your puppet repository) as the puppet user
modify your web server configuration to redirect requests for
/production/catalog to 127.0.0.1:3000. This can be done in Apache by
disabling the “High Performance mode” (wasted a few hours on this one), and
adding something like that to your vhost configuration:
There are quite a few things it will not do (such as updating the facts in
PuppetDB), but you should experience much better catalog compilation times (I
have right now catalogs that take more than the default two minutes timeout to
compile with Puppet), sometimes much clearer error messages. As it is based on
language-puppet, it is generally much more strict than Puppet. For example it
will fail on any variable that cannot be resolved.
Please have a try and let me know how it worked for you (do use --noop!).
The language-puppet library has been created when I started to learn Haskell. As
a consequence, it uses the dreaded String type to store all kind of textual
values. It also uses the System.IO module for performing I/O. I was aware of
the file descriptor leak problem that happens when you use readFile, so I
chose for the following implementation for the Puppet file function:
This should return Just the content of the first readable file in the
parameter list, or Nothing if there are none, and should not leak any file
descriptor. Now that I am finalizing the hspuppetmaster binary, I can use my
library to (try to) compute catalogs on my production systems, using the
standard puppet agent -t --noop. It turned out that the file function was
misbehaving. Testing it in GHCi illustrates the problem:
It seems to work fine, except all file contents are empty. This behavior seems
to be common knowledge among Haskellers, and is due to the fact that the file
descriptor is closed before the output is evaluated. This is pretty horrible
(and surprising), and what is even worse is my solution:
It is a bit longer because of the use of the non-deprecated version of catch,
and because it explicitly forces evaluation of the output of hGetContents.
This behavior was extremely surprising to me, and I would like to thank the
people on #haskell for their help in devising a correct version (mine was
along the lines of !y <- hGetContents, which worked for my simple examples,
but was certain to fail at some point). This is the only IRC channel I know
of where people are at the same time active, always helpful, and knowledgeable.
The main goal of this project is, for now, to assist sysadmins editing their catalogs. The best illustration is, for now, the puppetresources application. It can:
Check a file syntax, and print what it thinks it is.
Compute a whole catalog and display it in human readable format or JSON.
Display details about a specific resource in a catalog, including special support for file contents (useful for debugging templates).
And do the two previous items using facts and/or queried data from a real PuppetDB.
It is also fast enough to compute the catalogs of all your nodes in reasonable time, which opens possibilities you would not even dream of in the Ruby Puppet world. One of them is writing “integration tests” that let you check properties related to complex environmental interactions between hosts.
In order to facilitate this, I am in the process of writing a fully fledged testing API (it is still a bit lacking). It is strongly inspired by other testing APIs and should quickly evolve into something that is very easy to use. It is not the current focus (which is to replace an actual Puppet Master with my software), but I already implemented a test that is built in the puppetresources executable: it now checks that each source parameter in each file resources points to an actual file. This is a common error pattern to me (forgetting to create the file, mistyping its name, or placing it in the wrong directory) that has now disappeared.
Oh by the way, a new version is out ! Version 0.3.2 mainly changes the license, from GPL3 to BSD3. The choice was dictated by the sudden outburst of horribly uninteresting posts about licensing that has plagued Haskell-cafe during the last few hours. I hope this will end soon, or it will not be possible to differentiate this mailing list from that of Debian.
A new version is already out, this time with JSon catalogs generation. It is not
properly tested, but Puppet seems to accept them. If someone knows how to get
puppet catalog apply to download files from a Puppet server, I am interested.
I will probably write a sample application on top of WARP and modify the
configuration of my Puppetmaster to redirect catalog requests towards it. This
means that there could be an efficient replacement to the Puppetmaster soon.
This version introduces resource relationships handling. It is also full of
nasty bugs :) An improved version is already in the works, along with great
features.
First of all, you will now get notifications when a resource is missing or when
you have created cycles. There are still some bugs :
The aliases are not taken into account.
The relationship metaparameters on classes are ignored.
With the released version, nothing is actually working. Sorry … I realized
too late how broken it was. You might want to check github, or the
updated binary packages.
This is the kind of error messages you will get when cycles are found :
12345678
puppetresources: The following cycles have been found:
File[/a]
-> File[/b] ["./manifests/site.pp" (line 557, column 9)] link is Just (RRequire,UNormal,"./manifests/site.pp" (line 556, column 9))
-> File[/c] ["./manifests/site.pp" (line 558, column 9)] link is Just (RRequire,UNormal,"./manifests/site.pp" (line 557, column 9))
-> File[/d] ["./manifests/site.pp" (line 559, column 9)] link is Just (RRequire,UNormal,"./manifests/site.pp" (line 558, column 9))
-> File[/e] ["./manifests/site.pp" (line 560, column 9)] link is Just (RRequire,UNormal,"./manifests/site.pp" (line 559, column 9))
-> File[/f] ["./manifests/site.pp" (line 561, column 9)] link is Just (RRequire,UNormal,"./manifests/site.pp" (line 560, column 9))
-> File[/g] ["./manifests/site.pp" (line 562, column 9)] link is Just (RRequire,UNormal,"./manifests/site.pp" (line 561, column 9))
Please note how each resource and link position is displayed. This is a lot more
verbose than what the vanilla Puppet spouts, but I believe it is also much more
useful. Chasing after links defined as a resource chain gets old very fast.
This is the current error message when a relationship points to (or from) an
unknown resource:
1
puppetresources: Unknown relation ("file","/b") -> ("file","/a") used at "./manifests/site.pp" (line 556, column 9) debug: (False,True,False,False)
This one is terrible, and will need to be reworked. There is still quite a bit
of work, but I am fairly pleased at how everything seems to fold in place.
I started this project at the end of April, 7 months ago. The project is
incredibly useful as it is right now, and I am confident I will be able to
provide a robust and vastly more efficient puppetmaster, with a ton of helpful
tools, before the first anniversary of this project.
A new version is out. Actually, a pair of versions have been released. The most
important change is an important bug with how default values worked that has
been fixed. The other visible improvements will be felt on the performance side,
as discussed in the previous entry.
v0.2.2
New features
A few statistics are exported.
v0.2.1
Bugs fixed
The defaults system was pretty much broken, it should be better now.
New features
Basic testing framework started.
create_resources now supports the defaults system.
defined() function works for resource references.
in operator implemented for hashes.
Multithreading works.
The ruby <> daemon communication is now over ByteStrings.
The toRuby function has been optimized, doubling the overall speed for
rendering complex catalogs.
One of the worst decision that was taken when designing language-puppet was
to use the String type. It is well known that this type is horribly slow. I
had a few tests to run and they were taking forever : it took about 11 seconds
to compute 5 catalogs, which is annoying when you are doing this often. As a
comparison, the puppetmaster on faster computer (it is a X5660, whereas my
workstation uses a i7 860), running the latest Puppet, takes 23 seconds to
compute those 5 catalogs sequentially.
A quick profiling revealed the following :
The program couldn’t use more than a CPU.
Almost all the time was spent computing templates.
I refactored the code a bit during lunch. The Daemon code was supposed to be
multithreaded, and has been designed as such. The only problem was that I forgot
to start several worker threads. This is now
fixed.
The template computing time was reduced by spawning several threads for this
task (see previous commit), and converting the String code to ByteString,
using builders. The time is now mostly spent in the renderString function,
defined as :
I went with the definition in
here,
but it is much slower. If somebody has a better implementation, please let me
know.
The ByteString move reduced the time it took to compute the catalogs to about
6 seconds, and parallelisation reduced it to about 2 seconds. It is a 5.5x speed
upgrade for a few minutes of work, not too bad. The template generation still
takes most of the running time (50% of the time is spent spawning and waiting
for the Erb code, 20% preparing the inputs for the Ruby processes).
Nice speedups could arise from parsing more complex Ruby expressions from the
Haskell code, but it is not a priority now that the performance is acceptable.
You can grab the latest github repo to test it, but you will need a very recent
bytestring, and you will need to fix the cabal file for luautils.
A tentative test framework is included. It took a few minutes of hacking, and
now the puppetresources application checks that file sources are valid. It
doesn’t work for “private” stuff.
Quick note on versionning. The official version number is 0.2.0, but I have
been omitting the first zero on this blog. Anyway, this version includes two
features, one of them built on the
hslua library.
A huge bug was fixed: defaults finishing with a comma were interpreted as a
function working on a hash! For example:
1
File{owner=>'root',}
Was seen as:
1
file({'owner'=>'root'})
Custom functions
Testing custom functions was almost impossible until now, as they required being
compiled with the library to be used. It is now possible to include lua
implementations of your functions alongside the ruby version.
It is a bit harder to write than the ruby versions as it can’t (for now) access
anything from the “Puppet” side, such as facts, and you have to put up with the
Lua syntax.
The other caveat is that it is stored right next to the ruby function right now,
and Puppet tries to interpret it (and find syntax errors in it), so it isn’t
very clean. I will move it for the next version, and will rewrite a few
functions from Puppetlabs stdlib too.
From the implementation point of view, the difficult part was the fact that
there were no instance for Data.Map. I wrote it and it is now part of the
luautils package.
Custom types
As there was all the infrastructure to find files into subfolders, I also added
a very weak custom type system. It will detect the file names in the usual
places and will know these are valid Puppet types. It doesn’t perform any
additional checks for now.
I might add a lua functionnality for fine grained verifications.
What’s next
The fabled dependency handling is not coming soon, but there is a few things
that are already implemented and will be released next time:
Support for the defined function. Just like the real thing it is parse order
dependant.
the create_resources function now accepts the third parameter (default
(values).
The in operator now works with hashes. I am not sure why the Puppet stdlib
implements has_key …
In order to celebrate the entry of this blog into planet haskell,
I will talk a bit about the implementation of the interpretation stage of the
program. I also ripped the eggplant color from the puppetlabs.com site.
Module structure
The whole interpretation process could be separated in the following parts
(fitting the modules hierarchy):
The parser. This is an ugly part, is it is the first part I wrote when I had
no clue about monads, applicative functors, or that the do notation was just
syntactic sugar.
The interpreter, which will be discussed here. It is also an ugly part, but
for other reasons.
The “daemon”, which is a piece of code that glues everything together and
provides a simple API for using them. It should scale on multiple cores too, but
this hasn’t been tested yet.
The PuppetDB communication facilities, that need to be reworked.
The native types support, with checks about the consistency of their
parameters.
The plugin system, which currently handles user defined functions and types.
The catalog monad
Most functions in the catalog monad are of type
CatalogMonad,
which is just an error/state monad over IO :
1
typeCatalogMonad=ErrorTString(StateTScopeStateIO)
The fact that IO is needed is unfortunate, but cannot really be avoided, as
templates are computed and PuppetDB is queried during catalog evaluation. It
would have been possible to return partially evaluated catalogs to an outer
function that lives in IO and that would be responsible with handling this, but
I thought it would just make things complicated.
Template computation is another matter. It is theorically possible to run it as
a pure function (if you accept to only use a subset of the Ruby language), but
it would require a full fledged Ruby interpreter. This is not an option here,
and templates are computed by spawning a Ruby process every time. Now that a LUA
interpreter is embarked, it would be possible to write lua versions of the
templates and have them interpreted in the same process.
Achieving position independency with a single pass
The catalog computing function basically goes through the AST (following
includes and defines) and tries to resolve everything it processes. The problem
is that the Puppet language is supposed to be referentially transparent,
especially according to the the language guide,
which was the only documentation for quite a long time.
Puppet language is a declarative language, which means that its scoping and assignment rules are somewhat different than a normal imperative language. The primary difference is that you cannot change the value of a variable within a single scope, because that would rely on order in the file to determine the value of the variable. Order does not matter in a declarative language.
Reading this, I presumed that the position of a variable assignment, like
everything else, did not matter. It turns out it
does,
but it is now a bit late to correct this.
In order to satisfy this false assumption, all data and resource types exist in
two flavors. For example, with values we have Value
and ResolvedValue.
It would be a good idea to remove all this cruft right now, but I am not in a
hurry to touch it, as it represents quite a bit of code and seems to work
alright for now.
It presents several challenges, as values are supposed to be transformable
into their resolved counterpart at any time, but many things are context
dependant (such as variable scoping, local variable presence, variables in
defines, etc.). It also doesn’t work at all (just like with Puppet) with control
structures or templates that rely on not-yet-defined values.
If somebody knows what the “proper” way to do this is, I would be quite
interested. I am almost certain this exists as it seems related to writing fast
compilers, which is a subject that has certainly been explored by smart people.
Room for extension
Finally, the most important type, besides that of the state, is the type of the
main function getCatalog.
One can notice there are quite a few functions that must be passed along. The
reason is that it should be possible to swap backends easily. An example that
might be written shortly would be to give two implementations for the puppetDB
function:
A generalization of the current version, that queries a real puppetDB.
An emulated puppetDB that is populated with other runs.
This would need passing around more functions (a function to query arbitrary
values to start with, and a function to store the exported resources), but would
be immensely useful. It would make it possible to write test suites that cover
the whole node list, including exported resources between hosts.