A new version is out, with a focus on the bleeding edge ! The focus of this release was to update important dependencies, so we have now :

lens 4
attoparsec 0.11
aeson 0.7
text up to 1.1
parsers 0.10
and new versions of filecache and hruby

The goal here is to reap the benefits of the speed improvements in attoparsec, text and aeson, and be ready for the GHC 7.8 release.

There were a few hurdles to get there. The move from Data.Attoparsec.Number to Data.Scientific for number representation was not a big problem (even though code in hruby could be polished), but the lens and parsers upgrade proved more problematic.

Lens related breakage

The main problem with lens 4 was that it broke strict-base-types. I don’t think this will last long, but here is a temporary workaround for the impatient. Other than that, several instances of x ^. contains y were replaced by x & has (ix y) for the map-like types. This was a painless upgrade.

The trouble with parsers

I like this package a lot, because it exposes a nice API and supports several underlying parsers. But it comes with a couple problems.

The first one is related to the TokenParsing typeclass. This let you define how someSpace, the function that skip spaces, is defined. Unfortunately, the parsers library comes with an instance for parsec that will skip the characters satisfying isSpace. While this certainly is a sane choice, this is a problem for people who would like to use parsec as the underlying parser, but with a different implementation of someSpace. In my case, I also wanted to skip single line (start with #) and multi-line (/* these */) comments. A solution is to create a newtype, and redefine all instances. For those wondering how this is done, here is the relevant part. Please let me know if there is a way to do that that does not require that much boilerplate.

The second problem is that the expression parser builder (at Text.Parser.Expression) is much slower than what is in parsec. Switching to it from Text.Parsec.Expr resulted in a 25% slowdown, so I switched back to parsec. Unfortunately, I didn’t immediately realize this was the culprit, and instead believed it a case newtypes lowering performance. My code is now littered with unsafeCoerces, that I will remove in the next version (provided this does not result in a performance hit).

New features of previous versions I did not blog about

Several bugs were solved thanks to Pi3r.
Several new facts were added.
A parsing bug was fixed (it was also fixed in the official documentation.
An ‘all nodes’ testing mode was added for puppetresources. This can be used that way :

$ puppetresource -p . -o allnodes +RTS -N4
Problem with workstation.site : template error for userconfig/signature.erb :
undefined method `[]' for nil:NilClass
(erb):5:in `get_binding'
/home/bartavelle/.cabal/share/x86_64-linux-ghc-7.6.3/language-puppet-0.11.0/ruby/hrubyerb.rb:46:in `get_binding'
/home/bartavelle/.cabal/share/x86_64-linux-ghc-7.6.3/language-puppet-0.11.0/ruby/hrubyerb.rb:68:in `runFromContent'
/home/bartavelle/.cabal/share/x86_64-linux-ghc-7.6.3/language-puppet-0.11.0/ruby/hrubyerb.rb:63:in `runFromFile'
 in ./modules/userconfig/templates/signature.erb at # "./modules/userconfig/manifests/init.pp" (line 33, column 9)
 Problem with db.dev.site : The following parameters are unknown: (use_ramdisk) when including class percona at # "./manifests/site.pp" (line 956, column 10)
 Problem with db2.dev.site : The following parameters are unknown: (use_ramdisk) when including class percona at # "./manifests/site.pp" (line 969, column 9)
 Tested 54 nodes.

This should provide a nice overview of the current state of your manifests. And it tested those nodes about 50 times faster than Puppet can compile the corresponding catalogs.

Behind the scene changes

The Puppet.Daemon machinery has been simplified a lot. It previously worked by spawning a pre-determined amount of threads specialized for parsing or compiling catalogs. The threads communicated with each other using shared Chans. The reason was that I wanted to control the memory usage, and didn’t want to have too many concurrent threads at the same time. It turns out that most memory is used by the parsed AST, which is shared using the (filecache)[http://hackage.haskell.org/package/filecache] module, so this is not a real concern.

I did rip all that, and now the only threads that is spawned is an OS thread for the embedded Ruby interpreter, and an IO thread for the filecache thread. The user of the library can then spawn as many parallel threads as he wants. As a result, concurrency is a bit better, even though there are still contention points :

The parsed file cache is held by the filecache thread, and communicates with a Chan. I will eventually replace this with an MVar, or some other primitive that doesn’t require a dedicated thread.
The Lua interpreter requires a LuaState, that should not be used by several threads at once. It is stored in a shared MVar.
The Ruby interpreter is the main performance bottleneck. It is single threaded, and very slow. The only way to speed it up would be to parse more of the ruby language (there is a parser for common Erb patterns included !), or to switch to another interpreter that would support multithreading. Both are major endeavors.

Waiting around the corner

The next releases will probably be Ruby 2.1 support for hruby, and some performance work on filecache.

The language-puppet website.

Work with your manifests!

Version 0.11.0 - Now With a Full Serving of unsafeCoerce

Lens related breakage

The trouble with parsers

New features of previous versions I did not blog about

Behind the scene changes

Waiting around the corner

Comments