A new version is out, with a focus on the bleeding edge ! The focus of this release was to update important dependencies, so we have now :
textup to 1.1
- and new versions of
The goal here is to reap the benefits of the speed improvements in attoparsec, text and aeson, and be ready for the GHC 7.8 release.
There were a few hurdles to get there. The move from
Data.Scientific for number representation was not a big problem (even though code in
hruby could be polished), but the
parsers upgrade proved more problematic.
Lens related breakage
The main problem with
lens 4 was that it broke
strict-base-types. I don’t think this will last long, but here is a temporary workaround for the impatient. Other than that, several instances of
x ^. contains y were replaced by
x & has (ix y) for the map-like types. This was a painless upgrade.
The trouble with parsers
I like this package a lot, because it exposes a nice API and supports several underlying parsers. But it comes with a couple problems.
The first one is related to the TokenParsing typeclass. This let you define how
someSpace, the function that skip spaces, is defined. Unfortunately, the
parsers library comes with an instance for
parsec that will skip the characters satisfying
isSpace. While this certainly is a sane choice, this is a problem for people who would like to use
parsec as the underlying parser, but with a different implementation of
someSpace. In my case, I also wanted to skip single line (start with
#) and multi-line (
/* these */) comments. A solution is to create a newtype, and redefine all instances. For those wondering how this is done, here is the relevant part. Please let me know if there is a way to do that that does not require that much boilerplate.
The second problem is that the expression parser builder (at
Text.Parser.Expression) is much slower than what is in
parsec. Switching to it from
Text.Parsec.Expr resulted in a 25% slowdown, so I switched back to
parsec. Unfortunately, I didn’t immediately realize this was the culprit, and instead believed it a case
newtypes lowering performance. My code is now littered with
unsafeCoerces, that I will remove in the next version (provided this does not result in a performance hit).
New features of previous versions I did not blog about
- Several bugs were solved thanks to Pi3r.
- Several new facts were added.
- A parsing bug was fixed (it was also fixed in the official documentation.
- An ‘all nodes’ testing mode was added for puppetresources. This can be used that way :
1 2 3 4 5 6 7 8 9 10 11
This should provide a nice overview of the current state of your manifests. And it tested those nodes about 50 times faster than Puppet can compile the corresponding catalogs.
Behind the scene changes
Puppet.Daemon machinery has been simplified a lot. It previously worked by spawning a pre-determined amount of threads specialized for parsing or compiling catalogs. The threads communicated with each other using shared
Chans. The reason was that I wanted to control the memory usage, and didn’t want to have too many concurrent threads at the same time. It turns out that most memory is used by the parsed AST, which is shared using the (filecache)[http://hackage.haskell.org/package/filecache] module, so this is not a real concern.
I did rip all that, and now the only threads that is spawned is an OS thread for the embedded Ruby interpreter, and an IO thread for the
filecache thread. The user of the library can then spawn as many parallel threads as he wants. As a result, concurrency is a bit better, even though there are still contention points :
- The parsed file cache is held by the
filecachethread, and communicates with a
Chan. I will eventually replace this with an
MVar, or some other primitive that doesn’t require a dedicated thread.
- The Lua interpreter requires a
LuaState, that should not be used by several threads at once. It is stored in a shared
- The Ruby interpreter is the main performance bottleneck. It is single threaded, and very slow. The only way to speed it up would be to parse more of the ruby language (there is a parser for common Erb patterns included !), or to switch to another interpreter that would support multithreading. Both are major endeavors.
Waiting around the corner
The next releases will probably be Ruby 2.1 support for
hruby, and some performance work on