I just released version 0.12.0, which should compile on GHC 7.8 once the following packages are fixed :
regex-pcre-builtin
(Edit 2014/02/09: now updated)strict-base-types
(Edit 2014/02/09: now updated)charset
(There is a pull request under way)
It seems to compile fine against GHC 7.6.3, even though I couldn’t really test the resulting executable (I gave a shot at Nix, but hruby
is somewhat broken as a result).
This release doesn’t bring much on the table apart from an hypothetical 7.8 compatibility.
I made several claims of performance increase, previously, so here are the results :
0.10.5 | 0.10.6 | 0.11.1 | 0.12.0 | |
---|---|---|---|---|
49 nodes, N1 | 10.74s | 9.76s | 9.03s | |
49 nodes, N2 | 10.48s | 7.66s | 7.01s | |
49 nodes, N4 | 9.7s | 6.89s | 6.37s | |
49 nodes, N8 | 12.46s | 13.4s | 11.77s | |
Single node | 2.4s | 2.24s | 2.02s | 1.88s |
The measurements were done on my workstation, sporting a 4 cores HT processor (8 logical cores).
The performance improvements can be explained in the following way :
- Between
0.10.5
and0.10.6
, the Ruby interpreter mode of execution was modified from aChannel
based system to anMVar
one. - Between
0.10.6
and0.11.1
, all systems that would run on their own thread were modified to use the calling thread instead, reducing synchronization overhead (except for the Ruby thread). This gave a 9% performance boost for single threaded work, and a 29% performance boost when using four cores. The 8-cores performance worsened, because of the wasted work of the parser (This is explained in the previous post). - Between
0.11.1
and0.12.0
, I moved from GHC 7.6.3 to GHC 7.8-rc1, and bumped the version of many dependencies (includingtext
andaeson
, both having received a speed boost recently). This resulted in a “free” 7% speed boost.
As it is shown here, the initial parsing is extremely costly, as computing the catalogs for 49 nodes is about 5 times as long as computing it for a single node. As the parsed files get cached, catalog computing becomes more and more effective (about 50 times faster than Puppet). I don’t think the current parser can be sped up significantly without ditching its readability, so this is about as fast as it will get.
The next goals are a huge simplification of the testing system, and perhaps an external DSL. There are compiled binaries and ubuntu packages at the usual place.