The language-puppet website.

Work with your manifests!

Version 0.12.0 : GHC 7.8 and Performance Increase

I just released version 0.12.0, which should compile on GHC 7.8 once the following packages are fixed :

  • regex-pcre-builtin (Edit 2014/02/09: now updated)
  • strict-base-types (Edit 2014/02/09: now updated)
  • charset (There is a pull request under way)

It seems to compile fine against GHC 7.6.3, even though I couldn’t really test the resulting executable (I gave a shot at Nix, but hruby is somewhat broken as a result).

This release doesn’t bring much on the table apart from an hypothetical 7.8 compatibility.

I made several claims of performance increase, previously, so here are the results :

0.10.5 0.10.6 0.11.1 0.12.0
49 nodes, N1 10.74s 9.76s 9.03s
49 nodes, N2 10.48s 7.66s 7.01s
49 nodes, N4 9.7s 6.89s 6.37s
49 nodes, N8 12.46s 13.4s 11.77s
Single node 2.4s 2.24s 2.02s 1.88s

The measurements were done on my workstation, sporting a 4 cores HT processor (8 logical cores).

The performance improvements can be explained in the following way :

  • Between 0.10.5 and 0.10.6, the Ruby interpreter mode of execution was modified from a Channel based system to an MVar one.
  • Between 0.10.6 and 0.11.1, all systems that would run on their own thread were modified to use the calling thread instead, reducing synchronization overhead (except for the Ruby thread). This gave a 9% performance boost for single threaded work, and a 29% performance boost when using four cores. The 8-cores performance worsened, because of the wasted work of the parser (This is explained in the previous post).
  • Between 0.11.1 and 0.12.0, I moved from GHC 7.6.3 to GHC 7.8-rc1, and bumped the version of many dependencies (including text and aeson, both having received a speed boost recently). This resulted in a “free” 7% speed boost.

As it is shown here, the initial parsing is extremely costly, as computing the catalogs for 49 nodes is about 5 times as long as computing it for a single node. As the parsed files get cached, catalog computing becomes more and more effective (about 50 times faster than Puppet). I don’t think the current parser can be sped up significantly without ditching its readability, so this is about as fast as it will get.

The next goals are a huge simplification of the testing system, and perhaps an external DSL. There are compiled binaries and ubuntu packages at the usual place.

Comments