In the process of writing the language-puppet library, I learned quite a lot about Haskell and its libraries. The first part of language-puppet that was written was the parser. At that time I did not understand monads, brute-forced the do-notation until it seemed to do what I wanted, and generally made all kind of blunders. The other problem was that I was learning Puppet too, at a time when it was changing a lot and nothing was really documented. This led to unfortunate decisions that I already documented.
I dediced to rewrite everything from scratch, by directly implementing all I could find in the reference. I started a new parser during the weekend, encoding as many verifications as possible in it, and then tried it on real manifests. Boy, was I naïve ! It did not work at all. The specification is good for learning the language or dissipating some common misconceptions, but is of moderate use for my purpose. I relaxed most of the checks and it seems to work now.
On the technical side, I am now using the parsers package, which has a very nice interface. I considered using trifecta as the underlying parser. Its error messages are gorgeous, but it turns out it is not trivial to get my own lexeme system in place with it. I went with parsec, and, instead of using the parsec-parsers package, wrote my own instances (to be honest I copy-pasted those of the package and added a non-default definition for token). Edward Kmett was nice enough to give me pointers on how to do this with trifecta, but this did look quite clumsy. He hinted that he might work on a monad-transformer approach to this problem, so I am just waiting for this to happen. The nice thing about the parsers approach is that switching now is trivial.
As can be seen on the previous screenshot, I am using a nice pretty printing library that let me (ab)use color.
Another huge difference is that I now use strict type whenever possible. The previous version seemed to be able to support an arbitrary number of worker threads with 300mb of storage for my catalogs, whereas the Puppet version could go up to 800mb for a single thread. I would like to at least halve this figure for the next version.
The next step is to write the new daemon infrastructure. I already have a generic file-cache module that let you cache things related to files. When a file is modified, the cached value is automagically invalidated (using inotify). I hope this will work well in practice and will not be blocking all the other threads.