A faster RWS variant

I recently saw comments stating that the Writer (and thus RWS) monad is slow and leaky. I have several programs and libraries built around the RWS monad, where I use the writer part only for logging (that is, snocing a single element at the end of the log), and already noticed in some cases that the profiler reported up to 80% of the time usage spent doing mappends.

Here is a POC, with a pair of monads I called RSS (for Reader - State - State). They offer an API similar to that of RWS, the only difference being that the writer part is internally riding with the state.

Tests with a (top secret) real program gave a two times speed increase. For language-puppet, it is a 10% speed increase for a single run, but most of the time is dominated by the initial parsing, so I expect the gain to be more substantial for other type of applications (complex tests come in mind). I wrote a benchmark, that compares the lazy and strict versions of the RSS and RWS monads, using several distinct monoids as the writer type. Here are the results on my machine :

	RSS.Lazy	RSS.Strict	RWS.Lazy	RWS.Strict
Seq	193 µs	194 µs	4.95 ms	457 µs
[]	4.46 ms	4.47 ms	4.89 ms	538 µs
Vector.Prim	764 µs	784 µs	47.72 ms	47.46 ms
RevList	173 µs	175 µs	10.57 ms	5.60 ms
IntSet	187 µs	184 µs	4.98 ms	472 µs
DList	177 µs	176 µs	4.56 ms	302 µs

RevList is the best data structure for my use case (it is just a newtype around a regular list, with reverse applied when the list is extracted), which should not come as a surprise. What was surprising (to me) is how terrible it turns out to me under RWS …
The performance of Vector.Prim with the RWS monad is terrible. I suppose this is due to some optimization rule that could not be deduced in that case. I was however surprised at the good performance compared to lists in the RSS case.
The RSS version is much faster than the RWS one … except for lists !
The strict version of the RSS monad doesn’t have a leaky writer (test here).

I also added a tellElement helper function that would snoc an element to the writer (with suitable monoids) :

	tell	tellElement
Seq	193 µs	231 µs
List	4.46 ms	4.45 ms
RevList	173 µs	200 µs

It seems that, to my surprise, there is a small price to pay for calling snoc instead of mappend.

I will probably release this as a package so that I can build language-puppet against it. I will copy’n’paste all the haddocks from the mtl and transformers packages, and move the modules so as to mimick their hierarchy. If you have suggestions, especially concerning the package or module names, strictness properties of the new monads (a subject where I am at loss), please let me know.

EDIT : Added benchmark results for DList

The language-puppet website.

Work with your manifests!

Comments