I recently saw comments stating that the Writer (and thus RWS) monad is slow and leaky. I have several programs and libraries built around the RWS monad, where I use the writer part only for logging (that is,
snocing a single element at the end of the log), and already noticed in some cases that the profiler reported up to 80% of the time usage spent doing
Here is a POC, with a pair of monads I called RSS (for Reader - State - State). They offer an API similar to that of RWS, the only difference being that the writer part is internally riding with the state.
Tests with a (top secret) real program gave a two times speed increase. For
language-puppet, it is a 10% speed increase for a single run, but most of the time is dominated by the initial parsing, so I expect the gain to be more substantial for other type of applications (complex tests come in mind). I wrote a benchmark, that compares the lazy and strict versions of the RSS and RWS monads, using several distinct monoids as the writer type. Here are the results on my machine :
|Seq||193 µs||194 µs||4.95 ms||457 µs|
|||4.46 ms||4.47 ms||4.89 ms||538 µs|
|Vector.Prim||764 µs||784 µs||47.72 ms||47.46 ms|
|RevList||173 µs||175 µs||10.57 ms||5.60 ms|
|IntSet||187 µs||184 µs||4.98 ms||472 µs|
|DList||177 µs||176 µs||4.56 ms||302 µs|
- RevList is the best data structure for my use case (it is just a newtype around a regular list, with
reverseapplied when the list is extracted), which should not come as a surprise. What was surprising (to me) is how terrible it turns out to me under RWS …
- The performance of
Vector.Primwith the RWS monad is terrible. I suppose this is due to some optimization rule that could not be deduced in that case. I was however surprised at the good performance compared to lists in the RSS case.
- The RSS version is much faster than the RWS one … except for lists !
- The strict version of the RSS monad doesn’t have a leaky writer (test here).
I also added a
tellElement helper function that would
snoc an element to the writer (with suitable monoids) :
|Seq||193 µs||231 µs|
|List||4.46 ms||4.45 ms|
|RevList||173 µs||200 µs|
It seems that, to my surprise, there is a small price to pay for calling snoc instead of
I will probably release this as a package so that I can build
language-puppet against it. I will copy’n’paste all the haddocks from the
transformers packages, and move the modules so as to mimick their hierarchy. If you have suggestions, especially concerning the package or module names, strictness properties of the new monads (a subject where I am at loss), please let me know.
EDIT : Added benchmark results for DList