The language-puppet website.

Work with your manifests!

Refactoring: From an IO-based Monad Stack to a Pure Program (Part 1)

UPDATE (2014/03/01)

It turns out that there was a better way to do this, please see this new post.

Rationale

I am currently experimenting with the operational package. This post provides a rough outline on how I moved from an IO based monad stack to an isomorphic pure representation of the computation. I am unfortunately not well endowed on the theoretical side, so this will be a very practical post. It might contain some glaring mistakes, as I just spend a few hours acquainting myself with the concepts and migrating everything, and didn’t test it extensively. I marked the places where I am unsure on how to do something with a number, such as (0).

Here is the type of the main monad, before and after the change :

1
2
3
4
-- Before
type InterpreterMonad = ErrorT Doc (RSST InterpreterReader InterpreterWriter InterpreterState IO)
-- After
type InterpreterMonad = ProgramT InterpreterInstr (State InterpreterState)

I first tried a simple Program InterpreterInstr for the main monad, but I could not write the MonadState instance, as there was a conflicting instance (1). This is the reason why the State monad is there, at the base of the transformer stack.

The goal is to build a representation of the catalog compilation process, in a pure monad, and then transform it into another representation that will actually be executed. In order to do so, all the “effects” need to be encoded as a single type (designated as instr in the operational haddocks). In this case, this is the InterpreterInstr type, detailed here.

You might observe that commands of type m a -> m b become constructors of type m a -> instr b, and not instr a -> instr b (which makes sense if you think about what you are doing, but was not immediately obvious to me when I started writing the types).

Implementing the Program monad

First of all, all the effects given by the original transformer stacks have their own instructions : ErrorT has the ErrorThrow and ErrorCatch instructions, and a similar treatment has been realized on the MonadWriter part of the original RSST transformer (it’s like RWST, except faster and not leaky). The MonadState doesn’t need special instructions, as InterpreterMonad is already an instance of MonadState.

The MonadWriter has been dropped, in favor of more specific instructions (the original reader structure can be directly observed in the new instruction set, along with the exposed PuppetDBAPI). Finally, some additional utility functions have been thrown in, as they rely on IO.

With all of this in place, it becomes trivial to write the following instances :

1
2
3
4
5
6
7
8
instance MonadError Doc InterpreterMonad where
    throwError     = singleton . ErrorThrow
    catchError a c = singleton (ErrorCatch a c)

instance MonadWriter InterpreterWriter InterpreterMonad where
    tell   = singleton . WriterTell
    pass   = singleton . WriterPass
    listen = singleton . WriterListen

Now the refactoring becomes mechanical, and surprisingly non invasive. As can be seen in the patch, it’s mostly about replacing every use of the view and liftIO with the corresponding “singleton” command. I have seen that people write short functions for commonly used instructions, such as :

1
pdbGetResources = singleton . PDBGetResources

I didn’t go for this, as most functions are used at most a couple times.

Running the computation

The interpreter is right here, and is pretty painful to read. Its type is however pretty straightforward : given the “Reader” data and a ProgramT, it will create an equivalent (or not) computation represented by another monad. It is exactly(2) as writing in a DSL, and running it through an interpreter.

I was surprised that I had to write the explicit type signatures for the functions that are in the where section of the interpretIO function, but other than that this was a straightforward exercise. As a reaction to a recent popular reddit thread, the “Overview” given in operational’s documentation was invaluable to get started quickly.

Conclusion

I have seen this kind of design mentioned at several places, as a common way to keep things pure and easy to reason about. I however thought it was better to think about it earlier in the design process, as changing the base monad of all computations would require a significant rewrite.

The first pleasant surprise was that it only took me a few hours to go from “reading the haddocks” to “refactoring done”.

The second, in some sense, even more pleasant surprise was that there doesn’t seem to be any performance penalty whatsoever.

Comments