The language-puppet website.

7 Startups - part 5 - the XMPP backend

2014-08-09T10:48:00+02:00

Note: I ran out of time weeks ago. I could never finish this serie as I envisionned, and I don’t see much free time on the horizon. Instead of letting this linger forever, here is a truncated conclusion. The previous episodes were :

Part 1 : probably the best episode, about the basic game types.
Part 2 : definition of the game rules in an unspecified monad.
Part 3 : writing an interpreter for the rules.
Part 4 : stumbling and failure in writing a clean backend system.

In the previous episode I added a ton of STM code and helper functions in several 15 minutes sessions. The result was not pretty, and left me dissatisfied.

For this episode, I decided to release my constraints. For now, I am only going to support the following :

The backend list will not be dynamic : a bunch of backends are going to be registered once, and it will be not be possible to remove an existing or add a previous backend once this is done.
The backends will be text-line based (XMPP and IRC are good protocols for this). This will unfortunately make it harder to write a nice web interface for the game too, but given how much time I can devote to this side-project this doesn’t matter much …

The MVC paradigm

A great man once said that “if you have category theory, everything looks like a pipe. Or a monad. Or a traversal. Or perhaps it’s a cosomething”. With the previously mentionned restrictions, I was able to shoehorn my problem in the shape of the mvc package, which I wanted to try for a while. It might be a bit different that what people usually expect when talking about the model - view - controller pattern, and is basically :

Some kind of pollable input (the controllers),
a pure stream based computation (the model), sporting an internal state and transforming the data coming from the inputs into something that is passed to …
… IO functions that run the actual effects (the views).

Each of these components can be reasoned about separately, and combined together in various ways.

There is however one obvious problem with this pattern, due to the way the game is modeled. Currently, the game is supposed to be able to receive data from the players, and to send data to them. It would need to live entirely in the model for this to work as expected, but the way it is currently written doesn’t make it obvious.

It might be possible to have the game be explicitely CPS, so that the pure part would run the game until communication with the players is required, which would translate nicely in an output that could be consumed by a view.

This would however require some refactoring and a lot of thinking, which I currently don’t have time for, so here is instead how the information flows :

Here PInput and GInput are the type of the inputs (respectively from player and games). The blue boxes are two models that will be combined together. The pink ones are the type of outputs emitted from the models. The backends serve as drivers for player communication. The games run in their respective threads, and the game manager spawns and manages the game threads.

Comparison with the “bunch of STM functions” model

I originally started with a global TVar containing the state information of each players (for example if they are part of a game, still joining, due to answer to a game query, etc.). There were a bunch of “helper functions” that would manipulate the global state in a way that would ensure its consistency. The catch is that the backends were responsible for calling these helper functions at appropriate times and for not messing with the global state.

The MVC pattern forces the structure of your program. In my particular case, it means a trick is necessary to integrate it with the current game logic (that will be explained later). The “boSf” pattern is more flexible, but carries a higher cognitive cost.

With the “boSf” pattern, response to player inputs could be :

Messages to players, which fits well with the model, as it happened over STM channels, so the whole processing / state manipulation / player output could be of type Input -> STM ().
Spawning a game. This time we need forkIO and state manipulation. This means a type like c :: Input -> STM (IO ()), with a call like join (atomically (c input)).

Now there are helper functions that return an IO action, and some that don’t. When some functionnality is added, some functions need to start returning IO actions. This is ugly and makes it harder to extend.

Conclusion of the serie

Unfortunately I ran out of time for working on this serie a few weeks ago. The code is out, the game works and it’s fun. My original motivation for writing this post was as an exposure on basic type-directed design to my non-Haskeller friends, but I think it’s not approachable to non Haskellers, so I never shown them.

The main takeaways are :

Game rules

The game rules have first been written with an unspecified monad that exposed several functions required for user interaction. That’s the reason I started with defining a typeclass, that way I wouldn’t have to worry about implementing the “hard” part and could concentrate on writing the rules instead. For me, this was the fun part, and it was also the quickest.

As of the implementation of the aforementionned functions, I then used the operational package, that would let me write and “interpreter” for my game rules. One of them is pure, and used in tests. There are two other interpreters, one of them for the console version of the game, the other for the multi-backends system.

Backend system

The backends are, I think, easy to expand. Building the core of the multi-game logic with the mvc package very straightforward. It would be obvious to add an IRC backend to the XMPP one, if there weren’t that many IRC packages to choose from on hackage …

A web backend doesn’t seem terribly complicated to write, until you want to take into account some common web application constraints, such as having several redundant servers. In order to do so, the game interpreter should be explicitely turned into an explicit continuation-like system (with the twist it only returns on blocking calls) and the game state serialized in a shared storage system.

Bugs

My main motivation was to show it was possible to eliminate tons of bug classes by encoding of the invariants in the type system. I would say this was a success.

The area where I expected to have a ton of problems was the card list. It’s a tedious manual process, but some tests weeded out most of the errors (it helps that there are some properties that can be verified on the deck). The other one was the XMPP message processing in its XML horror. It looks terrible.

The area where I wanted this process to work well was a success. I wrote the game rules in one go, without any feedback. Once they were completed, I wrote the backends and tested the game. It turned out they were very few bugs, especially when considering the fact that the game is a moderately complicated board game :

One of the special capabilities was replaced with another, and handled at the wrong moment in the game. This was quickly debugged.
I used traverse instead of both for tuples. I expected them to have the same result, and it “typechecked” because my tuple was of type (a,a), but the Applicative instance for tuples made it obvious this wasn’t the case. That took a bit longer to find out, as it impacted half of the military victory points, which are distributed only three times per game.
I didn’t listen to my own advice, and didn’t take the time to properly encode that some functions only worked with nonempty lists as arguments. This was also quickly found out, using quickcheck.

The game seems to run fine now. There is a minor rule bugs identified (the interaction between card-recycling abilities and the last turn for example), but I don’t have time to fix it.

There might be some interest with the types of the Hub, as they also encode a lot of invariants.

Also off-topic, but I really like using the lens vocabulary to encode the relationship between types these days. A trivial example can be found here.

The game

That might be the most important part. I played a score of games, and it was a lot of fun. The game is playable, and just requires a valid account on an XMPP server. Have fun !

Language-puppet 0.14.0

2014-06-13T13:59:00+02:00

A new version is out, and the version bump is here because it might now break catalog builds (despite no API change). The resource relationship resolver has been completely reworked, and should now work a bit more like Puppet does. It should now properly flag unknown resources used in relationships, or relationship loops.

There is another feature that might be of interest to those who don’t feel like editing modules that are contributed by other persons. An important class of modules are those that are officially supported by puppetlabs. language-puppet should strive to support all of them, but most of them demonstrate bad practices that are frowned upon by Puppetlabs itself (such as the reliance on the inheritance parameters pattern instead of Hiera), and some of them rely on features I specifically want to avoid (such as failed lookups returning undef).

You can now use the --ignoremodules options to prevent puppetresources from interpreting those modules, letting you test the other parts of your manifests. This will however fail if you reference a resource from these modules from your code …

Complete changelog

New features
- Overhauled the dependency check system
- Added an option to skip the user and group checks
- Added an option to ignore some modules
Bugs fixed
- Added vagrant, nagios, www-data, postgres and nginx to the list of known users.
- Fixed how resource relationships were resolved with notify and before.
- Fixed a problem where inheritance whould be used with :: prefix.
- The defined function now works with classes.
- All numbers are now strings in templates.

7 Startups - part 4 - Adding an asynchronous backend

2014-05-22T19:39:00+02:00

This one will be an experimental post, as I have just added a ton of new code. I did not have enough time to do this properly, so this looks like a giant tangle of STM functions now. I will probably rewrite a large part of it, but I still think the though process that led me to this could be interesting to others, so here we are.

TLDR: I wrote a lot of code, it sucks, and I will rewrite a large part of it for the next episode.

Stuff that was refactored since last time

A pair of minor items :

I fixed the PrettyElement problem here
I changed the type of playerActionsDialog so that it only accepts NonEmpty lists.
I did the same for allowableActions. I also rendered the function partial by mistake, can you spot the bug ? :)

The big change came from the fact that I realized my operation types were wrong. In particular this one :

AskCard :: Age -> PlayerId -> NonEmpty Card -> Message -> GameInstr Card

This type seemed right for writing the game rules, and the console version. However, it does suck for a multiplayer version, as this code will ask the second player for his choice only after the first one has answered. This will slow down the game considerably. We should query all players at once, and wait for their answers after that. I decided to model this as an abstract promise, ie. a value container that will eventually be filled. There is a new type parameter p for the GameInstr type, along with a new GetPromise instruction.

Now, all players are consulted at the same time, and the game then waits for them to answer (code).

This is all very abstract, but in practice things are not that simple, and the promise might not get fulfilled. One problem is a player disconnecting from a backend. One way to do this would be to make the return value of the getPromise function be an Either of some sort. But the only sensible option in the game logic would be to throw an error, so instead the interpreter’s getPromise can fail, but not the version that’s exposed to the game.

For the pure backend, the promise type is just Identity, as seen here.

An ambitious step

I decided to get a bit more ambitious for this episode. I wanted to implement code that would be able to run several games at once, over several medias at once, with player being connected on any media combination. I did not write a simple backend to start with, to get a feel of where the actual problems were, and decided to write the code top-down.

So here is what I had in mind :

So basically, there would be a “Hub” that would hold the player list, who is playing which game, and that would also run the games. Backends would interact with it to provide IO with the players. As IRC and XMPP have the same kind of communication model, they would be merged in a single “virtual backend” that would communicate with a pair of “real backends”. Now how to do that ?

Asynchronous communication

Both backends need to listen to two kinds of events :

Requests from the game, such as asking a specific player to choose a card.
Instant messages from a server, a channel, or a person.

From the point of view of a game, the messages from the players are usually of no interest. It just needs them to choose a card, or an action to play from times to times. The backends, however, will need to watch out for administrative commands. This means there should be a lot of filtering.

The main problem resides in asking something to a player, and get his answer. An extremely naive way to go about this would be something along the lines of:

sendTextContent session (RUser playerid) message
resp <- getMessage session

This would be completely wrong because we are not assured the next message will be the one we are expecting. So instead, we need to implement some sort of callbacks, so that when a message arrives, the backend would check if we were expecting something from the sender, and react accordingly. This means that we need an extra thread that will just wait for messages, and handle administrative commands and callbacks. So something like :

forkIO $ forever $ do
    msg <- getMessage session
    let pid = getPlayerId msg
    case callbacks ^. at pid of
        Just cb -> cb msg
        Nothing -> case getContent msg of
            "xxx" -> adminCommandXxx
            "yyy" -> adminCommandYyy
            _     -> return ()
runGame

Where the game would do something like this to implement, for example, the “ask card” function:

askCard :: PlayerId -> NonEmpty Card -> m (promise Card)
askCard pid necards = do
    let msg = buildAskCardMessage necards
    sendMessage session pid msg
    p <- newPromise
    addCallback pid $ \msg ->
        case decodeMsg of
            Just x -> fulfill p x
            Nothing -> askAgain
    return p

Multiple backends

So that was cool, but what if there are multiple backends ? All of them must be able to fulfill that promise ! What I would like to do is to be able to return a promise that will be fulfilled at a later date in another part of my program. Usually, this would be something like that :

promise :: IO a -> IO (Promise a)
getPromise :: Promise a -> IO a

But I decided most of my function will live in the STM (I did not document this choice, but the STM is so useful it’s a no brainer), and I wanted to write code anyway.

Because of “not invented here”, I wrote my own implementation, gave it a bad name (PubSub) and wrote buggy Monoid instances. Code is here and is wrong on many levels. It exposes this set of functions :

newPubSub    :: STM (PubFP e a, SubFP e a)
fulfillPub   :: PubFP e a -> a -> STM ()
failPub      :: PubFP e a -> e -> STM ()
getResult    :: SubFP e a -> STM (Either e a)
addPubAction :: STM () -> PubFP e a -> PubFP e a

The newPubSub gives you a pair of values, one of them you can publish to (using fulfillPub for success and failPub for failure), and one of them you can get results from (with a blocking getResult).

The name is wrong because getResult will always return the same value, so this does not behave at all like a stream, which could be implied by the PubSub name.

The addPubAction is horrible too. I only had a sketchy idea that I needed callbacks at some point when I wrote this module, and that these callbacks should be lifted as soon as possible, so probably as soon as a response is published. This is wrong because :

The type here let you do a ton of stuff, so it’s not obvious what this function does or is supposed to be used for.
It is not actually useful. I realized later this was not needed.

The Monoid instances suffer the same problem, as they are probably not useful. Even worse, one of them doesn’t even work !

instance Monoid (PubFP e a) where
    mempty = PubFP (const (return ()))
    PubFP a `mappend` PubFP b = PubFP (a >> b)

It actually used the monad instance of (->) and not STM, which, if I desugar it properly, does something like that :

    a >> b
    a >>= \_ -> b
    \z -> (\_ -> b) (a z) z
    \z -> b z
    b

So it basically only used the last PubFP. The correct implementation for mappend should have been :

instance Monoid (PubFP e a) where
    PubFP a `mappend` PubFP b = PubFP (\e -> a e >> b e)

Abstracting a backend

Now that I have decided to have a “hub” connecting several backends, I need to find a way to abstract them. I will need to keep a list of them, and it must be possible to add and remove them dynamically, so I need some way to identify each backend. I also need a way to tell the backend that it is to be removed, so that it can clean things up and say goodbye. Finally, I need a way to talk to a backend, and a way for the backend to interact with the hub.

Here is the set of threads I should need without undue multiplexing :

One thread per active game. I don’t think it’s possible to combine them in a single thread due to the API I exposed, and I don’t think it’s worth worrying about this anyway.
One thread per backend, that will listen to messages from the outside world.

That is probably all I need, but because I started writing code before thinking about my architecture, I introduced a bad abstraction. I decided I would talk to the backends using a TChan, which means :

One additional thread per backend, listening to messages from the games.

So backends are defined here. A better abstraction for backendChan :: TChan Interaction could be backendTell :: Interaction -> STM (). You might notice that the comments are talking about a set of function, which was my first shot, and which was a better idea indeed.

Abstracting communication

Communications from the game and to the player are currently of three types :

Asking the player what card he would like to play during normal play. This returns a PlayerAction and Exchange couple.
Asking the player to choose between several cards. This returns a Card.
Telling various things to the player. There is no return value expected from the player.

For all cases, we need to contact each backends and tell them to contact a player or broadcast some message.

For the first two cases we need also to set some callback machinery and wait for them to answer. The function performing this should return quickly some kind of promise object that will be filled once a player has answered.

We would like to write a function with a type looking somehow like :

communicate :: TVar GlobalState -> a -> STM (Promise b)

Where a is one of the three message types, and b the corresponding answer. Obviously, we can’t write such a function with this type. But we can use the TypeFamilies extension to write it :

data IAskingAction   = IAskingAction PlayerId Age (NonEmpty Card) GameState Turn
data IAskingCard     = IAskingCard   PlayerId Age (NonEmpty Card) GameState Message
data ISimpleMessage  = ISimpleMessage CommunicationType GameId

type family InteractionResult a :: *
type instance InteractionResult IAskingAction     = (PlayerAction, Exchange)
type instance InteractionResult IAskingCard       = Card
type instance InteractionResult ISimpleMessage    = ()

communicate :: TVar GlobalState -> a -> STM (Promise (InteractionResult a))

Now the type of a and of the Promise are statically linked, which will be useful for writing generic functions.

Conclusion

This episode was about hasty decisions and code quickly written. I was not exactly in my comfort zone with the multiple backends concept, and should probably have aimed lower to get a feel of the problem first.

I will rework all of this before the next episode, which will be about concurrency, the STM, and how to mix IO and STM actions.

Language-puppet 0.13.0

2014-05-21T20:15:00+02:00

This is a release mostly about the new parser stuff and bugfixes.

New stuff

New parser stuff, such as adding array and hashes, inserting values in them.
Hacky support for scope.get_hash.
Numbers are now numbers, and not strings, just like the new parser does. Protip: before activating the new parser for the real Puppet, make sure ALL your file modes are defined as strings, or you will just break your production. That was not a pleasant experience to me !
Support for structured facts is present.
New stdlib functions : is_hash, has_key, size, values.

Bugs fixed

There were a few very minor bugs fixed, the biggest of them probably being the fact that Puppet seems to define the $title variable for classes declared “define style”. I think this is a terrible decision, but well, it’s now supported in language-puppet.

Haskell stuff

Expression now has a Num and Fractional instance.
A pure evaluation function is now here, for tests and lenses.

Have fun !

7 Startups - part 3 - An interpreter for the rules

2014-05-16T07:31:00+02:00

In the previous episode I implemented the game rules, but did not test them. I also had some reservations about some code I wrote, but predicted it would be mostly right, even without tests. Today’s episode is about pretty printing and operational !

Minor modifications since last time

I refactored the getCardFunding and getCardVictory functions so that they are now pure. I toyed with the idea of having a monad morphism (I learned today it was called like that to integrate Reader GameState actions in the MonadState GameState functions, but this was not warranted as the functions are so simple.
I refactored neighborhood relationship so that it encodes more invariants. A player now must have a left and right neighbor. They might be invalid though.
I refactored the type of the interfaces between the game rules and the players, so that you can’t pass empty lists where they are forbidden. I was later told this type already existed in semigroups.

Why pretty printing ?

I hinted heavily last time that there would be a dedicated pretty printer. An example of such an implementation is in the ansi-wl-pprint package. It introduces functions and combinators that let you easily create a Doc value that will look neat on your screen.

Unfortunately, in order to properly support all text-based backends (IRC, XMPP, email, console) it doesn’t seem to be possible to reuse an existing printer. For example, the color set between all these backends is quite distinct, and some are even capable of printing pictures. I tried to engineer one that would be at the same time flexible, easy to use and good-looking an all backends. Time will tell if this was a success.

I will not give a dissertation on the subject, and have copied the interface from other pretty printing libraries. I will just give some implementation details here.

Basic pretty printing types

Speaking of stealing from other pretty printers, I really should have looked at their code too ! Here are my basic types:

newtype PrettyDoc = PrettyDoc { getDoc :: Seq PrettyElement }
    deriving (Eq, Monoid)

data PrettyElement = RawText T.Text
                   | NewLine
                   | Space
                   | ...

So you basically have all “elements” in PrettyElement, and they can be appended in a monoidal fashion in a PrettyDoc, which is just a newtype for Seq PrettyElement. This is a very inelegant decision, and I will be sure to refactor it for the next episode ! Looking at another implementation, it is clear that a single type was required, and that the Monoidal structure could be achieved by adding Empty and Cat constructors. There is a reason I wrote my type like this though, and it is related to how I intended to solve the problem of backends with poor or no support for multiline messages, but this will featured in another episode !

Specific design choices

I decided to directly encode the game entities as part of the pretty printing types. That should be obvious from the list of elements. A VictoryPoint, a Neighbor or even a CardType are directly representable, so that the backends can optimize their rendering.

Other than that, the code is pretty boring.

A pretty-pretty printer ?

My first backend will be the console, as it will not have any networking or concurrency problems to solve. I used the aforementioned ansi-wl-pprint package, and wrote a pretty instance for PrettyElement and PrettyDoc. This leads to strange code such as print (PP.pretty (pe something)).

Implementing the GameMonad

During the last episode, I wrote all the rules in an abstract monad that is an instance of GameMonad, meaning it featured a few functions for interacting with the players. I took a typeclass approach so that I could start writing the rules without worrying about the actual implementation of this abstract monad.

Now that the rules are written, it is time to give them a try. In order to do so, I ditched the typeclass, and expressed it in terms of ProgramT, from the operational package. It only takes a few steps to refactor :

The instructions GADT

You must start by writing all the operations that must be supported as a GADT.

We previously had :

type NonInteractive m = (MonadState GameState m,
                         Monad m,
                         MonadError Message m,
                         Functor m,
                         Applicative m)

class NonInteractive m => GameMonad m where
    playerDecision    :: Age -> Turn -> PlayerId -> [Card] -> GameState -> m (PlayerAction, Exchange)
    askCard           :: Age -> PlayerId -> [Card] -> GameState -> Message -> m Card
    tellPlayer        :: PlayerId -> Message -> m ()
    generalMessage    :: Message -> m ()

And now have :

data GameInstr a where
    PlayerDecision :: Age -> Turn -> PlayerId -> NonEmpty Card -> GameInstr (PlayerAction, Exchange)
    AskCard        :: Age -> PlayerId -> NonEmpty Card -> Message -> GameInstr Card
    TellPlayer     :: PlayerId -> Message -> GameInstr ()
    GeneralMessage :: Message -> GameInstr ()
    ActionsRecap   :: M.Map PlayerId (PlayerAction, Exchange) -> GameInstr ()
    ThrowError     :: Message -> GameInstr a
    CatchError     :: GameMonad a -> (Message -> GameMonad a) -> GameInstr a

So … there have been some choices going on here. First of all, we need to support all the features we previously had, namely MonadState, MonadError and four game-specific functions. You can spot these four functions quite easily (along with a new one, which will be covered in a minute). We get MonadState and MonadError in the following way :

type GameMonad = ProgramT GameInstr (State GameState)

instance MonadError PrettyDoc (ProgramT GameInstr (State GameState)) where
    throwError = singleton . ThrowError
    catchError a handler = singleton (CatchError a handler)

I decided to use the monad transformer ProgramT over a base State GameState monad, but encode the error part with the provided instructions. It would have been easier to encode the state part that way, except I don’t know how to write an instance for ProgramT (see this post comment).

The interaction functions no longer have a GameState in their types, because the interpreter will have access to the state when decoding this instruction, so it is not necessary to pass it here too.

Mechanically refactor all mentions of GameMonad

Now all you have to do is to replace all type signatures that looked like :

GameMonad m => a -> b -> m c

Into :

a -> b -> GameMonad c

Write an interpreter

I decided to write a generic interpreter, that takes a group of functions in some monad m, a base GameState, and gives you a function that computes any GameMonad a expression in the m monad. The implementation is pretty obvious, and not very interesting, but it should be easy to write backends now.

Perhaps of interest is the fact that the game state is explicitly passed as a parameter all over the place, so it can be passed to the backends at the interpreter level.

A pure backend

The easiest backend to write is a pure one, with no real player interaction. I could have used Identity as the base monad, but instead opted for State StdGen. That way, I can easily have the “players” play random cards, which will help with testing.

The implementation is also nothing special, but made me write a lot of code to support it. In particular, the allowableActions function is pretty tricky, and is not entirely satisfying. Given a game state, a player name and a list of cards in his hands, it gives a list of all the non obviously stupid legal actions that are available. It does so in the most direct way, enumerating all possible combinations of resources, neighbor resources, exchanges, etc. that would work. Then it removes all duplicates, and the actions that are obviously too expensive.

Fortunately, all this code will also be used by the other backends.

So … are there bugs yet ?

I wrote a simple test that checks for errors. Theoretically, the pure backend should always result in games that end well (we should get a Right ... instead of a Left rr. So I wrote a simple property-based test that gets an arbitrary seed and number of players (between 3 and 7), runs a pure game and checks its result.

And there were runtime errors !

The Monoid instance for AddMap had an infinite loop.
The allowableActions function sometimes returned no valid actions. I forgot to always add the possibility to drop a card …

To prevent the second case from happening again, I wrote the “prepend drop actions” before the big case statement, and modified the type of the askCardSafe function so that it can’t accept an empty list. This means that if I introduce another bug in allowableActions, I should get a Left ... instead of a runtime exception.

There also was a “rule” bug, due to the fact that I had not understood a rule correctly. Basically, I use a fictional 7Th round to emulate the efficiency capability, but there should be no “hand rotation” before that turn. I fixed it wrong once, and then properly. However, I did not discover nor fix this bug because of tests.

The console backend

Before writing the console backend I needed a bit of code for pretty printing stuff. Once this was done, the backend was quickly written.

The opponents still play randomly, which explains the kind of results depicted below, but it is a genuine pleasure to finally play !

I also realized when using the console backends that the messaging functions, while generic, would probably not work well on all backends. I decided to include more specialized functions, such as ActionsRecap, which can be passed a map of all the actions the players undertook in a turn. The current version also lacks a way of getting the results of the poacher wars between the ages, but that should be trivial to add.

Next time

Next time should get more interesting, as I will try to write an interesting backend. It will be a bit harder to design because I want players using distinct backends to be able to participate in the same game.

7 Startups - part 2 - Game rules definition

2014-05-12T20:59:00+02:00

Whew, I just added a big pile of code to what was done previously. I wrote all the missing game types and rules. It took about 4 or 5 hours.

In this post, I will describe how I decided to define the main game types, and some various details of interest.

Choosing the rules monad

I will describe the rules using a monad, mainly because I am used to work with them, and because they are mighty convenient in Haskell, with the do notation and the numerous libraries. As is often the case with games, there will be a state, containing the game state at a given time. But while I will just write the rules, I need to graft user interaction at some point. The goal of this project is to write a 7 Wonders clone that might work with multiple backends. To achieve this, I will try not to constraint my implementation any more than necessary.

Player identification

The first important type is to find a way to identify each players. I wrote this :

type PlayerId = T.Text

I currently am not sure this is sufficient / precise enough, but the backends I have in mind (IRC, XMPP, console and email) all have string based identifiers, so it should work for at least those three. Anyway, the backends will probably have to keep a relationship between a player nickname and his actual identity in the system, so this will probably turn out OK.

Game state

data GameState = GameState { _playermap   :: M.Map PlayerId PlayerState
                           , _discardpile :: [Card]
                           , _rnd         :: StdGen
}

data PlayerState = PlayerState { _pCompany         :: CompanyProfile
                               , _pCompanyStage    :: CompanyStage
                               , _pCards           :: [Card]
                               , _pFunds           :: Funding
                               , _pNeighborhood    :: M.Map Neighbor PlayerId
                               , _pPoachingResults :: [PoachingOutcome]
}

makeLenses ''GameState
makeLenses ''PlayerState

This might look pretty obvious, and it might be (as it is my first version), but this model has several shortcomings, the worst of them being the way that neighboring information is encoded. This is originally a tabletop game, and each player has two neighbors : on his left and on his right. Unfortunately, the Map Neighbor PlayerId only means that a player can have up to two neighbors (there are only two constructors in the Neighbor type), and it doesn’t even garantee they have a corresponding state in the GameState.

A type that would properly model this property would be to store [(PlayerId, PlayerState)] in GameState, interpreted as a circular list (the first player in the list being the right neighbor of the last one). But this would be a major PITA to manipulate.

Another idea would be to store the neighboring information in a read-only structure. That way, we can make sure that no invariants are being violated, as the structure can’t be modified, but this also might be too much of a hassle. I will probably refactor some of this for the next episode with something less challenging : a simple pair.

And now, the monad !

As we have seen, we will need a MonadState GameState to model most of the rules. Some parts of the game might also throw errors, so it might be a good idea to have our monad be an instance of MonadError. Finally, we need some user interaction. In order to be able to write any backend, I decided to keep it abstract for now :

type GameStateOnly m = ( MonadState GameState m
                       , Monad m
                       , Functor m
                       , Applicative m)
type NonInteractive m = (MonadState GameState m
                        , Monad m
                        , MonadError Message m
                        , Functor m
                        , Applicative m)

class NonInteractive m => GameMonad m where
    playerDecision    :: Age -> Turn -> PlayerId -> [Card] -> GameState
                             -> m (PlayerAction, Exchange)
    askCard           :: Age -> PlayerId -> [Card] -> GameState -> Message -> m Card
    -- | Tell some information to a specific player
    tellPlayer        :: PlayerId -> Message -> m ()
    generalMessage    :: Message -> m () -- ^ Broadcast some information

First of all are two constraints synonyms :

GameStateOnly : basically MonadState State with all the implied constaints, which will be used in all the functions that can’t fail and that don’t require user interaction.
NonInteractive : just like the previous constraint, but for functions that can throw errors.

Finally, a GameMonad typeclass. The monad our game will work in must implement these four functions, which are all I found was needed for player communication :

playerDecision: this is the main interaction. Given all kinds of data, it asks the player to decide what he will do in the current turn.
askCard: there are two effects where a player must chose a card over a list (copy community, and play for free a card from the discard pile). This is what this function is about, at least for now.
tellPlayer: tells a specific message to a given player.
generalMessage: tells a message to all players. This might not be necessary, as we could just iterate over the list of players and use tellPlayer. On the other hand, for IRC or XMPP backends, it might make sense to display this information on a conversation channel, so that watchers can follow the game.

The reason why it might make sense to have such granularity (pure, GameStateOnly, NonInteractive, GameMonad) is twofold :

It is easier to reason about the functions.
The less “powerful” a function is, the easier it is to test.

What is important to note is that I can’t write arbitrary effects with just the GameMonad constraint. Even better, I know I should be careful only when using the first two functions, as they are the only ones where user input can creep in. This explains why the part of the code that deals with playerDecision is so full of checks.

The choice of a typeclass is debatable, as there probably will only be a single implementation. I chose to do so because it will let me write code without worrying about how the monad itself will be implemented. I will probably ditch the typeclass later.

One problem so far is that these functions don’t have the proper type. Indeed, what happens when I pass askCard an empty list ? How is the player supposed to provide a card ? The other problem now is what to do with this Message type. Right now, it’s a type synonym to String, but it will change for the next episode !

Various notes

No error recovery

I decided not to have error recovery in the game rules description. This is the responsability of the “driver” (which will be described in a later post) to make sure sore losers can’t DoS the game. The game will just end on the first error it encounters.

Lenses everywhere

This code uses the lens library all over the place. This is not surprising, as it involves a lot of mangling of nested structures in the State monad. But the prisms are even better ! Here is an example :

-- | Compute the money that a card gives to a player
getCardFunding :: GameStateOnly m => PlayerId -> Card -> m Funding
getCardFunding pid card = do
    stt <- use playermap
    -- note how we exploit the fact that ^. behaves like foldMap here
    let funding = card ^. cEffect . traverse . _GainFunding . to computeFunding
        computeFunding (n, cond) = countConditionTrigger pid cond stt * n
    return funding

The choice of writing this option in GameStateOnly is debatable, as it just needs a read only access to the state once, and might just have been like that :

getCardFunding :: GameState -> PlayerId -> Card -> Funding

However, what is interesting is how it is working. Here is an anotated of how the funding function is composed :

cEffect                                               :: Traversal' Card [Effect]
cEffect . traverse                                    :: Traversal' Card Effect
cEffect . traverse . _GainFunding                     :: Traversal' Card (Funding, Condition)
cEffect . traverse . _GainFunding . to computeFunding :: Fold Card Funding

So basically we wrote a traversal that goes through all effects of a card, keeping those with the GainFunding constructor, extracting its arguments, and finally using them to compute a Funding.

Now, if I had written funding = card ^.. ..., I would have obtained a [Funding], that I could add with sum. But remember that we made sure that our numerical newtypes, such as Funding and Victory, had a monoid instance for addition. In that case, ^. (or view) will make a monoidal summary, meaning it will give me 0 if there were no matches, or the sum of these matches, which is exactly what I wanted.

Order of execution

In this game, order of execution is really important, as most actions are supposed to happen simultaneously, and some only at very specific steps. In particular, a players can “trade” a resource belonging to a neighbor in exchange for money. A naïve implementation would be something like :

playermap . ix playerid . cFunds -= x
playermap . ix neighbor . cFunds += x

But this would create a (risky) exploit : namely declaring that you want to trade more resource than what you have money for, hoping somebody else will trade with you and that this transaction will be processed before yours.

In order to fix this, the resolveExchange function only removes money from the current player, returning the set of bought resources and an AddMap PlayerId Funding, listing the money that needs to be given to the neighbors.

The AddMap newtype

The resolveAction function also returns this AddMap PlayerId Funding, and the payouts are only processed after all actions are resolved. In order to make the code nicer, we need this AddMap k v newtype to be Traversable and have a Monoid instance that does unionWith (+).

The code is here and is an example on how this is done. I also derived the Ix and At instances, even though I didn’t end up using them. Strangely, someone asked on the cafe mailing list how to do this.

The 7th turn

There are only 6 turns for each age. But there is a company stage that let players use the 7th card, at the end of an age. Instead of having a special case, this is done by having an optional 7th turn.

No tests

Despite my claim that my rules are easy to test, tests are horrible to write, as they need a big setup. For this reason I postponed writing them ;) This will be a good test of the “Haskell code almost works the first time it runs” theory.

Next time

I will refactor a bit, and introduce a custom pretty-printer that will work with multiple backends, so that it is possible to have a nice view of what is going on during play.

7 Startups - part 1 - Introduction and types

2014-05-10T10:14:00+02:00

There have been recently complaints that there wasn’t any resource available for bridging the gap between beginner and experimented Haskeller, and some posts on “Haskell program architecture” have been written to help with this transition. I have found these posts to be pretty interesting, and while I can hardly be called an expert, I would like to contribute to this effort by documenting a few advanced Haskell features, as well as my design decisions, applied to a simple, yet fun, project.

Now that this is out of the way, let’s start !

The project

In this series of posts, I will describe how to model the rules of a well known board game, and how to turn them in an enjoyable program. If time permits, quite a few topics should be discussed, including key design decisions, how to interface a pure description of the rules with multiple backends, concurrency with the STM, and the advantage of always pretty printing your data structures.

The game itself is a shameless clone of the excellent 7 Wonders game (you can find the rules on the official web site), but with Internet giants instead of antique wonders. The theming took me a long time, and I am not particularly satisfied with it, so if you feel like contributing, please give me better names for the cards and resources.

All the code is on github. I will document my decisions and actions as I go, and will tag the repository accordingly. The relevant version for this article is tag Step1.1.

The types

Startups.Base

The Startups.Base module contains all the base elements of the game, with the relationship with the original game written the comments. While all the types are more or less directly transcribed from the rules book, the newtyped numerical types might not be obvious :

newtype Poacher = Poacher { getPoacher :: Integer }
    deriving (Ord, Eq, Num, Integral, Real, Enum, Show)

newtype VictoryPoint = VictoryPoint { getVictory :: Integer }
    deriving (Ord, Eq, Num, Integral, Real, Enum, Show)

newtype Funding = Funding { getFunding :: Integer }
    deriving (Ord, Eq, Num, Integral, Real, Enum, Show)

newtype PlayerCount = PlayerCount { getPlayerCount :: Integer }
    deriving (Ord, Eq, Num, Integral, Real, Enum, Show)

All the derived instances let you use them just like a standard Integer in your code, and the newtype prevents you from mixing them. But the main advantage is that it will make functions type signatures a lot more informative.

Startups.Cards

I usually would have merged this module with the previous one, but for the sake of blogging about it I separated the two. This module is all about modeling the cards. Fortunately, the cards have an obvious representation. But what about the Effect type ?

Modeling the effects

With a functional language, there are several ways to go :

Have some big case statements all over the code that depend on the card names, the effects being encoded where they are needed. This is obviously bad, as it will lead to a lot of verbose code, and it will be a pain to refactor the code.
Have the effect described as a state-changing function (ie. type Effect = PlayerId -> GameState -> GameState). This is the most versatile option, as it lets you add new cards with funky effects without modifying other parts of the code. Unfortunately, your program no longer have an easy way to “observe” the effect, so you will need to write a human-readable description for each card. It might be hard to write an AI for this game too (this point is debatable). There is also the problem of reasoning about new effects, especially concerning the order of application of the effects. A common workaround is to add a “priority” field, so that the order of application is known.
Fully describe all effects with a data type. This is the approach we are going to take, as it has obvious advantages in this particular case : most cards can be described with a handful of distinct “effect components”, where the components are orthogonal. This means they should be implemented in the part of the code that are relevant. It will be quite easy to describe arbitrary effects to the user too.

All the possible effects components can be seen here. Some components have no parameters (such as Recycling), meaning they model a specific rule. But what is nice about this data type is that it models the effects of the cards, but also of the company building stages.

Precise types

The following types are not as obvious as they appear :

data Neighbor = NLeft
              | NRight
              deriving (Ord, Eq, Show)

data EffectDirection = Neighboring Neighbor
                     | Own
                     deriving (Ord, Eq, Show)

My first version was something like :

data EffectDirection = NLeft
                     | Own
                     | NRight
                     deriving (Ord, Eq, Show)

This was simpler, but some effects have no meaning when applied to the current player (such as reduced exchange rates). This will make pattern matching a bit more cumbersome, but it will probably prevent some mistakes.

Modeling the cost

What is more interesting is the Cost data type.

data Cost = Cost (MS.MultiSet Resource) Funding
          deriving (Ord, Eq, Show)

A MultiSet is a collection of objects that can be repeated but for which order is not important (you can also think of it as a sorted list). It perfectly models a resource cost, such as “3 operations, and a marketing”, and it provides us with a isSubsetOf operation that can directly tell whether a player has enough resources to play some card. There is an obvious Monoid instance for it :

instance Monoid Cost where
    mempty = Cost mempty 0
    Cost r1 f1 `mappend` Cost r2 f2 = Cost (r1 <> r2) (f1 + f2)

I don’t think this instance will be too useful, except for writing this cleanly :

instance IsString Cost where
    fromString = F.foldMap toCost
        where
            toCost 'Y' = Cost (MS.singleton Youthfulness) 0
            toCost 'V' = Cost (MS.singleton Vision) 0
            toCost 'A' = Cost (MS.singleton Adoption) 0
            toCost 'D' = Cost (MS.singleton Development) 0
            toCost 'O' = Cost (MS.singleton Operations) 0
            toCost 'M' = Cost (MS.singleton Marketing) 0
            toCost 'F' = Cost (MS.singleton Finance) 0
            toCost '$' = Cost mempty 1
            toCost _ = error "Invalid cost string"

When the OverloadedStrings extension is enabled, the compiler will accept strings in places where another type is expected, by adding a call to the fromString function. For example, "YYY" :: Cost will be replaced by fromString "YYY" :: Cost.

I don’t think this is good practice to advise others to write partial IsString instances, but it greatly helped with writing the card list, speaking of which …

Tests

Writing the first card list was the most tedious and error prone part of this endeavor. In order to make sure I did not introduce a typo, I performed a couple of tests on the card list :

All cards are distinct (got a bug).
For every number of players and ages, there are 7 cards for each player (there were three errors).

What could have been better

Ord instances

Most data types now have Ord instances that are not particularly meaningful. They are here so that the data structures can be used in the standard containers types, such as Data.Map.Strict and Data.Set. It might have been a better idea to use unordered-containers, but this would have meant more boilerplate (for deriving all the Hashable instances).

Why not use an external DSL for describing the cards ?

This indeed would have been a good idea, and wouldn’t have been particularly hard to write. I don’t think it would have added much to the project at this stage though.

Modeling the “Opportunity” effect

This effect currently looks like this : Opportunity (S.Set Age). It is used to describe the fact that a given player can build for free any card, once per age. The Set will contain the set of Ages for which this capacity has not been used. This means that when the player decides to use this capacity, this effect will need to be updated. If this wasn’t for this effect, a player card list would only be modified by adding a card to it, which would have been more pleasant.

Card and Company stages

When I started writing this post, the Card type had a single constructor, and there was a CardType that was not part of the rules used to describe a company stage. I did that because I thought it was more elegant to unify cards and company stages, as they were pretty similar (both have costs and effects that work the same way).

It turned out that I had to enter dummy values for player count, age, card name, etc. for these “cards”. Now there is an additional constructor for company building stages, as can be seen in this commit.

Next time

In the next episode, I will start writing the game rules, starting with choosing (or not) the proper abstraction for describing them. In the meantime, do not hesitate commenting (reddit link) !

New tests : are all users defined ?

2014-03-15T11:39:00+01:00

I recently added to HEAD tests about users and groups definitions (patch here). It is now enabled by default on puppetresources, which might not be a great idea (let me know if you would rather have a switch).

The basic idea is that some built-in types take user and group names as parameters. There is no check for existence during catalog compilation (and there can’t be meaningful tests anyway), so you can get catalog application failures. This particular test (named usersGroupsDefined) checks that all the groups and users used in the file, exec, cron, ssh_authorized_key, ssh_authorized_key_secure and user types are defined somewhere.

Refactoring: from an IO-based monad stack to a pure Program (part 2)

2014-03-01T13:57:00+01:00

In the previous post, I explained how I refactored the language-puppet catalog compiler so that the main monad was a pure Program (from the operational package) instead of an ErrorT Doc (RSST InterpreterReader InterpreterWriter InterpreterState IO). I then wrote an interpreter that would turn it back to this monad stack, so that it could be used with runErrorT and runRSST.

It might have been obvious to many readers that this was a pretty strange move, but I didn’t figure it out until operational’s author told me (thanks !). Here is what my interpreter signature was :

interpretIO :: InterpreterReader
            -> InterpreterMonad a
            -> ErrorT Doc (RSST InterpreterReader InterpreterWriter InterpreterState IO) a

And here is what it should have been from the beginning :

interpretIO :: InterpreterReader
            -> InterpreterState
            -> InterpreterMonad a
            -> IO (Either Doc a, InterpreterState, InterpreterWriter)

The Program should have been converted to the base monad (IO in this case) in the first place, instead going through the intermediate monad stack transformer step. The interpreter is now a lot easier to read.

Refactoring: from an IO-based monad stack to a pure Program (part 1)

2014-02-26T22:36:00+01:00

UPDATE (2014/03/01)

It turns out that there was a better way to do this, please see this new post.

Rationale

I am currently experimenting with the operational package. This post provides a rough outline on how I moved from an IO based monad stack to an isomorphic pure representation of the computation. I am unfortunately not well endowed on the theoretical side, so this will be a very practical post. It might contain some glaring mistakes, as I just spend a few hours acquainting myself with the concepts and migrating everything, and didn’t test it extensively. I marked the places where I am unsure on how to do something with a number, such as (0).

Here is the type of the main monad, before and after the change :

-- Before
type InterpreterMonad = ErrorT Doc (RSST InterpreterReader InterpreterWriter InterpreterState IO)
-- After
type InterpreterMonad = ProgramT InterpreterInstr (State InterpreterState)

I first tried a simple Program InterpreterInstr for the main monad, but I could not write the MonadState instance, as there was a conflicting instance (1). This is the reason why the State monad is there, at the base of the transformer stack.

The goal is to build a representation of the catalog compilation process, in a pure monad, and then transform it into another representation that will actually be executed. In order to do so, all the “effects” need to be encoded as a single type (designated as instr in the operational haddocks). In this case, this is the InterpreterInstr type, detailed here.

You might observe that commands of type m a -> m b become constructors of type m a -> instr b, and not instr a -> instr b (which makes sense if you think about what you are doing, but was not immediately obvious to me when I started writing the types).

Implementing the Program monad

First of all, all the effects given by the original transformer stacks have their own instructions : ErrorT has the ErrorThrow and ErrorCatch instructions, and a similar treatment has been realized on the MonadWriter part of the original RSST transformer (it’s like RWST, except faster and not leaky). The MonadState doesn’t need special instructions, as InterpreterMonad is already an instance of MonadState.

The MonadWriter has been dropped, in favor of more specific instructions (the original reader structure can be directly observed in the new instruction set, along with the exposed PuppetDBAPI). Finally, some additional utility functions have been thrown in, as they rely on IO.

With all of this in place, it becomes trivial to write the following instances :

instance MonadError Doc InterpreterMonad where
    throwError     = singleton . ErrorThrow
    catchError a c = singleton (ErrorCatch a c)

instance MonadWriter InterpreterWriter InterpreterMonad where
    tell   = singleton . WriterTell
    pass   = singleton . WriterPass
    listen = singleton . WriterListen

Now the refactoring becomes mechanical, and surprisingly non invasive. As can be seen in the patch, it’s mostly about replacing every use of the view and liftIO with the corresponding “singleton” command. I have seen that people write short functions for commonly used instructions, such as :

pdbGetResources = singleton . PDBGetResources

I didn’t go for this, as most functions are used at most a couple times.

Running the computation

The interpreter is right here, and is pretty painful to read. Its type is however pretty straightforward : given the “Reader” data and a ProgramT, it will create an equivalent (or not) computation represented by another monad. It is exactly(2) as writing in a DSL, and running it through an interpreter.

I was surprised that I had to write the explicit type signatures for the functions that are in the where section of the interpretIO function, but other than that this was a straightforward exercise. As a reaction to a recent popular reddit thread, the “Overview” given in operational’s documentation was invaluable to get started quickly.

Conclusion

I have seen this kind of design mentioned at several places, as a common way to keep things pure and easy to reason about. I however thought it was better to think about it earlier in the design process, as changing the base monad of all computations would require a significant rewrite.

The first pleasant surprise was that it only took me a few hours to go from “reading the haddocks” to “refactoring done”.

The second, in some sense, even more pleasant surprise was that there doesn’t seem to be any performance penalty whatsoever.

Finding dead code with puppetresources

2014-02-10T13:11:00+01:00

I just released version 0.12.1, along with binary downloads at the usual place.

The main feature of this new version is that puppetresources can now detect dead code. In order to do so, you must supply it with a site.pp containing explicit references to all the nodes you are using, and run it that way :

puppetresources -p . -o deadcode +RTS -N4

Also of interest for those writing automated integration test, puppetresources now exits with an error code when something went wrong.

Version 0.12.0 : GHC 7.8 and performance increase

2014-02-07T13:50:00+01:00

I just released version 0.12.0, which should compile on GHC 7.8 once the following packages are fixed :

regex-pcre-builtin (Edit 2014/02/09: now updated)
strict-base-types (Edit 2014/02/09: now updated)
charset (There is a pull request under way)

It seems to compile fine against GHC 7.6.3, even though I couldn’t really test the resulting executable (I gave a shot at Nix, but hruby is somewhat broken as a result).

This release doesn’t bring much on the table apart from an hypothetical 7.8 compatibility.

I made several claims of performance increase, previously, so here are the results :

	0.10.5	0.10.6	0.11.1	0.12.0
49 nodes, N1		10.74s	9.76s	9.03s
49 nodes, N2		10.48s	7.66s	7.01s
49 nodes, N4		9.7s	6.89s	6.37s
49 nodes, N8		12.46s	13.4s	11.77s
Single node	2.4s	2.24s	2.02s	1.88s

The measurements were done on my workstation, sporting a 4 cores HT processor (8 logical cores).

The performance improvements can be explained in the following way :

Between 0.10.5 and 0.10.6, the Ruby interpreter mode of execution was modified from a Channel based system to an MVar one.
Between 0.10.6 and 0.11.1, all systems that would run on their own thread were modified to use the calling thread instead, reducing synchronization overhead (except for the Ruby thread). This gave a 9% performance boost for single threaded work, and a 29% performance boost when using four cores. The 8-cores performance worsened, because of the wasted work of the parser (This is explained in the previous post).
Between 0.11.1 and 0.12.0, I moved from GHC 7.6.3 to GHC 7.8-rc1, and bumped the version of many dependencies (including text and aeson, both having received a speed boost recently). This resulted in a “free” 7% speed boost.

As it is shown here, the initial parsing is extremely costly, as computing the catalogs for 49 nodes is about 5 times as long as computing it for a single node. As the parsed files get cached, catalog computing becomes more and more effective (about 50 times faster than Puppet). I don’t think the current parser can be sped up significantly without ditching its readability, so this is about as fast as it will get.

The next goals are a huge simplification of the testing system, and perhaps an external DSL. There are compiled binaries and ubuntu packages at the usual place.

Version 0.11.1 - cleanup and performance improvements

2014-01-31T13:03:00+01:00

This is a quick update, that mainly fixes the build issues with strict-base-types (it was updated a few minutes after I released v0.11.0). I also reduced the boilerplate (thanks redditors), and dropped all unsafeCoerce, as they had almost no effect on performance.

There is an important performance improvement however : filecache isn’t built around the ‘fork a master thread and communicate with it through a Chan’ pattern that I have been so fond of. As a quick reminder, the query function looks like that :

query :: Error r => FileCacheR r a
                 -> FilePath
                 -> IO (S.Either r a)
                 -> IO (S.Either r a)

The first parameter is a file cache of actions of type a, that can fail with errors of type r. The second argument is the file path corresponding to the cache entry you are querying (note that nothing in this API forces a relationship between the file and the computation). The last argument is the IO action that should be cached. If the action has already been queried, and the file hasn’t changed since, then a cache hit should occur, and a quick answer is expected. The problem with the previous implementation was that, on cache misses, the action would be executed in the ‘master’ thread, blocking all other consumers of the file cache.

This is no longer the case, as this was all replaced by classic STM operations. This increased the performance a bit for the single threaded case, and a bit more with -N4. This is however a performance killer with -N8, as all threads query the same files at the same time, duplicating work unnecessarily. This was not a problem with the previous implementation, as only the first call would trigger the action, and all subsequent calls would hit the cache. All in all, this is simpler to reason about, and a nice gain in common situations.

Version 0.11.0 - now with a full serving of unsafeCoerce

2014-01-30T21:15:00+01:00

A new version is out, with a focus on the bleeding edge ! The focus of this release was to update important dependencies, so we have now :

lens 4
attoparsec 0.11
aeson 0.7
text up to 1.1
parsers 0.10
and new versions of filecache and hruby

The goal here is to reap the benefits of the speed improvements in attoparsec, text and aeson, and be ready for the GHC 7.8 release.

There were a few hurdles to get there. The move from Data.Attoparsec.Number to Data.Scientific for number representation was not a big problem (even though code in hruby could be polished), but the lens and parsers upgrade proved more problematic.

Lens related breakage

The main problem with lens 4 was that it broke strict-base-types. I don’t think this will last long, but here is a temporary workaround for the impatient. Other than that, several instances of x ^. contains y were replaced by x & has (ix y) for the map-like types. This was a painless upgrade.

The trouble with parsers

I like this package a lot, because it exposes a nice API and supports several underlying parsers. But it comes with a couple problems.

The first one is related to the TokenParsing typeclass. This let you define how someSpace, the function that skip spaces, is defined. Unfortunately, the parsers library comes with an instance for parsec that will skip the characters satisfying isSpace. While this certainly is a sane choice, this is a problem for people who would like to use parsec as the underlying parser, but with a different implementation of someSpace. In my case, I also wanted to skip single line (start with #) and multi-line (/* these */) comments. A solution is to create a newtype, and redefine all instances. For those wondering how this is done, here is the relevant part. Please let me know if there is a way to do that that does not require that much boilerplate.

The second problem is that the expression parser builder (at Text.Parser.Expression) is much slower than what is in parsec. Switching to it from Text.Parsec.Expr resulted in a 25% slowdown, so I switched back to parsec. Unfortunately, I didn’t immediately realize this was the culprit, and instead believed it a case newtypes lowering performance. My code is now littered with unsafeCoerces, that I will remove in the next version (provided this does not result in a performance hit).

New features of previous versions I did not blog about

Several bugs were solved thanks to Pi3r.
Several new facts were added.
A parsing bug was fixed (it was also fixed in the official documentation.
An ‘all nodes’ testing mode was added for puppetresources. This can be used that way :

$ puppetresource -p . -o allnodes +RTS -N4
Problem with workstation.site : template error for userconfig/signature.erb :
undefined method `[]' for nil:NilClass
(erb):5:in `get_binding'
/home/bartavelle/.cabal/share/x86_64-linux-ghc-7.6.3/language-puppet-0.11.0/ruby/hrubyerb.rb:46:in `get_binding'
/home/bartavelle/.cabal/share/x86_64-linux-ghc-7.6.3/language-puppet-0.11.0/ruby/hrubyerb.rb:68:in `runFromContent'
/home/bartavelle/.cabal/share/x86_64-linux-ghc-7.6.3/language-puppet-0.11.0/ruby/hrubyerb.rb:63:in `runFromFile'
 in ./modules/userconfig/templates/signature.erb at # "./modules/userconfig/manifests/init.pp" (line 33, column 9)
 Problem with db.dev.site : The following parameters are unknown: (use_ramdisk) when including class percona at # "./manifests/site.pp" (line 956, column 10)
 Problem with db2.dev.site : The following parameters are unknown: (use_ramdisk) when including class percona at # "./manifests/site.pp" (line 969, column 9)
 Tested 54 nodes.

This should provide a nice overview of the current state of your manifests. And it tested those nodes about 50 times faster than Puppet can compile the corresponding catalogs.

Behind the scene changes

The Puppet.Daemon machinery has been simplified a lot. It previously worked by spawning a pre-determined amount of threads specialized for parsing or compiling catalogs. The threads communicated with each other using shared Chans. The reason was that I wanted to control the memory usage, and didn’t want to have too many concurrent threads at the same time. It turns out that most memory is used by the parsed AST, which is shared using the (filecache)[http://hackage.haskell.org/package/filecache] module, so this is not a real concern.

I did rip all that, and now the only threads that is spawned is an OS thread for the embedded Ruby interpreter, and an IO thread for the filecache thread. The user of the library can then spawn as many parallel threads as he wants. As a result, concurrency is a bit better, even though there are still contention points :

The parsed file cache is held by the filecache thread, and communicates with a Chan. I will eventually replace this with an MVar, or some other primitive that doesn’t require a dedicated thread.
The Lua interpreter requires a LuaState, that should not be used by several threads at once. It is stored in a shared MVar.
The Ruby interpreter is the main performance bottleneck. It is single threaded, and very slow. The only way to speed it up would be to parse more of the ruby language (there is a parser for common Erb patterns included !), or to switch to another interpreter that would support multithreading. Both are major endeavors.

Waiting around the corner

The next releases will probably be Ruby 2.1 support for hruby, and some performance work on filecache.

Prisms and lenses in language-puppet

2014-01-22T20:08:00+01:00

I added a few prisms and a lens to language-puppet. I use it mainly for manipulating manifest files. Here is an example, with some ‘type annotation’ that should give an idea on how this work :

import Control.Lens
import Puppet.Lens
import Control.Lens.Aeson
import qualified Data.Text as T
import qualified Data.Text.IO as T

content :: T.Text
content = "node 'test' { $x = 12 }"

main :: IO ()
main = T.putStrLn $ content               -- T.Text
                  & _PParse               -- [Vector Statement]
                    . traverse            -- [Statement]
                    . _Statements         -- [[Statement]]
                    . traverse            -- [Statement]
                    . _VariableAssignment -- [(T.Text, Expression, PPosition)]
                    . _2                  -- [Expression]
                    . _PResolveExpression -- [PValue]
                    . _Integer            -- [Integer]
                    +~ 2

The result of this program is :

node "test"
{
      $x = "14"
}

This is much nicer than the many ad-hoc transformations I have written.

The stm-firehose package

2014-01-18T11:55:00+01:00

I recently needed a specific kind of publish-subscribe primitive, where the publisher must not block, and that should not leak memory. It is published as the stm-firehose package.

It exposes an STM interface (built upon TBMQueue), but also higher level helpers. My use case is to be able to tap on a message stream at will, and observe some of the messages as they flow through the infrastructure. The clients will join and leave at any time, possibly from a low bandwidth connection. This means that they might not be able to cope with the traffic, but this should not adversely affect the publisher or other clients.

To cover common use cases, two helpers are exposed in the Data.Conduit.Network.Firehose module :

firehoseApp: creates a WAI application that lets firehose clients connect through a web interface.
firehoseConduit: spawns a WARP server waiting for clients to connect, and returns a Conduit that will make all messages traversing it available to clients.

Finally, a specialized version of firehoseConduit, called firehose, has been included in the Data.Conduit.Firehose module of the hslogstash package, in the name of fireHose. This version is specialized for querying logstash messages, and yes, the naming of the function and module could have been better.

It might have been a better idea to just create a TCP server for the helper functions, instead of a more complex HTTP server, but this would have required writing a custom protocol for handling the client “preferences” in the kind of messages they would like to see. As it stands, the user of the helper functions can pass a filtering function that can adjust its behavior based on the client HTTP request. This is exploited in the logstash-specific helper function by letting the client specify the list of message types he would like to see in the URL.

As a side note, there have been quite a few improvement and bug fixes in language-puppet since last time I blogged about it. There still are a few issues to close before releasing a new version.

Version 0.10.3 out - Hiera support, and a score of incremental updates

2013-12-02T23:22:00+01:00

I did not announce the 0.10.2 version, which was mainly a bug fixing release and the first time I bothered to fix the dependencies of the language-puppet package. This version is full of new stuff however, you can get it on hackage or as binaries.

Hiera support, with facts manipulation utilities

This is the main attraction to this release. I did not use Hiera beforehand, so the development only started because of a GitHub issue. It turns out that this is really a great feature ! Most Hiera features should be supported (actually, I think all of them are, but I might have missed something). Facts manipulation is also easier now.

You can now have that kind of sessions :

session.sh

$ ssh remote.node 'facter --yaml' > /tmp/facts.yaml # Retrieve facts from a remote host.
$ vi /tmp/facts.yaml # Edit the facts so as to test variations.
$ puppetresources -p ~/gits/puppet -o remote.node \
        --yaml ~/gits/puppet/hiera.yaml \  # Load the given yaml file.
        --facts-override /tmp/facts.yaml   # Override collected facts with
                                           # those from the remote host.

New stateWriter monad

I previously blogged about a replacement for the standard RWS monad, wrote a package, and moved to it. Code speed increased by 10% on a single run, but I expect subsequent runs to go twice as fast. I will need to benchmark this of course, but this is currently not a priority.

JSON output, for use with puppet apply

This is a funny feature. There is support for this in puppetresources, with the --json flag. Once this is generated, you can copy the file onto a machine and apply it with puppet apply --catalog. This also means that it is now trivial to write a replacement for the puppetmaster, a feature that was experimental before the full rewrite, but that might be fully supported (with debian packages) soon.

Full scope tracking for all resources

Now all resources are stored with the full ‘scope’ (think of it as some kind of stack trace, including the originating host for exported resources), which makes debugging much easier. A typical use case is resource duplication in a define, such as the following case situation :

manifests/site.pp

define filewrapper($filename,$owner)
{
        file { $filename: ensure => present, owner => $owner; }
}

class class1 {
        filewrapper { 'fw1': filename => '/tmp/xxx', owner => 'class1' ; }
}
class class2 {
        filewrapper { 'fw2': filename => '/tmp/xxx', owner => 'class2' ; }
}

node 'a.b.c' {
        include class1,class2
}

The Puppet error messages is :

Error: Duplicate declaration: File[/tmp/xxx] is already declared in file \
/tmp/x/manifests/site.pp:3; cannot redeclare at /tmp/x/manifests/site.pp:3
on node a.b.c

The language-puppet error message is :

This error message is way more informative :

The resources are fully displayed, including all attribute values, which might shed some light on the problem.
The originating host (a.b.c) is displayed. This is crucial for debugging exported resource collisions. The standard output also shows

Various stuff

There are also a few minor changes in this release :

Support for the $classes variable in templates : this one is just as limited as the official one, as its value is parse-order dependent.
Support for @instance variables in templates.
Support for scope['key'] syntax in templates.
The haddocks were updated for some modules.
New is_virtual fact, and stdlib functions (flatten, str2bool, validate_absolute_path).

A faster RWS variant

2013-11-28T22:28:00+01:00

I recently saw comments stating that the Writer (and thus RWS) monad is slow and leaky. I have several programs and libraries built around the RWS monad, where I use the writer part only for logging (that is, snocing a single element at the end of the log), and already noticed in some cases that the profiler reported up to 80% of the time usage spent doing mappends.

Here is a POC, with a pair of monads I called RSS (for Reader - State - State). They offer an API similar to that of RWS, the only difference being that the writer part is internally riding with the state.

Tests with a (top secret) real program gave a two times speed increase. For language-puppet, it is a 10% speed increase for a single run, but most of the time is dominated by the initial parsing, so I expect the gain to be more substantial for other type of applications (complex tests come in mind). I wrote a benchmark, that compares the lazy and strict versions of the RSS and RWS monads, using several distinct monoids as the writer type. Here are the results on my machine :

	RSS.Lazy	RSS.Strict	RWS.Lazy	RWS.Strict
Seq	193 µs	194 µs	4.95 ms	457 µs
[]	4.46 ms	4.47 ms	4.89 ms	538 µs
Vector.Prim	764 µs	784 µs	47.72 ms	47.46 ms
RevList	173 µs	175 µs	10.57 ms	5.60 ms
IntSet	187 µs	184 µs	4.98 ms	472 µs
DList	177 µs	176 µs	4.56 ms	302 µs

RevList is the best data structure for my use case (it is just a newtype around a regular list, with reverse applied when the list is extracted), which should not come as a surprise. What was surprising (to me) is how terrible it turns out to me under RWS …
The performance of Vector.Prim with the RWS monad is terrible. I suppose this is due to some optimization rule that could not be deduced in that case. I was however surprised at the good performance compared to lists in the RSS case.
The RSS version is much faster than the RWS one … except for lists !
The strict version of the RSS monad doesn’t have a leaky writer (test here).

I also added a tellElement helper function that would snoc an element to the writer (with suitable monoids) :

	tell	tellElement
Seq	193 µs	231 µs
List	4.46 ms	4.45 ms
RevList	173 µs	200 µs

It seems that, to my surprise, there is a small price to pay for calling snoc instead of mappend.

I will probably release this as a package so that I can build language-puppet against it. I will copy’n’paste all the haddocks from the mtl and transformers packages, and move the modules so as to mimick their hierarchy. If you have suggestions, especially concerning the package or module names, strictness properties of the new monads (a subject where I am at loss), please let me know.

EDIT : Added benchmark results for DList

Quick facts editing with pdbquery

2013-11-18T23:25:00+01:00

I was made aware that it was not easy to play with facts for newcomers. I added a quick command to pdbquery that works like this :

pdbquery -t test -l /tmp/test.yaml addfacts test.site.com

This should create the file /tmp/test.yaml containing the following facts :

facts:
  test.site.com:
    memorysize: 31.31 GB
    ipaddress: 192.168.0.1
    hostname: test
    osfamily: Debian
    lsbdistcodename: precise
    facterversion: 0.1
    fqdn: test.site.com
    memoryfree: 21.16 GB
    environment: test
    operatingsystem: Ubuntu
    domain: site.com
    lsbdistrelease: 12.04
    puppetversion: language-puppet
    virtual: xenu
    operatingsystemrelease: 12.04
    clientcert: test.site.com
    is_virtual: 'true'
    lsbdistid: Ubuntu
    mountpoints: / /data
    architecture: x86_64
    swapsize: 9.31 GB
    lsbmajdistrelease: 12
    rootrsa: xxx
    swapfree: 9.31 GB
    lsbdistdescription: Ubuntu 12.04.3 LTS
    hardwaremodel: x86_64
resources: {}

You can then edit this file to add/remove/alter facts as you want. You can then use them in the following way:

puppetresources -p . -o test.site.com --pdbfile /tmp/test.yaml

This only works properly in HEAD right now, and will be fixed in 0.10.3, because of this Yaml bug that just got worked around today.

Hiera support in HEAD

2013-11-16T13:58:00+01:00

Hiera support is now in the latest version of the source repository. This is the last important feature that I wanted to implement before hitting 1.0. As with all the other features, this was really quick to develop (an evening to write the Hiera server and integrate it with the interpreter, and two lunch breaks to implement hash, array search types, and to finalize variable interpolation), and it is mostly untested.

I did not look much at Hiera before, but now that I have learned about it I can see how it will simplify my manifests, so I’ll test things a bit more during the following days. I’ll also revamp the test library, and, when I’ll be happy about it, will release 1.0.

I believe several persons are trying to use the library, and I would be quite happy to have feedback about it. If you need a feature, or a bug fixed, do not hesitate to create an issue.