This blog post is not about language-puppet, but might be of interest to my fellow sysadmins with an interest in Haskell. I recently worked with Logstash in a way that might not be typical, as all my messages are emitted by services that are Logstash-aware: they directly write JSON messages to the TCP input of the Logstash server. This means that most of the features (and some would say, the whole point) of Logstash were of no use to me.
I stuck with my grand mission of rewriting the handful of useful Ruby programs, and wrote a new package. I based almost everything around the excellent conduit abstraction. It has the following features:
- Haskell types for representing Logstash messages, along with the type-classes necessary for converting them from and to JSON
- An ElasticSearch conduit, using the bulk insert API
- A Redis source using the pipelining features of the hedis package, and a simple Redis sink
- A Logstash listener, based on the TCP listener from network-conduit, able to accept latin1 and UTF-8 messages at the same time
- A pair of “retrying” sinks, one using a
Socket
and the other establishing TCP connections. They are used for garanteed delivery of a wholeByteString
segment, retrying to connect until it is sent (this is obviously useful for JSON messages) - A few functions for handling bulk APIs in Conduits
- And finally, the coolest part, a few helper functions that will let you route between conduits !
The last part was made after a little discussion on the Haskell-Cafe mailing list. It is built with with stm-conduit, which already has a helper function for merging sources. This package introduces the other useful functionnality: the ability to “route” items coming from a source to several sinks. The main function, branchConduits
, works by taking a Source
, a routing function, and a Sink
list. The routing function associates a (possibly empty) list of integers to every item coming from the Source
. These integers directly map to the corresponding Sink
, letting you define the routing policy.
The package includes a few examples of common tasks, all of them with acceptable runtime performance, such as :
- Moving messages from a TCP server to Redis
- Moving messages from Redis to Elasticsearch
- Routing messages between conduits
So if you need more control, or much better performance, than what you would get from Logstash, and you are not afraid to write (a lot of) code, please use this package and let me know what is missing and/or buggy!