First Experiments With Dependent Types In GHC
Tags: Haskell, GHC, Dependent Types, FFI, haskus-system March 18, 2016

Update 2016-12-28: rename ViperVM into Haskus system and fix links accordingly

For two weeks I have been getting my feet wet with GHC’s type extensions: DataKinds, TypeFamilies, TypeOperators, etc.

I rely a lot on the Foreign Function Interface (FFI) in my Haskus system project (formerly ViperVM) and I wanted to easily and efficiently map Haskell data types to C data types (unions, arrays, bitsets, bitfields, etc.). Last week I have put together a coherent set of modules to do it, heavily based on GHC’s type extensions. It is not yet documented on this blog, but you can read about it here: Writing bindings

Playing with type extensions gave me a lot of food for thought, hence this article about using them to enhance the way we do error checking and other control-flow stuff.

Control-flow

I am now trying to solve another issue: control-flow in the IO monad with error handling. For instance, if you have read this page, you know that most Linux’s syscalls can either fail and return an error, or succeed and return a value. E.g.:

I will use the following simpler functions to illustrate this article:

EitherT

To compose these calls, it is common to use the EitherT monad transformer.

The corresponding data-flow diagram is then:

Drawbacks

I think that this composition approach, however, is quite limiting:

  • all the functions must share the same Error type
  • when we get an Error, we don’t know which of the function has failed

Suppose we have instead a different error type per function:

We can somehow circumvent the issues mentionned above by enhancing the Error data type so that it can contain the different kinds of error. We could also add an identifier indicating the function which has triggered the error (I haven’t done it here to keep things simple).

This method doesn’t scale well. When we use Error in different places, we never know the real subset of errors that can happen in practice. For instance, in the following example, the type of result2 doesn’t tell us that the ErrorH error will never happen while it may happen in result.

(It is similar to errno in C: it can be any int value so we never know if we have tested all the possible error cases for a given syscall).

Handling errors is already very boring, we don’t want to handle errors that will never happen and we don’t want to create new ErrorXXX data types for each error subset due to composition. Instead, we would like to compose the functions to get:

That is, result should have an extended Either type that can contain an arbitrary number of different types: here C, ErrorF, ErrorG and ErrorH. And we want the help of the compiler to ensure that we don’t forget to handle some errors: we want to pattern match on result so that the compiler signals missing cases.

Variant

I have implemented a Variant type that supports an arbitrary number of content types. Similarly to data types (as implemented in GHC as far as I know), a hidden tag number identifies the actual type of the contained value. The value can be obtained/set with a type index (a Nat) that is used both to index into the content types and to check/set the tag value. It is much more simple to understand with a few examples:

Flow

I have made a little control-flow module that deals with functions returning Variants. Now instead of returning Either Error a, we can make our functions return a Variant '[a,Error] and we can compose them with flowSeq as in the following example:

Notice that by convention, the first value type in the variant is the type of the “correct” value: it is the type of the value that is potentially passed to the next function.

We can easily lift functions returning Either with flowSeqE:

Similarly, we can lift functions that do not fail with flowSeqM:

Note that a function may return several kinds of errors:

Most flow functions are biased in favor of the first element of the variant (the “correct” one). Nevertheless it is totally possible to perform custom operations during a flow:

We can also directly match in the flow with flowMatch:

Finally, if we want to catch all the errors with a given type, we can use flowCatch:

If we catch all the error types, we must be able to match the result with singleVariant, otherwise it doesn’t compile:

Conclusion

Finally, we now have a control-flow framework that seems easy to work with (I still need to use it in production). It looks a little bit like checked-exceptions in Java, but first-class (we can manipulate them, etc.). I should add combinators usually found in Exception frameworks (finally, bracket, etc.).

If you have other ideas, do not hesitate to contact me!

All of this code is part of a larger project now called Haskus system