Extensible ADT (EADT)

Tags: Haskell, EADT, Variant, Expression problem

May 22, 2018

Introduction

The expression problem (1998)

In 1998, Philip Wadler defined the Expression Problem as follow:

The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety

In basic Haskell it is straightforward to add new functions over a data type. Suppose we have the following arithmetic expression data type:

data Expr = Val Int | Add Expr Expr

We can independently add an evaluator function, potentially in another module:

eval :: Expr -> Int
eval (Val x)   =  x
eval (Add x y) = eval x + eval y

However if we want to add a new case to the data type (say support for multiplication), we have to modify both the data type definition and the functions using it:

data Expr = .... | Mul Expr Expr

eval :: Expr -> Int
....
eval (Mul x y) = eval x * eval y

Data types à la carte (2008)

Ten years later (in 2008), Wouter Swierstra described a technique to handle this in his well-known Data types à la carte paper. The first idea is to define data constructors independently and to use a type parameter to leave open the data type they are part of.

-- Independent data constructors. Parameter `e` represents the data type they
-- will be part of. It is required even if it is not used in the right hand
-- side.
data Val e = Val Int
data Add e = Add e e

Defining a new independent constructor is easy:

data Mul e = Mul e e

The second idea is to use a combining data type:

data (f :+: g) e = Inl (f e) | Inr (g e)

It is similar to Either except that it passes the same additional type parameter to both f and g type constructors. It can be used to compose independent data constructors without creating a new data type:

type ExprF = Val :+: Add

ExprF has kind Type -> Type and its type parameter is used as the e parameter of the independent data constructors. We can set it to arbitrary types such as Int to build valid values:

y = Inr (Add 5 8) :: ExprF Int

However the main use of this parameter should be to indicate the type of the expression data type we want to build, say Expr. Hence we would like to write something like this:

type Expr = ExprF Expr

 >error:
 Cycle in type synonym declarations:
   <interactive>:12:1-22: type Expr = ExprF Expr

Oops, we can’t build this cyclic (infinite) type. This leads us to the third idea: use another data type to handle the recursive nature of the expression type:

data Expr = Expr (ExprF Expr)

We can abstract on it to use the same data type for different expression types:

-- `Fix` type as defined in Data.Functor.Foldable for instance
newtype Fix f = Fix (f (Fix f))

type Expr = Fix ExprF

Summary

We use 3 different kinds of data types:

Constructor data types: Val, Add, Mul…
Combinator data type: :+:
Recursivity handling data type: Fix

By using these different data types we have untangled the construction of ADTs (algebraic data types) and we can freely add new constructor data types and mix them into different algebraic data types.

Operations on these algebraic data types can be defined independently by using type-classes and recursion schemes.

EADT - Extensible ADT (2018)

In this post I’d like to slightly revisit and enhance the approach by using modern Haskell and in particular another combinator data type and pattern synonyms.

A different combinator data type

Variant (xs :: [*]) is a sum type that contains a value whose type can be any type in the xs list (see this previous post). We can use it to define the following VariantF type:

newtype VariantF (xs :: [* -> *]) e = VariantF (Variant (ApplyAll e xs))

-- ApplyAll e '[f,g,h] ==> '[f e, g e, h e]

instance Functor (VariantF xs) where ....

We will use this type as a combinator data type. Similarly to the :+: combinator data type presented in the introduction, VariantF passes its e parameter to all of its “member” types and has an instance of the Functor class.

Now instead of writing f :+: g :+: h :+: i to combine constructor data types to form an ADT we can write VariantF '[f,g,h,i].

First benefit: just like using Variant is more efficient – O(1) memory usage and (de)construction – than using a nest of Either, using VariantF is more efficient than using a nest of :+:.

A generic EADT type

Let’s define an EADT type synonym as follow:

type EADT xs = Fix (VariantF xs)

And a pattern synonym to easily build EADT values:

pattern VF :: forall e f cs.
   ( e ~ EADT cs  -- allow easy use of TypeApplication to set the EADT type
   , f :<: cs     -- constraint synonym ensuring `f` is in `cs`
   ) => f (EADT cs) -> EADT cs
pattern VF x = Fix (VariantF (V' x))   -- `V'` match a variant value (without
                                       -- checking the membership: we already
                                       -- do it with :<:)

Second benefit: with modern Haskell we can define bidirectional pattern synonyms that make creation and matching of EADT values much more nicer.

Examples using EADT

Defining an EADT and building some values

Let’s define some constructors:

{-# LANGUAGE DeriveFunctor #-}

data ValF e = ValF Int deriving (Functor)
data AddF e = AddF e e deriving (Functor)

And an algebraic data type:

type AddValADT = EADT '[ValF,AddF]

We can build EADT values in the following (equivalent) ways:

x :: AddValADT
x = VF (ValF 10)

y = VF @AddValADT (ValF 10)

z = VF (ValF 10 :: ValF AddValADT)

But it’s nicer to define some additional pattern synonyms:

pattern Val :: ValF :<: xs => Int -> EADT xs
pattern Val a = VF (ValF a)

pattern Add :: AddF :<: xs => EADT xs -> EADT xs -> EADT xs
pattern Add a b = VF (AddF a b)

x :: AddValADT
x = Val 10

xx = Add (Val 10) (Val 5) :: AddValADT

We can pattern-match EADT values with VF or with pattern synonyms too:

showAddValADT :: AddValADT -> String
showAddValADT e = case e of
   -- match with VF
   VF (ValF u)  -> show u

   -- match with additional pattern synonym
   Add u v      -> "(" ++ showAddValADT u ++ " + " ++ showAddValADT v ++ ")"

   _            -> error "showAddValADT: unhandled AddValADT constructor"

> showAddValADT (Add (Val 10) (Val 3))
"(10 + 3)"

Note that the wildcard match is necessary, otherwise the pattern-match is reported as non exhaustive by GHC (warning). This is one shortcoming due to the use of the Variant type.

Anyway it isn’t really an issue in this case because showAddValADT shouldn’t be written like this to be extensible.

Extending an EADT and converting values from one EADT to another

Suppose we add a constructor:

data MulF e = MulF e e deriving (Functor)

We can define a new EADT:

type MulAddValADT = EADT '[ValF,AddF,MulF]

We only need to change the type to declare values of the new EADT:

x' :: MulAddValADT
x' = Val 10

We can explicitly add constructors to an EADT:

> x' = Val 10 :: EADT '[ValF,AddF]

> :t appendEADT @'[MulF] x'
appendEADT @'[MulF] x' :: EADT '[ValF, AddF, MulF]

We can also lift values from one EADT to the other as follow:

xx' :: MulAddValADT
xx' = liftEADT (Add (Val 10) (Val 5) :: AddValADT)

Thanks to the type-checking, the opposite direction is not allowed:

> liftEADT xx' :: AddValADT

<interactive>:83:1: error:
    • '[MulF]
      is not a subset of
      '[ValF, AddF]
    • In the expression: liftEADT xx' :: AddValADT
      In an equation for ‘it’: it = liftEADT xx' :: AddValADT

Defining operations on EADTs

Extensible operations on EADTS (i.e. that need to be extended when we add new constructors) are defined with type-classes.

Type classes and recursion schemes

We can use recursion schemes, for example:

import Data.Functor.Foldable

class ShowEADT (f :: * -> *) where
   showEADT' :: f String -> String

-- boilerplate instances for the combinator data type

instance ShowEADT (VariantF '[]) where
   showEADT' = undefined

instance (ShowEADT x, ShowEADT (VariantF xs))  => ShowEADT (VariantF (x ': xs)) where
   showEADT' v = case popVariantFHead v of
      Right u -> showEADT' u
      Left  w -> showEADT' w


-- instances for constructors data types

instance ShowEADT ValF where
   showEADT' (ValF i) = show i

instance ShowEADT AddF where
   showEADT' (AddF u v) = "(" ++ u ++ " + " ++ v ++ ")" -- no recursive call

instance ShowEADT MulF where
   showEADT' (MulF u v) = "(" ++ u ++ " * " ++ v ++ ")" -- no recursive call


-- `cata` (for catamorphism) handles the recursion/fold

> cata showEADT' (Add (Val 2) (Add (Val 10) (Val 3)) :: AddValADT)
"(2 + (10 + 3))"

> cata showEADT' (Add (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
"(2 + (10 + 3))"

> cata showEADT' (Mul (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
"(2 * (10 + 3))"


showEADT :: (ShowEADT (Base t), Recursive t) => t -> String -- type inferred by GHC
showEADT = cata showEADT'

> showEADT (Mul (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
"(2 * (10 + 3))"

Type classes and explicit recursion

We can also use explicit recursion, for example:

class Eval e where
   -- evaluate an expression
   eval :: e -> Int

-- same boilerplate

instance Eval (VariantF '[] e) where
   eval = undefined

instance (Eval (x e), Eval (VariantF xs e))  => Eval (VariantF (x ': xs) e) where
   eval v = case popVariantFHead v of
      Right u -> eval u
      Left  w -> eval w

-- evaluating an EADT value

type EvalAll xs = Eval (VariantF xs (EADT xs))

evalEADT :: EvalAll xs => EADT xs -> Int
evalEADT = eval . unfix

-- instances for constructors

instance Eval (ValF e) where
   eval (ValF i) = i

instance EvalAll xs => Eval (AddF (EADT xs)) where
   eval (AddF u v) = evalEADT u + evalEADT v -- explicit recursion

instance EvalAll xs => Eval (MulF (EADT xs)) where
   eval (MulF u v) = evalEADT u * evalEADT v -- explicit recursion


> evalEADT (Add (Val 2) (Add (Val 10) (Val 3)) :: AddValADT)
15
> evalEADT (Add (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
15
> evalEADT (Mul (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
26

Transformation example

Let’s rewrite Swiestra’s example to apply the distributivity law. The operation is not extensible, hence we don’t have to use type classes to define it. It works on any EADT as long as the constraints are fulfilled: supporting Add and Mul constructors.

import Control.Arrow

-- distribute multiplication over addition if it matches
distr :: (AddF :<: f, MulF :<: f) => EADT f -> Maybe (EADT f)
distr (Mul a (Add c d)) = Just (Add (Mul a c) (Mul a d))
distr _                 = Nothing

-- bottom up traversal that performs an additional bottom up traversal in
-- the transformed sub-tree when a transformation occurs. 
bottomUpFixed :: Functor f => (Fix f -> Maybe (Fix f)) -> Fix f -> Fix f
bottomUpFixed f = unfix >>> fmap (bottomUpFixed f) >>> Fix >>> f'
   where
      f' u = case f u of
         Nothing -> u
         Just v  -> bottomUpFixed f v

-- example
expr :: MulAddValADT
expr = Mul (Val 2) (Add (Val 10) (Mul (Val 3) (Add (Val 5) (Val 20))))

> showEADT expr
"(2 * (10 + (3 * (5 + 20))))"

> showEADT (bottomUpFixed distr expr)
"((2 * 10) + ((2 * (3 * 5)) + (2 * (3 * 20))))"

Constructor removal example (~desugaring)

Let’s show how to transform each multiplication into an addition while removing the MulF constructor from the EADT in a generic type-safe manner:

class MulToAdd (f :: * -> *) ys where
   mulToAdd :: f (EADT ys) -> EADT ys

-- boilerplate

instance MulToAdd (VariantF '[]) ys where
   mulToAdd = undefined

instance (MulToAdd x ys, MulToAdd (VariantF xs) ys)  => MulToAdd (VariantF (x ': xs)) ys where
   mulToAdd v = case popVariantFHead v of
      Right u -> mulToAdd u
      Left  w -> mulToAdd w

-- constructor instances

instance{-# OVERLAPPING #-} AddF :<: ys => MulToAdd MulF ys where
   mulToAdd (MulF u v) = Add u v -- Mul becomes Add
                                 -- (it makes no sense but it's just an example)

instance {-# OVERLAPPABLE #-} f :<: ys => MulToAdd f ys where
   mulToAdd = VF  -- the other constructors are kept as-is

-- example
expr :: MulAddValADT
expr = Mul (Val 2) (Add (Val 10) (Mul (Val 3) (Add (Val 5) (Val 20))))

> showEADT (cata mulToAdd expr :: AddValADT)
"(2 + (10 + (3 + (5 + 20))))"

> showEADT (cata mulToAdd (bottomUpFixed distr expr) :: AddValADT)
"((2 + 10) + ((2 + (3 + 5)) + (2 + (3 + 20))))"

This kind of transformation can be used to transform a big language into a smaller one (e.g. Haskell into Core).

Annotation examples (parameter example)

Let’s add a constructor to support annotations in the AST. We let the annotation type parametric.

-- | Annotate `e` with a value having type `a`
data AnnotF a e = AnnotF a e deriving (Functor)

pattern Annot :: AnnotF a :<: xs => a -> EADT xs -> EADT xs
pattern Annot a e = VF (AnnotF a e)

instance Show a => ShowEADT (AnnotF a) where
   showEADT' (AnnotF a e) = "{" ++ show a ++ "} " ++ e

We can easily define an EADT supporting parametric annotations, for instance:

type AnnotExp a = EADT '[ValF,AddF,MulF,AnnotF a]

Now we can build an example of an annotated expression where the annotation type is String:

annotExpr :: AnnotExp String
annotExpr = Mul (Val 8) (Annot "An addition" (Add (Val 10) (Val 5)))

> showEADT annotExpr 
"(8 * {\"An addition\"} (10 + 5))"

We can easily prune annotations from the expression similarly to what we did above to transform multiplications into additions:

class RemoveAnnot (f :: * -> *) ys where
   removeAnnot' :: f (EADT ys) -> EADT ys

instance RemoveAnnot (VariantF '[]) ys where
   removeAnnot' = undefined

instance
   ( RemoveAnnot x ys
   , RemoveAnnot (VariantF xs) ys
   )  => RemoveAnnot (VariantF (x ': xs)) ys
   where
   removeAnnot' v = case popVariantFHead v of
      Right u -> removeAnnot' u
      Left  w -> removeAnnot' w

instance{-# OVERLAPPING #-} RemoveAnnot (AnnotF a) ys where
   removeAnnot' (AnnotF _a e) = e

instance {-# OVERLAPPABLE #-} f :<: ys => RemoveAnnot f ys where
   removeAnnot' = VF

> showEADT (cata removeAnnot' annotExpr :: MulAddValADT)
"(8 * (10 + 5))"

As it is a bit sad to have to specify the type of the resulting EADT without the annotations, we can define the following helper function:

-- remove `AnnotF a` (for all `a`) in `xs`
type family FilterAnnot xs where
   FilterAnnot '[]              = '[]
   FilterAnnot (AnnotF a ': xs) = FilterAnnot xs
   FilterAnnot (x ': xs)        = x ': FilterAnnot xs

removeAnnot ::
   ( Functor (VariantF xs)
   , RemoveAnnot (VariantF xs) (FilterAnnot xs)
   ) => EADT xs -> EADT (FilterAnnot xs)
removeAnnot = cata removeAnnot'

> showEADT (removeAnnot annotExpr)
"(8 * (10 + 5))"

Type-checking example (paramorphism example)

First let’s add support for Float values:

data FloatValF e = FloatValF Float deriving (Functor)

pattern FloatVal :: FloatValF :<: xs => Float -> EADT xs
pattern FloatVal a = VF (FloatValF a)

instance ShowEADT FloatValF where
   showEADT' (FloatValF i) = show i

Now we can write invalid expressions mixing both Int and Float values:

> showEADT (Add (Val 10) (FloatVal 5) :: EADT '[ValF,FloatValF,AddF])
"(10 + 5.0)"

Let’s write a type-checker to detect these cases. For a given expression, our type-checker will either return its valid type (Int or Float) or an erroneous type with an explanation message:

data Typ
   = TInt
   | TFloat
   | TError String   -- the type of an invalid expression with some explanation
   deriving (Show,Eq)

class TypeCheck (f :: * -> *) where
   typeCheck' :: f Typ -> Typ

instance TypeCheck (VariantF '[]) where
   typeCheck' = undefined

instance (TypeCheck x, TypeCheck (VariantF xs))  => TypeCheck (VariantF (x ': xs)) where
   typeCheck' v = case popVariantFHead v of
      Right u -> typeCheck' u
      Left  w -> typeCheck' w

instance TypeCheck ValF where
   typeCheck' _ = TInt

instance TypeCheck FloatValF where
   typeCheck' _ = TFloat

instance TypeCheck AddF where
   typeCheck' (AddF t1 t2)
      | t1 == t2       = t1
      | TError _ <- t1 = t1 -- propagate errors
      | TError _ <- t2 = t2
      | otherwise      = TError $ "can't match " ++ show t1 ++ " with " ++ show t2

Let’s try it:

> cata typeCheck' (Add (Val 10) (Val 5) :: EADT '[ValF,FloatValF,AddF])
TInt

> cata typeCheck' (Add (Val 10) (FloatVal 5) :: EADT '[ValF,FloatValF,AddF])
TError "can't match TInt with TFloat"

The error message is not very useful, let’s write a better type-checker that indicates which expressions are at fault:

class TypeCheck2 (f :: * -> *) ys where
   typeCheck2' :: f (EADT ys, Typ) -> Typ

instance TypeCheck2 (VariantF '[]) ys where
   typeCheck2' = undefined

instance (TypeCheck2 x ys, TypeCheck2 (VariantF xs) ys)  => TypeCheck2 (VariantF (x ': xs)) ys where
   typeCheck2' v = case popVariantFHead v of
      Right u -> typeCheck2' u
      Left  w -> typeCheck2' w

instance TypeCheck2 ValF ys where
   typeCheck2' _ = TInt

instance TypeCheck2 FloatValF ys where
   typeCheck2' _ = TFloat

instance (AddF :<: ys, ShowEADT (VariantF ys), Functor (VariantF ys)) => TypeCheck2 AddF ys where
   typeCheck2' (AddF (u,t1) (v,t2))
      | t1 == t2       = t1
      | TError _ <- t1 = t1 -- propagate errors
      | TError _ <- t2 = t2
      | otherwise = TError $ "can't add `" ++ showEADT u ++ "` whose type is " ++ show t1 ++
                             " with `" ++ showEADT v ++ "` whose type is " ++ show t2

This time instead of using cata we will use another recursion scheme called para (for paramorphism) that also passes expression values (u and v in AddF instance):

> para typeCheck2' (Add (Val 10) (Val 5) :: EADT '[ValF,FloatValF,AddF])
TInt

> para typeCheck2' (Add (Val 10) (FloatVal 5) :: EADT '[ValF,FloatValF,AddF])
TError "can't add `10` whose type is TInt with `5.0` whose type is TFloat"

Let’s check that we can easily extend our expression language and our type-checker without rewriting anything by adding support for type ascriptions (i.e., types specified by the programmer):

data TypedF e = TypedF Typ e deriving (Functor)

pattern Typed :: TypedF :<: xs => Typ -> EADT xs -> EADT xs
pattern Typed t e = VF (TypedF t e)

instance ShowEADT TypedF where
   showEADT' (TypedF t e) = "(" ++ e ++ " :: " ++ show t ++ ")"

instance (Functor (VariantF ys), ShowEADT (VariantF ys)) => TypeCheck2 TypedF ys where
   typeCheck2' (TypedF t (e,te))
      | t == te   = t
      | TError _ <- t  = t
      | TError _ <- te = te
      | otherwise = TError $ "wrong type ascription " ++ show t ++
                             " for expression `" ++ showEADT e ++
                             "` with inferred type " ++ show te

Now let’s check that asserting some types is correctly supported:

> type E = EADT '[ValF,FloatValF,AddF,TypedF]

> para typeCheck2' (Add (Val 10) (Typed TInt $ Val 5) :: E)
TInt

> para typeCheck2' (Add (Val 10) (Typed TFloat $ Val 5) :: E)
TError "wrong type ascription TFloat for expression `5` with inferred type TInt"

> para typeCheck2' (Typed TInt $ Add (Val 10) (Val 5) :: E)
TInt

> para typeCheck2' (Typed TFloat $ Add (Val 10) (Val 5) :: E)
TError "wrong type ascription TFloat for expression `(10 + 5)` with inferred type TInt"

Finally let’s write another type-checker that adds inferred types as type-ascriptions to every node of the expression.

class TypeCheck3 (f :: * -> *) ys where
   typeCheck3' :: f (TypedF (EADT ys)) -> TypedF (EADT ys)

instance TypeCheck3 (VariantF '[]) ys where
   typeCheck3' = undefined

instance (TypeCheck3 x ys, TypeCheck3 (VariantF xs) ys)  => TypeCheck3 (VariantF (x ': xs)) ys where
   typeCheck3' v = case popVariantFHead v of
      Right u -> typeCheck3' u
      Left  w -> typeCheck3' w

instance (TypedF :<: ys, ValF :<: ys) => TypeCheck3 ValF ys where
   typeCheck3' (ValF i) = TypedF TInt (Val i)

instance (TypedF :<: ys, FloatValF :<: ys) => TypeCheck3 FloatValF ys where
   typeCheck3' (FloatValF f) = TypedF TFloat (FloatVal f)

instance
   ( TypedF :<: ys
   , AddF :<: ys
   , ShowEADT (VariantF ys)
   , Functor (VariantF ys)
   ) => TypeCheck3 AddF ys
   where
   typeCheck3' (AddF u@(TypedF t1 _) v@(TypedF t2 _))
      | t1 == t2       = TypedF t1 addVal
      | TError _ <- t1 = TypedF (TError "") addVal -- don't propagate errors
      | TError _ <- t2 = TypedF (TError "") addVal
      | otherwise = TypedF (TError $ "can't add " ++ show t1 ++ " with " ++ show t2) addVal
      where
         addVal = Add (VF u) (VF v)

-- similar for MulF...

instance
   ( TypedF :<: ys
   , Functor (VariantF ys)
   , ShowEADT (VariantF ys)
   ) => TypeCheck3 TypedF ys
   where
   typeCheck3' (TypedF t u@(TypedF te _))
      | t == te        = u                         -- Remove redundant ascription
      | TError _ <- t  = TypedF t (VF u)           -- keep erroring ascription as-is
      | TError _ <- te = TypedF (TError "") (VF u) -- don't propagate errors
      | otherwise      = TypedF (TError $ "wrong type ascription" ++
                                          " (inferred type: " ++ show te ++ ")") (Typed t (VF u))


typeCheck3 :: forall xs.
   ( TypeCheck3 (VariantF xs) xs
   , Functor (VariantF xs)
   , TypedF :<: xs
   ) => EADT xs -> EADT xs
typeCheck3 e = VF (cata typeCheck3' e :: TypedF (EADT xs))

Let’s try it:

> type E = EADT '[ValF,FloatValF,AddF,MulF,TypedF]

> showEADT (typeCheck3 (Add (Val 10) (Val 5) :: E))
"(((10 :: TInt) + (5 :: TInt)) :: TInt)"

> showEADT (typeCheck3 (Add (Val 10) (FloatVal 5) :: E))
"(((10 :: TInt) + (5.0 :: TFloat)) :: TError \"can't add TInt with TFloat\")"

> showEADT (typeCheck3 (Add (Add (Mul (Val 2) (Val 3)) (FloatVal 10)) (Val 5)) :: E)
"(((((((2 :: TInt) * (3 :: TInt)) :: TInt) + (10.0 :: TFloat))
   :: TError \"can't add TInt with TFloat\") + (5 :: TInt)) :: TError \"\")"

It’s a bit hard to read, let’s add an helper:

import Data.Tree

class DisplayEADT (f :: * -> *) where
   displayEADT' :: f (Tree String) -> Tree String

instance DisplayEADT (VariantF '[]) where
   displayEADT' = undefined

instance (DisplayEADT x, DisplayEADT (VariantF xs))  => DisplayEADT (VariantF (x ': xs)) where
   displayEADT' v = case popVariantFHead v of
      Right u -> displayEADT' u
      Left  w -> displayEADT' w

instance DisplayEADT ValF where
   displayEADT' (ValF i) = Node (show i) []

instance DisplayEADT FloatValF where
   displayEADT' (FloatValF i) = Node (show i) []

instance DisplayEADT AddF where
   displayEADT' (AddF u v) = Node "(+)" [u,v]

instance DisplayEADT MulF where
   displayEADT' (MulF u v) = Node "(*)" [u,v]

instance DisplayEADT TypedF where
   displayEADT' (TypedF t (Node l ts)) = Node (l ++ " :: " ++ show t) ts

drawEADT :: (Functor (VariantF xs), DisplayEADT (VariantF xs)) => EADT xs -> IO ()
drawEADT = putStr . drawTree . cata displayEADT'

And try it:

> drawEADT (typeCheck3 (Add (Add (Mul (Val 2) (Val 3)) (FloatVal 10)) (Val 5)) :: E)

(+) :: TError ""
|
+- (+) :: TError "can't add TInt with TFloat"
|  |
|  +- (*) :: TInt
|  |  |
|  |  +- 2 :: TInt
|  |  |
|  |  `- 3 :: TInt
|  |
|  `- 10.0 :: TFloat
|
`- 5 :: TInt

Composition example

Finally, we can write the following one-line example that strips annotations, adds support for type-ascriptions, performs type-checking and displays the resulting EADT:

annotExpr :: EADT '[ValF,AddF,MulF,AnnotF String]
annotExpr = Mul (Val 8) (Annot "An addition" (Add (Val 10) (Val 5)))

> drawEADT (typeCheck3 (appendEADT @'[TypedF] (removeAnnot annotExpr)))

(*) :: TInt
|
+- 8 :: TInt
|
`- (+) :: TInt
   |
   +- 10 :: TInt
   |
   `- 5 :: TInt

The compiler ensures that the EADT is valid at each phase. For instance if we forget to remove annotations, type-checking and drawing phases will complain at compile time:

> drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))

<interactive>:367:1: error:
    • No instance for (DisplayEADT (AnnotF String))
        arising from a use of ‘drawEADT’
    • In the expression:
        drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))
      In an equation for ‘it’:
          it = drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))

<interactive>:367:11: error:
    • No instance for (TypeCheck3
                         (AnnotF String) '[ValF, AddF, MulF, AnnotF String, TypedF])
        arising from a use of ‘typeCheck3’
    • In the first argument of ‘drawEADT’, namely
        ‘(typeCheck3 (appendEADT @'[TypedF] annotExpr))’
      In the expression:
        drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))
      In an equation for ‘it’:
          it = drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))

This way we can easily encode phase invariants in the types.

Remarks

This post has been motivated by a discussion on Reddit about AST transformations: each phase of a compiler transforms an AST into another and there is a tension between using a different AST data type per phase (best approach if we consider phases independently) or a generic AST data type that can be used by every phase (best approach to easily compose/interleave phases).

The usual answer (as used by GHC) is taken from the “Trees that grow” paper and consists in using an ADT as a generic AST and to parameterize it with type families. I don’t really like it and I think that using EADTs instead could probably avoid the use of AST specific type-families and hence could make the AST more generic (easier to add phases, etc.).

I have used recursion schemes in this post without really introducing them. Patrick Thomson wrote a very good series of posts about them that I highly recommend.

Please let me know if you do some fun or interesting things with EADTs!

Sources can be found here: https://github.com/haskus/haskus-utils in Haskus.Utils.VariantF.

ADDENDUM

Following the discussion on Reddit about this post, I have implemented the speculated “project” function from the Compositional data types paper (Future Work section) with the proposed type. I have called it popEADT and it allows us to perform complete case analysis (with its friends popVariantF and variantFToValue), e.g.:

showMulAddValADT :: MulAddValADT -> String
showMulAddValADT e = case popEADT e of
   Right (ValF u) -> show u
   Left w         -> case popVariantF w of
      Right (AddF u v) ->  "(" ++ showMulAddValADT u ++ " + " ++ showMulAddValADT v ++ ")"
      Left z           -> case variantFToValue z of
         MulF u v ->  "(" ++ showMulAddValADT u ++ " * " ++ showMulAddValADT v ++ ")"

> v = Mul (Add (Val 2) (Val 3)) (Val 5) :: MulAddValADT 
> showMulAddValADT v
"((2 + 3) * 5)"

Notice that contrary to the previous showAddValADT example in the post, there is no need for any extra wildcard match this time because GHC detects that the pattern-matching is complete.