Tags: Haskell, EADT, Variant, Expression problem | May 22, 2018 |
Introduction
The expression problem (1998)
In 1998, Philip Wadler defined the Expression Problem as follow:
The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety
In basic Haskell it is straightforward to add new functions over a data type. Suppose we have the following arithmetic expression data type:
We can independently add an evaluator function, potentially in another module:
However if we want to add a new case to the data type (say support for multiplication), we have to modify both the data type definition and the functions using it:
Data types à la carte (2008)
Ten years later (in 2008), Wouter Swierstra described a technique to handle this in his well-known Data types à la carte paper. The first idea is to define data constructors independently and to use a type parameter to leave open the data type they are part of.
-- Independent data constructors. Parameter `e` represents the data type they
-- will be part of. It is required even if it is not used in the right hand
-- side.
data Val e = Val Int
data Add e = Add e e
Defining a new independent constructor is easy:
The second idea is to use a combining data type:
It is similar to Either
except that it passes the same additional type parameter to both f
and g
type constructors. It can be used to compose independent data constructors without creating a new data type:
ExprF
has kind Type -> Type
and its type parameter is used as the e
parameter of the independent data constructors. We can set it to arbitrary types such as Int
to build valid values:
However the main use of this parameter should be to indicate the type of the expression data type we want to build, say Expr
. Hence we would like to write something like this:
type Expr = ExprF Expr
>error:
Cycle in type synonym declarations:
<interactive>:12:1-22: type Expr = ExprF Expr
Oops, we can’t build this cyclic (infinite) type. This leads us to the third idea: use another data type to handle the recursive nature of the expression type:
We can abstract on it to use the same data type for different expression types:
-- `Fix` type as defined in Data.Functor.Foldable for instance
newtype Fix f = Fix (f (Fix f))
type Expr = Fix ExprF
Summary
We use 3 different kinds of data types:
- Constructor data types:
Val
,Add
,Mul
… - Combinator data type:
:+:
- Recursivity handling data type:
Fix
By using these different data types we have untangled the construction of ADTs (algebraic data types) and we can freely add new constructor data types and mix them into different algebraic data types.
Operations on these algebraic data types can be defined independently by using type-classes and recursion schemes.
EADT - Extensible ADT (2018)
In this post I’d like to slightly revisit and enhance the approach by using modern Haskell and in particular another combinator data type and pattern synonyms.
A different combinator data type
Variant (xs :: [*])
is a sum type that contains a value whose type can be any type in the xs
list (see this previous post). We can use it to define the following VariantF
type:
newtype VariantF (xs :: [* -> *]) e = VariantF (Variant (ApplyAll e xs))
-- ApplyAll e '[f,g,h] ==> '[f e, g e, h e]
instance Functor (VariantF xs) where ....
We will use this type as a combinator data type. Similarly to the :+:
combinator data type presented in the introduction, VariantF
passes its e
parameter to all of its “member” types and has an instance of the Functor
class.
Now instead of writing f :+: g :+: h :+: i
to combine constructor data types to form an ADT we can write VariantF '[f,g,h,i]
.
First benefit: just like using Variant
is more efficient – O(1) memory usage and (de)construction – than using a nest of Either
, using VariantF
is more efficient than using a nest of :+:
.
A generic EADT type
Let’s define an EADT type synonym as follow:
And a pattern synonym to easily build EADT values:
pattern VF :: forall e f cs.
( e ~ EADT cs -- allow easy use of TypeApplication to set the EADT type
, f :<: cs -- constraint synonym ensuring `f` is in `cs`
) => f (EADT cs) -> EADT cs
pattern VF x = Fix (VariantF (V' x)) -- `V'` match a variant value (without
-- checking the membership: we already
-- do it with :<:)
Second benefit: with modern Haskell we can define bidirectional pattern synonyms that make creation and matching of EADT values much more nicer.
Examples using EADT
Defining an EADT and building some values
Let’s define some constructors:
{-# LANGUAGE DeriveFunctor #-}
data ValF e = ValF Int deriving (Functor)
data AddF e = AddF e e deriving (Functor)
And an algebraic data type:
We can build EADT values in the following (equivalent) ways:
But it’s nicer to define some additional pattern synonyms:
pattern Val :: ValF :<: xs => Int -> EADT xs
pattern Val a = VF (ValF a)
pattern Add :: AddF :<: xs => EADT xs -> EADT xs -> EADT xs
pattern Add a b = VF (AddF a b)
x :: AddValADT
x = Val 10
xx = Add (Val 10) (Val 5) :: AddValADT
We can pattern-match EADT values with VF
or with pattern synonyms too:
showAddValADT :: AddValADT -> String
showAddValADT e = case e of
-- match with VF
VF (ValF u) -> show u
-- match with additional pattern synonym
Add u v -> "(" ++ showAddValADT u ++ " + " ++ showAddValADT v ++ ")"
_ -> error "showAddValADT: unhandled AddValADT constructor"
> showAddValADT (Add (Val 10) (Val 3))
"(10 + 3)"
Note that the wildcard match is necessary, otherwise the pattern-match is reported as non exhaustive by GHC (warning). This is one shortcoming due to the use of the Variant
type.
Anyway it isn’t really an issue in this case because showAddValADT
shouldn’t be written like this to be extensible.
Extending an EADT and converting values from one EADT to another
Suppose we add a constructor:
We can define a new EADT:
We only need to change the type to declare values of the new EADT:
We can explicitly add constructors to an EADT:
> x' = Val 10 :: EADT '[ValF,AddF]
> :t appendEADT @'[MulF] x'
appendEADT @'[MulF] x' :: EADT '[ValF, AddF, MulF]
We can also lift values from one EADT to the other as follow:
Thanks to the type-checking, the opposite direction is not allowed:
> liftEADT xx' :: AddValADT
<interactive>:83:1: error:
• '[MulF]
is not a subset of
'[ValF, AddF]
• In the expression: liftEADT xx' :: AddValADT
In an equation for ‘it’: it = liftEADT xx' :: AddValADT
Defining operations on EADTs
Extensible operations on EADTS (i.e. that need to be extended when we add new constructors) are defined with type-classes.
Type classes and recursion schemes
We can use recursion schemes, for example:
import Data.Functor.Foldable
class ShowEADT (f :: * -> *) where
showEADT' :: f String -> String
-- boilerplate instances for the combinator data type
instance ShowEADT (VariantF '[]) where
showEADT' = undefined
instance (ShowEADT x, ShowEADT (VariantF xs)) => ShowEADT (VariantF (x ': xs)) where
showEADT' v = case popVariantFHead v of
Right u -> showEADT' u
Left w -> showEADT' w
-- instances for constructors data types
instance ShowEADT ValF where
showEADT' (ValF i) = show i
instance ShowEADT AddF where
showEADT' (AddF u v) = "(" ++ u ++ " + " ++ v ++ ")" -- no recursive call
instance ShowEADT MulF where
showEADT' (MulF u v) = "(" ++ u ++ " * " ++ v ++ ")" -- no recursive call
-- `cata` (for catamorphism) handles the recursion/fold
> cata showEADT' (Add (Val 2) (Add (Val 10) (Val 3)) :: AddValADT)
"(2 + (10 + 3))"
> cata showEADT' (Add (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
"(2 + (10 + 3))"
> cata showEADT' (Mul (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
"(2 * (10 + 3))"
showEADT :: (ShowEADT (Base t), Recursive t) => t -> String -- type inferred by GHC
showEADT = cata showEADT'
> showEADT (Mul (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
"(2 * (10 + 3))"
Type classes and explicit recursion
We can also use explicit recursion, for example:
class Eval e where
-- evaluate an expression
eval :: e -> Int
-- same boilerplate
instance Eval (VariantF '[] e) where
eval = undefined
instance (Eval (x e), Eval (VariantF xs e)) => Eval (VariantF (x ': xs) e) where
eval v = case popVariantFHead v of
Right u -> eval u
Left w -> eval w
-- evaluating an EADT value
type EvalAll xs = Eval (VariantF xs (EADT xs))
evalEADT :: EvalAll xs => EADT xs -> Int
evalEADT = eval . unfix
-- instances for constructors
instance Eval (ValF e) where
eval (ValF i) = i
instance EvalAll xs => Eval (AddF (EADT xs)) where
eval (AddF u v) = evalEADT u + evalEADT v -- explicit recursion
instance EvalAll xs => Eval (MulF (EADT xs)) where
eval (MulF u v) = evalEADT u * evalEADT v -- explicit recursion
> evalEADT (Add (Val 2) (Add (Val 10) (Val 3)) :: AddValADT)
15
> evalEADT (Add (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
15
> evalEADT (Mul (Val 2) (Add (Val 10) (Val 3)) :: MulAddValADT)
26
Transformation example
Let’s rewrite Swiestra’s example to apply the distributivity law. The operation is not extensible, hence we don’t have to use type classes to define it. It works on any EADT as long as the constraints are fulfilled: supporting Add and Mul constructors.
import Control.Arrow
-- distribute multiplication over addition if it matches
distr :: (AddF :<: f, MulF :<: f) => EADT f -> Maybe (EADT f)
distr (Mul a (Add c d)) = Just (Add (Mul a c) (Mul a d))
distr _ = Nothing
-- bottom up traversal that performs an additional bottom up traversal in
-- the transformed sub-tree when a transformation occurs.
bottomUpFixed :: Functor f => (Fix f -> Maybe (Fix f)) -> Fix f -> Fix f
bottomUpFixed f = unfix >>> fmap (bottomUpFixed f) >>> Fix >>> f'
where
f' u = case f u of
Nothing -> u
Just v -> bottomUpFixed f v
-- example
expr :: MulAddValADT
expr = Mul (Val 2) (Add (Val 10) (Mul (Val 3) (Add (Val 5) (Val 20))))
> showEADT expr
"(2 * (10 + (3 * (5 + 20))))"
> showEADT (bottomUpFixed distr expr)
"((2 * 10) + ((2 * (3 * 5)) + (2 * (3 * 20))))"
Constructor removal example (~desugaring)
Let’s show how to transform each multiplication into an addition while removing the MulF
constructor from the EADT in a generic type-safe manner:
class MulToAdd (f :: * -> *) ys where
mulToAdd :: f (EADT ys) -> EADT ys
-- boilerplate
instance MulToAdd (VariantF '[]) ys where
mulToAdd = undefined
instance (MulToAdd x ys, MulToAdd (VariantF xs) ys) => MulToAdd (VariantF (x ': xs)) ys where
mulToAdd v = case popVariantFHead v of
Right u -> mulToAdd u
Left w -> mulToAdd w
-- constructor instances
instance{-# OVERLAPPING #-} AddF :<: ys => MulToAdd MulF ys where
mulToAdd (MulF u v) = Add u v -- Mul becomes Add
-- (it makes no sense but it's just an example)
instance {-# OVERLAPPABLE #-} f :<: ys => MulToAdd f ys where
mulToAdd = VF -- the other constructors are kept as-is
-- example
expr :: MulAddValADT
expr = Mul (Val 2) (Add (Val 10) (Mul (Val 3) (Add (Val 5) (Val 20))))
> showEADT (cata mulToAdd expr :: AddValADT)
"(2 + (10 + (3 + (5 + 20))))"
> showEADT (cata mulToAdd (bottomUpFixed distr expr) :: AddValADT)
"((2 + 10) + ((2 + (3 + 5)) + (2 + (3 + 20))))"
This kind of transformation can be used to transform a big language into a smaller one (e.g. Haskell into Core).
Annotation examples (parameter example)
Let’s add a constructor to support annotations in the AST. We let the annotation type parametric.
-- | Annotate `e` with a value having type `a`
data AnnotF a e = AnnotF a e deriving (Functor)
pattern Annot :: AnnotF a :<: xs => a -> EADT xs -> EADT xs
pattern Annot a e = VF (AnnotF a e)
instance Show a => ShowEADT (AnnotF a) where
showEADT' (AnnotF a e) = "{" ++ show a ++ "} " ++ e
We can easily define an EADT supporting parametric annotations, for instance:
Now we can build an example of an annotated expression where the annotation type is String
:
annotExpr :: AnnotExp String
annotExpr = Mul (Val 8) (Annot "An addition" (Add (Val 10) (Val 5)))
> showEADT annotExpr
"(8 * {\"An addition\"} (10 + 5))"
We can easily prune annotations from the expression similarly to what we did above to transform multiplications into additions:
class RemoveAnnot (f :: * -> *) ys where
removeAnnot' :: f (EADT ys) -> EADT ys
instance RemoveAnnot (VariantF '[]) ys where
removeAnnot' = undefined
instance
( RemoveAnnot x ys
, RemoveAnnot (VariantF xs) ys
) => RemoveAnnot (VariantF (x ': xs)) ys
where
removeAnnot' v = case popVariantFHead v of
Right u -> removeAnnot' u
Left w -> removeAnnot' w
instance{-# OVERLAPPING #-} RemoveAnnot (AnnotF a) ys where
removeAnnot' (AnnotF _a e) = e
instance {-# OVERLAPPABLE #-} f :<: ys => RemoveAnnot f ys where
removeAnnot' = VF
> showEADT (cata removeAnnot' annotExpr :: MulAddValADT)
"(8 * (10 + 5))"
As it is a bit sad to have to specify the type of the resulting EADT without the annotations, we can define the following helper function:
-- remove `AnnotF a` (for all `a`) in `xs`
type family FilterAnnot xs where
FilterAnnot '[] = '[]
FilterAnnot (AnnotF a ': xs) = FilterAnnot xs
FilterAnnot (x ': xs) = x ': FilterAnnot xs
removeAnnot ::
( Functor (VariantF xs)
, RemoveAnnot (VariantF xs) (FilterAnnot xs)
) => EADT xs -> EADT (FilterAnnot xs)
removeAnnot = cata removeAnnot'
> showEADT (removeAnnot annotExpr)
"(8 * (10 + 5))"
Type-checking example (paramorphism example)
First let’s add support for Float values:
data FloatValF e = FloatValF Float deriving (Functor)
pattern FloatVal :: FloatValF :<: xs => Float -> EADT xs
pattern FloatVal a = VF (FloatValF a)
instance ShowEADT FloatValF where
showEADT' (FloatValF i) = show i
Now we can write invalid expressions mixing both Int and Float values:
Let’s write a type-checker to detect these cases. For a given expression, our type-checker will either return its valid type (Int or Float) or an erroneous type with an explanation message:
data Typ
= TInt
| TFloat
| TError String -- the type of an invalid expression with some explanation
deriving (Show,Eq)
class TypeCheck (f :: * -> *) where
typeCheck' :: f Typ -> Typ
instance TypeCheck (VariantF '[]) where
typeCheck' = undefined
instance (TypeCheck x, TypeCheck (VariantF xs)) => TypeCheck (VariantF (x ': xs)) where
typeCheck' v = case popVariantFHead v of
Right u -> typeCheck' u
Left w -> typeCheck' w
instance TypeCheck ValF where
typeCheck' _ = TInt
instance TypeCheck FloatValF where
typeCheck' _ = TFloat
instance TypeCheck AddF where
typeCheck' (AddF t1 t2)
| t1 == t2 = t1
| TError _ <- t1 = t1 -- propagate errors
| TError _ <- t2 = t2
| otherwise = TError $ "can't match " ++ show t1 ++ " with " ++ show t2
Let’s try it:
> cata typeCheck' (Add (Val 10) (Val 5) :: EADT '[ValF,FloatValF,AddF])
TInt
> cata typeCheck' (Add (Val 10) (FloatVal 5) :: EADT '[ValF,FloatValF,AddF])
TError "can't match TInt with TFloat"
The error message is not very useful, let’s write a better type-checker that indicates which expressions are at fault:
class TypeCheck2 (f :: * -> *) ys where
typeCheck2' :: f (EADT ys, Typ) -> Typ
instance TypeCheck2 (VariantF '[]) ys where
typeCheck2' = undefined
instance (TypeCheck2 x ys, TypeCheck2 (VariantF xs) ys) => TypeCheck2 (VariantF (x ': xs)) ys where
typeCheck2' v = case popVariantFHead v of
Right u -> typeCheck2' u
Left w -> typeCheck2' w
instance TypeCheck2 ValF ys where
typeCheck2' _ = TInt
instance TypeCheck2 FloatValF ys where
typeCheck2' _ = TFloat
instance (AddF :<: ys, ShowEADT (VariantF ys), Functor (VariantF ys)) => TypeCheck2 AddF ys where
typeCheck2' (AddF (u,t1) (v,t2))
| t1 == t2 = t1
| TError _ <- t1 = t1 -- propagate errors
| TError _ <- t2 = t2
| otherwise = TError $ "can't add `" ++ showEADT u ++ "` whose type is " ++ show t1 ++
" with `" ++ showEADT v ++ "` whose type is " ++ show t2
This time instead of using cata
we will use another recursion scheme called para
(for paramorphism) that also passes expression values (u
and v
in AddF
instance):
> para typeCheck2' (Add (Val 10) (Val 5) :: EADT '[ValF,FloatValF,AddF])
TInt
> para typeCheck2' (Add (Val 10) (FloatVal 5) :: EADT '[ValF,FloatValF,AddF])
TError "can't add `10` whose type is TInt with `5.0` whose type is TFloat"
Let’s check that we can easily extend our expression language and our type-checker without rewriting anything by adding support for type ascriptions (i.e., types specified by the programmer):
data TypedF e = TypedF Typ e deriving (Functor)
pattern Typed :: TypedF :<: xs => Typ -> EADT xs -> EADT xs
pattern Typed t e = VF (TypedF t e)
instance ShowEADT TypedF where
showEADT' (TypedF t e) = "(" ++ e ++ " :: " ++ show t ++ ")"
instance (Functor (VariantF ys), ShowEADT (VariantF ys)) => TypeCheck2 TypedF ys where
typeCheck2' (TypedF t (e,te))
| t == te = t
| TError _ <- t = t
| TError _ <- te = te
| otherwise = TError $ "wrong type ascription " ++ show t ++
" for expression `" ++ showEADT e ++
"` with inferred type " ++ show te
Now let’s check that asserting some types is correctly supported:
> type E = EADT '[ValF,FloatValF,AddF,TypedF]
> para typeCheck2' (Add (Val 10) (Typed TInt $ Val 5) :: E)
TInt
> para typeCheck2' (Add (Val 10) (Typed TFloat $ Val 5) :: E)
TError "wrong type ascription TFloat for expression `5` with inferred type TInt"
> para typeCheck2' (Typed TInt $ Add (Val 10) (Val 5) :: E)
TInt
> para typeCheck2' (Typed TFloat $ Add (Val 10) (Val 5) :: E)
TError "wrong type ascription TFloat for expression `(10 + 5)` with inferred type TInt"
Finally let’s write another type-checker that adds inferred types as type-ascriptions to every node of the expression.
class TypeCheck3 (f :: * -> *) ys where
typeCheck3' :: f (TypedF (EADT ys)) -> TypedF (EADT ys)
instance TypeCheck3 (VariantF '[]) ys where
typeCheck3' = undefined
instance (TypeCheck3 x ys, TypeCheck3 (VariantF xs) ys) => TypeCheck3 (VariantF (x ': xs)) ys where
typeCheck3' v = case popVariantFHead v of
Right u -> typeCheck3' u
Left w -> typeCheck3' w
instance (TypedF :<: ys, ValF :<: ys) => TypeCheck3 ValF ys where
typeCheck3' (ValF i) = TypedF TInt (Val i)
instance (TypedF :<: ys, FloatValF :<: ys) => TypeCheck3 FloatValF ys where
typeCheck3' (FloatValF f) = TypedF TFloat (FloatVal f)
instance
( TypedF :<: ys
, AddF :<: ys
, ShowEADT (VariantF ys)
, Functor (VariantF ys)
) => TypeCheck3 AddF ys
where
typeCheck3' (AddF u@(TypedF t1 _) v@(TypedF t2 _))
| t1 == t2 = TypedF t1 addVal
| TError _ <- t1 = TypedF (TError "") addVal -- don't propagate errors
| TError _ <- t2 = TypedF (TError "") addVal
| otherwise = TypedF (TError $ "can't add " ++ show t1 ++ " with " ++ show t2) addVal
where
addVal = Add (VF u) (VF v)
-- similar for MulF...
instance
( TypedF :<: ys
, Functor (VariantF ys)
, ShowEADT (VariantF ys)
) => TypeCheck3 TypedF ys
where
typeCheck3' (TypedF t u@(TypedF te _))
| t == te = u -- Remove redundant ascription
| TError _ <- t = TypedF t (VF u) -- keep erroring ascription as-is
| TError _ <- te = TypedF (TError "") (VF u) -- don't propagate errors
| otherwise = TypedF (TError $ "wrong type ascription" ++
" (inferred type: " ++ show te ++ ")") (Typed t (VF u))
typeCheck3 :: forall xs.
( TypeCheck3 (VariantF xs) xs
, Functor (VariantF xs)
, TypedF :<: xs
) => EADT xs -> EADT xs
typeCheck3 e = VF (cata typeCheck3' e :: TypedF (EADT xs))
Let’s try it:
> type E = EADT '[ValF,FloatValF,AddF,MulF,TypedF]
> showEADT (typeCheck3 (Add (Val 10) (Val 5) :: E))
"(((10 :: TInt) + (5 :: TInt)) :: TInt)"
> showEADT (typeCheck3 (Add (Val 10) (FloatVal 5) :: E))
"(((10 :: TInt) + (5.0 :: TFloat)) :: TError \"can't add TInt with TFloat\")"
> showEADT (typeCheck3 (Add (Add (Mul (Val 2) (Val 3)) (FloatVal 10)) (Val 5)) :: E)
"(((((((2 :: TInt) * (3 :: TInt)) :: TInt) + (10.0 :: TFloat))
:: TError \"can't add TInt with TFloat\") + (5 :: TInt)) :: TError \"\")"
It’s a bit hard to read, let’s add an helper:
import Data.Tree
class DisplayEADT (f :: * -> *) where
displayEADT' :: f (Tree String) -> Tree String
instance DisplayEADT (VariantF '[]) where
displayEADT' = undefined
instance (DisplayEADT x, DisplayEADT (VariantF xs)) => DisplayEADT (VariantF (x ': xs)) where
displayEADT' v = case popVariantFHead v of
Right u -> displayEADT' u
Left w -> displayEADT' w
instance DisplayEADT ValF where
displayEADT' (ValF i) = Node (show i) []
instance DisplayEADT FloatValF where
displayEADT' (FloatValF i) = Node (show i) []
instance DisplayEADT AddF where
displayEADT' (AddF u v) = Node "(+)" [u,v]
instance DisplayEADT MulF where
displayEADT' (MulF u v) = Node "(*)" [u,v]
instance DisplayEADT TypedF where
displayEADT' (TypedF t (Node l ts)) = Node (l ++ " :: " ++ show t) ts
drawEADT :: (Functor (VariantF xs), DisplayEADT (VariantF xs)) => EADT xs -> IO ()
drawEADT = putStr . drawTree . cata displayEADT'
And try it:
> drawEADT (typeCheck3 (Add (Add (Mul (Val 2) (Val 3)) (FloatVal 10)) (Val 5)) :: E)
(+) :: TError ""
|
+- (+) :: TError "can't add TInt with TFloat"
| |
| +- (*) :: TInt
| | |
| | +- 2 :: TInt
| | |
| | `- 3 :: TInt
| |
| `- 10.0 :: TFloat
|
`- 5 :: TInt
Composition example
Finally, we can write the following one-line example that strips annotations, adds support for type-ascriptions, performs type-checking and displays the resulting EADT:
annotExpr :: EADT '[ValF,AddF,MulF,AnnotF String]
annotExpr = Mul (Val 8) (Annot "An addition" (Add (Val 10) (Val 5)))
> drawEADT (typeCheck3 (appendEADT @'[TypedF] (removeAnnot annotExpr)))
(*) :: TInt
|
+- 8 :: TInt
|
`- (+) :: TInt
|
+- 10 :: TInt
|
`- 5 :: TInt
The compiler ensures that the EADT is valid at each phase. For instance if we forget to remove annotations, type-checking and drawing phases will complain at compile time:
> drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))
<interactive>:367:1: error:
• No instance for (DisplayEADT (AnnotF String))
arising from a use of ‘drawEADT’
• In the expression:
drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))
In an equation for ‘it’:
it = drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))
<interactive>:367:11: error:
• No instance for (TypeCheck3
(AnnotF String) '[ValF, AddF, MulF, AnnotF String, TypedF])
arising from a use of ‘typeCheck3’
• In the first argument of ‘drawEADT’, namely
‘(typeCheck3 (appendEADT @'[TypedF] annotExpr))’
In the expression:
drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))
In an equation for ‘it’:
it = drawEADT (typeCheck3 (appendEADT @'[TypedF] annotExpr))
This way we can easily encode phase invariants in the types.
Remarks
This post has been motivated by a discussion on Reddit about AST transformations: each phase of a compiler transforms an AST into another and there is a tension between using a different AST data type per phase (best approach if we consider phases independently) or a generic AST data type that can be used by every phase (best approach to easily compose/interleave phases).
The usual answer (as used by GHC) is taken from the “Trees that grow” paper and consists in using an ADT as a generic AST and to parameterize it with type families. I don’t really like it and I think that using EADTs instead could probably avoid the use of AST specific type-families and hence could make the AST more generic (easier to add phases, etc.).
I have used recursion schemes in this post without really introducing them. Patrick Thomson wrote a very good series of posts about them that I highly recommend.
Please let me know if you do some fun or interesting things with EADTs!
Sources can be found here: https://github.com/haskus/haskus-utils in Haskus.Utils.VariantF
.
ADDENDUM
Following the discussion on Reddit about this post, I have implemented the speculated “project” function from the Compositional data types paper (Future Work section) with the proposed type. I have called it popEADT
and it allows us to perform complete case analysis (with its friends popVariantF
and variantFToValue
), e.g.:
showMulAddValADT :: MulAddValADT -> String
showMulAddValADT e = case popEADT e of
Right (ValF u) -> show u
Left w -> case popVariantF w of
Right (AddF u v) -> "(" ++ showMulAddValADT u ++ " + " ++ showMulAddValADT v ++ ")"
Left z -> case variantFToValue z of
MulF u v -> "(" ++ showMulAddValADT u ++ " * " ++ showMulAddValADT v ++ ")"
> v = Mul (Add (Val 2) (Val 3)) (Val 5) :: MulAddValADT
> showMulAddValADT v
"((2 + 3) * 5)"
Notice that contrary to the previous showAddValADT
example in the post, there is no need for any extra wildcard match this time because GHC detects that the pattern-matching is complete.