Tags: Haskell | September 12, 2024 |
Yesterday on Reddit, someone asked the following question:
What exactly is the point of recursion schemes?
[..]
I understand it’s usefulness from an academic perspective, understanding the similarities between different types of recursive algorithms and their common patterns, but in practice, it just seems kind of… pointless?
[..]
I’m not not ripping on this particular library or it’s author at all, I think it’s worthwhile to explore topics like this, but as anything other than a mental exercise I struggle to see what the benefit is in practice. Personally I don’t understand what the aversion is to explicit recursion when the algorithm you’re writing calls for it. For other complicated ideas born out of Haskell like lenses I can immediately see the utility that outweighs any added complexity. Am I missing something here?
I thought the same before working on variants and in particular on EADTs. This post is my answer to the question for the posterity and so that I can easily find it back. It’s a revision of my answer on Reddit.
I’ve found a particularly compelling use case for them in my variant package. I’ll try to explain but it’s a bit long to setup…
The idea of variant is to compose a sum type from other types like this:
At some point we may want to use variants to encode ASTs that can get or lose constructors. E.g.
data Let b e = Let ...
data Abs b e = Abs ...
data App e = App ...
data Var b = Var ...
type LC b e = V [Abs b e, App e, Var b]
type LCWithLet b e = V [Abs b e, App e, Var b, Let b e]
However e
in LC b e
/LCWithLet b e
should really be LC b e
/LCWithLet b e
(respectively): e
is the type of an expression in our AST. But we can’t encode a recursive type like this with simple type synonyms: we need to introduce a fixpoint data type.
newtype Fix f = Fix (f (Fix f))
type LCF b e = V [Abs b e, App e, Var b]
type LCWithLetF b e = V [Abs b e, App e, Var b, Let b e]
-- our recursive variant-based ASTs
type LC b = Fix (LCF b)
type LCWithLet b = Fix (LCWithLetF b)
Now to avoid this boilerplate, in my package there is:
-- same as (V xs) but takes an additional argument
-- 'e' applied to every types in the list
newtype VariantF (xs :: [t -> Type]) (e :: t)
= VariantF (V (ApplyAll e xs))
-- fixpoint datatype specialized for variants
newtype EADT fs
= EADT (VariantF fs (EADT fs))
It means that we can define a recursive ADT based on variants (i.e. an EADT, for Extensible ADT) like this:
-- every type embedded in an EADT requires the `e`
-- argument (ADT type), even if it doesn't use it.
-- So rewrite "data Var b = Var ..." into:
data Var b e = Var ...
type LC b = EADT [Abs b, App, Var b]
type LCWithLet b = EADT [Abs b, App, Var b, Let b]
Now suppose we want to write some code to collect the free variables. We can use explicit recursion and write two functions for the two ASTs. But it would be duplicating a lot of code (especially if we have even more ASTs). This is (finally!) where recursion schemes are useful!
It’s easy to add functor instances to the datatypes embedded in the EADTs: just derive them:
data Let b e = Let ... deriving (Functor)
data Abs b e = Abs ... deriving (Functor)
data App e = App ... deriving (Functor)
data Var b e = Var ... deriving (Functor)
VariantF
also has a Functor
instance. It’s Functors all the way down!
So we can define a generic function for all those functors using a typeclass like this:
class FreeVars b f where
-- return free variables. Use this with a bottom-up traversal (catamorphism)
freeVarsF :: f (Set b) -> Set b
-- see below for the instances
And then we finally use the catamorphism recursion-scheme (renamed to bottomUp
) to write our generic freeVars
function:
freeVars :: forall b xs. BottomUpF (FreeVars b) xs => EADT xs -> Set b
freeVars xs = bottomUp (toBottomUp @(FreeVars b) freeVarsF) xs
-- "bottomUp" is just "cata" with a less cryptic name
A fully working example:
#!/usr/bin/env cabal
{- cabal:
build-depends: base, containers, variant
-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE PatternSynonyms #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE ScopedTypeVariables #-}
module Main where
import Data.Variant.EADT
import Data.Variant.EADT.TH
import Data.Set (Set)
import qualified Data.Set as Set
-- Definitions of the EADTs
data VarF b e = VarF b deriving (Functor)
data AbsF b e = AbsF b e deriving (Functor)
data AppF e = AppF e e deriving (Functor)
data LetF b e = LetF b e e deriving (Functor)
eadtPattern 'VarF "Var"
eadtPattern 'AbsF "Abs"
eadtPattern 'AppF "App"
eadtPattern 'LetF "Let"
type LC b = EADT [VarF b, AbsF b, AppF]
type LCWithLet b = EADT [VarF b, AbsF b, AppF, LetF b]
-- Generic traversal for FreeVars
class FreeVars b f where
-- | Compute free variables set
freeVarsF :: f (Set b) -> Set b
instance FreeVars b (VarF b) where
freeVarsF (VarF b) = Set.singleton b
instance Eq b => FreeVars b (AbsF b) where
freeVarsF (AbsF b vs) = Set.filter (/= b) vs
instance (Eq b, Ord b) => FreeVars b AppF where
freeVarsF (AppF vs1 vs2) = Set.union vs1 vs2
instance (Eq b, Ord b) => FreeVars b (LetF b) where
freeVarsF (LetF b vs1 vs2) = Set.filter (/= b) (Set.union vs1 vs2)
freeVars :: forall b xs. BottomUpF (FreeVars b) xs => EADT xs -> Set b
freeVars xs = bottomUp (toBottomUp @(FreeVars b) freeVarsF) xs
-- "bottomUp" is just "cata"
-- Examples
type Name = String
example1 :: LC Name
example1 = Abs "x" (Var "k" `App` (Var "(+)" `App` Var "foo" `App` Var "x"))
example2 :: LCWithLet Name
example2 = Let "foo" (Var "(+)" `App` Var "a" `App` Var "b") $
Abs "x" (Var "k" `App` (Var "(+)" `App` Var "foo" `App` Var "x"))
example3 :: LCWithLet Name
example3 = Let "foo" (Var "(+)" `App` Var "a" `App` Var "b") $
liftEADT example1
main :: IO ()
main = do
-- call `freeVars` on different AST types
print (freeVars @Name example1)
print (freeVars @Name example2)
print (freeVars @Name example3)
When executed it prints this expected result:
> ./Main.hs
fromList ["(+)","foo","k"]
fromList ["(+)","a","b","k"]
fromList ["(+)","a","b","k"]