Family Values

I wrote a big thread on the company Slack to compare type families: open vs closed vs associated. I also ended up discussing data families, as well, since they are a good complement to type families. I’ll probably edit this further and include it in my book, Production Haskell, but here’s the Slack transcript for now:

An associated type family is an open type family, but with the requirement that it be defined for any type that’s also an instance of the type class. Consider the mono-traversable library. We have:

type family Element o :: Type

class MonoFunctor o where
omap :: (Element o -> Element o) -> o -> o

class MonoFoldable o where
ofoldMap :: Monoid m => (Element o -> m) -> o -> m

class (MonoFunctor o, MonoFoldable o) => MonoTraversable o where
otraverse :: (Applicative f) => (Element o -> f (Element o)) -> o -> f o


If we wanted to put Element as an associated type, like:

class MonoFunctor o where
type Element o :: Type
omap :: (Element o -> Element o) -> o -> o


Then we’d be a bit stuck - we need to require class (MonoFunctor o) => MonoFoldable o. This is a stronger requirement than regular Foldable, and that’s a problem. Or we define a separate type family on MonoFoldable. But then - what do we use with MonoTraversable?

If we’re desperate to avoid open type families, then we can work around this by having a dummy class that only carries the type family.

class HasElement o where
type Element o :: Type

class (HasElement o) => MonoFunctor o where ...
class (HasElement o) => MonoFoldable o where ...
class (MonoFunctor o, MonoFoldable o) => MonoTraversable o where ...


But “a dummy class that only has a type family” seems obviously worse to me than “a type family” :shrug:

The question: “Should I use an open type family or a closed type family?” has an analog to simpler language features: type classes and sum types.

If you want a closed set, you use a sum type. If you want an open set, you use a type class. So if you’re familiar with the trade-offs there, the trade-offs with open/closed type familes are easier to evaluate

A closed set means you can be exhaustive - “every case is handled.” If you’re pattern matching on a datakind, like type family Foo (x :: Bool), then you can know that handling Foo 'True and Foo 'False that you’ve handled all cases. You don’t have to worry that some user is going to add a case and blow things up. (Well, you kinda do, because even closed type families aren’t actually closed, due to stuckness semantics, but, uhhhhh, that’s a bit of very advanced trickery, talk to csongor if you’re curious)

An open set is a way of allowing easy extensibility. So you’re going to accept something of kind Type or possibly a polymorphic kind variable to allow people to define their own types, and their own instances of these types. For example, if I want to associate a type with a string, I can write:

type family TypeString (sym :: Symbol) :: Type

type instance TypeString "Int" = Int
type instance TypeString "Char" = Char


And that lets me run programs at the type level, that end users can extend. Much like you can write a type class and end users can extend your functionality.

But ultimately you need to do something at the value level. Which means you need to take some type information and translate it to the value level. This is precisely what type classes do - they are morally “a function from a type to a value.” We can write a super basic function, like:

typeStringProxy :: Proxy sym -> Proxy (TypeString sm)
typeStringProxy _ = Proxy


But this is still not useful without further classes. The Default class assigns a special value to any type, and we could do something like this:

typeStringDefault :: forall sm. Default (TypeString sm) => Proxy sm -> TypeString sm
typeStringDefault = def @(TypeString sm)


Since associating a type class and an open type family is so common, it’s almost always better to use an associated type unless you know that the type family is going to be shared across multiple type classes.

“So how do you associate a closed type family with values?” That’s a great question. We can do the same trick with Proxy functions:

type family Closed (a :: Bool) where
Closed 'True = Int
Closed 'False = Char

closed :: Proxy b -> Proxy (Closed b)
closed _ = Proxy


But, until we know what b is, we can’t figure out what Closed b is. To pattern match on a type, we need a type class.

class RunClosed (b :: Bool) where
runClosed :: Proxy b -> Closed b

instance RunClosed 'True where
runClosed _ = 3

instance RunClosed 'False where
runClosed _ = 'a'


I’m going to detour a bit here and mention data families. A data family is like a type family, but instead of allowing you to refer to any type, you have to specify the constructors inline. To take an example from persistent,

data family Key a

newtype instance Key User = UserKey { unUserKey :: UUID }
type UserId = Key User

newtype instance Key Organization = OrganizationKey { unOrganizationKey :: UUID }
type OrganizationId = Key Organization


An advantage of this is that, since you specify the constructors, you can know the type of Key a by knowing the constructor in use - the value specifies the type. OrganizationKey :: UUID -> Key Organization.

It looks a lot like an “open type family,” and in fact is completely analogous. But we don’t call them “open data families,” even though that’s an appropriate name for it. It should make you wonder - is there such a thing as a closed data family?

The nice thing about an “open data family” is that you can learn about types by inspecting values - by knowing a value (like OrganizationKey uuid), I can work ‘backwards’ and learn that I have an Key Organization. But, I can’t write a case expression over all Key a - it’s open! and case only works on closed things. So this code does not work:

whatKey :: Key a -> Maybe UUID
whatKey k = case k of
UserKey uuid -> Just uuid
OrganizationKey uuid -> Just uuid
_ -> Nothing


Indeed, we need a type class to allow us to write get :: Key a -> SqlPersistT m (Maybe a).

A GADT - as a closed data family - allows us to work from a value to a type, and since it is exhaustive, we can write case expressions on them.

data Key a where
UserKey :: { unUserKey :: UUID } -> Key User
OrganizationKey :: { unOrganizationKey :: UUID } -> Key Organization


If I have this structure, then I can actually write get without a type class.

get :: Key a -> SqlPersistT IO (Maybe a)
get k = case k of
UserKey uuid -> do
[userName, userAge] <- rawSql "SELECT * FROM users WHERE id = ?" [toPersistValue uuid]
pure User {..}
OrganizationKey uuid -> ...


A GADT is ‘basically’ a closed type family that gives you constructor tags for applying that type family,. If we look at Closed, we can inline this:

type family ClosedTy (b :: Bool) where
ClosedTy True = Int
ClosedTy False = Char

data ClosedData (a :: Type) where
ClosedData :: Proxy b -> ClosedData (ClosedTy b)

-- inlining:
data Closed (a :: Type) where
ClosedTrue :: Proxy 'True -> Closed Int
ClosedFalse :: Proxy 'False -> Closed Char


When we case on a Closed value, we get:

runClosed :: Closed a -> a
runClosed closed =
case closed of
ClosedTrue (Proxy :: Proxy 'True) -> 3
ClosedFalse (Proxy :: Proxy 'False) -> 'a'


I’m going to conclude now, with this distillation:

• Open type family + type class = extensible, open programming, but no exhaustivity.
• Closed type family + GADT + functions = exhaustive handling of types, but not extensible
• An open type family + a GADT isn’t much fun.
• A closed type family + a type class isn’t much fun