GADTs: Dependent Types at Home (Haskell/OCaml edition)

👻

GhostlyGADT

★ Phantom Type Enjoyer

Posts: 1,432
Rep: 🌸 847
Joined: '22
Lang: GHC HEAD

#1 — OP Nov 4, 2025 · 02:17

hey everyone 🌙 i've been meaning to write this post for a while. been doing a deep dive into how far you can push GADTs in Haskell and OCaml to simulate dependent type patterns without jumping all the way to Idris/Agda, and i have Thoughts™

the classic example everyone starts with is a type-safe expression evaluator. in vanilla ADTs you get runtime type errors. with a GADT, the type checker itself enforces well-typedness:

HASKELL

{-# LANGUAGE GADTs #-}

data Expr a where
  Lit  :: Int              -> Expr Int
  Bool :: Bool             -> Expr Bool
  Add  :: Expr Int -> Expr Int -> Expr Int
  If   :: Expr Bool -> Expr a -> Expr a -> Expr a

eval :: Expr a -> a
eval (Lit n)    = n
eval (Bool b)   = b
eval (Add x y)  = eval x + eval y
eval (If c t e) = if eval c then eval t else eval e

-- This won't compile — the type checker rejects it!
-- bad = Add (Lit 3) (Bool True)

the key insight here — which i think a lot of people miss — is that Expr a is indexed by a *type*, and each constructor can specialize that index. this is what lets the compiler learn more type information when you descend into a pattern match branch.

same idea in OCaml (native since 4.00!):

OCAML

type _ expr =
  | Lit  : int                        -> int expr
  | Bool : bool                       -> bool expr
  | Add  : int expr * int expr       -> int expr
  | If   : bool expr * 'a expr * 'a expr -> 'a expr

let rec eval : type a. a expr -> a = function
  | Lit n      -> n
  | Bool b     -> b
  | Add (x, y) -> eval x + eval y
  | If (c,t,e) -> if eval c then eval t else eval e

note OCaml requires the type a. polymorphic recursion annotation — without it the type checker can't unify a across recursive calls. this trips up a lot of people new to OCaml GADTs.

so the thesis of this thread: GADTs let us encode some dependent type patterns at the cost of verbosity and manual singleton boilerplate. but where does the analogy break down? let's discuss 👇

🐾

PatternMatchPurrr

♦ Core Contributor

Posts: 3,017
Rep: 💜 2,103
Joined: '21
Lang: OCaml + Coq

#2 Nov 4, 2025 · 03:44

great writeup ghost 🐾 the evaluator example is nice but i want to jump straight to where GADTs really shine as "dependent types at home": length-indexed vectors. the classic.

HASKELL

{-# LANGUAGE GADTs, DataKinds, KindSignatures #-}

data Nat = Z | S Nat   -- promoted to kind by DataKinds

data Vec :: Nat -> * -> * where
  VNil  ::                    Vec Z a
  VCons :: a -> Vec n a -> Vec (S n) a

-- type-safe head: can only call on non-empty vecs!
vhead :: Vec (S n) a -> a
vhead (VCons x _) = x
-- vhead VNil would be a TYPE ERROR, not a runtime crash!

-- type-safe zip: lengths must match
vzip :: Vec n a -> Vec n b -> Vec n (a, b)
vzip VNil        VNil        = VNil
vzip (VCons x xs) (VCons y ys) = VCons (x, y) (vzip xs ys)

this is basically Idris' Vect n a but achieved through DataKinds promoting Nat to a kind. the length lives in the type, and the compiler knows at compile time that vhead is only callable on a non-empty vector.

in a true dependently typed language like Idris you write essentially the same syntax. the critical difference: in Idris, n is an actual value that can be computed at runtime and still appear in types. in Haskell, n :: Nat lives at the kind level — it's a phantom, erased before runtime.

🔬

FormalFelinae

✦ Proof Mechanist

Posts: 892
Rep: 💙 644
Joined: '23
Lang: Agda, Lean 4

#3 Nov 4, 2025 · 07:12

PatternMatchPurrr wrote:

in Idris, n is an actual value that can be computed at runtime and still appear in types. in Haskell, n :: Nat lives at the kind level — it's a phantom, erased before runtime.

yes! this is THE fundamental distinction and i think it deserves more elaboration 🔬

in Agda, here's the same Vec:

AGDA

data Vec (A : Set) : ℕ → Set where
  []  : Vec A 0
  _∷_ : ∀ {n} → A → Vec A n → Vec A (suc n)

-- We can compute the length INDEX from a runtime value
fromList : (xs : List A) → Vec A (length xs)
fromList []       = []
fromList (x ∷ xs) = x ∷ fromList xs

look at fromList. the return type depends on the value of the argument xs — specifically its length. Agda can compute length xs in the type! in Haskell you'd have to use singletons or type families and it gets... messy.

the singletons library attempts to bridge this gap for Haskell by generating runtime "witnesses" for type-level values, but it's a lot of boilerplate machinery. that's the price of dependent types at home: you're fighting the type system rather than working with it.

😼

TypeNihilistKat

⚡ Chaos Typist

Posts: 4,889
Rep: 🌙 3,211
Joined: '20
Lang: everything & nothing

#4 Nov 4, 2025 · 09:03

ok i'll be the skeptic in the room 😼

the slogan i keep coming back to is: "GADTs are dependent types, but dependent types are not GADTs." a GADT is just a parameterized algebraic data type where constructors can have different return types. that IS a limited form of dependency. but it's missing the full thing.

here's what you fundamentally cannot do in Haskell/OCaml with GADTs alone:

IDRIS

-- A function whose RETURN TYPE depends on a runtime value
replicate : (n : Nat) -> a -> Vect n a
replicate Z     _ = []
replicate (S k) x = x :: replicate k x

-- n is BOTH a runtime value AND a type index simultaneously
-- There's no distinction between "type level" and "value level"

in Haskell you'd need SNat n (a singleton), pass that explicitly, and suddenly you're writing twice as much code to achieve the same thing. OCaml has first-class modules which can help, but it's still not clean.

the two-level architecture (term level vs type level) is the original sin of Haskell's type system. GADTs let you peek across that wall but never truly demolish it.

that said: for most practical purposes, GADTs cover like 80% of the dependent type patterns you actually want. fight me.

👻

GhostlyGADT

★ Phantom Type Enjoyer

Posts: 1,432
Rep: 🌸 847
Joined: '22
Lang: GHC HEAD

#5 Nov 4, 2025 · 10:28

TypeNihilistKat wrote:

the two-level architecture (term level vs type level) is the original sin of Haskell's type system

harsh but fair. ok so let me show the singletons approach since i think it's actually a pretty beautiful hack even if it's verbose:

HASKELL

{-# LANGUAGE GADTs, DataKinds, ScopedTypeVariables, TypeFamilies #-}

-- A singleton: a runtime value that mirrors its type-level counterpart
data SNat :: Nat -> * where
  SZ ::            SNat Z
  SS :: SNat n -> SNat (S n)

-- Now we can replicate: pass the singleton as a runtime witness
vreplicate :: SNat n -> a -> Vec n a
vreplicate SZ     _ = VNil
vreplicate (SS n) x = VCons x (vreplicate n x)

-- Usage:
-- vreplicate (SS (SS SZ)) 'a'  →  VCons 'a' (VCons 'a' VNil) :: Vec (S (S Z)) Char

the singleton SNat n is the runtime witness for the type-level n. when you pattern match on SZ, GHC learns n ~ Z; when you match on SS n', GHC learns n ~ S n'. it's essentially manually implementing what Idris does automatically.

the singletons-th library generates all this boilerplate via Template Haskell. it works! but yeah, you're writing the proof twice: once in the type and once in the value. dependent types collapse that duplication entirely.

🐾

PatternMatchPurrr

♦ Core Contributor

Posts: 3,017
Rep: 💜 2,103
Joined: '21
Lang: OCaml + Coq

#6 Nov 4, 2025 · 12:55

let me show the OCaml side of the same trick. OCaml doesn't have DataKinds so we use phantom types with abstract modules or just raw GADT witnesses:

OCAML

(* Peano naturals as phantom types *)
type z = Z
type 'n s = S

type (_, _) vec =
  | Nil  :                             (z, _) vec
  | Cons : 'a * ('n, 'a) vec   ->  ('n s, 'a) vec

let head : type n a. (n s, a) vec -> a = function
  | Cons (x, _) -> x
(* Nil case is unreachable and the compiler KNOWS it *)

let zip : type n a b. (n, a) vec -> (n, b) vec -> (n, a * b) vec =
  fun v1 v2 -> match v1, v2 with
  | Nil, Nil               -> Nil
  | Cons (a, as_), Cons (b, bs) -> Cons ((a, b), zip as_ bs)

note: OCaml's exhaustiveness checker is aware of GADT constraints. in head, the Nil branch is correctly identified as unreachable because the type index n s (a successor) can't unify with z. no wildcard pattern needed, no partial function warning. this is genuinely beautiful.

one headache in OCaml vs Haskell: you have to be careful with type a. annotations — if a recursive call involves an existential GADT type variable, it can escape its scope and produce a terrifying error message. always annotate your recursive GADT functions in OCaml.

🔬

FormalFelinae

✦ Proof Mechanist

Posts: 892
Rep: 💙 644
Joined: '23
Lang: Agda, Lean 4

#7 ⭐ Notable Nov 4, 2025 · 14:39

can we talk about the type equality witness pattern? this is one of my favorite GADT tricks — you can encode propositional type equality as a GADT and use it to do very dependently-typed-feeling things:

HASKELL

data (:~:) :: k -> k -> * where
  Refl :: a :~: a   -- only constructor: a ~ a

-- If you have a proof that a ~ b, you can coerce
castWith :: a :~: b -> a -> b
castWith Refl x = x

-- Symmetry is provable
sym :: a :~: b -> b :~: a
sym Refl = Refl

-- Transitivity is provable
trans :: a :~: b -> b :~: c -> a :~: c
trans Refl Refl = Refl

this is literally encoding propositional equality from Martin-Löf type theory! Refl is the reflexivity proof. when you pattern match on Refl :: a :~: b, GHC learns a ~ b locally, which is exactly the same mechanism as identity types in Agda/Idris.

where it falls short: in full dependent type theories you have the J eliminator — a general principle for reasoning about equality proofs. Haskell's coercion mechanism is weaker; you can't do full induction over equality proofs without extra machinery.

still, for day-to-day type-safe programming, this is incredibly powerful. i use (:~:) constantly in production OCaml/Haskell code for heterogeneous comparisons and dynamic typing schemes.

😼

TypeNihilistKat

⚡ Chaos Typist

Posts: 4,889
Rep: 🌙 3,211
Joined: '20
Lang: everything & nothing

#8 Nov 4, 2025 · 16:20

FormalFelinae wrote:

this is literally encoding propositional equality from Martin-Löf type theory!

ok felinae u got me, that's genuinely cool. and the OCaml equivalent is the eql type from the Cambridge GADT lecture notes right? same idea:

OCAML

type (_, _) eql =
  | Refl : ('a, 'a) eql

(* At the data level: same representation as unit *)
(* At the type level: carries a proof that 'a = 'b *)

let coerce : type a b. (a, b) eql -> a -> b =
  fun Refl x -> x

let sym : type a b. (a, b) eql -> (b, a) eql =
  fun Refl -> Refl

what gets me every time is that eql has a single nullary constructor at runtime — it's just unit. it carries zero runtime information. all the interesting stuff is purely in the types. this is the platonic ideal of a phantom type. 😼

though this also highlights the limitation: because it's erased, you can't inspect the equality proof itself. in HoTT / Agda with Cubical you can reason about different paths between equal things. in Haskell you get one Refl and that's it.

👻

GhostlyGADT

★ Phantom Type Enjoyer

Posts: 1,432
Rep: 🌸 847
Joined: '22
Lang: GHC HEAD

#9 Nov 4, 2025 · 18:44

let me bring up a practical limitation that doesn't get discussed enough: type inference degrades badly with GADTs.

in regular Hindley-Milner you get global type inference for free. with GADTs, the system can't always infer what you mean because the type equations introduced by pattern matching create ambiguities. you end up needing type annotations everywhere, which can be really noisy.

OCAML

(* This fails without the annotation! *)
let eval_bad = function
  | Int n  -> n
  | Bool b -> b  (* Error: types int and bool are incompatible *)

(* You need a locally abstract type annotation *)
let eval_ok (type a) (v : a value) : a = match v with
  | Int n  -> n
  | Bool b -> b

the reason: OCaml by default won't instantiate ordinary type variables in different ways within the same function body. a locally abstract type (type a) sidesteps this restriction. it's not hard once you know it, but the error message is cryptic without context.

in Idris/Agda this just... works. the whole point is that terms and types are unified, so inference operates over the full bidirectional type checking machinery. the annotation tax is the cost of tacking dependent-style indices onto a language not designed for them from the ground up.

🔬

FormalFelinae

✦ Proof Mechanist

Posts: 892
Rep: 💙 644
Joined: '23
Lang: Agda, Lean 4

#10 Nov 4, 2025 · 20:17

summary post from the agda-pilled corner of the thread 🔬

What GADTs CAN simulate:

Type-indexed data structures (Expr, Vec, Fin)
Type equality witnesses (:~:, eql)
Heterogeneous lists and type-safe heterogeneous maps
Tagless-final interpreters with static type safety
Phantom-type based access control / state machines
Some forms of existential types

What they fundamentally CANNOT do (without full dependent types):

Return types that depend on runtime-computed values directly
Inductive families where the index is an arbitrary expression
Full proof-relevant reasoning (HoTT-style path types)
Dependent functions (Π-types) and dependent pairs (Σ-types)
Termination checking for recursive types over computed indices

the line between "GADT with DataKinds + TypeFamilies + singletons" and "just use Idris" is thinner than most people think. at some point you're writing a dependent type system in macro form. might as well use the real thing for verification-critical code.

that said — for production software in Haskell/OCaml, GADTs are an incredible tool. they're practically zero cost, fit into existing codebases, and catch entire classes of bugs at compile time. use them freely.

😼

TypeNihilistKat

⚡ Chaos Typist

Posts: 4,889
Rep: 🌙 3,211
Joined: '20
Lang: everything & nothing

#11 Nov 5, 2025 · 00:03

one more thing i want to throw in before bed: the existential side of GADTs, which often gets less attention than the index/refinement side.

HASKELL

-- Existential: 'a' appears in constructor args but NOT return type
data ShowBox where
  MkShow :: Show a => a -> ShowBox

-- Can store heterogeneous showables in a list!
boxes :: [ShowBox]
boxes = [MkShow 42, MkShow "hello", MkShow True]

showAll :: [ShowBox] -> [String]
showAll = map (\(MkShow x) -> show x)

this is the existential flavor. in full dependent types, existentials are just sigma types (Σ-types). in Haskell/OCaml you get them "for free" from GADTs by having a type variable appear in constructor arguments but not the return type.

the limitation: you can only eliminate the existential by applying whatever constraint you packed in. you can't recover the actual type. in a proper Σ-type you have both the witness AND the proof that it satisfies some property. with Haskell's existentials, the typeclass constraint IS the only thing you know.

anyway, great thread ghost. bookmarking forever. gonna go cry into my Agda proofs now 😼✨

🐾

PatternMatchPurrr

♦ Core Contributor

Posts: 3,017
Rep: 💜 2,103
Joined: '21
Lang: OCaml + Coq

#12 Nov 5, 2025 · 08:30

closing thought from me — there's a great talk by Stephanie Weirich about this exact trajectory. Haskell has been growing towards dependent types for years through GHC extensions: GADTs, DataKinds, TypeFamilies, PolyKinds, RankNTypes... each one closing the gap a little more.

the honest position: Haskell with dependent types is never going to be Agda, and that's okay. the tradeoffs are different. Haskell keeps its industrial-strength runtime, its ecosystem, its laziness. it sacrifices the clean type-theoretic foundations of a language designed for dependent types from day one.

for most teams: use GADTs in Haskell/OCaml for day-to-day invariant enforcement. reach for Agda/Lean when you need machine-checked proofs of correctness. the two-language problem (model in Agda, implement in Haskell) is real and painful, but that's where we are right now.

maybe Dependent Haskell (GHC proposal) will fix this someday. until then: GADTs are our best tool and honestly? they're pretty great. 🐾

// reply to thread