Type Inference: Hindley-Milner vs Bidirectional โ€” a comparison ๐Ÿ“Œ PINNED
๐Ÿ“ฌ 33 replies ๐Ÿ‘ 2,847 views ๐Ÿ‘ค Started by InferenceEngine9 ๐Ÿ• Posted: 2025-11-04 ๐Ÿท #type-inference #hindley-milner #bidirectional
#1  OP

Hey everyone. I've been implementing a small dependently typed language for fun and kept hitting a wall trying to decide: should I go with Hindley-Milner style global inference or bidirectional type checking? I ended up reading a lot of papers and thought this would be a good thread to summarize the key differences and get everyone's thoughts.

The Basics: What is Hindley-Milner?

A Hindleyโ€“Milner (HM) type system is a classical type system for the lambda calculus with parametric polymorphism, also known as Damasโ€“Milner or Damasโ€“Hindleyโ€“Milner. Among HM's more notable properties are its completeness and its ability to infer the most general type of a given program without programmer-supplied type annotations or other hints.

The key workhorse is Algorithm W. Algorithm W is an efficient type inference method in practice and has been successfully applied on large code bases, although it has a high theoretical complexity. HM is preferably used for functional programming languages. It was first implemented as part of the type system of ML. Since then, HM has been extended in various ways, most notably with type class constraints like those in Haskell.

Here's a sketch of the core unification-based inference (Algorithm J style):

-- HM inference: returns a type, unifying constraints along the way infer :: Ctx -> Expr -> TyVar -> Type -- Var rule: look up in context, instantiate polymorphic type infer ctx (Var x) = instantiate (lookup ctx x) -- App rule: infer function type, unify with argument infer ctx (App f a) = let tf = infer ctx f ta = infer ctx a tr = freshTyVar() in unify tf (ta -> tr); tr -- Let rule: generalize before adding to context (let-polymorphism) infer ctx (Let x e1 e2) = let t1 = infer ctx e1 ฯƒ = generalize ctx t1 -- โˆ€-quantify free vars in infer (ctx[x โ†ฆ ฯƒ]) e2

Generics generally require a type system that supports unification. Unification is the process of assigning and solving type variables. HM sits in a sweet spot: the type system is quite expressive, and there are well known type inference algorithms that require absolutely no annotations from the programmer.

Let-Polymorphism and its Limits

HM distinguishes variables that are immediately bound to an expression from more general ฮป-bound variables, calling the former let-bound variables, and allows polymorphic types to be assigned only to these. This leads to let-polymorphism where the example takes the form let id = ฮป x . x in ... (id 3) ... (id "text") ... which can be typed with a polymorphic type for 'id'.

I'll post more on bidirectional in the next reply. Curious what everyone thinks so far!

#2  OP

Now: Bidirectional Type Checking

Bidirectional typing combines two modes of typing: type checking, which checks that a program satisfies a known type, and type synthesis, which determines a type from the program.

This approach โ€” check an expression against a type when we know what type to expect, synthesize a type from an expression when we don't โ€” is called bidirectional type checking, so named because type information flows in two directions in the abstract syntax tree: from leaves to root when synthesizing, from root to leaves when checking.

The two core judgments in any bidir system:

ฮ“, x:A โŠข M โ‡ B
ฮ“ โŠข (ฮปx.M) โ‡ (A โ†’ B) [โ‡Abs]
ฮ“ โŠข f โ‡’ (A โ†’ B)    ฮ“ โŠข a โ‡ A
ฮ“ โŠข f(a) โ‡’ B [โ‡’App]

Generally, for constructors of a type we can propagate the type information downward into the term, which means it should be used in the checking judgment (e โ†“ ฯ„). Conversely, the destructors generate a result of a smaller type from a constituent of larger type and can therefore be used for synthesis, propagating information upward.

The key subsumption rule that switches modes:

ฮ“ โŠข e โ‡’ A    A <: B
ฮ“ โŠข e โ‡ B [Sub]

The Sub rule is what lets you switch from synthesis to checking โ€” if you can synthesize a type A for e, and A is a subtype of the expected B, you're done.

#3

Great intro! I want to jump straight to what I think is the most underrated advantage of bidir: error messages.

An advantage bidirectional type checking has over the Hindley-Milner inference algorithm is that we can give a more specific error than simply 'failed to unify types A and B'. One of the commonly cited disadvantages of bidirectional type inference is that we need to sometimes include type annotations in places where it seems obvious.

In HM, when unification fails, you often get a spooky "Cannot unify ฮฑ3 with Int โ†’ ฮฒ7" message where the type variables have been unified across the whole program. You lose locality completely. In bidir, when checking mode fails, you know exactly where the mismatch happened and what type was expected โ€” that's just structural.

Also want to clarify the annotation burden question. One of the commonly cited disadvantages of bidirectional type inference is that we need to sometimes include type annotations in places where it seems 'obvious'. However: if we extend our language with let-bindings with type annotations and force everything at the top level to be a let-binding, then we should almost never need another type annotation in the body of the let bindings! This restriction isn't unrealistic for a language.

-- HM: no annotations needed at all (inference is global) let compose f g x = f (g x) -- inferred: (ฮฒ โ†’ ฮณ) โ†’ (ฮฑ โ†’ ฮฒ) โ†’ ฮฑ โ†’ ฮณ -- Bidir: top-level annotation, rest is inferred locally let compose : (ฮฒ โ†’ ฮณ) โ†’ (ฮฑ โ†’ ฮฒ) โ†’ ฮฑ โ†’ ฮณ let compose f g x = f (g x) -- body inferred from the pushed-in type, no annotation inside

Honestly for any serious language I'd argue the top-level sig discipline is a feature, not a bug. It makes code self-documenting.

#4

I'll add the formal perspective. The key reason bidir dominates in dependent type theory is decidability and subtyping.

If you try to extend Hindley-Milner type inference to add any other language feature that requires subtyping, you run into issues pretty quickly. This is because Hindley-Milner exploits a cute hack that does not generalize well.

In contrast, bidirectional type-checking is not just a method for handling implicit type abstraction/application. Rather, bidirectional type-checking is a framework for introducing any form of subtyping, of which implicit type abstraction/application is just a special case.

The key paper here is Dunfield & Krishnaswami (2013), "Complete and Easy Bidirectional Typechecking for Higher-Rank Polymorphism." Also the so-called "Pfenning recipe" yields a set of bidirectional typing rules that is small and whose annotation discipline is moderately lightweight and highly predictable.

For HM, the limitation is well-understood: in HM type systems, universal quantification can only appear at the top level. This lifts a restriction that prevents higher-rank polymorphism. Concretely, if you want a function that takes a polymorphic argument like โˆ€a. a โ†’ a at rank 2, HM chokes. Bidir handles it cleanly.

HM let-polymorphism (rank-1 only):
ฮ“ โŠข eโ‚ : ฯ„    ฮ“, x:โˆ€แพฑ.ฯ„ โŠข eโ‚‚ : ฯ„'
ฮ“ โŠข let x = eโ‚ in eโ‚‚ : ฯ„' [Var แพฑ = ftv(ฯ„)\ftv(ฮ“)]
Bidir annotation rule โ€” escape hatch for synthesis:
ฮ“ โŠข e โ‡ A
ฮ“ โŠข (e : A) โ‡’ A [Anno]

The Anno rule is what makes bidir "escape" into synthesis anywhere โ€” you just annotate the term. HM has no equivalent because it doesn't have checking mode at all.

#5

Wait, question โ€” I've heard "local type inference" used almost synonymously with bidirectional. Are they the same thing?

#6
LambdaCatgirl wrote:
Wait, question โ€” I've heard "local type inference" used almost synonymously with bidirectional. Are they the same thing?

Related but distinct! The term "local type inference" (Pierce and Turner 2000) implies that bidirectional typing should focus on (or even be restricted to) local information, suggesting that annotation disciplines should not need global information.

Local type inference is a pragmatic approach. It is not aimed to provide the same degree of type inference that is possible in languages with Hindley-Milner type inference. Instead, as Pierce and Turner state, the goal is to "exchange complete type inference for simpler methods that work well in the presence of more powerful type-theoretic features such as subtyping and impredicative polymorphism".

In local type inference, the main design decision that is made to simplify type inference is to avoid forms of global type inference that employ long-distance constraints such as unification variables.

So: local type inference is a philosophy/design constraint. Bidirectional is the mechanism usually used to implement it. You can have bidir systems that still use long-distance unification (like "Complete and Easy") โ€” those aren't strictly "local" anymore but still benefit from the checking/synthesis structure.

#7

I want to address the "are they mutually exclusive?" question that always comes up. They're not.

When facing down designing a type system, knowing if you need unification or not decides a lot for you. Unification sits center stage in Hindley-Milner. When you pick HM you pick unification.

If you look to the literature, you'll find plenty of examples of bidirectional typing without unification in sight. By introducing annotations at key locations, you can type check sophisticated programs with no type variables. A key insight of bidirectional typing is how much you can do without unification.

BUT โ€” and this is important โ€” you can absolutely combine them. A combination of the two โ€” a bidirectional typing system with unification โ€” will be the best approach in many cases. In fact, if you are going to choose a Hindley-Milner type system, you might as well add a few lines of code and make it a bidirectional system.

Here's the combined approach in pseudocode:

-- Bidir + unification: best of both worlds check :: Env โ†’ Ast โ†’ Type โ†’ Result<(), TypeError> check env (Fun x body) (Fun arg ret) = check (env.insert(x, arg)) body ret -- push type in! check env ast ty = let inferred = infer env ast -- fall back to synthesis in unify inferred ty -- unify with expected -- The lambda case is the key win: we avoid a fresh type variable -- and can directly use the known argument type from context.

The weakness of pure bidirectional typing systems is that they do not talk about "partially-known" types. Bidirectional typing systems require "complete" types, but the inference rules of the lambda calculus fundamentally give us partial information about types. Unification fills exactly that gap.

#8  OP

Good points all around. Let me add a practical comparison table:

Feature โ”‚ Hindley-Milner โ”‚ Bidirectional โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Annotation burden โ”‚ None (global) โ”‚ Top-level sigs Higher-rank polymorphism โ”‚ Undecidable โ”‚ Supported Subtyping โ”‚ Very hard โ”‚ Natural fit Dependent types โ”‚ Not applicable โ”‚ Standard approach Error messages โ”‚ Global, confusing โ”‚ Local, precise Implementation complexity โ”‚ Moderate โ”‚ Lower (pure bidir) Principal types โ”‚ Yes (guaranteed) โ”‚ Not always Used in โ”‚ Haskell, OCaml โ”‚ Agda, Lean, Idris

A few languages use Hindley-Milner type inference, such as Rust, Haskell, and F#. What bidirectional type checking is used in includes TypeScript, Scala, and others.

One thing I didn't list: Hindley-Damas-Milner inference typically provides a stronger form of principality โ€” the property that type inference never "chooses" among incompatible, valid types for subexpressions. Bidir can sometimes be ambiguous about which type to synthesize in certain cases. That's a real tradeoff.

#9

Really appreciate that table!! One thing I'm confused about โ€” why can't HM just... infer dependent types? Like what's the actual theoretical blocker?

#10
LambdaCatgirl wrote:
why can't HM just... infer dependent types?

Wonderful question. The short answer: type checking becomes undecidable once types can depend on terms, because you'd need to evaluate arbitrary programs at compile time. HM's whole power comes from working purely in a stratified world where terms and types are separate layers.

In dependent types, you might have something like Vec A n where n : โ„• is a term. Checking Vec A (f 3) ~ Vec A (g 2) reduces to checking f 3 = g 2 โ€” but f and g can be arbitrary programs! Deciding that requires deciding program equality, which is undecidable in general.

In complexity theory, it is often relatively easy to check that a given solution is valid, but finding (synthesizing) a solution may be complex or even undecidable. This general rule holds for type systems: synthesizing types may be convenient for the programmer, but computationally intractable.

To go beyond the specific feature set of traditional Damasโ€“Milner typing, it might seem necessary to abandon synthesis. Instead, however, we can combine synthesis with checking. In this approach โ€” bidirectional typing โ€” language designers are not forced to choose between a rich set of typing features and usability.

Bidir handles this by making the programmer provide enough annotations so the checker can verify without needing to invent types from scratch. The โ‡’/โ‡ discipline gives you a principled way to decide exactly where annotations are mandatory.

-- Dependent type checking example (Agda-like notation) -- The checking mode pushes the expected type INTO the term: check ฮ“ (Vec xs) (Vec A n) = -- We KNOW the expected type A and length n from context -- No need to infer them from xs alone check ฮ“ xs (List A) โˆง check_length xs n -- evaluate n to normal form, compare
#11

One more concept worth covering: mode-correctness, which is the formal property that makes a bidirectional system actually work as an algorithm.

As Dunfield and Krishnaswami outlined, mode-correctness for a bidirectional typing rule means that (i) each 'input' type variable in a premise must be an 'output' variable in 'earlier' premises, or provided by the conclusion if the rule is checking; (ii) each 'output' type variable in the conclusion should be some 'output' variable in a premise if the rule is synthesising.

The algorithm of a bidirectional type synthesiser can often be 'read off' from a well-designed bidirectional type system: as the synthesiser traverses a raw term, it switches between synthesis and checking, following the modes assigned to the judgements in the typing rules.

This is the beautiful part: if your rules are mode-correct, the implementation almost writes itself. You get two mutually recursive functions synth and check, and the rules directly correspond to pattern match arms. No unification variable management, no occurs-check headaches, no backtracking.

Compare that to implementing Algorithm W, where you have to thread a substitution through everything and apply it carefully. It's not hard, but it's much more stateful.

#12

This thread is making me so much smarter omg. So for someone building a new language in 2025 โ€” just go bidir? Or does HM still make sense?

#13  OP
LambdaCatgirl wrote:
for someone building a new language in 2025 โ€” just go bidir?

My take after all this research: bidir + unification is the sweet spot for most new languages. Here's my decision tree:

do_you_need_subtyping? yes โ†’ bidir (HM won't survive contact with subtyping) no โ†’ do_you_need_dependent_types? yes โ†’ bidir (the only real option) no โ†’ do_you_want_zero_annotations? yes โ†’ HM (or HM + bidir sugar on top) no โ†’ bidir + unification (best error messages)

If you're in the "no subtyping, no deptype, yes ML-style" space, HM is absolutely still valid and time-tested. Hindley-Milner doesn't combine easily with subtyping and union types, but if you don't need those, you're fine.

The rest of the replies are more deep dives โ€” click to next page for discussion on constraint-based inference, GHC's OutsideIn(X), and how Lean 4 does it.

โœ๏ธ Post a Reply