On this page:
14.1 A Preview
14.2 Getting Our Theory Closer to Practice
14.3 Doing It with ’Stile
14.3.1 Working with Types
14.3.2 Typing Rules in turnstile
14.3.3 Making Your Definitions Available
7.4.0.4

14 Typed Languages with Turnstile

Jesse Tov

Goals

transition from blackboard rules to a real #lang

see how macro expansion can check types

use the turnstile library to implement a typed language

14.1 A Preview

This afternoon, we’re going to learn how to implement a type system using macros. The end result will be a #lang that checks while it expands and raises a syntax error if it encounters a type error. Here is an example interaction that makes the type checker happy:

> (+ 3 4)

7

> (λ ([x Bool] [y Int])
    (if x y (* 2 y)))

#<procedure>

> (let ([x 5])
    (let ([x 8] [y x])
      y))

5

And here’s an interaction demonstrating some type errors:

> (+ 3 #t)

eval:4.0: #%app: type mismatch: expected Int, given Bool

  expression: #t

  at: #t

  in: (#%app + 3 #t)

> (λ ([x Int] [y Bool])
    (if x y (* 2 y)))

eval:5.0: if: type mismatch: expected Bool, given Int

  expression: x

  at: x

  in: (if x y (* 2 y))

> (λ (x) (+ x 1))

eval:6.0: λ: no expected type, add annotations

  at: (λ (x) (+ x 1))

  in: (λ (x) (+ x 1))

  parsing context:

   while parsing has-expected-type

    term: (λ (x) (+ x 1))

    location: eval:6.0

The last error message suggests that we can fix the function by adding a type annotation. Annotating either the whole function or just the parameter allows the function to type:

> (ann (λ (x) (+ x 1))
       (-> Int Int))

#<procedure>

> (λ ([x Int]) (+ x 1))

#<procedure>

Beyond the rules we saw this morning, the stlc language adds a module-level definition form def, which is visible only to code that follows it. It also supports testing forms, such as check-type and typecheck-fail.These forms originate from the rackunit/turnstile library. Here’s a full-module example using these:

"use-stlc.rkt"

  #lang s-exp "stlc.rkt"
   
  ;; Subtracts two integers.
  (def -
    (λ ([x Int] [y Int])
       (+ x (* -1 y))))
   
  (check-type - : (-> Int Int Int))  ; correct type for -
  (check-type (- 3 8) : Int -> -5)   ; correct type and result
   
  (typecheck-fail (- 3 #t))          ; bad argument type
  (typecheck-fail (- 3 8 12))        ; too many arguments
  (typecheck-fail (- 3))             ; too few arguments
   
   
  ;; Finds the factorial of an integer.
  (def fact
    (rec self (-> Int Int)
      (λ (n)
        (if (<= n 1)
            1
            (* n (self (- n 1)))))))
   
  (check-type (fact 5) : Int -> 120)
   
   
  ;; (iter n f x) applies `f` to `x`, `n` times.
  (def iter (-> Int (-> Int Int) Int Int)
    (rec self
      (λ (n f x)
        (if (<= n 0)
            x
            (self (- n 1) f (f x))))))
   
  ;; Integer exponentiation using `iter`.
  (def expt (-> Int Int Int)
    (λ (n m)
      (iter m (λ (acc) (* n acc)) 1)))
   
  (check-type (expt 2 10) : Int -> 1024)

14.2 Getting Our Theory Closer to Practice

To express turnstile-style rewriting using type inference rules, we’re going to define a second language as that target of our translation. This new language is like the typed source language λZB→, but without types or type annotations:

The compute-mode and check-mode judgments now have a fourth position between the input expression and the type—a rewritten output expression. Here’s the form of the new inference-mode judgment: . And here’s the form of the new check-mode judgment:

We need one more change, to type environments. Part of how turnstile works involves replacing variable names with new ones, and to do this, it uses an environment to map each old name () to both a new name () and the variable’s type:

Now we can express the rule for inferring the type of a variable. It has to be present in the type environment, in which case we rewrite it to the new name from the environment and assign the type from the environment:

Some rules, such as the inference-mode rule for function applications, are the same as before, except for rewriting each piece before putting them back together:

Rules that bind variables also need to participate in the renaming. This means generating a fresh name for each bound variable and including that information in the environment for expressions in the variable’s scope. For example, here’s the check-mode rule for :

Finally, some rules need to eliminate syntactic forms that the target language lacks. For example, the inference-mode rules for type annotation expressions and type-annotated λs both erase the annotations in order to translate to the untyped target language:

14.3 Doing It with ’Stile

Before we see how to write those same rules in turnstile, let’s start with the basics. The easiest way to make a turnstile–based language is to write it in the #lang turnstile/quicklang language, which automatically provides #%module-begin, #%top-interaction, and require from our language:

#lang turnstile/quicklang

(If you want to customize the automatically provided forms then there are lower-level ways to set it up.)

Next, before we define the syntax of expressions, we need to define the syntax of types.

14.3.1 Working with Types

The turnstile library represents types as syntax objects. Types have an external form, which you write directly, and an internal form suitable for comparison and pattern matching. Each type written in a user program will be in the external representation, and it’s our responsibility to normalize it to its internal form. To help with this, turnstile provides a syntax class type (at phase 1). The syntax-parse pattern τ:type

  1. matches only well-formed types,

  2. normalizes type τ to its internal form, and

  3. binds an attribute named τ.norm to that internal form for use in subsequent syntax templates. (Note that pattern variable τ gets bound to the original syntax, not the internal form.)

Setting all that up can be pretty complicated, but turnstile provides macros that will define the right things for you. So, let’s start by defining our types. As before, we want two base types for Booleans and integers, and one type constructor for function types:

"stlc.rkt"

19   (define-base-types Bool Int)
20   (define-type-constructor -> #:arity >= 1)

The notation #:arity >= 1 means that -> requires at least one parameter to be well formed.

Each type definition above binds four names. When you define a type named τ, you get:

Let’s see how constructing and matching types works using the REPL. First we’ll require turnstile and define our types:

> (require turnstile/base)
> (define-base-types Bool Int)
> (define-type-constructor -> #:arity >= 1)

Now, what can we do with them? At run time, not much, since the operations aren’t available:

> Int

Int: Cannot use tag at run time

> #'Int

#<syntax:eval:5:0 Int>

> (Int? #'Int)

Int?: undefined;

 cannot reference an identifier before its definition

  module: 'program

The error is because define-base-types defined Int? in phase 1 (the transformer environment), not phase 0 (the run-time environment). This is because turnstile’s types are meant to be used while type checking, which happens at compile time. Further, some turnstile operations, such as type normalization (and thus, the type syntax class), only work during macro expansion.

To make it easy to experiment with macro expansion in the REPL, we’ll make a variant of syntax-parse that pattern matches phase 0 syntax, evaluates the body of the matching clause in 1, and then shifts the resulting value back to phase 0. That’s tricky, but you don’t need to understand how it works in order to experiment with it. Here’s the definition:

> (require (for-syntax syntax/parse))
> (define-syntax (syntax-parse0 stx)
    (syntax-parse stx
      #:literals (syntax)
      [(_ #'e0 [p e1 ... er] ...+)
       #'(let-syntax
             ([m (λ (stx)
                   (syntax-parse stx
                     [(_ p) e1 ... #`'#,er] ...))])
           (m e0))]))

To see how to use it, let’s start with something simple: Match the syntax of a type and format it as a string (using turnstile’s phase 1 function type->str):

> (syntax-parse0 #'Int  ; ↓ note: this isn't quite right
    [τ:type  (type->str #'τ)])

"Int"

That might seem like it’s not doing anything, but let’s try formatting a couple ill-formed types:

> (syntax-parse0 #'Intt ; ↓ still not right…
    [τ:type  (type->str #'τ)])

eval:8.0: m: eval (10:0) not a well-formed type: Intt

  at: Intt

  in: (m Intt)

  parsing context:

   while parsing type

    term: Intt

    location: eval:10.0

> (syntax-parse0 #'(->) ; ↓ also not right…
    [τ:type  (type->str #'τ)])

eval:8.0: m: eval (11:0) not a well-formed type: (->)

eval:11:0: ->: Improper usage of type constructor ->: (->),

expected >= 1 arguments

  in: (->)

  at: (->)

  in: (m (->))

  parsing context:

   while parsing type

    term: (->)

    location: eval:11.0

In order to write a type checker, we also need to be able to recognize and decompose types, which is where pattern expanders, like ~Int and ~-> come in. To match type Int with syntax-parse, use the pattern ~Int; to match an one-parameter arrow type and bind the domain and codomain to pattern variables, you want the pattern (~-> dom cod). Let’s try matching an arrow type and returning codomain as a string:

> (syntax-parse0 #'(-> Int Bool)
    ; ↓ won't match ↑
    [(~-> dom cod)     (type->str #'cod)])

eval:8.0: m: expected the identifier `#%plain-app'

  at: ->

  in: (m (-> Int Bool))

What’s the problem now? The turnstile library’s pattern expanders work only on internal types, but types that come directly from the program’s syntax haven’t been normalized to internal types yet. That means we first need to match using the type syntax class, and then use the norm attribute to get the normalized (internal) type and match that using syntax-parse’s pattern directive #:with:

> (syntax-parse0 #'(-> Int Bool)
    [τ:type                 ; ↓ this is the solution
     #:with (~-> dom cod) #'τ.norm
     (type->str #'cod)])

"Bool"

It’s important know which types have been internalized and which are still external and unchecked. Since tracking this can be tricky, I adopt the following convention: external types are named using Greek letters (e.g., τ, σ), whereas internal types are named using Roman letters (e.g., t, s, dom, etc.).

As a final example of working with types, let’s write a macro that counts all the arrows in a type:

> (define-syntax (count-arrows stx)
    (syntax-parse stx
      [(_ τ:type)
       #:with (~-> dom ... cod) #'τ.norm
       #'(+ 1 (count-arrows dom) ... (count-arrows cod))]
      [_
       #'0]))
> (count-arrows Int)

0

> (count-arrows (-> Int Bool))

1

> (count-arrows (-> Int Int (-> Int Bool)))

2

> (count-arrows (-> (-> Int Int) Int (-> Int Bool)))

3

14.3.2 Typing Rules in turnstile

Now we’re finally ready to see how to we can implement the rules of our type system using turnstile. We’ll use its define-typed-syntax macro to define each syntactic form in our typed language as a macro that infers or checks the type and expands into untyped Racket code. You can read define-typed-syntax in two complementary ways:

As a first example, let’s see the rule for ann, written in both forms:

"stlc.rkt"

29   (define-typed-syntax ann
30     ; to COMPUTE the type of ‘(ann e τ)’, . . .
31     [(_ e:expr τ:type)    
32      
33      ; CHECK that ‘e’ has type ‘τ’ (normalized), calling its
34      ; expansion ‘e-’, . . .
35      [ e  e-  τ.norm] 
36      ----
37      ; then expand to ‘e-’ and COMPUTE type ‘τ’ (normalized).
38      [ e-  τ.norm]])

There are three main differences you should notice:

  1. The type environment is implicit in turnstile (though there’s a way to add things to it for a premiss).

  2. If a typed syntactic form requires multiple type inference rules on paper, then each of those rules becomes a case, à la syntax-parse, in the same macro definition using define-typed-syntax. This means that:
    • If there are both compute-mode and check-mode rules for the same syntactic form, all of them need to be cases in the single macro that implements that form.

    • As with syntax-parse, the cases are considered in the order they are written.

    • When in check mode, compute-mode rules are also considered to match, which means they should come after the check-mode rules.

    • Racket wraps each literal L as (#%datum . L), so all literals (integers and Booleans, in our case) need to be handled by an implementation of #%datum.

  3. The two notations differ in where the expression-to-be-typed appears. In the traditional notation, appears as part of the judgment in the conclusion, beneath the horizontal line. By contrast, inputs to a define-typed-syntax case appear as patterns at the top of each case, leaving only outputs at the bottom. So the expression-to-be-typed is a syntax-parse pattern at the top of each case, and for check-mode cases, the expected type also appears as a pattern at the top.

Given all that, here’s the basic shape to define a form with one check-mode case followed by one infer-mode case:

(define-typed-syntax ⟨form-name⟩
  [⟨typed-form-pattern⟩
    ⟨expected-type-pattern⟩  ; ← expected type means check mode
   
   ⟨premiss⟩ ...
   ----
   [ ⟨untyped-form-template⟩]]
 
  [⟨typed-form-pattern⟩
   
   ⟨premiss⟩ ...
   ----          ; computed type ↓ means compute mode
   [ ⟨untyped-form-template⟩  ⟨computed-type-template⟩]])

In order to type a turnstile rule (using your keyboard), you need to be able to enter the special characters that turnstile uses. DrRacket has a mechanism for entering special characters by typing a particular sequence starting with \ and ending with Control-\. Here are the sequences you’ll need:

   

\gg

   

\Leftarrow

   

\Rightarrow

   

\succ

   

\vdash

(Alternatively, see turnstile/no-unicode.)

Figure 5: Unicode Text Entry

As a final, larger example, let’s consider how to type λ. For reference, here are the relevant blackboard rules:

And here’s the implementation using define-typed-syntax:

"stlc.rkt"

46   (define-typed-syntax λ
47     [(_ (x:id ...) e:expr)
48       (~-> s ... t)
49      
50      #:fail-when (check-duplicate-identifier (stx->list #'(x ...)))
51      "repeated formal parameter name"
52      #:fail-unless (= (stx-length #'(x ...)) (stx-length #'(s ...)))
53      "wrong number of formal parameters for expected arrow type"
54      [[x  x- : s] ...  e  e-  t]
55      ----
56      [ (λ- (x- ...) e-)]]
57     
58     [(_ ([x:id σ:type] ...) e:expr)
59      
60      #:fail-when (check-duplicate-identifier (stx->list #'(x ...)))
61      "repeated formal parameter name"
62      #:with (s ...) #'(σ.norm ...)
63      [[x  x- : s] ...  e  e-  t]
64      ----
65      [ (λ- (x- ...) e-)  (-> s ... t)]])

Everything else you need to know about turnstile for the exercises is used in the above macro, so let’s consider it line by line:

14.3.3 Making Your Definitions Available

To make your language work, you need to provide the necessary bindings: syntax, types, and values. When making a turnstile-based language, you provide your typed syntax in the way you’ve already seen:

"stlc.rkt"

  (provide ann if def λ let rec #%app #%datum
           (rename-out [λ lambda]))

To export types, you should use type-out, like so:

"stlc.rkt"

5   (provide (type-out Bool Int ->))

To export typed values (usually functions), use typed-out:

"stlc.rkt"

 6   (provide (typed-out [not    (-> Bool Bool)]
 7                       [+      (-> Int Int Int)]
 8                       [*      (-> Int Int Int)]
 9                       [/      (-> Int Int Int)]
10                       [<=     (-> Int Int Bool)]
11                       [zero?  (-> Int Bool)]))

And finally, we want to require and re-provide the testing forms from rackunit/turnstile to our new language:

"stlc.rkt"

13   (require rackunit/turnstile)
14   (provide (all-from-out rackunit/turnstile))

Of course, we haven’t implemented all the syntax in that first provide yet, since that will be your job.