11 More Language Variations
Let’s continue implementing pfsh. We’ll start with a solution to exercise 24:
"pfsh3.rkt"
#lang racket (require "run.rkt" racket/port (for-syntax syntax/parse)) (provide #%module-begin (rename-out [pfsh:run run] [pfsh:define define])) (define-syntax (pfsh:run stx) (syntax-parse stx #:datum-literals (<) [(_ prog:id arg:id ... < stream:id) #'(with-input-from-string stream (lambda () (pfsh:run prog arg ...)))] [(_ prog:id arg:id ...) #`(void (run (as-string prog) (as-string arg) ...))])) (define-syntax (as-string stx) (syntax-parse stx [(_ sym:id) #`#,(symbol->string (syntax-e #'sym))])) (define-syntax (pfsh:define stx) (syntax-parse stx [(_ stream:id expr) #'(define stream (with-output-to-string (lambda () expr)))]))
11.1 The Application Form
So far, the biggest difference between the pfsh that we’ve implemented and the pfsh that we want is that we have to put run before every program name. Instead of (run ls), we want to write (ls).
Since macros can do any kind of work at compile time, you might imagine changing pfsh so that it scans the filesystem and builds up a set of definitions based on the programs that are currently available via the PATH environment variable. That’s not how scripting languages are meant to work, though. Also, it’s likely to cause trouble to use the filesystem and environment-variable state at such a fine granularity to determine bindings of a module.
Instead, we would like to change the default meaning of parentheses. In Racket, a pair of parentheses mean a function call by default. In pfsh, a pair of parentheses should mean running an external program by default. The “by default” part concedes that an identifier after an open parenthesis can change the meaning of the parenthesis, such as when define appears after an open parenthesis. Otherwise, though, it’s as if a function-call identifier appears after the open parenthesis to specify a function-call form... and function-call exists, except that it’s spelled #%app.
(+ 1 2)
"pfsh7.rkt"
#lang racket .... (provide .... (rename-out [pfsh:run #%app] ....)) ....
11.2 More Implicit Forms
You can have seen two implicit forms that a language can adjust, #%module-begin and #%app, so you may wonder how many implicit forms there are. The others are #%datum, #%top, and #%top-interaction.
11.2.1 #%datum
The #%datum form is implicitly wrapped around a literal constant such as 0, #true, or "apple" when it appears in a place where an expression is expected. Since the #%datum form always has a single subform, it takes advantage of a performance hack internally by being written with parentheses and a ., which corresponds to a non-list pair instead of a list; so, 0 is implicitly (#%datum . 0), and so on.
Let’s not allow numbers in pfsh, but let’s allow literal strings, which can be useful for piping to a program’s input. Since a literal string is useful as a program’s input, let’s also change #%app to allow any expression after a < redirection.
"pfsh8.rkt"
#lang racket .... (provide #%module-begin (rename-out [pfsh:run #%app] [pfsh:define define] [pfsh:datum #%datum])) (define-syntax (pfsh:run stx) (syntax-parse stx #:datum-literals (<) [(_ prog:id arg:id ... < stream:expr) #'(with-input-from-string stream (lambda () (pfsh:run prog arg ...)))] [(_ prog:id arg:id ...) #`(void (run (as-string prog) (as-string arg) ...))])) .... (define-syntax (pfsh:datum stx) (syntax-parse stx [(_ . s:string) #'(#%datum . s)] [(_ . other) (raise-syntax-error 'pfsh "only literal strings are allowed" #'other)]))
11.2.2 #%top
(define-syntax (complain-top stx) (syntax-parse stx [(_ . x:id) (raise-syntax-error 'variable "misplaced" #'x)]))
"pfsh9.rkt"
#lang racket (require "run.rkt" racket/port (for-syntax syntax/parse)) (provide #%module-begin (rename-out [pfsh:run #%app] [pfsh:top #%top] [pfsh:define define] [pfsh:datum #%datum])) (define-syntax (pfsh:run stx) (syntax-parse stx #:datum-literals (<) [(_ prog arg ... < stream:expr) #'(with-input-from-string stream (lambda () (pfsh:run prog arg ...)))] [(_ prog arg ...) #`(void (run prog arg ...))])) (define-syntax (pfsh:top stx) (syntax-parse stx [(_ . sym:id) #`#,(symbol->string (syntax-e #'sym))])) (define-syntax (pfsh:define stx) (syntax-parse stx [(_ stream:id expr) #'(define stream (with-output-to-string (lambda () expr)))])) (define-syntax (pfsh:datum stx) (syntax-parse stx [(_ . s:string) #'(#%datum . s)] [(_ . other) (raise-syntax-error 'pfsh "only literal strings are allowed" #'other)]))
11.2.3 #%top-interaction
Finally, you may have noticed that when you run any of the working programs with "pfsh9.rkt" and earlier variants, DrRacket usually reports “Interactions disabled: language does not support a REPL (no #%top-interaction).”
"pfsh9.rkt"
.... (provide .... (rename-out .... [pfsh:top-interaction #%top-interaction])) (define-syntax (pfsh:top-interaction stx) (syntax-parse stx [(_ . form) #'form])) ....
11.3 Defining Functions
Our pfsh implementation can now run the original example script, but let’s go a little further. An advantage of a parenthesis-friendly shell is that we can mix in more of Racket to better support abstraction in a script. At a minimum, we’d like to be able to define and call functions in pfsh scripts:
"use-pfsh11.rkt"
#lang s-exp "pfsh11.rkt" (define (double x) (string-append x x)) (define l (ls -l)) (wc -l < l) (wc -l < (double l))
It’s easy to make the string-append function available.
It’s also easy to change define to match and distinguish
function and stream shapes—
Here are two ways to make the adaptation work:
We can change #%app so that it inspects an identifier in the “function” position to check whether the identifier is bound. If so, the #%app corresponds to a function call.
In this case, a define for a function can expand to a regular define.
We can make define bind a function name as a macro, so that using the name after an open parenthesis triggers a function-specific macro instead of the generic #%app form.
In this case, a define for a function needs to expand to define-syntax to bind the function name as a macro.
Slightly different behaviors fall out from each of these strategies. With the first strategy, an identifier that is bound to a string for a program name cannot be used to run the program, because using the identifier after an open parenthesis would trigger a function call. With the second strategy, a name bound to a string still works as a program name, but a function identifier doesn’t work as an argument to another function (unless we do a little more work to make that possible). Both approaches are viable, and either could be made to fit a preferred behavior, so let’s try both of them.
11.3.1 Detecting Bindings
To try the first strategy, we need #%app to recognize whether an identifier has a binding or not. Since the #%app macro receives only an immediate application form, how can it know what definitions are in the rest of the module? That is, although the #%app macro can do any work its wants at compile time, it doesn’t have a handle on the whole module to inspect it. The macro expander itself must know about bindings, because it uses binding information to determine which macro should handle an expansion. Happily for our #%app, the macro expander shares its binding information with macros in several ways, including through a identifier-binding function.
The identifier-binding function takes an identifier and reports #f if the identifier has no binding. Otherwise, it reports some information about the binding, such as which module (possibly the current one) contains a definition of the identifier. For our purposes, we do not care about the additional details, so we can just check whether identifier-binding returns #f.
(define-syntax (pfsh:run stx) (syntax-parse stx #:datum-literals (<) [(_ prog arg ... < stream:expr) #'(with-input-from-string stream (lambda () (pfsh:run prog arg ...)))] [(_ prog:id arg ...) #:when (identifier-binding #'prog) #'(prog arg ...)] [(_ prog arg ...) #`(void (run prog arg ...))]))
(define-syntax (pfsh:define stx) (syntax-parse stx [(_ stream:id expr) #'(define stream (with-output-to-string (lambda () expr)))] [(_ (proc:id arg:id ...) expr) #'(define (proc arg ...) expr)]))
(provide .... string-append)
11.3.2 Macro-Defining Macros
With the strategy where define binds a function name as a macro, we don’t have to change #%app. We just have to change pfsh:define to compile a pfsh function definition into a racket macro definition.
(define-syntax (pfsh:define stx) (syntax-parse stx [(_ stream:id expr) #'(define stream (with-output-to-string (lambda () expr)))] [(_ (proc:id arg:id ...) expr) #'(define-syntax (proc stx) (syntax-parse stx [(_ arg ...) #'expr]))]))
(define-syntax (pfsh:define stx) (syntax-parse stx [(_ stream:id expr) #'(define stream (with-output-to-string (lambda () expr)))] [(_ (proc:id arg:id ...) expr) #'(begin (define (actual-proc arg ...) expr) (define-syntax (proc stx) (syntax-parse stx [(_ arg ...) #'(actual-proc arg ...)])))]))
(define (double x) (string-append x x))
(define (actual-proc x) (string-append x x)) (define-syntax (double stx) (syntax-parse stx [(_ arg ...) #'(actual-proc arg ...)]))
(provide .... (rename-out .... [pfsh:string-append string-append])) (pfsh:define (pfsh:string-append arg1 arg2) (string-append arg1 arg2))
11.4 Installing a Language
Let’s take the last step in defining a language, which will let use switch from #lang s-exp "pfsh11.rkt" to #lang pfsh. To enable writing #lang pfsh, we must do two things:
Adjust our language implementation so that it explicitly specifies S-expression parsing, instead of having S-expression parsing imposed externally.
Install our language as a package so that #lang pfsh will work from anywhere.
The part of a language that specifies its parsing from characters to syntax objects is called a reader. A language’s reader is implemented by a reader submodule (i.e., a nested module) inside the language’s module. That submodule must export a read-syntax function that takes an input port, reads characters from it, and constructs a module form as a syntax object. For historical reasons, the submodule should also provide a read function that does the same thing but returns a plain S-expression instead of a syntax object.
(module reader racket (provide (rename-out [pfsh:read-syntax read-syntax] [pfsh:read read])) (define (pfsh:read-syntax name in) (datum->syntax #f `(module anything pfsh (#%module-begin ,@(read-body name in))))) (define (read-body name in) (define e (read-syntax name in)) (if (eof-object? e) '() (cons e (read-body name in)))) (define (pfsh:read in) (syntax->datum (pfsh:read-syntax 'src in))))
(module reader syntax/module-reader pfsh)
In short, we just need to add those two lines to our current pfsh implementation, and then save it as "main.rkt" in a "pfsh" directory. Here’s the complete implementation:
#lang racket (require "run.rkt" racket/port (for-syntax syntax/parse)) (provide #%module-begin (rename-out [pfsh:run #%app] [pfsh:top #%top] [pfsh:define define] [pfsh:datum #%datum] [pfsh:top-interaction #%top-interaction] [pfsh:string-append string-append])) (module reader syntax/module-reader pfsh) (define-syntax (pfsh:run stx) (syntax-parse stx #:datum-literals (<) [(_ prog arg ... < stream:expr) #'(with-input-from-string stream (lambda () (pfsh:run prog arg ...)))] [(_ prog arg ...) #`(void (run prog arg ...))])) (define-syntax (pfsh:top stx) (syntax-parse stx [(_ . sym:id) #`#,(symbol->string (syntax-e #'sym))])) (define-syntax (pfsh:define stx) (syntax-parse stx [(_ stream:id expr) #'(define stream (with-output-to-string (lambda () expr)))] [(_ (proc:id arg:id ...) expr) #'(begin (define (actual-proc arg ...) expr) (define-syntax (proc stx) (syntax-parse stx [(_ arg ...) #'(actual-proc arg ...)])))])) (pfsh:define (pfsh:string-append arg1 arg2) (string-append arg1 arg2)) (define-syntax (pfsh:datum stx) (syntax-parse stx [(_ . s:string) #'(#%datum . s)] [(_ . other) (raise-syntax-error 'pfsh "only literal strings are allowed" #'other)])) (define-syntax (pfsh:top-interaction stx) (syntax-parse stx [(_ . form) #'form]))
You’ll also need "run.rkt" in the same "pfsh" directory.
After either of those steps, you can run
#lang pfsh (echo Hello!)