17.4 Better Performance through Smarter Compilation

The compilation strategy used in Untyped, Slow Regular Expressions constructs regular expressions as part of the matching process. That is, it calls Racket’s regexp function only after it gets the string to match against. The regexp function, however, is expensive when compared to the cost of actually matching the string, so we would like to be able to call it earlier, in preparation for matching, and save the result to use multiple times. Happily, Racket’s regular expressions are values that can cross phases (just like numbers and booleans), so we can call the regexp function during the macro expansion process and then simply expand into code that contains the regular expression value.

Computing the regular expression at compile time results in about a 30x speed up for this expression.

(time
(for ([x (in-range 100000)])
   (r^match (r^seq (r^range #\a #\d) (r^range #\x #\z)
                   (r^range #\a #\d) (r^range #\x #\z))
            "bycz")))

To achieve this speed up, first go back to the untyped version of the regular expression matcher and focus on to-r^. Instead of a macro that pushes itself down into its arguments, we can turn it into a compile-time function that constructs something similar to the r^ struct, but where the procedure field is a syntax object containing code that will extract the result from the given match directly.

The test cases from the original solution should also all pass here.

Solution

← prev up next →

1	Language-Oriented Programming
2	Macro Expansion
3	Language Extensions via Macros
4	Lab Simple Macros
5	Advanced Racket Macros
6	Lab Syntax Patterns & Classes
7	Lexical Scope, (Un)Hygienic Macros
8	Lab Comfortable Macros
9	Modules, Macros, Languages
10	Lab Languages via Macros
11	More Language Variations
12	Lab Languages and Readers
13	Types and Type Checking
14	Typed Languages with Turnstile
15	Lab My First Turnstile Language
16	Two Techniques
17	Extended Lab Regexps
18	Some Language Gems
19	Closure

17.1	Untyped, Slow Regular Expressions
17.2	A Weak Type System
17.3	A Better Type System
17.4	Better Performance through Smarter Compilation
17.5	Improving Repeat to Produce Lists