4A Pattern Matching and Rule-Based Substitution⁠

Part 1

Recap

Last time we wrote a program to differentiate symbolic mathematical expressions represented using quotation.
To implement the differentiation rules, we wrote them in a highly stylized form based on conditional dispatch on the types of expressions.
Why did we have to translate the rules of differential calculus into the language of the computer? Is there some other, more clear way of writing this program?

Rules

We follow rules to differentiate expressions. What is a rule, exactly?
A rule has a pattern on the left and a skeleton on the right.
We match the source expression to the pattern, and following the rule, we instantiate the skeleton to get the target expression.
We want to build a language that allows us to directly express these rules. We will work bottom-up, like before.

Representation

We can represent the differentiation rules like this:

(define deriv-rules
  '(((dd (?c c) (? v)) 0)
    ((dd (?v v) (? v)) 1)
    ((dd (?v u) (? v)) 0)
    ((dd (+ (? x1) (? x2)) (? v))
     (+ (dd (: x1) (: v))
        (dd (: x2) (: v))))
    ((dd (* (? x1) (? x2)) (? v))
     (+ (* (: x1) (dd (: x2) (: v)))
        (* (dd (: x1) (: v)) (: x2))))
    ((dd (** (? x) (?c n)) (? v))
     (* (* (: n)
           (** (: x) (: (- n 1))))
        (dd (: x) (: v))))))

It could be prettier, but that doesn’t matter. What matters is that we are writing rules directly in our language.
deriv-rules is a list of rules. Each rule has the form (pattern skeleton).
We have invented two concepts for our language:
- The forms beginning with question marks are called pattern variables.
- The forms beginning with colons are called substitution objects.
Once we have this language, we can reuse it for other things. For example, we can write rules for algebraic simplification:

(define algebra-rules
  '((((? op) (?c e1) (?c e2))
     (: (op e1 e2)))
    (((? op) (? e1) (?c e2))
     ((: op) (: e2) (: e1)))
    ((+ 0 (? e)) (: e))
    ((* 1 (? e)) (: e))
    ((* 0 (? e)) 0)))

Syntax

Patterns:

foo matches exactly the symbol foo.
(a b) matches a list whose first element matches the pattern a and whose second element matches the pattern b. This generalizes to lists of any length.
(? x) matches anything, and binds it to x.
(?c x) matches a constant, and binds it to x.
(?v x) matches a variable, and binds it to x.

Skeletons:

foo instantiates to exactly the symbol foo.
(a b) instantiates to a list whose first element is the instantiation of a and whose second element is the instantiation of b. This generalizes to lists of any length.
(: x) instantiates to the value bound to x in the pattern.

Simplification process

We can imagine the rules as a deck of cards, each one with a pattern and a skeleton.
The patterns feed into the matcher, and the skeletons feed into the instantiator.
The matcher gives the instantiator a mapping from pattern variables to values.
The output of the instantiator goes back into the matcher.
All of the rules must be tried on the expression, and on all its subexpressions. We can stop when it no longer changes.
If you don’t write your rules carefully, there is a danger of going into an infinite loop.

Part 2

Matcher

The matcher takes an expression, a pattern, and a dictionary as input. It outputs another, augmented dictionary.

(define (match pat exp dict)
  (cond ((eq? dict 'failed) 'failed)
        ((atom? pat)
         (if (and (atom? exp) (eq? pat exp))
             dict
             'failed))
        ((arbitrary-constant? pat)
         (if (constant? exp)
             (extend-dict pat exp dict)
             'failed))
        ((arbitrary-variable? pat)
         (if (variable? exp)
             (extend-dict pat exp dict)
             'failed))
        ((arbitrary-expression? pat)
         (extend-dict pat exp dict))
        ((atom? exp) 'failed)
        (else
         (match (cdr pat)
                (cdr exp)
                (match (car pat)
                       (car exp)
                       dict)))))

Consider the pattern (+ (* (? x) (? y)) (? y)). We could draw this as a tree.
The expression matching it, (+ (* 3 x) x), has a very similar tree structure.
We want to traverse both trees simultaneously and compare them.
The matcher can fail at any point. The first clause checks and propagates the failure.
If the pattern is an atom, we make sure the expression is the same atom.
If the pattern is an arbitrary expression variable like (? x), we simply add the expression to the dictionary.
If the pattern is an arbitrary constant or variable, we make sure the expression is the correct one before adding to the dictionary.

Instantiator

The instantiator takes a dictionary and a skeleton as input and outputs an expression.

(define (instantiate skeleton dict)
  (define (loop s)
    (cond ((atom? s) s)
          ((skeleton-evaluation? s)
           (evaluate (eval-exp s) dict))
          (else (cons (loop (car s))
                      (loop (cdr s))))))
  (loop skeleton))

We do a recursive tree walk on the skeleton.
It’s simpler than the matcher because the parts of the tree can be instantiated independently (we don’t change the dictionary).
The only reason for loop is to avoid passing dict through every recursive call.
The skeleton evaluation forms like (: x) using evaluate:

(define (evaluate form dict)
  (if (atom? form)
      (lookup form dict)
      (apply (eval (lookup (car form) dict)
                   user-initial-environment)
             (mapcar (lambda (v) (lookup v dict))
                     (cdr form)))))

For atoms (variable names) we simply look them up in the dictionary.
For more complicated expressions, we lookup and evaluate the car (operator) and apply it to the result of looking up everything from the cdr (operands).
This is magic. We don’t haven’t learned about apply and eval yet. We will later.

Part 3

Plan

We’ve seen the two halves of the rule system: the matcher and the instantiator.
Now we’ll create the control structure by which rules are applied to expressions.
We want to apply all of the rules to every node.
We call the basic idea “garbage in, garbage out.”

Simplifier

We want to be able to evaluate (simplifier rules) and get a procedure that simplifies expressions according to rules.
For example, we could then write (define dsimp (simplifier deriv-rules)).

(define (simplifier the-rules)
  (define (simplify-exp exp)
    (try-rules (if (compound? exp)
                   (simplify-parts exp)
                   exp)))
  (define (simplify-parts exp)
    (if (null? exp)
        '()
        (cons (simplify-exp (car exp))
              (simplify-parts (cdr exp)))))
  (define (try-rules exp)
    (define (scan rules)
      (if (null? rules)
          exp
          (let ((dict (match (pattern (car rules))
                             exp
                             (empty-dictionary))))
            (if (eq? dict 'failed)
                (scan (cdr rules))
                (simplify-exp
                 (instantiate (skeleton (car rules))
                              dict))))))
    (scan the-rules))
  simplify-exp)

The procedures simplify-exp and simplify-parts call each other recursively.
Since the exp in simplify-exp can be either atomic or compound, this naturally recurses through a tree of expressions.
We could have just written one procedure, using map:

(define (simplify-exp exp)
  (try-rules
   (if (compound? exp)
       (map simplify-exp exp)
       exp)))

This is a different idiom; it works the same way, but you think about it a bit differently.
It’s better because otherwise we are basically reinventing the wheel by hiding a definition of map in one of our procedures. It is best to extract those common patterns.
The try-rules procedure scans the rules and tries them one by one. It returns the original expression if none matched.
When there’s a match, it instantiates the skeleton and tries to simplify that further.

Complexity

The pattern of recursions here is very complicated.
We shouldn’t try to think about it all at once. Instead, we make a modular system where we can focus on one part at a time.
As long as we know what a procedure application is supposed to do, we can use it without thinking about how it will work.

Highlights

The key to this—very good programming and very good design—is to know what not to think about. —Sussman

Dictionaries

The implementation of the dictionary is pretty simple:

(define (empty-dictionary) '())
(define (extend-dictionary pat dat dict)
  (let ((name (variable-name pat)))
    (let ((v ((assq name dict))))
      (cond ((null? v)
             (cons (list name dat) dict))
            ((eq? (cadr v) dat) dict)
            (else 'failed)))))
(define (lookup var dict)
  (let ((v (assq var dict)))
    (if (null? v) var (cadr v))))

When extending the dictionary, we check if the name is already present.
It it’s not present, we add it. If it’s present but has the same value, we do nothing. Otherwise it has a conflicting value, and we fail.

4A Pattern Matching and Rule-Based Substitution⁠

# Part 1

# Recap

# Rules

# Representation

# Syntax

# Simplification process

# Part 2

# Matcher

# Instantiator

# Part 3

# Plan

# Simplifier

# Complexity

# Dictionaries

Part 1

Recap

Rules

Representation

Syntax

Simplification process

Part 2

Matcher

Instantiator

Part 3

Plan

Simplifier

Complexity

Dictionaries