2B Compound Data
# Part 1
# Recap
In the beginning, we learned about
- primitive forms
- means of combination
- means of abstraction
Then, we learned how to use higher-order procedures to represent general methods of computation. This gave us extraordinary expressive power.
# Layered system
- When we wrote the square root procedure, we used layers of abstraction. Someone else could have written
good-enough?
for us without understanding the rest.
So the crucial idea is that when we’re building things, we divorce the task of building things from the task of implementing the parts. And in a large system, of course, we have abstraction barriers like this at lots, and lots, and lots of levels. —Abelson
- Now we will look at the same issues for data. There are means of combination for data as well, allowing us to combine primitive data into compound data.
- We will also see a methodology for abstraction with data.
- The key idea is to build the system in layers, with abstraction barriers.
# Rational number arithmetic
- We already know how to express the arithmetic operators for fractions in math.
- Adding, subtracting, multiplying, or dividing two fractions produces another fraction.
- The computations are easy—but how to we represent a fraction?
- We need to apply the strategy of wishful thinking: let’s imagine that we have procedures
make-rat
,numer
, anddenom
. - We can implement a procedure for adding rationals like so:
(define (+rat x y)
(make-rat
(+ (* (numer x) (denom y))
(* (numer y) (denom x)))
(* (denom x) (denom y))))
- The procedure
make-rat
is called a constructor. - The procedures
numer
anddenom
are called selectors.
# Why do we need compound data?
- Why bother with these data objects instead of just passing in four numbers?
- We would have to worry about all these temporary numbers.
- It doesn’t scale. It’s confusing. We need abstraction.
- We need to create compound data objects for the same reason that we separate our programs into procedures built on abstractions.
# Part 2
# Pairs
- How do we make one of these “clouds”? We need a kind of glue to connect things.
- Lisp provides this: it is called list structure. It starts with the primitive procedure
cons
. cons
is obviously the constructor.car
andcdr
are the selectors for pairs.- For any
x
andy
,(car (cons x y))
isx
, and(cdr (cons x y))
isy
. - We can represent pairs with two boxes side by side with an arrow coming from each. This is called box-and-pointer notation.
# Lowest terms
- When we use the system to add a half and a quarter, it gives us six eighths instead of three quarters. This isn’t what we want.
- This isn’t the addition procedure’s problem; the
make-rat
procedure should be responsible for reducing to lowest terms. - We can use the greatest common divisor to fix this:
(define (make-rat n d)
(let ((g (gcd n d)))
(cons (/ n g) (/ d g))))
- Now our fractions will always be reduced to lowest terms.
# Abstraction layer
- The important thing with our rational arithmetic system is the abstraction layer.
- We have the rational arithmetic operators on one side and the pair constructor and selectors on the other. The abstraction barrier between the two is formed by the procedures
make-rat
,numer
, anddenom
. - We always want to separate use from representation.
- This methodology is called data abstraction.
# Why use data abstraction?
- We didn’t have to do it this way.
- We could have used
cons
,car
, andcdr
directly in the rational arithmetic procedures. - Is this talk of data abstraction just self-righteous BS?
- Maybe it would be marginally more efficient to skip the data abstraction!
- It goes back to naming. If you have the name of the spirit, you have control over it.
- One advantage is the flexibility to use alternative representations.
- If we use
cons
directly, we would have to reduce to lowest terms every time we make a rational number. Using data abstraction, we can write the code once in the constructor. - Data abstraction lets us postpone decisions.
# Designing systems
See, in general, as systems designers, you’re forced with the necessity to make decisions about how you’re going to do things, and in general, the way you’d like to retain flexibility is to never make up your mind about anything until you’re forced to do it. The problem is, there’s a very, very narrow line between deferring decisions and outright procrastination. So you’d like to make progress, but also at the same time, never be bound by the consequences of your decisions. —Abelson
I said that computer science is a lot like magic, and it’s sort of good that it’s like magic. There’s a bad part of computer science that’s a lot like religion. And in general, I think people who really believe that you design everything before you implement it basically are people who haven’t designed very many things.
The real power is that you can pretend that you’ve made the decision and then later on figure out which one is right, which decision you ought to have made. And when you can do that, you have the best of both worlds. —Abelson
# Part 3
# Data abstraction
- Data abstraction is a way of controlling complexity in large systems.
- Like all ways of controlling complexity, the real power comes from their use as building blocks for more complicated things.
# Points and line segments
- We could also use pairs to represent points on a plane. The constructor and accessors could use
cons
,car
, andcdr
, just like in the rational number system. - We could use points as a building block to make line segments.
- Now we have a multi-layered system: segments, points, and pairs are all separated.
- Without data abstraction, the procedure for calculating the length of a line segment is very hard to read; worse, it locks you into decisions about representation.
cons
can combine anything, not just numbers. With line segments, we combine pairs.- Closure means we can make pairs of pairs, not just pairs of numbers. We say that the means of combination closes over the things that it makes.
# Part 4
# Abstract data and contracts
- We’ve done a few examples. Now we’re going to discuss what it means—much harder.
- Earlier, we assumed the constructors and selectors for rational numbers existed (without knowing about pairs).
- We had defined a rational number representation in terms of abstract data.
- We had a contract that the procedures have to fulfill: given a rational number
x
created with(make-rat n d)
, we must have(= (/ (numer x) (denom x)) (/ n d))
. - This tells us whether three procedures are suitable as a basis for rational numbers.
# Implementation of pairs
- Rational numbers really are just this contract, this axiom.
- They might be realized as pairs in a particular implementation, but that has nothing to do with what pairs really are.
- Pairs are similar: they happen to satisfy the contract that
(car (cons x y))
isx
and(cdr (cons x y))
isy
. - We can implement pairs with procedures. We don’t even need special primitives—all we need are lambdas:
(define (cons a b)
(lambda (pick)
(cond ((= pick 1) a)
((= pick 2) b))))
(define (car x) (x 1))
(define (cdr x) (x 2))
- All we need to do is show that this satisfies the axiom. We can do that with the substitution model. It works.
- You couldn’t tell if
cons
,car
, andcdr
were implemented in this way or not.
# Conclusion
- We don’t need data at all for data abstraction: we can do everything with procedures.
- This blurs the line between code and data.
- Procedures are not just the act of doing something.
- Procedures are conceptual entities or objects.