Sicp 2.4 3.1 Multiple Representation

I’ve continued pretty quickly through SICP, getting through 2.4, 2.5, and 3.1, although I ended up skipping some of the questions at the end of 2.5. This is because I’m unable to test the solutions, as the problems assume that you already have some procedures that are only introduced in chapter 3. I’ll come back to them when I build up the necessary material.

These sections have been really interesting. Sections 2.4 and 2.5 mainly discussed working with multiple representations of the same data. I’ve been thinking of writing various music notation apps, but never could think of a good internal representation of the notes, due to the various rules and conventions in music theory. 2.4 and 2.5 discussed being able to handle different representations. Section 3.1 pulled back the curtains, and finally defined what functional programming was. The first two chapters were done entirely without using any assignments or state! Anyways, the rest of these notes describe some ideas I had about representing music note names.

«< background info (skip to »> for the data representation discussion)

First, some background on music theory. An octave up is usually divided into 12 different pitches. These pitches are named with 7 ‘note’ names are the letters A through G, plus an optional modifier that either raises the pitch to the next pitch (sharp), or lowers the pitch to the previous pitch (flat). All notes are spaced 2 pitches apart (called a whole-tone), except between B and C, and E and F, both of which are spaced only a single pitch apart (called a semi-tone). Using w to represent 2 pitches (whole-tone), and s to represent 1 pitch (semi-tone), starting on C and going towards the next C, we have:

w w s w w w s ^ ^ ^ ^ ^ ^ ^ C D E F G A B (C - octave higher)

There are 7 distinct note names listed above, and the 5 missing pitches are between the notes that are spaced a whole tone apart. To name these 5 pitches, we use the name of one of the neighbouring notes, and modify it with a sharp or flat. For example, the pitch between the C and D could be named either C# (sharp) or Db (flat). Also, note that the following pairs have the same pitches (E#, F), (E, Fb), (B#, C), (B, Cb). Notes that have the same pitch but are spelled differently are called enharmonically equivalent. (This is just way the language of music theory is - I don’t know the history behind it.)

The particular order of whole-tone and semi-tone spacings between the 7 notes in the example above create what is known as the major scale. The above shows that the C major scale can be written with no sharps or flats. Note that we could have named the third note in the scale an Fb instead of an E, but it is conventional to ensure that each note in the scale is named using a different letter (along with the appropriate modifier to retain the correct order of whole-tone and semi-tones). These conventions tell us which enharmonically equivalent note name to use when identifying a pitch.

For example, let’s build the F major scale:

w w s w w w s F G A Bb C D E F

The 4th note in the scale was named Bb, instead of A#, because the letter A was already used by another note in the scale. If you work this out starting on all 12 pitches, you’ll see that in some cases, the pitches are labelled using sharps and others using flats.

end background info

So, how can we represent notes programmatically? In code, I’d like to be able to think in terms of music theory using the appropriate spellings of scales (sharp or flat note names). For example, I’d like to be able to write something like ‘get-third(A)’ to get the third note in the A major scale. If we just represented each of the 12 pitches by the numbers 1 to 12, the definition of get-third(A) would contain rules for each possible input. Sections 2.4 and 2.5 described two different approaches for multiple representations for the same data - data-driven programming and message-passing. I haven’t exactly thought out how to use data-driven programming for this task, but some ideas are to automatically label each note with its own scale. For example, A would be labelled as part of the A major scale. When get-third(A) is called, the data-driven program looks up the code for the A major scale, and returns the third of A. Of course, the rules would still have to be programmed, but it would be a much cleaner organization of code than putting the rules for all scales within a single get-third procedure.

While going through the exercises, I noticed that data-driven programming was very similar (or maybe equivalent?) to using polymorphism in OOP languages like Java. You define a generic interface, then write concrete implementations for different data types.

Sections 2.4 and 2.5 only briefly introduced message passing, but it seems like it would be a more straightforward way to implement get-third. Also, now I know why Ruby always talks about objects receiving messages - it’s all based on message-passing! To use message-passing, we’d just add the appropriate methods to each note name. So for the note A, define its get-third method to return C#. We’d have to do this for all notes, and for all intervals (e.g. get-second, fourth, etc..). However, using this approach, we could easily construct scales and chords by just calling get-second, get-third, get-fourth, etc… Each of the intervals in a scale would be captured in code. This should be easy to implement in Ruby.