Beat the Puzzle Master Home
Beat the Puzzle Master!
A web site dedicated to solving the NPR Sunday Puzzle

Puzzle November 16, 2008

Podcast of the original challenge

Sure Bet You'll Know

Didn't get enough interactive puzzling Sunday? Try these
The letters S-B stand for seat belt. They also stand for some other familiar two word phrases. See if you can get them from the clues.


Clue:  
Your answer?

 

How to Solve Will's Puzzle With Code

Name a famous author whose last name starts with the letter C. Cross out four letters in it. The remaining letters, in order, will name another famous author also starting with C. Who are these two writers?

As it turns out, I was able to identify eleven author pairs that match each other in this way! And it was pretty straightforward too.

Algorithm Summary

Author
  • Build a dictionary of authors, using their last name as a key
  • A dictionary (a type of hash table in .NET) is quite handy because it is very efficient at performing lookups, and also because we can use it to store additional information, in this case the author's full name.
  • Look at each author (iterate the dictionary) and construct candidate variatons according to Will's instructions.
  • Check whether any candidate is in our authors dictionary, if so, we have a solution!

The only tricky part here is constructing the last name variations. I looked at a number of possible techniques, including recursion, before I realized I could simply use four nested loops. Each loop controls the position of a character that must be removed (blocked). Every inner loop starts just after its parent's value, and runs as far as it can before bumping into its child loop (or the end of the string for the innermost loop).

Nested Loops Control Your Blocking Positions

Blocking positions

The illustration at left shows the first nine variations for the name 'Campbell'. p1..p4 are the positions that are blocked when constructing a new variation. Observe how p4 starts one position higher than it's containing loop (the p3 loop) and goes to the end. Meanwhile, p3 starts one higher than p2. In the picture, p1 and p2 haven't had a chance to move yet.

In case it isn't obvious, the unblocked letters are what we use to form our new variation.

Interpretation:

  • letter c is blocked every time above, because p1 hasn't moved yet
  • letter a is blocked each time in the picture, because p2 hasn't had a chance to move
  • p3 blocks the letter 'm' the first 5 iterations, then blocks the letter 'p' after p4 has finished its first pass
  • p4 is blocking a variety of letters because it is moving in the innermost loop

A code sample follows which is probably easier to read than my explanation!

//p1..p4 are positions to block in the current variation of our last name
//Each is controlled by a nested loop so all possibilities are iterated        
for (int p1 = 0; p1 < lastName.Length - 3; p1++) {
    for (int p2 = p1 + 1; p2 < lastName.Length - 2; p2++) {
        for (int p3 = p2 + 1; p3 < lastName.Length - 1; p3++) {
            for (int p4 = p3 + 1; p4 < lastName.Length; p4++) {
                //Use p1..p4 to construct a variation:
                CopyBlocked(p1, p2, p3, p4, lastName, variations);
                //look in the dictionary:
                string candidate = new string(variations);
                if (authorDic.ContainsKey(candidate)) {
                    //Score!
                    lstResults.Items.Add(lastName + " - " 
                                + authorDic[candidate]);
                }
            }
        }
     }   
}

Note that CopyBlocked is a method I wrote (not shown above) which utilizes p1..p4 to create a new variation. Meanwhile, lstResults is the name of a listbox that displays the results.

Note that this code will generate duplicates if the last name contains duplicated letters. The impact on efficiency is negligible, so I leave it as an exercise for the reader to construct a method to suppress the duplicates in the output.

Algorithm Analysis (for Advanced Readers)

Analysis

In terms of algorithm analysis, the main loop is O(n). (Take a look at the Wikipedia article on "Big O Notation" if you have no idea what I'm talking about here.) That's pretty speedy, though I really can't take much credit for it, since I'd almost have to be a dunce to make it slower! It makes exactly one pass through the data set, and since I used a dictionary to look-up author names (which runs in O(1) time), no significant cost is incurred for the lookup.

Calculating the variations on an author's name is somewhat interesting. Strictly speaking, it runs in O((n-4)4) time, where n is considered to be the name length. 'Interesting' because the length of the names is short (relative to the length of the list they are in), making it debatable whether this additional processing is worthy of consideration. At any rate, since the author data set is small, the overall question of algorithm speed is academic here.

Getting the Author List

Once again, Wikipedia provides the best place to get lists. As a matter of fact, they have a page dedicated to authors whose name begins with the letter c: http://en.wikipedia.org/wiki/List_of_authors_by_name:_C

As usual, I built the data set using my standard techniques of:

  • Writing a spider to scrape the Wikipedia screen
  • Matching author names via a regular expression to extract them from the page markup, and
  • Employing some simple File IO to write the regular expression matches to a file we can use.
If you're curious about the regular expression I utilized, download my C# code and take a look.

Solve the Problem Now!

Name a famous author whose last name starts with the letter C. Cross out four letters in it. The remaining letters, in order, will name another famous author also starting with C. Who are these two writers?

By clicking the button, you will run a small javascript program in your browser to solve the problem. It needs to fetch the author list from the server, and then it should solve the problem in a few seconds. If you're curious, inspect the page source code! (Search for "function dataFetched".)

Note that my JavaScript implementation is different than my C# implementation, mainly because JavaScript doesn't have a Dictionary class, and I didn't feel like implementing my own. So I used an array with a binary search instead.

That works fine except for the existence of duplicated last names; note that I am using last name as a search value for my binary search. The upshot: my C# solution identifies 11 pairs, and my JavaScript only identifies 7 pairs. The difference is wholly due to the problem of duplicated last names.



Take-Home Challenge!

See if you can write your own code to solve a similar problem. If you like, you can modify my code to accomplish this purpose; in any event, if you use my download, you can use my text files for the author list.

See if you can find fourteen authors that whose names start with 'S' and have the same property as the 'C-authors'.




Reader Comments

DaveMc says:
Hi Randy, I just stopped by your site to take a look. I enjoyed your coding solution to solve the author puzzle...another clever coding feather in your cap. Nice job!
Posted 2 year(s) ago

Add a Comment Show comment controls
Comment Details
Your Comment:

Name:
Password:
Your Name*:
Password*:
Confirm*:
Email*: (will not be published!)
  Your Gravatar image
Website?
About Me: