How to Solve the Puzzle With Code
Name a famous actress with four letters in her first name and five letters in her last name.
Drop the last letter of her first name and the last two letters of her last name.
The remaining letters, in order, will name a well-known world capital. Who is the
actress and what is the capital?
Slam-dunk on this one - this one nicely raised my self-esteem, which was somewhat
battered
from the puzzle two weeks ago!
The code is almost easier than the explanation! (But if you prefer to read
English,
read on as my
explanation follows.)
foreach (Actress anActress in actressList) {
//Make sure the names are the correct length:
if (anActress.FirstName.Length == 4 &&
anActress.LastName.Length == 5) {
//chop-off the letters according to Will's rules:
string candidate = anActress.FirstName.Substring(0, 3)
+ anActress.LastName.Substring(0, 3);
//Now check if the candidate is a captial:
if (capitalsDic.ContainsKey(candidate)) {
//Score!
lstResults.Items.Add(candidate + "-" + anActress.ToString());
}
}
}
In English: look at each actress, if their name has the right number of letters,
build a candidate capital by chopping-off the appropriate trailing letters and concatenating
them. If the result is contained in our dictionary, we've solved the puzzle!
Solving this one involved three tasks:
My Simple Logic
Getting the two lists is pretty straightforward, so first let's talk about what we do with
them. In code (not shown yet), I use those two lists to build 1) a dictionary of capitals and 2) a
list of Actress objects.
Why a dictionary? In .NET, a dictionary is a kind of hash table, which basically means that
when I check whether a capital is in my dictionary, that lookup operation will be extremely fast.
Your code won't impress your girlfriend if all she sees is an hourglass while
solving the puzzle, so efficiency makes a difference!
'Actress' objects - OK, I didn't really need a class for them, but I think you
can read my code easier because of it. My actress class merely contains a FirstNname,
a LastName, and
a MiddleNme. Oh, and some minor convenience stuff like ToString() and a way for
one to
build itself from a string. (Download the sample code for details.)
If you read the code sample above, you can see that I have a list of Actress
objects that I use.
Once I have my capitols dictionary (named 'capitalsDic' in the code above) and my actress
list (named 'actressList' above), finding a solution is a simple matter of looking at each actress
checking if we can build a capital from her name.
Getting The Lists
If you're a regular reader, you already know how I 1) scraped the screen
at Wikipedia (if not, read this:
Write Your Own Spider) and 2) saved the results to a file (primer:
Simple File IO). The only new part is the regular
expressions I used to isolate actress names in a pile of HTML gibberish!
Regex for Capitals
Trust me, the regex below is extremely simple! I literally built and tested it
in less than 10 minutes. It only looks nasty because it has a bunch of HTML literals in
it. I need those literals for a good reason, which I will explain shortly. What's
that - you say my regular expression is ugly? Remember, doing the same
task without a regex would take a huge amount of code.
In simple terms, this regular expression looks for the text embedded in a link,
which, in turn, is inside a list
item tag. Note the original page contains many links, but only the actress links are inside <li> tags.
The other little snag is that each link has it's own URL and it's own title, so I modified
my regex so it basically accepts any value for those two tags. To accomplish that, I used this little
expression: [^"]*.
What is this nasty looking thing? Merely
a simple list of unacceptable letters:
- Anything inside [] brackets is a list of acceptable characters, a character expression
- The caret symbol ^ means not, so [^"] means: no quotes allowed!
- Asterisk means zero or more of the preceding
- In short, my little expression says 'I'll take anything except the close quote'!
<td><a #literal match against html (table cell, link)
\s #whitespace
href="/wiki/[^"]*" #literal match except page name
\s #whitespace
title="[^"]*"> #literal except title
([^<]*) #Our group - anything up to the angle bracket
</a></td> #literal match
\n? #line feed
</tr> #literal - close of table row
For more, refer to one of the many online Regex cheat sheets, such as
this one. It is easy to pick and choose, and just experiment, when you are getting
started.
Regex for Actresses
My regular expression for actresses is almost identical, so I won't bore you with details
<li><a #Match literal HTML
\s #Whitespace
href="/wiki/ #Match more literal HTML
[^"]* #Match anything except the close quote for the URL
" #Literal match
\s #Whitespace
title=" #Match HTML literally
[^"]* #Anything except the close quote for the title
"> #Literal match
([^<]*) #The group we want-anything up to the angle bracket
</a></li> #More literal HTML
For full details, download my .NET sample code here:
download
. If you have solved the puzzle using a language different than C#, leave a
comment; I am considering adding a section for users to submit their own
solutions.
Take Home Challenge!
See if you can write some code to accomplish this task: identify a former football player whose name
can be used to form a world capital by taking the first three letters of his given name, the first
two letters of his last name, and appending with the first letter of his given name.
You can use my list of famous Americans in the download code for my
Sept. 7 Puzzle.