Jumble Solver
I wrote a program to help solve Jumble word puzzles.
Each daily Jumble has several small sub-puzzles, and one more difficult final puzzle. The sub-puzzles involve finding a word or words that match a set of 4 to 7 letters. That is, you’re given a set of letters, say “GEEINN”, and you must find the correctly spelled word(s) that contain all those letters. There’s usually 4 to 7 of these scrambled word anagram sub-puzzles. The example above has 6, 6-letter sub-puzzles.
The final puzzle uses selected (circled in the above example) letters of the solved sub-puzzles. The puzzle uses all the selected letters of the solved sub-puzzles to create one or more words that are the answer to a question or punchline of a joke posed in the cartoon associated with the day’s puzzle.
Data Structure
My solver depends on one data structure: a dictionary of arrays of strings, keyed by a string.
Since I wrote it in Go, that data structure is: type Dictionary map[string][]string
.
I wrote an earlier version in Python, and the type was a dict
,
but a hashtable in plain C would work, as long as the “value” was char *[]
, an array of strings.
On startup, the program reads in a file of words to make into the dictionary of arrays of strings, but it also reads in a file of words to throw out. For each word read in, for example “range”, the program checks if the word is in the words to throw out, and does so if the word is. If it doesn’t throw out the word, the program creates an anagram by alphabetizing the letters in the word: “aegnr”. The program checks if the alphabetized anagram (“aegnr”) appears as a key in the dictionary. If it does, the program adds the word to the array of words associated with that key. If not, the prgram creates an array with the new word as the first element, then adds the array to the dictionary under the alphabetized anagram as key.
The dictionary ends up getting used to solve both the sub-puzzles and the main puzzle.
I use the Linux file /usr/share/dict/words
as the file full of words,
and I’ve developed another file with strings that appear in the file words
that really aren’t “words”.
The Linux file /usr/share/dict/words
is usually used as input for a spell-checker,
so it has “words” like “xvi”, Roman numerals for 14.
My Jumble Solver accepts a file of “stop words”,
words to ignore that might appear in the dictionary file.
This is a general problem, not just one with /usr/share/dict/words
.
You don’t want “the”, “a” and “an” cluttering up
possible main puzzle solutions: the Jumble authors often provide them,
as “the” above.
Solving the single-word sub-puzzles
The Jumble Solver solves an individual sub-puzzle by alphabetizing the letters in a sub-puzzle. The letters for the first sub-puzzle are “DFLIED”. Alphabetized, the letters would be “DDEFLI”.
It turns out that “FIDDLE” is the only word that has “DDEFLI” as an anagram, but there are some anagrams that have multiple words. “AEMT” has “TEAM”, “MEAT” and “META” for example.
My program has a rather ungainly input (I can’t figure out a better way):
In the image above, the Jumble Solver has performed the first level of solving: all 6 sub-puzzles have 1 word each that matches them in the above image. The current Jumble writers apparently work to have sub-puzzles with a single solution, although occasionally, they include a sub-puzzle with multiple solutions.
The consequence of sub-puzzles with multiple solutions is multiple selections of letters from which to form the Jumble’s main puzzle solution.
Solving the big, multi-word main puzzle
The solution to a jumble is a word or words composed from selected letters of the sub-puzzles. Once the Jumble Solver has determined set(s) of selected letters, the human user inputs the number of word(s) in the solution, and the size(s) of the word(s) in characters. In the exampe, 2 words, 11 and 4 characters.
This is the part that distinguishes my Jumble Solver from others found on the web. Most or all of the other Jumble helpers give suggestions for words that match a single sub-puzzle’s letters. My Jumble Solver can suggest possible answers to the Jumble’s ultimate solution.
The 15 selected letters from the 6 sub-puzzles are:
a a e e e i l l m n s s s t y
Because Jumble Solver found only one word for each sub-puzzle, it has only 1 set of 15 selected letters. For each set of selected letters, the Jumble Solver creates alphabetized anagrams of each possible 11 and 4 letter subset of selected letters. It looks up those alphabetized anagrams in the main data structure. If the Jumble Sover finds word(s) matching the alphabetized anagrams, it has a solution to the main puzzle. It usually finds a number of solutions, sometimes in the hundreds. This sounds worse than it is: most alphabetized anagrams have no dictionary words associated.
This is a good Jumble for my solver: only 2 words in the solution, each of which has more than 3 letters. The second, 4-letter word, illustrates when a single alphabetized anagram matches more than one dictionary word. This problem has an 11-letter word, and a 4-letter word in the solution. The Jumble Solver finds only one 11-letter word that matches the specified letters from the sub-puzzles, but it finds 3, 4-letter words that match “AEMS” as an anagram.
Running the Jumble Solver
- Clone the github repo:
git clone https://github.com/bediger4000/jumblesolver.git
- Compile it:
cd jumblesolver; go build runserver.go
- Execute it:
./runserver -v -d /usr/share/dict/words -s ./stopwords.dat
Now switch to Firefox on the same machine and ask for it. http://localhost:8012
You’ll need a Jumble from the newspaper or the web site linked above to try it.
Don’t run this facing The Internet. I wrote it without regard to security at all. It doesn’t even do HTTPS.
Select the number of sub-puzzles, enter their letters, and check off the selected
(circled in newspaper Jumbles) letters.
Click Unjumble Words
to solve the sub-puzzles.
Enter the number of words in the main puzzle, and their sizes.
Click Solve
and enjoy the benefits of artificial intelligence.