The "difficulty" of the past two Guardian daily crosswords
Defining the difficulty of a crossword automatically would be an extremely difficult problem in general, but one dimension of this which we can easily measure is how unusual the vocabulary in the answers is.
To take this approach we need some way to score each word which might appear in the crossword. A possible way of doing this might be to look for how many times each word appears in Project Gutenberg (or some other large text corpus) but this doesn't really reflect the difficulty of words in crosswords. For example, because of its useful checking letters the word "OKAPI" is one of the most frequently used in the Guardian crossword (see below for the top 25) but would occur relatively rarely in most texts. So, for the graphs below I'm measuring the easiness of a word by how often it has appeared in the Guardian crossword previously. The distribution of words according to this score is quite interesting; about 35% of the answers in the Guardian crossword hadn't been clued in the previous 10 years.
The graphs below show on the Y-axis the proportion of words either in this particular crossword, all crosswords by that setter or all Guardian crosswords. The X-axis shows the "easiness" of the words as defined above - the number of occurences in the Guardian crossword overall. In other words, flatter graphs represent puzzles with easier vocabulary for the experienced crossword solver.
The next two graphs are updated each day shortly after 3am and 9am. If it's at least a day since the publication of the daily crossword then you will be able to hover your mouse over the yellow graph to show the words which make up the data for that point.
Guardian Crossword 24926 by Rover on 05 February 2010
Guardian Crossword 24925 by Orlando on 04 February 2010
Compare the "difficulty" of setters
You can compare the "difficulty" curves (as explained above) of the top 20 Guardian crosswords setters using the checkboxes below. The setter who used the most surprising vocabulary by this metric was Bunthorne (the late Bob Smithies) while the least surprising is used by Chifonie.
AraucariaRufus
Paul
Gordius
Shed
Bunthorne
Chifonie
Orlando
Rover
Taupi
Pasquale
Quantum
Logodaedalus
Janus
Brummie
Enigmatist
Brendan
Audreus
Crispa
Mercury
Auster
The most often clued words or phrases in the Guardian crossword
The following are the most clued words in the Guardian daily and prize crosswords. This is a marvellous list, I think - the popularity of the words seems to be largely a function of:
- Containing a common pattern of checked letters that few other words fit (EXTRA, STYE)
- Having useful but unusual synonyms (e.g. OUNCE (cat), STUD (boss), EGGSHELL (finish,paint,china))
I have been very careful on this page only to present data which setters or the editor hopefully wouldn't be upset to see posted here, so I have omitted including any of the clues for these words. However, I will say the list of clues for the top word (EXTRA) is remarkable because of the incredible variation and inventiveness of the clues. As someone who struggles rather to set a single good clue for a given word, I'm incredibly impressed by this...
Number of Occurrences | Answer |
---|---|
38 | EXTRA |
25 | STUD |
25 | STYE |
24 | ISLE |
24 | ANON |
24 | ECHO |
23 | ESTATE |
23 | STUN |
23 | BLUE |
23 | OUNCE |
22 | ISSUE |
22 | REIGN |
21 | UNIT |
21 | ERROR |
21 | EDGE |
21 | ETERNAL |
21 | USED |
20 | ADDRESS |
20 | RATIO |
20 | NIECE |
20 | ACHE |
20 | SCAR |
20 | ARCH |
20 | ERATO |
20 | IRIS |
The following image is a Wordle visualization of the most frequently clued words, generated using data between 1995 and 2009:
Number of crosswords set by the most frequent setters
The bar graph below shows how many crosswords have been set by each of the 20 most prolific setters, both in the daily and prize crosswords. (If we limit this to just daily crosswords then Rufus is the top setter, since Araucaria sets a much larger proportion of the prize crosswords than anyone else.)