Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language. (arXiv:2103.01242v1 [cs.CL])

Current NLP datasets targeting ambiguity can be solved by a native speaker
with relative ease. We present Cryptonite, a large-scale dataset based on
cryptic crosswords, which is both linguistically complex and naturally sourced.
Each example in Cryptonite is a cryptic clue, a short phrase or sentence with a
misleading surface reading, whose solving requires disambiguating semantic,
syntactic, and phonetic wordplays, as well as world knowledge. Cryptic clues
pose a challenge even for experienced solvers, though top-tier experts can
solve them with almost 100% accuracy. Cryptonite is a challenging task for
current models; fine-tuning T5-Large on 470k cryptic clues achieves only 7.6%
accuracy, on par with the accuracy of a rule-based clue solver (8.6%).



