There are now many computer programs for automatically determining the sense of a word in context (Word Sense Disambiguation or WSD). The purpose of SENSEVAL is to evaluate the strengths and weaknesses of such programs with respect to different words, different varieties of language, and different languages.
The first SENSEVAL took place in the summer of 1998 for English, French, and Italian, culminating in a workshop held at Herstmonceux Castle, Sussex, England on September 2-4.
SENSEVAL-2 is now over!
We evaluated word sense disambiguation systems on three types of task over 12 languages. In the "all-words" task, the evaluation is on almost all of the content words in a sample of texts. In the "lexical sample" task, first we sample the lexicon, then we find instances in context of the sample words and the evaluation is on those instances only. In the "translation task" (Japanese only), senses corresponded to distinct translations of a word into another language. The tasks were
|All-words||Czech, Dutch, English, Estonian|
|Lexical sample||Basque, Chinese, Danish, English, Italian, Japanese, Korean, Spanish, Swedish|
About 35 teams participated, submitting over 90 systems. The review of the workshop gives more statistics and some followup information. We will be publishing a proceedings of the workshop later this year (free to workshop attendees, otherwise available through the ACL). All of the results of the evaluation and data is now in the public domain:
A related workshop, Word Sense Disambiguation: Recent Successes and Future Directions, was held Thursday 11 July 2002, Philadelphia, USA. (at ACL-02)
Scott Cotton, University of Pennsylvania
Phil Edmonds, Sharp Laboratories of Europe
Adam Kilgarriff, ITRI, University of Brighton
Martha Palmer, University of Pennsylvania
last updated: 14 November, 2002 17:45 -0000