A Similarity Measure for UNsupervised Semantic Disambiguation

In this paper we propose a similarity measure aimed to support an unsupervised approach to semantic tagging. This proposal represents a variant of the notion of Conceptual Density previously suggested as a tool for sense disambiguation. However, the major difference is the learning framework in which this measure applied to the Wordnet hierarchy enables a natural corpus-driven empirical estimation of lexical and contextual probabilities for probabilistic semantic tagging. Experimental results over an hand-annotated portion of the British National Corpus (about 5 M words) are also discussed. Although below the results obtained by a supervised method (Maximum Entropy trained over hand labelled data), the proposed unsupervised tagger confirms the effectiveness of the proposed metric as well as show a promising research direction
Published in 2004