Will very Large Corpora Play for Semantic Disambiguation the Role that Massive Computing Power is Playing for other AI-Hard Problems

In this paper we formally analyze the relation between the amount of (possibly noisy) examples provided to a word-sense classification algorithm and the performance of the classifier. In the first part of the paper, we show that Computational Learning Theory provides a suitable theoretical framework to establish one such relation. In the second part of the paper, we will apply our theoretical results to the case of a semantic disambiguation algorithm based on syntactic similarity
Published in 2000