Building Lexical Resources for PrincPar, a Large Coverage Parser that Generates Principled Semantic Representations

Parsing, one of the more successful areas of Natural Language Processing has mostly been concerned with syntactic structure. Though uncovering the syntactic structure of sentences is very important, in many applications a meaningrepresentation for the input must be derived as well. We report on PrincPar, a parser that builds full meaning representations. It integrates LCFLEX, a robust parser, with alexicon and ontology derived from two lexical resources, VerbNet and CoreLex that represent the semantics of verbs and nouns respectively. We show that these two different lexical resources that focus on verbs and nouns can be successfully integrated. We report parsing results on a corpus of instructional text and assess the coverage of those lexical resources. Our evaluation metric is the number of verb frames that are assigned a correct semantics: 72.2% verb frames are assigned a perfect semantics, and another 10.9% are assigned a partially correctsemantics. Our ultimate goal is to develop a (semi)automatic method to derive domain knowledge from instructional text, in the form of linguistically motivated action schemes
Published in 2006