Clips, a Multi-Level Italian Computational Lexicon: a Glimpse to Data

CLIPS is a multi-layered Italian computational lexicon based on the PAROLE-SIMPLE model. In this paper we briefly recall the main characteristics of the model and devote our attention to issues emerging from the encoding of large quantities of data, especially in relation to those types of syntactic and semantic information specific to our lexicon and that reflect innovative features of the underlying model. At syntactic level, we show how alternating structures may be encoded in a linguistically more elegant way by using framesets. We illustrate the connection between syntactic and semantic information, and show how the SIMPLE Italian lexicon approach to predicate selection has been refined in CLIPS. At semantic level, we illustrate the richness of information types encoded in a word sense description and the way such a wealth of data can be exploited. We stress in particular the expressive power of the Extended Qualia Structure yet mentioning some of its problematic aspects. We show that queries on qualia relations allow to retrieve lexical collocates, to extract domain specific information, semantic networks, and help interpreting modifying PPs in complex nominals. Finally, we show that features, which cut across the type hierarchy, have a stronger expressive power with respect to semantic types in identifying selectional preferences
Published in 2002