Towards a Standardized Linguistic Annotation of the Textual Content of Labels in Knowledge Representation Systems

WWe propose applying standardized linguistic annotation to terms included in labels of knowledge representation schemes (taxonomies or ontologies), hypothesizing that this would help improving ontology-based semantic annotation of texts. We share the view that currently used methods for including lexical and terminological information in such hierarchical networks of concepts are not satisfactory, and thus put forward ? as a preliminary step to our annotation goal ? a model for modular representation of conceptual, terminological and linguistic information within knowledge representation systems. Our CTL model is based on two recent initiatives that describe the representation of terminologies and lexicons in ontologies: the Terminae method for building terminological and ontological models from text (Aussenac-Gilles et al., 2008), and the LexInfo metamodel for ontology lexica (Buitelaar et al., 2009). CTL goes beyond the mere fusion of the two models and introduces an additional level of representation for the linguistic objects, whereas those are no longer limited to lexical information but are covering the full range of linguistic phenomena, including constituency and dependency. We also show that the approach benefits linguistic and semantic analysis of external documents that are often to be linked to semantic resources for enrichment with concepts that are newly extracted or inferred
Published in 2010