Building a Linguistically Interpreted Corpus of Bulgarian: the BulTreeBank

In the field of Human Language Technology (HLT), the existence of linguistically interpreted real-world texts provides the license necessary for a given language to enter the area of high-tech applications. The significance of BulTreeBank is the granting of an HLT license to a ``less processed'' language like Bulgarian which, until recently, has been formally modelled and processed mainly on the morphology level. The BulTreeBank project aims at the creation of syntactically annotated data for Bulgarian and the tools for their production, management and automatic processing. It provides not only language resources, but develops an infrastructure of research solutions, production scenarios and services
Published in 2002