Related Topics
Publications (10)

1Efficient Stochastic PartofSpeech Tagging for Hungarian2002  Csaba Oravecz,Peter Dienesmodels widely used in stateoftheart taggers: lexical probabilities are calculated from a wordform lexicon generated during...suggestions for further work will follow in section 6. 2. Data sparseness in highly inflective languages It is a computational...

2Combining Symbolic and Statistical Methods in Morphological Analysis and Unknown Word Guessingbasis of some training corpus often face the problem of data sparseness. A possible solution to this problem is to apply a c...

3Training a Statistical Machine Translation System Without GIZA++transducer for the sentence pair. The weights are lexical probabilities. wir koennen den Zug nehmen Figure 3: Linear automaton...

4A Galician Textual Corpus for Morphosyntactic Tagging with Application to TexttoSpeech Synthesis2004  Ana Martinez Insua,Eduardo Rodriguez Banga,Francisco Campillo Diaz,Francisco Mendez Pazo,Lorena Seijo Pereirotoo ample and detailed, due to the problem of training data sparseness, for using it in the estimation of the probabilities...

5DIAC+: a Professional Diacritics Recovering System2008  Alexandru Ceausu,Dan Tufisy this is not a linear dependency. To avoid severe data sparseness and accuracy degradation, a huge amount of manual work...2000) is a twostage technique addressing the issue of datasparseness. In general terms, tiered tagging uses a hidden tagset...

6POS Tagging for Grammaticalization and Grammatical Neologism Detection2012  Maarten Janssenprovided, it is also possible to use the external lexical probabilities for unknown words. NeoTag has an accuracy rating of...technique in taggers, and is used to counter problems with data sparseness. It comes traditionally in two types: transition smoothing...

7The Hungarian National Corpus2002  Tamas Varadirichness and productivity of morphology, which result in data sparseness and computational inefficiency mentioned in 2.2.1., what...

8Using a Morphological Analyzer in High Precision POS Tagging of Hungarianlanguages with a large number of possible word forms, if lexical probabilities are calculated from a word form lexicon generated...Hungarian (17.1%) than in English (4.5%). To cope with this data sparseness problem three alternative strategies can be followed:...

9Identifying MultiWord Expressions in Statistical Machine Translationwhich can draw on richer statistics and overcome the data sparseness problems. In Table 1 we give an example of MWE produced... directions translation probabilites and set the lexical probabilities to 1 for simplicity. So, for each phrase in a given...

10Using a Large Set of EaglesCompliant MorphoSyntactic Descriptors as a Tagset for Probabilistic Tagging2000  Dan TufisAbstract The paper presents one way of reconciling data sparseness with the requirement of high accuracy tagging in terms...the possible interpretations being assigned equal lexical probabilities. Table 6 displays the results of this second experiment...