Sejong Korean Corpora in the Making

We introduce a set of Korean corpora in the making. One of them is a corpus consisting of morphologically analyzed Korean words and it is called Sejong Morph Tagged Corpus . It is a part of Sejong Corpora, which are the results of a government-sponsored language resources compiling project in Korea. We give an outline of the corpus building component of the project and describe in some detail Sejong Morph Tagged Corpus . The latter is being further processed for disambiguation to be turned into Sejong Morph Sense Tagged Corpus and into a Korean Treebank of syntactically parsed sentences
Published in 2004