Proposal of a very-Large-Corpus Acquisition Method by Cell-Formed Registration

One promising way to improve the performance of a speech translation system is to collect a large volume of data in the target tasks/domains. However, a naive expansion of the traditional data collection scheme consumes valuable resources. Advanced speech recognition technology can provide a highly accurate recognizer if a machine-friendly speech is permitted. We propose a new data collection scheme that is supported by this speaking style. The preliminary results of data collection show that the proposed scheme has a three-digit efficiency
Published in 2002