The Fisher Corpus: a Resource for the Next Generations of Speech-to-Text

This paper describes, within the context of the DARPA EARS program, the design and implementation of the Fisher protocol for collecting conversational telephone speech which has yielded more than 16,000 English conversations. It also discusses the Quick Transcription specification that allowed 2000 hours of Fisher audio to be transcribed in less than one year. Fisher data is already in use within the DARPA EARS programs and will be published via the Linguistic Data Consortium for general use beginning in 2004
Published in 2004