A Comprehensive Resource to Evaluate Complex Open Domain Question Answering

Complex Question Answering is a discipline that involves a deep understanding of question/answer relations, such as those characterizing definition and procedural questions and their answers. To contribute to the improvement of this technology, we deliver two question and answer corpora for complex questions, WEB-QA and TREC-QA, extracted by the same Question Answering system, YourQA, from the Web and from the AQUAINT-6 data collection respectively. We believe that such corpora can be useful resources to address a type of QA that is far from being efficiently solved. WEB-QA and TREC-QA are available in two formats: judgment files and training/testing files. Judgment files contain a ranked list of candidate answers to TREC-10 complex questions, extracted using YourQA as a baseline system and manually labelled according to a Likert scale from 1 (completely incorrect) to 5 (totally correct). Training and testing files contain learning instances compatible with SVM-light; these are useful for experimenting with shallow and complex structural features such as parse trees and semantic role labels. Our experiments with the above corpora have allowed to prove that structured information representation is useful to improve the accuracy of complex QA systems and to re-rank answers
Published in 2010