More Publications (39)

1DOCUMENT CLASSIFICATION BY MACHINE:Theory and Practice1994  Louise Guthrie,Elbert Walkerwith the proportion of type j being Pij. We are given a random sample of size n from one of the populations, and are asked...

2A Joint Information Model for NBest Ranking2008  Patrick Pantel,Vishnu Vyasproperties extracted by our system (described below) for a random sample of two instances from a cluster of food, {apple, beef}...

3HighPerformance Tagging on Medical Texts2004  Udo Hahn,Joachim Wermterannotation was 96.7% (standard deviation: 0.6%), based on a random sample of 2000 tokens (10% of the evaluation corpus). The p...

4Open Entity Extraction from Web Search Query Logs2010  Alpa Jain,Marco Pennacchiottiexperiments we use the following datasets: Query log: A random sample of 100 million, fully anonymized queries collected by...

5Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis2004  Michael Gamondealing with here are extremely noisy. Recall that on a random sample of 200 pieces of feedback even a human evaluator could...

6Genus Disambiguation: A Study in Weighted Preference1992  Rebecca Bruce,Louise Guthrieproposed sense selectien 6riteria were mn on the same random sample of 520 definitions. Table I provides a summary of the...

7Parsing with the Shortest Derivation2000  Rens Bodsubtrees larger than depth l by taking for each depth a random sample o1' 400,000 subtrecs. No subtrces larger than depth 14...

8Experiments in Automated Lexicon Building for Text Searchinglexicons fl'om the different configurations. We had chosen a random sample of 10 percent of the 2,700 words that occurred at least...

9Fine Grained Classification of Named Entities2002  Michael Fleischman,Eduard Hovygeneration is that the training set created is not a random sample of person instances in the real world. Rather, the training...

10Concept Discovery from Text2002  Dekang Lin,Patrick Pantelclustering. Buckshot first applies averagelink to a random sample of n elements to generate K clusters. It then uses...

11Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sourcesdifferences between the training sets, we handexamined a random sample of sentence pairs from each corpus type. The most common...

12Cognate Mapping  A Heuristic Strategy for the SemiSupervised Acquisition of a Spanish Lexicon from a Portuguese Seed LexiconSemantic Validation One of the authors evaluated manually a random sample of 388 (3.5% of all generated) cognate pairs in order...

13FASIL Email Summarisation Systemfor 100 emails chosen at random, separated into 10 random sample groups from representative subsets of the three main...

14Robust SubSentential Alignment of PhraseStructure Treestranslation via Monte Carlo sampling involves taking a random sample of derivations and outputting the most frequently occurring...

15Authorship Attribution and Verification with Many Authors and Limited Data2008  Kim Luyckx,Walter Daelemansorder to minimize the effect of chance, then select one random sample of 20, 50, 100 authors and finally experiment with all...

16Automatically Learning Sourceside Reordering Rules for Large Scale Machine Translation2010  Dmitriy Genzel104106 per sentence. It is sufficient, however, to take a random sample of the input, extract top candidates, and reevaluate those...

17Exploring variation across biomedical subdomainsreference subdomain, “Newswire”, composed of a 6 million word random sample from the English Gigaword corpus (Graff et al., 2005)...

18Exploring the DataDriven Prediction of Prepositions in Englishevidence for them can be found on the web. For the random sample of the BNC section J we tested on, the surfacebased approach...

19The Role of Queries in Ranking Labeled Instances Extracted from Text2010  Marius A. PascaEvaluation Results 4.1 Evaluation Sets of Queries A random sample of anonymized, classseeking queries (e.g., video game...

20Notes on the Evaluation of Dependency Parsers Obtained Through CrossLingual Projection2010  Kathrin Spreyerby the individual parsers, each trained on a different random sample of 100,000 words, drawn from the pool of all projected...

21Summarization of BusinessRelated Tweets: A ConceptBased Approach2012  Annie Louis,Todd Newmanranking are selected to create a companyword dictionary (a random sample is shown in Table 3). Next we group these words using...

22Syntactic Patterns In A Sample Of Technical English1969  Victor J. Streeterof nonrandom sampling and a changing population. The random sampleuni form populat ion (RSUP) model for a single wr i...

23Automated Translation of Semantic Relationships2010  Dmitry Davidov,Ari Rappoportour Generic evaluation setting, we utilized as input a random sample of 15 automatically discovered relationship definitions...

24A Large Scale RankerBased System for Search Query Spelling Correctionfour LMs trained on different data sources tested on a random sample of 733,147 queries. The results show that (1) higher order...

25A Structured Vector Space Model for Hidden Attribute Meaning in AdjectiveNoun Phrases2010  Matthias Hartung,Anette Franknouns with attributes. This gold standard builds on a random sample extracted from TN (cf. section 3.3). Running N1N4 on...

26FactRank: Random Walks on a Web of Facts2010  Alpa Jain,Patrick PantelHowever, to keep our evaluation manageable, we draw a random sample from these facts. Specifically, we first generate a ranked...

27Enhanced Sentiment Learning Using Twitter Hashtags and Smileyshashtags/smileys from the whole dataset assuming that such a random sample is unlikely to contain a significant amount of sentiment...

28Incremental Chinese Lexicon Extraction with Minimal Resources on a DomainSpecific Corpus2010  Gaël Patinother approaches. The extraction results, evaluated on a random sample of the working corpus, show a recall of 68.4 % and precision...

29Tailored Feature Extraction for Lexical Disambiguation of English Verbs Based on Corpus Pattern Analysisfor each annotator pair. Provided the annotation of the random sample reached a satisfactory IAA, the disagreements were manually...

30Underspecified Query Refinement via Natural Language Question GenerationFigure 1: Left: Percentage of query types found in a random sample of the Avatar Dataset. Right: Percentage of relevant question...

31Document and Corpus Level Inference For Unsupervised and Transductive Learning of Information Structure of Scientific Documents2012  Roi Reichart,Anna Korhonento the models. In the last two lines the classes of a random sample of 5% or 10% of the sentences are known. that do not...

32Automatic Detection of Point of View Differences in WikipediaWikipedia. The random subset (15 pairs) is a standard random sample. The length range of downloaded articles is 1–1128 sentences...

33Native Language Identification using Recurring ngrams – Investigating Abstraction and Domain Dependence2012  Serhiy Bykh,Detmar Meurerssecondary school. For the present work, we took a 8% random sample of the whole corpus, consisting of manually tagged 77...

34Analysis and Enhancement of Wikification for Microblogs with Context Expansionin (Meij et al., 2012), which we refer to as gold1. A random sample of verified twitter accounts were selected, and up to...

35SRestricted Monotone Alignments: Algorithm, Search Space, and Applications2012  Steffen Egermethod as follows. From the aligned data, we extract a random sample of size 1000 and train an ngram graphone model (that...

36Using Distributional Similarity for Lexical Expansion in Knowledgebased Word Sense Disambiguationthese represent 2.8% of the instances. Again, we drew a random sample of these instances, and observed that in all of them,...

37Attribute Extraction from Conjectural Queries2012  Marius A. PascaSetting Textual Data Sources: The experiments rely on a random sample of around 500 million fullyanonymized Web search queries...

38Improving Supervised Sense Disambiguation with WebScale Selectorswith the selector class of features (w/ sels) across a random sample of the WSJ , Xh and Sr portions of OnotNotes. (mfc: accuracy...

39Modeling ESL Word Choice Similarities By Representing Word Intensions and Extensions2012  Huichao Xue,Rebecca Hwakeep all instances that contain an error and retain a random sample of q percent of the correct instances in the training...