Timeline Summary Evaluation Dataset for Nuggets vs. Clusters Evaluation

This is the dataset associated to the paper ‘A Comparison of Nuggets and Clusters for Evaluating Timeline Summaries’ published in the ACM Conference on Information and Knowledge Management 2017.

This dataset is released under the Creative Commons Attribution International (CC-BY) licence. See https://creativecommons.org/licenses/by/4.0/. Under this licence, you are free to:
 - Share — copy and redistribute the material in any medium or format
 - Adapt — remix, transform, and build upon the material for any purpose, even commercially.

However, you must You must give appropriate credit, provide a link to the license, and indicate if changes were made. For academic works or technical reports, please cite the original paper:

@INPROCEEDINGS{Baruah_2017CIKM,
   author = "Gaurav Baruah, Richard McCreadie and Jimmy Lin",
   title = "A Comparison of Nuggets and Clusters for Evaluating Timeline Summaries",
   booktitle = "CIKM",
   year = 2017,
}

The dataset is comprised of ground-truth summary nuggets and matches for the pooled updates submitted by participating systems to the Text Retrieval Conference Temporal Summarization track in 2013 and 2014. The ‘nuggets’ represent the key pieces of information that a good summary should contain. The ‘matches’ are a manual matching between updates returned by participant systems and the nuggets. These can be used to estimate the performance of timeline summarization systems that participate in the TREC-TS task for the two years. Track information can be found at http://www.trec-ts.org/.

The dataset is split into two parts, denoted ‘cluster-qrels’ and ‘nugget-qrels’. The goal of the original paper was to compare two methods for generating the ground-truth nuggets and matches. The ‘nugget-qrels’ represent the original TREC-TS method, that extracts nuggets from Wikipedia pages and then manually matches each update to one or more nuggets. The ‘cluster-qrels’ represent an alternative approach developed by the Tweet Timeline Generation (TTG) task at TREC, which instead has humans cluster updates based on the information that the updates contain.