Debater Datasets

In this page you can download copies of the IBM Debater Datasets.

The datasets are released under the following licensing and copyright terms:

To download, please fill in the request form below.

Other datasets are expected to be released over time.


Ranit Aharonov, Manager, Debater group, IBM Research - Haifa

The following Datasets are available:



Debater Datasets - Licensing Notice

Each copy or modified version that you distribute must include a licensing notice stating that the work is released under CC-BY-SA and either a) a hyperlink or URL to the text of the license or b) a copy of the license. For this purpose, a suitable URL is:


IBM Unraveling Language Patterns

(GReedy Augmented Sequential Patterns)

IBM Unraveling Language Patterns is an algorithm for automatically extracting patterns that characterize subtle linguistic phenomena.
To that end, IBM Unraveling Language Patterns augments each term of input text with multiple layers of linguistic information. These different facets of the text terms are systematically combined to reveal rich patterns.