Abstract: extracting names from natural-language text Authors: Yael Ravin, TJ Watson Research Center, IBM, Yorktown Heights, NY 10598 Nina Wacholder, CRIA, Columbia University, New York, NY 10027 We describe Nominator, a module we developed to extract proper names from natural language text, which is currently being integrated into IBM products and services. Using fast and robust heuristics, Nominator locates names in text, determines what type of entity they refer to -- such as person, place or organization -- and groups together all the variant names that refer to the same entity. For example, "President Clinton", "Mr. Clinton" and "Bill Clinton" are grouped as referring to the same person. Each group is assigned a "canonical name", (e.g., "Bill Clinton") to distinguish it from other groups referring to other entities ("Clinton, New Jersey"). Nominator produces a dictionary, or database, of names associated with a collection of documents. dentifying the occurrences of proper names in text and the entities they refer to can be a difficult task because of the many-to-many mapping between names and their referents. We analyze the types of ambiguity --- structural and semantic --- that make the discovery of proper names difficult in text, and describe the heuristics used to disambiguate names in Nominator, a fully-implemented module for proper name recognition developed at the IBM T.J. Watson Research Center.