Abstracts for Talks
Translation Models for Natural Language Understanding, Kishore
Language Understanding in a restricted domain can be posed as problem
of selecting the most probable formal language sentence given the natural
language utterance. Often, a finite set of formal language sentences
cover the intended domain very well. Then, NLU involves building a discrete
conditional probability distribution on the finite set of formal sentences,
conditioned on the natural language utterance. We discuss building this
a posteriori model directly using maximum entropy and related frameworks.
Results will be presented in the Air Travel Information Service domain.
Document Frequency is I-Divergence Optimal, Kishore
Document Frequency (IDF), widely used in information retrieval,
is a popular measure of a word's importance. While its use is justified
in many ways, it has not been shown to be optimal in a formal framework.
In this talk, we show that IDF is the optimal weight of a word with
respect to I-Divergence in an information retrieval setting assuming
that relevant documents contain all query terms. We also assign
optimal weights to longer n-grams, with the weights seen as a natural
extension to IDF of a single word. We use this framework for unsupervised
identification of phrases in text corpora.
of Natural Language, Wlodek Zadrozny
Is it possible
to estimate the difficulty of creating natural language applications?
This problem is of growing practical and theoretical interest. One measure
of complexity is the number of objects and relations in the domain of
consideration. However, for practical applications, measures of complexity
must take into account accuracy of processing, relationship to world
knowledge, evaluations etc. The talk surveys initial results and describes
research issues ranging from purely mathematical and linguistic to technological.
Language Semantics, Wlodek Zadrozny
We look at
formal relations between synonymy and compositionality. We prove that
synomy does not put any formal constraints on compositionality. i.e.
with a proper encoding,if two expressions are synonymous, then their
compositional meanings are identical. We suggest that the reason that
semanticists have been anxious to preserve compositionality as a significant
constraint on semantic theory is that it has been mistakenly regarded
as a condition that must be satisfied by any theory that sustains a
systematic connection between the meaning of an expression and the meanings
of its parts. Recent developments in formal and computational semantics
show that systematic theories of meanings need not be compositional
(joint work w. Shalom Lappin, King's College).
Parser Adaptation Based on Model Transform, Xiaoqiang
- It is observed
that performance of a statistical parser degrades rapidly when the style
of test text is different from training text. I will talk about how
parsing accuracy can be improved by transforming models. Two types of
transforms have been explored: 1) using a Markov matrix; 2) using Householder
transform. Results show that classing errors can be reduced by 20-30%
in AirTrav domain.
Parser Adaptation Via Householder Transform,
We propose a method of adapting a statistical parser using a special
orthogonal transform -- Householder transform. Probability mass functions
(pmf) in the parser are first mapped to unit sphere, then a Householder
transform is applied, which maps a point in unit sphere to another point
in unit sphere. The final model is obtained by mapping the transformed
point in unit sphere back to simplex through a square map. The proposed
method is tested on a semantic parser, and over 20% relative reduction
of parse errors can be achieved.
the Probabilities of Unseen Events: A Language Modeling Perspective,
- In many
tasks, we must estimate the probabilities of events that we have never
seen before. For example, consider the estimation of the probability
that the next president of the U.S. is a woman. A naive estimate can
be found by dividing the number of past female presidents by the total
number of past presidents, but the resulting estimate of zero is clearly
an underestimate. A wide range of techniques have been developed in
the field of "language modeling" to improve on these naive "maximum
likelihood" estimates. Language modeling deals with the estimation of
the frequency of a given word given the preceding words in text. In
this talk, we present a history of the "smoothing" techniques studied
in language modeling to estimate unseen event probabilities, and we
discuss the impact of improved smoothing in language modeling on the
tasks of text compression and speech recognition.
Representations of Finite Distributive Lattices, Frank
- A goal of
this talk is to show how ideas from lattice theory can be used in the
implementation of a knowledge representation language. First, the semantics
of a simple knowledge representation language is presented. Then we
show how to use Birkhoff's Representation Theorem for Finite Distributive
Lattices to build incrementally what we call a Birkhoff implementation
of a knowledge base by processing a sequence of terminological axioms.
A mathematical proof of the correctness of our technique with respect
to the given semantics is an integral part of the development. While
the intended application is to knowledge representation, these methods
can be used whenever a computationally tractable representation of a
finite distributive lattice needs to be implemented, such as when an
ontology is required for natural language processing.