Analysis of TimeBank as a resource for TimeML parsing
Branimir Boguraev, Rie Kubota Ando
LREC 2006
We present a novel algorithm that creates document vectors with reduced dimensionality. This work was motivated by an application characterizing relationships among documents in a collection. Our algorithm yielded inter-document similarities with an average precision up to 17.8% higher than that of singular value decomposition (SVD) used for Latent Semantic Indexing. The best performance was achieved with dimensional reduction rates that were 43% higher than SVD on average. Our algorithm creates basis vectors for a reduced space by iteratively `scaling' vectors and computing eigenvectors. Unlike SVD, it breaks the symmetry of documents and terms to capture information more evenly across documents. We also discuss correlation with a probabilistic model and evaluate a method for selecting the dimensionality using log-likelihood estimation.
Branimir Boguraev, Rie Kubota Ando
LREC 2006
Rie Kubota Ando, Tong Zhang
ICML 2007
Rie Kubota Ando, Lillian Lee
Natural Language Engineering
Branimir Boguraev, Rie Kubota Ando
Dagstuhl Seminar Proceedings 2005