Research papers


Here are some of my research papers. If you have questions/comments on those papers, or have problems downloading them, or if you would like to get a copy of any paper, please feel free to contect me.

Book:

Sholom M. Weiss, Nitin Indurkhya, Tong Zhang, and Fred Damerau. Text Mining: Predictive Methods for Analyzing Unstructured Information, Springer-Verlag, New York, 2004.

Recent:

[RC23462]  Rie K. Ando and Tong Zhang. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data. Technical Report RC23462, IBM T.J. Watson Research Center, 2004.

[RC22980] Tong Zhang. From epsilon-entropy to KL-entropy: analysis of minimum information complexity density estimation. Technical Report RC22980,  IBM T.J. Watson Research Center, 2003.


2005:

[54] Tong Zhang and Bin Yu. Boosting with early stopping: Convergence and consistency. The Annals of Statitics, to appear.

[53] Tong Zhang. Learning Bounds for Kernel Regression using Effective Data Dimensionality. Neural Computation, to appear.

[52] Christoph Tillmann and Tong Zhang. A Localized Prediction Model for Statistical Machine Translation. ACL 05.

[51]
Rie Ando and Tong Zhang. A High-Performance Semi-Supervised Learning Method for  Text Chunking.  ACL 05.

[50]
Tong Zhang.  Localized Upper and Lower Bounds for Some Estimation Problems. COLT 2005.

[49] Tong Zhang. Data Dependent Concentration Bounds for Sequential Prediction Algorithms. COLT 2005.

2004:

[48] Tong Zhang. Statistical Analysis of Some Multi-Category Large Margin Classification Methods. JMLR, 5:1225-1251, 2004.

[47] Fred J. Damerau, Tong Zhang, Sholom M. Weiss, and Nitin Indurkhya. Text categorization for a comprehensive time-dependent benchmark. Information Processing & Management, 40:209-221, 2004.

[46] Tong Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statitics, 32:56-85, 2004 (with discussion).

[45]
Tong Zhang.  Class-size independent generalization analsysis of some discriminative multi-category classification methods. NIPS, 2004.

[44] Jinbo Bi and Tong Zhang. Support vector classification with input data uncertainty. NIPS, 2004.

[43] Tong Zhang. Solving Large Scale Linear Prediction Problems Using Stochastic Gradient Descent Algorithms. ICML, 2004.

[42] Li Zhang, Yue Pan, and Tong Zhang. Focused Named Entity Recognition using Machine Learning. SIGIR, 2004.

[41] Tong Zhang. On the Convergence of MDL Density Estimation. COLT, 2004.

[40] Jinbo Bi, Tong Zhang, and Kristin P. Bennett. Column-Generation Boosting Methods for Mixture of Kernels. KDD, 2004.

2003:

[39] Ron Meir and Tong Zhang.  Generalization error bounds for Bayesian mixture algorithmsJournal of Machine Learning Research, 4:839-860, 2003.

[38] Shie Mannor, Ron Meir, and Tong Zhang. Greedy algorithms for classification - consistency, convergence rates, and adaptivityJournal of Machine Learning Research,  4:713-741, 2003.

[37] Tong Zhang. Sequential greedy approximation for certain convex optimization problems. IEEE Transaction on Information Theory, 49:682-691, 2003.

[36] Tong Zhang. Leave-one-out bounds for kernel methods. Neural Computation, 15:1397-1437, 2003.

[35] Sholom M. Weiss and Tong Zhang.  The Handbook of Data Mining, Chapter on Performance Analysis and Evaluation. Lawrence Erlbaum Associates, 2003.

[34] Tong Zhang. An infinity-sample theory for multi-category large margin classification. In NIPS 03, 2004. to appear.

[33] Tong Zhang.  Learning bounds for a generalized family of Bayesian posterior distributions. In NIPS 03, 2004. to appear. (also see [RC22980])

[32] Tong Zhang and Bin Yu. On the convergence of boosting procedures. In ICML 03, pages 904-911, 2003.  (full paper)

[31] Radu  Florian,  Abe  Ittycheriah,  Hongyan  Jing,  and  Tong  Zhang. Named entity recogintion through classifier combination. In Proceedings CoNLL 03, pages 168-171, 2003.

[30] Tong Zhang and David E. Johnson. A robust risk minimization based named entity recognition system.  In Proceedings CoNLL 03, pages 204-207, 2003.

[29] Tong Zhang, Fred Damerau, and David E. Johnson. Updating an NLP system to fit new domains: an empirical study on the sentence segmentation problem. In Proceedings CoNLL 03, pages 56-62, 2003.

[28] Hongyan Jing, Radu Florian, Xiaoqiang Luo, Tong Zhang, and Abraham Ittycheriah.  Howtogetachinesename (entity) : Segmentation and combination issues. In EMNLP 03, 2003.

2002:

[27] David E. Johnson, Frank J. Oles, Tong Zhang, and Thilo Goetz.  A decision-tree-based symbolic rule induction system for text categorization. IBM Systems Journal, 41:428-437, 2002.

[26] Tong Zhang and Carlo Tomasi.  On the consistency of instantaneous rigid motion estimationInternational Journal of Computer Vision, 46:51-79, 2002.

[25] Tong Zhang.  Covering number bounds of certain regularized linear function classesJournal of Machine Learning Research, 2:527-550, 2002.

[24] Tong Zhang and Vijay S. Iyengar. Recommender systems using linear classifiers. Journal of Machine Learning Research, 2:313-334, 2002.

[23] Tong Zhang, Fred Damerau, and David E. Johnson.  Text chunking based on a generalization of WinnowJournal of Machine Learning Research, 2:615-637, 2002.

[22] Tong Zhang.  On the dual formulation of regularized linear systems. Machine Learning, 46:91-129, 2002.

[21] Tong Zhang. Approximation bounds for some sparse kernel regression algorithms. Neural Computation, 14:3013-3042, 2002.

[20] Jane Cullum and Tong Zhang. Two-sided Arnoldi and non-symmetric Lanczos algorithmsSIAM Journal on Matrix Analysis and Applications, 24:303-319, 2002.

[19] Ron Meir and Tong Zhang. Data-dependent bounds for Bayesian mixture methods. In NIPS 02, 2003. (full paper [39])

[18] Tong Zhang. Effective dimension and generalization of kernel learning. In NIPS 02, 2003. (full paper)

[17] Shie Mannor, Ron Meir, and Tong Zhang.  The consistency of greedy algorithms for classification. In COLT 02, pages 319-333, 2002. (also see [38])

[16] Tong Zhang.  Statistical behavior and consistency of support vector machines, boosting, and beyond. In ICML 02, pages 690-697, 2002. (full paper [44])

[15] Fred J. Damerau, Tong Zhang, Sholom M. Weiss, and Nitin Indurkhya. Experiments in high-dimensional text categorization. In SIGIR 2002, 2002. (full paper [45])

2001:

[14] Tong Zhang and Frank J. Oles. Text categorization based on regularized linear classification methods. Information Retrieval, 4:5-31, 2001.

[13] Tong Zhang and Gene H. Golub. Rank-one approximation to high order tensors. SIAM Journal on Matrix Analysis and Applications, 23:534-550, 2001.

[12] Tong Zhang.  A general greedy approximation algorithm with applications.  In NIPS 01, 2002. (Also see [37])

[11] Tong Zhang. Generalization performance of some learning problems in Hilbert functional spaces. In NIPS 01, 2002.

[10] Vajay S. Iyengar and Tong Zhang. Empirical study of recommender systems using linear classifiers. In The Fifth Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 16-27, 2001. (full paper [24])

[9] Tong Zhang.  Some sparse approximation bounds for regression problems. In ICML 01, pages 624-631, 2001. (full paper [21])

[8] Tong Zhang, Fred Damerau, and David E. Johnson.  Text chunking using regularized Winnow. In ACL 01, pages 539-546, 2001. (full paper [23])

[7] Tong Zhang.  A sequential approximation bound for some sample-dependent convex optimization problems with applications in learning. In  COLT 01, pages 65-81, 2001.

[6] Tong Zhang. A leave-one-out cross validation bound for kernel methods with applications in learning. In COLT 01, pages 427-443, 2001. (full paper [36])

2000:

[5] Jane Cullum, Albert Ruehli, and Tong Zhang. A method for reduced-order modeling and simulation of large interconnect circuits and its application to PEEC models including retardation. IEEE Trans. Circ. Sys., 47:261-273, 2000.

[4] Tong Zhang. Convergence of large margin separable linear classification. In NIPS 00, pages 357-363, 2001.

[3] Tong Zhang.  Regularized Winnow methods.  In NIPS 00, pages 703-709, 2001.  (note: A typo in Thm 3.2 of the original paper is fixed)

[2] Tong Zhang and Frank J. Oles. A probability analysis on the value of unlabeled data for classification problems.  In ICML 00, pages 1191-1198, 2000.  (note: we didn't write a longer version of the paper, in spite of comments in the paper suggesting so)

[1] Vijay S. Iyengar, Chid Apte, and Tong Zhang.  Active learning using adaptive resampling. In The Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 91-98, 2000.



Some earlier papers:

T. Zhang, G. Golub, and K.H. Law.  Subspace iterative methods for eigenvalue problems. Lin. Alg. and Appl., 294:239-258, 1999.

T. Zhang.  Some theoretical results concerning the convergence of composition of regularized linear functions. In NIPS 99, pages 370-376, 2000.

T. Zhang and C. Tomasi.  Fast, robust, and consistent camera motion estimation. In CVPR 99, pages 164-170, 1999.

T. Zhang. Theoretical analysis of a class of randomized regularization methods. In COLT 99, pages 156-163, 1999.

T. Zhang, K.H. Law, and G. Golub.  On the homotopy method for perturbed symmetric generalized eigenvalue problems.  SIAM J. Sci. Comput., 19:1625-1645, 1998.

T. Zhang, G. Golub, and K.H. Law. Eigenvalue perturbation and the generalized Krylov subspace method. J. Applied Numer. Math., 27:185-202, 1998.

T. Zhang.  Compression by model combination.  In Proceedings of IEEE Data Compression Conference, DCC'98, pages 319-328, 1998.

J. Cullum, A. Ruehli, and T. Zhang. Model reduction for peec models including retardation. In Proc. IEEE 7th topical meeting on Electrical performance of electronic packaging, EPEP'98, pages 287-290, 1998.

D. Greene, F. Yao, and T. Zhang.  A linear algorithm for optimal context clustering with application to bi-level image coding. In IEEE Conference on image processing, ICIP'98, pages 508-511, 1998.

D. Greene, M. Vishwanath, F. Yao, and T. Zhang. A progressive Ziv-Lempel algorithm for image compression. In Proceedings of Compression and Complexity of Sequences, SEQUENCE'97, pages 136-144, 1997.

G. Taubin, T. Zhang, and G. Golub. Optimal surface smoothing as filter design.  In Proceedings of Fourth European Conference on Computer Vision, pages 283-292, 1996.

R.S. Strichartz, A. Taylor, and T. Zhang. Densities of self-similar measures on the line. Exper. Math., 4:101-128, 1995.