|

|
Publications
Note: If there are problems printing the .PDF files,
please upgrade to the newest version of Acrobat Reader.
2005
- Embedded Predictive Modeling in a Parallel Relational Database by A. Dorneich, R. Natarajan, E. Pednault and C. Apte, to appear in Proceedings of the 21st ACM Symposium on Applied Computing, Special Track on Data Mining, April 2006, Dijon, France.
- Data Mining and Clinical Data Repositories: Insights from a 667,000 Patient Data Set by B. Robson, C. Apte, S. Weiss et al., to appear in Computers in Biology and Medicine, 2005.
- Ranking-Based Evaluation of Regression Models by S. Rosset, C. Perlich and B. Zadrozny, in Proceedings of the Fifth IEEE International Conference on Data Mining, November 2005.
- An Improved Categorization of Classifier's Sensitivity on Sample Selection Bias by W. Fan, I. Davidson, B. Zadrozny, and P. S. Yu, in Proceedings of the Fifth IEEE International Conference on Data Mining, November 2005.
- Business Performance Management System for CRM and Sales Execution by M. Ettl, B. Zadrozny, P. Chowdhary and N. Abe, in Proceedings of the Sixteenth International Conference on Database and Expert Systems Applications, pp. 908-913, August 2005.
- Robust Boosting and Its Relation to Bagging by S. Rosset, in Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2005.
- Gene Classification: Issues and Challenges for Relational Learning by C. Perlich and S. Merugu, in Proceedings of the Workshop on Multi-Relational Data Mining (MRDM), at the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2005.
- One-Benefit Learning: Cost-Sensitive Learning with Restricted Cost Information by B. Zadrozny, in Proceedings of the Workshop on Utility-Based Data Mining (UBDM), at the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2005.
- ROC Confidence Bands: An Empirical Evaluation by S. Macskassy, F. Provost and S. Rosset, in Proceedings of the Twenty-Second International Conference on Machine Learning, August 2005.
- Error Limiting Reductions Between Classification Tasks by A. Beygelzimer, V. Dani, T. Hayes, J. Langford and B. Zadrozny, in Proceedings of the Twenty-Second International Conference on Machine Learning, August 2005.
- Relating Reinforcement Learning Performance to Classification Performance by J. Langford and B. Zadrozny, in Proceedings of the Twenty-Second International Conference on Machine Learning, August 2005.
- Approaching the ILP Challenge 2005: Class-Conditional Bayesian Propositionalization for Genetic Classification by C. Perlich, in Proceedings of the Fifteenth International Conference on Inductive Logic Programming, August 2005.
- Weighted One-Against-All by A. Beygelzimer, J. Langford and B. Zadrozny, in Proceedings of the Twentieth National Conference on Artificial Intelligence, July 2005.
- Learning from Identifier Attributes: Distribution-Based Aggregation for Relational Learning by C. Perlich and F. Provost, in Proceedings of the Dagstuhl Seminar 05051 (Probabilistic, Logical and Relational Learning - Towards a Synthesis), February 2005.
- Sparsity and Smoothness via the Fused Lasso by R. Tibshirani, M. Saunders, S. Rosset, J. Zhu and Keith Knight, in Journal of the Royal Statistical Society Series B, Vol. 67 No. 1, 2005.
- Estimating Class Membership Probabilities using Classifier Learners by J. Langford and B. Zadrozny, in Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, January 2005.
- Improvements to the Linear Programming based Scheduling of Web Advertisements by A. Nakamura and N. Abe, in Journal of Electronic Commerce Research, 5(1), 75-98, 2005.
- Sequential Risk Management in E-Business by Reinforcement Learning, by N. Abe, E. Pednault, B. Zadrozny, H. Wang, W. Fan, and C. Apte, in
Handbook of Integrated Risk Management for E-Business: Measuring, Modeling and Managing Risk, A. Labbi, eds., J.Ross Publishing, 2005.
2004
- A Grid-based Approach for Enterprise-Scale Data Mining by R. Natarajan, R. Sion, C. Apte and I. Narang, in Proceedings of the Workshop on Data Mining and the Grid
at the Fourth IEEE International Conference on Data Mining, November 2004.
- Boosting as a Regularized Path to A Maximum Margin Classifier by S. Rosset, J. Zhu and T. Hastie. Journal of Machine Learning Research, 5(Aug):941-973. August 2004.
- Tracking Curved Regularized Optimization Solution Paths by S. Rosset, in Neural Information Processing Systems, December 2004.
- A Method for Inferring Label Sampling Mechanisms in Semi-Supervised Learning by S. Rosset, J. Zhu, H. Zou and T. Hastie, in Neural Information Processing Systems, December 2004.
- The Entire Regularization Path for the Support Vector Machine by T. Hastie, S. Rosset, R. Tibshirani and J. Zhu. Journal of Machine Learning Research 5(Oct): 1391--1415, October 2004. R package (short version to appear in NIPS 2004).
- An Iterative Method for Multi-Class Cost-Sensitive Learning by N. Abe, B. Zadrozny and J. Langord, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, Seattle, August 2004.
- Cross Channel Optimized Marketing by Reinforcement Learning by N. Abe, N. Verma, C. Apte and R. Schroko, in Proceedings of the International Conference on Knowledge Discovery and Data Mining, Seattle, August 2004.
- Empirical Evaluation of Feature Subset Selection Based on a Real-world Data Set by P. Perner and C. Apte. Engineering Applications of Artificial Intelligence, Volume 17, Issue 3, Pages 285-288, April 2004.
- Transform Regression and the Kolmogorov Superposition Theorem by E. Pednault. IBM Research Report RC-23227, June 2004.
- Model Selection via the AUC by S. Rosset, in Proceedings of the Twenty-First International Conference on Machine Learning, July 2004.
- Learning and Evaluating Classifiers under Sample Selection Bias by B. Zadrozny, in Proceedings of the Twenty-First International Conference on Machine Learning, July 2004.
- Discussion of "Least Angle Regression" by Efron et al. by S. Rosset and Ji Zhu, in Annals of Statistics, April 2004 (preliminary version).
- Sampling Approach to Resource Light Mining by N. Abe, C. Apte, B. Bhattacharjee, K. Goldman, J. Langford and B. Zadrozny, in Proceedings of the Data Mining in Resource Constrained Environments Workshop
at the Fourth SIAM International Conference on Data Mining, April 2004.
2003
- 1-norm
Support Vector Machines by J. Zhu, S. Rosset, T. Hastie, and R.
Tibshirani, in Seventeenth Annual Conference on Neural Information Processing
Systems (NIPS), 2003.
- Margin
Maximizing Loss Functions by S. Rosset, J. Zhu, and T. Hastie, in
Seventeenth Annual Conference on Neural Information Processing Systems
(NIPS), 2003.
- Integrating
Customer Value Considerations into Predictive Modeling by S. Rosset
and E. Neumann, in IEEE International Conference on Data Mining (ICDM),
2003.
- Cost-Sensitive
Learning by Cost-Proportionate Example Weighting by B. Zadrozny,
J. Langford,N. Abe, in IEEE International Conference on Data Mining
(ICDM), 2003.
- Knowledge-Based
Data Mining by S.M. Weiss, S.J. Buckley, S. Kapoor, and S. Damgaard,
in Proceedings of the International Conference on Knowledge Discovery
and Data Mining, Washington DC, August 24-27, 2003.
- Passenger-Based
Predictive Modeling of Airline No-show Rates by R. D. Lawrence,
S.J. Hong, and J. Cherrier, in Proceedings of the International Conference
on Knowledge Discovery and Data Mining, Washington DC, August 24-27,
2003.
- Data Mining Analytics for Business Intelligence and Decision Support by C. Apte, in OR/MS Today, February 2003.
- Data Intensive Analytics for Predictive Modeling by C. Apte, S.J. Hong, R. Natarajan, E.P.D. Pednault, F. Tipu, and S. Weiss, in IBM Journal of R&D, Vol. 47, No. 1, Pages 17-23, January 2003.
- A
Machine-Learning Approach to Optimal Bid Pricing by R. D. Lawrence,
in Proceedings of the Eighth INFORMS Computing Society Conference on
Optimization and Computation in the Network Era, Chandler, Arizona,
January 2003.
- Reinforcement Learning with Immediate Rewards and Linear Hypotheses by Naoki Abe, Alan Biermann, and Philip Long, in Algorithmica, 37, 263-293, 2003.
2002
- Empirical
Comparison of Various Reinforcement Learning Strategies in Sequential
Targeted Marketing by N. Abe, E.P.D. Pednault, H. Wang, B. Zadrozny,
W. Fan, and C. Apte, in Proceedings of the 2002 IEEE International Conference
on Data Mining, December 2002.
- Prediction
of MHC Class I Binding Peptides by Dynamic Experiment Design based on
Query Learning with Hidden Markov Models by K. Udaka, H. Mamitsuka,
Y. Nakaseko and N. Abe, in Journal of Immunology, 169(10), 5744-5753,
2002.
- Business
Applications of Data Mining, by C. Apte, B. Liu, E.P.D. Pednault,
and P. Smyth, in Communications of the ACM, Vol. 45, No. 8, August 2002.
- A
Probabilistic Estimation Framework for Predictive Modeling Analytics,
by C. Apte, R. Natarajan, E.P.D. Pednault, and F. Tipu, in IBM Systems
Journal, Vol. 41, No. 3, August 2002.
- Predictive
Algorithms in the Management of Computer Systems, by R. Vilalta,
C. Apte, J. Hellerstein, S. Ma, and S.M. Weiss, in IBM Systems Journal,
Vol. 41, No. 3, August 2002.
- Automated
Generation of Model Cases for Help-Desk Applications, by S.M. Weiss
and C. Apte, in IBM Systems Journal, Vol. 41, No. 3, August 2002.
- Experiments
in High-Dimensional Text Categorization, by F. Damerau, T. Zhang,
and S.M. Weiss, in Proceedings of ACM SIGIR International Conference
on Information Retrieval, August 2002.
- Sequential
Cost-Sensitive Decision Making with Reinforcement Learning, by E.P.D.
Pednault, N. Abe, and B. Zadrozny, in Eigth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (SIGKDD), Edmonton,
Canada, July 2002.
- A
System for Real-time Competitive Market Intelligence, by S.M. Weiss
and N.K. Verma, in Eigth ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (SIGKDD), Edmonton, Canada, July 2002.
- Segmented
Regression Estimators for Massive Data Sets by R. Natarajan and
E.P.D. Pednault, in Proceedings of the SIAM Second International Conference
on Data Mining, Crystal City, Virginia, April 2002.
- Multiplicative
Adjustment of Class Probability: Educating Naive Bayes by S.J. Hong,
J. Hosking, R. Natarajan, IBM Research Report RC-22393, April 2002.
Condensed version in IEEE ICDM 2002.
2001
- Personalization
of Supermarket Product Recommentations by R. D. Lawrence, G. Almasi,
V. Kotlyar, M. Viveros, and S. Duri, in Data Mining and Knowledge Discovery,
5, 11-32, 2001.
- Segmentation-Based
Modeling for Advanced Targeted Marketing, by C. Apte, E. Bibelnieks,
R. Natarajan, E.P.D. Pednault, F. Tipu, D. Campbell, and B. Nelson,
IBM Research Report RC-21982. In Seventh ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (SIGKDD), San Francisco, August
2001.
- Using
Simulated Pseudo Data to Speed Up Statistical Predictive Modeling from
Massive Data Sets, by R. Natarajan and E.P.D. Pednault, in SIAM
First International Conference on Data Mining, Chicago, IL, April 2001.
- Solving
Regression Problems with Rule-Based Ensemble Classifiers, by N.
Indurkhya and S.M. Weiss, in Seventh ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (SIGKDD), San Francisco, August
2001.
- Lightweight
Collaborative Filtering Method for Binary-Encoded Data, by S.M.
Weiss and N. Indurkhya, in Fifth European Conference on Principles and
Practice of Knowledge Discovery in Databases (PKDD), Freiburg, Germany,
September 2001.
- A
New Approach for Item Choice Recommendations, by S.J. Hong, R. Natarajan,
and I. Belitskaya, IBM Research Report RC-21962, in Third International
Conference on Data Warehousing and Knowledge Discovery (DaWaK'01), September
2001, Munich, Germany.
2000
- Active
Learning using Adaptive Resampling by V. Iyengar, C. Apte, and T.
Zhang, in Proceedings of ACM SIGKDD 2000.
- Lightweight
Rule Induction, by S.M. Weiss and N. Indurkhya, in Proceedings of
the International Conference on Machine Learning (ICML) 2000.
- AI at
IBM Research, by C. Apte, L. Morgenstern, and S.J. Hong, in IEEE
Intelligent Systems, Nov./Dec. 2000, Volume 15, Number 6, pages 51-57.
IBM Research Report RC-21907.
- Operational
Data Analysis: Improved Predictions Using Multi-Computer Pattern Detection,
by R. Vilalta, C. Apte, and S.M. Weiss, in Proceedings of the 11th IFIP/IEEE
International Workshop on Distributed Systems: Operations & Management
(DSOM 2000). Austin, Texas, USA.
- Lightweight
Document Clustering, by S.M. Weiss, B.F. White, and C. Apte, IBM
Research Report RC-21684, in Proceedings of PKDD 2000.
- Decision-Rule
Solutions for Data Mining with Missing Values, by S.M. Weiss and
N. Indurkhya, IBM Research Report RC-21783, in Proceedings of SBIA/IBERAMIA
2000 (Springer Verlag).
- Advances
in Predictive Models for Data Mining by S.J. Hong and S.M. Weiss.
in Pattern Recogntion Letters Journal, 2000, IBM Research Report RC-21570.
Earlier version appears
in Proceedings of MLDM'99 (Springer), Predictive
Data Mining Methods, pp. 13-20. (1999).
- The
Importance of Estimation Errors in Cost Sensitive Learning, by E.P.D.
Pednault, B.K. Rosen, and C. Apte, IBM Research Report RC-21757.
- Handling
Imbalanced Data Sets in Insurance Risk Modeling, by E.P.D. Pednault,
B.K. Rosen, and C. Apte, IBM Research Report RC-21731.
- Analysis
of Regularized Linear Functions for Classification Problems, by
T. Zhang, IBM Research Report RC-21572.
- Lightweight
Document Matching by S.M. Weiss, B.F. White, C. Apte, and F. Damerau,
in IJCAI-99 Workshop on Text Mining: Foundations, Applications, and
Techniques, 1999. Also in IEEE Intelligent Systems, Volume 15, Number
2, March/April 2000.
1999
- A
Scalable Parallel Algorithm for Self-Organizing Maps with Applications
to Sparse Data Mining Problems by R. D. Lawrence, G.S. Almasi, and
H. Rushmeier, in Data Mining and Knowledge Discovery, 3, 171-195, 1999.
- Probabilistic
Estimation Based Data Mining for Discovering Insurance Risks by
C. Apte, E. Grossman, E. Pednault, B. Rosen, F. Tipu, and B. White.
IBM Research Report RC-21483, in IEEE Intelligent Systems, Volume 14,
Number 6, November/December 1999.
- Maximizing
Text-Mining Performance by S.M. Weiss, C. Apte, F. Damerau, D.E.
Johnson, F.J. Oles, T. Goetz, and T. Hampp, in IEEE Intelligent Systems,
Volume 14, Number 4, July/August 1999.
- Partitioning
Nominal Attributes in Decision Trees by D. Coppersmith, S.J. Hong,
and J. Hosking, in Journal of Data Mining and Knowledge Discovery, Volume
3, Number 2, June 1999. Earlier version appears in Intelligent Data
Engineering and Learning, Proceedings of the 1st International Symposium,
IDEAL '98, ed. L. Xu, L. W. Chan, I. King and A. Fu, 393-400. Singapore:
Springer-Verlag. IBM Research Division Technical Report RC-21114.
- Insurance
Risk Modeling Using Data Mining Technology by C. Apte, E. Grossman,
E. Pednault, B. Rosen, F. Tipu, and B. White, in Proceedings of The
Third International Conference on The Practical Applications of Knowledge
Discovery and Data Mining, April 1999. IBM Research Division Technical
Report RC-21314.
1998
- Data
Mining with Extended Symbolic Models by C. Apte, E. Pednault, and
S.M. Weiss, in Proceedings of Joint Statistical Meeting (JSM'98), Statistical
Computing Section, 1998.
- Text
Mining with Decision Trees and Decision Rules by C. Apte, F. Damerau,
and S.M. Weiss, in Conference on Automated Learning and Discovery, Carnegie-Mellon
University, June 1998.
- Estimating
Performance Gains for Voted Decision trees by N. Indurkhya and S.M.
Weiss, IBM Research Division Technical Report RC-21199, in Intelligent
Data Analysis (IDA).
- Decomposition
of Heterogeneous Classification Problems by C. Apte, S.J. Hong,
J. Hosking, J. Lepre, E. Pednault, and B. Rosen, in Intelligent Data
Analysis, 1998. Expanded version of paper with same title in proceedings
of IDA'97, August 1997.
- Statistical
Learning Theory by E.P.D. Pednault, in the MIT Encyclopedia of the
Cognitive Sciences, 1998.
1997
- Attribute
Selection for Modeling by I. Kononenko and S.J. Hong, in Future
Generation Computer Systems, November 1997.
- Data
Mining: Guest Editorial by S.J. Hong, in Future Generation Computer
Systems, November 1997.
- Data
Mining with Decision Trees and Decision Rules by C. Apte and S.M.
Weiss, in Future Generation Computer Systems, November 1997.
- A
Statistical Perspective on Data Mining by J. Hosking, E. Pednault
and M. Sudan, in Future Generation Computer Systems, November 1997.
- Data
Mining - An Industrial Research Perspective by C. Apte, in IEEE
Computational Science and Engineering, April-June 1997.
- Use
of Contextual Information for Feature Ranking and Discretization
by S.J. Hong, in IEEE Transactions on Knowledge and Data Engineering,
1997.
- R-MINI:
An Iterative Approach for Generating Minimal Rules from Examples
by S.J. Hong, in IEEE Transactions on Knowledge and Data Engineering,
1997.
1996 and before
- RAMP:
Rules Abstraction for Modeling and Prediction by C. Apte, S.J. Hong,
J. Lepre, S. Prasad, and B. Rosen, IBM Research Division Technical Report
RC-20271.
- Use
of Randomization to Normalize Feature Merits by S.J. Hong, J. Hosking,
and S. Winograd, in proceedings of ISIS'96, 1996.
- Predicting
Equity Returns from Securities Data by C. Apte and S.J. Hong, in
Advances in Knowledge Discovery and Data Mining, AAAI Press, 1995.
- Automated
Learning of Decision Rules for Text Categorization by C. Apte, F.
Damerau, and S.M. Weiss, in ACM Transactions on Information Systems,
1994.
- Towards
Language Independent Automated Learning of Text Categorization Models
by C. Apte, F. Damerau, and S.M. Weiss, in ACM SIGIR'94, July 1994.
- Case
Studies in High-Dimensional Classification by C. Apte, R. Sasisekharan,
V. Seshadri, and S.M. Weiss, in Journal of Applied Artificial Intelligence,
Vol. 4, No. 3, July 1994.
- Predicting
Defects in Disk Drive Manufacturing: A Case Study in High-Dimensional
Classification by C. Apte, S.M. Weiss, and G. Grout, in IEEE Annual
Conference on AI Applications, CAIA-93, March 1993.
Revised December 9, 2004 |