DAR Pages






|
The Data Analytics Research Project
Publications and Technical Reports
Note: If there are problems printing the .PDF files,
please upgrade to the newest version of Acrobat Reader.
- 1-norm
Support Vector Machines by J. Zhu, S. Rosset, T. Hastie,
and R. Tibshirani, in Seventeenth Annual Conference on Neural Information
Processing Systems (NIPS), 2003.
- Margin
Maximizing Loss Functions by S. Rosset, J. Zhu, and T.
Hastie, in Seventeenth Annual Conference on Neural Information Processing
Systems (NIPS), 2003.
- Integrating
Customer Value Considerations into Predictive Modeling
by S. Rosset and E. Neumann, in IEEE International Conference on Data
Mining (ICDM), 2003.
- Cost-Sensitive
Learning by Cost-Proportionate Example Weighting by B.
Zadrozny, J. Langford,N. Abe, in IEEE International Conference on
Data Mining (ICDM), 2003.
- Knowledge-Based
Data Mining by S.M. Weiss, S.J. Buckley, S. Kapoor, and
S. Damgaard, in Proceedings of the International Conference on Knowledge
Discovery and Data Mining, Washington DC, August 24-27, 2003.
- Passenger-Based
Predictive Modeling of Airline No-show Rates by R. D. Lawrence,
S.J. Hong, and J. Cherrier, in Proceedings of the International Conference
on Knowledge Discovery and Data Mining, Washington DC, August 24-27,
2003.
- A
Machine-Learning Approach to Optimal Bid Pricing by R.
D. Lawrence, in Proceedings of the Eighth INFORMS Computing Society
Conference on Optimization and Computation in the Network Era, Chandler,
Arizona, January 2003.
- Empirical
Comparison of Various Reinforcement Learning Strategies in Sequential
Targeted Marketing by N. Abe, E.P.D. Pednault, H. Wang,
B. Zadrozny, W. Fan, and C. Apte, in Proceedings of the 2002 IEEE
International Conference on Data Mining, December 2002.
- Prediction
of MHC Class I Binding Peptides by Dynamic Experiment Design based
on Query Learning with Hidden Markov Models by K. Udaka,
H. Mamitsuka, Y. Nakaseko and N. Abe, in Journal of Immunology, 169(10),
5744-5753, 2002.
- Business
Applications of Data Mining, by C. Apte, B. Liu, E.P.D. Pednault,
and P. Smyth, in Communications of the ACM, Vol. 45, No. 8, August
2002.
- A
Probabilistic Estimation Framework for Predictive Modeling Analytics,
by C. Apte, R. Natarajan, E.P.D. Pednault, and F. Tipu, in IBM Systems
Journal, Vol. 41, No. 3, August 2002.
- Predictive
Algorithms in the Management of Computer Systems, by R. Vilalta,
C. Apte, J. Hellerstein, S. Ma, and S.M. Weiss, in IBM Systems Journal,
Vol. 41, No. 3, August 2002.
- Automated
Generation of Model Cases for Help-Desk Applications, by S.M.
Weiss and C. Apte, in IBM Systems Journal, Vol. 41, No. 3, August
2002.
- Experiments
in High-Dimensional Text Categorization, by F. Damerau, T. Zhang,
and S.M. Weiss, in Proceedings of ACM SIGIR International Conference
on Information Retrieval, August 2002.
- Sequential
Cost-Sensitive Decision Making with Reinforcement Learning, by
E.P.D. Pednault, N. Abe, and B. Zadrozny, in Eigth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (SIGKDD), Edmonton,
Canada, July 2002.
- A
System for Real-time Competitive Market Intelligence, by S.M.
Weiss and N.K. Verma, in Eigth ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (SIGKDD), Edmonton, Canada,
July 2002.
- Segmented
Regression Estimators for Massive Data Sets by R. Natarajan
and E.P.D. Pednault, in Proceedings of the SIAM Second International
Conference on Data Mining, Crystal City, Virginia, April 2002.
- Multiplicative
Adjustment of Class Probability: Educating Naive Bayes
by S.J. Hong, J. Hosking, R. Natarajan, IBM Research Report RC-22393,
April 2002. Condensed version in IEEE ICDM 2002.
- Personalization
of Supermarket Product Recommentations by R. D. Lawrence,
G. Almasi, V. Kotlyar, M. Viveros, and S. Duri, in Data Mining and
Knowledge Discovery, 5, 11-32, 2001.
- Segmentation-Based
Modeling for Advanced Targeted Marketing, by C. Apte, E. Bibelnieks,
R. Natarajan, E.P.D. Pednault, F. Tipu, D. Campbell, and B. Nelson,
IBM Research Report RC-21982. In Seventh ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (SIGKDD), San Francisco,
August 2001.
- Using
Simulated Pseudo Data to Speed Up Statistical Predictive Modeling
from Massive Data Sets, by R. Natarajan and E.P.D. Pednault, in
SIAM First International Conference on Data Mining, Chicago, IL, April
2001.
- Solving
Regression Problems with Rule-Based Ensemble Classifiers, by N.
Indurkhya and S.M. Weiss, in Seventh ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (SIGKDD), San Francisco, August
2001.
- Lightweight
Collaborative Filtering Method for Binary-Encoded Data, by S.M.
Weiss and N. Indurkhya, in Fifth European Conference on Principles
and Practice of Knowledge Discovery in Databases (PKDD), Freiburg,
Germany, September 2001.
- A
New Approach for Item Choice Recommendations, by S.J. Hong, R.
Natarajan, and I. Belitskaya, IBM Research Report RC-21962, in Third
International Conference on Data Warehousing and Knowledge Discovery
(DaWaK'01), September 2001, Munich, Germany.
- Active
Learning using Adaptive Resampling by V. Iyengar, C. Apte, and
T. Zhang, in Proceedings of ACM SIGKDD 2000.
- Lightweight
Rule Induction, by S.M. Weiss and N. Indurkhya, in Proceedings
of the International Conference on Machine Learning (ICML) 2000.
- AI
at IBM Research, by C. Apte, L. Morgenstern, and S.J. Hong, in
IEEE Intelligent Systems, Nov./Dec. 2000, Volume 15, Number 6, pages
51-57. IBM Research Report RC-21907.
- Operational
Data Analysis: Improved Predictions Using Multi-Computer Pattern Detection,
by R. Vilalta, C. Apte, and S.M. Weiss, in Proceedings of the 11th
IFIP/IEEE International Workshop on Distributed Systems: Operations
& Management (DSOM 2000). Austin, Texas, USA.
- Lightweight
Document Clustering, by S.M. Weiss, B.F. White, and C. Apte, IBM
Research Report RC-21684, in Proceedings of PKDD 2000.
- Decision-Rule
Solutions for Data Mining with Missing Values, by S.M. Weiss and
N. Indurkhya, IBM Research Report RC-21783, in Proceedings of SBIA/IBERAMIA
2000 (Springer Verlag).
- Advances
in Predictive Models for Data Mining by S.J. Hong and S.M. Weiss.
in Pattern Recogntion Letters Journal, 2000, IBM Research Report RC-21570.
Earlier version appears
in Proceedings of MLDM'99 (Springer), Predictive
Data Mining Methods, pp. 13-20.
(1999).
- The
Importance of Estimation Errors in Cost Sensitive Learning, by
E.P.D. Pednault, B.K. Rosen, and C. Apte, IBM Research Report RC-21757.
- Handling
Imbalanced Data Sets in Insurance Risk Modeling, by E.P.D. Pednault,
B.K. Rosen, and C. Apte, IBM Research Report RC-21731.
- Analysis
of Regularized Linear Functions for Classification Problems, by
T. Zhang, IBM Research Report RC-21572.
- Lightweight
Document Matching by S.M. Weiss, B.F. White, C. Apte, and
F. Damerau, in IJCAI-99 Workshop on Text Mining: Foundations, Applications,
and Techniques, 1999. Also in IEEE Intelligent Systems, Volume 15,
Number 2, March/April 2000.
- A
Scalable Parallel Algorithm for Self-Organizing Maps with Applications
to Sparse Data Mining Problems by R. D. Lawrence, G.S.
Almasi, and H. Rushmeier, in Data Mining and Knowledge Discovery,
3, 171-195, 1999.
- Probabilistic
Estimation Based Data Mining for Discovering Insurance Risks
by C. Apte, E. Grossman, E. Pednault, B. Rosen, F. Tipu, and B. White.
IBM Research Report RC-21483, in IEEE Intelligent Systems, Volume
14, Number 6, November/December 1999.
- Maximizing
Text-Mining Performance by S.M. Weiss, C. Apte, F. Damerau,
D.E. Johnson, F.J. Oles, T. Goetz, and T. Hampp, in IEEE Intelligent
Systems, Volume 14, Number 4, July/August 1999.
- Partitioning
Nominal Attributes in Decision Trees by D. Coppersmith,
S.J. Hong, and J. Hosking, in Journal of Data Mining and Knowledge
Discovery, Volume 3, Number 2, June 1999. Earlier version appears
in Intelligent Data Engineering and Learning, Proceedings of the 1st
International Symposium, IDEAL '98, ed. L. Xu, L. W. Chan, I. King
and A. Fu, 393-400. Singapore: Springer-Verlag. IBM Research Division
Technical Report RC-21114.
- Insurance
Risk Modeling Using Data Mining Technology by C. Apte,
E. Grossman, E. Pednault, B. Rosen, F. Tipu, and B. White, in Proceedings
of The Third International Conference on The Practical Applications
of Knowledge Discovery and Data Mining, April 1999. IBM Research Division
Technical Report RC-21314.
- Data
Mining with Extended Symbolic Models by C. Apte, E. Pednault,
and S.M. Weiss, in Proceedings of Joint Statistical Meeting (JSM'98),
Statistical Computing Section, 1998.
- Text
Mining with Decision Trees and Decision Rules by C. Apte,
F. Damerau, and S.M. Weiss, in Conference on Automated Learning and
Discovery, Carnegie-Mellon University, June 1998.
- Estimating
Performance Gains for Voted Decision trees by N. Indurkhya
and S.M. Weiss, IBM Research Division Technical Report RC-21199, in
Intelligent Data Analysis (IDA).
- Decomposition
of Heterogeneous Classification Problems by C. Apte, S.J.
Hong, J. Hosking, J. Lepre, E. Pednault, and B. Rosen, in Intelligent
Data Analysis, 1998. Expanded version of paper with same title in
proceedings of IDA'97, August 1997.
- Statistical
Learning Theory by E.P.D. Pednault, in the MIT Encyclopedia
of the Cognitive Sciences, 1998.
- Attribute
Selection for Modeling by I. Kononenko and S.J. Hong, in
Future Generation Computer Systems, November 1997.
- Data
Mining: Guest Editorial by S.J. Hong, in Future Generation
Computer Systems, November 1997.
- Data
Mining with Decision Trees and Decision Rules by C. Apte
and S.M. Weiss, in Future Generation Computer Systems, November 1997.
- A
Statistical Perspective on Data Mining by J. Hosking, E.
Pednault and M. Sudan, in Future Generation Computer Systems, November
1997.
- Data
Mining - An Industrial Research Perspective by C. Apte,
in IEEE Computational Science and Engineering, April-June 1997.
- RAMP:
Rules Abstraction for Modeling and Prediction by C. Apte,
S.J. Hong, J. Lepre, S. Prasad, and B. Rosen, IBM Research Division
Technical Report RC-20271.
- Use
of Randomization to Normalize Feature Merits by S.J. Hong,
J. Hosking, and S. Winograd, in proceedings of ISIS'96, 1996.
- Use
of Contextual Information for Feature Ranking and Discretization
by S.J. Hong, in IEEE Transactions on Knowledge and Data Engineering,
1997.
- R-MINI:
An Iterative Approach for Generating Minimal Rules from Examples
by S.J. Hong, in IEEE Transactions on Knowledge and Data Engineering,
1997.
- Predicting
Equity Returns from Securities Data by C. Apte and S.J.
Hong, in Advances in Knowledge Discovery and Data Mining, AAAI Press,
1995.
- Automated
Learning of Decision Rules for Text Categorization by C.
Apte, F. Damerau, and S.M. Weiss, in ACM Transactions on Information
Systems, 1994.
- Towards
Language Independent Automated Learning of Text Categorization Models
by C. Apte, F. Damerau, and S.M. Weiss, in ACM SIGIR'94, July 1994.
- Case
Studies in High-Dimensional Classification by C. Apte,
R. Sasisekharan, V. Seshadri, and S.M. Weiss, in Journal of Applied
Artificial Intelligence, Vol. 4, No. 3, July 1994.
- Predicting
Defects in Disk Drive Manufacturing: A Case Study in High-Dimensional
Classification by C. Apte, S.M. Weiss, and G. Grout, in
IEEE Annual Conference on AI Applications, CAIA-93, March 1993.
Revised
January 26, 2004 |