IBMSkip to main content
  Home     Products & services     Support & downloads     My account  
  Select a country 
Journals Home 
 Systems Journal 
Journal of Research
and Development
 ·  Current Issue 
 ·  Recent Issues 
 ·  Papers in Progress 
 ·  Search/Index 
 ·  Orders 
 ·  Description 
 ·  Patents 
 ·  Recent publications 
 ·  Author's Guide 
 Staff 
 Contact Us 
 Related links: 
    IBM China
   Research Laboratory
 
    IBM India
   Research Laboratory
 
    IBM Tokyo
   Research Laboratory
 
IBM Journal of Research and Development 
Volume 48, Number 5/6, 2004
IBM Research in Asia
 Table of contents: arrowHTML arrowPDF   This article: arrowHTML arrowPDF arrowCopyright info
  

The eShopmonitor: A comprehensive data extraction tool for monitoring Web sites - References

by N. Agrawal, R. Ananthanarayanan, R. Gupta, S. Joshi, R. Krishnapuram, and S. Negi

References

  1. N. Kushmerick and B. Thomas, “Adaptive Information Extraction: Core Technologies for Information Agents,” in Intelligent Information Agents R&D in Europe: An AgentLink Perspective, Springer-Verlag, New York, 2002.
  2. N. Kushmerick, “Wrapper Induction: Efficiency and Expressiveness,” Artificial Intelligence 118, No. 1/2, 15–68 (2000).
  3. C. A. Knoblock, K. Lerman, S. Minton, and I. Muslea, “Accurately and Reliably Extracting Data from the Web: A Machine Learning Approach,” IEEE Data Eng. Bull. 23, No. 4, 33–41 (2000).
  4. G. Huck, P. Fankhauser, K. Aberer, and E. J. Neuhold, “Jedi: Extracting and Synthesizing Information from the Web,” Proceedings of the 3rd International Conference on Cooperative Information Systems (CoopIS), 1998, pp. 32–43.
  5. V. Crescenzi, G. Mecca, and P. Merialdo, “Roadrunner: Towards Automatic Data Extraction from Large Web Sites,” Proceedings of the 27th Very Large Database (VLDB) Conference, Rome, Italy, 2001, pp. 109–118.
  6. J. Hammer, H. Garcia-Molina, J. Cho, A. Crespo, and R. Aranha, “Extracting Semistructured Information from the Web,” Proceedings of the Workshop on Management of Semistructured Data, 1997, pp. 18–25.
  7. B. Adelberg, “NoDoSE—A Tool for Semiautomatically Extracting Structured and Semistructured Data from Text Documents,” Proceedings of the ACM SIGMOD Conference on Management of Data, 1998, pp. 283–294.
  8. C.-H. Chang and S.-C. Lui, “IEPAD: Information Extraction Based on Pattern Discovery,” Proceedings of the 10th International Conference on the World Wide Web, ACM, 1-58113-348-0/01/0005, 2001.
  9. J. Myllymaki, “Effective Web Data Extraction with Standard XML Technologies,” Proceedings of the 10th International Conference on the World Wide Web, ACM, 1-58113-348-0/01/0005, 2001.
  10. J. Freire, B. Kumar, and D. Lieuwen, “Webviews: Accessing Personalized Web Content and Services,” Proceedings of the 10th International Conference on the World Wide Web, ACM, 1-58113-348-0/01/0005, 2001.
  11. M. Abe and M. Hori, “Robust Pointing by XPath Language: Authoring Support and Empirical Evaluation,” Proceedings of the IEEE Symposium on Applications and the Internet, 2003, pp. 156–165.
  12. J. Kahan, M.-R. Koivunen, E. Prud'Hommeaux, and R. R. Swick, “Annotea: An Open RDF Infrastructure for Shared Web Annotations,” Proceedings of the 10th International Conference on the World Wide Web, ACM, 1-58113-348-0/01/0005, 2001.
  13. L. Liu, C. Pu, and W. Han, “Xwrap: An XML-Enabled Wrapper Construction System for Web Information Sources,” Proceedings of the 16th International Conference on Data Engineering (ICDE), 2000, pp. 611–621.
  14. R. Baumgartner, S. Flesca, and G. Gottlob, “Visual Web Information Extraction with Lixto,” Proceedings of the 27th Very Large Database (VLDB) Conference, Rome, Italy, 2001, pp. 119–128.
  15. M. Garofalakis, A. Gionis, R. Rastogi, S. Seshadri, and K. Shim, “XTRACT: A System for Extracting Document Type Descriptors from XML Documents,” Proceedings of the ACM SIGMOD Conference, 2000, pp. 165–176.
  16. V. Boyapati, K. Chevrier, A. Finkel, N. Glance, T. Pierce, R. Stokton, and C. Whitmer, “Changedetector™: A Site-Level Monitoring Tool for the WWW,” Proceedings of the 11th International Conference on the World Wide Web, ACM, 2002, pp. 570–579.
  17. N. Agrawal, R. Ananthanarayanan, R. Gupta, S. Joshi, R. Krishnapuram, and S. Negi, “eShopmonitor: A Web Content Monitoring Tool,” Proceedings of the 20th International Conference on Data Engineering (ICDE), 2004, in press.
  18. S. Joshi, N. Agrawal, R. Krishnapuram, and S. Negi, “A Bag of Paths Model for Measuring Structural Similarity in Web Documents,” Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), 2003, pp. 577–582.