Dynamic Web content can consume orders of magnitude more CPU time to serve
than static content. Generating dynamic content is often the performance
bottleneck for Web sites, even if only a fraction of the content is generated
dynamically. The key challenge is to serve dynamic content efficiently while
providing current and consistent information.
The following papers describe techniques which we have developed for efficiently
publishing dynamic content. Several of these techniques can be applied to
changing content in many different forms and are not restricted to the Web.
The ACM Transactions on Internet Technology and SC
papers describe some of our experiences when we deployed our techniques
commercially.
- "Automatic Fragment Detection in Dynamic Web Pages and its Impact on Caching" (pdf, with Lakshmish Ramaswamy, Ling Liu, and Fred Douglis). In IEEE Transactions on Knowledge and Data Engineering vol. 17 #6, June 2005.
- "A Fragment-Based Approach for Efficiently Creating Dynamic Web Content"
(pdf, with Jim Challenger, Paul Dantzig, and Karen Witting). In
ACM Transactions on Internet Technology vol. 5 #2, May 2005.
- "Automatic Detection of Fragments in Dynamically Generated Web Pages" (pdf, with Lakshmish Ramaswamy, Ling Liu, and Fred Douglis). Best Paper Award. In Proceedings of the 13th International World Wide Web Conference (WWW2004), New York City, May 2004.
-
"Application-Specific Delta-Encoding via Resemblance Detection"
(with Fred Douglis). In Proceedings of the 2003 USENIX Annual Technical Conference
(USENIX '03), San Antonio, Texas, June 2003.
- "A Scalable and Highly
Available System
for Serving Dynamic Data at Frequently Accessed Web Sites" (with Jim Challenger and Paul Dantzig).
In Proceedings of ACM/IEEE Supercomputing '98 (SC98), Orlando, Florida,
November 1998.
The ability to cache dynamic content can greatly improve performance. Many systems do not allow dynamic
content to be cached because of the problem of maintaining consistency. The following papers describe
techniques we have developed for consistently caching dynamic data. The IEEE/ACM Transactions on
Networking and USITS papers describe our experiences caching dynamic data for commercial Web sites.
- "Efficiently Serving Dynamic Data at Highly Accessed Web Sites" (pdf, with Jim Challenger, Paul Dantzig, Mark Squillante, and Li Zhang). In
IEEE/ACM Transactions on Networking vol. 12 #2, April 2004.
- "Engineering Web Cache Consistency"
(pdf, with Jian Yin, Lorenzo Alvisi, and Mike Dahlin).
In ACM Transactions on Internet Technology vol. 2 # 3, August 2002.
- "Engineering server-driven consistency for
large scale dynamic web services"
(pdf, with Jian Yin, Lorenzo Alvisi, and Mike
Dahlin). Best Paper Award.
In
Proceedings of the 10th International World Wide Web Conference (WWW10), Hong Kong,
May 2001.
- "Improving Web Server Performance by Caching Dynamic
Data" (postscript, with Jim Challenger). In Proceedings of the USENIX 1997
Symposium on Internet Technologies and Systems (USITS '97), Monterey, CA, December 1997.
Caching is extremely important for improving the performance of distributed systems and can be deployed at
multiple places.
The following papers
describe work that we have done
developing caches applicable to a broad range of distributed applications and not
necessarily limited to the Web.
The caching systems described in the Computer Networks, Middleware, and IPCCC papers have been deployed commercially.
- "Network-Aware Partial Caching for Internet Streaming Media Delivery"
(pdf, with Shudong Jin and Azer Bestavros). In ACM/Springer Multimedia Systems Journal vol. 9 #4, October 2003,
copyright Springer-Verlag.
- "Architecture of a Web Server Accelerator"
(pdf, with Junehwa Song, Eric Levy-Abegnoli, and Daniel Dias).
In Computer Networks, vol. 38 #1, January 2002.
- "Web Proxy Acceleration"
(with Daniela Rosu and Daniel Dias). In
Cluster Computing,
Vol. 4 #4, October 2001.
- "A Middleware System Which
Intelligently Caches Query Results" (pdf, with Louis Degenaro,
Ilya Lipkind, and Isabelle Rouvellou).
In Proceedings of ACM/IFIP Middleware 2000, Palisades, New York,
April 2000
(published by
Springer-Verlag).
- "Design and Performance
of a General-Purpose Software
Cache" (postscript). In Proceedings of the 18th IEEE
International Performance, Computing, and Communications Conference (IPCCC'99),
Phoenix/Scottsdale, Arizona, February 1999.
The following papers provide a general overview of Web performance.
- "Improving Web Site Performance" (with Erich Nahum, Anees Shaikh, and Renu Tewari). In The Practical Handbook of Internet Computing,
Copyright 2005, Chapman & Hall/CRC Press, Munindar P. Singh ed.
- "Web Caching, Consistency and Content Distribution" (with Erich Nahum, Anees Shaikh, and Renu Tewari). In The Practical Handbook of Internet Computing,
Copyright 2005, Chapman & Hall/CRC Press, Munindar P. Singh ed.
- "Architecting Web Sites for High Performance"
(pdf, with Daniela Rosu).
In Scientific Programming, vol. 10 #1, June 2002.
- "High-Performance Web Site Design Techniques"
(pdf, with Jim Challenger, Daniel Dias, and Paul Dantzig). In
IEEE Internet Computing, vol. 4 #2, March/April 2000.
Most of my work in load balancing and scheduling has been for client-server systems handling
high request rates. The techniques described in the Computer Networks and SC
papers have been commercially deployed.
- "How to determine a good multi-programming level for
external scheduling" (pdf, with Bianca Schroeder, Mor Harchol-Balter, Erich Nahum,
and Adam Wierman). In Proceedings of the 22nd IEEE International
Conference on Data Engineering, Atlanta, Georgia, April 2006.
- "A Tiered System for Serving Differentiated Content"
(pdf, with Huamin Chen).
In World Wide Web: Internet and Web Information Systems vol. 6 #4, December 2003.
- "Architecture of a Web Server Accelerator"
(pdf, with Junehwa Song, Eric Levy-Abegnoli, and Daniel Dias).
In Computer Networks, vol. 38 #1, January 2002.
- "A Scalable and Highly
Available System
for Serving Dynamic Data at Frequently Accessed Web Sites" (with Jim Challenger and Paul Dantzig).
In Proceedings of ACM/IEEE Supercomputing '98 (SC98), Orlando, Florida,
November 1998.
- "Dual-Quorum Replication for Edge Services" (pdf, with Lei Gao, Mike
Dahlin, Jiandan Zheng and Lorenzo Alvisi). In Proceedings of the ACM/IFIP/USENIX 6th International
Middleware Conference (Middleware 2005), Grenoble, France, November/December 2005.
- "Thema: Byzantine-Fault-Tolerant Middleware for Web-Service
Applications" (with Michael G. Merideth, Thomas Mikalsen, Stefan Tai, Isabelle Rouvellou and Priya
Narasimhan). In Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems (SRDS 2005), Orlando, Florida, October 2005.
- "Improving Availability and Performance with Application-Specific Data Replication" (pdf, with Lei Gao, Mike Dahlin, Amol Nayate, and Jiandan Zheng). In IEEE Transactions on Knowledge and Data Engineering vol. 17 #1, January 2005.
- "Transparent Information Dissemination" (pdf, with Amol Nayate and Mike Dahlin). In Proceedings of ACM/IFIP/USENIX Middleware 2004, Toronto, Canada, October 2004.
- "Application Specific Data Replication for Edge Services"
(pdf, with Lei Gao, Mike Dahlin, Amol Nayate, and Jiandan Zheng). Best Student Paper Award.
In
Proceedings of the 12th International World Wide Web Conference (WWW2003), Budapest, Hungary,
May 2003.
- "Design and Implementation of a Secure Distributed Data
Repository" (postscript, with Robert Cahn, Juan Garay, and Charanjit Jutla).
In Proceedings of the 14th IFIP International Information Security Conference (SEC
'98), Vienna, Austria and Budapest, Hungary, September 1998.
- "Software Exploitation of a Fault-Tolerant Computer with a
Large Memory" (postscript, with Frank Eskesen, Michel Hack, Richard King, and
Nagui Halim). In Proceedings of the 28th IEEE International Symposium on Fault-Tolerant
Computing Systems, (FTCS '98), Munich, Germany, June 1998.
I am interested in storage allocation for both main memory and disk.
A major theme of my work in this area has been storage allocators which adapt
themselves to request size distributions to optimize performance and minimize
fragmentation.
The first paper describes a disk storage allocation system which outperforms both
file systems and databases for the workloads we used. This disk storage
allocation system has been deployed commercially. The second paper presents
main memory storage allocation algorithms which are particularly well suited for
parallel computer systems.
I am also interested in capacity planning and performance modelling for both
scientific and commercial workloads. I am developing improved techniques
for predicting customer workloads in the future from past behavior. This
allows customers to estimate how much capacity will be required in the
future. It also helps schedule tasks to optimize utilization of system
resources and efficiently make use of spare CPU cycles in grid
environments.
Copyright Notices
- ACM Papers - Copyright © by Association
for Computing Machinery, Inc. Permission to make digital or hard copies of part
of all of this work for personal or classroom use is granted without fee provided
that copies are not made or distributed for profit or commercial advantage.
To copy otherwise, to republish, to post of servers, or to redistribute to lists,
requires prior specific permission and/or a fee.
- IEEE Papers - Copyright
© by IEEE. Permission to make digital or hard copies of part or
all of this work for personal or classroom use is granted without fee provided
that copies are not made or distributed for profit. To copy otherwise, to republish,
to post of servers, or to redistribute to lists, requires prior specific permission
and/or a fee.
|