[ IBM Research ]
[ Find ] [ News ] [ Products ] [ Support ] [ Business solutions ] [ Inside IBM ] [ Interest groups ]

Mark N. Wegman's professional interests:

(or click here to find out more about me personally)

There are a number of areas that I've been interested in, and have done research in. Some of these areas have overlapped in a synergistic way. Detailed information can be found in my papers or patents.
 I'm very proud of a group that I manage at IBM.  Some of the work for this group is described as part of the work we do in software technology

Some of the things I'm best known for include work on

  • Data compression -- where with Victor Miller we discovered a variant on Lempel-Ziv's compression algorithm, which in the early '80's was probably the best compression algorithm known.  Part of this algorithm were independently invented by Welch and form the basis of many current standards  including GIF's, the Unix Compress algorithm and various modem standards.  Our work is covered by US patent no. US04814746.
  • Program optimization and Static Single Assignment analysis for programming languages.  This work done jointly with Ken Zadeck and many others allows us to separate a variable in a program into several parts.  We then give several algorithms which can efficiently analyze a program and prove various kinds of information which will be true about the various parts of a variable.  One key insight is that by separating the variables into parts where there is only one assignment per part we can prove more precise information about the parts than the whole variable.  Moreover, by doing this we transform the program into almost a functional form where flow independent analysis, which is usually considered more efficient, yields the same results as a flow dependent analysis.

  • More details can be found in our papers, but also in a presentation I was invited to make as a retrospective at the 25th annual POPL.  It's best view with a Freelance viewer, in this form, but is adequately viewed in postscript.  I've been invited to, and have given this at a number of universities.

    My more recent work on optimization has veered into the idea of trace based optimizations one variant of which we've been exploring by dynamic application partitioning.

  • Universal Hashing (done with Larry Carter and others) -- allows one to find fast hash functions which for any inputs will be expected to store and retrieve a set of values in linear time.  This was one of the first and most important algorithms where an algorithm was randomized so that for even the worst input, for most choices of the algorithm good behavior is expected.  In other words good average behavior has been moved from being viewed as over inputs to being over algorithms.

  • Universal hash functions have become important for cryptography where the hash value of a message becomes an unbreakable certification for the authenticity of the message. A team of us( Shai Halevi, Hugo Krawczyk, David Chase, Ken Zadeck, Larry Carter and I) recently came up with a very fast Java implementation, which is almost as fast as a very trivial hash function and yet will distribute all inputs well no matter how deviously constructed with high probability. I'm hoping to put the code here as soon as I can get it cleaned up.

  • Programming Languages and Environments:  Much of my motivation for exploring program optimization is to raise the level at which people program.  I've also explored this in the context of programming environments.  The first significant IBM product I was involved in was Lisp/VM and it's associated programming environments which had one of the earliest source level debugger combined with a syntax directed editor.

  • More recently I've worked with some groups at IBM where we've built technologies for visualizing the behavior of object oriented programs and automatically animating them.

I've also done some work that is not yet known well.
  • I am very interested in some work which I'm hoping we will be able to publish soon, which can be thought of as a very high level language for the description of the IT needs for a business process.  I'm hoping that eventually we will be able to generate very flexible and pleasant to use business applications automatically.
  • Information retrieval work (a minor part of which is in patent US05826260 ). More importantly with my colleagues, I've developed an interesting probabilistic Baysian model for information retrieval, which in the context of IBM's Trek submissions seems to get very nice results.
I am currently General Chair of POPL 1900 (or was that 2000?). 
Assist Legal Privacy Orders IBMIBM Research