|
The Sixth IBM Programming Languages Day will be held at the IBM
Thomas J. Watson Research Center on Friday, April 22, 2005. The
day will be held in cooperation with the New
Jersey and New England Programming
Languages and Systems Seminars. The main goal of the event is to
increase awareness of each other's work, and to encourage interaction
and collaboration.
The Programming Languages Day features a keynote presentation and
six regular presentations. Simon
Peyton Jones of Microsoft Research will deliver the keynote
presentation this year.
You are welcome from 9AM onwards, and the keynote presentation will
start at 10AM sharp. We expect the program to run until 4PM. The
Programming Languages day will be held in room GN-F15 in the Hawthorne-1
building in Hawthorne, New
York.
If you plan to attend the Programming Languages Day, please register
by sending an e-mail with your name, affiliation, and contact information
to raghavac@us.ibm.com
so that we can plan for lunch and refreshments.
Agenda
Abstracts
Composable memory transactions
Simon Peyton Jones
Microsoft Research (Cambridge)
Concurrent programs are increasingly important, but are notoriously
difficult to write. Familiar problems include forgetten locks, deadlock,
missed wake-up calls, and complex error-recovery code. Worse than
all these, however, is lack of compositionality: even correctly-implemented
concurrency abstractions cannot be composed together to form larger
abstractions.
After a decade of stumbling around in the dark, I went to a talk
about \emph{Software Transactional Memory} (STM), and my life was
changed. In this talk I’ll share my Damascus-road experience
with you, by presenting a new shared-memory concurrency model that
avoids, by construction, many of the problems that plague lock-based
synchronisation schemes. Better still, memory transactions compose
beautifully, both in sequence and as choices.
Transactional memory works particularly nicely in a declarative
setting, and I’ll describe our realisation in Haskell. Gratifyingly,
the pure setting allowed us to come up with two ideas that were
previously inaccessible: modular primitives for blocking and choice.
This isn’t a Haskell-ghetto talk, though: I think that transactional
memory is the best hope yet for shared-memory concurrency in any
language.
Joint work with Tim Harris, Maurice Herlihy, Simon Marlow
A Calculus for Specifying Ad Hoc Data
Formats
Yitzhak Mandelbaum
Princeton University
An ad hoc data format is any nonstandard data format for which parsing,
querying, analysis or transformation tools are not readily available.
Vast amounts of useful data are stored and processed in ad hoc formats.
Traditional databases and XML systems provide rich infrastructure
for processing well-behaved data, but are of little help when dealing
with ad hoc formats. Before anything can be done with an ad hoc
data source, someone has to produce a suitable parser for it. Today,
people tend to use C or PERL for this task. Unfortunately, writing
such parsers is tedious and error-prone. It is complicated by the
lack of documentation, convoluted encodings designed to save space,
the need to produce efficient code to cope with the scale of the
data, and the need to handle errors robustly to avoid corrupting
precious data.
As an alternative, a number of languages have been designed to enable
users to describe data declaratively in terms of types and constraints,
including PADS, DataScript and PacketTypes. While these languages
differ significantly in their application, their essential semantics
are quite similar. Unfortunately, no formal specification of the
semantics of any of these languages exists. Such semantics would
be very useful in understanding the existing implementations, as
well as in developing new bindings and alternative implementations.
To address this shortcoming, we have developed a calculus of dependent
types for describing the physical layout and semantic properties
of a data source. It consists of a small number of orthogonal type
constructors that can be composed with one another to describe rich
and varied data formats. The calculus captures many of the commonalities
of the aforementioned data description languages. We present a semantics
for the calculus in which we interpret the types simultaneously
as the simple types of the data described and as parsing functions
for a data stream. The parsing functions are robust, automatically
detecting and recording errors in the data stream without halting
parsing. We prove two important correctness properties of the parsers.
First, we show they are type-correct, outputing data whose type
matches the simple-type interpretation of the specification, and
second, we show they are “error-correct,” reporting
exactly the number of physical and semantic errors that occur in
the returned data. Finally, we present a case study of using this
calculus to describe the PADS language.
Joint work with: Kathleen Fisher and David Walker.
Type Erasure: Breaking the Java Type System
Suad Alagic
University of Southern Maine
A Type erasure is an implementation idiom for extending Java with
generics (parametric polymorphism) underlying the most recent release
Java 5.0. We show that this idiom is provably incorrect and as such
it leads to major violations of the Java type system that appear
in the new release. The new release thus turns a type safe language
into an unsafe one as static type checking does not work any more.
The run-time type information in this release is incorrect so that
dynamic type checking can fail in unexpected ways. This leads to
major problems for the programmers relying on the Java reflective
capabilities. Further problems are in type violations related to
persistence and the Java serialization mechanism. We show that the
problem of extending Java with parametric polymorphism does not
have a correct solution unless the Java Virtual Machine is extended
to handle it properly. We explain the subtleties required by a correct
implementation technique that includes representation of parametric
classes in the standard Java class file format, representation of
the instantiated parametric class objects, extensions of the Java
Core Reflection to report type information about (instantiated)
parametric classes, and the loading techniques required by this
solution for extending Java with generics. Previous solutions for
this problem are analyzed as well.
Background paper:
Parametric polymorphism for Java: Is there any hope in sight?, ACM
SIGPLAN Notices, December 2004.
Joint work with Brian Cabana, Jeff Faulkner and Mark Royer
X10: An Object-Oriented Approach to Non-Uniform
Cluster Computing
Vijay Saraswat
IBM Research
The next generation of high performance computers (e.g. those capable
of O(10^{15}) operations per second) will be based on scale-out
techniques rather than increasing clock rates. This leads to a notion
of clustered computing: a single computer may contain hundreds of
thousands of tightly coupled nodes. Unlike a distributed model,
failure of a single node is tantamount to failure of the entire
machine. However, the cost of memory access by a hardware processor
may vary as much as five orders of magnitude across the cluster;
hence the notion of a single shared memory may no longer be appropriate
for such machines.
We have designed a concrete modern object-oriented programming language,
X10, for high performance, high productivity programming of such
machines. Past work in the Java Grande Forum has exposed the need
for substantial changes in modern OO languages in order to support
high performance computation (e.g. support for true multi-dimensional
arrays, value types, relaxed exception model, changes to the concurrency
model, support for distribution etc.) X10 builds on this past work.
A member of the *partitioned global address space* family of languages
(which includes Titanium, UPC and Co-Array Fortran), X10 is distinguished
by the explicit reification of locality (*places*), termination
detection (*finish*), by the use of lock-free synchronization (conditional
atomic sections), and by the ability to express clustered data-structures
(e.g. arrays scattered across multiple places). X10 smoothly integrates
concurrent shared memory access (e.g. {} as expressed by OpenMP
or Java threads with message-passing (e.g. as expressed by MPI).
We present the design of the core features of the language, experience
with a reference implementation and present results from some initial
productivity studies.
Joint work with: Philippe Charles, Christopher Donawa, Kemal Ebcioglu,
Christian Grothoff, Allan Kielstra, Christoph von Praun, Vivek Sarkar
Generalizing Parametricity Using Information
Flow
Geoffrey Washburn
University of Pennsylvania
Run-time type analysis allows programmers to easily and concisely
define operations based upon type structure, such as serialization,
iterators, and structural equality. However, when types can be inspected
at run time, nothing is secret. A module writer cannot use type
abstraction to hide implementation details from clients: they can
use type analysis to determine the structure of these supposedly
“abstract” data types. Furthermore, access control mechanisms
do not help isolate the implementation of abstract datatypes from
their clients. Buggy or malicious authorized modules may simply
leak type information to unauthorized clients, so module implementors
cannot reliably tell which parts of a program rely on their type
definitions.
Currently, module implementors rely on parametric polymorphism to
provide guarantees about the use of their abstract datatypes. Standard
parametricity does not hold for a language with run-time type analysis,
but in this paper we show how to generalize parametricity so that
it does hold in the presence of type analysis and still encompasses
the integrity and confidentiality policies that are derived from
parametricity. The key is to augment the type system with annotations
about information-flow. By tracking the flow of dynamic type information,
the implementor of an abstract data type can easily see which parts
of the program depend on their chosen implementation.
Joint work with: Stephanie Weirich.
SIF, A Typed Assembly Language for Non-Interference
Adriana Compagnoni
Stevens Institute for Technology
In a multilevel security architecture information can range from
having low (public) to high (confidential) security level. Information
flow analysis studies whether an attacker can obtain information
about the confidential data by observing the output of the system.
In particular, the non-interference property states that any two
executions of the same program, where only the high-level inputs
differ in both executions, does not exhibit any observable difference
in the program's output.
Standard perimeter security mechanisms such as access control or
digital signatures fail to address the enforcement of information-flow
policies. On the other hand, language-based strategies offer a promising
approach to information flow security. However, the
challenge of studying information flow for assembly languages is
that the control flow constructs that guide the analysis in high-level
languages are not present. To address this problem, we developed
a typed assembly language that uses low-level linear continuations
to impose a stack discipline on the control flow of programs.
In this talk we will present SIF, a typed assembly language for
secure information flow analysis with security types. This language
provides a stack of continuations that indicate the program points
where different branches of code converge. We present a type system
that guarantees that these continuations are treated linearly in
the sense that they are obligations and must be discharged. Well-typed
SIF programs are assembled to machine code that satisfy non-interference.
Joint work with: Ricardo H. Medel and Eduardo Bonelli
Encapsulated Upgradable Components
Eric Allen,
Sun Microsystems, Inc.
Modern software applications and libraries are difficult to deploy,
upgrade and maintain. Users regularly encounter incompatible software
and broken upgrades with hidden dependencies that are difficult
to diagnose and repair,often requiring reinstallation of applications,
or even of entire operating systems. In this paper, we argue that
these problems correspond to other, well understood, issues that
occur during software development, which are solved through the
use of module systems. We extend the notion of software modules
and present a single solution to problems in both contexts:
encapsulated upgradable components that are carried throughout the
software lifecycle.
We propose a set of desirable features that an upgradable component
system should provide and present a system with these features,
including mechanisms for controlled compilation, testing, installation,
linking, and upgrade. Our system allows users and programmers to
reason about software as if each deployed component were a self-contained
entity. In addition, users can safely upgrade exposed libraries
used by components, and programmers can impose restrictions on installations
and upgrades. We formally state the key properties of our system.
We show how the structure of this component system can be expressed
in the component system itself, enabling management and upgrades
of components by other components, and how such a system can be
efficiently implemented with resource requirements comparable to
dynamic linking.
Joint work with: Victor Luchangco and Sam Tobin-Hochstadt
Program committee:
Patricia
Johann, Rutgers University
Shriram Krishnamurthi,
Brown University
Mukund Raghavachari,
IBM T.J. Watson Research Center
|