Skip to main content

Multicore Streaming Framework (MSF)

Overview

The Multicore Streaming Framework (MSF) is an implementation of a streaming programming model for parallel platforms. The main objectives of the MSF design are

  • Easing parallel programming on a multi-core environment
  • Generic usage for various platforms (heterogeneous cores, distributed memory, different operating systems)
  • High run-time efficiency

Programming for a multicore environment must have inherent data and code transfers between various processors. In many cases, the management of these transfers is left to the programmer, making parallel programming difficult. MSF internally handles code transfers and provides an abstraction of data transfers. These features relieve the programmers from handling the parallel execution, code transfers, and data transfers associated with the execution. This abstraction also allows an implementation of MSF on different platforms. The programmer is then relieved from needing to handle platform-specific code.

High efficiency is achieved by optimizing the MSF implementation for each platform while taking advantage of the specific feature of the platform. In particular, efficiency is achieved by enabling data and code movements in parallel to processing. This is done on two levels – the MSF API allows the framework to know in advance which task should be scheduled next. This enables to pre-fetch task code and data. In addition, during the task execution, the API allows the task code to pre-fetch data using framework internal double buffering. Therefore, processing can take place while data and code movements are done and so overhead is reduced (or even hidden).

Programming using the MSF is based on a concept of data streaming between tasks. Tasks are data-driven activated functions. Tasks are independent of the application and provided to the application programmer in libraries. By using the MSF API, the application programmer loads a set of tasks and defines the data dependency connections graph between the tasks. Thereafter MSF is capable of executing the tasks as long as data is available for them.

Tasks are independent components of code that may process large amounts of input data and produce large amounts of output data using the provided API. Tasks are written in a generic form; they are independent of the application, other tasks, and the target core, and they can process any amount of data in each invocation as instructed by the application. A task can run on any of the available processors in the multicore based on its compilation target. The framework is responsible for providing the right core with the appropriate task.

Figure 1.

Any of the tasks may run in parallel to others. Streaming processing can be viewed as parallelism both on the “Y axis” and the “X axis” of the illustration above. The number of tasks that run in parallel at any given time depends on data (or space) availability and the amount of available processing resources on the platform at the given time. The framework is responsible for managing the above parallelism while keeping it transparent to the programmers.

For more information, contact Uzi Shvadron: shvadron@il.ibm.comimage