Development environment The eLite DSP architecture specification is maintained in the
centralized ISA database. This database contains all the information describing the architecture, including instructions formats, operands for each instruction, and a machine-readable pseudo-code
that describes the behavior of each instruction at each stage of the pipeline. All the tools have been built as semi-generic programs, providing a framework that takes most of the specific details
from a set of configuration files automatically generated from the ISA database. The contents of the database reflect the attributes and behavior of the architecture, and the tools automatically
track any changes to the architecture. Binary code generation The binary form of an
instruction is a sequence of bits composed of a concatenation of constants and the binary encoding of the instruction's operands (parameters). The instructions are bundled together to make Long
Instruction Words (LIW). Finally, the sequence of LIWs is the output program code. The way that bits are arranged in an instruction or LIW is often non-trivial, requiring a mechanism to insert (when
assembling) and extract (when disassembling) information from the instruction or LIW. Since all the details regarding the position of the fields are described in the ISA database, it is possible to
automatically generate code that will handle these insertions and extracts. For this purpose, the concept of Inserters has been developed. An Inserter is an abstract interface that provides the two
basic operations of insert and extract. Given a location within the program (e.g., LIW offset) and a value, the inserter places the value into the binary code at the given location. This concept is
useful for the basic case of inserting the value of an operand into an instruction, up to inserting a whole instruction into the code section of the output program. Inserters are also useful for
late binding of external symbols at link time.The instruction syntax inside the tools is based on a free form format string containing operand fields and any delimiting text. This approach
allows setting custom formats for various instructions without any modification to the assembler source. For example, an integer add instruction configuration is as follows: {"iadd",
"OP=0x10", "$(RT),$(RA),$(RB)"} A typical use of this format would be something like iadd r1,r2,r3. However, it can easily be iadd r1=r2+r3 by simply replacing the commas in the
format string with the appropriate symbol, such as "$(RT)=$(RA)+$(RB)". Conflict detection The eLite DSP architecture has an exposed pipeline, thereby assigning to the compiler/programmer the responsibility of resolving data dependencies and resource conflicts in the
program. The presence of instruction-level parallelism, combined with the pipeline latencies, make this task quite difficult for the assembly-level programmer. A special tool was developed to help
in this domain; this tool scans the assembly code detecting data dependencies and machine resource conflicts. In addition to detecting the simple cases, where instructions follow each other in the
code section of the program, the tool is also aware of branches in the code and can speculate about conflicts and dependencies that may arise if a certain branch is taken (or not).The tool
is integrated into the development environment user interface, allowing the Assembly-level programmer to visualize the conflicts in the source code. 
The screen shot above shows an example of a conflict and dependency found and displayed in the IDE. The
orange marks show the source lines containing dependencies. In this example, the register va2 is loaded in the first
orange marked line and is used in the second orange marked line. The conflict checker determines that since the
load did not complete its operation in time for the second instruction to use the loaded value, there is a dependency
hazard here. This example also shows a conflict (marked in red) between two instructions. Conflicts are detected on shared resources. In this case it is the address unit decoder.
Serialization packer
The architecture features a mechanism for packing several instructions into a single LIW, even if those
instructions will not be executed in parallel. This mechanism is called serialization and is, from a programmer's
point of view, a transparent mechanism that allows to reduce the code size. It is possible to use serialization by
using its explicit notation in the assembly code, similar to indicating paralellism. The disadvantage of serializing
manually is that the serialization may change drastically as you modify instructions or shuffle them during
optimization. The serialization packer is an automated tool that processes code and automatically finds the optimal
packing order, given a specific code schedule (i.e. it does not change order of instructions).
Static and dynamic code profilers
The code profilers are tools that were developed to allow collecting statistics from binary code of programs. The
static profiler runs over the code section of the program and collects information such as: code size, instruction
distribution, no-op and predicated instructions usage. The dynamic profiler collects the same information as the
static profiler, but instead of running over the static code section, it runs over the instructions that were decoded in
an actual run of a program. It uses the simulator to generate a trace of the run and uses the trace to collect the
information. These statistics are useful in the architecture evaluation process as they allow assesing the importance of adding/removing insructions to the ISA. Instruction set architecture (ISA) simulator
Just like the machine code generation tools, the Instruction Set Architecture (ISA) simulator is closely connected
to the ISA database. Each instruction in the database has a formal (machine readable) definition of the behavior of
the instruction, and the simulator guarantees that this behavior is kept and simulation results are consistent with
expected results. As in the case of the operands in the assembler, a preprocessing program goes over the concise
behavior information of the instructions and produces source code that is compiled into the simulator.
The instruction behavior description contains a local state for the instruction and a set of events describing what
the instruction does at the different stages of the pipeline. The ISA simulator has additional state for each
instruction instance used for storing local temporary variables. This allows building the instructions in a modular
fashion. There is of course a shared state of the machine, in the form of register files and memory. This state is
accessed indirectly via a set of functions representing the hardware ports used to access these resources in the hardware implementation.
The event behavior code of the instruction is source code (C++ in this case) that operates on the local variables
and the resource functions (ports) to perform its task. An event can either be a hardware related event, such as
performing the operations that are associated with a specific cycle in the instruction's execution, or a software event artificially added to the ISA database as a helper function to perform its duties.
Integrated Development Environment (IDE)
The entire tool set is integrated with a graphical user interface (GUI) that provides the programmer with a more
convenient way to go through the process of creating projects, editing source code, building and running the applications.
The IDE allows to perform various operations on the source code level, within the integrated text editor. This
make the process of doing various tasks (e.g. profiling a block of code) easier since it is simpler for the programmer
to identify locations in the source code (e.g. beginning and end of the block). A performance profiler is available to measure the distribution of runtime cycles between functions of code.
Visual Debugger
The IDE features an integrated visual debugger, which is a GUI front end for the simulator or actual hardware.
The debugger has a variaty of features, starting from common debugging facilities, such as stepping over code,
breakpoints and data inspection, and adding some features that are less common and some custom tailored to the architecture.
One of the biggest problems with debugging an eLite DSP application code is the fact that the architecture uses a
long pipeline which makes tracking data flow while debugging somewhat difficult. The source of the problem is that
normal inspection techniques allow viewing the contents of register files or memory, but by the time this information
is available (due to the long pipeline) the programmer would have lost track of instructions that follow this
instruction and may have a shorter pipeline. To facilitate this, a feature of execution view was created to allow
watching the data flow (input to output) of instructions as they are executing. For example, the following figure
shows a view of an iadd instruction at execution time. It is possible to see the two inputs (3,5) and the output (8). 
It is often necessary to test an application that works on a large stream of data, which does not fit in its entirety in
the chip's memory. Real world applications will usually use streaming to handle such data. In order to facilitate
testing of such streaming applications without the need to setup the actual streaming I/O, the IDE allows creating
probes. Probes are buffers in the host memory (virtually unlimited) which may hold any kind of data. The debugger
allows setting a special kind of breakpoint that upon triggering, will transfer data between this cyclic buffer and a
specified register. This operation allows simulating I/O with very little effort. The debugger also allows displaying a visual representation of the buffer, for analysis.
See the Publications and Presentations
section for further information regarding the eLite IDE. |