We developed an experimental prototype of a VLIW processor,
capable of performing multiway branching and conditional execution,
which is currently operational. The prototype has helped us investigate
some of the hardware constraints in building VLIWs.
This processor executes tree-instructions within a ``classical'' VLIW
architecture, that is, fixed-length VLIWs with preassigned slots for
the different operations. The register state consists of 64 32-bit
general purpose registers, 8 single-bit condition code registers,
4 memory address registers, program status word register, and some
special registers. Each Very Long Instruction Word is 759 bits, which
include:
- eight ALU operations
- four loads or stores (the sum of ALU operations and loads must not exceed 8).
- seven conditional branches (seven test nodes in a tree, producing
an eight-way branch); and
- a binary encoding of the tree-instruction.
Register to register operations (arithmetic, logical, shifting) are
RISC-like, whereas load/store operations use memory address registers
(MAR) as base registers. For each tree-path, the following information is
provided in the long instruction:
- a mask indicating which condition codes must be true, which condition
codes must be false, and which are don't care, in order for the
path to be taken;
- the address of the target instruction for the path; and
- a mask indicating which subset of ALU/memory operations should be
committed, if the path is taken (that is, the set of operations on
the path).
The data memory is 4M bytes, whereas the instruction memory is 64k
VLIWs. This VLIW processor and its associated memory are attached to
a host (a PS/2 computer), which performs I/O operations and loads
programs into the prototype memory.
The VLIW processor includes architectural support for speculative
operations and exceptions
(as described in an early publication), as follows:
- a speculative flag (bit) in each opcode;
- an additional bit in general-purpose registers and in condition
codes, to indicate results of speculative operations that caused
an error (e.g. overflow).
A speculative operation that generates in an error does not cause an
exception; instead, it just sets the extra bit of its result register.
Further speculative operations that use this result propagate the extra
bit to their own results. An exception occurs if and when a register
with the extra bit set is used by a nonspeculative operation. Software
can use this feature for moving operations speculatively above branches,
and detecting interrupts properly.
The prototype is optimized to do multiway branching every cycle, without
branch stalls. Consequently, compare operations set the condition codes
in a given cycle (say n), multiway branching is done in cycle
n+1 using these condition codes, the correct target VLIW
is fetched, and this VLIW is executed in cycle n+2. The
branching process is illustrated in the following code sequence
(wherein L1, L2 and L3 are the labels of successive VLIWs):
(L1 ((EQ r$1 7 (cc$0)) (GT r$1 r$7 (cc$1)) (GOTO L2)))
(L2 ((IF cc$0 ((GOTO L3))
ELSE ((IF (NOT cc$1) ((GOTO L4))
ELSE ((GOTO L5)))) ))
(L3 ((ADD r$7 r$10 (r$7)) (GOTO L6)))
In this code fragment, the first VLIW (L1) sets condition codes cc0 and
cc1, and performs an unconditional branch to VLIW L2. This VLIW (L2)
uses the condition codes just set to determine the next VLIW to be
executed; notice that L2 is a 3-way VLIW, whose targets are L3, L4
and L5. Finally, assuming L3 is the selected target, VLIW L3 is
executed in the next cycle. In addition to the branching components
shown above, the tree-instructions contain other operations, as
illustrated in L3. Here is another, more complicated
code example that shows how
multiway branching is done in a sustained manner, every cycle.