Handling Irregular ILP Within Conventional VLIW Schedulers
Using Artificial Resource Constraints [abstract] (ACM DL)
Subramanian Rajagopalan, Manish Vachharajani, and Sharad Malik
Proceedings of the International Conference on Compilers and Synthesis for Embedded Systems, November 2000.
Instruction set architects and compiler writers often have conflicting
requirements on the design of the instruction set. The architects are driven
by potential micro-architecture complexity and possibly by code size
considerations; while the compiler writers prefer clean orthogonal instruction
sets amenable to regular compiler algorithms. This struggle has stabilized
with a resulting uneasy boundary between the RISC and CISC architectural
styles in general purpose computing. However, in the embedded computing world,
with domain specific processors (e.g. Digital Signal Processors (DSPs)), this
struggle continues. It has resulted in very irregular architectures that
are very hard to compile for and that need specialized optimization
techniques.
An increasing trend in embedded processing is to use VLIW (Very Long
Instruction Word) processors, which need to be supported by efficient
compilers that can extract the instruction level parallelism (ILP) at
compile time. However, the ISAs (Instruction Set Architectures) for
these machines are still driven by embedded processing constraints
(small code size, simpler hardware), and consequently all available
ILP may not always be cleanly reflected in the ISA. This presents a
significant problem for the compiler, since tradition VLIW compilers
work directly with physical resources to determine what can and cannot
be scheduled in parallel. It is obviously desirable to leverage the
large body of work done in VLIW compilation - however, many embedded
ISAs make this impossible. This forces the development of specialized
optimization techniques for these processors - an undesirable
allocation of research effort; or even worse, coding in assembly
language - an undesirable allocation of engineering effort.
In this paper we show how the irregularities in the ISA can actually
be matched with the regular resource based requirements of classic
VLIW compilers by generating a set of artificial resources based on
the ISA specification. These resources may not correspond to any real
physical resources, but are created to provide a uniform specification
to the VLIW compiler (actually the scheduler) as to what can and
cannot be done in parallel. This enables the use of the large body of
work (and software) in VLIW compilation to be directly used for these
irregular processors.
The artificial resources are generated by solving a combinatorial
graph labeling problem which is generated from the ISA
specification. We show that this problem is equivalent to the problem
of covering all the edges in a given graph using the minimum number of
cliques (complete subgraphs). This is a known NP-complete problem -
thus, we can leverage the heuristics already developed for this. In
addition, there is a direct transformation between this problem and
the well studied graph coloring problem. This transformation enables
us to use the large body of work (and software!) in the graph
coloring problem domain.
Finally, we demonstrate the application of these ideas in the
development of compilers for two proprietary DSPs - the Elixir and the
Hiperion DSPs from Fujitsu. In the past, compilers were developed for
these processors using highly specialized optimization
algorithms. This is now being replaced by using the proposed technique
of generating artificial resources with an existing VLIW compiler (the
IMPACT compiler).
We believe that the development of these ideas opens the door for
using classic VLIW compilation methods for irregular embedded VLIW
DSPs and is a significant step forward in the development of software
tools for embedded processing.