Automatic Exploitation of Input Parallelism [abstract] (PDF)
Taewook Oh
Ph.D. Thesis, Department of Computer Science,
Princeton University, September 2015.
Parallelism may reside in the input of a program rather than the program
itself. A script interpreter, for example, is hard to parallelize
because its dynamic behavior is unpredictable until an input script is given.
Once the interpreter is combined with the script, the resulting program becomes
predictable, and even parallelizable if the input script contains parallelism.
Despite recent progress in automatic parallelization research, however, existing
techniques cannot take advantage of the parallelism within program inputs, even
when the inputs remain fixed across multiple executions of the program.
This dissertation shows that the automatic exploitation of parallelism within
fixed program inputs can be achieved by coupling program specialization
with automatic parallelization techniques. Program specialization
marries a program with the values that remain
invariant across the program execution, including fixed inputs, and creates a
program that is highly optimized for the invariants. The proposed technique exploits
program specialization as an enabling transformation for automatic
parallelization; through specialization, the parallelism within the fixed
program inputs can be materialized within the specialized program.
First, this dissertation presents Invariant-induced Pattern-based Loop
Specialization (IPLS).
IPLS folds the parallelism within the program invariants into the specialized
program, thereby creating a more complete and predictable program that is easier
to parallelize.
Second, this dissertation applies automatic speculative
parallelization techniques to specialized programs to exploit parallelism in
inputs. As existing techniques fail to extract parallelism from complex programs
such as IPLS specialized programs, context-sensitive speculation and
optimized design of the speculation run-time system are proposed to improve the
applicability and minimize the execution overhead of the parallelized program.
A prototype of the proposed technique is evaluated against two widely-used
open-source script interpreters. Experimental results demonstrate the
effectiveness of the proposed techniques.