The Omega test: a fast and practical integer programming algorithm for dependence analysis

Published on Aug 1, 1991
· DOI :10.1145/125826.125848
William Pugh51
Estimated H-index: 51
(UMD: University of Maryland, College Park)
Sources
Abstract
The Omega test is an integer programming algorithm that can determine whether a dependence exists between two array references, and if so, under what conditions. Conventional wisdom holds that integer programming techniques are far too expensive to be used for dependence analysis, except as a method of last resort for situations that cannot be decided by simpler methods. We present evidence that suggests this wisdom is wrong, and that the Omega test is competitive with approximate algorithms used in practice and suitable for use in production compilers. The Omega test is based on an extension of FourierMotzkin variable elimination to integer programming, and has worst-case exponential time complexity. However, we show that for many situations in which other (polynomial) methods are accurate, the Omega test has low order polynomial time complexity. The Omega test can be used to simplify integer programming problems, rather than just deciding them. This has many applications, including accurately and efficiently computing dependence direction and distance vectors.
📖 Papers frequently viewed together
1988
1 Author (Utpal Banerjee)
511 Citations
1986
5,999 Citations
1977POPL: Symposium on Principles of Programming Languages
5,600 Citations
References24
Newest
#1Michael Wolfe (Nvidia)H-Index: 22
From the Publisher: Effective use of a supercomputer requires users to have a good algorithm and to express this algorithm in an appropriate language, and requires compilers to generate efficient code. This book investigates several problems facing compiler design for supercomputers, including building efficient and comprehensive data dependence graphs, recurrence relations, the management of compiler temporary variables, and WHILE loops. The book first proposes an efficient means of representin...
720 CitationsSource
#1Michael Wolfe (OHSU: Oregon Health & Science University)H-Index: 22
#2Chau-Wen TsengH-Index: 36
A data dependence decision algorithm called the power test is introduced. The power test is a combination of the extended GCD algorithm and the Fourier-Motzkin method to eliminate variables in a system of inequalities. This is the first test that can generate the information needed for some advanced transformations, and that can handle complex simultaneous loop limits. Previous work in data dependence decision algorithms is reviewed. Some examples which motivated the development of this test are...
125 CitationsSource
#1David Klappholz (Stevens Institute of Technology)H-Index: 9
#2Xiangyun Kong (Sun Microsystems)H-Index: 1
The Banerjee-Wolfe test is one of the major data dependence tests used in automatic parallelization of sequential code. Though it is only an approximate test, its relatively high accuracy and its relatively low cost account for its great popularity. Being an approximate test, the Banerjee-Wolfe test does, however, sometimes result in a loss of parallelism. One of its potential sources of failure is the fact it does not traditionally take execution conditions into account. The purpose of the pres...
3 CitationsSource
#1William Pugh (UMD: University of Maryland, College Park)H-Index: 51
63 CitationsSource
#1François Irigoin (ENSMP: Mines ParisTech)H-Index: 15
#2Pierre Jouvelot (ENSMP: Mines ParisTech)H-Index: 18
Last. Rémi Triolet (ENSMP: Mines ParisTech)H-Index: 3
view all 3 authors...
137 CitationsSource
May 1, 1991 in PLDI (Programming Language Design and Implementation)
#1Gina Goff (Rice University)H-Index: 1
#2Ken Kennedy (Rice University)H-Index: 83
Last. Chau-Wen Tseng (Rice University)H-Index: 36
view all 3 authors...
Precise and efficient dependence tests are essential to theeffectivermss ofaparallelizing compiler. This paper proposes a dependence testing scheme based on classifyingpairs ofsubscripted variable references. Exact yet fast dependence tests are presented for certain classes ofarray references, as well as empirical results showing that these references dominate scientific Fortran codes. These dependence tests are being implemented at Rice University in both PFC, aparallelizing compiler, and ParaS...
220 CitationsSource
May 1, 1991 in PLDI (Programming Language Design and Implementation)
#1Dror E. Maydan (Stanford University)H-Index: 9
#2John L. Hennessy (Stanford University)H-Index: 72
Last. Monica S. Lam (Stanford University)H-Index: 77
view all 3 authors...
Data dependence testing is the basic step in detecting loop level parallelism in numerieal programs. The problem is equivalent to integer linear programming and thus in general cannot be solved efficiently. Current methods in use employ inexact methods that sacrifice potential parallelism in order to improve compiler efficiency. This paper shows that in practice, data dependence can be computed exactly and efficiently. There are three major ideas that lead to this result. First, we have develope...
188 CitationsSource
Apr 1, 1991 in PPoPP (ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)
#1Corinne AncourtH-Index: 10
#2François IrigoinH-Index: 15
306 CitationsSource
#1Michael WolfeH-Index: 22
60 Citations
#1Paul Havlak (Rice University)H-Index: 23
#2Ken Kennedy (Rice University)H-Index: 83
The authors describe an implementation of regular section analysis in the Rice Parallel Fortran Converter (PFC), an automatic parallelization system. The overriding concern in the implementation is that it be efficient enough to be incorporated in a practical compilation system. This implementation of regular section analysis describes interprocedural side effects on subarrays in a form useful to dependence analysis while avoiding the complexity of prior solutions. The authors also examine the p...
23 CitationsSource
Cited By717
Newest
#1Ahmed Al-Ajeli (College of Information Technology)H-Index: 2
#2David Parker (University of Birmingham)H-Index: 72
Abstract null null We propose techniques for fault diagnosis in discrete-event systems modelled by labelled Petri nets, where fault events are modelled as unobservable transitions. The proposed approach combines an offline and an online algorithm. The offline algorithm constructs a diagnoser in the form of sets of inequalities that capture the legal, normal and faulty behaviour. To implement the offline algorithm, we adopt the Fourier–Motzkin method for elimination of variables from these sets o...
Source
#1Armin BiereH-Index: 56
#2Marijn J. H. HeuleH-Index: 26
Last. Toby WalshH-Index: 73
view all 4 authors...
'Satisfiability (SAT) related topics have attracted researchers from various disciplines: logic, applied areas such as planning, scheduling, operations research and combinatorial optimization, but also theoretical issues on the theme of complexity and much more, they all are connected through SAT. My personal interest in SAT stems from actual solving: The increase in power of modern SAT solvers over the past 15 years has been phenomenal. It has become the key enabling technology in automated ver...
515 Citations
#1Seungbin Song (Yonsei University)H-Index: 2
#2Heelim Choi (Yonsei University)
Last. Hanjun Kim (Yonsei University)H-Index: 10
view all 3 authors...
Network programming languages enable programmers to implement new network functions on various hardware and software stacks in the domain of Software Defined Net-working (SDN). Although the languages extend the flexibility of network devices, existing compilers do not fully optimize the network programs due to their coarse-grained parallelization methods. The compilers consider each packet processing table that consists of match and action functions as a unit of tasks and parallelize the program...
Source
Automatic parallelization improves the performance of serial program by automatically converting to parallel program. Automatic parallelization typically works in three phases: check for data dependencies in the input program, perform transformations, and generate the parallel code for target machine. Though automatic parallelization is beneficial, it is not done as a part of compiling process because of the time complexity of the data dependence tests and transformation techniques. Data depende...
Dataflow computing is a very attractive paradigm for high-performance computing, given its ability to trigger computations as soon as their inputs are available. UPC++ DepSpawn is a novel task-based library that supports this model in hybrid shared/distributed memory systems on top of a Partitioned Global Address Space environment. While the initial version of the library provided good results, it suffered from a key restriction that heavily limited its performance and scalability. Namely, each ...
Source
The Isabelle/HOL proof assistant provides quite advanced and reliable support for discharging proof goals with external SMT solvers by reconstructing the resulting proof tree within the Isabelle/Pure inference kernel. In particular, currently Isabelle fully and efficiently supports proof reconstruction for the quantifier-free fragments of the theories of equality and uninterpreted functions (QF_UF) and linear integer arithmetic (QF_LIA). Yet there are a number of practically relevant theories th...
Source
#1Hendrik BrinksmaH-Index: 1
#2W.R. CleavelandH-Index: 2
Last. Kim Guldstrand LarsenH-Index: 84
view all 5 authors...
9 Citations
Sep 30, 2020 in PACT (International Conference on Parallel Architectures and Compilation Techniques)
#1Lorenzo Chelini (TU/e: Eindhoven University of Technology)H-Index: 5
#2Tobias Gysi (ETH Zurich)H-Index: 6
Last. Henk Corporaal (TU/e: Eindhoven University of Technology)H-Index: 40
view all 5 authors...
To this day, polyhedral optimizing compilers use either extremely rigid (but accurate) cost models, one-size-fits-all general-purpose heuristics, or auto-tuning strategies to traverse and evaluate large optimization spaces. In this paper, we introduce an adaptive and automatic scheduler that permits to generate novel loop transformation sequences (or recipes) capable of delivering strong performance for a variety of different architectures without relying on auto-tuning, nor on pre-determined tr...
4 CitationsSource
#1František Silváši (Technical University of Košice)H-Index: 1
#2Martin Tomášek (Technical University of Košice)H-Index: 3
Abstract We present a formalization of bounded grids using Lean proof assistant and provide a formalized implementation along with an interface consisting of various definitions together with their proven–correct properties serving to manipulate grids in general fashion regardless of the intended use case. We then proceed to demonstrate the applicability of the grids by interpreting them as matrices, followed by developing a formalization of cellular automata with emphasis on preservation of com...
2 CitationsSource
#1Hamid Arabnejad (University of Porto)H-Index: 12
#2João Bispo (University of Porto)H-Index: 11
Last. Jorge G. Barbosa (University of Porto)H-Index: 16
view all 4 authors...
Directive-driven programming models, such as OpenMP, are one solution for exploring the potential parallelism when targeting multicore architectures. Although these approaches significantly help developers, code parallelization is still a non-trivial and time-consuming process, requiring parallel programming skills. Thus, many efforts have been made toward automatic parallelization of the existing sequential code. This article presents AutoPar-Clava, an OpenMP-based automatic parallelization com...
3 CitationsSource