The first step in eliminating local common subexpressions is to detect the common subexpression in a basic block. The common subexpressions in a basic block can be automatically detected if we construct a directed acyclic graph (DAG).
DAG Construction
For constructing a basic block DAG, we make use of the function node(id), which returns the most recently created node associated with id. For every three-address statement x = y op z, x = op y, or x = y in the block we:
do
{
-
If node(y) is undefined, create a leaf labeled y, and let node(y) be this node. If node(z) is undefined, create a leaf labeled z, and let that leaf be node(z). If the statement is of the form x = op y or x = y, then if node(y) is undefined, create a leaf labeled y, and let node(y) be this node.
-
If a node exists that is labeled op whose left child is node(y) and whose right child is node(z) (to catch the common subexpressions), then return this node. Otherwise, create such a node, and return it. If the statement is of the form x = op y, then check if a node exists that is labeled op whose only child is node(y). Return this node. Otherwise, create such a node and return. Let the returned node be n.
-
Append x to the list of identifiers for the node n returned in step 2. Delete x from the list of attached identifiers for node(x), and set node(x) to be node n.
}
Therefore, we first go for a DAG representation of the basic block. And if the interior nodes in the DAG have more than one label, then those nodes of the DAG represent the common subexpressions in the basic block. After detecting these common subexpressions, we eliminate them from the basic block. The following example shows the elimination of local common subexpressions, and the DAG is shown in Figure 1.
-
S1 : = 4 * I
-
S2 : addr(A) − 4
-
S3 : S2 [S1]
-
S4 : 4 * I
-
S5 : = addr(B) − 4
-
S6 : = S5 [S4]
-
S7 : = S3 * S6
-
S8 : PROD + S7
-
PROD : = S8
-
S9 : = I + 1
-
I = S9
-
if I ≤ 20 goto (1).
In Figure 1, PROD 0 indicates the initial value of PROD, and I0 indicates the initial value of I. We see that the same value is assigned to S8 and PROD. Similarly, the value assigned to S9 is the same as I. And the value computed for S1 and S4 are the same; hence, we can eliminate these common subexpressions by selecting one of the attached identifiers (one that is needed outside the block). We assume that none of the temporaries is needed outside the block. The rewritten block will be:
-
S1 : = 4 * I
-
S2 : = addr(A) − 4
-
S3 : = S2 [S1]
-
S5 : = addr(B) − 4
-
S6 : = S5 [S1]
-
S7 : = S3 * S6
-
PROD : = PROD + S7
-
I : = I + 1
-
if I ≤ 20 goto (1)