11. Structure and Extensions of the Backend Process

Contents

11.1. How to call Backend Processes

The backend processes are called from coins.driver.Driver as follows:
00    /* pass 7 -- Code generation */
01    Root root = new Root(spec, new PrintWriter(System.out, true));
02    String targetName
03      = coinsOptions.getArg(CommandLine.COINS_TARGET_NAME_OPTION);
04    String targetConvention
05      = coinsOptions.getArg(CommandLine.COINS_TARGET_CONVENTION_OPTION);
06    trace.trace(myName, 5000, "target name = " + targetName);
07    Module unit = Module.loadSLir(sexp, targetName, targetConvention, root);
08    makeCSourceFromLir("new", lirToCTimings, unit, io);
09
10    if(spec.getCoinsOptions().isSet("snapshot"))
11      snap.shot(unit,"Generated LIR");
12
13    /* SSA optimization */
14    if (spec.getCoinsOptions().isSet(coins.ssa.OptionName.SSA_OPT)) {
15      unit.apply(new coins.ssa.SsaDriver(unit, io, spec));
16      /* Simple/JumpOpt again */
17      unit.apply(coins.backend.opt.SimpleOpt.trig);
18      unit.apply(coins.backend.opt.JumpOpt.trig);
19    } else {
20      unit.basicOptimization();
21    }
22    if (spec.getCoinsOptions().isSet("simd")) {
23      unit.apply(new coins.simd.SimdDriver(unit, io, spec));
24    }
25    makeCSourceFromLir("opt", lirToCTimings, unit, io);
26
27    if(spec.getCoinsOptions().isSet("snapshot"))
28      snap.shot(unit,"Optimized LIR");
29
30    if(spec.getCoinsOptions().isSet("snapshot"))
31      snap.generateXml();
32
33    unit.generateCode(out);
34
35    if (trace.shouldTrace("Sym", 1)) {
36      trace.trace("Sym", 1, "\nSym after code generation ");
37      symRoot.symTable.printSymTableAllDetail(symRoot.symTableRoot);
38      symRoot.symTableConst.printSymTableDetail();
39    }
The instance root made at the line 1 has the root information of the backend process. It has the information of the compiler options (spec) and output assembly file (System.out). The backend processes are called at the line 7, 20, and 33 as:
Module unit = Module.loadSLir(sexp, targetName, targetConvention, root);
unit.basicOptimization();
unit.generateCode(out);
The instance unit has the internal representaion of the LIR made from the S-expression sexp.

The above three methods, loadSLir, basicOptimizations, and generateCode, are composed of several steps as follows:

loadSLir
Module unit = new Module(sexp, targetName, convention, root);
unit.apply(new String[] {
    "IntroVirReg",
    "EarlyRewriting",
    "+AfterEarlyRewriting",
    "If2Jumpc" });
return unit;
basicOptimization
unit.apply(new String[] {
    "+BeforeBasicOpt",
    "SimpleOpt",
    "JumpOpt",
    "PreHeaders",
    "LoopInversion",
    "+AfterBasicOpt" });
generateCode
unit.apply(new String[] {
    "+BeforeCodeGeneration",
    "LateRewriting",
    "+AfterLateRewriting",
    "ToMachineCode",
    "+AfterToMachineCode",
    "ConvToAsm"  });
For example, by the first
unit.apply("IntroVirReg");
the method doIt in the class coins.backend.opt.IntroVirReg is executed, because the Module constructor has registered the transformer IntroVirReg.trig with the name "IntroVirReg" to the instance root by the following statement:
root.registerTransformer(IntroVirReg.trig);
By the execution of doIt of coins.backend.opt.IntroVirReg, frame variables are named as virtual registers.

Furthermore, the execution of

unit.apply("ToMachineCode");
in generateCode causes the excecution of the instance method toMachineCode of the class Function, and the toMachineCode method executes the following steps:
    apply(LiveRange.trig);   // live range analysis
    apply(JumpOpt.trig);   // jump optimization
    apply(JumpCanon.trig);   // canonicalize JUMPC instructions
    apply("+BeforeFirstInstSel");   // call Hook point
    apply(module.targetMachine.instSelTrig);   // instruction selection
    apply("+AfterFirstInstSel");  // call Hook point
    int count = 0;
    for (;;) {
      if (count++ >= REGALOLIMIT)
        throw new CantHappenException("Looks like an infinite loop during Register allocation");
      if (apply(RegisterAllocation.trig))  // register allocation
        break;          //  if success
      apply("+BeforeSecondInstSel");  // call Hook point
      apply(module.targetMachine.instSelTrig);   // instruction selection again
      apply("+AfterSecondInstSel");  // call Hook point
    }
    postProcess(); // eliminate redundant jumps
The argument string with prefixed "+" in the above calls of the apply method means that it is a hook name. Usually this call does nothing. But, if the method
 Root#addHook(String, Transformer)
has been called with a hook name and a Transformer object, the doIt method of the Transformer object is executed when the apply method is called with the hook name.

11.2. How to Add New Process to Backend

By using the previous hooking features a new process can be added to the backend without modifing the backend. The class for a new process should be in the following form:
class NewProcess {
  class NewTrigger  implements LocalTransformer{ // or GlobalTransformer
    void doIt(...){...}
    ...
  }
  static void attach(CompileSpecification spec, Root root){
//  if (spec.getCoinsOptions().isSet("newprocess")) {
      root.addHook("+xxx", new NewTrigger());
// }
  }
}
where "+xxx" is a hook name. LocalTransformer is applied to each Function instance, and GlobalTransformer is applied to a Module instance.

When the compiler option

-coins:attach=NewProcess
is specified, the NewProcess.attach method is called at the entry of the backend process. And the doIt method of the NewTrigger object is executed at the hook point "+xxx".
If the commented-out if-statement is effective, this execution occurs only when the option
-coins:newprocess
is specified.

11.2.1. Example 1: Instruction Scheduler

The instruction scheduler is called when the following compiler options are specified:
-coins:schedule,attach=coins.backend.sched.Schedule
The attach method of coins.backend.sched.Schedule is defined as follows:
public static void attach(CompileSpecification spec, Root root) {
    if (spec.getCoinsOptions().isSet("schedule")) {
      root.addHook("+AfterFirstInstSel", before);
      root.addHook("+AfterToMachineCode", after);
    }
  }
Therefore, the instruction scheduler is called twice by two triggers, one is the before trigger at the "+AfterFirstInstSel" hook point, and the other is the after trigger at the "+AfterToMachineCode" hook point. The former point is before the register allocation, and the latter is after the regisiter allocation.

The two triggers are almost the same except that the latter does the "nop"-instruction elimination if the target CPU has delayed branch instructions. They are written machine independently, and can be applied to any target machine.

See Fig.12-2 in 12. Performance of COINS compilers for some results of this optimization.

11.2.2. Example 2: Register Promotion

The process of register promotion is called when the following compiler options are specified:
-coins:regpromote,attach=RegPromote
The class RegPromote is in the package coins.backend.contrib. The name of this special package can be omitted.

The register promotion is the process to promote static variables to registers within some loops. This promotion is effective especially for such a program as:

int candidat, quotient, remaindr, index, nth, primenum, loopend ;
int primeNumbers[SIZE] ;

void getPrime(int primevec[SIZE], int count)
{
 primevec[0] = 2 ;
 nth = 1 ;
 candidat = 3 ;
 while (nth<count) {
   remaindr = 1 ;
   index = 0 ;
   loopend = 0 ;
   while(loopend==0) {
     primenum = primevec[index] ;
     quotient = candidat / primenum ;
     remaindr = candidat - quotient*primenum ;
     if (remaindr==0)
       loopend = 1 ;
     if (quotient*quotient<candidat)
       loopend = 1 ;
     index = index + 1 ;
     if (index>=nth)
       loopend = 1 ;
   }
   if (remaindr != 0) {
     primevec[nth] = candidat ;
     nth = nth + 1 ;
   }
   candidat = candidat + 2 ;
 }
 nth   = 0 ;
 while (nth<count) {
/*   print(primevec[nth]) ;
*/
   nth   = nth + 1 ;
 }
 return;
}
This program is the prime program in the Fig.12-2 of the chapter 12. Performance of COINS compilers. This register promotion produces almost all the performance gain of the prime program.