Compiler Designing W07L1304

COOL Semantics II

  • The declared object for a class C would return list of argument with their types and values.
 Class(C)={a1 : T1 ← e1, a2 : T2 ← e2... an : Tn ← en}

If class C is inherited from other class, then the list of the attributes generated would in same order as they are in memory, i.e. first come the superclass’s attributes then the attributes of class.

These are the rules for creating a new object of any class:

gif (1)

See the self object is the allocated object, first three lines are for allocating memory to the class object and remaining for initializing it. Defaults are given to values initially and see that all the attributes are in scope.

  • Discussion about the informal semantics of e0.f(a1,a2…an)
    • Evaluation of the arguments.
    • evaluate target object e0 and let X is the dynamic type of the target object e0
    • Fetch from X the definition of f with n arguments.
    • create n new locations and make environment mapping from attributes to these new locations.
    • set self to target object and evaluate f’s body.
  • impl(A,f)={args, f_body} –lookup in memory of function
  • Example of above @14:00 min, in the example, E have the formal arguments and self attributes. S contains the locations of attributes and formal arguments.

[II]

  • Operational semantics do not cover all the cases.
  • When function impl(A, f) is used, there is not this case of checking that if f exists in A or not that work is already been done by type checker. But there are some error that type checker cannot prevent:
    • Heap overflow
    • string out of range
    • division by zero
    • dispatch on void
  • Operational semantics are good if portability is expected.
Advertisements

Compiler Designing W07L1206

Object Layout I: Attributes

  • Object layout is the structure for storing an object of a class in memory. According to OO implementation techniques, if B inherits from class A then every place object of B can be used where A’s object is expected. This means that the structure is such that the class B’s implementation in memory should be same until it is same as A.
  • Every attribute is stored at some fixed offset in object. Also object have contiguous memory allocation for attributes and its header.
  • Object layout for an object of class A:(contiguous memory allocation)

object

The first three entries are collectively called the header of the class. Class tag is an integer used to identify the class. Dispatch Pointer points to the dispatch table.

When a class inherits from A then layout till the attributes is the same, and this layout is extended for the attributes of the inherited class.

Object Layout II: Dispatch Table

  • Dispatch table contains the pointer to the functions and these functions remain at same offset in the class and all of its subclasses. If any subclass overrides to the function of its super class then it is at the same offset.
  • All the objects of the same class would use the dispatch table.

Compiler Designing W07L1205

Temporaries (Improvement)

  • Remember the case for adding two expressions e1 and e2 it used one temporary register to reload the value of e1. This temporary register can be reused by other expressions.
  • The number of temporary for an expression can be calculated at the compil time, hence memory in AR is reserved for the number of temporaries. Some formula for counting number of temporaries are:(NT: number of temporaries)
NT(e + e)= max(NT(e), 1+ NT(e))
NT(e - e)= max(NT(e), 1+ NT(e))
NT(if a=b then c else d)= max(NT(a=b), NT(c), NT(d))   //NT(a=b)=max(NT(a), 1 + NT(b))
NT(id(arg1, ... argn))=max(NT(arg1), NT(arg2)...NT(argn))
NT(int)=0
NT(id)=0
  • Including the temporary changes the structure of AR as:

ar

  • The explanation is necessary to understand the concept.

[II]

  • The a + b assembly code with this structure becomes:
cgen(a + b, nt)=
     cgen(a, nt)
     sw $a0 nt($fp)
     cgen(b, nt+4)
     lw $t1 nt($fp)
     add $a0 $t1 $a0
  • A code generator should know how many temporaries are in use so that the empty temporary can be used. Then temporary area is also a stack and it’s small and fixed-size.

Compiler Designing W07L1203 & 04

Code Generation II and Example

  • Code generation of functions calls and definition is done using AR.
  • As the result of a funtion call would be stored in accumulator, hence no need to store it in AR.
  • Arguments of a function that is being called is stored in AR of the caller function.
  • When the called function is executed successfully, there will be no change in stack (property of stack machine). As $sp would be at the same place hence no need to use control link.
  • Return address of the called function is stored in caller’s AR. As this would change depending on the other function called.
  • $fp is defined which is frame pointer that points to current activation. It is used to access the argument of the function as it remains at the constant location during the execution of the function body.
so cgen(xi)= lw $a0 z($fp) where z=4*i and i ∈ {1,n}
  • AR of the caller’s frame ccontains arguments of the called function return address of that function(old $fp).
caller implements 'jal lable' on function call //jump and link, stores next instruction's 
                                                 value in register $ra: return address.
The called implements 'jr reg' after executing itself //jump register, jums to reg.
  • A very good explanation of function call process in stack is given here, for example of code with commnets see here from slide 30. Other link is this.

[II]

  • AR and code generator has to be designed together.
  • Code generation can be implemented as recursive descent of AST just like type checking.
  • Production compilers keep values in register as this is fast and intermediate results are kept in AR not in stack.

[04]

  • Every binary operation would be done by the same procedure as written in some previous post that first will goto stack after being loaded into accumulator then the second one comes into accumulator and stack is popped to perform the operation and then result is pushed to stack. Note that the result of the expression is in accumulator.Example.

Compiler Designing W07L1202

Code Generation I

  • A function cgen(e) is introduced that produces the code for expression for e.
  • Red: Compile time operation; Blue: Run-time operation
cgen(constant) = li $a0 constant

/*Add Operation*/
cgen(a+b) = 
      cgen(a)
      sw $a0 0($sp)
      addiu $sp $sp -4
      cgen(b)
      lw $t1 4($sp)
      add $a0 $a0 $t1
      addiu $sp $sp 4 
/*Subtract Operation*/
cgen(a-b) = 
      cgen(a)
      sw $a0 0($sp)
      addiu $sp $sp -4
      cgen(b)
      lw $t1 4($sp)
      sub $a0 $a0 $t1                      //just changes here
      addiu $sp $sp 4 
  • One optimization can be proposed using $t1 to load e1, but it won’t work correctly. Take the case of 1+(2+3).
  • From the above, see that the cgen is actually a template, it is to be plugged with different values and get the result.
  • Stack machine code generation is recursive, hence a recursive-descent of AST can be used to implement it.

[II]

  • Two other operations are:
beq $a $b branch_label          //beq: branch if equal
b branch_label                  //b: unconditional branch
  • Example execution of if (a== b) then c else d;
      cgen(a)
      sw $a0 0($sp)
            addiu $sp $sp -4
      cgen(b)
       lw $t1 4($sp)
              addiu $sp $sp 4 
              beq $a0 $t1 true_branch
              b false_brach
   
   true_branch cgen(c)
   end_if
     false_branch cgen(d)
   end_if

Compiler Designing W07L1201

Introduction to Code Generation

  • Code generation would be based on stack machine with accumulator or 1-register stack machine.
  • SPIM is used as simulator, code generated by MIPS can be used on any kind of hardware.
  • Accumulator would be kept in register $a0, and stack in memory.
  • To access stack there is this stack pointer denoted by $sp, this is actually the next unallocated space and as the stack grows towards the low address the MIPS then the address of top is $sp+4, 4 because MIPS works on 4 byte words.
  • MIPS architecture:
    • Its a RISC machine
    • for operands and results, registers are used.
    • use load and store to access memory
    • 32 bit general purpose registers(32-bit each), we use three of them $a0, $sp and for temporary values $t1
  • The register operation done in a MIPS program are:
lw reg1 offset(reg2)
     //load word into reg1 the contents of offset+reg2
add reg1 reg2 reg3
     //performs add operation reg1 ← reg2 + reg3
sw reg1 offset(reg2)
     //stores the contents of reg1 in reg2 + offset
addiu reg1 reg2 imm
     //adds immediate unsigned: imm is a constant value; reg1 ← reg2 + imm
li reg1 imm
     //load immediate; load imm in reg1

lets take the example of 7 + 5

load 7                             li $a0 7
push 7                             sw $a0 0($sp)
                                   add $sp $sp -4
load 5                             li $a0 5
acc ← acc + top_of_stack           lw $t1 4($sp)
                                   add $a0 $a0 $t1
pop                                addiu $sp $sp 4