3. Semantics of High level Intermediate Representation HIR

Preface

This document describes the semantics of HIR, such as the meaning of type and operations corresponding to each HIR representation. The operation of an expression will be defined according to the type of the expression and types of its operands. And so, explanation of types is given before explanation of operations. As for the overview and syntax of HIR, see
Outline of the High level Intermediate Representation HIR.
HIR is almost independent of source languages or target machines, however, some detailed items may be specified by language parameters in SourceLanguage class and machine parameters in MachineParam class. The SourceLanguage class may have a subclass corresponding to each source language. The MachineParam class may have a subclass corresponding to each target machine architecture.

3.1. HIR-base and Transformations from source language to HIR

In some explanations, it is preferable to mention about a particular source language or a particular target machine to draw clear images. When that is the case, the portion of the explanation with language/machine dependencies is enclosed with double brackets [[ ]] in this document.
In the course of transformation from source language to HIR, it may be convenient to use some intermediate representation with some dependencies to the source language. To distinctly address the HIR free of this kind of dependencies, a term "HIR-base" is used.

[[
Transformation of a C program into HIR goes through an intermediate representation named HIR-C having some items specific to C before generating HIR-base, that is,
   C  -->  HIR-C  -->  HIR-base
HIR-C is introduced to lessen, by the above decomposition, the complexity of the single step transformation from C to HIR, and to prepare intermediate representations close to C source program so that software engineering tools and alike, which handle languages close to C in representations, can be developed with ease. When mentioned simply as HIR, that always means HIR-base.
HIR-C is an intermediate representation which covers HIR. The notations present in HIR-C but not in HIR are:
      Short circuit conditional expression (&&, ||)
      Ternary operator (?:)
      Embedded assign statement (assign statement as a factor of an expression)
      Comma operator (,)
      Increment, decrement (++, --)
      Operators combined with assignment
          ( +=, -+, *=, /*, %=, <<=, >>=, &=, ^=, |=) 
      Addition and subtraction operations on pointers    
Among the types in HIR-C, there are some which are decomposed further when transformed to HIR-base. For example, "int" in HIR-C is broken down to "int", "char" and "enum" in HIR-base. Pointer subtraction operations in HIR-C and in HIR-base share the same notation but differ in meanings. If a flag named HIR.FLAG_C_PTR is attached to a pointer operation node, then the operator has the meaning of the pointer operation in C, and if the flag is not attached, then the operator has the meaning of pointer operation in HIR-base. When input program is transformed to complete HIR-base, then no node has HIR.FLAG_C_PTR flag.
]]

3.2. Types of HIR-base

3.2.1. Notations of types

3.2.1.1. Nodes and types
HIR is represented as a tree structure, and the tree structure has a corresponding text representation. A tree structure consists of nodes. Each node is either a leaf with no child or a non-leaf with children. Each node has a type as its attribute. A leaf represents one symbol among symbol for variable name, constant, subprogram name or others, which appear in the source program. The type of a leaf node for a symbol "s" is the same as the type of the symbol "s" (as explained in "The outline of the symbol table", a symbol is given its type). A non-leaf has one or more child nodes and an operator. The child nodes are operands of the operator.
A node, regardless of leaf or non-leaf, is given a coded operator which indicates what the node is. A leaf node is presented with an operator which does not have operands. In text representation, a leaf node is enclosed by angular brackets < >, and a non-leaf node is enclosed in parentheses, and a constructed type (non basic type) is also enclosed in angular brackets.
The type of a non-leaf node of an operator "op" shows the type of resulting expression which is obtained by applying "op" to its operands. The type is in most cases decided by the combination of the operands and the operator at the time of HIR generation. The resulting type of the type conversion operator "conv" is decided by the context of the upper level syntax and "conv" node has that type. The context decides which type must be given to the node. If a node does not need a specific type, then the node has "void" type.
In general, a non-leaf represents an expression, a statement, a subprogram and so forth, and a leaf represents a simple variable, a constant, a label and so forth. Both non-leaves and leaves, as a whole, may be assumed as expressions in a broader sense. The type of a node in the HIR tree is the one decided for each of the expression in that broad sense upon HIR generation.
3.2.1.2. Objects
A storage area where a value is stored is called an object in this document. The size of an object is denoted in bytes. An expression other than of void type has a value. Once a value is stored in an object which is specified by the first child of an assignment operator (assign), it is retained there until another new value is assigned to that object, and the values of expressions which refer to that storage area are equal to the value last stored there. (As mentioned later, though, if a "volatile" qualifier is attached, this is not always the case.)
An expression which represents an object to store a value is called an "l-value". The type of an object is the type of the value which it stores. Generally, a variable represents a storage area to which a value can be assigned. If a storage area has "const" qualifier, its value can not be reassigned but such storage area may be represented by a variable. Later, aggregate types named array or structure or union will be explained. A storage areas which stores array, structure, or union is also treated as a variable of the sort and is called array variable, structure variable and union variable, respectively.
Thus, all variables represent objects. But there are objects which are represented by expressions of non variables. An array, a structure, and a union are called an aggregate and they are objects, and the expressions which represent them are l-values. If an l-value has the qualifier "const", then its value can not be reassigned, that is it can not be used as the first operand of assignment statement (AssignStmt). A scalar constant is not always an object, but an aggregate constant is always an object.
Objects are generated not only in accordance to the declarations in a source program, but also during the execution of the program. A part of an object may be used as another object. In such case, which storage area is to be used for which object should be decided according to the semantics of the aggregates.
3.2.1.3. Basic types
The basic types in HIR are:
    short int                short integer
    int                      integer
    long int                 long integer
    long long int            long long integer
    float                    floating point number
    double                   double precision floating point number
    long double              long double precision floating point number
    char                     character
    bool                     similar to "enum" type, with values of false and true
    unsigned short int       unsigned short integer
    unsigned int             unsigned integer 
    unsigned long int        unsigned long integer 
    unsigned long long int   unsigned long long integer 
    unsigned char            unsigned character
    offset                   offset of an element of a structure or an array, or
                             number to show size in bytes of an object
    void                     type without value (empty set)
The types short int, int, long int and long long int are given a generic name of signed integer type. The types unsigned short int, unsigned int, unsigned long int and unsigned long long int are given a generic name of unsigned integer type. Signed integer and unsigned integer again have a generic name of integer type. The generic name for float, double and long double is floating point type. The generic name of char and unsigned char is character type. As it is mentioned later, char can be converted to integer type, but the conversion is an implementation defined operation. How many bytes are used to represent each of these basic types is target machine dependent, and it is represented by a parameter corresponding to it in the MachineParam class.

Note: The term "implementation defined" means that a compiler developer who uses this compiler infrastructure can decide how to implement it, but the details in which way it is decided have to be documented.

Constants have their types and values. The internal representations of the values are implementation defined. The values also have their notational forms as their external representations. The notations in HIR may differ from those of the source language. The notations for values of HIR are used to put an intermediate representation on display and so forth. Such methods as makeIntConstString, makeFloatConstString, makeStringBody in the SourceLanguage class can be used to convert the notation of a constant from that of the source language's to that of HIR's.
In HIR, an integer constant is textually represented by a sequence of digits. A sequence of digits followed by a letter L represents a "long int" constant, and a "long long int" constant is followed by LL at the end. An "unsigned int" constant is textually represented by a sequence of digits with trailing U, and an "unsigned long int" constant and an "unsigned long long int" constant are with trailing UL and ULL, respectively. For example, a constant 1 takes forms of 1, 1L, 1LL, 1U, 1UL or 1ULL according to its type. The notation for a floating point number constant is basically the same as that of C language's, but to make different types distinct, trailing F, D and L are added for "float", "double" and "long double", respectively.
A string of characters is represented as a character array. Constants of string of characters are called "string" in HIR. In some source languages, a string constant may have additional marks to show its start and end, or escape characters. A term named "pure sequence of characters" means the array of characters that is obtained by disregarding these additional marks and escape characters. In textual representation, a string constant in HIR is represented by its pure sequence of characters enclosed with quotation marks ("). Since a pure sequence of characters may include unprintable characters, an escape character \ is supplemented at print out, in compliance with the notation in Java which is the compiler description language. The notation for a null string is "" that has no pure sequence of characters. Methods in SourceLanguage class are to be prepared to handle notational changes between the source program and HIR reciprocally.
To the value of a "char", there corresponds a character code of "int" type.
To the value of an "unsigned char", there corresponds a character code of "unsigned int" type. The value of a "bool" is either "true" or "false". To the value of "bool", there corresponds an integer called ordinal number. The ordinal number corresponding to "true" is 1 and that of "false" is 0. An "offset" represents the offset of an element of a structure or of an array, or the size of an object.
To the value of an offset, there corresponds an integer value of either "int", "long int" or "long long int". The selection of one of these integer types corresponding to the offset depends on the target machine and the corresponding integer type is called as origin-type of the "offset" as it is described later.
A label is not an object holding a value, and its type should be set to "void".
Many languages have matching types to some of the basic types mentioned above. The correspondence between the matching types of an individual source language to the types of HIR is decided at HIR generation (by the parser of its compiler).

[[
"int", "short int", and "long int" types of C's are converted to "int", "short int", and "long int" types of HIR's, respectively. "signed int", "signed short int", and "signed long int" types of C's are converted to "int", "short int", and "long int" types of HIR's, respectively. To convert "char" of C's to "char" or "unsigned char" of HIR's is implementation defined (see SourceLanguage class). HIR-base "char", "bool" and "offset" can be treated, as elaborated later, as integer type in HIR-C. A "\0" is added at the end of a string in C, and is not to be included in the pure sequence of characters of a string.
]]
3.2.1.4. Type constructors and origin-types
A type constructor constructs a new type from basic and other types. Listed below are type constructors in HIR.
    pointer      pointer (a type which points to an object) 
    vector       one dimensional array
    struct       structure
    union        union
    enum         enumeration
    subprogram   subprogram
    typedef      type name definition
The enumeration "enum" constructs a scalar type giving a name to a set of elements named enumerators each of which has corresponding ordinal number of integer type as its attribute. The basic type "bool" has many in common with "enum" such as ordinal number but it is different from "enum". The value of "pointer" is the address of an object pointed by the pointer, where the address is represented as integer.
The type qualifiers "const" and "volatile" may be attached to types. If a type t1 is a type t with a qualifier, then t1 and t are different types, but they are treated just the same way as far as the operations are concerned, and no explicit type conversion is required. No assignment operations (operations by assign operator) should be performed on a variable of a type qualified with "const". As for variables of a type qualified with "volatile", eliminations of assignments or references to them or rearrangement of their operation order should not be done during the course of code optimizations.
Type constructors introduce a new type based on some underlining type and define a new type name for the introduced type. The underlining type used to define a new type t is called the "origin-type" of t. The "origin-type" should be a basic type or a defined type that is already defined at the type of construction. Origin-types are explained in detailed for each type constructor. Operations on introduced types are carried out by going back to the origin-type, but application of the operations of the origin-type's on the introduced type may be under certain constraints. These constraints will be elaborated individually. Unless otherwise specified, the basic types do not have origin-types, however, some source languages permit giving origin-types to some basic types. The origin-type of "offset" can be selected, using a target machines parameter, to be "long", "long long" and so forth (see MachineParam class).
[[
When the source language is C, specify "int" as the origin-type of "char" and "bool" (see SourceLanguag class).
]]

When an operator requires of its operands to be the same type, a type conversion of one to the other should take place in compliance with the conversion rules, even among types which share the same origin-type, with some exceptions later described. In the case where operands type t1 and type t2 are of the same origin-type, though their value rages may differ, their values within the intersection of the value ranges of t1 and t2 stay unchanged before and after the type conversions, but are simply looked at from different angles after the conversion. It is quite often the case when types with different names are introduced by type name definition "typedef"s based on the same origin-type. In such cases, practically there is no need of value conversion among two types.
A type with a type qualifier and an introduced type whose range of value is the same as that of the origin-type inherit the operations of their respective origin-types. In these cases, either the introduced type or the origin-type may be used in HIR representations. The choice is implementation defined. The same operation may have different meanings among a type and its introduced type whose ranges of value are not the same.
If we go back the chain of origin-types starting from an introduced type, we will reach to a basic type at the end of this backward chain, and can not go back any further. This basic type is called the ultimate origin-type of the introduced type.
3.2.1.5. Notations of types
In the text representation of HIR, types are also represented in text. A set of notations for basic types in short forms are defined as follows.
    Notation     basic type         explanation
     int           int              (integer)
     short         short int        (short integer)
     long          long int         (long integer)
     long_long     long long int
     u_int         unsigned int
     u_short       unsigned short
     u_long        unsigned long int
     u_long_long   unsigned long long int
     char          char             (character)
     u_char        unsigned char
     float         float            (floating point number)
     double        double           (double floating point number)
     long_double   long double
     bool          bool             (Boolean type, similar to enum)
     void          void             (has no value)
     offset        offset           (structure element offset, 
                                     vector element offset, 
                                     object size, pointer difference)
A set of notations for type constructors in short forms are also defined.
 
    Notation     type constructor   
     ENUM         enum
     PTR          pointer
     VECT         vector
     STRUCT       struct
     UNION        union
     SUBP         subprogram
     TYPEDEF      typedef
In textual representation of HIR, compound types (non basic type) are enclosed with angular brackets < > to avoid confusions.
Integer, float, character, pointer, enum, bool, and offset types are collectively called scalar types, and vector, struct, and union are collectively called aggregates. An introduced type belongs to the same type group as its ultimate origin-type. The groupings of types are shown below:
  basic types
    integer types
      signed integer
        int, short int, long int, long long int
      unsigned integer
         unsigned int, unsigned short,
         unsigned long int, unsigned long long int
    character types
       char, unsigned char
    floating point types
        float, double, long double
    bool
    void
    offset
  type constructors
    ENUM, PTR, VECT, STRUCT, UNION, SUBP, TYPEDEF

  scalar types
    int, short int, long int, long long int,
    unsigned int, unsigned short,
    unsigned long int, unsigned long long int
    char, unsigned char,
    float, double, long double,
    bool,
    offset,
    ENUM, PTR
  aggregates
    VECT, STRUCT, UNION
3.2.1.6. Introduction of types
An introduced type belongs to the same type group to which its origin-type belongs. A type introduced from an integer type also is an integer type. A type introduced from a scalar type also is a scalar type. The criteria, such as structural equivalence, name equivalence, and so forth, to decide whether or not two introduced types are of the same type differ among languages, therefore it is advisable to prepare a method in SourceLanguage class to make the judgment. As mentioned later, there are some cases for which the equality relation is defined.
An object represented as a storage location is often represented as a variable, but it is also represented by a pointer expression. The type of a pointer which points to an object of type T is denoted by
     <PTR T>
An object is of type T, if it is pointed by a pointer of <PTR T> type. A pointer of the type shown below can be converted to and from any type of pointers:
     <PTR void>
3.2.1.6.1. Arrays
An array is an ordered sequence of elements of the same type. The size of an array is the size of an element multiplied by the number of elements. The type of an array which consists of n elements of type T and its smallest subscript is lb is denoted by
     <VECT n lb T>
The type of elements of an array of type <VECT n lb T> is T. One dimensional array is also called vector.
A type of a multi dimensional array would take such form as
     <VECT 10 0 <VECT 20 0 int>>
This shows a type of an array of arrays. The number of elements and the smallest subscript of an array may take forms of integer expressions as well as integer constants. An array type is an incomplete type if its number of elements or its smallest subscript is not specified at the time of compilation and it takes such forms as
     <VECT null null int>, and 
     <VECT null 0 char>
As mentioned earlier, the term "null" is put in places for the number of elements and the smallest subscript. An incomplete array type does not need a type conversion at operation involving an array type with the same element type.
[[
All elements of an array are usually placed in consecutive storage locations, but the specification of HIR does not request it to be as compulsory. Therefore, they do not have to be placed in ascending order of subscripts. In which way elements of an array are arranged and are accessed are implementation defined.
]]
3.2.1.6.2. TYPEDEF
A declaration
     (typedef T name)
declares that a type T is called "name" which can be used instead of T, and introduces a new type in the form of
     <TYPEDEF name>
or in some other form defined according to the source language. That is, "name" is another representation of T and the type of "name" is <TYPEDEF name>. Here, type T is called the origin-type of the introduced type <TYPEDEF name>, and the introduced type is treated in the same way as the origin-type T in evaluating HIR expressions.
Once this declaration has been done, the introduced type in the form of "name" can be used instead of T. (How to define an introduced type whose range of values differs from that of its origin-type is elaborated later). The "name" here is usually an identifier, but other exceptional representations may be allowed for some source languages.
An operation which involves expressions using an introduced type follows the rules applicable to the origin-type. If the origin-type is also an introduce type, the rules for the ultimate origin-type apply. Where their ranges of value differ, the operation might partially takes different courses.
[[
In C language, statements
    typedef int cm;
    typedef cm Length;
    Length leng1;
introduce types <TYPEDEF cm> and <TYPEDEF Length>, and the type of leng1 is <TYPEDEF Length>. The origin-type of <TYPEDEF Length> is <TYPEDEF cm>, and that of <TYPEDEF cm> is "int". Operations on leng1 are carried out as "int" type. (As is mentioned later, <STRUCT identifier>, <UNION identifier> and <ENUM identifier> are accepted as names of introduced types).
A declaration in Pascal;
     type price = integer;
corresponds to
     (typedef int price)
in HIR and allows to use "price" instead of "integer". Here, the declaration introduces a new type
     <TYPEDEF price>
and its origin-type is "int".
]]
(Note1:
It is an acceptable implementation to replace an introduced type with its origin-type at HIR generation, in stead of making use of type constructor TYPDEF. Current C parser takes that approach. The way with current C parser does not suit to some languages which make distinctions between types with different names. It also is not well prepared for the cases, such as a debugger, which requests the notations in the source language. In the case of C, since it does not make distinctions among type names, as long as the problem with debuggers can be put aside, replacing an introduced type with its origin type at HIR generation can be an acceptable choice.)
3.2.1.6.3. Enumeration
The notation for an enumeration in HIR is
     <ENUM <enumerator-specification enumerator-specification ...>>
where an enumerator-specification is a pair of an enumerator and its ordinal number:
    <enumerator ordinal number>
An ordinal number is an integer constant. The ordinal numbers corresponding to the enumerators often are 0, 1, 2, 3, ... in accordance with the order of the enumerators, but they do not have to be consecutive numbers, and a notation
     <ENUM <LOW 3> <MEDIUM 5> <HIGH 8>>
can be accepted as valid. The type of an enumerator is "enum", its origin-type is integer, and its value is the corresponding ordinal number. (Which one of the integer types is the origin type of "enum" is to be specified in MachineParam class).
[[
If an enumeration in C is attached with an identifier t as its tag, a type <ENUM t> is assumed to be introduced, and the notation <ENUM t> is used for that enumeration type. If a tag is absent, a tag (e.g., named t1) is generated at the time of transformation to HIR, and introduction of type <ENUM t1> is assumed. The notation <ENUM t1> is used for this enumeration type. This means that the notation <ENUM identifier> is a valid form of introduced type name. The origin-type of <ENUM t> is the enumeration declaration itself taking off its tag.
For example, consider a declaration for s1 in C
     enum signal {RED, YELLOW, GREEN} s1;
This declaration is to be translated that the type of s1 is <ENUM signal> whose origin-type is <ENUM <<RED 0> <YELLOW 1> <GREEN 2>>>. In other words, it is assumed that this declares, as well as variable s1, (typedef <ENUM <<RED 0> <YELLOW 1> <GREEN 2>>> <ENUM signal>) and thus declares two types,
    <ENUM signal> and 
    <ENUM <<RED 0> <YELLOW 1> <GREEN 2>>
Consider another example in C
     enum {OFF, ON} flag;
Assuming t1 to be the tag name generated by the compiler for this declaration, the declared type is denoted by
     <ENUM t1>
and its origin-type is
     <ENUM <<OFF 0> <ON 1>>>
]]
3.2.1.6.4. Structures and unions
A structure is an ordered list of its elements. How to calculate the size of a structure is implementation defined. A union is an aggregate whose elements share an overlapping storage location, and how to calculate its size is implementation defined, but its size is no less than the size of the element largest in size. How to access to elements of a structure or of a union is also implementation defined.
A structure and a union take respective forms of
     <STRUCT <Type_1 Name_1> <Type_2 Name_2> ... <Type_n Name_n>>
and
     <UNION  <Type_1 Name_1> <Type_2 Name_2> ... <Type_n Name_n>>
Here, "STRUCT" or "UNION" is followed by a list of element specifications which are in compliance with the notational conventions of types described in this section. The notations above show a structure or a union whose elements, Name_1, Name_2, ..., and Name_n are of types Type_1, Type_2, ..., and Type_n, respectively. As for the elements of the union above, Name_1, Name_2, ... , and Name_n are stored in storage areas which start at the same address.
To define a type of structure or union having element refering the type to be defined, an incomplete type
    (typedef null introduced-type-name) 
may be defined to start with, in preparation of the introduced type of structure or union type. The incomplete type thus introduced may be redefined later by
    (typedef structure introduced-type-name) 
or 
    (typedef union introduced-type-name) 
so as to make it a complete type. An incomplete type can be specified as the type of a target of a pointer reference and so forth.
[[
Consider a declaration in C
     struct point { int x; int y; } s1;
having the tag named "point". It is assumed that an introduced type name <STRUCT tag-name> (<STRUCT point> in this case) is declared and it is then used to declare the type of structure s1. If a tag name is not given in the source program, the compiler generates a tag name at transformation to HIR, and it appears in the notation likewise. In the case of unions, similarly, <UNION tag-name> is assumed to be introduced and declared. Thus, <STRUCT identifier> and <UNION identifier> are now included in the valid forms of introduced types.
Consider following declarations in C
     struct point { int x; int y; } s1;
     union value { int iValue; float fValue; } u1;
     struct { char* name; int price; } b1;
Introduced types and their origin types are
     type of s1:  <STRUCT point>
     type of u1:  <UNION value>
     type of b1:  <STRUCT t2>
     origin-type of <STRUCT point>: <STRUCT <int x> <int y>>
     origin-type of <UNION value>:  <UNION <int iValue> <float fValue>>
     origin-type of <STRUCT t2>:    <STRUCT <<PTR char> name> <int price>>
where t2 is the generated tag from the declaration of b1.
In the following example of C
     typedef struct tnode *TreePtr;
     typedef struct tnode {
       char *name;
       TreePtr left;
       TreePtr right;
     } TreeNode;
     TreeNode root;
the types of root and left are
     type of root: <TYPEDEF TreeNode>
     type of left: <TYPEDEF TreePtr>
     origin-type of <TYPEDEF TreeNode>: <STRUCT tnode>
     origin-type of <TYPEDEF TreePtr>:  <PTR <STRUCT tnode>>
where <STRUCT tnode> is first defined as an incomplete type, and then it is redefined as a complete type later.
Another example of C follows.
     typedef int Int;
     typedef Int *IntP;
     IntP p;
Here, the introduced types are:
     <TYPEDEF Int>
     <TYPEDEF <PTR Int>>
     <TYPEDEF IntP>
The type of p is <TYPEDEF IntP>, the origin type of <TYPEDEF IntP> is <TYPEDEF <PTR <TYPEDEF Int>>>, and the origin type of <TYPEDEF Int> is int.
]]
3.2.1.6.5. Subprograms
The subprogram type is formed with, apart from the type name SUBP as the first item, a list of types of arguments enclosed with angular brackets < > at the second place, a bool constant indicating whether or not a variable number of arguments are expected at the third, and the type of return value at the forth in such way
   <SUBP <type-of-argument-1 type-of-argument-2...> bool type-of-return-value>
For a subprogram with no argument, the second item is an empty list <>. If a variable number of arguments are expected, the third item is true, otherwise it is false. For a subprogram without a return value, a void is set to the forth item.
[[
For the following example in C,
     int (*comp)(void *, void *);
the type of comp is
     <PTR <SUBP <<PTR void> <PTR void>> false int> > >
In C, if an expression representing a subprogram appears at a place other than an operand of "&" or "sizeof", the expression is taken to be a pointer to the subprogram. According to this rule, in the subprogram call expressions
     comp(i, j), and (*comp)(i, j) 
the type of the subprogram specification part of either expression is
     <PTR <SUBP <<PTR void> <PTR void>> false int> > >
Consider slightly more complicated example
     int *fpi();          /* A function returning a pointer to int. */
     int  *(*pfpi)();     /* A pointer to a function returning
                             a pointer to int. */
     int  *(*fpfpi())()   /* A function returning a pointer to a function */
     {                    /* returning a pointer to int. */
       return pfpi;
     }
        ...
       pfpi = fpi;
       pi1  = pfpi();
       i5   = (*fpfi())();
The types of subprograms or subprogram expressions for the statements above are:
     int *fpi();          <SUBP <> false <PTR int>>
     int  *(*pfpi)();     <PTR <SUBP <> false <PTR int>>>
     int  *(*fpfpi())()   <SUBP <> false
                            <PTR <SUBP <> false <PTR int>>>>
     {  return pfpi;      <PTR <SUBP <> false <PTR int>>>
     }
        ...
       pfpi = fpi;        <PTR <SUBP <> false <PTR int>>>
       pi1  = pfpi();     <PTR <SUBP <> false <PTR int>>>
       i5   = (*fpfi())();  <PTR <SUBP <> false
                                 <PTR <SUBP <> false <PTR int>>>>>
]]
3.2.1.6.6. Qualified type
A type t with type qualifier takes the form of <t qualifier>, such as
    <float const>
    <int volatile>
Operations for the qualified type are carried out in the same way as the origin-type t. For expressions that request operands of the same type, if types of two operands are the same except for their qualifiers, no insertions of type conversion operations are needed.

3.2.2. Types of operational expressions

The type of an operational expression is in most cases decided from the combination of the type of its operands and its operator, but there are some operators, like "conv", whose type is to be decided at the time of transformation to HIR-base following the specifications of the source language. Since each operator requests operands of certain types to suit it, explicit conversions of operands should be inserted by the parser of the source language, that is, all operand conversions should be explicitly represented in HIR. In this section, explanation of type is given for each of operational expressions.
The syntax of HIR-base is common to multiple languages and machines, but some instances may have some items whose values are language/machine dependent. That is to say, the interface of HIR is common to multiple languages/machines, but there are some fields in its real implementation (instance) which hold language/machine dependent values. Since a real compiler is for a specific source language and for a specific target machine, the dependencies of this kind do not cause serious problems. Compiler modules processing HIR can be built to be commonly applicable to multiple languages and machines because language/machine dependent values of HIR instances can be get from SourceLanguage class or MachineParam class.
3.2.2.1. Constants
HIR has integer constants and floating point constants. The integer constants are constants of either int, long int, long long int, unsigned int, unsigned long int and unsigned long long int. The floating type constants are constants of either float, double and long double. Which constant in source language corresponds to which one in HIR is decided by the parser of the source language.
[[
Among constants in C language, the types signed integer, unsigned integer and floating point have corresponding constant types in HIR-base. The result of sizeof operator is a constant of offset type (, but its values may differ from one target machine to another, even if the source program is the same). In HIR-base, not like in C, a character constant is of char type rather than int, and an enumerator, which is a named constant defined by enum, is of enum type rather than int. Though there is a notational form for a character string as a constant, in view of types, a character string is handled as an array of characters. Therefore, when decomposing a character string constant to its elemental letters, they are treated as a char array.
]]

A constant of integer, offset or enum type generally has internal value (in binary representation). If the value is beyond the rage of representation defined for integer of Java, isOutOfValueRange() method returns true, and the constant does not have a valid internal value. Transformations (such as constant folding) operated on the intermediate representation are carried out only within the range where an internal value can be represented as an integer of Java. Also, a constant of floating point type generally has internal value. If the value is beyond the rage of representation as a floating point of Java, isOutOfValueRange() method returns true, and the constant does not have a valid internal value. The internal value of a character constant is that of the corresponding int, or that of an int obtained by a type conversion from unsigned char to int.
The internal representation of a string constant takes form of the pure sequence of characters enclosed with quotation marks ("").
[[
In the case of C language, the pure sequence of characters is a string whose quotation marks (") at the both ends are stripped off, escape characters (say, with leading \) in the middle are translated, and 0x00 at the end of the string is eliminated.
]]

An HIR node for a constant whose type is T and value is v takes the form of
     <const T v>
For example, the node for an integer constant 123 is
     <const int 123>
Nodes for bool constants are either one of
     <const bool true>, or <const bool false>
[[
A node for a character string constant of C language, counting 0x00 at the end as an element of the character array, takes the form of
     <const <VECT 4 0 u_char> "abc">
]]

The symbol null shows the absence of an expression. (NullNode in a tree structure shows a leaf which does nothing, and is not null. The type of NullNode is void.)
3.2.2.2. Variables
If a type T in the source language corresponds to a type t in HIR, a variable of T type in the source language becomes a variable of t type in HIR. If t is scalar type, the variable is a scalar variable, and if t is aggregate type, the variable is an aggregate variable.
[[
Variables of short int, int, long int, char, float, double, long double, unsigned short int, unsigned int, unsigned long int, and unsigned char in C language have variables of corresponding types in HIR-base.
In C language, let T be a type specification and D a declarator, then
     T  D[n]
declares that the type of D is an array of n elements of type T,
     <VECT n 0 T>
where n is an unsigned integer constant.
     T D[ ]
declares that the type of D is an arrary of undefined number of elements of type T
     <VECT null 0 T>
The declaration
     T *D
declares that the type of D is a pointer which points to an object of type T, that is
     <PTR T>
For example,
     char *s;
declares that the type of s is a pointer
     <PTR char>
that points to a char.
The types defined by declarations for enum, struct, union, and function are already explained.
]]

A leaf node for a variable v of type T is represented in textual form
     <var T v> 
The i-th element of an array whose element type is T is is represented as
     (subs T aa' i')
where aa' and i' stand for the respective notations in HIR for aa and i. The type of aa' is <VECT n lb T>, where n is the number of elements of the array, and lb is the lower bound of its subscript. If aa and i are leaf nodes for variables, the above notation becomes
     (subs T
       <var <VECT n lb T> aa>
       <var int i> )
To calculate the location of i-th element of the array aa from this notation, information such as the difference of displacement in memory address for each increment of the subscript is required. The size of an element and the word boundary, which are used to obtain the information, are target machine dependent. As elaborated later, aa' can be a pointer expression pointing to the array aa, or to the top of the array aa. A multi-dimensional array is expressed as an array whose elements are arrays.
An element x of a structure s is
     (qual T
       s'
       <elem T x> )
where T is the type of x, and s' stands for the notation in HIR for s.
The example earlier mentioned is thus
     (qual int
       <var <STRUCT <int x> <int y>> s>
       <elem int x> )
For unions, except for replacing STRUCT with UNION, the rest is the same. To obtain the offset of the element x, use evaluateDisp method in Elem interface
     x.evaluateDisp()
Usually this value is a constant, but a constant expression may be the case. Whether it is a constant value or not can be confirmed using isDispEvaluable.
A type T element x of a structure or a union which is pointed by a pointer p is
     (arrow T  p'  x')
where p' and x' stand for the respective notations in HIR for p and x.
When p and x are leaf nodes, the expression becomes
     (arrow T
       <var <PTR T> p>
       <elem T x> )
and the value of the object (of scalar type) to which the pointer p points is
     (contents T
       (p' <PTR T> )
If a pointer p points to the top of an array aa and it is used only to point to an i-th element of the array, as shown below,
     (contents int
      (add      <PTR int>
       <var      <PTR int> p>
       (mult     offset
        <const    offset 4>
        (conv     offset
         <var      int i>))))
then the i-th array element to which the expression including p points can be represented by an array element expression using "subs" after "undecay"ing the pointer to an array in such way as
    (subs T
      (undecay <VECT n lb T>
        <var <PTR T> p>
        <const int n> 
        <const int lb> )
      (i' int) )
where i', n and lb stand for the notation in HIR for i, the number of elements and the lower bound of the subscript of the array to which p points, respectively. If p points to the array itself, then either
     (subs T 
       (contents <VECT n lb T> 
         <var <PTR <VECT n lb T>> p> )
       (i' int) ) 
or
    (subs T
      (undecay <VECT n lb T>
        <var <PTR <VECT n lb T>> p>
        <const int n>
        <const int lb> )
      (i' int) )
can be used instead of above pointer expression. In many cases, it is easier to optimize or to parallelize an array element expression than a pointer expression.
[[
If a formal parameter of a function in C declared to be an array satisfies the conditions above, the elements of the array may be expressed using subs. For example, if a function declaration
   void f(int aa[10], int bb[3][5]) { ... }
is given, aa[i], and bb[j][k] may be expressed as follows:
     (subs int
       (undecay <VECT 10 0 int>
         <var <PTR int> aa> )
       <var i int> )
     (subs int
       (subs <VECT 5 0 int>
         (undecay <VECT 3 0 <VECT 5 0 int>)
           <var <PTR <VECT 5 0 int>> bb>
           <const int 3>
           <const int 0> )
         <var j int> )
       <var k int> )
According to the C language specifications, an array formal parameter in C means a pointer that points to the top element of an array. Therefore, the 10 in "int aa[10]" and the 3 in "int bb[3][5]" do not assure that they are the respective number of elements of aa and bb. Even so, as long as an array formal parameter is used in a straightforward way for an array, and is judged to satisfy the conditions above, it is possible to express a formal parameter using an array rather than a pointer. The former is preferable because array elements are important targets in optimization and parallelization. But whether or not a real argument has the number of elements as declared for the corresponding formal parameter can not usually be tested in C. So, use of this assumption at optimization and parallelization should be restricted by a compilation time option to that effect.
There is a pragma available to declare that array element expression can be safely used instead of pointer expression. For pointer p,
    #pragma optControl safeArray p
declares that the pointer p is used to represent array element. For array aa,
    #pragma optControl safeArray aa
declares that subscript values for the array aa never exceeds the range of safely accessing the storage area allocated to the array aa. Although the above pragma gives information to the compiler, it does not certify to detect violations. Incidentally, in the following example:
     int f(int (*p)[10], (*q)[4][5]) {
     #pragma optControl safeArray p q
       int i, j, k, x;
       ....
       x = (*p)[i] + (*q)[j][k];
       ....
     }
(*p) can be treated to be an array with number of elements 10, and (*q) a 2-dimensional array with [4][5] elements.
]]
3.2.2.3. Arithmetic expressions
Arithmetic operators add, sub, mult, div and mod stand for the four basic rules of arithmetic and modulo arithmetic. They handle all types of integer and floating point, and generate a computed result from their 2 operands of the same type and the type of the computed result is the same to the type of operands.
In general, the two operands of arithmetic operators must be of the same type, unless specifically explained otherwise. The distinction of operations among signed and unsigned, and short, long, double and so fourth are decided by the type of the operands, that is, the type of the computed result is the same to the type of operands, unless explained otherwise.
When applying an arithmetic operator to two operands of different types, type conversion has to be explicitly made before the arithmetic operation. The type conversion should be towards the type with richer power of expression among the two. No type conversion is needed between a qualified type and its origin-type. Such conversion should be inserted at the time of HIR-base generation in compliance with the specifications of the source language.
When unary minus operator "neg" is applied to an operand whose type is either signed integer, offset, or floating point number type, it generates the result of the same type with reversed sign. If "neg" is to be applied to an operand of other type, such as unsigned integer and character, the operand should be explicitly converted to a signed integer at the time of HIR-base generation.
3.2.2.4. Pointer expressions
There are two groups of arithmetic operations on pointers in HIR-base. The first group includes addition and subtraction using a pointer as the first operand and an offset as the second operand. The second group is subtraction between two operands of the same pointer type.
The type of the result of an operation from the first group is the same pointer type of its first operand. The "add" operator adds the value of its second offset operand to the memory address to which its first pointer operand points, the address thus obtained is made the address to which the resulting pointer points. The "sub" operator subtracts the value of its second operand (of offset type) from the address to which its first operand points, the address thus obtained is the address to which the resulting pointer points.
The type of the result of an operation from the second group is offset. Subtraction between pointers here means the address represented by the second pointer operand is subtracted from the address represented by the first pointer operand, generating the result of offset type.
In HIR-base, an addition or a subtraction between two offsets generates the result of offset type. An offset added, subtracted or multiplied by an integer generates the result of offset type. A division between two offsets generates the result of integer type. The division between offsets is used, for example, to obtain the number of elements of an array by utilizing the offset generated by sizeof operator.
[[
In HIR-C, addition and subtraction of a pointer, "add" and "sub", take two operands. One of them is a pointer p which points to an object of type T, and the other is integer i. Addition adds sizeof(T)*i to p, and subtraction subtracts the same from p. In HIR-base, both "add" and "sub" operators also take two operands, but their types are pointer and offset. The value obtained from sizeof(T) multiplied by i must be given as the offset operand. This means that HIR-base requires explicit insertion of such multiplication.
Similarly, in C, a difference of two pointers is expressed in terms of the number of objects which they point. In HIR-base, where a difference in terms of memory address is obtained by the "sub" operator, an operation to divide that difference by sizeof(T) has to be explicitly inserted if desired result is the difference in terms of the number of objects. The operator "offset" is available only in HIR-C, which expresses differences between pointers in terms of number of elements. Since HIR-C is an extension of HIR-base, if differences of pointers are to be expressed by "sub" operator rather than "offset" operator, division by sizeof(T) aught to be explicitly inserted, just like in HIR-base. If a pair of a pointer and an offset type rather than a pair of a pointer and an integer type is used in HIR-C, the meaning of the expression is the same as in HIR-base.
]]

Below is the summary of the pointer related operations in HIR-base.
  Operator  first    second  result
            operand  operand
    add     pointer  offset   pointer
    sub     pointer  offset   pointer
    sub     pointer  pointer  offset   (not a number of elements) 
    add     offset   offset   offset
    add     offset   integer  offset
    sub     offset   offset   offset
    sub     offset   integer  offset
    mult    offset   integer  offset
    div     offset   offset   long
    neg     offset            offset
    offset  pointer  pointer  long   (number of elements) available 
                                            in HIR-C only
[[
In HIR-C, in addition to listed above, those pairs which are accepted in C, such as a pointer and an integer, an integer and a pointer, and so forth, are acceptable, as a pair of operands of pointer related arithmetic. For operators "add", "sub" that have different meaning between HIR-base and HIR-C, a flag HIR.FLAG_C_PTR should be attached to the pointer expression if it is used in the meaning of C language, otherwise operation of HIR-base is assumed.
]]

Address operator "addr" takes an operand of an l-value or a subprogram type. Assuming that the type of operand is T, a pointer to T, <PTR T>, is the type of the result.
The indirect operator "contents" takes a pointer operand p, and its result is the value of the object to which p points. Assuming that p points to an object of type T, the resultant type of "contents" is T.
The operator "decay" takes as its operand an array, its result is the pointer which points to the top element of the array. The type of the result is, assuming that the type of the element of the array is T, a pointer to T, <PTR T>. If "decay" is applied on a character string constant, a pointer of type <PTR char> pointing to the top of the character string is obtained.
The operator "undecay" takes three operands. The first operand is a pointer, the second is an integer expression showing a number of elements of an array, and the third is an integer expression showing the lower bound value of the subscripts of the array. Let <PTR T>, n and lb be the type of the first, the second and the third operands, respectively, the result of undecay is an array with n elements of type T, and the type of the result is
    <VECT n lb T>
The second and third operands of undecay may be null, meaning the corresponding information about the array are indefinite, and the type of the array is expressed putting a null in respective places in such way as
    <VECT null null T>
The differences between "addr", "decay", "contents" and "undecay" focused on types are as follows:
    addr:     T --> <PTR T>
    decay:    <VECT n lb T> --> <PTR T>
              string --> <PTR char>  or  <PTR u_char>
              i.e. applicable only to arrays and character strings.
              Never apply decay to a subprogram.
    contents: <PTR T> --> T
    undecay:  <PTR T> --> <VECT n lb T>
              i.e. "undecay" converts a pointer expression 
              to an array expression.
here, the left hand side of "-->" is operand type and the right hand side is the resultant type.
3.2.2.5. Comparison expressions
Comparison operators, "cmpGt", "cmpGe", "cmpLt" and "cmpLe", take two operands of the same type which is either integer, floating point, enumeration, bool, offset or pointer type. The operators compare if the first operand is greater than, greater than or equal to, less than, and less than or equal to the second operand, respectively, and generate a result of bool type. When the condition is satisfied, the result is true, if not satisfied, the result is false. If operand type of a comparison operator is enumeration type, the result is the same to the comparison of ordinal numbers of the operands. Similarly, if operand type is bool, the comparison is the same to the comparison of ordinal numbers (1 for true, 0 for false) of operands. Comparison for offset type is the same as that for its origin-type.
Equality comparison operators, "cmpEq" and "cmpNe", take two operands of the same arbitrary type of scalar types. Whether or not the values of the two operands are equal is tested, and true or false in bool is generated, accordingly.
When applying a comparison operator to two operands of different types, type conversion, in compliance with the specifications of the source language, has to be explicitly carried out before the comparison operation, in the same way as arithmetic expressions. The type conversion should be towards the type with richer power of expression among the two. In spite of this, in equal comparisons, void or null may be compared with any pointer regardless to which type it points. No type conversion is needed between a qualified type and its origin-type (the type without a type qualifier).
3.2.2.6. Logical expressions
Logical operators, "and", "or", and "xor", operate on two operands of bool type, and perform logical AND, logical OR, and exclusive OR, respectively. They generate a result of bool type. When operands are integer or offset types, they perform the operation on the internal representations of the operands, and generate the results of the same types as the operands'. If the compliment operator "not" is applied to an operand of bool type, it generates the reversal of the operand, that is false for true, and true for false.
3.2.2.7. Bit operation expressions
If the complement operator "not" is applied to an operand of a type of scalar types other than floating point, which means integer, character, enumeration, bool or offset type, it generates a result which holds the one's complement of the internal expression of the operand (bitwise reversals), and is of the same type as the operand. If one of operators "and", "or" and "xor" are applied to operands of a type of scalar types other than bool, it generates a result whose internal representation is the bitwise logical AND, logical OR and exclusive OR, respectively, of the internal representations of the operand. The result of the operation is of the same type of the operand. The two operands of "and", "or" and "xor" must be the same type.
3.2.2.8. Shift expressions
Shift operators, "shiftR", "shiftLL" and "shiftRL", take two operands. The type of their first operand is one of scalar types other than bool, floating point and pointer, and they operate on the internal representation of the first operand. Their second operand should be of int type and the type of result is the same as the type of the first operand. Arithmetic right shift "shftR" shifts bits to the right extending the sign bit. Logical left shift "shiftLL" shifts bits to the left together with the sign bit, and the emptied bits at the right side are filled with zeros (0). Logical right shift "shiftRL" shifts bits to the right together with the sign bit, and the emptied bits at the left side are filled with zeros.
[[
In transforming from C, if the type of operand is a signed integer or a signed character, use arithmetic shift. Otherwise use logical shift.
]]
3.2.2.9. Other operators
Type conversion operator "conv" converts the type of its operand into the type specified by the node type of "conv" node, that is the "conv" node type shows the type of converted result. The type of the result is decided at HIR generation, and it is the type requested in the context by the operator which uses the conv result as its operand. A pointer to "void" type can be converted to and from a pointer to any type.
The operator "enclose" has an operand representing an expression to be evaluates as one value, that is, to be protected from partially combined with other operand that is not contained in the operand of "enclose". Some languages may represent such restriction by enclosing an expression by parenthesis. If it is not necessary to consider such restriction, then it is not necessary to apply "enclose" to expressions enclosed by parenthesis in the source program.
The operator "subs" construct a subscripted variable expression taking an expression representing an array as the first operand, and a subscript expression as the second operand. The type of the result is the type of the array element. If the type of the first operand is <VECT n lb T>, the type of the result is T. The type of the subscript expression should be integer.
"subs" can be used to readily obtain the location of an element of a rectangular array from the value of the subscript, since how far the element is placed from the top position of the rectangular array can be calculated from the size of the array element and the subscript value. Arrays with sub-arrays of different sizes are expressed using a pointer or as arrays whose elements are pointers.
For SSA function phi (ƒÓ) , "phi", a variable or register (v) to store the result is specified as the first operand, and a list of pairs of a variable or a register and a label (v_i L_i) is specified as the second operand. The labels are of the preceding basic blocks, and show the paths the control flow took.
     (phi  T
       (v T)
       (list (v_1 L_1 T) (v_2 L_2 T) ... ) )
This means that if the control is passed from the basic block labeled with L_i, the corresponding v_i is selected as the result v (i = 1,2, ...). v_1, v_2, ... and v are of the same type T, and that type T is the type of the result of "phi".

We sum up here the operators and the operands, and the type of results, where O denotes available and X denotes undefined.
  Type           Comparison of    sign      conversion to
                  size  equality  reversal  integer type
  signed integer    O      O       O         not needed
  unsigned integer  O      O       X         not needed
  floating point    O      O       O         convet
  char              O      O       X         character code or origin type 
  u_char            O      O       X         character code or origin type
  bool              O      O       X         false 0, true 1
  offset            O      O       O         origin-type (internalrepresentation) 
  PTR               O      O       X         memory address
  ENUM              O      O       X         origin-type (ordinal number)
3.2.2.10. Subprograms
As mentioned earlier, the type of a subprogram is expressed as below:
     <SUBP <argument-1-type argument-2-type ... > bool return-value-type>  
A pointer to a subprogram of type T is, like ordinary pointers, expressed as:
     <PTR T>
If a subprogram is a function returning a value, then the type of the expression calling the function is the type of the return value. The type of expression calling a subprogram that returns no value is void, that is, a subprogram that returns no value may be treated as a function returning void value.
Arguments are passed by value at a subprogram call. For an argument to be passes as reference, the memory address of the argument obtained by "addr" operator is passed to the subprogram. For more detail, refer to the section for subprogram references.
3.2.2.11. Assign statements
An assign statement assigns the value of the second operand to the first operand which is an l-value object so that the value of the first operand becomes the value assigned. The types of the two operands must be the same, and the node for assign has the same type (except for a type qualifier). If the two operands are a pair of structures or unions of the same type, the whole of the contents of the second operand is set to the first operand.
A type with const type qualifier can not be placed at the left hand side of an assignment statement. If a value of a type other than the type of the first operand is to be assigned, an explicit type conversion on the second operand to the type of the first operand is required at HIR-base generation. A null (NullNode) is the exception to this, and may be assigned to pointers of any type.
All types, scalars and aggregates, except for void are valid as the type of operands of an assign statement. This means, the value of an expression of any one of the types of integer, character, floating point, bool, offset, pointer, array, enumeration, structure and union may be assigned to an l-value of the same type.

Note: Ways to handle assignments between arrays, and returning an array as the return value of a function are yet to be completed.
3.2.2.12. Control statements
The type of the result of an if-statement, a jump-statement, a loop-statement, a return-statement without return value or a switch-statement is void. The type of a return-statement returning a value is the same as that of the value. The type of a subprogram call-statement is the same as that of the return value of the subprogram.
3.2.2.13. Other statements
The type of an expression-statement (ExpStmt) is the same as that of its operand which is an expression. The type of a block-statement is void.
3.2.2.14. Labels
The symbol for a label is used for label definition and label reference only, and never is the subject of other operation. The types of label definition and label reference are void.
3.2.2.15. Other notations
The operator "null" represents NullNode which stands for nothing is there.
     <null void>
Its type is void. The NullNode is an instance of HIR and is not null.
The notation null represents the absence of operand and it is not an instance of HIR. The notation null can be placed where an operand or a symbol is to be placed representing the absence of them, if absence is permitted at the position.

3.2.3. Type conversions

The "conv" operator is used to convert the type of an operand in an expression, such as an arithmetic expression, to the required type. In what situation and how a conversion has to be done is decided by the source language specifications. Required conversion operations have to be explicitly inserted at the time of HIR-base generation. HIR-base does not perform any implicit type conversions.
Among the types of integer type, there is an ordered rank of conversion:
      short int < int < long int < long long int
A type conversion to a type higher in the rank (integer promotion) does not cause any information loss. The rank order among the types of floating point type is:
      float < double < long double 
A type conversion to a type higher in this rank also does not cause any information loss.
If converted to integer, false and true of type bool are 0 and 1, respectively. A character can be converted to an integer whose value is the character code of the character. An enumerator can be converted to an integer whose value is the ordinal number of the enumerator. An enumerator can be converted to a character whose character code is the ordinal number of the enumerator. A type conversion can be explicitly done to or from integer and floating point types. The value of an offset can be assumed to be a value of signed int or signed long int type, in the sense the internal (binary) representations before and after the conversion by "conv" are the same.
The value of a pointer represents an address, and can be assumed to be a value of either signed int, signed long int, unsigned int or unsigned long int type. Signed or unsigned is dependent on the target machine. Unless otherwise specified, signed is assumed.
In general, when a type conversion of an operand is necessary for arithmetic or other operations, the operand whose type is lower in the rank order is converted to the type of the other which is higher in the rank order. But the conversion rules in the syntax of the source language have the priority.
For example, when transforming from C to HIR-base, type conversions to make operands compatible must be explicitly done in compliance with the language specifications of C.
The operator "conv" is used not only for substantive basic type conversions, but also for just a change in how a pointer p is seen, from one that points to an object of type T1, to another that points to an object of type T2:
     (conv  (PTR T2)
       (p (PTR T1)) )
The following is a summary of the relation between operand (x), result (r) of "conv" and the effect of the conversion operation. Here, Ts stands for the type of the operand, and Tr for the result. The same rules for "enum" apply to "bool". The details of the conversion are implementation defined. In this summary "int" is used as a representative of all integer types. The conversions among different integer types are omitted here.
  Ts of      Tr of       Effect of
  Operand x  Result r    Conversion operation
  -------    --------    --------------------
   int       char     The character code of r is the value of x.
             enum     The ordinal number of r is the value of x.
             offset   The value of r is the value of x.
             pointer  The value of r is the value of x.
   char      int      The value of r is the character code of x.
             enum     The ordinal number of r is the character code of x.
             offset   Same as (conv offset (conv int (x char))).
   enum      int      The value of r is the ordinal number of x.
             enum     The ordinal number of r is the ordinal number of x.
             offset   Same as (conv offset (conv int (x Ts))).
   offset    int      The value of r is the value of x..
             enum     The ordinal number of r is the value of x.
             pointer  If the type of pointer is signed,  
                      then the value of r is the value of x.
                      If the type of pointer is unsigned, 
                      then the value of r is signed-to-unsigned
                      converted value of x.
   pointer   int      The value of r is the address represented
                      by x.
             enum     The ordinal number of r is the value of x.
             offset   If address is signed,
                      then the value of r is the value of x.
                      If address is unsigned,
                      then the value of r is unsigned-to-signed
                      converted value of x.
             pointer  The address represented by r and the address 
                      represented by x are the same.
                      Whether or not the internal representation of the
                      object pointed by x may also be treated as the
                      object pointed by r is decided by programmer's
                      responsibility.
[[
Implicit type conversions
As mentioned earlier, all implicit type conversions in the source program are to be expressed as explicit type conversions in HIR-base. Some HIR-C expressions still have hidden implicit type conversions. All of these are to be made explicit in HIR-base, following the language specifications of C and the rules of HIR-base. For example, in C, char and bool are treated to be the same as int, but in HIR-base, char and bool and int are treated as different types, so explicit type conversions must be inserted to keep the compatibility between operands where necessary.

C
     int a, plus;
     plus = a > 0;
HIR-C
     (assign int
       <var int plus>
       (compGt int
         <var int a>
         <const int 0> ) )
HIR-base
     (assign int
       <var int plus>
       (conv int
         (compGt bool
           <var int a>
           <const int 0> ) ) )
Different handlings of integer types in HIR-base and HIR-C
As mentioned above, HIR-base treats char, bool and enum as different types from int type. But HIR-C treats them, like C, as a kind of int types. HIR-C permits, like C, an integer addition to and subtraction from a pointer, and subtraction between pointers. Whereas in HIR-base, these addition and subtraction with an integer should be done only after the pointer is converted to integer type, and subtraction between pointers should be done after type conversions to offset type have been done.
Arithmetic expressions
In C, variables and function values which are declared either as char, unsigned char, bool, and enum can be used as a kind of integer types and can be made operands of arithmetic operations. Whereas in HIR-base, these types have to be explicitly converted to int or unsigned int before they are used at a place where an integer type is requested by context, such as arithmetic operations and assignment to integer. Where a char, unsigned char, bool or enum type is requested by context, such as assignment to a variable of type char, bool or enum, int and unsigned int have to be explicitly and appropriately converted to char, bool or enum, respectively, before they are used.
Example
C
     char a, b;
     a = 'a';
     b = a + 1;
HIR-C
     (assign char
       <var char a>
       (conv char
         <const int 97> ) )
     (assign char
       <var char b>
       (conv char
         (add int
           (conv int
             <var char a> )
         <const int 1> ) )
HIR-base
     (assign char
       <var char a>
       <const char 'a'> )
     (assign char
       <var char b>
       (conv char
         (add int
           (conv int
             <var char a> )
           <const int 1> ) ) )
Pointer expressions
In HIR-C, like in C, operations on pointers are permitted. The representations in HIR-C are transformed to HIR-base in the manner shown below. Let p and q be pointers of type T, and i, j and k be of integer type. (If p or q, in the expressions below, is not a simple variable but is a general pointer expression, p and q in the notations
   <var <PTR T> p> and <var <PTR T> q> 
are to be replaced by the corresponding expression.)

C
     p + i
HIR-C
     (add <PTR T>
       <var <PTR T> p>
       <var int i> )
HIR-base
     (add  <PTR T>
       <var  <PTR T> p>
       (mult  offset
         <const  offset s>      where, s is sizeof T
         <var  int i>))
For the expression
      p - i
replace all add operators to sub operator. Here, sizeof T is, in general, a constant of offset type.

C
     p - q
HIR-C
     (offset  offset
      <var  <PTR T> p2>
      <var  <PTR T> p1>)
HIR-base
     (div   int
      (sub   offset
       <var   <PTR T> p2>
       <var   <PTR T> p1>)
      <const   offset s>))      where, s is sizeof T
]]

3.2.4. Specifying initial values

A statement below is used to specify the initial values of variables.
    (setData l-value-expression value-specification)
This statement means that the value specified by value-specification is set to the address expressed by l-value-expression. Where l-value-expression is a constant expression of an l-value, and value-specification is either a constant, a list of value specifications, or a repetition of a value specification.
   value-specification -> constant
         | (expList value-specification value-specification ... )
         | (expRepeat value-specification number-of-repetitions)
If the l-value is a scalar variable and the value specified is a constant of a compatible type to the type of l-value, the constant is set to the scalar variable. Here,
    (expList value-specification value-specification ... )
is a list of an arbitrary number of value-specifications, and
    (expRepeat value-specification number-of-repetitions)
is a list of the value-specification repeated number-of-repetitions (an integer constant) times.
If the l-value represents a vector aa with n elements, and the value-specification represents a list of n elements, then the above statement means that for all elements of the vector aa, following assignment
    ( setData aa[i] v_i )
is repeated, where
    aa[i] is the i-th element of the vector aa, and 
    v_i is the i-th element of the value specification list.
If the l-value represents a structure s with n elements, and the value-specification represents a list of n elements, the statement above means that for all elements of the structure, following assignment
    ( setData s_i v_i )
is repeated, where
    s_i is the i-th element of the structure s, and 
    v_i is the i-th element of the value specification list.
Initial value specification statements (setData ... ) are used to specify values to be set prior to the start of the execution of a program. To specify values to be set after starting execution, such as at the entries to subprograms, assignment statements or loop statements should be used in HIR-base.

3.2.5. Subprogram references

References to subprograms are made by
    (call subprogram-expression argument-list)
where subprogram-expression is a subprogram or a pointer to a subprogram, and argument-list is a list of arguments or an empty list. For example, a subprogram call to a subprogram (function) sqrt to obtain a square root of an argument x of double type is as follows.
     (call  double
      (addr <PTR <SUBP <double> false double>>
       <subp <SUBP <double> false double> sqrt> )
      (list
       <var double x>) )
If the subprogram does not have any argument, the argument list is an empty list. If a scalar expression with an l-value is specified as an argument, its value is passed to the subprogram. If a pointer is specified, the value of the pointer is passed.
There are three ways to pass an array: Which one of these three ways is to be used is decided at HIR generation in compliance with the specifications of the source language. There are 2 cases in the treatment of pointer: Interpretations to these cases are implementation defined. It is desirable to decide on one uniform approach to deal with such cases for each source language.
There are two ways to pass a structure or a union: Which one of these two ways is to be used is decided at HIR generation in compliance with the specifications of the source language. If a pointer is passed, then the call is call-by-reference, and if a structure or a union itself is passed, then the call is call-by-value.
If the type of the return value of a subprogram is a structure or a union, an expression representing a structure or a union is to be specified as the operand of "return". In this case, the value of the specified structure or union is to be made the return value. When returning an aggregate is intended, a pointer to an address where the aggregate is to be stored may be passed and let the subprogram write values to the address, but all processes involved have to be explicitly stated in HIR.
Note: In stead of using a pointer to an array, passing the array itself as the argument, and receiving the array itself as the return value are not yet completely implemented.
[[
The HIR representation for the following function call in C
   int func( int a[3], int i ){
     return a[i];
   }
    ....
   r = func(aa, 2);
    ....
may be, with considerations to the conditions stated in section 2.2, as follows:
   (subpDef  void
    <subp  <SUBP <<PTR int> int > false int> func>
   <null  void>            // Initialization part.
    (labeldSt  void
     (list
      <labelDef _lab1>)    // Label generated by subpDefinition of HIR.
     (block  void
      (return  int
       (subs  int
        (undecay  <VECT 3 0 int>
         <var  <PTR int> a>
         <const  int 3>
         <const  int 0>)
        <var  int i> ))))    
     ... )
    ....
   (assign  int
    <var  int r>
    (call  int
     (addr  <PTR <SUBP <<PTR int> int > false int>>
      <subp  <SUBP <<PTR int> int > false int> func>)
     (list
      (decay  <PTR int>
       <var  <VECT 3 0 int> aa>)
      <const  int 2>)))
Note that, as mentioned earlier, from the point of language specifications of C, "a", in this example, is a pointer to int, and does not secure that the number of elements of the array to be 3.
]]

[[
In Fortran, a real argument address is passed to the called side. The address thus passed may be used as call-by-reference. As alternative implementation, a copy of the formal parameter (dummy argument) may be created as a local variable and write the update of the copy back to the address before "return" when the value of the dummy argument is changed.
For a scalar argument, its address is passed. For an array, a pointer pointing to the array is passed.
Since the (Fortran) language specifications prohibits the called procedure to write to a storage location which is an overlap of two real arguments passed from the calling procedure, the two methods of call-by-reference and the use of copied local variables should give the same result. If the overlapping is to be checked out, it has to be done in the front part by embedding checking operations, etc.
A formal parameter which is the pointer to an array is to be undecayed to an array, and "subs" is to be used for accesses to elements of the array.
For example:
   PROGRAM SIMPLE
   INTEGER AA(10), BB(3, 5), CC, DD
     ....
   DD = F(AA, BB, CC)
     ....
   END
   FUNCTION F( A, B, C )
   INTEGER A(10), B(3, 5)
   INTEGER I, X
     ....
   X = C + A(I)
   B(2, 3) = X
   C = C + 1
   RETURN X
   END
The codes above in Fortran may be transformed to codes in HIR similar to the one below;
  (prog             // PROGRAM SIMPLE
   <sym SIMPLE>     // program name
   <nullNode >      // initialization part
   (subpDef void
    <subp <SUBP <> false int> SIMPLE_MAIN>
            ;       // generated subprogram name
    <null void>     // initialization part
    (labeldSt  void
     (list
      <labelDef _lab1>)    // Label generated by subpDefinition of HIR.
     (block          // subprogram body
     ....
      (assign int    // DD = F(AA, BB, CC)
       <var int DD>
       (call int
        (addr <SUBP <<PTR <VECT 10 1 int>>
                     <PTR <VECT 5 1 <VECT 3 1 int>>>
                     <PTR int>> false int>
         <subp <SUBP <<PTR <VECT 10 1 int>>
                      <PTR <VECT 5 1 <VECT 3 1 int>>>
                      <PTR int>> false int> F>)
        (list     // argument list
         (addr <PTR <VECT 10 1 int>>  // Change to array pointer.
          <var <VECT 10 1 int> AA>)
         (addr <PTR <VECT 5 1 <VECT 3 1 int>>>
          <var <VECT 5 1 <VECT 3 1 int>> BB>)
         (addr <PTR int>  // Change to scalar pointer.
          <var int CC>))))
     ....
     (return void )
    )))
   (subpDef void
    <subp <SUBP <<PTR <VECT 10 1 int>>
                 <PTR <VECT 5 1 <VECT 3 1 int>>>
                 <PTR int>> false int> F>
     <null void>  // Initialization part
     (labeldSt  void
      (list
       <labelDef _lab1>)    // Label generated by subpDefinition of HIR.
      (block void
       (assign int   // Copy the contents of the scalar pointer argument C
            ;       
        <var int _var1>    //  to _var1.
        (contents int
         <var <PTR int> C>))
       (assign int
        <var int X>        
        (add int
         <var int _var1>
         (subs int
          (contents <VECT 10 1 int>      
           <var <PTR <VECT 10 1 int>> A>)
          <var int I>)))
      (assign int
       (subs int
        (subs <VECT 3 1 int>
         (contents <VECT 5 1 <VECT 3 1 int>>       
          <var <PTR <VECT 5 1 <VECT 3 1 int>>> B>)
         <const int 3>)
        <const int 2>)
       <var int X>)
      (assign int
       <var int _var1>
       (add int
        <var int _var1>
        <const int 1>))
      (assign int    // Copy back _var1 
       (contents int          // as the contents of the scalar pointer argument C.
        <var <PTR int> C>)
       <var int _var1>)
      (return int
       <var int X>)
    )))
  )
The first argument of subprogram call in C is always the pointer to a subprogram. In Fortran, whether it is a subprogram symbol or the pointer to a subprogram is implementation defined.
When a constant is to be made an argument, it is better to create a temporary variable whose value is the constant of concern, and pass the address of the temporary variable, than to pass the address of the constant. This prevents occurrence of bad effects from a program error which overwrites a constant argument.
The above mentioned is just one of many implementation choices, others may as well be chosen, as long as they comply with the language specifications in a coherent manner.
]]