You are on page 1of 4

21st Telecommunications forum TELFOR 2013

Serbia, Belgrade, November 26-28, 2013.

Adding microMIPS Backend to the LLVM


Compiler Infrastructure
Jozef Kolek, Zoran Jovanovi, Nenad ljivi, Dragan Narani

Abstract This work describes extending of the LLVM


Compiler Infrastructure with the new backend support for
microMIPS, which is an architecture from MIPS family of
architectures. New backend consists of 16- and 32-bit
instructions, out of which 180 of 32-bit instructions are
recoded MIPS32 instructions, and 14 of 32-bit instructions
are new microMIPS instructions. There are the 39 highly
optimized 16-bit instructions.
Keywords Compilers, LLVM, microMIPS

I. INTRODUCTION
As the LLVM Compiler Infrastructure [1] has become
popular due to its advanced design and handy set of tools,
there is a need for more architectures support. MicroMIPS
is one of the latest architecture from MIPS family of
architectures. MicroMIPS is designed to reduce the code
size without performance losses. Some tests showed that it
is possible to reduce the code size up to 35% with only 2%
of performance loss [2].
This work describes how to add support for microMIPS
backend to the LLVM Compiler Infrastructure. Part II
describes LLVM in general, and also its domain specific
language called TableGen, that is used to add new
backends. Part III describes microMIPS architecture, how
it is designed and its differences when compared to
MIPS32 architecture. Part IV shows how TableGen is used
to add new microMIPS backend, and mentions other steps
that must be performed in order to get new backend
worked.
II. THE LLVM COMPILER INFRASTRUCTURE
LLVM Compiler Infrastructure is a set of modular and
extensible compiler tools and libraries. It consists of many
subprojects and the most important of them are [1]:
LLVM Core libraries - represents optimizer which
operates on source code and machine target independent
intermediate representation known as LLVM IR. Along
with this optimizer comes a code generation support for
many CPUs.
Clang - a C/C++/Objective-C front-end for LLVM.
Clang parses the source code and generates the LLVM IR.
One of the interesting tool that is built using this front-end
Jozef Kolek, RT-RK, Computer Based Systems, Novi Sad, Narodnog
Fronta 23a, 21000 Novi Sad, Serbia (e-mail: jozef.kolek@rt-rk.com)
Zoran Jovanovi, RT-RK, Computer Based Systems, Novi Sad,
Narodnog Fronta 23a, 21000 Novi Sad, Serbia (e-mail:
zoran.jovanovic@rt-rk.com)
Nenad ljivi, RT-RK, Computer Based Systems, Novi Sad,
Narodnog Fronta 23a, 21000 Novi Sad, Serbia (e-mail:
nenad.sljivic@rt-rk.com)
Dragan Narani, RT-RK, Computer Based Systems, Novi Sad,
Narodnog Fronta 23a, 21000 Novi Sad, Serbia (e-mail:
dragan.narancic@rt-rk.com)

978-1-4799-1420-3/13/$31.00 2013 IEEE

as a library is Clang Static Analyzer that automatically


finds bugs in source code.
Dragonegg - represents integration of GCC front-ends
with LLVM optimizers and code generators.
LLDB - LLVM's native debugger.
libc++ and libc++ ABI - implementation of C++
Standard Library.
VMKIT LLVM based implementations of Java and
.NET Virtual Machines.
In addition to these projects there are many other
official and unofficial subprojects of LLVM, that will not
be mentioned here.

Fig. 1. LLVM compiler phases.


Very important part of the LLVM project is its
Intermediate Representation (IR), which is programming
language and target independent program representation
that is in Single Static Assignment (SSA) form. This
LLVM IR can be represented in three different forms: 1)
textual form, which is human-readable assembly-like
form, 2) bitcode form, which is a sequence of bits and 3)
in-memory data structures [3].
Next example shows C function that returns sum of two
numbers:
int sum(int a, int b) {
return a + b;
}

And this is its corresponding LLVM IR function after


translation:
define i32 @sum(i32 %a, i32 %b) {
entry:
%tmp1 = add i32 %a, %b
ret i32 %tmp1
}

After Clang generates an LLVM IR from source files,


this LLVM IR serves as an input to the optimizer and later
to the Code Generator. Code generation is very complex
process that consists of many phases. Main data structure
that the Code Generator operates on is Directed Acyclic
Graph (DAG) which is formed to contain information
about dependencies between functions and instructions of
IR, as well as other informations about functions and
instructions itself. The list of LLVM IR instructions is
transformed to corresponding DAG [4].

1015

The phases of the code generation are [5]:


DAG lowering
DAG legalization
Instruction selection
Instruction scheduling
SSA based optimization
Register allocation
Post allocation passes
Prologue/epilogue insertion
Late machine code optimizations
Assembly printing
The important tool in backend definition is the
TableGen [6]. LLVM TableGen is a domain specific
programming language for backend definition that comes
with LLVM Core. The TableGen is very important because
it is a primary tool for backend definition. TableGen tool
parses a target description file (.td), instantiates the
declarations and sends these results to TableGen backend
for processing. The final result is a C++ code ready for
compiling. Figure 2 shows relation between .td files, C++
files and LLVM libraries. Two key parts of TableGen
language are classes and definitions. Both of these two
parts are considered 'records'. Each of these TableGen
record has a unique name, a list of values and list of
super-classes. TableGen classes are abstractions that are
used to describe other records. TableGen definitions are
concrete instances of TableGen classes. There are also so
called TableGen multi-classes that are used to instantiate
multiple TableGen definitions at once. TableGen cares
about syntax and supports very simple type system [6]:
bit Boolean value that holds 0 or 1
int Simple 32-bit integer type
string Textual string
bits<n> - Fixed size array of n bits
list<tp> - List of other types
class type Type defined as class
dag Elements of directed graph
code Program code

Fig. 2. Target definition.

As an example we will show definition of ADDIU


instruction together with its superclasses taken from
MIPS32 backend definition [6]:
// Format for arithmetic instructions with 2
// register operands.
class ADDI_FM<bits<6> op> {
bits<5> rs;
bits<5> rt;
bits<16> imm16;
bits<32> Inst;

let
let
let
let

Inst{31-26}
Inst{25-21}
Inst{20-16}
Inst{15-0}

=
=
=
=

op;
rs;
rt;
imm16;

}
// Arithmetic and logical instructions with 2
// register operands.
class ArithLogicI<string opstr,
Operand Od, RegisterOperand RO,
InstrItinClass Itin = NoItinerary,
SDPatternOperator imm_type = null_frag,
SDPatternOperator OpNode = null_frag> :
InstSE<(outs RO:$rt),
(ins RO:$rs, Od:$imm16),
!strconcat(opstr, "\t$rt, $rs, $imm16"),
[(set RO:$rt,
(OpNode RO:$rs, imm_type:$imm16))],
Itin, FrmI, opstr> {
let isReMaterializable = 1;
let TwoOperandAliasConstraint = "$rs = $rt";
}
def ADDIU : ArithLogicI<"addiu",
simm16, GPR32Opnd, IIArith,
immSExt16, add>,
ADDI_FM<0x9>;

As we can see the definition of ADDIU instruction has


two super-classes: ArithLogicI and ADDI_FM. The
first superclass collects informations about instructions
with two register and one immediate operands. First
parameter of this class named opstr represents name of
an instruction in textual form, second parameter Od
represents type of an immediate input operand, third
parameter RO represents another record which is
predefined set of the registers from which operands can be
chosen for this instruction. Parameters imm_type and
OpNode are used in instruction selection phase to select
this instruction when corresponding pattern is matched.
LLVM implements instruction selector based on pattern
matching algorithms. Then these parameters of
ArithLogicI class are specially combined and passed
to its superclass InstSE, which is common superclass for
all MIPS32 instructions. To the class InstSE are passed
an output operands, input operands, the pattern that tells
how the instruction should be emitted into the assembly
file, the pattern for the instruction selector, etc. The second
superclass of ADDIU definition is used to define
instruction's binary format. As we can see instruction
opcode is passed as a parameter to this class and it is
bound to the bits from 31 to 26 in the instruction.
The derived record of this definitions and classes would
be single definition with more than 40 members. To avoid
a manual data specification for every single definition,
TableGen supports previously explained mechanisms
based on object oriented concepts, in this particular case it
is inheritance, which leads to maintainable and less error
prone back-end construction.
III. MICROMIPS ARCHITECTURE
MicroMIPS is one of the latest architectures from MIPS
family of architectures that comes with a new set of 16and 32-bit instructions, and is aimed to use less program
memory than standard 32-bit MIPS architectures. There

1016

are two microMIPS compatible processor cores: MIPS32


M14K and M14Kc [7]. MicroMIPS is, like all
architectures from MIPS family, a load/store RISC
architecture.
In contrast to formerly designed MIPS16e architecture
which is extension to existing MIPS32 architecture that
introduces 16-bit instruction set, microMIPS is a
standalone architecture that is backward compatible with
MIPS32 architecture, and has full support of MIPS32
recoded instructions. Therefore microMIPS does not need
to change a mode to use 16-bit instructions, and this is one
of main reasons where the great speed advance comes
from when compared to MIPS16e. For example in case of
MIPS16e architecture, if processor is in 16-bit mode, and
there is need for use privileged instructions for exception
handling, processor needs to switch to 32-bit mode to
perform this operation. Also usage of floating point
instructions requires 32-bit mode so this is another reason
when switch from 16- to 32-bit mode must be performed.
This is not the case with the microMIPS [2].
Another reason of better performance is that
microMIPS's new 16- and 32-bit instructions are highly
optimized. For example there are 16- and 32-bit
instructions that can perform multiple load and store
operations with a single instruction. MicroMIPS also
supports specialized instructions, an example is the 16-bit
instruction JRADDIUSP [8] that jumps to the address
specified in return address register (RA) and adds left
shifted 7-bit offset to the stack pointer, so the less
instructions and therefore smaller code size is delivered in
exchange of this single instruction. Therefore when exiting
from function, where two separate instructions are used to
adjust the stack pointer and to jump to desired address, the
single JRADDIUSP instruction can be used instead [9].
CsiBE code-size and Dhrystone performance benchmarks
have shown that the microMIPS architecture delivers
similar code size reductions (35% relative to MIPS32) as
the MISP16e ASE but with much better performance (98%
of the performance of MIPS32) [2].
Because microMIPS supports old MIPS32 ISA, as with
MIPS16e the mode can be swapped between microMIPS
and MIPS32. When mode is swapped from microMIPS to
MIPS32, then legacy MIPS32 instructions are processed
[7]. MicroMIPS processors supports two decoders, one for
microMIPS instructions and second one for legacy
MIPS32 instructions [6]. Since microMIPS is a complete
and self-contained architecture, the only need for mode
switch is when the legacy MIPS32 code must be executed.
There are few MIPS32 recoded instructions that are
different in format from legacy MIPS32 instructions, like
for example load/store unaligned instructions, where in
microMIPS they have 12-bit offset and in legacy MIPS32
format they have 16-bit offset.
To make summary there are four key characteristics
related to the microMIPS architecture [2]:
No performance loss when compared to older MIPS32
architectures.
A code size reduction up to 35% with equivalent
memory reduction.
Support for existing MIPS32 and MIPS64 ISAs with
remapped opcodes.
Maintains the both high-performance micro-architecture

and legacy MIPS32 support.


IV. ADDING MICROMIPS BACKEND TO THE LLVM
COMPILER INFRASTRUCTURE
Since microMIPS supports recoded MIPS32
instructions, the first step was to map microMIPS's
recoded instructions to the existing MIPS32 instructions in
the LLVM code generator. LLVM backend supports
mechanism that allows instruction mapping in late phase
of code generation, or in other words, after MIPS32
instructions are selected and scheduled, their binary
encodings are simply replaced by microMIPS encodings
[10].
Usage of instruction mapping is most beneficial when
new instruction set is similar to the one already
implemented. In this case much of the existing
implementation can be reused. Thus, implementation of
the first four phases of code generation (DAG lowering,
DAG legalization, instruction selection and instruction
scheduling) can be completely reused from existing
backend. Only addition to the code generation is that
instructions are mapped at the very end of the process, or
in other words binary encodings of selected instructions
are simply replaced with the new ones.
Here will be shown example of mapping microMIPS to
MIPS32 instruction. Next example shows MIPS32
instruction with added MMRel class. Here MMRel class is
the key to instruction mapping because it is used to relate
instructions with each other:
def ADDIU : MMRel, ArithLogicI<"addiu",
simm16, GPR32Opnd, IIArith,
immSExt16, add>,
ADDI_FM<0x9>;

And here is corresponding microMIPS instruction, with


new format class ADDI_FM_MM defined:
class ADDI_FM_MM<bits<6> op> {
bits<5> rs;
bits<5> rt;
bits<16> imm16;
bits<32> Inst;
let
let
let
let

Inst{31-26}
Inst{25-21}
Inst{20-16}
Inst{15-0}

=
=
=
=

op;
rt;
rs;
imm16;

}
def ADDIU_MM : MMRel, ArithLogicI<"addiu",
simm16, GPR32Opnd, IIArith,
immSExt16, add>,
ADDI_FM_MM<0x9>;

These mapped instructions are very similar in format,


only rs and rt operands are switched. Since addition of
relation class in instruction definitions is done, relation
table will be created, and the final step is to perform
mapping. Actual mapping is done in function
MipsMCCodeEmitter::EncodeInstruction in
file MipsMCCodeEmitter.cpp:
// Get MIPS32 binary enconding
uint32_t Binary = getBinaryCodeForInstr(TmpInst,
Fixups);
unsigned Opcode = TmpInst.getOpcode();
if (IsMicroMips) {

1017

There are three groups of tests that are used in this


project. The first group consists of so called regression
tests. These tests serves to prove correctness of instruction
encodings, generated relocations, etc. These tests are ran
by supported tools in the LLVM. Second group are
DejaGNU tests that are ran with emulator such as Qemu,
and natively on microMIPS M14K board. The third group
is SingleSource and MultiSource tests that are part of the
LLVM test-suite [11].

// Get microMIPS instruction opcode


int NewOpcode = Mips::Std2MicroMips (Opcode,
Mips::Arch_micromips);
if (NewOpcode != 0xFFFF) {
Opcode = NewOpcode;
TmpInst.setOpcode (NewOpcode);
// Get microMIPS binary enconding
Binary = getBinaryCodeForInstr(TmpInst,
Fixups);
}
}

Function Std2MicroMips performs lookup in


relation table and returns opcode for microMIPS
instruction related to opcode of corresponding MIPS32
instruction that is passed as a parameter.
Although most of the microMIPS recoded instructions
are covered this way, there are few exceptions with this
approach. For example the microMIPS recoded unaligned
load/store instructions cannot be handled by mapping
because they have different offset sizes compared to the
corresponding MIPS32 instructions. The offset of
microMIPS unaligned load/store instructions is 12-bit in
size, while the the offset of the corresponding MIPS32
instructions is 16-bit in size, and the problem would arise
when instruction selector would choose one of these
MIPS32 instructions with offset greater than 12-bits, and
after that when binary encoding of this instruction would
be changed to microMIPS encoding, only the 12-bits of
offset would be considered. Therefore new definitions of
microMIPS unaligned load/store instructions have been
introduced, together with the corresponding pattern
selection algorithms.
Beside of instruction definitions in target description
files, there are other things that need to be implemented
manually, in order to get new backend worked. These are
function call support (stack frames), ELF support which
also includes implementation of relocations, support for
instructions for which selection must be handled manually,
support for delay slot filler, support for endianess, various
target dependent optimizations, assembler, disassembler,
etc. For every of these things there is corresponding file in
directory of the backend. Because there are many of these
files, here will be mentioned only few of them:
MipsAsmPrinter.cpp Assembly printer
MipsDelaySlotFiller.cpp Support for delay slots
MipsFrameLowering.cpp Prologue/Epilogue functions
MipsISelDAGToDAG.cpp Manual instruction
selection implementations
Also specific thing about microMIPS it is that its little
endian byte order is different than one that can be found in
other MIPS architectures. First two bytes are switched
with second two bytes, as is shown in Figure 3. This is
thing that must be cared about all the time when
implementing microMIPS backend, because beside of
final byte emission there are places where manual changes
of binary encoding of instructions must be performed,
such as fixups of relocations for example.

CONCLUSION
Instruction mapping reduces the code size, the
implementation time and therefore overall effort that
implementation of a new backend took. Precondition for
instruction mapping is that corresponding backend must
already be implemented, so few of already implemented
phases of the code generation can be reused. In our case
almost all of the 32-bit recoded MIPS32 instructions in
microMIPS are mapped with the MIPS32 instructions that
has been already implemented in LLVM. That is to say
175 instructions are mapped and 58 instructions are
implemented from scratch. To compare, target description
file for microMIPS instructions (MicroMipsInstrInfo.td) is
near 500 lines of code long, target description file for
MIPS32 instructions (MipsInstrInfo.td) is near 1400 lines
of code long, and target description file for MIPS16
architecture is over 1800 lines of code long, which is more
than three times longer than microMIPS target description
file.
Another benefit of instruction mapping is that a testing
and verification time of the implemented backend is much
shorter than it would be otherwise. Number of newly
developed tests is reduced because reused parts of
implementation are already verified and tested. Same
stand for the number of detected bugs and effort needed to
fix them.
ACKNOWLEDGMENT
This work was partially supported by the Ministry of
Education, Science and Technological Development of the
Republic of Serbia under Grant TR-32034.
REFERENCES
[1]
[2]
[3]
[4]
[5]

[6]
[7]
[8]

[9]
[10]

Fig. 3. Little endian byte ordering.

[11]

1018

LLVM Compiler Infrastructure, http://llvm.org


MIPS Technologies, microMIPS Instruction Set
Architecture, MD00690, Revision 01, October 2009
LLVM, The Architecture of Open Source Applications,
http://aosabook.org/en/llvm.html
The LLVM Target-Independent Code Generator,
http://llvm.org/docs/CodeGenerator.html
Christoph Erhardt, Design and Implementation of a TriCore
Backend for the LLVM Compiler Framework, September
2009
TableGen Fundamentals,
http://llvm.org/docs/TableGenFundamentals.html
Tom R. Halfhill, MICROPROCESSOR, MicroMIPS Crams
Code, November 2009
MIPS Technologies, MIPS Architecture for Programmers,
Volume II-B, The microMIPS32 Instruction Set, December
2010
MIPS Technologies, microMIPS ASE Usage, June 2010
Writing an LLVM Backend,
http://llvm.org/docs/WritingAnLLVMBackend.html
LLVM Testing Infrastructture Guide,
http://llvm.org/docs/TestingGuide.html

You might also like