TableGen for instruction selection
We have looked at how the instruction selection works. Here, we will see how you can select LLVM IR to your target’s instruction set.
TableGen
TableGen does a lot of heavy-lifting for us in the instruction selection process. Refer to the TableGen documentation for how to use TableGen.
A TableGen introduction is coming soon to this site.
Essentially, we specify instructions and rules on how to match SDNodes to our instructions by using TableGen records. Then, the instruction selection TableGen backend generates the necessary code to actually perform the selection.
This page can be updated
This page is a work in progress. The intention is to provide a comprehensive documentation on how to use TableGen for instruction selection.
The documentation needs some commentary from more experienced developers to be more informative. If you happen to be one of them, please feel free to contribute.
Registers
Before we can write any instruction, we must define some registers for our architecture.
Registers are quite simple, so refer directly to the Adding Registers chapter for how to define them.
Register Classes
The important part in defining the register classes is the supported types of the class.
For example, MIPS defines the GPR32
class for register supporting i32
types:
def GPR32 : GPR32Class<[i32]>;
AMDGPU defines the VGPR
class for registers supporting v4i32
types:
// VGPR 32-bit registers// i16/f16 only on VI+def VGPR_32 : SIRegisterClass<"AMDGPU", !listconcat(Reg32Types.types, Reg16Types.types), 32, (add (sequence "VGPR%u", 0, 255))> { let AllocationPriority = 0; let Size = 32; let Weight = 1; let BaseClassOrder = 32;}...// Types are defined as a list:def Reg16Types : RegisterTypes<[i16, f16, bf16]>;def Reg32Types : RegisterTypes<[i32, f32, v2i16, v2f16, v2bf16, p2, p3, p5, p6]>;
Defining Instructions
Instructions refer to the MachineInstr
class, which holds a MCInstrDesc
object and are generated from
records that derive from the TableGen Instruction
class.
The MCInstrDesc
class describes the properties of an instruction, like:
- Opcode
- Arguments
- Arguments that are definitions
- Encoding size
- Implicit uses and defs
- Target-independent flags
- Target-specific flags
Instances of this class are created by the InstrInfo TableGen backend. The Opcodes are generated in the target’s namespace and are the same name as the records defined for each instruction.
Types in TableGen
Types come from the ValueTypes.td
file. Whenever a valuetype
is used here, it refers to records
that subclass the ValueType
class. Here are some examples:
Integers: i1, i2, ... i128Floating point: bf16, f16, f32... f128, ppcf128Vectors: v1i1, ... v2048i1, v64i4, ...ScalableVectors: nxv1i1 etc
Glue
- used for glue operandsOtherVT
- the “Other” value, used as the type of return values of nodes of opcodes like ISD::BasicBlock or ISD::Br.
Find all of them in the llvm/include/llvm/CodeGen/ValueTypes.td
file.
SD Nodes in TableGen
To describe patterns, we need to specify the SDNodes that we want to match. An SDNode is defined by its type profile, opcode and node properties.
Syntax:
def SDNodeName : SDNode<"XXXISD::OpcodeName", {type profile record}, [{node properties}...], {sdnode C++ class}>;
Type Profile
Instances of this class are used to describe the types of the operands and results of the SDNode.
Type constraints
This uses a class called SDTypeConstraint
to describe the input and output types of the SDNode.
TableGen has hardcoded a list of subclasses of SDTypeConstraint
(like SDTisFP
) that are documented in the
TargetSelectionDAG.td
file.
class SDTypeConstraint<int opnum>;
OpNum argument
The argument int OpNum
is the operand number of the SDNode that we want to apply this to.
The 0th operand is the return value of the SDNode’s, and the operands of the SDNode are numbered from 1 to N.
To use these classes, we create anonymous records in the type profile.
Creating a type profile
To create a type profile, define a record that subclasses the SDTypeProfile
class.
Here is an example:
// SDTCisFP - The specified operand has floating-point type.class SDTCisFP<int OpNum> : SDTypeConstraint<OpNum>;
// Use it by creating an anonymous record in the type profile like: SDTCisFP<0>// Create a floating point binary operation in SDAG:
def SDFloatingBinaryOp : SDTypeProfile< 1, // the number of result values 2, // the number of operands [ SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, // all operands and result are same SDTCisFP<0> // all are floating point ]>;
There are loads of pre-defined type profiles in TargetSelectionDAG.td
that you can use.
SDNode Properties
The properties of the SDNode are defined in SDNodeProperties.td
file.
Show all properties
list<SDNodeProperty> Properties = [];}//===----------------------------------------------------------------------===//// Selection DAG Node Properties.//// Note: These are hard coded into tblgen.//def SDNPCommutative : SDNodeProperty; // X op Y == Y op Xdef SDNPAssociative : SDNodeProperty; // (X op Y) op Z == X op (Y op Z)def SDNPHasChain : SDNodeProperty; // R/W chain operand and resultdef SDNPOutGlue : SDNodeProperty; // Write a flag resultdef SDNPInGlue : SDNodeProperty; // Read a flag operanddef SDNPOptInGlue : SDNodeProperty; // Optionally read a flag operanddef SDNPMayStore : SDNodeProperty; // May write to memory, sets 'mayStore'.def SDNPMayLoad : SDNodeProperty; // May read memory, sets 'mayLoad'.def SDNPSideEffect : SDNodeProperty; // Sets 'HasUnmodelledSideEffects'.def SDNPMemOperand : SDNodeProperty; // Touches memory, has assoc MemOperanddef SDNPVariadic : SDNodeProperty; // Node has variable arguments.
SD Nodes
Targets can create their own SDNodes to match with this TableGen class.
MIPS uses an SDNode with the opcode MIPSISD::Ret
to lower return instructions from LLVM IR to SDAG. This gets selected into
a RetRA machine pseudo instruction.
class SDNode<string opcode, SDTypeProfile typeprof, list<SDNodeProperty> props = [], string sdclass = "SDNode">;
Here is the example from MIPSInstrInfo.td:
// Returndef MipsRet : SDNode<"MipsISD::Ret", SDTNone, [SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;
// This is used in the RetRA instruction's inline pattern:def RetRA : PseudoSE<(outs), (ins), [(MipsRet)]>;
Patterns
Patterns in TableGen are used to match the SDNodes to the instructions.
You can jump to the Patterns Semantics section to just see the semantics.
Before we get into the details, here is a simple example of a pattern:
def : Pat<(add GPR32:$gp, (MipsGPRel tglobaladdr:$in)), (ADDiu GPR32:$gp, tglobaladdr:$in)>;// where:// add = builtin SDNode for the add operation// GPR32 = register class// MipsGPRel = SDNode for the global address// tglobaladdr = builtin SDNode for a global address// ADDiu = The target's instruction for the add immediate unsigned operation
We match the add
SDNode with the ADDiu
instruction where
one of the operands is a global address.
Syntax
To match a SDAG pattern, we use the Pat
class. The following defines a single pattern:
Signature:class Pattern<dag patternToMatch, list<dag> resultInstrs>;
Pat
Usually we use the Pat
class instead of Pattern
directly.
Targets create their own Pat
class wrapper.
class MipsPat<dag pattern, dag result> : Pat<pattern, result>, PredicateControl;
Input pattern
The LHS of the Pat
is defined as the input pattern. (the patternToMatch
in the above signature)
The dag patternToMatch
is the SDNode that we want to match in the SDAG.
Input patterns cannot contain instructions, so we cannot match instructions to .other instructions through these patterns.
Output pattern The RHS is the output pattern.
The list<dag> resultInstrs
is the list of result SDAGs that we want to generate
on matching the previous pattern. These cannot contain SDNodes (like add
).
def : Pat<(input_pattern), (output_pattern)>;
The list<dag> resultInstrs
is the list of result SDAGs that we want to generate
on matching the previous pattern. This is called as the output pattern. These cannot
contain SDNodes (like add
).
Inline patterns
Patterns can also be defined inline, in the Instruction record.
These are input patterns only (so we cannot use OutPatFrags
).
def ADD32 : XXXInstrFormat<(outs GPR32:$dst), (ins GPR32:$src1, GPR32:$src2), /*inline pattern*/ (set GPR32:$dst, (add GPR32:$src1, GPR32:$src2)) >;
The special record set
is used to match the output operand of the instruction.
Using patterns
Using patterns is quite simple. Use the Pat
class to create a pattern.
You don’t have to name the pattern record.
def : Pat<(add i32:$src1, i32:$src2)), (ADDiu i32:$src1, i32:$src2)>;
PatFrags
Pattern fragments are like macros for patterns. They can be used as an operator in other patterns or other pattern fragments. These get expanded at the use site into their corresponding output patterns.
def not_pat_frag : PatFrag<(ops node:$in), (xor node:$in, -1)>;
Syntax
def {pat_frag_name} : PatFrags<(ops { node:$nameN }*), [ {output patternN}* ]>
The ‘input’ of a pattern fragment is the (ops node, ...)
pattern. The output pattern
(xor node:$in, -1)
is what gets inlined at the use of this pattern fragment.
`ops`, `ins`, `outs`, `node`, `set`
All these are “keywords” in the ISel TableGen language.
“keywords” mean just special records that have a hardcoded meaning in the TableGen backend.
ops
,ins
andouts
are all equivalent nodes used only in input patterns for fragments and inline patterns in instructions.ins
is mostly used for input operand pattern of an instruction andouts
for the output.ops
is mostly used in pattern fragments.
set
is a special operator that can be only used in instruction’s inline patterns.node
can be only used in the input pattern of a pattern fragment.
def : Pat<(not_pat_frag i16:$in), (NotInstr i16:$in)>;// not_pat_frag gets expanded to:def : Pat<(xor i16:$in, -1), (NotInstr i16:$in)>;
The labels like $in
cannot be used more than once in the same pattern fragment.
We can use more than one output for the pattern fragment to expand to.
def multiple_pat_frag : PatFrags<(ops node:$in), [(xor node:$in, -1), (add node:$in, 1)]>;
The use of this PatFrag will get expanded to match both the xor
and add
instructions
to the same output instruction pattern.
def : Pat<(multiple_pat_frag i16:$in), (NotInstr i16:$in)>;// xor with -1 and add with 1 will match to the NotInstr in the// target-specific DAG.
OutPatFrags
The PatFrags
we saw stand for input pattern macros. For output patterns,
we need to use OutPatFrags
.
The syntax is the same as PatFrags
, but we cannot use SDNodeXForm
or SDNode
records
as operands.
def out_pat_frag : OutPatFrag< (ops node:$src1, node:$src2), (ADD32 $src1, // ADD32 is an instruction (ADD32 GPR32, $src2) ) >;// We cannot use SDNodes like `add` or `xor` in the output pattern.def invalid_pat_frag : OutPatFrag< // use PatFrag instead (ops node:$src1, node:$src2), (add $src1, $src2) // invalid >;
These can be used in other pattern fragments also.
def : Pat<(fmul f32:$src1, f32:$src2), (out_pat_frag $src1, $src2)>;
ComplexPatterns
Complex patterns are used mainly for addressing modes where we want a C++ function to do the matching manually.
class ComplexPattern<ValueType ty, // the return type of emitted node by the select function int numpos, // number of operands the select function takes string fn, // name of the select function. This is a member // of the ISelDAGToDAG class where we include the // tablegen'erated file. >;
These can be used as nodes in pattern or pattern fragments.
def : Pat<(i32 (load addrRegImm:$a), (LW addrRegImm:$a))>;
Using constants
Patterns can match or emit constants. The constants are represented as TargetConstant
nodes
if emitted, otherwise are just Constant
nodes.
def : Pat<(store (i32 0), addr:$dst), (SW ZERO, addr:$dst)>;// where:// store = SDNode for the store operation. This is a builtin PatFrag.// i32 = SDNode for the i32 type// 0 = the constant value// ZERO = the target's zero register (MIPS zero register)// addr = a complex pattern for a MIPS addressing mode.
Type inference
The input pattern (in a Pat
definition) must resolve to a single type.
This is done by intersecting the possible types of the nodes in the pattern DAG.
For example, the following pattern will not work because the types of the add
operands can be i32 or i64:
// GPR32 register class supports [i32, i64] typesdef : Pat<(add GPR32:$src1, GPR32:$src2), (ADDrr GPR32:$src1, GPR32:$src2)>;
To constrain the types, we use the type cast operator:
TypeCastOperator ::= TGTypes::List<recordOfType(ValueType)> // for intrinsics returning multiple values // like ([i16, i32] (intrinsic_with_two_ret_values ...)) | recordOfType(ValueType) // single ret value type cast // like (i16 (op_ret_one_val ...))
Basically we just wrap the multi-typed operation in a type node.
// we need to cast to the type of the return// value of the extloadi8 operation.def : Pat<(i32 (extloadi8 addr:$src), (LBu addr:$src))>;
Pattern semantics reference
I have attempted to condense the allowed combinations of operations in patterns with this BNF-like grammar syntax. This is not exhaustive, but should be enough to serve as a quick reference.
using TGTypes = TableGenTypes; // only for readability,// this virtual namespace refers to types in the TableGen language.function recordOfType(TGTypes::Class className) -> TGTypes::Record;// this grammar function is to represent an instance record of a// tablgen class.
// values in single quotes are just characters or built-in records.
pattern ::= '(' PatternOperator ')' // only operator, no arguments | '(' PatternOperator ArgList ')' // operator with arguments
attribute bool pattern.isInput; // these match the SDAG nodesattribute bool pattern.isOutput; // these are what replace SDAG nodes.
PatternOperator ::= 'null_frag' | TypeCastOperator | SDNodeOperator | recordOfType( PatFrags | Instruction if pattern.isOutput | SDNodeXForm if pattern.isOutput | Intrinsic if pattern.isInput | ComplexPattern ) | 'set' // can only be used in instruction inline patterns | 'ops' | 'ins' | 'outs' // only for pattern fragment operators
SDNodeOperator ::= OutputSDNodeOperator if pattern.isOutput | InputSDNodeOperator if pattern.isInput
OutputSDNodeOperator ::= 'imm' | 'timm' | 'fpimm' | 'tglobaltlsaddr' | 'tglobaladdr' | 'tblockaddress' | 'texternalsym' | 'tconstpool' | 'tjumptable' | 'tframeindex' | 'bb' | 'vt' | 'mcsym' // no other SDNodes' can be emitted in the DAG
InputSDNodeOperator ::= recordOfType(SDNode) // like (add imm:$immediate, ...) // any SDNode can be used here
TypeCastOperator ::= TGTypes::List<recordOfType(ValueType)> // for intrinsics returning multiple values // like ([i16, i32] (intrinsic_with_two_ret_values ...)) | recordOfType(ValueType) // single ret value type cast // like (i16 (op_ret_one_val ...))
ArgList ::= PatternArg ( ',' PatternArg )* // comma separated list of arguments
PatternArg ::= PatternArgVal ArgName? // argument can have an optional name | '$' name // no value | 'node' ArgName // special record to indicate an input argument // for PatFrag operands only | instanceOfType(TGTypes::Int) // int value, to match or emit constants | instanceOfType(TGTypes::Bit) // converted to an int value | instanceOfType(TGTypes::Bits) ArgName? // Bits is also converted to an int value // and emitted as TargetConstant node.
ArgName ::= ':' '$' identifier
PatternArgVal ::= pattern // note: (add imm:$imm) is the same as (add (imm):$imm) | '?' // unset value | SDNodeOperator // like 'imm' in (add imm:$immediate, ...) // this is wrapped into its own dag: imm:$name -> (imm):$name | recordOfType(PatFrags) // this is wrapped similarly