The Return Instruction
Instruction encodings
Since we are not going to generate object code yet, we do not need to write the instruction encodings.
To support instruction selection for the Nova
target, we have to write
some tablegen and C++ code. Here is what we have to do:
- Write instructions in
NovaInstrInfo.td
. These should cover all instructions we want to support from the MIPS ISA.
Normally you want to add instruction encodings in 1, but we don’t need encodings for compiling to assembly. We just move to the next step.
- Write matching patterns in
NovaInstrPats.td
. This is where we implement the instruction selection. - Write the
NovaISelLowering.cpp
file to lower LLVM code to the target-specific SelectionDAG nodes. - Write the
NovaISelDAGToDAG.cpp
file to implement the instruction selection.
You’ll know what each step means when we get to it.
Instruction Selection
Explain how selection works
Remove this section and just link to the previous Instruction Selection pages.
LLVM uses a two-phase process to select instructions.
Most of the time, TableGen will generate the patterns for you.
Selection phase is: LLVM IR -> SelectionDAG —optimize—> SelectionDAG -> Target-specific SelectionDAG —optimize—> Target-specific SelectionDAG
Lowering phase is: Target-specific SelectionDAG —optimize, then lower—> Target-specific SelectionDAG -> MachineInstr —optimize—> MachineInstr
To define Nova’s instructions, we need to write entries for each instruction as a TableGen record that
is an instance of the Instruction
class. This is done in the NovaInstrInfo.td
file.
Instruction formats
Since instructions have a certain encoding format like rs, rt, rd, shamt, funct and others, we usually
define these formats in the XXXInstrFormats.td
file. Read the Target.td
file
(llvm/include/llvm/Target/Target.td
) to see how the instruction formats are defined.
Let’s add our file NovaInstrFormats.td
to the llvm/lib/Target/Nova/
directory.
//===-- NovaInstrFormats.td - Nova Instruction Formats --------------------===//// This file contains the instruction formats for the Nova architecture.//===----------------------------------------------------------------------===//
Since we are not going to generate object code yet, we do not need to add the instruction encoding formats. We will just create a simple base class for Nova instructions.
//===----------------------------------------------------------------------===//
class NovaInst<dag outs, dag ins, string asmString> : Instruction { let Namespace = "Nova"; let OutOperandList = outs; let InOperandList = ins; let AsmString = asmString;}
Remember that the instructions we define are the MachineInstrs that LLVM IR instructions map to. Ideally, these match the target’s instruction set architecture (ISA) instructions.
But sometimes we need additional instructions that are not part of the ISA. These are called “pseudo instructions”. Pseudo instructions are not real instructions, but they are used to represent a sequence of real instructions. They are used to simplify the instruction selection process and to work more easily.
For example, the MIPS backend uses PseudoRet
to represent a return instruction. PseudoRet
is then printed as jr
or
jalr
depending on the MIPS Version ISA.
Let’s add a Pseudo instruction class for Nova instructions.
}
class PseudoNovaInst<dag outs, dag ins, string asmString> : NovaInst<outs, ins, asmString> { let isPseudo = 1; let isCodeGenOnly = 1;}
Defining the instruction
We’ll start with the return instruction.
Create the NovaISD::Ret
enum value. In our LLVM version, these enum values are not generated by TableGen, but
work is in progress to generate them. See this RFC.
//==-- Nova DAG Lowering Interface --------//
#ifndef LLVM_LIB_TARGET_NOVA_NOVAISELLOWERING_H#define LLVM_LIB_TARGET_NOVA_NOVAISELLOWERING_H
#include "llvm/CodeGen/ISDOpcodes.h"#include "llvm/CodeGen/TargetLowering.h"namespace llvm {
namespace NovaISD {enum NodeType : unsigned { FIRST_NUMBER = ISD::BUILTIN_OP_END,
// Return Ret,};} // end namespace NovaISD
While we are in this file, add the NovaTargetLowering
class. This is responsible for
lowering LLVM IR to the target-specific DAG nodes.
} // end namespace NovaISD
class NovaSubtarget;
class NovaTargetLowering : public TargetLowering {public: explicit NovaTargetLowering(const TargetMachine &TM, const NovaSubtarget &STI);23 collapsed lines
SDValue LowerReturn(SDValue Chain, CallingConv::ID CallConv, bool isVarArg, const SmallVectorImpl<ISD::OutputArg> &Outs, const SmallVectorImpl<SDValue> &OutVals, const SDLoc &dl, SelectionDAG &DAG) const override;
SDValue LowerCall(TargetLowering::CallLoweringInfo &CLI, SmallVectorImpl<SDValue> &InVals) const override;
bool CanLowerReturn(CallingConv::ID CallConv, MachineFunction &MF, bool IsVarArg, const SmallVectorImpl<ISD::OutputArg> &Outs, LLVMContext &Context, const Type *RetTy) const override; SDValue LowerFormalArguments(SDValue Chain, CallingConv::ID /*CallConv*/, bool /*isVarArg*/, const SmallVectorImpl<ISD::InputArg> & /*Ins*/, const SDLoc & /*dl*/, SelectionDAG & /*DAG*/, SmallVectorImpl<SDValue> & /*InVals*/) const override { return Chain; } /// getTargetNodeName - This method returns the name of a target specific // DAG node. const char *getTargetNodeName(unsigned Opcode) const override;};
} // namespace llvm
#endif
Add the SDNode that LLVM IR’s ret
maps to. The opcode of this node is NovaISD::Ret
and it takes
a variable number of operands. This is to support multiple return value registers (like returning an i64 value
needs two i32 registers, in $v0
and $v1
).
//===- Nova Instruction Definitions ----------------------------===//include "NovaInstrFormats.td"
//==---------- All SD nodes for Nova ------------------===//def NovaRetSDN : SDNode<"NovaISD::Ret", SDTNone, // 0 results and 0 operands [SDNPHasChain, SDNPVariadic, SDNPOptInGlue]>;//==---------- End SD Node definitions ----------------===//
This node will get selected to the PseudoRet
instruction.
[SDNPHasChain, SDNPVariadic, SDNPOptInGlue]>;//==---------- End SD Node definitions ----------------===////==----------- Nova Instruction Definitions ----------===//def PseudoRet : PseudoNovaInst<(outs), (ins), "ret"> { let isReturn = 1; let isTerminator = 1;}//==--------- End Nova Instruction Definitions --------===//
Add the pattern that will select the NovaISD::Ret
node.
}//==--------- End Nova Instruction Definitions --------===////==---- All patterns to match SD nodes -----------==//def : Pat<(NovaRetSDN), (PseudoRet)>;
All target tablegen files are included in the top-level XXX.td
file.
Include the new NovaInstrInfo.td
file in Nova.td
:
include "NovaRegisterInfo.td"include "NovaInstrInfo.td"
def : ProcessorModel<"generic", NoSchedModel, []>;
InstrInfo class
TableGen’erated instruction records are stored in the NovaInstrInfo
class.
Following the common tablegen pattern, we derive our class from the
NovaGenInstrInfo
class.
#ifndef LLVM_LIB_TARGET_NOVA_NOVAINSTRINFO_H#define LLVM_LIB_TARGET_NOVA_NOVAINSTRINFO_H
#include "Nova.h"#include "NovaRegisterInfo.h"#include "llvm/CodeGen/MachineInstrBuilder.h"#include "llvm/CodeGen/TargetInstrInfo.h"
#define GET_INSTRINFO_HEADER#include "NovaGenInstrInfo.inc"
namespace llvm {class NovaSubtarget;
class NovaInstrInfo : public NovaGenInstrInfo {public: explicit NovaInstrInfo(const NovaSubtarget &STI);protected: const NovaSubtarget &Subtarget;};} // end namespace llvm
#endif
Before we create the constructor, we need stack manipulation instructions.
These instructions and the
callseq_end
SDNode are just placeholders for now. We will use them while lowering call nodes.
def : Pat<(NovaRetSDN), (PseudoRet)>;
def callseq_end : SDNode<"ISD::CALLSEQ_END", SDTNone, [SDNPHasChain, SDNPOptInGlue]>;
def ADJCALLSTACKDOWN : Instruction { let OutOperandList = (outs); let Namespace = "Nova"; let InOperandList = (ins); let AsmString = "ADJCALLSTACKDOWN"; let Pattern = [(callseq_end)];}
def ADJCALLSTACKUP : Instruction { let OutOperandList = (outs); let Namespace = "Nova"; let InOperandList = (ins); let AsmString = "ADJCALLSTACKUP"; let Pattern = [(callseq_end)];}
Create the NovaInstrInfo.cpp
file and implement the constructor.
#include "NovaInstrInfo.h"#include "MCTargetDesc/NovaMCTargetDesc.h"#include "NovaTargetMachine.h"#include "llvm/CodeGen/MachineInstrBuilder.h"
using namespace llvm;
#define DEBUG_TYPE "nova-instr-info"
#define GET_INSTRINFO_CTOR_DTOR#include "NovaGenInstrInfo.inc"
NovaInstrInfo::NovaInstrInfo(const NovaSubtarget &STI) : NovaGenInstrInfo(Nova::ADJCALLSTACKDOWN, Nova::ADJCALLSTACKUP), Subtarget(STI) { }
Include this in CMakeLists.txt
to build the file.
NovaRegisterInfo.cpp MCTargetDesc/NovaMCTargetDesc.cpp NovaTargetObjectFile.cpp NovaSubtarget.cpp MCTargetDesc/NovaMCAsmInfo.cpp NovaInstrInfo.cpp
// Add the GenInstrInfo.inc include to MCTargetDesc files.
Registering the InstrInfo
Instructions are represented by enum objects, and individual information is in MCInstrDesc
objects.
Include the enum declaration in the MCTargetDesc
header file.
#include "NovaGenSubtargetInfo.inc"
#define GET_INSTRINFO_ENUM#include "NovaGenInstrInfo.inc"
#endif
TableGen generates all instructions in a MSInstrDesc[]
array.
using namespace llvm;
#define GET_INSTRINFO_MC_DESC#define ENABLE_INSTR_PREDICATE_VERIFIER#include "NovaGenInstrInfo.inc"
#define GET_REGINFO_MC_DESC
We should now also include the necessary files for the definitions.
#include "NovaTargetInfo.h"#include "llvm/MC/MCSubtargetInfo.h"#include "llvm/MC/MCInstrInfo.h"
Register the instruction info in the createNovaMCInstrInfo
function.
}
static MCInstrInfo* createNovaMCInstrInfo() { MCInstrInfo *X = new MCInstrInfo(); InitNovaMCInstrInfo(X); return X;}
static MCInstPrinter* createNovaMCInstPrinter(const Triple &T, unsigned SyntaxVariant, const MCAsmInfo &MAI, const MCInstrInfo &MII, const MCRegisterInfo &MRI) {
TargetRegistry::RegisterMCRegInfo(*T, createNovaMCRegisterInfo); TargetRegistry::RegisterMCSubtargetInfo(*T, createNovaSubtargetInfo); TargetRegistry::RegisterMCInstrInfo(*T, createNovaMCInstrInfo);
With this, we have defined everything required to support the return instruction.
Lowering to SelectionDAG
We have to tell the SelectionDAGBuilder
how to lower the LLVM IR ret
instruction to Nova’s SDNodes.
More specifically, we have to construct physical register nodes for the return values and insert the actual return SDNode.
This is done in the LowerReturn
method of the TargetLowering
class.
Let’s consider an example of a return statement that needs to be lowered.
define i64 @rett(i32 %a, i32 %b) {entry: %aext = zext i32 %a to i64 %bext = zext i32 %b to i64 %ret = add i64 %aext, %bext ret i64 %ret}
This is converted into this selection DAG:
Initial selection DAG: %bb.0 'rett:entry'SelectionDAG has 17 nodes: t0: ch,glue = EntryToken t2: i32,ch = CopyFromReg t0, Register:i32 %0 t5: i64 = zero_extend t2 t4: i32,ch = CopyFromReg t0, Register:i32 %1 t6: i64 = zero_extend t4 t7: i64 = add t5, t6 t9: i32 = extract_element t7, Constant:i32<1> t13: ch,glue = CopyToReg t0, Register:i32 $v0, t9 t11: i32 = extract_element t7, Constant:i32<0> t15: ch,glue = CopyToReg t13, Register:i32 $v1, t11, t13:1 t16: ch = MipsISD::Ret t15, Register:i32 $v0, Register:i32 $v1, t15:1
We see that the return instruction returns two values for one i64 value. This is because the MIPS ABI requires that all values be returned in registers. The return value is split into two 32-bit values.
The LowerReturn
method is responsible for lowering the return instruction. It does this by iterating over the return values and creating a new SDNode
for each value. The SDNode
is then added to the DAG.
See the virtual method in TargetLowering
This method must be implemented by targets.
}
/// This hook must be implemented to lower outgoing return values, described /// by the Outs array, into the specified DAG. The implementation should /// return the resulting token chain value. virtual SDValue LowerReturn(SDValue /*Chain*/, CallingConv::ID /*CallConv*/, bool /*isVarArg*/, const SmallVectorImpl<ISD::OutputArg> & /*Outs*/, const SmallVectorImpl<SDValue> & /*OutVals*/, const SDLoc & /*dl*/, SelectionDAG & /*DAG*/) const { llvm_unreachable("Not Implemented"); }
/// Return true if result of the specified node is used by a return node
To begin, spin up the NovaISelLowering.cpp
file.
//===- NovaIselLowering.cpp - Nova DAG Lowering Implementation -----------===//#include "NovaISelLowering.h"#include "MCTargetDesc/NovaMCTargetDesc.h"#include "NovaSubtarget.h"
using namespace llvm;
#define DEBUG_TYPE "nova-isel"
We have to declare legal types for the target. This is done in the NovaTargetLowering
constructor.
#define DEBUG_TYPE "nova-isel"
NovaTargetLowering::NovaTargetLowering(const TargetMachine &TM, const NovaSubtarget &STI) : TargetLowering(TM) { addRegisterClass(MVT::i32, &Nova::GPR32RegClass);
computeRegisterProperties(STI.getRegisterInfo());}
Now implement the LowerReturn
method.
}
SDValueNovaTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv, bool isVarArg, const SmallVectorImpl<ISD::OutputArg> &Outs, const SmallVectorImpl<SDValue> &OutVals, const SDLoc &dl, SelectionDAG &DAG) const {
Classes used for lowering arguments and return values
These types that are used for calling-convention information.
1. ISD::ArgFlagsTy
This is a bitset that contains information about the argument. It is used to determine how the argument should be passed to the function.
ISD::ArgFlagsTy
namespace ISD {
struct ArgFlagsTy {private: unsigned IsZExt : 1; ///< Zero extended unsigned IsSExt : 1; ///< Sign extended unsigned IsNoExt : 1; ///< No extension unsigned IsInReg : 1; ///< Passed in register unsigned IsSRet : 1; ///< Hidden struct-ret ptr unsigned IsByVal : 1; ///< Struct passed by value unsigned IsByRef : 1; ///< Passed in memory
2. ISD::InputArg
This struct contains the flags and type information about a single incoming (formal) argument or incoming return value virtual register.
/// of the caller) return value virtual register.///struct InputArg { ArgFlagsTy Flags; MVT VT = MVT::Other; EVT ArgVT; bool Used = false;
/// Index original Function's argument. unsigned OrigArgIndex; /// Sentinel value for implicit machine-level input arguments. static const unsigned NoArgIndex = UINT_MAX;
/// Offset in bytes of current input value relative to the beginning of /// original argument. E.g. if argument was splitted into four 32 bit /// registers, we got 4 InputArgs with PartOffsets 0, 4, 8 and 12. unsigned PartOffset;
InputArg() = default;
3. ISD::OutputArg
Same as ISD::InputArg
, but for outgoing arguments. It is used to determine how the argument should be passed to the function.
/// of the caller) return value virtual register.///struct OutputArg { ArgFlagsTy Flags; MVT VT; EVT ArgVT;
/// IsFixed - Is this a "fixed" value, ie not passed through a vararg "...". bool IsFixed = false;
/// Index original Function's argument. unsigned OrigArgIndex;
/// Offset in bytes of current output value relative to the beginning of /// original argument. E.g. if argument was splitted into four 32 bit /// registers, we got 4 OutputArgs with PartOffsets 0, 4, 8 and 12. unsigned PartOffset; OutputArg() = default; OutputArg(ArgFlagsTy flags, MVT vt, EVT argvt, bool isfixed, unsigned origIdx,
The Outs
vector contains the return values that we have to stuff into registers
according to the calling convention.
This is done by the generic return lowering code in SelectionDAGBuilder.cpp
.
It splits the return value of any LLVM type (like i17) into legal types (like i32, f32)
and puts them into the Outs
vector.
Let’s just support single register return values for now.
const SmallVectorImpl<SDValue> &OutVals, const SDLoc &dl, SelectionDAG &DAG) const { // Handle only integer return values // we need to copy the value to the v0 register. if (Outs.size() > 1) { report_fatal_error( "Multiple return values not supported\n" "This could be because the return type is a struct or a large integer " "that got split into multiple registers", false); }
report_fatal_error
We use this function here to report a user error.
In the current LLVM version, the report_fatal_error
function is
deprecated and replaced by reportFatalUsageError
.
If we have no return values, just emit a return node.
}
if (Outs.size() == 0) { return DAG.getNode(NovaISD::Ret, dl, MVT::Other, Chain); }
Else, we iterate over the values given in Outs
and
emit CopyToReg
nodes for each value. These nodes must be glued together,
and then to the final NovaISD::Ret
node.
Note that this only supports
i32
values.
}
SDValue Glue; SmallVector<SDValue, 3> RetOps(1, Chain); for (unsigned i = 0, e = Outs.size(); i != e; ++i) { const ISD::OutputArg &Out = Outs[i]; const SDValue &OutVal = OutVals[i]; if (!Out.ArgVT.isScalarInteger() || Out.ArgVT.getScalarSizeInBits() > 32) { report_fatal_error("Only i32 return values are supported", false); } Chain = DAG.getCopyToReg(Chain, dl, Nova::V0, OutVal, Glue); Glue = Chain.getValue(1); RetOps.push_back(DAG.getRegister(Nova::V0, Out.VT)); } RetOps[0] = Chain; RetOps.push_back(Glue);
return DAG.getNode(NovaISD::Ret, dl,MVT::Other, RetOps);}
Add dummy implementations for the LowerCall
and other required methods.
}
SDValue NovaTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI, SmallVectorImpl<SDValue> &InVals) const { return SDValue(); }
bool NovaTargetLowering::CanLowerReturn(CallingConv::ID CallConv, MachineFunction &MF, bool IsVarArg, const SmallVectorImpl<ISD::OutputArg> &Outs, LLVMContext &Context, const Type *RetTy) const{ return true;}
const char *NovaTargetLowering::getTargetNodeName(unsigned Opcode) const { switch (Opcode) { case NovaISD::Ret: return "NovaISD::Ret"; default: return "Unknown NovaISD::Node"; }}
Finally, tell CMakeLists.txt
to build the new file.
MCTargetDesc/NovaMCAsmInfo.cpp NovaInstrInfo.cpp NovaISelLowering.cpp
Instruction Selection pass
The lowering code above is driven by the instruction selection pass that comes after some
IR optimizations in the llc
pipeline.
Let’s create the pass for our target. The logic mainly comes from the SelectionDAGISel
class.
#ifndef LLVM_LIB_TARGET_NOVA_NOVAISELDAGTODAG_H#define LLVM_LIB_TARGET_NOVA_NOVAISELDAGTODAG_H
#include "NovaSubtarget.h"#include "NovaTargetMachine.h"#include "llvm/CodeGen/MachineFunction.h"#include "llvm/CodeGen/SelectionDAGISel.h"
namespace llvm {class NovaDAGToDAGISel final : public SelectionDAGISel { const NovaSubtarget *Subtarget;
public: explicit NovaDAGToDAGISel(NovaTargetMachine &TM, CodeGenOptLevel OptLevel) : SelectionDAGISel(TM, OptLevel) {}
bool runOnMachineFunction(MachineFunction &MF) override;
private:#include "NovaGenDAGISel.inc"
void Select(SDNode *Node) override;};} // namespace llvm
#endif
Select()
is called for each node in the DAG. We can put our custom selection code and
call the TableGen generated code to select the node based on patterns in td files.
#include "NovaISelDAGToDAG.h"#include "NovaSubtarget.h"#include "llvm/CodeGen/MachineFunction.h"#include "llvm/CodeGen/SelectionDAGISel.h"#include "llvm/Pass.h"#include "llvm/Support/CodeGen.h"
using namespace llvm;
#define DEBUG_TYPE "nova-isel"
namespace {class NovaDAGToDAGISelLegacy : public SelectionDAGISelLegacy {public: static char ID; NovaDAGToDAGISelLegacy(NovaTargetMachine &TM, CodeGenOptLevel OptLevel) : SelectionDAGISelLegacy( ID, std::make_unique<NovaDAGToDAGISel>(TM, OptLevel)) {}};} // namespace
char NovaDAGToDAGISelLegacy::ID = 0;
INITIALIZE_PASS(NovaDAGToDAGISelLegacy, DEBUG_TYPE, "nova-isel", false, false);
FunctionPass *llvm::createNovaISelDagLegacy(NovaTargetMachine &TM, CodeGenOptLevel OptLevel) { return new NovaDAGToDAGISelLegacy(TM, OptLevel);}
bool NovaDAGToDAGISel::runOnMachineFunction(MachineFunction &MF) { Subtarget = &static_cast<const NovaSubtarget &>(MF.getSubtarget<NovaSubtarget>()); return SelectionDAGISel::runOnMachineFunction(MF);}
void NovaDAGToDAGISel::Select(SDNode *Node) { // Implement the selection logic here. // This is where you would match the SelectionDAG nodes to the target // instructions. For example, you might want to match a specific node type and // then create a corresponding machine instruction.
// Example: if (Node->getOpcode() == ISD::ADD) { ... } // This is just a placeholder for the actual implementation. SelectCode(Node);}
Legacy passes like this one need to be initialized by registering them in the PassRegistry
.
We put such initializer functions in Nova.h
file.
#include "llvm/Support/CodeGen.h"
namespace llvm { class FunctionPass; class NovaTargetMachine;
FunctionPass *createNovaISelDagLegacy(NovaTargetMachine &TM, CodeGenOptLevel OptLevel);
void initializeNovaDAGToDAGISelLegacyPass(PassRegistry &);} // namespace llvm#endif
Finish with required includes.
#include "MCTargetDesc/NovaMCTargetDesc.h"#include "llvm/Pass.h"#include "llvm/Support/CodeGen.h"
Plug into the pipeline
We now set up the pass pipeline to use the new NovaISelDAGToDAG
pass.
Register the targetmachine in the target registry.
extern "C" void LLVMInitializeNovaTarget() { // TODO: Add initialize target RegisterTargetMachine<NovaTargetMachine> X(getTheNovaTarget());
initializeNovaDAGToDAGISelLegacyPass(*PassRegistry::getPassRegistry());}
Targets construct their pipeline by using the TargetPassConfig
class.
}
namespace {class NovaPassConfig : public TargetPassConfig {public: NovaPassConfig(NovaTargetMachine &TM, PassManagerBase &PM) : TargetPassConfig(TM, PM) {}
NovaTargetMachine &getNovaTargetMachine() const { return getTM<NovaTargetMachine>(); } bool addInstSelector() override { addPass(createNovaISelDagLegacy(getNovaTargetMachine(), getOptLevel())); return false; } void addPreEmitPass() override {}};} // namespace
TargetPassConfig *NovaTargetMachine::createPassConfig(PassManagerBase &PM) { return new NovaPassConfig(*this, PM);}
Great! We are almost there - the last piece of the backend is the instruction printer.
Instruction Printer
To write the machine instructions to the assembly file, we have to implement
our AsmPrinter pass. This uses another class called MCInstPrinter
to print the
instructions.
This pass writes the MachineInstr
to the output file. It is responsible for
converting the MachineInstr
to the target-specific assembly syntax.
When we write the instructions in the NovaInstrInfo.td
file, we also define the
the assembly string format for it. TableGen will generate the printing method
using that format.
tablegen(LLVM NovaGenInstrInfo.inc -gen-instr-info)tablegen(LLVM NovaGenDAGISel.inc -gen-dag-isel)tablegen(LLVM NovaGenAsmWriter.inc -gen-asm-writer)
#ifndef LLVM_LIB_TARGET_NOVA_MCTARGETDESC_NOVAMCINSTPRINTER_H#define LLVM_LIB_TARGET_NOVA_MCTARGETDESC_NOVAMCINSTPRINTER_H
#include "llvm/MC/MCInstPrinter.h"#include "llvm/MC/MCRegister.h"
namespace llvm {class NovaInstPrinter : public MCInstPrinter {public: NovaInstPrinter(const MCAsmInfo &MAI, const MCInstrInfo &MII, const MCRegisterInfo &MRI) : MCInstPrinter(MAI, MII, MRI) {}
void printInst(const MCInst *MI, uint64_t Address, StringRef Annot, const MCSubtargetInfo &STI, raw_ostream &O) override;
bool printAliasInstr(const MCInst *MI, uint64_t Address, raw_ostream &OS);
void printInstruction(const MCInst *MI, uint64_t Address, raw_ostream &O);
void printOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O);
void printRegName(raw_ostream &OS, MCRegister RegNo) override;
const char *getRegisterName(MCRegister Reg);
std::pair<const char*, uint64_t> getMnemonic(const MCInst &MI) const override;};} // end namespace llvm
#endif
The tablegen code is included in the implementation file like so:
#include "NovaMCInstPrinter.h"#include "NovaInstrInfo.h"#include "llvm/MC/MCInst.h"#define DEBUG_TYPE "nova-mcinst-printer"
using namespace llvm;
#define PRINT_ALIAS_INSTR#include "NovaGenAsmWriter.inc"
To print instructions, we use the generated printInstruction
method.
Sometimes we need to print aliases of the instruction, which is handled by printAliasInstr
.
#include "NovaGenAsmWriter.inc"
void NovaInstPrinter::printInst(const MCInst *MI, uint64_t Address, StringRef Annot, const MCSubtargetInfo &STI, raw_ostream &O) { // check if we have an alias if (!printAliasInstr(MI, Address, O)) { printInstruction(MI, Address, O); } printAnnotation(O, Annot);}
void NovaInstPrinter::printRegName(raw_ostream &OS, MCRegister Reg) {
Registers in MIPS assembly are printed as $v0
, $v1
, etc. This is done by the printRegName
method.
}
void NovaInstPrinter::printRegName(raw_ostream &OS, MCRegister Reg) { OS << "$" << StringRef(getRegisterName(Reg)).lower();}
Printing registers is just a special case of printing operands. MCOperand
represents several types of operands:
class raw_ostream;
/// Instances of this class represent operands of the MCInst class./// This is a simple discriminated union.class MCOperand { enum MachineOperandType : unsigned char { kInvalid, ///< Uninitialized. kRegister, ///< Register operand. kImmediate, ///< Immediate operand. kSFPImmediate, ///< Single-floating-point immediate operand. kDFPImmediate, ///< Double-Floating-point immediate operand. kExpr, ///< Relocatable immediate operand. kInst ///< Sub-instruction operand. }; MachineOperandType Kind = kInvalid;
Handle this on a case-by-case basis in the printOperand
method.
}
void NovaInstPrinter::printOperand(const MCInst *MI, unsigned OpNo, raw_ostream &O) { const MCOperand &Op = MI->getOperand(OpNo); if (Op.isReg()) { printRegName(O, Op.getReg()); return; }
if (Op.isImm()) { O << Op.getImm(); return; }
assert(Op.isExpr() && "unknown operand type"); Op.getExpr()->print(O, &MAI, true);}
Let’s get this show on the road by getting this in our target.
}
static MCInstPrinter* createNovaMCInstPrinter(const Triple &T, unsigned SyntaxVariant, const MCAsmInfo &MAI, const MCInstrInfo &MII, const MCRegisterInfo &MRI) { return new NovaInstPrinter(MAI, MII, MRI);}
extern "C" void LLVMInitializeNovaTargetMC() {
Install the instance in the Target
POD class.
TargetRegistry::RegisterMCInstrInfo(*T, createNovaMCInstrInfo); TargetRegistry::RegisterMCAsmInfo(*T, createNovaMCAsmInfo); TargetRegistry::RegisterMCInstPrinter(*T, createNovaMCInstPrinter);}
Reference the new header in.
#include "MCTargetDesc/NovaMCAsmInfo.h"#include "llvm/MC/MCDwarf.h"#include "MCTargetDesc/NovaMCInstPrinter.h"
#include "llvm/MC/MCRegisterInfo.h"
Get it rolling by garnishing CMakeLists.txt
file with the new files.
NovaISelLowering.cpp NovaISelDAGToDAG.cpp MCTargetDesc/NovaMCInstPrinter.cpp
ASM Printer
The class above is used by the assembly printer to print the instructions.
#include "Nova.h"#include "NovaSubtarget.h"#include "NovaTargetInfo.h"#include "NovaTargetMachine.h"#include "MCTargetDesc/NovaMCInstPrinter.h"#include "llvm/CodeGen/AsmPrinter.h"#include "llvm/CodeGen/MachineFunction.h"#include "llvm/MC/MCExpr.h"#include "llvm/MC/MCSymbol.h"#include "llvm/MC/TargetRegistry.h"
#define DEBUG_TYPE "nova-asm-printer"
using namespace llvm;
namespace {class NovaAsmPrinter : public AsmPrinter {public: NovaAsmPrinter(TargetMachine &TM, std::unique_ptr<MCStreamer> Streamer) : AsmPrinter(TM, std::move(Streamer)) {}
StringRef getPassName() const override { return "Nova Assembly Printer"; }
void emitInstruction(const MachineInstr *MI) override;
// Lower the MachineInstr to MCInst void lowerInstruction(const MachineInstr &MI, MCInst &Inst);
// bool lowerPseudoInstExpansion(const MachineInstr *MI, MCInst &Inst);private:
MCOperand lowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym);};
MCOperand NovaAsmPrinter::lowerSymbolOperand(const MachineOperand &MO, MCSymbol *Sym) { auto &Ctx = OutContext; const MCExpr *Expr = MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, Ctx); assert(MO.isMBB() && "Only basic block symbols are supported"); return MCOperand::createExpr(Expr);}
void NovaAsmPrinter::lowerInstruction(const MachineInstr &MI, MCInst &Inst) { // This function should convert the MachineInstr to MCInst // The implementation will depend on the specific instruction set // and how you want to represent it in the MCInst format. // For now, we will just print the opcode and operands.
Inst.setOpcode(MI.getOpcode()); for (const auto &Op : MI.operands()) { MCOperand MCOp; switch (Op.getType()) { case MachineOperand::MO_Register: MCOp = MCOperand::createReg(Op.getReg()); break; case MachineOperand::MO_Immediate: MCOp = MCOperand::createImm(Op.getImm()); break; case MachineOperand::MO_MachineBasicBlock: MCOp = lowerSymbolOperand(Op, Op.getMBB()->getSymbol()); break; // Add other operand types as needed default: llvm_unreachable("Unsupported operand type"); } Inst.addOperand(MCOp); }}
} // end anonymous namespace
void NovaAsmPrinter::emitInstruction(const MachineInstr *MI) { // Lower the instruction to MCInst MCInst Inst; lowerInstruction(*MI, Inst); EmitToStreamer(*OutStreamer, Inst);}
extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeNovaAsmPrinter() { RegisterAsmPrinter<NovaAsmPrinter> X(getTheNovaTarget());}
Add to cmake.
NovaISelDAGToDAG.cpp MCTargetDesc/NovaMCInstPrinter.cpp NovaAsmPrinter.cpp
LINK_COMPONENTS
Compiling
And we are done! We can compile this code to assembly now.
define void @main() { ret void}
Run llc
on the file.
llc -mtriple=mipsnova test.ll -o -
.text.globl voidTest # -- Begin function voidTest.type voidTest,@functionvoidTest: # @voidTest# %bb.0:ret.Lfunc_end0:.size voidTest, .Lfunc_end0-voidTest# -- End function.section ".note.GNU-stack","",@progbits
Congrats, you just completed your first LLVM backend!