Path: blob/main/cranelift/codegen/src/machinst/abi.rs
1693 views
//! Implementation of a vanilla ABI, shared between several machines. The1//! implementation here assumes that arguments will be passed in registers2//! first, then additional args on the stack; that the stack grows downward,3//! contains a standard frame (return address and frame pointer), and the4//! compiler is otherwise free to allocate space below that with its choice of5//! layout; and that the machine has some notion of caller- and callee-save6//! registers. Most modern machines, e.g. x86-64 and AArch64, should fit this7//! mold and thus both of these backends use this shared implementation.8//!9//! See the documentation in specific machine backends for the "instantiation"10//! of this generic ABI, i.e., which registers are caller/callee-save, arguments11//! and return values, and any other special requirements.12//!13//! For now the implementation here assumes a 64-bit machine, but we intend to14//! make this 32/64-bit-generic shortly.15//!16//! # Vanilla ABI17//!18//! First, arguments and return values are passed in registers up to a certain19//! fixed count, after which they overflow onto the stack. Multiple return20//! values either fit in registers, or are returned in a separate return-value21//! area on the stack, given by a hidden extra parameter.22//!23//! Note that the exact stack layout is up to us. We settled on the24//! below design based on several requirements. In particular, we need25//! to be able to generate instructions (or instruction sequences) to26//! access arguments, stack slots, and spill slots before we know how27//! many spill slots or clobber-saves there will be, because of our28//! pass structure. We also prefer positive offsets to negative29//! offsets because of an asymmetry in some machines' addressing modes30//! (e.g., on AArch64, positive offsets have a larger possible range31//! without a long-form sequence to synthesize an arbitrary32//! offset). We also need clobber-save registers to be "near" the33//! frame pointer: Windows unwind information requires it to be within34//! 240 bytes of RBP. Finally, it is not allowed to access memory35//! below the current SP value.36//!37//! We assume that a prologue first pushes the frame pointer (and38//! return address above that, if the machine does not do that in39//! hardware). We set FP to point to this two-word frame record. We40//! store all other frame slots below this two-word frame record, as41//! well as enough space for arguments to the largest possible42//! function call. The stack pointer then remains at this position43//! for the duration of the function, allowing us to address all44//! frame storage at positive offsets from SP.45//!46//! Note that if we ever support dynamic stack-space allocation (for47//! `alloca`), we will need a way to reference spill slots and stack48//! slots relative to a dynamic SP, because we will no longer be able49//! to know a static offset from SP to the slots at any particular50//! program point. Probably the best solution at that point will be to51//! revert to using the frame pointer as the reference for all slots,52//! to allow generating spill/reload and stackslot accesses before we53//! know how large the clobber-saves will be.54//!55//! # Stack Layout56//!57//! The stack looks like:58//!59//! ```plain60//! (high address)61//! | ... |62//! | caller frames |63//! | ... |64//! +===========================+65//! | ... |66//! | stack args |67//! Canonical Frame Address --> | (accessed via FP) |68//! +---------------------------+69//! SP at function entry -----> | return address |70//! +---------------------------+71//! FP after prologue --------> | FP (pushed by prologue) |72//! +---------------------------+ -----73//! | ... | |74//! | clobbered callee-saves | |75//! unwind-frame base --------> | (pushed by prologue) | |76//! +---------------------------+ ----- |77//! | ... | | |78//! | spill slots | | |79//! | (accessed via SP) | fixed active80//! | ... | frame size81//! | stack slots | storage |82//! | (accessed via SP) | size |83//! | (alloc'd by prologue) | | |84//! +---------------------------+ ----- |85//! | [alignment as needed] | |86//! | ... | |87//! | args for largest call | |88//! SP -----------------------> | (alloc'd by prologue) | |89//! +===========================+ -----90//!91//! (low address)92//! ```93//!94//! # Multi-value Returns95//!96//! We support multi-value returns by using multiple return-value97//! registers. In some cases this is an extension of the base system98//! ABI. See each platform's `abi.rs` implementation for details.99100use crate::CodegenError;101use crate::entity::SecondaryMap;102use crate::ir::types::*;103use crate::ir::{ArgumentExtension, ArgumentPurpose, ExceptionTag, Signature};104use crate::isa::TargetIsa;105use crate::settings::ProbestackStrategy;106use crate::{ir, isa};107use crate::{machinst::*, trace};108use alloc::boxed::Box;109use regalloc2::{MachineEnv, PReg, PRegSet};110use rustc_hash::FxHashMap;111use smallvec::smallvec;112use std::collections::HashMap;113use std::marker::PhantomData;114115/// A small vector of instructions (with some reasonable size); appropriate for116/// a small fixed sequence implementing one operation.117pub type SmallInstVec<I> = SmallVec<[I; 4]>;118119/// A type used by backends to track argument-binding info in the "args"120/// pseudoinst. The pseudoinst holds a vec of `ArgPair` structs.121#[derive(Clone, Debug)]122pub struct ArgPair {123/// The vreg that is defined by this args pseudoinst.124pub vreg: Writable<Reg>,125/// The preg that the arg arrives in; this constrains the vreg's126/// placement at the pseudoinst.127pub preg: Reg,128}129130/// A type used by backends to track return register binding info in the "ret"131/// pseudoinst. The pseudoinst holds a vec of `RetPair` structs.132#[derive(Clone, Debug)]133pub struct RetPair {134/// The vreg that is returned by this pseudionst.135pub vreg: Reg,136/// The preg that the arg is returned through; this constrains the vreg's137/// placement at the pseudoinst.138pub preg: Reg,139}140141/// A location for (part of) an argument or return value. These "storage slots"142/// are specified for each register-sized part of an argument.143#[derive(Clone, Copy, Debug, PartialEq, Eq)]144pub enum ABIArgSlot {145/// In a real register.146Reg {147/// Register that holds this arg.148reg: RealReg,149/// Value type of this arg.150ty: ir::Type,151/// Should this arg be zero- or sign-extended?152extension: ir::ArgumentExtension,153},154/// Arguments only: on stack, at given offset from SP at entry.155Stack {156/// Offset of this arg relative to the base of stack args.157offset: i64,158/// Value type of this arg.159ty: ir::Type,160/// Should this arg be zero- or sign-extended?161extension: ir::ArgumentExtension,162},163}164165impl ABIArgSlot {166/// The type of the value that will be stored in this slot.167pub fn get_type(&self) -> ir::Type {168match self {169ABIArgSlot::Reg { ty, .. } => *ty,170ABIArgSlot::Stack { ty, .. } => *ty,171}172}173}174175/// A vector of `ABIArgSlot`s. Inline capacity for one element because basically176/// 100% of values use one slot. Only `i128`s need multiple slots, and they are177/// super rare (and never happen with Wasm).178pub type ABIArgSlotVec = SmallVec<[ABIArgSlot; 1]>;179180/// An ABIArg is composed of one or more parts. This allows for a CLIF-level181/// Value to be passed with its parts in more than one location at the ABI182/// level. For example, a 128-bit integer may be passed in two 64-bit registers,183/// or even a 64-bit register and a 64-bit stack slot, on a 64-bit machine. The184/// number of "parts" should correspond to the number of registers used to store185/// this type according to the machine backend.186///187/// As an invariant, the `purpose` for every part must match. As a further188/// invariant, a `StructArg` part cannot appear with any other part.189#[derive(Clone, Debug)]190pub enum ABIArg {191/// Storage slots (registers or stack locations) for each part of the192/// argument value. The number of slots must equal the number of register193/// parts used to store a value of this type.194Slots {195/// Slots, one per register part.196slots: ABIArgSlotVec,197/// Purpose of this arg.198purpose: ir::ArgumentPurpose,199},200/// Structure argument. We reserve stack space for it, but the CLIF-level201/// semantics are a little weird: the value passed to the call instruction,202/// and received in the corresponding block param, is a *pointer*. On the203/// caller side, we memcpy the data from the passed-in pointer to the stack204/// area; on the callee side, we compute a pointer to this stack area and205/// provide that as the argument's value.206StructArg {207/// Offset of this arg relative to base of stack args.208offset: i64,209/// Size of this arg on the stack.210size: u64,211/// Purpose of this arg.212purpose: ir::ArgumentPurpose,213},214/// Implicit argument. Similar to a StructArg, except that we have the215/// target type, not a pointer type, at the CLIF-level. This argument is216/// still being passed via reference implicitly.217ImplicitPtrArg {218/// Register or stack slot holding a pointer to the buffer.219pointer: ABIArgSlot,220/// Offset of the argument buffer.221offset: i64,222/// Type of the implicit argument.223ty: Type,224/// Purpose of this arg.225purpose: ir::ArgumentPurpose,226},227}228229impl ABIArg {230/// Create an ABIArg from one register.231pub fn reg(232reg: RealReg,233ty: ir::Type,234extension: ir::ArgumentExtension,235purpose: ir::ArgumentPurpose,236) -> ABIArg {237ABIArg::Slots {238slots: smallvec![ABIArgSlot::Reg { reg, ty, extension }],239purpose,240}241}242243/// Create an ABIArg from one stack slot.244pub fn stack(245offset: i64,246ty: ir::Type,247extension: ir::ArgumentExtension,248purpose: ir::ArgumentPurpose,249) -> ABIArg {250ABIArg::Slots {251slots: smallvec![ABIArgSlot::Stack {252offset,253ty,254extension,255}],256purpose,257}258}259}260261/// Are we computing information about arguments or return values? Much of the262/// handling is factored out into common routines; this enum allows us to263/// distinguish which case we're handling.264#[derive(Clone, Copy, Debug, PartialEq, Eq)]265pub enum ArgsOrRets {266/// Arguments.267Args,268/// Return values.269Rets,270}271272/// Abstract location for a machine-specific ABI impl to translate into the273/// appropriate addressing mode.274#[derive(Clone, Copy, Debug, PartialEq, Eq)]275pub enum StackAMode {276/// Offset into the current frame's argument area.277IncomingArg(i64, u32),278/// Offset within the stack slots in the current frame.279Slot(i64),280/// Offset into the callee frame's argument area.281OutgoingArg(i64),282}283284impl StackAMode {285fn offset_by(&self, offset: u32) -> Self {286match self {287StackAMode::IncomingArg(off, size) => {288StackAMode::IncomingArg(off.checked_add(i64::from(offset)).unwrap(), *size)289}290StackAMode::Slot(off) => StackAMode::Slot(off.checked_add(i64::from(offset)).unwrap()),291StackAMode::OutgoingArg(off) => {292StackAMode::OutgoingArg(off.checked_add(i64::from(offset)).unwrap())293}294}295}296}297298/// Trait implemented by machine-specific backend to represent ISA flags.299pub trait IsaFlags: Clone {300/// Get a flag indicating whether forward-edge CFI is enabled.301fn is_forward_edge_cfi_enabled(&self) -> bool {302false303}304}305306/// Used as an out-parameter to accumulate a sequence of `ABIArg`s in307/// `ABIMachineSpec::compute_arg_locs`. Wraps the shared allocation for all308/// `ABIArg`s in `SigSet` and exposes just the args for the current309/// `compute_arg_locs` call.310pub struct ArgsAccumulator<'a> {311sig_set_abi_args: &'a mut Vec<ABIArg>,312start: usize,313non_formal_flag: bool,314}315316impl<'a> ArgsAccumulator<'a> {317fn new(sig_set_abi_args: &'a mut Vec<ABIArg>) -> Self {318let start = sig_set_abi_args.len();319ArgsAccumulator {320sig_set_abi_args,321start,322non_formal_flag: false,323}324}325326#[inline]327pub fn push(&mut self, arg: ABIArg) {328debug_assert!(!self.non_formal_flag);329self.sig_set_abi_args.push(arg)330}331332#[inline]333pub fn push_non_formal(&mut self, arg: ABIArg) {334self.non_formal_flag = true;335self.sig_set_abi_args.push(arg)336}337338#[inline]339pub fn args(&self) -> &[ABIArg] {340&self.sig_set_abi_args[self.start..]341}342343#[inline]344pub fn args_mut(&mut self) -> &mut [ABIArg] {345&mut self.sig_set_abi_args[self.start..]346}347}348349/// Trait implemented by machine-specific backend to provide information about350/// register assignments and to allow generating the specific instructions for351/// stack loads/saves, prologues/epilogues, etc.352pub trait ABIMachineSpec {353/// The instruction type.354type I: VCodeInst;355356/// The ISA flags type.357type F: IsaFlags;358359/// This is the limit for the size of argument and return-value areas on the360/// stack. We place a reasonable limit here to avoid integer overflow issues361/// with 32-bit arithmetic.362const STACK_ARG_RET_SIZE_LIMIT: u32;363364/// Returns the number of bits in a word, that is 32/64 for 32/64-bit architecture.365fn word_bits() -> u32;366367/// Returns the number of bytes in a word.368fn word_bytes() -> u32 {369return Self::word_bits() / 8;370}371372/// Returns word-size integer type.373fn word_type() -> Type {374match Self::word_bits() {37532 => I32,37664 => I64,377_ => unreachable!(),378}379}380381/// Returns word register class.382fn word_reg_class() -> RegClass {383RegClass::Int384}385386/// Returns required stack alignment in bytes.387fn stack_align(call_conv: isa::CallConv) -> u32;388389/// Process a list of parameters or return values and allocate them to registers390/// and stack slots.391///392/// The argument locations should be pushed onto the given `ArgsAccumulator`393/// in order. Any extra arguments added (such as return area pointers)394/// should come at the end of the list so that the first N lowered395/// parameters align with the N clif parameters.396///397/// Returns the stack-space used (rounded up to as alignment requires), and398/// if `add_ret_area_ptr` was passed, the index of the extra synthetic arg399/// that was added.400fn compute_arg_locs(401call_conv: isa::CallConv,402flags: &settings::Flags,403params: &[ir::AbiParam],404args_or_rets: ArgsOrRets,405add_ret_area_ptr: bool,406args: ArgsAccumulator,407) -> CodegenResult<(u32, Option<usize>)>;408409/// Generate a load from the stack.410fn gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;411412/// Generate a store to the stack.413fn gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I;414415/// Generate a move.416fn gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I;417418/// Generate an integer-extend operation.419fn gen_extend(420to_reg: Writable<Reg>,421from_reg: Reg,422is_signed: bool,423from_bits: u8,424to_bits: u8,425) -> Self::I;426427/// Generate an "args" pseudo-instruction to capture input args in428/// registers.429fn gen_args(args: Vec<ArgPair>) -> Self::I;430431/// Generate a "rets" pseudo-instruction that moves vregs to return432/// registers.433fn gen_rets(rets: Vec<RetPair>) -> Self::I;434435/// Generate an add-with-immediate. Note that even if this uses a scratch436/// register, it must satisfy two requirements:437///438/// - The add-imm sequence must only clobber caller-save registers that are439/// not used for arguments, because it will be placed in the prologue440/// before the clobbered callee-save registers are saved.441///442/// - The add-imm sequence must work correctly when `from_reg` and/or443/// `into_reg` are the register returned by `get_stacklimit_reg()`.444fn gen_add_imm(445call_conv: isa::CallConv,446into_reg: Writable<Reg>,447from_reg: Reg,448imm: u32,449) -> SmallInstVec<Self::I>;450451/// Generate a sequence that traps with a `TrapCode::StackOverflow` code if452/// the stack pointer is less than the given limit register (assuming the453/// stack grows downward).454fn gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>;455456/// Generate an instruction to compute an address of a stack slot (FP- or457/// SP-based offset).458fn gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>) -> Self::I;459460/// Get a fixed register to use to compute a stack limit. This is needed for461/// certain sequences generated after the register allocator has already462/// run. This must satisfy two requirements:463///464/// - It must be a caller-save register that is not used for arguments,465/// because it will be clobbered in the prologue before the clobbered466/// callee-save registers are saved.467///468/// - It must be safe to pass as an argument and/or destination to469/// `gen_add_imm()`. This is relevant when an addition with a large470/// immediate needs its own temporary; it cannot use the same fixed471/// temporary as this one.472fn get_stacklimit_reg(call_conv: isa::CallConv) -> Reg;473474/// Generate a load to the given [base+offset] address.475fn gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I;476477/// Generate a store from the given [base+offset] address.478fn gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I;479480/// Adjust the stack pointer up or down.481fn gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>;482483/// Compute a FrameLayout structure containing a sorted list of all clobbered484/// registers that are callee-saved according to the ABI, as well as the sizes485/// of all parts of the stack frame. The result is used to emit the prologue486/// and epilogue routines.487fn compute_frame_layout(488call_conv: isa::CallConv,489flags: &settings::Flags,490sig: &Signature,491regs: &[Writable<RealReg>],492function_calls: FunctionCalls,493incoming_args_size: u32,494tail_args_size: u32,495stackslots_size: u32,496fixed_frame_storage_size: u32,497outgoing_args_size: u32,498) -> FrameLayout;499500/// Generate the usual frame-setup sequence for this architecture: e.g.,501/// `push rbp / mov rbp, rsp` on x86-64, or `stp fp, lr, [sp, #-16]!` on502/// AArch64.503fn gen_prologue_frame_setup(504call_conv: isa::CallConv,505flags: &settings::Flags,506isa_flags: &Self::F,507frame_layout: &FrameLayout,508) -> SmallInstVec<Self::I>;509510/// Generate the usual frame-restore sequence for this architecture.511fn gen_epilogue_frame_restore(512call_conv: isa::CallConv,513flags: &settings::Flags,514isa_flags: &Self::F,515frame_layout: &FrameLayout,516) -> SmallInstVec<Self::I>;517518/// Generate a return instruction.519fn gen_return(520call_conv: isa::CallConv,521isa_flags: &Self::F,522frame_layout: &FrameLayout,523) -> SmallInstVec<Self::I>;524525/// Generate a probestack call.526fn gen_probestack(insts: &mut SmallInstVec<Self::I>, frame_size: u32);527528/// Generate a inline stack probe.529fn gen_inline_probestack(530insts: &mut SmallInstVec<Self::I>,531call_conv: isa::CallConv,532frame_size: u32,533guard_size: u32,534);535536/// Generate a clobber-save sequence. The implementation here should return537/// a sequence of instructions that "push" or otherwise save to the stack all538/// registers written/modified by the function body that are callee-saved.539/// The sequence of instructions should adjust the stack pointer downward,540/// and should align as necessary according to ABI requirements.541fn gen_clobber_save(542call_conv: isa::CallConv,543flags: &settings::Flags,544frame_layout: &FrameLayout,545) -> SmallVec<[Self::I; 16]>;546547/// Generate a clobber-restore sequence. This sequence should perform the548/// opposite of the clobber-save sequence generated above, assuming that SP549/// going into the sequence is at the same point that it was left when the550/// clobber-save sequence finished.551fn gen_clobber_restore(552call_conv: isa::CallConv,553flags: &settings::Flags,554frame_layout: &FrameLayout,555) -> SmallVec<[Self::I; 16]>;556557/// Generate a memcpy invocation. Used to set up struct558/// args. Takes `src`, `dst` as read-only inputs and passes a temporary559/// allocator.560fn gen_memcpy<F: FnMut(Type) -> Writable<Reg>>(561call_conv: isa::CallConv,562dst: Reg,563src: Reg,564size: usize,565alloc_tmp: F,566) -> SmallVec<[Self::I; 8]>;567568/// Get the number of spillslots required for the given register-class.569fn get_number_of_spillslots_for_value(570rc: RegClass,571target_vector_bytes: u32,572isa_flags: &Self::F,573) -> u32;574575/// Get the ABI-dependent MachineEnv for managing register allocation.576fn get_machine_env(flags: &settings::Flags, call_conv: isa::CallConv) -> &MachineEnv;577578/// Get all caller-save registers, that is, registers that we expect579/// not to be saved across a call to a callee with the given ABI.580fn get_regs_clobbered_by_call(581call_conv_of_callee: isa::CallConv,582is_exception: bool,583) -> PRegSet;584585/// Get the needed extension mode, given the mode attached to the argument586/// in the signature and the calling convention. The input (the attribute in587/// the signature) specifies what extension type should be done *if* the ABI588/// requires extension to the full register; this method's return value589/// indicates whether the extension actually *will* be done.590fn get_ext_mode(591call_conv: isa::CallConv,592specified: ir::ArgumentExtension,593) -> ir::ArgumentExtension;594595/// Get a temporary register that is available to use after a call596/// completes and that does not interfere with register-carried597/// return values. This is used to move stack-carried return598/// values directly into spillslots if needed.599fn retval_temp_reg(call_conv_of_callee: isa::CallConv) -> Writable<Reg>;600601/// Get the exception payload registers, if any, for a calling602/// convention.603///604/// Note that the argument here is the calling convention of the *callee*.605/// This might differ from the caller but the exceptional payloads that are606/// available are defined by the callee, not the caller.607fn exception_payload_regs(callee_conv: isa::CallConv) -> &'static [Reg] {608let _ = callee_conv;609&[]610}611}612613/// Out-of-line data for calls, to keep the size of `Inst` down.614#[derive(Clone, Debug)]615pub struct CallInfo<T> {616/// Receiver of this call617pub dest: T,618/// Register uses of this call.619pub uses: CallArgList,620/// Register defs of this call.621pub defs: CallRetList,622/// Registers clobbered by this call, as per its calling convention.623pub clobbers: PRegSet,624/// The calling convention of the callee.625pub callee_conv: isa::CallConv,626/// The calling convention of the caller.627pub caller_conv: isa::CallConv,628/// The number of bytes that the callee will pop from the stack for the629/// caller, if any. (Used for popping stack arguments with the `tail`630/// calling convention.)631pub callee_pop_size: u32,632/// Information for a try-call, if this is one. We combine633/// handling of calls and try-calls as much as possible to share634/// argument/return logic; they mostly differ in the metadata that635/// they emit, which this information feeds into.636pub try_call_info: Option<TryCallInfo>,637}638639/// Out-of-line information present on `try_call` instructions only:640/// information that is used to generate exception-handling tables and641/// link up to destination blocks properly.642#[derive(Clone, Debug)]643pub struct TryCallInfo {644/// The target to jump to on a normal returhn.645pub continuation: MachLabel,646/// Exception tags to catch and corresponding destination labels.647pub exception_handlers: Box<[TryCallHandler]>,648}649650/// Information about an individual handler at a try-call site.651#[derive(Clone, Debug)]652pub enum TryCallHandler {653/// If the tag matches (given the current context), recover at the654/// label.655Tag(ExceptionTag, MachLabel),656/// Recover at the label unconditionally.657Default(MachLabel),658/// Set the dynamic context for interpreting tags at this point in659/// the handler list.660Context(Reg),661}662663impl<T> CallInfo<T> {664/// Creates an empty set of info with no clobbers/uses/etc with the665/// specified ABI666pub fn empty(dest: T, call_conv: isa::CallConv) -> CallInfo<T> {667CallInfo {668dest,669uses: smallvec![],670defs: smallvec![],671clobbers: PRegSet::empty(),672caller_conv: call_conv,673callee_conv: call_conv,674callee_pop_size: 0,675try_call_info: None,676}677}678}679680/// The id of an ABI signature within the `SigSet`.681#[derive(Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]682pub struct Sig(u32);683cranelift_entity::entity_impl!(Sig);684685impl Sig {686fn prev(self) -> Option<Sig> {687self.0.checked_sub(1).map(Sig)688}689}690691/// ABI information shared between body (callee) and caller.692#[derive(Clone, Debug)]693pub struct SigData {694/// Currently both return values and arguments are stored in a continuous space vector695/// in `SigSet::abi_args`.696///697/// ```plain698/// +----------------------------------------------+699/// | return values |700/// | ... |701/// rets_end --> +----------------------------------------------+702/// | arguments |703/// | ... |704/// args_end --> +----------------------------------------------+705///706/// ```707///708/// Note we only store two offsets as rets_end == args_start, and rets_start == prev.args_end.709///710/// Argument location ending offset (regs or stack slots). Stack offsets are relative to711/// SP on entry to function.712///713/// This is a index into the `SigSet::abi_args`.714args_end: u32,715716/// Return-value location ending offset. Stack offsets are relative to the return-area717/// pointer.718///719/// This is a index into the `SigSet::abi_args`.720rets_end: u32,721722/// Space on stack used to store arguments. We're storing the size in u32 to723/// reduce the size of the struct.724sized_stack_arg_space: u32,725726/// Space on stack used to store return values. We're storing the size in u32 to727/// reduce the size of the struct.728sized_stack_ret_space: u32,729730/// Index in `args` of the stack-return-value-area argument.731stack_ret_arg: Option<u16>,732733/// Calling convention used.734call_conv: isa::CallConv,735}736737impl SigData {738/// Get total stack space required for arguments.739pub fn sized_stack_arg_space(&self) -> u32 {740self.sized_stack_arg_space741}742743/// Get total stack space required for return values.744pub fn sized_stack_ret_space(&self) -> u32 {745self.sized_stack_ret_space746}747748/// Get calling convention used.749pub fn call_conv(&self) -> isa::CallConv {750self.call_conv751}752753/// The index of the stack-return-value-area argument, if any.754pub fn stack_ret_arg(&self) -> Option<u16> {755self.stack_ret_arg756}757}758759/// A (mostly) deduplicated set of ABI signatures.760///761/// We say "mostly" because we do not dedupe between signatures interned via762/// `ir::SigRef` (direct and indirect calls; the vast majority of signatures in763/// this set) vs via `ir::Signature` (the callee itself and libcalls). Doing764/// this final bit of deduplication would require filling out the765/// `ir_signature_to_abi_sig`, which is a bunch of allocations (not just the766/// hash map itself but params and returns vecs in each signature) that we want767/// to avoid.768///769/// In general, prefer using the `ir::SigRef`-taking methods to the770/// `ir::Signature`-taking methods when you can get away with it, as they don't771/// require cloning non-copy types that will trigger heap allocations.772///773/// This type can be indexed by `Sig` to access its associated `SigData`.774pub struct SigSet {775/// Interned `ir::Signature`s that we already have an ABI signature for.776ir_signature_to_abi_sig: FxHashMap<ir::Signature, Sig>,777778/// Interned `ir::SigRef`s that we already have an ABI signature for.779ir_sig_ref_to_abi_sig: SecondaryMap<ir::SigRef, Option<Sig>>,780781/// A single, shared allocation for all `ABIArg`s used by all782/// `SigData`s. Each `SigData` references its args/rets via indices into783/// this allocation.784abi_args: Vec<ABIArg>,785786/// The actual ABI signatures, keyed by `Sig`.787sigs: PrimaryMap<Sig, SigData>,788}789790impl SigSet {791/// Construct a new `SigSet`, interning all of the signatures used by the792/// given function.793pub fn new<M>(func: &ir::Function, flags: &settings::Flags) -> CodegenResult<Self>794where795M: ABIMachineSpec,796{797let arg_estimate = func.dfg.signatures.len() * 6;798799let mut sigs = SigSet {800ir_signature_to_abi_sig: FxHashMap::default(),801ir_sig_ref_to_abi_sig: SecondaryMap::with_capacity(func.dfg.signatures.len()),802abi_args: Vec::with_capacity(arg_estimate),803sigs: PrimaryMap::with_capacity(1 + func.dfg.signatures.len()),804};805806sigs.make_abi_sig_from_ir_signature::<M>(func.signature.clone(), flags)?;807for sig_ref in func.dfg.signatures.keys() {808sigs.make_abi_sig_from_ir_sig_ref::<M>(sig_ref, &func.dfg, flags)?;809}810811Ok(sigs)812}813814/// Have we already interned an ABI signature for the given `ir::Signature`?815pub fn have_abi_sig_for_signature(&self, signature: &ir::Signature) -> bool {816self.ir_signature_to_abi_sig.contains_key(signature)817}818819/// Construct and intern an ABI signature for the given `ir::Signature`.820pub fn make_abi_sig_from_ir_signature<M>(821&mut self,822signature: ir::Signature,823flags: &settings::Flags,824) -> CodegenResult<Sig>825where826M: ABIMachineSpec,827{828// Because the `HashMap` entry API requires taking ownership of the829// lookup key -- and we want to avoid unnecessary clones of830// `ir::Signature`s, even at the cost of duplicate lookups -- we can't831// have a single, get-or-create-style method for interning832// `ir::Signature`s into ABI signatures. So at least (debug) assert that833// we aren't creating duplicate ABI signatures for the same834// `ir::Signature`.835debug_assert!(!self.have_abi_sig_for_signature(&signature));836837let sig_data = self.from_func_sig::<M>(&signature, flags)?;838let sig = self.sigs.push(sig_data);839self.ir_signature_to_abi_sig.insert(signature, sig);840Ok(sig)841}842843fn make_abi_sig_from_ir_sig_ref<M>(844&mut self,845sig_ref: ir::SigRef,846dfg: &ir::DataFlowGraph,847flags: &settings::Flags,848) -> CodegenResult<Sig>849where850M: ABIMachineSpec,851{852if let Some(sig) = self.ir_sig_ref_to_abi_sig[sig_ref] {853return Ok(sig);854}855let signature = &dfg.signatures[sig_ref];856let sig_data = self.from_func_sig::<M>(signature, flags)?;857let sig = self.sigs.push(sig_data);858self.ir_sig_ref_to_abi_sig[sig_ref] = Some(sig);859Ok(sig)860}861862/// Get the already-interned ABI signature id for the given `ir::SigRef`.863pub fn abi_sig_for_sig_ref(&self, sig_ref: ir::SigRef) -> Sig {864self.ir_sig_ref_to_abi_sig[sig_ref]865.expect("must call `make_abi_sig_from_ir_sig_ref` before `get_abi_sig_for_sig_ref`")866}867868/// Get the already-interned ABI signature id for the given `ir::Signature`.869pub fn abi_sig_for_signature(&self, signature: &ir::Signature) -> Sig {870self.ir_signature_to_abi_sig871.get(signature)872.copied()873.expect("must call `make_abi_sig_from_ir_signature` before `get_abi_sig_for_signature`")874}875876pub fn from_func_sig<M: ABIMachineSpec>(877&mut self,878sig: &ir::Signature,879flags: &settings::Flags,880) -> CodegenResult<SigData> {881// Keep in sync with ensure_struct_return_ptr_is_returned882if sig.uses_special_return(ArgumentPurpose::StructReturn) {883panic!("Explicit StructReturn return value not allowed: {sig:?}")884}885let tmp;886let returns = if let Some(struct_ret_index) =887sig.special_param_index(ArgumentPurpose::StructReturn)888{889if !sig.returns.is_empty() {890panic!("No return values are allowed when using StructReturn: {sig:?}");891}892tmp = [sig.params[struct_ret_index]];893&tmp894} else {895sig.returns.as_slice()896};897898// Compute args and retvals from signature. Handle retvals first,899// because we may need to add a return-area arg to the args.900901// NOTE: We rely on the order of the args (rets -> args) inserted to compute the offsets in902// `SigSet::args()` and `SigSet::rets()`. Therefore, we cannot change the two903// compute_arg_locs order.904let (sized_stack_ret_space, _) = M::compute_arg_locs(905sig.call_conv,906flags,907&returns,908ArgsOrRets::Rets,909/* extra ret-area ptr = */ false,910ArgsAccumulator::new(&mut self.abi_args),911)?;912if !flags.enable_multi_ret_implicit_sret() {913assert_eq!(sized_stack_ret_space, 0);914}915let rets_end = u32::try_from(self.abi_args.len()).unwrap();916917// To avoid overflow issues, limit the return size to something reasonable.918if sized_stack_ret_space > M::STACK_ARG_RET_SIZE_LIMIT {919return Err(CodegenError::ImplLimitExceeded);920}921922let need_stack_return_area = sized_stack_ret_space > 0;923if need_stack_return_area {924assert!(!sig.uses_special_param(ir::ArgumentPurpose::StructReturn));925}926927let (sized_stack_arg_space, stack_ret_arg) = M::compute_arg_locs(928sig.call_conv,929flags,930&sig.params,931ArgsOrRets::Args,932need_stack_return_area,933ArgsAccumulator::new(&mut self.abi_args),934)?;935let args_end = u32::try_from(self.abi_args.len()).unwrap();936937// To avoid overflow issues, limit the arg size to something reasonable.938if sized_stack_arg_space > M::STACK_ARG_RET_SIZE_LIMIT {939return Err(CodegenError::ImplLimitExceeded);940}941942trace!(943"ABISig: sig {:?} => args end = {} rets end = {}944arg stack = {} ret stack = {} stack_ret_arg = {:?}",945sig,946args_end,947rets_end,948sized_stack_arg_space,949sized_stack_ret_space,950need_stack_return_area,951);952953let stack_ret_arg = stack_ret_arg.map(|s| u16::try_from(s).unwrap());954Ok(SigData {955args_end,956rets_end,957sized_stack_arg_space,958sized_stack_ret_space,959stack_ret_arg,960call_conv: sig.call_conv,961})962}963964/// Get this signature's ABI arguments.965pub fn args(&self, sig: Sig) -> &[ABIArg] {966let sig_data = &self.sigs[sig];967// Please see comments in `SigSet::from_func_sig` of how we store the offsets.968let start = usize::try_from(sig_data.rets_end).unwrap();969let end = usize::try_from(sig_data.args_end).unwrap();970&self.abi_args[start..end]971}972973/// Get information specifying how to pass the implicit pointer974/// to the return-value area on the stack, if required.975pub fn get_ret_arg(&self, sig: Sig) -> Option<ABIArg> {976let sig_data = &self.sigs[sig];977if let Some(i) = sig_data.stack_ret_arg {978Some(self.args(sig)[usize::from(i)].clone())979} else {980None981}982}983984/// Get information specifying how to pass one argument.985pub fn get_arg(&self, sig: Sig, idx: usize) -> ABIArg {986self.args(sig)[idx].clone()987}988989/// Get this signature's ABI returns.990pub fn rets(&self, sig: Sig) -> &[ABIArg] {991let sig_data = &self.sigs[sig];992// Please see comments in `SigSet::from_func_sig` of how we store the offsets.993let start = usize::try_from(sig.prev().map_or(0, |prev| self.sigs[prev].args_end)).unwrap();994let end = usize::try_from(sig_data.rets_end).unwrap();995&self.abi_args[start..end]996}997998/// Get information specifying how to pass one return value.999pub fn get_ret(&self, sig: Sig, idx: usize) -> ABIArg {1000self.rets(sig)[idx].clone()1001}10021003/// Get the number of arguments expected.1004pub fn num_args(&self, sig: Sig) -> usize {1005let len = self.args(sig).len();1006if self.sigs[sig].stack_ret_arg.is_some() {1007len - 11008} else {1009len1010}1011}10121013/// Get the number of return values expected.1014pub fn num_rets(&self, sig: Sig) -> usize {1015self.rets(sig).len()1016}1017}10181019// NB: we do _not_ implement `IndexMut` because these signatures are1020// deduplicated and shared!1021impl std::ops::Index<Sig> for SigSet {1022type Output = SigData;10231024fn index(&self, sig: Sig) -> &Self::Output {1025&self.sigs[sig]1026}1027}10281029/// Structure describing the layout of a function's stack frame.1030#[derive(Clone, Debug, Default)]1031pub struct FrameLayout {1032/// Word size in bytes, so this struct can be1033/// monomorphic/independent of `ABIMachineSpec`.1034pub word_bytes: u32,10351036/// N.B. The areas whose sizes are given in this structure fully1037/// cover the current function's stack frame, from high to low1038/// stack addresses in the sequence below. Each size contains1039/// any alignment padding that may be required by the ABI.10401041/// Size of incoming arguments on the stack. This is not technically1042/// part of this function's frame, but code in the function will still1043/// need to access it. Depending on the ABI, we may need to set up a1044/// frame pointer to do so; we also may need to pop this area from the1045/// stack upon return.1046pub incoming_args_size: u32,10471048/// The size of the incoming argument area, taking into account any1049/// potential increase in size required for tail calls present in the1050/// function. In the case that no tail calls are present, this value1051/// will be the same as [`Self::incoming_args_size`].1052pub tail_args_size: u32,10531054/// Size of the "setup area", typically holding the return address1055/// and/or the saved frame pointer. This may be written either during1056/// the call itself (e.g. a pushed return address) or by code emitted1057/// from gen_prologue_frame_setup. In any case, after that code has1058/// completed execution, the stack pointer is expected to point to the1059/// bottom of this area. The same holds at the start of code emitted1060/// by gen_epilogue_frame_restore.1061pub setup_area_size: u32,10621063/// Size of the area used to save callee-saved clobbered registers.1064/// This area is accessed by code emitted from gen_clobber_save and1065/// gen_clobber_restore.1066pub clobber_size: u32,10671068/// Storage allocated for the fixed part of the stack frame.1069/// This contains stack slots and spill slots.1070pub fixed_frame_storage_size: u32,10711072/// The size of all stackslots.1073pub stackslots_size: u32,10741075/// Stack size to be reserved for outgoing arguments, if used by1076/// the current ABI, or 0 otherwise. After gen_clobber_save and1077/// before gen_clobber_restore, the stack pointer points to the1078/// bottom of this area.1079pub outgoing_args_size: u32,10801081/// Sorted list of callee-saved registers that are clobbered1082/// according to the ABI. These registers will be saved and1083/// restored by gen_clobber_save and gen_clobber_restore.1084pub clobbered_callee_saves: Vec<Writable<RealReg>>,10851086/// The function's call pattern classification.1087pub function_calls: FunctionCalls,1088}10891090impl FrameLayout {1091/// Split the clobbered callee-save registers into integer-class and1092/// float-class groups.1093///1094/// This method does not currently support vector-class callee-save1095/// registers because no current backend has them.1096pub fn clobbered_callee_saves_by_class(&self) -> (&[Writable<RealReg>], &[Writable<RealReg>]) {1097let (ints, floats) = self.clobbered_callee_saves.split_at(1098self.clobbered_callee_saves1099.partition_point(|r| r.to_reg().class() == RegClass::Int),1100);1101debug_assert!(floats.iter().all(|r| r.to_reg().class() == RegClass::Float));1102(ints, floats)1103}11041105/// The size of FP to SP while the frame is active (not during prologue1106/// setup or epilogue tear down).1107pub fn active_size(&self) -> u32 {1108self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size1109}11101111/// Get the offset from the SP to the sized stack slots area.1112pub fn sp_to_sized_stack_slots(&self) -> u32 {1113self.outgoing_args_size1114}11151116/// Get the offset of a spill slot from SP.1117pub fn spillslot_offset(&self, spillslot: SpillSlot) -> i64 {1118// Offset from beginning of spillslot area.1119let islot = spillslot.index() as i64;1120let spill_off = islot * self.word_bytes as i64;1121let sp_off = self.stackslots_size as i64 + spill_off;11221123sp_off1124}11251126/// Get the offset from SP up to FP.1127pub fn sp_to_fp(&self) -> u32 {1128self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size1129}1130}11311132/// ABI object for a function body.1133pub struct Callee<M: ABIMachineSpec> {1134/// CLIF-level signature, possibly normalized.1135ir_sig: ir::Signature,1136/// Signature: arg and retval regs.1137sig: Sig,1138/// Defined dynamic types.1139dynamic_type_sizes: HashMap<Type, u32>,1140/// Offsets to each dynamic stackslot.1141dynamic_stackslots: PrimaryMap<DynamicStackSlot, u32>,1142/// Offsets to each sized stackslot.1143sized_stackslots: PrimaryMap<StackSlot, u32>,1144/// Total stack size of all stackslots1145stackslots_size: u32,1146/// Stack size to be reserved for outgoing arguments.1147outgoing_args_size: u32,1148/// Initially the number of bytes originating in the callers frame where stack arguments will1149/// live. After lowering this number may be larger than the size expected by the function being1150/// compiled, as tail calls potentially require more space for stack arguments.1151tail_args_size: u32,1152/// Register-argument defs, to be provided to the `args`1153/// pseudo-inst, and pregs to constrain them to.1154reg_args: Vec<ArgPair>,1155/// Finalized frame layout for this function.1156frame_layout: Option<FrameLayout>,1157/// The register holding the return-area pointer, if needed.1158ret_area_ptr: Option<Reg>,1159/// Calling convention this function expects.1160call_conv: isa::CallConv,1161/// The settings controlling this function's compilation.1162flags: settings::Flags,1163/// The ISA-specific flag values controlling this function's compilation.1164isa_flags: M::F,1165/// If this function has a stack limit specified, then `Reg` is where the1166/// stack limit will be located after the instructions specified have been1167/// executed.1168///1169/// Note that this is intended for insertion into the prologue, if1170/// present. Also note that because the instructions here execute in the1171/// prologue this happens after legalization/register allocation/etc so we1172/// need to be extremely careful with each instruction. The instructions are1173/// manually register-allocated and carefully only use caller-saved1174/// registers and keep nothing live after this sequence of instructions.1175stack_limit: Option<(Reg, SmallInstVec<M::I>)>,11761177_mach: PhantomData<M>,1178}11791180fn get_special_purpose_param_register(1181f: &ir::Function,1182sigs: &SigSet,1183sig: Sig,1184purpose: ir::ArgumentPurpose,1185) -> Option<Reg> {1186let idx = f.signature.special_param_index(purpose)?;1187match &sigs.args(sig)[idx] {1188&ABIArg::Slots { ref slots, .. } => match &slots[0] {1189&ABIArgSlot::Reg { reg, .. } => Some(reg.into()),1190_ => None,1191},1192_ => None,1193}1194}11951196fn checked_round_up(val: u32, mask: u32) -> Option<u32> {1197Some(val.checked_add(mask)? & !mask)1198}11991200impl<M: ABIMachineSpec> Callee<M> {1201/// Create a new body ABI instance.1202pub fn new(1203f: &ir::Function,1204isa: &dyn TargetIsa,1205isa_flags: &M::F,1206sigs: &SigSet,1207) -> CodegenResult<Self> {1208trace!("ABI: func signature {:?}", f.signature);12091210let flags = isa.flags().clone();1211let sig = sigs.abi_sig_for_signature(&f.signature);12121213let call_conv = f.signature.call_conv;1214// Only these calling conventions are supported.1215debug_assert!(1216call_conv == isa::CallConv::SystemV1217|| call_conv == isa::CallConv::Tail1218|| call_conv == isa::CallConv::Fast1219|| call_conv == isa::CallConv::Cold1220|| call_conv == isa::CallConv::WindowsFastcall1221|| call_conv == isa::CallConv::AppleAarch641222|| call_conv == isa::CallConv::Winch,1223"Unsupported calling convention: {call_conv:?}"1224);12251226// Compute sized stackslot locations and total stackslot size.1227let mut end_offset: u32 = 0;1228let mut sized_stackslots = PrimaryMap::new();12291230for (stackslot, data) in f.sized_stack_slots.iter() {1231// We start our computation possibly unaligned where the previous1232// stackslot left off.1233let unaligned_start_offset = end_offset;12341235// The start of the stackslot must be aligned.1236//1237// We always at least machine-word-align slots, but also1238// satisfy the user's requested alignment.1239debug_assert!(data.align_shift < 32);1240let align = std::cmp::max(M::word_bytes(), 1u32 << data.align_shift);1241let mask = align - 1;1242let start_offset = checked_round_up(unaligned_start_offset, mask)1243.ok_or(CodegenError::ImplLimitExceeded)?;12441245// The end offset is the start offset increased by the size1246end_offset = start_offset1247.checked_add(data.size)1248.ok_or(CodegenError::ImplLimitExceeded)?;12491250debug_assert_eq!(stackslot.as_u32() as usize, sized_stackslots.len());1251sized_stackslots.push(start_offset);1252}12531254// Compute dynamic stackslot locations and total stackslot size.1255let mut dynamic_stackslots = PrimaryMap::new();1256for (stackslot, data) in f.dynamic_stack_slots.iter() {1257debug_assert_eq!(stackslot.as_u32() as usize, dynamic_stackslots.len());12581259// This computation is similar to the stackslots above1260let unaligned_start_offset = end_offset;12611262let mask = M::word_bytes() - 1;1263let start_offset = checked_round_up(unaligned_start_offset, mask)1264.ok_or(CodegenError::ImplLimitExceeded)?;12651266let ty = f.get_concrete_dynamic_ty(data.dyn_ty).ok_or_else(|| {1267CodegenError::Unsupported(format!("invalid dynamic vector type: {}", data.dyn_ty))1268})?;12691270end_offset = start_offset1271.checked_add(isa.dynamic_vector_bytes(ty))1272.ok_or(CodegenError::ImplLimitExceeded)?;12731274dynamic_stackslots.push(start_offset);1275}12761277// The size of the stackslots needs to be word aligned1278let stackslots_size = checked_round_up(end_offset, M::word_bytes() - 1)1279.ok_or(CodegenError::ImplLimitExceeded)?;12801281let mut dynamic_type_sizes = HashMap::with_capacity(f.dfg.dynamic_types.len());1282for (dyn_ty, _data) in f.dfg.dynamic_types.iter() {1283let ty = f1284.get_concrete_dynamic_ty(dyn_ty)1285.unwrap_or_else(|| panic!("invalid dynamic vector type: {dyn_ty}"));1286let size = isa.dynamic_vector_bytes(ty);1287dynamic_type_sizes.insert(ty, size);1288}12891290// Figure out what instructions, if any, will be needed to check the1291// stack limit. This can either be specified as a special-purpose1292// argument or as a global value which often calculates the stack limit1293// from the arguments.1294let stack_limit = f1295.stack_limit1296.map(|gv| gen_stack_limit::<M>(f, sigs, sig, gv));12971298let tail_args_size = sigs[sig].sized_stack_arg_space;12991300Ok(Self {1301ir_sig: ensure_struct_return_ptr_is_returned(&f.signature),1302sig,1303dynamic_stackslots,1304dynamic_type_sizes,1305sized_stackslots,1306stackslots_size,1307outgoing_args_size: 0,1308tail_args_size,1309reg_args: vec![],1310frame_layout: None,1311ret_area_ptr: None,1312call_conv,1313flags,1314isa_flags: isa_flags.clone(),1315stack_limit,1316_mach: PhantomData,1317})1318}13191320/// Inserts instructions necessary for checking the stack limit into the1321/// prologue.1322///1323/// This function will generate instructions necessary for perform a stack1324/// check at the header of a function. The stack check is intended to trap1325/// if the stack pointer goes below a particular threshold, preventing stack1326/// overflow in wasm or other code. The `stack_limit` argument here is the1327/// register which holds the threshold below which we're supposed to trap.1328/// This function is known to allocate `stack_size` bytes and we'll push1329/// instructions onto `insts`.1330///1331/// Note that the instructions generated here are special because this is1332/// happening so late in the pipeline (e.g. after register allocation). This1333/// means that we need to do manual register allocation here and also be1334/// careful to not clobber any callee-saved or argument registers. For now1335/// this routine makes do with the `spilltmp_reg` as one temporary1336/// register, and a second register of `tmp2` which is caller-saved. This1337/// should be fine for us since no spills should happen in this sequence of1338/// instructions, so our register won't get accidentally clobbered.1339///1340/// No values can be live after the prologue, but in this case that's ok1341/// because we just need to perform a stack check before progressing with1342/// the rest of the function.1343fn insert_stack_check(1344&self,1345stack_limit: Reg,1346stack_size: u32,1347insts: &mut SmallInstVec<M::I>,1348) {1349// With no explicit stack allocated we can just emit the simple check of1350// the stack registers against the stack limit register, and trap if1351// it's out of bounds.1352if stack_size == 0 {1353insts.extend(M::gen_stack_lower_bound_trap(stack_limit));1354return;1355}13561357// Note that the 32k stack size here is pretty special. See the1358// documentation in x86/abi.rs for why this is here. The general idea is1359// that we're protecting against overflow in the addition that happens1360// below.1361if stack_size >= 32 * 1024 {1362insts.extend(M::gen_stack_lower_bound_trap(stack_limit));1363}13641365// Add the `stack_size` to `stack_limit`, placing the result in1366// `scratch`.1367//1368// Note though that `stack_limit`'s register may be the same as1369// `scratch`. If our stack size doesn't fit into an immediate this1370// means we need a second scratch register for loading the stack size1371// into a register.1372let scratch = Writable::from_reg(M::get_stacklimit_reg(self.call_conv));1373insts.extend(M::gen_add_imm(1374self.call_conv,1375scratch,1376stack_limit,1377stack_size,1378));1379insts.extend(M::gen_stack_lower_bound_trap(scratch.to_reg()));1380}1381}13821383/// Generates the instructions necessary for the `gv` to be materialized into a1384/// register.1385///1386/// This function will return a register that will contain the result of1387/// evaluating `gv`. It will also return any instructions necessary to calculate1388/// the value of the register.1389///1390/// Note that global values are typically lowered to instructions via the1391/// standard legalization pass. Unfortunately though prologue generation happens1392/// so late in the pipeline that we can't use these legalization passes to1393/// generate the instructions for `gv`. As a result we duplicate some lowering1394/// of `gv` here and support only some global values. This is similar to what1395/// the x86 backend does for now, and hopefully this can be somewhat cleaned up1396/// in the future too!1397///1398/// Also note that this function will make use of `writable_spilltmp_reg()` as a1399/// temporary register to store values in if necessary. Currently after we write1400/// to this register there's guaranteed to be no spilled values between where1401/// it's used, because we're not participating in register allocation anyway!1402fn gen_stack_limit<M: ABIMachineSpec>(1403f: &ir::Function,1404sigs: &SigSet,1405sig: Sig,1406gv: ir::GlobalValue,1407) -> (Reg, SmallInstVec<M::I>) {1408let mut insts = smallvec![];1409let reg = generate_gv::<M>(f, sigs, sig, gv, &mut insts);1410return (reg, insts);1411}14121413fn generate_gv<M: ABIMachineSpec>(1414f: &ir::Function,1415sigs: &SigSet,1416sig: Sig,1417gv: ir::GlobalValue,1418insts: &mut SmallInstVec<M::I>,1419) -> Reg {1420match f.global_values[gv] {1421// Return the direct register the vmcontext is in1422ir::GlobalValueData::VMContext => {1423get_special_purpose_param_register(f, sigs, sig, ir::ArgumentPurpose::VMContext)1424.expect("no vmcontext parameter found")1425}1426// Load our base value into a register, then load from that register1427// in to a temporary register.1428ir::GlobalValueData::Load {1429base,1430offset,1431global_type: _,1432flags: _,1433} => {1434let base = generate_gv::<M>(f, sigs, sig, base, insts);1435let into_reg = Writable::from_reg(M::get_stacklimit_reg(f.stencil.signature.call_conv));1436insts.push(M::gen_load_base_offset(1437into_reg,1438base,1439offset.into(),1440M::word_type(),1441));1442return into_reg.to_reg();1443}1444ref other => panic!("global value for stack limit not supported: {other}"),1445}1446}14471448/// Returns true if the signature needs to be legalized.1449fn missing_struct_return(sig: &ir::Signature) -> bool {1450sig.uses_special_param(ArgumentPurpose::StructReturn)1451&& !sig.uses_special_return(ArgumentPurpose::StructReturn)1452}14531454fn ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature {1455// Keep in sync with Callee::new1456let mut sig = sig.clone();1457if sig.uses_special_return(ArgumentPurpose::StructReturn) {1458panic!("Explicit StructReturn return value not allowed: {sig:?}")1459}1460if let Some(struct_ret_index) = sig.special_param_index(ArgumentPurpose::StructReturn) {1461if !sig.returns.is_empty() {1462panic!("No return values are allowed when using StructReturn: {sig:?}");1463}1464sig.returns.insert(0, sig.params[struct_ret_index]);1465}1466sig1467}14681469/// ### Pre-Regalloc Functions1470///1471/// These methods of `Callee` may only be called before regalloc.1472impl<M: ABIMachineSpec> Callee<M> {1473/// Access the (possibly legalized) signature.1474pub fn signature(&self) -> &ir::Signature {1475debug_assert!(1476!missing_struct_return(&self.ir_sig),1477"`Callee::ir_sig` is always legalized"1478);1479&self.ir_sig1480}14811482/// Initialize. This is called after the Callee is constructed because it1483/// may allocate a temp vreg, which can only be allocated once the lowering1484/// context exists.1485pub fn init_retval_area(1486&mut self,1487sigs: &SigSet,1488vregs: &mut VRegAllocator<M::I>,1489) -> CodegenResult<()> {1490if sigs[self.sig].stack_ret_arg.is_some() {1491let ret_area_ptr = vregs.alloc(M::word_type())?;1492self.ret_area_ptr = Some(ret_area_ptr.only_reg().unwrap());1493}1494Ok(())1495}14961497/// Get the return area pointer register, if any.1498pub fn ret_area_ptr(&self) -> Option<Reg> {1499self.ret_area_ptr1500}15011502/// Accumulate outgoing arguments.1503///1504/// This ensures that at least `size` bytes are allocated in the prologue to1505/// be available for use in function calls to hold arguments and/or return1506/// values. If this function is called multiple times, the maximum of all1507/// `size` values will be available.1508pub fn accumulate_outgoing_args_size(&mut self, size: u32) {1509if size > self.outgoing_args_size {1510self.outgoing_args_size = size;1511}1512}15131514/// Accumulate the incoming argument area size requirements for a tail call,1515/// as it could be larger than the incoming arguments of the function1516/// currently being compiled.1517pub fn accumulate_tail_args_size(&mut self, size: u32) {1518if size > self.tail_args_size {1519self.tail_args_size = size;1520}1521}15221523pub fn is_forward_edge_cfi_enabled(&self) -> bool {1524self.isa_flags.is_forward_edge_cfi_enabled()1525}15261527/// Get the calling convention implemented by this ABI object.1528pub fn call_conv(&self) -> isa::CallConv {1529self.call_conv1530}15311532/// Get the ABI-dependent MachineEnv for managing register allocation.1533pub fn machine_env(&self) -> &MachineEnv {1534M::get_machine_env(&self.flags, self.call_conv)1535}15361537/// The offsets of all sized stack slots (not spill slots) for debuginfo purposes.1538pub fn sized_stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32> {1539&self.sized_stackslots1540}15411542/// The offsets of all dynamic stack slots (not spill slots) for debuginfo purposes.1543pub fn dynamic_stackslot_offsets(&self) -> &PrimaryMap<DynamicStackSlot, u32> {1544&self.dynamic_stackslots1545}15461547/// Generate an instruction which copies an argument to a destination1548/// register.1549pub fn gen_copy_arg_to_regs(1550&mut self,1551sigs: &SigSet,1552idx: usize,1553into_regs: ValueRegs<Writable<Reg>>,1554vregs: &mut VRegAllocator<M::I>,1555) -> SmallInstVec<M::I> {1556let mut insts = smallvec![];1557let mut copy_arg_slot_to_reg = |slot: &ABIArgSlot, into_reg: &Writable<Reg>| {1558match slot {1559&ABIArgSlot::Reg { reg, .. } => {1560// Add a preg -> def pair to the eventual `args`1561// instruction. Extension mode doesn't matter1562// (we're copying out, not in; we ignore high bits1563// by convention).1564let arg = ArgPair {1565vreg: *into_reg,1566preg: reg.into(),1567};1568self.reg_args.push(arg);1569}1570&ABIArgSlot::Stack {1571offset,1572ty,1573extension,1574..1575} => {1576// However, we have to respect the extension mode for stack1577// slots, or else we grab the wrong bytes on big-endian.1578let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);1579let ty =1580if ext != ArgumentExtension::None && M::word_bits() > ty_bits(ty) as u32 {1581M::word_type()1582} else {1583ty1584};1585insts.push(M::gen_load_stack(1586StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),1587*into_reg,1588ty,1589));1590}1591}1592};15931594match &sigs.args(self.sig)[idx] {1595&ABIArg::Slots { ref slots, .. } => {1596assert_eq!(into_regs.len(), slots.len());1597for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {1598copy_arg_slot_to_reg(&slot, &into_reg);1599}1600}1601&ABIArg::StructArg { offset, .. } => {1602let into_reg = into_regs.only_reg().unwrap();1603// Buffer address is implicitly defined by the ABI.1604insts.push(M::gen_get_stack_addr(1605StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),1606into_reg,1607));1608}1609&ABIArg::ImplicitPtrArg { pointer, ty, .. } => {1610let into_reg = into_regs.only_reg().unwrap();1611// We need to dereference the pointer.1612let base = match &pointer {1613&ABIArgSlot::Reg { reg, ty, .. } => {1614let tmp = vregs.alloc_with_deferred_error(ty).only_reg().unwrap();1615self.reg_args.push(ArgPair {1616vreg: Writable::from_reg(tmp),1617preg: reg.into(),1618});1619tmp1620}1621&ABIArgSlot::Stack { offset, ty, .. } => {1622let addr_reg = writable_value_regs(vregs.alloc_with_deferred_error(ty))1623.only_reg()1624.unwrap();1625insts.push(M::gen_load_stack(1626StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),1627addr_reg,1628ty,1629));1630addr_reg.to_reg()1631}1632};1633insts.push(M::gen_load_base_offset(into_reg, base, 0, ty));1634}1635}1636insts1637}16381639/// Generate an instruction which copies a source register to a return value slot.1640pub fn gen_copy_regs_to_retval(1641&self,1642sigs: &SigSet,1643idx: usize,1644from_regs: ValueRegs<Reg>,1645vregs: &mut VRegAllocator<M::I>,1646) -> (SmallVec<[RetPair; 2]>, SmallInstVec<M::I>) {1647let mut reg_pairs = smallvec![];1648let mut ret = smallvec![];1649let word_bits = M::word_bits() as u8;1650match &sigs.rets(self.sig)[idx] {1651&ABIArg::Slots { ref slots, .. } => {1652assert_eq!(from_regs.len(), slots.len());1653for (slot, &from_reg) in slots.iter().zip(from_regs.regs().iter()) {1654match slot {1655&ABIArgSlot::Reg {1656reg, ty, extension, ..1657} => {1658let from_bits = ty_bits(ty) as u8;1659let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);1660let vreg = match (ext, from_bits) {1661(ir::ArgumentExtension::Uext, n)1662| (ir::ArgumentExtension::Sext, n)1663if n < word_bits =>1664{1665let signed = ext == ir::ArgumentExtension::Sext;1666let dst =1667writable_value_regs(vregs.alloc_with_deferred_error(ty))1668.only_reg()1669.unwrap();1670ret.push(M::gen_extend(1671dst, from_reg, signed, from_bits,1672/* to_bits = */ word_bits,1673));1674dst.to_reg()1675}1676_ => {1677// No move needed, regalloc2 will emit it using the constraint1678// added by the RetPair.1679from_reg1680}1681};1682reg_pairs.push(RetPair {1683vreg,1684preg: Reg::from(reg),1685});1686}1687&ABIArgSlot::Stack {1688offset,1689ty,1690extension,1691..1692} => {1693let mut ty = ty;1694let from_bits = ty_bits(ty) as u8;1695// A machine ABI implementation should ensure that stack frames1696// have "reasonable" size. All current ABIs for machinst1697// backends (aarch64 and x64) enforce a 128MB limit.1698let off = i32::try_from(offset).expect(1699"Argument stack offset greater than 2GB; should hit impl limit first",1700);1701let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);1702// Trash the from_reg; it should be its last use.1703match (ext, from_bits) {1704(ir::ArgumentExtension::Uext, n)1705| (ir::ArgumentExtension::Sext, n)1706if n < word_bits =>1707{1708assert_eq!(M::word_reg_class(), from_reg.class());1709let signed = ext == ir::ArgumentExtension::Sext;1710let dst =1711writable_value_regs(vregs.alloc_with_deferred_error(ty))1712.only_reg()1713.unwrap();1714ret.push(M::gen_extend(1715dst, from_reg, signed, from_bits,1716/* to_bits = */ word_bits,1717));1718// Store the extended version.1719ty = M::word_type();1720}1721_ => {}1722};1723ret.push(M::gen_store_base_offset(1724self.ret_area_ptr.unwrap(),1725off,1726from_reg,1727ty,1728));1729}1730}1731}1732}1733ABIArg::StructArg { .. } => {1734panic!("StructArg in return position is unsupported");1735}1736ABIArg::ImplicitPtrArg { .. } => {1737panic!("ImplicitPtrArg in return position is unsupported");1738}1739}1740(reg_pairs, ret)1741}17421743/// Generate any setup instruction needed to save values to the1744/// return-value area. This is usually used when were are multiple return1745/// values or an otherwise large return value that must be passed on the1746/// stack; typically the ABI specifies an extra hidden argument that is a1747/// pointer to that memory.1748pub fn gen_retval_area_setup(1749&mut self,1750sigs: &SigSet,1751vregs: &mut VRegAllocator<M::I>,1752) -> Option<M::I> {1753if let Some(i) = sigs[self.sig].stack_ret_arg {1754let ret_area_ptr = Writable::from_reg(self.ret_area_ptr.unwrap());1755let insts =1756self.gen_copy_arg_to_regs(sigs, i.into(), ValueRegs::one(ret_area_ptr), vregs);1757insts.into_iter().next().map(|inst| {1758trace!(1759"gen_retval_area_setup: inst {:?}; ptr reg is {:?}",1760inst,1761ret_area_ptr.to_reg()1762);1763inst1764})1765} else {1766trace!("gen_retval_area_setup: not needed");1767None1768}1769}17701771/// Generate a return instruction.1772pub fn gen_rets(&self, rets: Vec<RetPair>) -> M::I {1773M::gen_rets(rets)1774}17751776/// Set up arguments values `args` for a call with signature `sig`.1777/// This will return a series of instructions to be emitted to set1778/// up all arguments, as well as a `CallArgList` list representing1779/// the arguments passed in registers. The latter need to be added1780/// as constraints to the actual call instruction.1781pub fn gen_call_args(1782&self,1783sigs: &SigSet,1784sig: Sig,1785args: &[ValueRegs<Reg>],1786is_tail_call: bool,1787flags: &settings::Flags,1788vregs: &mut VRegAllocator<M::I>,1789) -> (CallArgList, SmallInstVec<M::I>) {1790let mut uses: CallArgList = smallvec![];1791let mut insts = smallvec![];17921793assert_eq!(args.len(), sigs.num_args(sig));17941795let call_conv = sigs[sig].call_conv;1796let stack_arg_space = sigs[sig].sized_stack_arg_space;1797let stack_arg = |offset| {1798if is_tail_call {1799StackAMode::IncomingArg(offset, stack_arg_space)1800} else {1801StackAMode::OutgoingArg(offset)1802}1803};18041805let word_ty = M::word_type();1806let word_rc = M::word_reg_class();1807let word_bits = M::word_bits() as usize;18081809if is_tail_call {1810debug_assert_eq!(1811self.call_conv,1812isa::CallConv::Tail,1813"Can only do `return_call`s from within a `tail` calling convention function"1814);1815}18161817// Helper to process a single argument slot (register or stack slot).1818// This will either add the register to the `uses` list or write the1819// value to the stack slot in the outgoing argument area (or for tail1820// calls, the incoming argument area).1821let mut process_arg_slot = |insts: &mut SmallInstVec<M::I>, slot, vreg, ty| {1822match &slot {1823&ABIArgSlot::Reg { reg, .. } => {1824uses.push(CallArgPair {1825vreg,1826preg: reg.into(),1827});1828}1829&ABIArgSlot::Stack { offset, .. } => {1830insts.push(M::gen_store_stack(stack_arg(offset), vreg, ty));1831}1832};1833};18341835// First pass: Handle `StructArg` arguments. These need to be copied1836// into their associated stack buffers. This should happen before any1837// of the other arguments are processed, as the `memcpy` call might1838// clobber registers used by other arguments.1839for (idx, from_regs) in args.iter().enumerate() {1840match &sigs.args(sig)[idx] {1841&ABIArg::Slots { .. } | &ABIArg::ImplicitPtrArg { .. } => {}1842&ABIArg::StructArg { offset, size, .. } => {1843let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();1844insts.push(M::gen_get_stack_addr(1845stack_arg(offset),1846Writable::from_reg(tmp),1847));1848insts.extend(M::gen_memcpy(1849isa::CallConv::for_libcall(flags, call_conv),1850tmp,1851from_regs.only_reg().unwrap(),1852size as usize,1853|ty| {1854Writable::from_reg(1855vregs.alloc_with_deferred_error(ty).only_reg().unwrap(),1856)1857},1858));1859}1860}1861}18621863// Second pass: Handle everything except `StructArg` arguments.1864for (idx, from_regs) in args.iter().enumerate() {1865match sigs.args(sig)[idx] {1866ABIArg::Slots { ref slots, .. } => {1867assert_eq!(from_regs.len(), slots.len());1868for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {1869// Load argument slot value from `from_reg`, and perform any zero-1870// or sign-extension that is required by the ABI.1871let (ty, extension) = match *slot {1872ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),1873ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),1874};1875let ext = M::get_ext_mode(call_conv, extension);1876let (vreg, ty) = if ext != ir::ArgumentExtension::None1877&& ty_bits(ty) < word_bits1878{1879assert_eq!(word_rc, from_reg.class());1880let signed = match ext {1881ir::ArgumentExtension::Uext => false,1882ir::ArgumentExtension::Sext => true,1883_ => unreachable!(),1884};1885let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();1886insts.push(M::gen_extend(1887Writable::from_reg(tmp),1888*from_reg,1889signed,1890ty_bits(ty) as u8,1891word_bits as u8,1892));1893(tmp, word_ty)1894} else {1895(*from_reg, ty)1896};1897process_arg_slot(&mut insts, *slot, vreg, ty);1898}1899}1900ABIArg::ImplicitPtrArg {1901offset,1902pointer,1903ty,1904..1905} => {1906let vreg = from_regs.only_reg().unwrap();1907let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();1908insts.push(M::gen_get_stack_addr(1909stack_arg(offset),1910Writable::from_reg(tmp),1911));1912insts.push(M::gen_store_base_offset(tmp, 0, vreg, ty));1913process_arg_slot(&mut insts, pointer, tmp, word_ty);1914}1915ABIArg::StructArg { .. } => {}1916}1917}19181919// Finally, set the stack-return pointer to the return argument area.1920// For tail calls, this means forwarding the incoming stack-return pointer.1921if let Some(ret_arg) = sigs.get_ret_arg(sig) {1922let ret_area = if is_tail_call {1923self.ret_area_ptr.expect(1924"if the tail callee has a return pointer, then the tail caller must as well",1925)1926} else {1927let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();1928let amode = StackAMode::OutgoingArg(stack_arg_space.into());1929insts.push(M::gen_get_stack_addr(amode, Writable::from_reg(tmp)));1930tmp1931};1932match ret_arg {1933// The return pointer must occupy a single slot.1934ABIArg::Slots { slots, .. } => {1935assert_eq!(slots.len(), 1);1936process_arg_slot(&mut insts, slots[0], ret_area, word_ty);1937}1938_ => unreachable!(),1939}1940}19411942(uses, insts)1943}19441945/// Set up return values `outputs` for a call with signature `sig`.1946/// This does not emit (or return) any instructions, but returns a1947/// `CallRetList` representing the return value constraints. This1948/// needs to be added to the actual call instruction.1949///1950/// If `try_call_payloads` is non-zero, it is expected to hold1951/// exception payload registers for try_call instructions. These1952/// will be added as needed to the `CallRetList` as well.1953pub fn gen_call_rets(1954&self,1955sigs: &SigSet,1956sig: Sig,1957outputs: &[ValueRegs<Reg>],1958try_call_payloads: Option<&[Writable<Reg>]>,1959vregs: &mut VRegAllocator<M::I>,1960) -> CallRetList {1961let callee_conv = sigs[sig].call_conv;1962let stack_arg_space = sigs[sig].sized_stack_arg_space;19631964let word_ty = M::word_type();1965let word_bits = M::word_bits() as usize;19661967let mut defs: CallRetList = smallvec![];1968let mut outputs = outputs.into_iter();1969let num_rets = sigs.num_rets(sig);1970for idx in 0..num_rets {1971let ret = sigs.rets(sig)[idx].clone();1972match ret {1973ABIArg::Slots {1974ref slots, purpose, ..1975} => {1976// We do not use the returned copy of the return buffer pointer,1977// so skip any StructReturn returns that may be present.1978if purpose == ArgumentPurpose::StructReturn {1979continue;1980}1981let retval_regs = outputs.next().unwrap();1982assert_eq!(retval_regs.len(), slots.len());1983for (slot, retval_reg) in slots.iter().zip(retval_regs.regs().iter()) {1984// We do not perform any extension because we're copying out, not in,1985// and we ignore high bits in our own registers by convention. However,1986// we still need to use the proper extended type to access stack slots1987// (this is critical on big-endian systems).1988let (ty, extension) = match *slot {1989ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),1990ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),1991};1992let ext = M::get_ext_mode(callee_conv, extension);1993let ty = if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {1994word_ty1995} else {1996ty1997};19981999match slot {2000&ABIArgSlot::Reg { reg, .. } => {2001defs.push(CallRetPair {2002vreg: Writable::from_reg(*retval_reg),2003location: RetLocation::Reg(reg.into(), ty),2004});2005}2006&ABIArgSlot::Stack { offset, .. } => {2007let amode =2008StackAMode::OutgoingArg(offset + i64::from(stack_arg_space));2009defs.push(CallRetPair {2010vreg: Writable::from_reg(*retval_reg),2011location: RetLocation::Stack(amode, ty),2012});2013}2014}2015}2016}2017ABIArg::StructArg { .. } => {2018panic!("StructArg not supported in return position");2019}2020ABIArg::ImplicitPtrArg { .. } => {2021panic!("ImplicitPtrArg not supported in return position");2022}2023}2024}2025assert!(outputs.next().is_none());20262027if let Some(try_call_payloads) = try_call_payloads {2028// Let `M` say where the payload values are going to end up and then2029// double-check it's the same size as the calling convention's2030// reported number of exception types.2031let pregs = M::exception_payload_regs(callee_conv);2032assert_eq!(2033callee_conv.exception_payload_types(M::word_type()).len(),2034pregs.len()2035);20362037// We need to update `defs` to contain the exception2038// payload regs as well. We have two sources of info that2039// we join:2040//2041// - The machine-specific ABI implementation `M`, which2042// tells us the particular registers that payload values2043// must be in2044// - The passed-in lowering context, which gives us the2045// vregs we must define.2046//2047// Note that payload values may need to end up in the same2048// physical registers as ordinary return values; this is2049// not a conflict, because we either get one or the2050// other. For regalloc's purposes, we define both starting2051// here at the callsite, but we can share one def in the2052// `defs` list and alias one vreg to another. Thus we2053// handle the two cases below for each payload register:2054// overlaps a return value (and we alias to it) or not2055// (and we add a def).2056for (i, &preg) in pregs.iter().enumerate() {2057let vreg = try_call_payloads[i];2058if let Some(existing) = defs.iter().find(|def| match def.location {2059RetLocation::Reg(r, _) => r == preg,2060_ => false,2061}) {2062vregs.set_vreg_alias(vreg.to_reg(), existing.vreg.to_reg());2063} else {2064defs.push(CallRetPair {2065vreg,2066location: RetLocation::Reg(preg, word_ty),2067});2068}2069}2070}20712072defs2073}20742075/// Populate a `CallInfo` for a call with signature `sig`.2076///2077/// `dest` is the target-specific call destination value2078/// `uses` is the `CallArgList` describing argument constraints2079/// `defs` is the `CallRetList` describing return constraints2080/// `try_call_info` describes exception targets for try_call instructions2081///2082/// The clobber list is computed here from the above data.2083pub fn gen_call_info<T>(2084&self,2085sigs: &SigSet,2086sig: Sig,2087dest: T,2088uses: CallArgList,2089defs: CallRetList,2090try_call_info: Option<TryCallInfo>,2091) -> CallInfo<T> {2092let caller_conv = self.call_conv;2093let callee_conv = sigs[sig].call_conv;2094let stack_arg_space = sigs[sig].sized_stack_arg_space;20952096let clobbers = {2097// Get clobbers: all caller-saves. These may include return value2098// regs, which we will remove from the clobber set below.2099let mut clobbers =2100<M>::get_regs_clobbered_by_call(callee_conv, try_call_info.is_some());21012102// Remove retval regs from clobbers.2103for def in &defs {2104if let RetLocation::Reg(preg, _) = def.location {2105clobbers.remove(PReg::from(preg.to_real_reg().unwrap()));2106}2107}21082109clobbers2110};21112112// Any adjustment to SP to account for required outgoing arguments/stack return values must2113// be done inside of the call pseudo-op, to ensure that SP is always in a consistent2114// state for all other instructions. For example, if a tail-call abi function is called2115// here, the reclamation of the outgoing argument area must be done inside of the call2116// pseudo-op's emission to ensure that SP is consistent at all other points in the lowered2117// function. (Except the prologue and epilogue, but those are fairly special parts of the2118// function that establish the SP invariants that are relied on elsewhere and are generated2119// after the register allocator has run and thus cannot have register allocator-inserted2120// references to SP offsets.)21212122let callee_pop_size = if callee_conv == isa::CallConv::Tail {2123// The tail calling convention has callees pop stack arguments.2124stack_arg_space2125} else {212602127};21282129CallInfo {2130dest,2131uses,2132defs,2133clobbers,2134callee_conv,2135caller_conv,2136callee_pop_size,2137try_call_info,2138}2139}21402141/// Produce an instruction that computes a sized stackslot address.2142pub fn sized_stackslot_addr(2143&self,2144slot: StackSlot,2145offset: u32,2146into_reg: Writable<Reg>,2147) -> M::I {2148// Offset from beginning of stackslot area.2149let stack_off = self.sized_stackslots[slot] as i64;2150let sp_off: i64 = stack_off + (offset as i64);2151M::gen_get_stack_addr(StackAMode::Slot(sp_off), into_reg)2152}21532154/// Produce an instruction that computes a dynamic stackslot address.2155pub fn dynamic_stackslot_addr(&self, slot: DynamicStackSlot, into_reg: Writable<Reg>) -> M::I {2156let stack_off = self.dynamic_stackslots[slot] as i64;2157M::gen_get_stack_addr(StackAMode::Slot(stack_off), into_reg)2158}21592160/// Get an `args` pseudo-inst, if any, that should appear at the2161/// very top of the function body prior to regalloc.2162pub fn take_args(&mut self) -> Option<M::I> {2163if self.reg_args.len() > 0 {2164// Very first instruction is an `args` pseudo-inst that2165// establishes live-ranges for in-register arguments and2166// constrains them at the start of the function to the2167// locations defined by the ABI.2168Some(M::gen_args(std::mem::take(&mut self.reg_args)))2169} else {2170None2171}2172}2173}21742175/// ### Post-Regalloc Functions2176///2177/// These methods of `Callee` may only be called after2178/// regalloc.2179impl<M: ABIMachineSpec> Callee<M> {2180/// Compute the final frame layout, post-regalloc.2181///2182/// This must be called before gen_prologue or gen_epilogue.2183pub fn compute_frame_layout(2184&mut self,2185sigs: &SigSet,2186spillslots: usize,2187clobbered: Vec<Writable<RealReg>>,2188function_calls: FunctionCalls,2189) {2190let bytes = M::word_bytes();2191let total_stacksize = self.stackslots_size + bytes * spillslots as u32;2192let mask = M::stack_align(self.call_conv) - 1;2193let total_stacksize = (total_stacksize + mask) & !mask; // 16-align the stack.2194self.frame_layout = Some(M::compute_frame_layout(2195self.call_conv,2196&self.flags,2197self.signature(),2198&clobbered,2199function_calls,2200self.stack_args_size(sigs),2201self.tail_args_size,2202self.stackslots_size,2203total_stacksize,2204self.outgoing_args_size,2205));2206}22072208/// Generate a prologue, post-regalloc.2209///2210/// This should include any stack frame or other setup necessary to use the2211/// other methods (`load_arg`, `store_retval`, and spillslot accesses.)2212pub fn gen_prologue(&self) -> SmallInstVec<M::I> {2213let frame_layout = self.frame_layout();2214let mut insts = smallvec![];22152216// Set up frame.2217insts.extend(M::gen_prologue_frame_setup(2218self.call_conv,2219&self.flags,2220&self.isa_flags,2221&frame_layout,2222));22232224// The stack limit check needs to cover all the stack adjustments we2225// might make, up to the next stack limit check in any function we2226// call. Since this happens after frame setup, the current function's2227// setup area needs to be accounted for in the caller's stack limit2228// check, but we need to account for any setup area that our callees2229// might need. Note that s390x may also use the outgoing args area for2230// backtrace support even in leaf functions, so that should be accounted2231// for unconditionally.2232let total_stacksize = (frame_layout.tail_args_size - frame_layout.incoming_args_size)2233+ frame_layout.clobber_size2234+ frame_layout.fixed_frame_storage_size2235+ frame_layout.outgoing_args_size2236+ if frame_layout.function_calls == FunctionCalls::None {223702238} else {2239frame_layout.setup_area_size2240};22412242// Leaf functions with zero stack don't need a stack check if one's2243// specified, otherwise always insert the stack check.2244if total_stacksize > 0 || frame_layout.function_calls != FunctionCalls::None {2245if let Some((reg, stack_limit_load)) = &self.stack_limit {2246insts.extend(stack_limit_load.clone());2247self.insert_stack_check(*reg, total_stacksize, &mut insts);2248}22492250if self.flags.enable_probestack() {2251let guard_size = 1 << self.flags.probestack_size_log2();2252match self.flags.probestack_strategy() {2253ProbestackStrategy::Inline => M::gen_inline_probestack(2254&mut insts,2255self.call_conv,2256total_stacksize,2257guard_size,2258),2259ProbestackStrategy::Outline => {2260if total_stacksize >= guard_size {2261M::gen_probestack(&mut insts, total_stacksize);2262}2263}2264}2265}2266}22672268// Save clobbered registers.2269insts.extend(M::gen_clobber_save(2270self.call_conv,2271&self.flags,2272&frame_layout,2273));22742275insts2276}22772278/// Generate an epilogue, post-regalloc.2279///2280/// Note that this must generate the actual return instruction (rather than2281/// emitting this in the lowering logic), because the epilogue code comes2282/// before the return and the two are likely closely related.2283pub fn gen_epilogue(&self) -> SmallInstVec<M::I> {2284let frame_layout = self.frame_layout();2285let mut insts = smallvec![];22862287// Restore clobbered registers.2288insts.extend(M::gen_clobber_restore(2289self.call_conv,2290&self.flags,2291&frame_layout,2292));22932294// Tear down frame.2295insts.extend(M::gen_epilogue_frame_restore(2296self.call_conv,2297&self.flags,2298&self.isa_flags,2299&frame_layout,2300));23012302// And return.2303insts.extend(M::gen_return(2304self.call_conv,2305&self.isa_flags,2306&frame_layout,2307));23082309trace!("Epilogue: {:?}", insts);2310insts2311}23122313/// Return a reference to the computed frame layout information. This2314/// function will panic if it's called before [`Self::compute_frame_layout`].2315pub fn frame_layout(&self) -> &FrameLayout {2316self.frame_layout2317.as_ref()2318.expect("frame layout not computed before prologue generation")2319}23202321/// Returns the full frame size for the given function, after prologue2322/// emission has run. This comprises the spill slots and stack-storage2323/// slots as well as storage for clobbered callee-save registers, but2324/// not arguments arguments pushed at callsites within this function,2325/// or other ephemeral pushes.2326pub fn frame_size(&self) -> u32 {2327let frame_layout = self.frame_layout();2328frame_layout.clobber_size + frame_layout.fixed_frame_storage_size2329}23302331/// Returns offset from the slot base in the current frame to the caller's SP.2332pub fn slot_base_to_caller_sp_offset(&self) -> u32 {2333let frame_layout = self.frame_layout();2334frame_layout.clobber_size2335+ frame_layout.fixed_frame_storage_size2336+ frame_layout.setup_area_size2337}23382339/// Returns the size of arguments expected on the stack.2340pub fn stack_args_size(&self, sigs: &SigSet) -> u32 {2341sigs[self.sig].sized_stack_arg_space2342}23432344/// Get the spill-slot size.2345pub fn get_spillslot_size(&self, rc: RegClass) -> u32 {2346let max = if self.dynamic_type_sizes.len() == 0 {2347162348} else {2349*self2350.dynamic_type_sizes2351.iter()2352.max_by(|x, y| x.1.cmp(&y.1))2353.map(|(_k, v)| v)2354.unwrap()2355};2356M::get_number_of_spillslots_for_value(rc, max, &self.isa_flags)2357}23582359/// Get the spill slot offset relative to the fixed allocation area start.2360pub fn get_spillslot_offset(&self, slot: SpillSlot) -> i64 {2361self.frame_layout().spillslot_offset(slot)2362}23632364/// Generate a spill.2365pub fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg) -> M::I {2366let ty = M::I::canonical_type_for_rc(from_reg.class());2367debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);23682369let sp_off = self.get_spillslot_offset(to_slot);2370trace!("gen_spill: {from_reg:?} into slot {to_slot:?} at offset {sp_off}");23712372let from = StackAMode::Slot(sp_off);2373<M>::gen_store_stack(from, Reg::from(from_reg), ty)2374}23752376/// Generate a reload (fill).2377pub fn gen_reload(&self, to_reg: Writable<RealReg>, from_slot: SpillSlot) -> M::I {2378let ty = M::I::canonical_type_for_rc(to_reg.to_reg().class());2379debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);23802381let sp_off = self.get_spillslot_offset(from_slot);2382trace!("gen_reload: {to_reg:?} from slot {from_slot:?} at offset {sp_off}");23832384let from = StackAMode::Slot(sp_off);2385<M>::gen_load_stack(from, to_reg.map(Reg::from), ty)2386}2387}23882389/// An input argument to a call instruction: the vreg that is used,2390/// and the preg it is constrained to (per the ABI).2391#[derive(Clone, Debug)]2392pub struct CallArgPair {2393/// The virtual register to use for the argument.2394pub vreg: Reg,2395/// The real register into which the arg goes.2396pub preg: Reg,2397}23982399/// An output return value from a call instruction: the vreg that is2400/// defined, and the preg or stack location it is constrained to (per2401/// the ABI).2402#[derive(Clone, Debug)]2403pub struct CallRetPair {2404/// The virtual register to define from this return value.2405pub vreg: Writable<Reg>,2406/// The real register from which the return value is read.2407pub location: RetLocation,2408}24092410/// A location to load a return-value from after a call completes.2411#[derive(Clone, Debug, PartialEq, Eq)]2412pub enum RetLocation {2413/// A physical register.2414Reg(Reg, Type),2415/// A stack location, identified by a `StackAMode`.2416Stack(StackAMode, Type),2417}24182419pub type CallArgList = SmallVec<[CallArgPair; 8]>;2420pub type CallRetList = SmallVec<[CallRetPair; 8]>;24212422impl<T> CallInfo<T> {2423/// Emit loads for any stack-carried return values using the call2424/// info and allocations.2425pub fn emit_retval_loads<2426M: ABIMachineSpec,2427EmitFn: FnMut(M::I),2428IslandFn: Fn(u32) -> Option<M::I>,2429>(2430&self,2431stackslots_size: u32,2432mut emit: EmitFn,2433emit_island: IslandFn,2434) {2435// Count stack-ret locations and emit an island to account for2436// this space usage.2437let mut space_needed = 0;2438for CallRetPair { location, .. } in &self.defs {2439if let RetLocation::Stack(..) = location {2440// Assume up to ten instructions, semi-arbitrarily:2441// load from stack, store to spillslot, codegen of2442// large offsets on RISC ISAs.2443space_needed += 10 * M::I::worst_case_size();2444}2445}2446if space_needed > 0 {2447if let Some(island_inst) = emit_island(space_needed) {2448emit(island_inst);2449}2450}24512452let temp = M::retval_temp_reg(self.callee_conv);2453// The temporary must be noted as clobbered.2454debug_assert!(2455M::get_regs_clobbered_by_call(self.callee_conv, self.try_call_info.is_some())2456.contains(PReg::from(temp.to_reg().to_real_reg().unwrap()))2457);24582459for CallRetPair { vreg, location } in &self.defs {2460match location {2461RetLocation::Reg(preg, ..) => {2462// The temporary must not also be an actual return2463// value register.2464debug_assert!(*preg != temp.to_reg());2465}2466RetLocation::Stack(amode, ty) => {2467if let Some(spillslot) = vreg.to_reg().to_spillslot() {2468// `temp` is an integer register of machine word2469// width, but `ty` may be floating-point/vector,2470// which (i) may not be loadable directly into an2471// int reg, and (ii) may be wider than a machine2472// word. For simplicity, and because there are not2473// always easy choices for volatile float/vec regs2474// (see e.g. x86-64, where fastcall clobbers only2475// xmm0-xmm5, but tail uses xmm0-xmm7 for2476// returns), we use the integer temp register in2477// steps.2478let parts = (ty.bytes() + M::word_bytes() - 1) / M::word_bytes();2479let one_part_load_ty =2480Type::int_with_byte_size(M::word_bytes().min(ty.bytes()) as u16)2481.unwrap();2482for part in 0..parts {2483emit(M::gen_load_stack(2484amode.offset_by(part * M::word_bytes()),2485temp,2486one_part_load_ty,2487));2488emit(M::gen_store_stack(2489StackAMode::Slot(2490i64::from(stackslots_size)2491+ i64::from(M::word_bytes())2492* ((spillslot.index() as i64) + (part as i64)),2493),2494temp.to_reg(),2495M::word_type(),2496));2497}2498} else {2499assert_ne!(*vreg, temp);2500emit(M::gen_load_stack(*amode, *vreg, *ty));2501}2502}2503}2504}2505}2506}25072508impl TryCallInfo {2509pub(crate) fn exception_handlers(2510&self,2511layout: &FrameLayout,2512) -> impl Iterator<Item = MachExceptionHandler> {2513self.exception_handlers.iter().map(|handler| match handler {2514TryCallHandler::Tag(tag, label) => MachExceptionHandler::Tag(*tag, *label),2515TryCallHandler::Default(label) => MachExceptionHandler::Default(*label),2516TryCallHandler::Context(reg) => {2517let loc = if let Some(spillslot) = reg.to_spillslot() {2518// The spillslot offset is relative to the "fixed2519// storage area", which comes after outgoing args.2520let offset = layout.spillslot_offset(spillslot) + i64::from(layout.outgoing_args_size);2521ExceptionContextLoc::SPOffset(u32::try_from(offset).expect("SP offset cannot be negative or larger than 4GiB"))2522} else if let Some(realreg) = reg.to_real_reg() {2523ExceptionContextLoc::GPR(realreg.hw_enc())2524} else {2525panic!("Virtual register present in try-call handler clause after register allocation");2526};2527MachExceptionHandler::Context(loc)2528}2529})2530}25312532pub(crate) fn pretty_print_dests(&self) -> String {2533self.exception_handlers2534.iter()2535.map(|handler| match handler {2536TryCallHandler::Tag(tag, label) => format!("{tag:?}: {label:?}"),2537TryCallHandler::Default(label) => format!("default: {label:?}"),2538TryCallHandler::Context(loc) => format!("context {loc:?}"),2539})2540.collect::<Vec<_>>()2541.join(", ")2542}25432544pub(crate) fn collect_operands(&mut self, collector: &mut impl OperandVisitor) {2545for handler in &mut self.exception_handlers {2546match handler {2547TryCallHandler::Context(ctx) => {2548collector.any_late_use(ctx);2549}2550TryCallHandler::Tag(_, _) | TryCallHandler::Default(_) => {}2551}2552}2553}2554}25552556#[cfg(test)]2557mod tests {2558use super::SigData;25592560#[test]2561fn sig_data_size() {2562// The size of `SigData` is performance sensitive, so make sure2563// we don't regress it unintentionally.2564assert_eq!(std::mem::size_of::<SigData>(), 24);2565}2566}256725682569