Path: blob/main/cranelift/codegen/src/machinst/abi.rs
3070 views
//! Implementation of a vanilla ABI, shared between several machines. The1//! implementation here assumes that arguments will be passed in registers2//! first, then additional args on the stack; that the stack grows downward,3//! contains a standard frame (return address and frame pointer), and the4//! compiler is otherwise free to allocate space below that with its choice of5//! layout; and that the machine has some notion of caller- and callee-save6//! registers. Most modern machines, e.g. x86-64 and AArch64, should fit this7//! mold and thus both of these backends use this shared implementation.8//!9//! See the documentation in specific machine backends for the "instantiation"10//! of this generic ABI, i.e., which registers are caller/callee-save, arguments11//! and return values, and any other special requirements.12//!13//! For now the implementation here assumes a 64-bit machine, but we intend to14//! make this 32/64-bit-generic shortly.15//!16//! # Vanilla ABI17//!18//! First, arguments and return values are passed in registers up to a certain19//! fixed count, after which they overflow onto the stack. Multiple return20//! values either fit in registers, or are returned in a separate return-value21//! area on the stack, given by a hidden extra parameter.22//!23//! Note that the exact stack layout is up to us. We settled on the24//! below design based on several requirements. In particular, we need25//! to be able to generate instructions (or instruction sequences) to26//! access arguments, stack slots, and spill slots before we know how27//! many spill slots or clobber-saves there will be, because of our28//! pass structure. We also prefer positive offsets to negative29//! offsets because of an asymmetry in some machines' addressing modes30//! (e.g., on AArch64, positive offsets have a larger possible range31//! without a long-form sequence to synthesize an arbitrary32//! offset). We also need clobber-save registers to be "near" the33//! frame pointer: Windows unwind information requires it to be within34//! 240 bytes of RBP. Finally, it is not allowed to access memory35//! below the current SP value.36//!37//! We assume that a prologue first pushes the frame pointer (and38//! return address above that, if the machine does not do that in39//! hardware). We set FP to point to this two-word frame record. We40//! store all other frame slots below this two-word frame record, as41//! well as enough space for arguments to the largest possible42//! function call. The stack pointer then remains at this position43//! for the duration of the function, allowing us to address all44//! frame storage at positive offsets from SP.45//!46//! Note that if we ever support dynamic stack-space allocation (for47//! `alloca`), we will need a way to reference spill slots and stack48//! slots relative to a dynamic SP, because we will no longer be able49//! to know a static offset from SP to the slots at any particular50//! program point. Probably the best solution at that point will be to51//! revert to using the frame pointer as the reference for all slots,52//! to allow generating spill/reload and stackslot accesses before we53//! know how large the clobber-saves will be.54//!55//! # Stack Layout56//!57//! The stack looks like:58//!59//! ```plain60//! (high address)61//! | ... |62//! | caller frames |63//! | ... |64//! +===========================+65//! | ... |66//! | stack args |67//! Canonical Frame Address --> | (accessed via FP) |68//! +---------------------------+69//! SP at function entry -----> | return address |70//! +---------------------------+71//! FP after prologue --------> | FP (pushed by prologue) |72//! +---------------------------+ -----73//! | ... | |74//! | clobbered callee-saves | |75//! unwind-frame base --------> | (pushed by prologue) | |76//! +---------------------------+ ----- |77//! | ... | | |78//! | spill slots | | |79//! | (accessed via SP) | fixed active80//! | ... | frame size81//! | stack slots | storage |82//! | (accessed via SP) | size |83//! | (alloc'd by prologue) | | |84//! +---------------------------+ ----- |85//! | [alignment as needed] | |86//! | ... | |87//! | args for largest call | |88//! SP -----------------------> | (alloc'd by prologue) | |89//! +===========================+ -----90//!91//! (low address)92//! ```93//!94//! # Multi-value Returns95//!96//! We support multi-value returns by using multiple return-value97//! registers. In some cases this is an extension of the base system98//! ABI. See each platform's `abi.rs` implementation for details.99100use crate::CodegenError;101use crate::FxHashMap;102use crate::HashMap;103use crate::entity::SecondaryMap;104use crate::ir::{ArgumentExtension, ArgumentPurpose, ExceptionTag, Signature};105use crate::ir::{StackSlotKey, types::*};106use crate::isa::TargetIsa;107use crate::settings::ProbestackStrategy;108use crate::{ir, isa};109use crate::{machinst::*, trace};110use alloc::boxed::Box;111use core::marker::PhantomData;112use regalloc2::{MachineEnv, PReg, PRegSet};113use smallvec::smallvec;114115/// A small vector of instructions (with some reasonable size); appropriate for116/// a small fixed sequence implementing one operation.117pub type SmallInstVec<I> = SmallVec<[I; 4]>;118119/// A type used by backends to track argument-binding info in the "args"120/// pseudoinst. The pseudoinst holds a vec of `ArgPair` structs.121#[derive(Clone, Debug)]122pub struct ArgPair {123/// The vreg that is defined by this args pseudoinst.124pub vreg: Writable<Reg>,125/// The preg that the arg arrives in; this constrains the vreg's126/// placement at the pseudoinst.127pub preg: Reg,128}129130/// A type used by backends to track return register binding info in the "ret"131/// pseudoinst. The pseudoinst holds a vec of `RetPair` structs.132#[derive(Clone, Debug)]133pub struct RetPair {134/// The vreg that is returned by this pseudionst.135pub vreg: Reg,136/// The preg that the arg is returned through; this constrains the vreg's137/// placement at the pseudoinst.138pub preg: Reg,139}140141/// A location for (part of) an argument or return value. These "storage slots"142/// are specified for each register-sized part of an argument.143#[derive(Clone, Copy, Debug, PartialEq, Eq)]144pub enum ABIArgSlot {145/// In a real register.146Reg {147/// Register that holds this arg.148reg: RealReg,149/// Value type of this arg.150ty: ir::Type,151/// Should this arg be zero- or sign-extended?152extension: ir::ArgumentExtension,153},154/// Arguments only: on stack, at given offset from SP at entry.155Stack {156/// Offset of this arg relative to the base of stack args.157offset: i64,158/// Value type of this arg.159ty: ir::Type,160/// Should this arg be zero- or sign-extended?161extension: ir::ArgumentExtension,162},163}164165impl ABIArgSlot {166/// The type of the value that will be stored in this slot.167pub fn get_type(&self) -> ir::Type {168match self {169ABIArgSlot::Reg { ty, .. } => *ty,170ABIArgSlot::Stack { ty, .. } => *ty,171}172}173}174175/// A vector of `ABIArgSlot`s. Inline capacity for one element because basically176/// 100% of values use one slot. Only `i128`s need multiple slots, and they are177/// super rare (and never happen with Wasm).178pub type ABIArgSlotVec = SmallVec<[ABIArgSlot; 1]>;179180/// An ABIArg is composed of one or more parts. This allows for a CLIF-level181/// Value to be passed with its parts in more than one location at the ABI182/// level. For example, a 128-bit integer may be passed in two 64-bit registers,183/// or even a 64-bit register and a 64-bit stack slot, on a 64-bit machine. The184/// number of "parts" should correspond to the number of registers used to store185/// this type according to the machine backend.186///187/// As an invariant, the `purpose` for every part must match. As a further188/// invariant, a `StructArg` part cannot appear with any other part.189#[derive(Clone, Debug)]190pub enum ABIArg {191/// Storage slots (registers or stack locations) for each part of the192/// argument value. The number of slots must equal the number of register193/// parts used to store a value of this type.194Slots {195/// Slots, one per register part.196slots: ABIArgSlotVec,197/// Purpose of this arg.198purpose: ir::ArgumentPurpose,199},200/// Structure argument. We reserve stack space for it, but the CLIF-level201/// semantics are a little weird: the value passed to the call instruction,202/// and received in the corresponding block param, is a *pointer*. On the203/// caller side, we memcpy the data from the passed-in pointer to the stack204/// area; on the callee side, we compute a pointer to this stack area and205/// provide that as the argument's value.206StructArg {207/// Offset of this arg relative to base of stack args.208offset: i64,209/// Size of this arg on the stack.210size: u64,211/// Purpose of this arg.212purpose: ir::ArgumentPurpose,213},214/// Implicit argument. Similar to a StructArg, except that we have the215/// target type, not a pointer type, at the CLIF-level. This argument is216/// still being passed via reference implicitly.217ImplicitPtrArg {218/// Register or stack slot holding a pointer to the buffer.219pointer: ABIArgSlot,220/// Offset of the argument buffer.221offset: i64,222/// Type of the implicit argument.223ty: Type,224/// Purpose of this arg.225purpose: ir::ArgumentPurpose,226},227}228229impl ABIArg {230/// Create an ABIArg from one register.231pub fn reg(232reg: RealReg,233ty: ir::Type,234extension: ir::ArgumentExtension,235purpose: ir::ArgumentPurpose,236) -> ABIArg {237ABIArg::Slots {238slots: smallvec![ABIArgSlot::Reg { reg, ty, extension }],239purpose,240}241}242243/// Create an ABIArg from one stack slot.244pub fn stack(245offset: i64,246ty: ir::Type,247extension: ir::ArgumentExtension,248purpose: ir::ArgumentPurpose,249) -> ABIArg {250ABIArg::Slots {251slots: smallvec![ABIArgSlot::Stack {252offset,253ty,254extension,255}],256purpose,257}258}259}260261/// Are we computing information about arguments or return values? Much of the262/// handling is factored out into common routines; this enum allows us to263/// distinguish which case we're handling.264#[derive(Clone, Copy, Debug, PartialEq, Eq)]265pub enum ArgsOrRets {266/// Arguments.267Args,268/// Return values.269Rets,270}271272/// Abstract location for a machine-specific ABI impl to translate into the273/// appropriate addressing mode.274#[derive(Clone, Copy, Debug, PartialEq, Eq)]275pub enum StackAMode {276/// Offset into the current frame's argument area.277IncomingArg(i64, u32),278/// Offset within the stack slots in the current frame.279Slot(i64),280/// Offset into the callee frame's argument area.281OutgoingArg(i64),282}283284impl StackAMode {285fn offset_by(&self, offset: u32) -> Self {286match self {287StackAMode::IncomingArg(off, size) => {288StackAMode::IncomingArg(off.checked_add(i64::from(offset)).unwrap(), *size)289}290StackAMode::Slot(off) => StackAMode::Slot(off.checked_add(i64::from(offset)).unwrap()),291StackAMode::OutgoingArg(off) => {292StackAMode::OutgoingArg(off.checked_add(i64::from(offset)).unwrap())293}294}295}296}297298/// Trait implemented by machine-specific backend to represent ISA flags.299pub trait IsaFlags: Clone {300/// Get a flag indicating whether forward-edge CFI is enabled.301fn is_forward_edge_cfi_enabled(&self) -> bool {302false303}304}305306/// Used as an out-parameter to accumulate a sequence of `ABIArg`s in307/// `ABIMachineSpec::compute_arg_locs`. Wraps the shared allocation for all308/// `ABIArg`s in `SigSet` and exposes just the args for the current309/// `compute_arg_locs` call.310pub struct ArgsAccumulator<'a> {311sig_set_abi_args: &'a mut Vec<ABIArg>,312start: usize,313non_formal_flag: bool,314}315316impl<'a> ArgsAccumulator<'a> {317fn new(sig_set_abi_args: &'a mut Vec<ABIArg>) -> Self {318let start = sig_set_abi_args.len();319ArgsAccumulator {320sig_set_abi_args,321start,322non_formal_flag: false,323}324}325326#[inline]327pub fn push(&mut self, arg: ABIArg) {328debug_assert!(!self.non_formal_flag);329self.sig_set_abi_args.push(arg)330}331332#[inline]333pub fn push_non_formal(&mut self, arg: ABIArg) {334self.non_formal_flag = true;335self.sig_set_abi_args.push(arg)336}337338#[inline]339pub fn args(&self) -> &[ABIArg] {340&self.sig_set_abi_args[self.start..]341}342343#[inline]344pub fn args_mut(&mut self) -> &mut [ABIArg] {345&mut self.sig_set_abi_args[self.start..]346}347}348349/// Trait implemented by machine-specific backend to provide information about350/// register assignments and to allow generating the specific instructions for351/// stack loads/saves, prologues/epilogues, etc.352pub trait ABIMachineSpec {353/// The instruction type.354type I: VCodeInst;355356/// The ISA flags type.357type F: IsaFlags;358359/// This is the limit for the size of argument and return-value areas on the360/// stack. We place a reasonable limit here to avoid integer overflow issues361/// with 32-bit arithmetic.362const STACK_ARG_RET_SIZE_LIMIT: u32;363364/// Returns the number of bits in a word, that is 32/64 for 32/64-bit architecture.365fn word_bits() -> u32;366367/// Returns the number of bytes in a word.368fn word_bytes() -> u32 {369return Self::word_bits() / 8;370}371372/// Returns word-size integer type.373fn word_type() -> Type {374match Self::word_bits() {37532 => I32,37664 => I64,377_ => unreachable!(),378}379}380381/// Returns word register class.382fn word_reg_class() -> RegClass {383RegClass::Int384}385386/// Returns required stack alignment in bytes.387fn stack_align(call_conv: isa::CallConv) -> u32;388389/// Process a list of parameters or return values and allocate them to registers390/// and stack slots.391///392/// The argument locations should be pushed onto the given `ArgsAccumulator`393/// in order. Any extra arguments added (such as return area pointers)394/// should come at the end of the list so that the first N lowered395/// parameters align with the N clif parameters.396///397/// Returns the stack-space used (rounded up to as alignment requires), and398/// if `add_ret_area_ptr` was passed, the index of the extra synthetic arg399/// that was added.400fn compute_arg_locs(401call_conv: isa::CallConv,402flags: &settings::Flags,403params: &[ir::AbiParam],404args_or_rets: ArgsOrRets,405add_ret_area_ptr: bool,406args: ArgsAccumulator,407) -> CodegenResult<(u32, Option<usize>)>;408409/// Generate a load from the stack.410fn gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;411412/// Generate a store to the stack.413fn gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I;414415/// Generate a move.416fn gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I;417418/// Generate an integer-extend operation.419fn gen_extend(420to_reg: Writable<Reg>,421from_reg: Reg,422is_signed: bool,423from_bits: u8,424to_bits: u8,425) -> Self::I;426427/// Generate an "args" pseudo-instruction to capture input args in428/// registers.429fn gen_args(args: Vec<ArgPair>) -> Self::I;430431/// Generate a "rets" pseudo-instruction that moves vregs to return432/// registers.433fn gen_rets(rets: Vec<RetPair>) -> Self::I;434435/// Generate an add-with-immediate. Note that even if this uses a scratch436/// register, it must satisfy two requirements:437///438/// - The add-imm sequence must only clobber caller-save registers that are439/// not used for arguments, because it will be placed in the prologue440/// before the clobbered callee-save registers are saved.441///442/// - The add-imm sequence must work correctly when `from_reg` and/or443/// `into_reg` are the register returned by `get_stacklimit_reg()`.444fn gen_add_imm(445call_conv: isa::CallConv,446into_reg: Writable<Reg>,447from_reg: Reg,448imm: u32,449) -> SmallInstVec<Self::I>;450451/// Generate a sequence that traps with a `TrapCode::StackOverflow` code if452/// the stack pointer is less than the given limit register (assuming the453/// stack grows downward).454fn gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>;455456/// Generate an instruction to compute an address of a stack slot (FP- or457/// SP-based offset).458fn gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>) -> Self::I;459460/// Get a fixed register to use to compute a stack limit. This is needed for461/// certain sequences generated after the register allocator has already462/// run. This must satisfy two requirements:463///464/// - It must be a caller-save register that is not used for arguments,465/// because it will be clobbered in the prologue before the clobbered466/// callee-save registers are saved.467///468/// - It must be safe to pass as an argument and/or destination to469/// `gen_add_imm()`. This is relevant when an addition with a large470/// immediate needs its own temporary; it cannot use the same fixed471/// temporary as this one.472fn get_stacklimit_reg(call_conv: isa::CallConv) -> Reg;473474/// Generate a load to the given [base+offset] address.475fn gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I;476477/// Generate a store from the given [base+offset] address.478fn gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I;479480/// Adjust the stack pointer up or down.481fn gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>;482483/// Compute a FrameLayout structure containing a sorted list of all clobbered484/// registers that are callee-saved according to the ABI, as well as the sizes485/// of all parts of the stack frame. The result is used to emit the prologue486/// and epilogue routines.487fn compute_frame_layout(488call_conv: isa::CallConv,489flags: &settings::Flags,490sig: &Signature,491regs: &[Writable<RealReg>],492function_calls: FunctionCalls,493incoming_args_size: u32,494tail_args_size: u32,495stackslots_size: u32,496fixed_frame_storage_size: u32,497outgoing_args_size: u32,498) -> FrameLayout;499500/// Generate the usual frame-setup sequence for this architecture: e.g.,501/// `push rbp / mov rbp, rsp` on x86-64, or `stp fp, lr, [sp, #-16]!` on502/// AArch64.503fn gen_prologue_frame_setup(504call_conv: isa::CallConv,505flags: &settings::Flags,506isa_flags: &Self::F,507frame_layout: &FrameLayout,508) -> SmallInstVec<Self::I>;509510/// Generate the usual frame-restore sequence for this architecture.511fn gen_epilogue_frame_restore(512call_conv: isa::CallConv,513flags: &settings::Flags,514isa_flags: &Self::F,515frame_layout: &FrameLayout,516) -> SmallInstVec<Self::I>;517518/// Generate a return instruction.519fn gen_return(520call_conv: isa::CallConv,521isa_flags: &Self::F,522frame_layout: &FrameLayout,523) -> SmallInstVec<Self::I>;524525/// Generate a probestack call.526fn gen_probestack(insts: &mut SmallInstVec<Self::I>, frame_size: u32);527528/// Generate a inline stack probe.529fn gen_inline_probestack(530insts: &mut SmallInstVec<Self::I>,531call_conv: isa::CallConv,532frame_size: u32,533guard_size: u32,534);535536/// Generate a clobber-save sequence. The implementation here should return537/// a sequence of instructions that "push" or otherwise save to the stack all538/// registers written/modified by the function body that are callee-saved.539/// The sequence of instructions should adjust the stack pointer downward,540/// and should align as necessary according to ABI requirements.541fn gen_clobber_save(542call_conv: isa::CallConv,543flags: &settings::Flags,544frame_layout: &FrameLayout,545) -> SmallVec<[Self::I; 16]>;546547/// Generate a clobber-restore sequence. This sequence should perform the548/// opposite of the clobber-save sequence generated above, assuming that SP549/// going into the sequence is at the same point that it was left when the550/// clobber-save sequence finished.551fn gen_clobber_restore(552call_conv: isa::CallConv,553flags: &settings::Flags,554frame_layout: &FrameLayout,555) -> SmallVec<[Self::I; 16]>;556557/// Generate a memcpy invocation. Used to set up struct558/// args. Takes `src`, `dst` as read-only inputs and passes a temporary559/// allocator.560fn gen_memcpy<F: FnMut(Type) -> Writable<Reg>>(561call_conv: isa::CallConv,562dst: Reg,563src: Reg,564size: usize,565alloc_tmp: F,566) -> SmallVec<[Self::I; 8]>;567568/// Get the number of spillslots required for the given register-class.569fn get_number_of_spillslots_for_value(570rc: RegClass,571target_vector_bytes: u32,572isa_flags: &Self::F,573) -> u32;574575/// Get the ABI-dependent MachineEnv for managing register allocation.576fn get_machine_env(flags: &settings::Flags, call_conv: isa::CallConv) -> &MachineEnv;577578/// Get all caller-save registers, that is, registers that we expect579/// not to be saved across a call to a callee with the given ABI.580fn get_regs_clobbered_by_call(581call_conv_of_callee: isa::CallConv,582is_exception: bool,583) -> PRegSet;584585/// Get the needed extension mode, given the mode attached to the argument586/// in the signature and the calling convention. The input (the attribute in587/// the signature) specifies what extension type should be done *if* the ABI588/// requires extension to the full register; this method's return value589/// indicates whether the extension actually *will* be done.590fn get_ext_mode(591call_conv: isa::CallConv,592specified: ir::ArgumentExtension,593) -> ir::ArgumentExtension;594595/// Get a temporary register that is available to use after a call596/// completes and that does not interfere with register-carried597/// return values. This is used to move stack-carried return598/// values directly into spillslots if needed.599fn retval_temp_reg(call_conv_of_callee: isa::CallConv) -> Writable<Reg>;600601/// Get the exception payload registers, if any, for a calling602/// convention.603///604/// Note that the argument here is the calling convention of the *callee*.605/// This might differ from the caller but the exceptional payloads that are606/// available are defined by the callee, not the caller.607fn exception_payload_regs(callee_conv: isa::CallConv) -> &'static [Reg] {608let _ = callee_conv;609&[]610}611}612613/// Out-of-line data for calls, to keep the size of `Inst` down.614#[derive(Clone, Debug)]615pub struct CallInfo<T> {616/// Receiver of this call617pub dest: T,618/// Register uses of this call.619pub uses: CallArgList,620/// Register defs of this call.621pub defs: CallRetList,622/// Registers clobbered by this call, as per its calling convention.623pub clobbers: PRegSet,624/// The calling convention of the callee.625pub callee_conv: isa::CallConv,626/// The calling convention of the caller.627pub caller_conv: isa::CallConv,628/// The number of bytes that the callee will pop from the stack for the629/// caller, if any. (Used for popping stack arguments with the `tail`630/// calling convention.)631pub callee_pop_size: u32,632/// Information for a try-call, if this is one. We combine633/// handling of calls and try-calls as much as possible to share634/// argument/return logic; they mostly differ in the metadata that635/// they emit, which this information feeds into.636pub try_call_info: Option<TryCallInfo>,637/// Whether this call is patchable.638pub patchable: bool,639}640641/// Out-of-line information present on `try_call` instructions only:642/// information that is used to generate exception-handling tables and643/// link up to destination blocks properly.644#[derive(Clone, Debug)]645pub struct TryCallInfo {646/// The target to jump to on a normal returhn.647pub continuation: MachLabel,648/// Exception tags to catch and corresponding destination labels.649pub exception_handlers: Box<[TryCallHandler]>,650}651652/// Information about an individual handler at a try-call site.653#[derive(Clone, Debug)]654pub enum TryCallHandler {655/// If the tag matches (given the current context), recover at the656/// label.657Tag(ExceptionTag, MachLabel),658/// Recover at the label unconditionally.659Default(MachLabel),660/// Set the dynamic context for interpreting tags at this point in661/// the handler list.662Context(Reg),663}664665impl<T> CallInfo<T> {666/// Creates an empty set of info with no clobbers/uses/etc with the667/// specified ABI668pub fn empty(dest: T, call_conv: isa::CallConv) -> CallInfo<T> {669CallInfo {670dest,671uses: smallvec![],672defs: smallvec![],673clobbers: PRegSet::empty(),674caller_conv: call_conv,675callee_conv: call_conv,676callee_pop_size: 0,677try_call_info: None,678patchable: false,679}680}681}682683/// The id of an ABI signature within the `SigSet`.684#[derive(Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]685pub struct Sig(u32);686cranelift_entity::entity_impl!(Sig);687688impl Sig {689fn prev(self) -> Option<Sig> {690self.0.checked_sub(1).map(Sig)691}692}693694/// ABI information shared between body (callee) and caller.695#[derive(Clone, Debug)]696pub struct SigData {697/// Currently both return values and arguments are stored in a continuous space vector698/// in `SigSet::abi_args`.699///700/// ```plain701/// +----------------------------------------------+702/// | return values |703/// | ... |704/// rets_end --> +----------------------------------------------+705/// | arguments |706/// | ... |707/// args_end --> +----------------------------------------------+708///709/// ```710///711/// Note we only store two offsets as rets_end == args_start, and rets_start == prev.args_end.712///713/// Argument location ending offset (regs or stack slots). Stack offsets are relative to714/// SP on entry to function.715///716/// This is a index into the `SigSet::abi_args`.717args_end: u32,718719/// Return-value location ending offset. Stack offsets are relative to the return-area720/// pointer.721///722/// This is a index into the `SigSet::abi_args`.723rets_end: u32,724725/// Space on stack used to store arguments. We're storing the size in u32 to726/// reduce the size of the struct.727sized_stack_arg_space: u32,728729/// Space on stack used to store return values. We're storing the size in u32 to730/// reduce the size of the struct.731sized_stack_ret_space: u32,732733/// Index in `args` of the stack-return-value-area argument.734stack_ret_arg: Option<u16>,735736/// Calling convention used.737call_conv: isa::CallConv,738}739740impl SigData {741/// Get total stack space required for arguments.742pub fn sized_stack_arg_space(&self) -> u32 {743self.sized_stack_arg_space744}745746/// Get total stack space required for return values.747pub fn sized_stack_ret_space(&self) -> u32 {748self.sized_stack_ret_space749}750751/// Get calling convention used.752pub fn call_conv(&self) -> isa::CallConv {753self.call_conv754}755756/// The index of the stack-return-value-area argument, if any.757pub fn stack_ret_arg(&self) -> Option<u16> {758self.stack_ret_arg759}760}761762/// A (mostly) deduplicated set of ABI signatures.763///764/// We say "mostly" because we do not dedupe between signatures interned via765/// `ir::SigRef` (direct and indirect calls; the vast majority of signatures in766/// this set) vs via `ir::Signature` (the callee itself and libcalls). Doing767/// this final bit of deduplication would require filling out the768/// `ir_signature_to_abi_sig`, which is a bunch of allocations (not just the769/// hash map itself but params and returns vecs in each signature) that we want770/// to avoid.771///772/// In general, prefer using the `ir::SigRef`-taking methods to the773/// `ir::Signature`-taking methods when you can get away with it, as they don't774/// require cloning non-copy types that will trigger heap allocations.775///776/// This type can be indexed by `Sig` to access its associated `SigData`.777pub struct SigSet {778/// Interned `ir::Signature`s that we already have an ABI signature for.779ir_signature_to_abi_sig: FxHashMap<ir::Signature, Sig>,780781/// Interned `ir::SigRef`s that we already have an ABI signature for.782ir_sig_ref_to_abi_sig: SecondaryMap<ir::SigRef, Option<Sig>>,783784/// A single, shared allocation for all `ABIArg`s used by all785/// `SigData`s. Each `SigData` references its args/rets via indices into786/// this allocation.787abi_args: Vec<ABIArg>,788789/// The actual ABI signatures, keyed by `Sig`.790sigs: PrimaryMap<Sig, SigData>,791}792793impl SigSet {794/// Construct a new `SigSet`, interning all of the signatures used by the795/// given function.796pub fn new<M>(func: &ir::Function, flags: &settings::Flags) -> CodegenResult<Self>797where798M: ABIMachineSpec,799{800let arg_estimate = func.dfg.signatures.len() * 6;801802let mut sigs = SigSet {803ir_signature_to_abi_sig: FxHashMap::default(),804ir_sig_ref_to_abi_sig: SecondaryMap::with_capacity(func.dfg.signatures.len()),805abi_args: Vec::with_capacity(arg_estimate),806sigs: PrimaryMap::with_capacity(1 + func.dfg.signatures.len()),807};808809sigs.make_abi_sig_from_ir_signature::<M>(func.signature.clone(), flags)?;810for sig_ref in func.dfg.signatures.keys() {811sigs.make_abi_sig_from_ir_sig_ref::<M>(sig_ref, &func.dfg, flags)?;812}813814Ok(sigs)815}816817/// Have we already interned an ABI signature for the given `ir::Signature`?818pub fn have_abi_sig_for_signature(&self, signature: &ir::Signature) -> bool {819self.ir_signature_to_abi_sig.contains_key(signature)820}821822/// Construct and intern an ABI signature for the given `ir::Signature`.823pub fn make_abi_sig_from_ir_signature<M>(824&mut self,825signature: ir::Signature,826flags: &settings::Flags,827) -> CodegenResult<Sig>828where829M: ABIMachineSpec,830{831// Because the `HashMap` entry API requires taking ownership of the832// lookup key -- and we want to avoid unnecessary clones of833// `ir::Signature`s, even at the cost of duplicate lookups -- we can't834// have a single, get-or-create-style method for interning835// `ir::Signature`s into ABI signatures. So at least (debug) assert that836// we aren't creating duplicate ABI signatures for the same837// `ir::Signature`.838debug_assert!(!self.have_abi_sig_for_signature(&signature));839840let sig_data = self.from_func_sig::<M>(&signature, flags)?;841let sig = self.sigs.push(sig_data);842self.ir_signature_to_abi_sig.insert(signature, sig);843Ok(sig)844}845846fn make_abi_sig_from_ir_sig_ref<M>(847&mut self,848sig_ref: ir::SigRef,849dfg: &ir::DataFlowGraph,850flags: &settings::Flags,851) -> CodegenResult<Sig>852where853M: ABIMachineSpec,854{855if let Some(sig) = self.ir_sig_ref_to_abi_sig[sig_ref] {856return Ok(sig);857}858let signature = &dfg.signatures[sig_ref];859let sig_data = self.from_func_sig::<M>(signature, flags)?;860let sig = self.sigs.push(sig_data);861self.ir_sig_ref_to_abi_sig[sig_ref] = Some(sig);862Ok(sig)863}864865/// Get the already-interned ABI signature id for the given `ir::SigRef`.866pub fn abi_sig_for_sig_ref(&self, sig_ref: ir::SigRef) -> Sig {867self.ir_sig_ref_to_abi_sig[sig_ref]868.expect("must call `make_abi_sig_from_ir_sig_ref` before `get_abi_sig_for_sig_ref`")869}870871/// Get the already-interned ABI signature id for the given `ir::Signature`.872pub fn abi_sig_for_signature(&self, signature: &ir::Signature) -> Sig {873self.ir_signature_to_abi_sig874.get(signature)875.copied()876.expect("must call `make_abi_sig_from_ir_signature` before `get_abi_sig_for_signature`")877}878879pub fn from_func_sig<M: ABIMachineSpec>(880&mut self,881sig: &ir::Signature,882flags: &settings::Flags,883) -> CodegenResult<SigData> {884// Keep in sync with ensure_struct_return_ptr_is_returned885if sig.uses_special_return(ArgumentPurpose::StructReturn) {886panic!("Explicit StructReturn return value not allowed: {sig:?}")887}888let tmp;889let returns = if let Some(struct_ret_index) =890sig.special_param_index(ArgumentPurpose::StructReturn)891{892if !sig.returns.is_empty() {893panic!("No return values are allowed when using StructReturn: {sig:?}");894}895tmp = [sig.params[struct_ret_index]];896&tmp897} else {898sig.returns.as_slice()899};900901// Compute args and retvals from signature. Handle retvals first,902// because we may need to add a return-area arg to the args.903904// NOTE: We rely on the order of the args (rets -> args) inserted to compute the offsets in905// `SigSet::args()` and `SigSet::rets()`. Therefore, we cannot change the two906// compute_arg_locs order.907let (sized_stack_ret_space, _) = M::compute_arg_locs(908sig.call_conv,909flags,910&returns,911ArgsOrRets::Rets,912/* extra ret-area ptr = */ false,913ArgsAccumulator::new(&mut self.abi_args),914)?;915if !flags.enable_multi_ret_implicit_sret() {916assert_eq!(sized_stack_ret_space, 0);917}918let rets_end = u32::try_from(self.abi_args.len()).unwrap();919920// To avoid overflow issues, limit the return size to something reasonable.921if sized_stack_ret_space > M::STACK_ARG_RET_SIZE_LIMIT {922return Err(CodegenError::ImplLimitExceeded);923}924925let need_stack_return_area = sized_stack_ret_space > 0;926if need_stack_return_area {927assert!(!sig.uses_special_param(ir::ArgumentPurpose::StructReturn));928}929930let (sized_stack_arg_space, stack_ret_arg) = M::compute_arg_locs(931sig.call_conv,932flags,933&sig.params,934ArgsOrRets::Args,935need_stack_return_area,936ArgsAccumulator::new(&mut self.abi_args),937)?;938let args_end = u32::try_from(self.abi_args.len()).unwrap();939940// To avoid overflow issues, limit the arg size to something reasonable.941if sized_stack_arg_space > M::STACK_ARG_RET_SIZE_LIMIT {942return Err(CodegenError::ImplLimitExceeded);943}944945trace!(946"ABISig: sig {:?} => args end = {} rets end = {}947arg stack = {} ret stack = {} stack_ret_arg = {:?}",948sig,949args_end,950rets_end,951sized_stack_arg_space,952sized_stack_ret_space,953need_stack_return_area,954);955956let stack_ret_arg = stack_ret_arg.map(|s| u16::try_from(s).unwrap());957Ok(SigData {958args_end,959rets_end,960sized_stack_arg_space,961sized_stack_ret_space,962stack_ret_arg,963call_conv: sig.call_conv,964})965}966967/// Get this signature's ABI arguments.968pub fn args(&self, sig: Sig) -> &[ABIArg] {969let sig_data = &self.sigs[sig];970// Please see comments in `SigSet::from_func_sig` of how we store the offsets.971let start = usize::try_from(sig_data.rets_end).unwrap();972let end = usize::try_from(sig_data.args_end).unwrap();973&self.abi_args[start..end]974}975976/// Get information specifying how to pass the implicit pointer977/// to the return-value area on the stack, if required.978pub fn get_ret_arg(&self, sig: Sig) -> Option<ABIArg> {979let sig_data = &self.sigs[sig];980if let Some(i) = sig_data.stack_ret_arg {981Some(self.args(sig)[usize::from(i)].clone())982} else {983None984}985}986987/// Get information specifying how to pass one argument.988pub fn get_arg(&self, sig: Sig, idx: usize) -> ABIArg {989self.args(sig)[idx].clone()990}991992/// Get this signature's ABI returns.993pub fn rets(&self, sig: Sig) -> &[ABIArg] {994let sig_data = &self.sigs[sig];995// Please see comments in `SigSet::from_func_sig` of how we store the offsets.996let start = usize::try_from(sig.prev().map_or(0, |prev| self.sigs[prev].args_end)).unwrap();997let end = usize::try_from(sig_data.rets_end).unwrap();998&self.abi_args[start..end]999}10001001/// Get information specifying how to pass one return value.1002pub fn get_ret(&self, sig: Sig, idx: usize) -> ABIArg {1003self.rets(sig)[idx].clone()1004}10051006/// Get the number of arguments expected.1007pub fn num_args(&self, sig: Sig) -> usize {1008let len = self.args(sig).len();1009if self.sigs[sig].stack_ret_arg.is_some() {1010len - 11011} else {1012len1013}1014}10151016/// Get the number of return values expected.1017pub fn num_rets(&self, sig: Sig) -> usize {1018self.rets(sig).len()1019}1020}10211022// NB: we do _not_ implement `IndexMut` because these signatures are1023// deduplicated and shared!1024impl core::ops::Index<Sig> for SigSet {1025type Output = SigData;10261027fn index(&self, sig: Sig) -> &Self::Output {1028&self.sigs[sig]1029}1030}10311032/// Structure describing the layout of a function's stack frame.1033#[derive(Clone, Debug, Default)]1034pub struct FrameLayout {1035/// Word size in bytes, so this struct can be1036/// monomorphic/independent of `ABIMachineSpec`.1037pub word_bytes: u32,10381039/// N.B. The areas whose sizes are given in this structure fully1040/// cover the current function's stack frame, from high to low1041/// stack addresses in the sequence below. Each size contains1042/// any alignment padding that may be required by the ABI.10431044/// Size of incoming arguments on the stack. This is not technically1045/// part of this function's frame, but code in the function will still1046/// need to access it. Depending on the ABI, we may need to set up a1047/// frame pointer to do so; we also may need to pop this area from the1048/// stack upon return.1049pub incoming_args_size: u32,10501051/// The size of the incoming argument area, taking into account any1052/// potential increase in size required for tail calls present in the1053/// function. In the case that no tail calls are present, this value1054/// will be the same as [`Self::incoming_args_size`].1055pub tail_args_size: u32,10561057/// Size of the "setup area", typically holding the return address1058/// and/or the saved frame pointer. This may be written either during1059/// the call itself (e.g. a pushed return address) or by code emitted1060/// from gen_prologue_frame_setup. In any case, after that code has1061/// completed execution, the stack pointer is expected to point to the1062/// bottom of this area. The same holds at the start of code emitted1063/// by gen_epilogue_frame_restore.1064pub setup_area_size: u32,10651066/// Size of the area used to save callee-saved clobbered registers.1067/// This area is accessed by code emitted from gen_clobber_save and1068/// gen_clobber_restore.1069pub clobber_size: u32,10701071/// Storage allocated for the fixed part of the stack frame.1072/// This contains stack slots and spill slots.1073pub fixed_frame_storage_size: u32,10741075/// The size of all stackslots.1076pub stackslots_size: u32,10771078/// Stack size to be reserved for outgoing arguments, if used by1079/// the current ABI, or 0 otherwise. After gen_clobber_save and1080/// before gen_clobber_restore, the stack pointer points to the1081/// bottom of this area.1082pub outgoing_args_size: u32,10831084/// Sorted list of callee-saved registers that are clobbered1085/// according to the ABI. These registers will be saved and1086/// restored by gen_clobber_save and gen_clobber_restore.1087pub clobbered_callee_saves: Vec<Writable<RealReg>>,10881089/// The function's call pattern classification.1090pub function_calls: FunctionCalls,1091}10921093impl FrameLayout {1094/// Split the clobbered callee-save registers into integer-class and1095/// float-class groups.1096///1097/// This method does not currently support vector-class callee-save1098/// registers because no current backend has them.1099pub fn clobbered_callee_saves_by_class(&self) -> (&[Writable<RealReg>], &[Writable<RealReg>]) {1100let (ints, floats) = self.clobbered_callee_saves.split_at(1101self.clobbered_callee_saves1102.partition_point(|r| r.to_reg().class() == RegClass::Int),1103);1104debug_assert!(floats.iter().all(|r| r.to_reg().class() == RegClass::Float));1105(ints, floats)1106}11071108/// The size of FP to SP while the frame is active (not during prologue1109/// setup or epilogue tear down).1110pub fn active_size(&self) -> u32 {1111self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size1112}11131114/// Get the offset from the SP to the sized stack slots area.1115pub fn sp_to_sized_stack_slots(&self) -> u32 {1116self.outgoing_args_size1117}11181119/// Get the offset of a spill slot from SP.1120pub fn spillslot_offset(&self, spillslot: SpillSlot) -> i64 {1121// Offset from beginning of spillslot area.1122let islot = spillslot.index() as i64;1123let spill_off = islot * self.word_bytes as i64;1124let sp_off = self.stackslots_size as i64 + spill_off;11251126sp_off1127}11281129/// Get the offset from SP up to FP.1130pub fn sp_to_fp(&self) -> u32 {1131self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size1132}1133}11341135/// ABI object for a function body.1136pub struct Callee<M: ABIMachineSpec> {1137/// CLIF-level signature, possibly normalized.1138ir_sig: ir::Signature,1139/// Signature: arg and retval regs.1140sig: Sig,1141/// Defined dynamic types.1142dynamic_type_sizes: HashMap<Type, u32>,1143/// Offsets to each dynamic stackslot.1144dynamic_stackslots: PrimaryMap<DynamicStackSlot, u32>,1145/// Offsets to each sized stackslot.1146sized_stackslots: PrimaryMap<StackSlot, u32>,1147/// Descriptors for sized stackslots.1148sized_stackslot_keys: SecondaryMap<StackSlot, Option<StackSlotKey>>,1149/// Total stack size of all stackslots1150stackslots_size: u32,1151/// Stack size to be reserved for outgoing arguments.1152outgoing_args_size: u32,1153/// Initially the number of bytes originating in the callers frame where stack arguments will1154/// live. After lowering this number may be larger than the size expected by the function being1155/// compiled, as tail calls potentially require more space for stack arguments.1156tail_args_size: u32,1157/// Register-argument defs, to be provided to the `args`1158/// pseudo-inst, and pregs to constrain them to.1159reg_args: Vec<ArgPair>,1160/// Finalized frame layout for this function.1161frame_layout: Option<FrameLayout>,1162/// The register holding the return-area pointer, if needed.1163ret_area_ptr: Option<Reg>,1164/// Calling convention this function expects.1165call_conv: isa::CallConv,1166/// The settings controlling this function's compilation.1167flags: settings::Flags,1168/// The ISA-specific flag values controlling this function's compilation.1169isa_flags: M::F,1170/// If this function has a stack limit specified, then `Reg` is where the1171/// stack limit will be located after the instructions specified have been1172/// executed.1173///1174/// Note that this is intended for insertion into the prologue, if1175/// present. Also note that because the instructions here execute in the1176/// prologue this happens after legalization/register allocation/etc so we1177/// need to be extremely careful with each instruction. The instructions are1178/// manually register-allocated and carefully only use caller-saved1179/// registers and keep nothing live after this sequence of instructions.1180stack_limit: Option<(Reg, SmallInstVec<M::I>)>,11811182_mach: PhantomData<M>,1183}11841185fn get_special_purpose_param_register(1186f: &ir::Function,1187sigs: &SigSet,1188sig: Sig,1189purpose: ir::ArgumentPurpose,1190) -> Option<Reg> {1191let idx = f.signature.special_param_index(purpose)?;1192match &sigs.args(sig)[idx] {1193&ABIArg::Slots { ref slots, .. } => match &slots[0] {1194&ABIArgSlot::Reg { reg, .. } => Some(reg.into()),1195_ => None,1196},1197_ => None,1198}1199}12001201fn checked_round_up(val: u32, mask: u32) -> Option<u32> {1202Some(val.checked_add(mask)? & !mask)1203}12041205impl<M: ABIMachineSpec> Callee<M> {1206/// Create a new body ABI instance.1207pub fn new(1208f: &ir::Function,1209isa: &dyn TargetIsa,1210isa_flags: &M::F,1211sigs: &SigSet,1212) -> CodegenResult<Self> {1213trace!("ABI: func signature {:?}", f.signature);12141215let flags = isa.flags().clone();1216let sig = sigs.abi_sig_for_signature(&f.signature);12171218let call_conv = f.signature.call_conv;1219// Only these calling conventions are supported.1220debug_assert!(1221call_conv == isa::CallConv::SystemV1222|| call_conv == isa::CallConv::Tail1223|| call_conv == isa::CallConv::Fast1224|| call_conv == isa::CallConv::WindowsFastcall1225|| call_conv == isa::CallConv::AppleAarch641226|| call_conv == isa::CallConv::Winch1227|| call_conv == isa::CallConv::PreserveAll,1228"Unsupported calling convention: {call_conv:?}"1229);12301231// Compute sized stackslot locations and total stackslot size.1232let mut end_offset: u32 = 0;1233let mut sized_stackslots = PrimaryMap::new();1234let mut sized_stackslot_keys = SecondaryMap::new();12351236for (stackslot, data) in f.sized_stack_slots.iter() {1237// We start our computation possibly unaligned where the previous1238// stackslot left off.1239let unaligned_start_offset = end_offset;12401241// The start of the stackslot must be aligned.1242//1243// We always at least machine-word-align slots, but also1244// satisfy the user's requested alignment.1245debug_assert!(data.align_shift < 32);1246let align = core::cmp::max(M::word_bytes(), 1u32 << data.align_shift);1247let mask = align - 1;1248let start_offset = checked_round_up(unaligned_start_offset, mask)1249.ok_or(CodegenError::ImplLimitExceeded)?;12501251// The end offset is the start offset increased by the size1252end_offset = start_offset1253.checked_add(data.size)1254.ok_or(CodegenError::ImplLimitExceeded)?;12551256debug_assert_eq!(stackslot.as_u32() as usize, sized_stackslots.len());1257sized_stackslots.push(start_offset);1258sized_stackslot_keys[stackslot] = data.key;1259}12601261// Compute dynamic stackslot locations and total stackslot size.1262let mut dynamic_stackslots = PrimaryMap::new();1263for (stackslot, data) in f.dynamic_stack_slots.iter() {1264debug_assert_eq!(stackslot.as_u32() as usize, dynamic_stackslots.len());12651266// This computation is similar to the stackslots above1267let unaligned_start_offset = end_offset;12681269let mask = M::word_bytes() - 1;1270let start_offset = checked_round_up(unaligned_start_offset, mask)1271.ok_or(CodegenError::ImplLimitExceeded)?;12721273let ty = f.get_concrete_dynamic_ty(data.dyn_ty).ok_or_else(|| {1274CodegenError::Unsupported(format!("invalid dynamic vector type: {}", data.dyn_ty))1275})?;12761277end_offset = start_offset1278.checked_add(isa.dynamic_vector_bytes(ty))1279.ok_or(CodegenError::ImplLimitExceeded)?;12801281dynamic_stackslots.push(start_offset);1282}12831284// The size of the stackslots needs to be word aligned1285let stackslots_size = checked_round_up(end_offset, M::word_bytes() - 1)1286.ok_or(CodegenError::ImplLimitExceeded)?;12871288let mut dynamic_type_sizes = HashMap::with_capacity(f.dfg.dynamic_types.len());1289for (dyn_ty, _data) in f.dfg.dynamic_types.iter() {1290let ty = f1291.get_concrete_dynamic_ty(dyn_ty)1292.unwrap_or_else(|| panic!("invalid dynamic vector type: {dyn_ty}"));1293let size = isa.dynamic_vector_bytes(ty);1294dynamic_type_sizes.insert(ty, size);1295}12961297// Figure out what instructions, if any, will be needed to check the1298// stack limit. This can either be specified as a special-purpose1299// argument or as a global value which often calculates the stack limit1300// from the arguments.1301let stack_limit = f1302.stack_limit1303.map(|gv| gen_stack_limit::<M>(f, sigs, sig, gv));13041305let tail_args_size = sigs[sig].sized_stack_arg_space;13061307Ok(Self {1308ir_sig: ensure_struct_return_ptr_is_returned(&f.signature),1309sig,1310dynamic_stackslots,1311dynamic_type_sizes,1312sized_stackslots,1313sized_stackslot_keys,1314stackslots_size,1315outgoing_args_size: 0,1316tail_args_size,1317reg_args: vec![],1318frame_layout: None,1319ret_area_ptr: None,1320call_conv,1321flags,1322isa_flags: isa_flags.clone(),1323stack_limit,1324_mach: PhantomData,1325})1326}13271328/// Inserts instructions necessary for checking the stack limit into the1329/// prologue.1330///1331/// This function will generate instructions necessary for perform a stack1332/// check at the header of a function. The stack check is intended to trap1333/// if the stack pointer goes below a particular threshold, preventing stack1334/// overflow in wasm or other code. The `stack_limit` argument here is the1335/// register which holds the threshold below which we're supposed to trap.1336/// This function is known to allocate `stack_size` bytes and we'll push1337/// instructions onto `insts`.1338///1339/// Note that the instructions generated here are special because this is1340/// happening so late in the pipeline (e.g. after register allocation). This1341/// means that we need to do manual register allocation here and also be1342/// careful to not clobber any callee-saved or argument registers. For now1343/// this routine makes do with the `spilltmp_reg` as one temporary1344/// register, and a second register of `tmp2` which is caller-saved. This1345/// should be fine for us since no spills should happen in this sequence of1346/// instructions, so our register won't get accidentally clobbered.1347///1348/// No values can be live after the prologue, but in this case that's ok1349/// because we just need to perform a stack check before progressing with1350/// the rest of the function.1351fn insert_stack_check(1352&self,1353stack_limit: Reg,1354stack_size: u32,1355insts: &mut SmallInstVec<M::I>,1356) {1357// With no explicit stack allocated we can just emit the simple check of1358// the stack registers against the stack limit register, and trap if1359// it's out of bounds.1360if stack_size == 0 {1361insts.extend(M::gen_stack_lower_bound_trap(stack_limit));1362return;1363}13641365// Note that the 32k stack size here is pretty special. See the1366// documentation in x86/abi.rs for why this is here. The general idea is1367// that we're protecting against overflow in the addition that happens1368// below.1369if stack_size >= 32 * 1024 {1370insts.extend(M::gen_stack_lower_bound_trap(stack_limit));1371}13721373// Add the `stack_size` to `stack_limit`, placing the result in1374// `scratch`.1375//1376// Note though that `stack_limit`'s register may be the same as1377// `scratch`. If our stack size doesn't fit into an immediate this1378// means we need a second scratch register for loading the stack size1379// into a register.1380let scratch = Writable::from_reg(M::get_stacklimit_reg(self.call_conv));1381insts.extend(M::gen_add_imm(1382self.call_conv,1383scratch,1384stack_limit,1385stack_size,1386));1387insts.extend(M::gen_stack_lower_bound_trap(scratch.to_reg()));1388}1389}13901391/// Generates the instructions necessary for the `gv` to be materialized into a1392/// register.1393///1394/// This function will return a register that will contain the result of1395/// evaluating `gv`. It will also return any instructions necessary to calculate1396/// the value of the register.1397///1398/// Note that global values are typically lowered to instructions via the1399/// standard legalization pass. Unfortunately though prologue generation happens1400/// so late in the pipeline that we can't use these legalization passes to1401/// generate the instructions for `gv`. As a result we duplicate some lowering1402/// of `gv` here and support only some global values. This is similar to what1403/// the x86 backend does for now, and hopefully this can be somewhat cleaned up1404/// in the future too!1405///1406/// Also note that this function will make use of `writable_spilltmp_reg()` as a1407/// temporary register to store values in if necessary. Currently after we write1408/// to this register there's guaranteed to be no spilled values between where1409/// it's used, because we're not participating in register allocation anyway!1410fn gen_stack_limit<M: ABIMachineSpec>(1411f: &ir::Function,1412sigs: &SigSet,1413sig: Sig,1414gv: ir::GlobalValue,1415) -> (Reg, SmallInstVec<M::I>) {1416let mut insts = smallvec![];1417let reg = generate_gv::<M>(f, sigs, sig, gv, &mut insts);1418return (reg, insts);1419}14201421fn generate_gv<M: ABIMachineSpec>(1422f: &ir::Function,1423sigs: &SigSet,1424sig: Sig,1425gv: ir::GlobalValue,1426insts: &mut SmallInstVec<M::I>,1427) -> Reg {1428match f.global_values[gv] {1429// Return the direct register the vmcontext is in1430ir::GlobalValueData::VMContext => {1431get_special_purpose_param_register(f, sigs, sig, ir::ArgumentPurpose::VMContext)1432.expect("no vmcontext parameter found")1433}1434// Load our base value into a register, then load from that register1435// in to a temporary register.1436ir::GlobalValueData::Load {1437base,1438offset,1439global_type: _,1440flags: _,1441} => {1442let base = generate_gv::<M>(f, sigs, sig, base, insts);1443let into_reg = Writable::from_reg(M::get_stacklimit_reg(f.stencil.signature.call_conv));1444insts.push(M::gen_load_base_offset(1445into_reg,1446base,1447offset.into(),1448M::word_type(),1449));1450return into_reg.to_reg();1451}1452ref other => panic!("global value for stack limit not supported: {other}"),1453}1454}14551456/// Returns true if the signature needs to be legalized.1457fn missing_struct_return(sig: &ir::Signature) -> bool {1458sig.uses_special_param(ArgumentPurpose::StructReturn)1459&& !sig.uses_special_return(ArgumentPurpose::StructReturn)1460}14611462fn ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature {1463// Keep in sync with Callee::new1464let mut sig = sig.clone();1465if sig.uses_special_return(ArgumentPurpose::StructReturn) {1466panic!("Explicit StructReturn return value not allowed: {sig:?}")1467}1468if let Some(struct_ret_index) = sig.special_param_index(ArgumentPurpose::StructReturn) {1469if !sig.returns.is_empty() {1470panic!("No return values are allowed when using StructReturn: {sig:?}");1471}1472sig.returns.insert(0, sig.params[struct_ret_index]);1473}1474sig1475}14761477/// ### Pre-Regalloc Functions1478///1479/// These methods of `Callee` may only be called before regalloc.1480impl<M: ABIMachineSpec> Callee<M> {1481/// Access the (possibly legalized) signature.1482pub fn signature(&self) -> &ir::Signature {1483debug_assert!(1484!missing_struct_return(&self.ir_sig),1485"`Callee::ir_sig` is always legalized"1486);1487&self.ir_sig1488}14891490/// Initialize. This is called after the Callee is constructed because it1491/// may allocate a temp vreg, which can only be allocated once the lowering1492/// context exists.1493pub fn init_retval_area(1494&mut self,1495sigs: &SigSet,1496vregs: &mut VRegAllocator<M::I>,1497) -> CodegenResult<()> {1498if sigs[self.sig].stack_ret_arg.is_some() {1499let ret_area_ptr = vregs.alloc(M::word_type())?;1500self.ret_area_ptr = Some(ret_area_ptr.only_reg().unwrap());1501}1502Ok(())1503}15041505/// Get the return area pointer register, if any.1506pub fn ret_area_ptr(&self) -> Option<Reg> {1507self.ret_area_ptr1508}15091510/// Accumulate outgoing arguments.1511///1512/// This ensures that at least `size` bytes are allocated in the prologue to1513/// be available for use in function calls to hold arguments and/or return1514/// values. If this function is called multiple times, the maximum of all1515/// `size` values will be available.1516pub fn accumulate_outgoing_args_size(&mut self, size: u32) {1517if size > self.outgoing_args_size {1518self.outgoing_args_size = size;1519}1520}15211522/// Accumulate the incoming argument area size requirements for a tail call,1523/// as it could be larger than the incoming arguments of the function1524/// currently being compiled.1525pub fn accumulate_tail_args_size(&mut self, size: u32) {1526if size > self.tail_args_size {1527self.tail_args_size = size;1528}1529}15301531pub fn is_forward_edge_cfi_enabled(&self) -> bool {1532self.isa_flags.is_forward_edge_cfi_enabled()1533}15341535/// Get the calling convention implemented by this ABI object.1536pub fn call_conv(&self) -> isa::CallConv {1537self.call_conv1538}15391540/// Get the ABI-dependent MachineEnv for managing register allocation.1541pub fn machine_env(&self) -> &MachineEnv {1542M::get_machine_env(&self.flags, self.call_conv)1543}15441545/// The offsets of all sized stack slots (not spill slots) for debuginfo purposes.1546pub fn sized_stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32> {1547&self.sized_stackslots1548}15491550/// The offsets of all dynamic stack slots (not spill slots) for debuginfo purposes.1551pub fn dynamic_stackslot_offsets(&self) -> &PrimaryMap<DynamicStackSlot, u32> {1552&self.dynamic_stackslots1553}15541555/// Generate an instruction which copies an argument to a destination1556/// register.1557pub fn gen_copy_arg_to_regs(1558&mut self,1559sigs: &SigSet,1560idx: usize,1561into_regs: ValueRegs<Writable<Reg>>,1562vregs: &mut VRegAllocator<M::I>,1563) -> SmallInstVec<M::I> {1564let mut insts = smallvec![];1565let mut copy_arg_slot_to_reg = |slot: &ABIArgSlot, into_reg: &Writable<Reg>| {1566match slot {1567&ABIArgSlot::Reg { reg, .. } => {1568// Add a preg -> def pair to the eventual `args`1569// instruction. Extension mode doesn't matter1570// (we're copying out, not in; we ignore high bits1571// by convention).1572let arg = ArgPair {1573vreg: *into_reg,1574preg: reg.into(),1575};1576self.reg_args.push(arg);1577}1578&ABIArgSlot::Stack {1579offset,1580ty,1581extension,1582..1583} => {1584// However, we have to respect the extension mode for stack1585// slots, or else we grab the wrong bytes on big-endian.1586let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);1587let ty =1588if ext != ArgumentExtension::None && M::word_bits() > ty_bits(ty) as u32 {1589M::word_type()1590} else {1591ty1592};1593insts.push(M::gen_load_stack(1594StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),1595*into_reg,1596ty,1597));1598}1599}1600};16011602match &sigs.args(self.sig)[idx] {1603&ABIArg::Slots { ref slots, .. } => {1604assert_eq!(into_regs.len(), slots.len());1605for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {1606copy_arg_slot_to_reg(&slot, &into_reg);1607}1608}1609&ABIArg::StructArg { offset, .. } => {1610let into_reg = into_regs.only_reg().unwrap();1611// Buffer address is implicitly defined by the ABI.1612insts.push(M::gen_get_stack_addr(1613StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),1614into_reg,1615));1616}1617&ABIArg::ImplicitPtrArg { pointer, ty, .. } => {1618let into_reg = into_regs.only_reg().unwrap();1619// We need to dereference the pointer.1620let base = match &pointer {1621&ABIArgSlot::Reg { reg, ty, .. } => {1622let tmp = vregs.alloc_with_deferred_error(ty).only_reg().unwrap();1623self.reg_args.push(ArgPair {1624vreg: Writable::from_reg(tmp),1625preg: reg.into(),1626});1627tmp1628}1629&ABIArgSlot::Stack { offset, ty, .. } => {1630let addr_reg = writable_value_regs(vregs.alloc_with_deferred_error(ty))1631.only_reg()1632.unwrap();1633insts.push(M::gen_load_stack(1634StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),1635addr_reg,1636ty,1637));1638addr_reg.to_reg()1639}1640};1641insts.push(M::gen_load_base_offset(into_reg, base, 0, ty));1642}1643}1644insts1645}16461647/// Generate an instruction which copies a source register to a return value slot.1648pub fn gen_copy_regs_to_retval(1649&self,1650sigs: &SigSet,1651idx: usize,1652from_regs: ValueRegs<Reg>,1653vregs: &mut VRegAllocator<M::I>,1654) -> (SmallVec<[RetPair; 2]>, SmallInstVec<M::I>) {1655let mut reg_pairs = smallvec![];1656let mut ret = smallvec![];1657let word_bits = M::word_bits() as u8;1658match &sigs.rets(self.sig)[idx] {1659&ABIArg::Slots { ref slots, .. } => {1660assert_eq!(from_regs.len(), slots.len());1661for (slot, &from_reg) in slots.iter().zip(from_regs.regs().iter()) {1662match slot {1663&ABIArgSlot::Reg {1664reg, ty, extension, ..1665} => {1666let from_bits = ty_bits(ty) as u8;1667let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);1668let vreg = match (ext, from_bits) {1669(ir::ArgumentExtension::Uext, n)1670| (ir::ArgumentExtension::Sext, n)1671if n < word_bits =>1672{1673let signed = ext == ir::ArgumentExtension::Sext;1674let dst =1675writable_value_regs(vregs.alloc_with_deferred_error(ty))1676.only_reg()1677.unwrap();1678ret.push(M::gen_extend(1679dst, from_reg, signed, from_bits,1680/* to_bits = */ word_bits,1681));1682dst.to_reg()1683}1684_ => {1685// No move needed, regalloc2 will emit it using the constraint1686// added by the RetPair.1687from_reg1688}1689};1690reg_pairs.push(RetPair {1691vreg,1692preg: Reg::from(reg),1693});1694}1695&ABIArgSlot::Stack {1696offset,1697ty,1698extension,1699..1700} => {1701let mut ty = ty;1702let from_bits = ty_bits(ty) as u8;1703// A machine ABI implementation should ensure that stack frames1704// have "reasonable" size. All current ABIs for machinst1705// backends (aarch64 and x64) enforce a 128MB limit.1706let off = i32::try_from(offset).expect(1707"Argument stack offset greater than 2GB; should hit impl limit first",1708);1709let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);1710// Trash the from_reg; it should be its last use.1711match (ext, from_bits) {1712(ir::ArgumentExtension::Uext, n)1713| (ir::ArgumentExtension::Sext, n)1714if n < word_bits =>1715{1716assert_eq!(M::word_reg_class(), from_reg.class());1717let signed = ext == ir::ArgumentExtension::Sext;1718let dst =1719writable_value_regs(vregs.alloc_with_deferred_error(ty))1720.only_reg()1721.unwrap();1722ret.push(M::gen_extend(1723dst, from_reg, signed, from_bits,1724/* to_bits = */ word_bits,1725));1726// Store the extended version.1727ty = M::word_type();1728}1729_ => {}1730};1731ret.push(M::gen_store_base_offset(1732self.ret_area_ptr.unwrap(),1733off,1734from_reg,1735ty,1736));1737}1738}1739}1740}1741ABIArg::StructArg { .. } => {1742panic!("StructArg in return position is unsupported");1743}1744ABIArg::ImplicitPtrArg { .. } => {1745panic!("ImplicitPtrArg in return position is unsupported");1746}1747}1748(reg_pairs, ret)1749}17501751/// Generate any setup instruction needed to save values to the1752/// return-value area. This is usually used when were are multiple return1753/// values or an otherwise large return value that must be passed on the1754/// stack; typically the ABI specifies an extra hidden argument that is a1755/// pointer to that memory.1756pub fn gen_retval_area_setup(1757&mut self,1758sigs: &SigSet,1759vregs: &mut VRegAllocator<M::I>,1760) -> Option<M::I> {1761if let Some(i) = sigs[self.sig].stack_ret_arg {1762let ret_area_ptr = Writable::from_reg(self.ret_area_ptr.unwrap());1763let insts =1764self.gen_copy_arg_to_regs(sigs, i.into(), ValueRegs::one(ret_area_ptr), vregs);1765insts.into_iter().next().map(|inst| {1766trace!(1767"gen_retval_area_setup: inst {:?}; ptr reg is {:?}",1768inst,1769ret_area_ptr.to_reg()1770);1771inst1772})1773} else {1774trace!("gen_retval_area_setup: not needed");1775None1776}1777}17781779/// Generate a return instruction.1780pub fn gen_rets(&self, rets: Vec<RetPair>) -> M::I {1781M::gen_rets(rets)1782}17831784/// Set up arguments values `args` for a call with signature `sig`.1785/// This will return a series of instructions to be emitted to set1786/// up all arguments, as well as a `CallArgList` list representing1787/// the arguments passed in registers. The latter need to be added1788/// as constraints to the actual call instruction.1789pub fn gen_call_args(1790&self,1791sigs: &SigSet,1792sig: Sig,1793args: &[ValueRegs<Reg>],1794is_tail_call: bool,1795flags: &settings::Flags,1796vregs: &mut VRegAllocator<M::I>,1797) -> (CallArgList, SmallInstVec<M::I>) {1798let mut uses: CallArgList = smallvec![];1799let mut insts = smallvec![];18001801assert_eq!(args.len(), sigs.num_args(sig));18021803let call_conv = sigs[sig].call_conv;1804let stack_arg_space = sigs[sig].sized_stack_arg_space;1805let stack_arg = |offset| {1806if is_tail_call {1807StackAMode::IncomingArg(offset, stack_arg_space)1808} else {1809StackAMode::OutgoingArg(offset)1810}1811};18121813let word_ty = M::word_type();1814let word_rc = M::word_reg_class();1815let word_bits = M::word_bits() as usize;18161817if is_tail_call {1818debug_assert_eq!(1819self.call_conv,1820isa::CallConv::Tail,1821"Can only do `return_call`s from within a `tail` calling convention function"1822);1823}18241825// Helper to process a single argument slot (register or stack slot).1826// This will either add the register to the `uses` list or write the1827// value to the stack slot in the outgoing argument area (or for tail1828// calls, the incoming argument area).1829let mut process_arg_slot = |insts: &mut SmallInstVec<M::I>, slot, vreg, ty| {1830match &slot {1831&ABIArgSlot::Reg { reg, .. } => {1832uses.push(CallArgPair {1833vreg,1834preg: reg.into(),1835});1836}1837&ABIArgSlot::Stack { offset, .. } => {1838insts.push(M::gen_store_stack(stack_arg(offset), vreg, ty));1839}1840};1841};18421843// First pass: Handle `StructArg` arguments. These need to be copied1844// into their associated stack buffers. This should happen before any1845// of the other arguments are processed, as the `memcpy` call might1846// clobber registers used by other arguments.1847for (idx, from_regs) in args.iter().enumerate() {1848match &sigs.args(sig)[idx] {1849&ABIArg::Slots { .. } | &ABIArg::ImplicitPtrArg { .. } => {}1850&ABIArg::StructArg { offset, size, .. } => {1851let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();1852insts.push(M::gen_get_stack_addr(1853stack_arg(offset),1854Writable::from_reg(tmp),1855));1856insts.extend(M::gen_memcpy(1857isa::CallConv::for_libcall(flags, call_conv),1858tmp,1859from_regs.only_reg().unwrap(),1860size as usize,1861|ty| {1862Writable::from_reg(1863vregs.alloc_with_deferred_error(ty).only_reg().unwrap(),1864)1865},1866));1867}1868}1869}18701871// Second pass: Handle everything except `StructArg` arguments.1872for (idx, from_regs) in args.iter().enumerate() {1873match sigs.args(sig)[idx] {1874ABIArg::Slots { ref slots, .. } => {1875assert_eq!(from_regs.len(), slots.len());1876for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {1877// Load argument slot value from `from_reg`, and perform any zero-1878// or sign-extension that is required by the ABI.1879let (ty, extension) = match *slot {1880ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),1881ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),1882};1883let ext = M::get_ext_mode(call_conv, extension);1884let (vreg, ty) = if ext != ir::ArgumentExtension::None1885&& ty_bits(ty) < word_bits1886{1887assert_eq!(word_rc, from_reg.class());1888let signed = match ext {1889ir::ArgumentExtension::Uext => false,1890ir::ArgumentExtension::Sext => true,1891_ => unreachable!(),1892};1893let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();1894insts.push(M::gen_extend(1895Writable::from_reg(tmp),1896*from_reg,1897signed,1898ty_bits(ty) as u8,1899word_bits as u8,1900));1901(tmp, word_ty)1902} else {1903(*from_reg, ty)1904};1905process_arg_slot(&mut insts, *slot, vreg, ty);1906}1907}1908ABIArg::ImplicitPtrArg {1909offset,1910pointer,1911ty,1912..1913} => {1914let vreg = from_regs.only_reg().unwrap();1915let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();1916insts.push(M::gen_get_stack_addr(1917stack_arg(offset),1918Writable::from_reg(tmp),1919));1920insts.push(M::gen_store_base_offset(tmp, 0, vreg, ty));1921process_arg_slot(&mut insts, pointer, tmp, word_ty);1922}1923ABIArg::StructArg { .. } => {}1924}1925}19261927// Finally, set the stack-return pointer to the return argument area.1928// For tail calls, this means forwarding the incoming stack-return pointer.1929if let Some(ret_arg) = sigs.get_ret_arg(sig) {1930let ret_area = if is_tail_call {1931self.ret_area_ptr.expect(1932"if the tail callee has a return pointer, then the tail caller must as well",1933)1934} else {1935let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();1936let amode = StackAMode::OutgoingArg(stack_arg_space.into());1937insts.push(M::gen_get_stack_addr(amode, Writable::from_reg(tmp)));1938tmp1939};1940match ret_arg {1941// The return pointer must occupy a single slot.1942ABIArg::Slots { slots, .. } => {1943assert_eq!(slots.len(), 1);1944process_arg_slot(&mut insts, slots[0], ret_area, word_ty);1945}1946_ => unreachable!(),1947}1948}19491950(uses, insts)1951}19521953/// Set up return values `outputs` for a call with signature `sig`.1954/// This does not emit (or return) any instructions, but returns a1955/// `CallRetList` representing the return value constraints. This1956/// needs to be added to the actual call instruction.1957///1958/// If `try_call_payloads` is non-zero, it is expected to hold1959/// exception payload registers for try_call instructions. These1960/// will be added as needed to the `CallRetList` as well.1961pub fn gen_call_rets(1962&self,1963sigs: &SigSet,1964sig: Sig,1965outputs: &[ValueRegs<Reg>],1966try_call_payloads: Option<&[Writable<Reg>]>,1967vregs: &mut VRegAllocator<M::I>,1968) -> CallRetList {1969let callee_conv = sigs[sig].call_conv;1970let stack_arg_space = sigs[sig].sized_stack_arg_space;19711972let word_ty = M::word_type();1973let word_bits = M::word_bits() as usize;19741975let mut defs: CallRetList = smallvec![];1976let mut outputs = outputs.into_iter();1977let num_rets = sigs.num_rets(sig);1978for idx in 0..num_rets {1979let ret = sigs.rets(sig)[idx].clone();1980match ret {1981ABIArg::Slots {1982ref slots, purpose, ..1983} => {1984// We do not use the returned copy of the return buffer pointer,1985// so skip any StructReturn returns that may be present.1986if purpose == ArgumentPurpose::StructReturn {1987continue;1988}1989let retval_regs = outputs.next().unwrap();1990assert_eq!(retval_regs.len(), slots.len());1991for (slot, retval_reg) in slots.iter().zip(retval_regs.regs().iter()) {1992// We do not perform any extension because we're copying out, not in,1993// and we ignore high bits in our own registers by convention. However,1994// we still need to use the proper extended type to access stack slots1995// (this is critical on big-endian systems).1996let (ty, extension) = match *slot {1997ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),1998ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),1999};2000let ext = M::get_ext_mode(callee_conv, extension);2001let ty = if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {2002word_ty2003} else {2004ty2005};20062007match slot {2008&ABIArgSlot::Reg { reg, .. } => {2009defs.push(CallRetPair {2010vreg: Writable::from_reg(*retval_reg),2011location: RetLocation::Reg(reg.into(), ty),2012});2013}2014&ABIArgSlot::Stack { offset, .. } => {2015let amode =2016StackAMode::OutgoingArg(offset + i64::from(stack_arg_space));2017defs.push(CallRetPair {2018vreg: Writable::from_reg(*retval_reg),2019location: RetLocation::Stack(amode, ty),2020});2021}2022}2023}2024}2025ABIArg::StructArg { .. } => {2026panic!("StructArg not supported in return position");2027}2028ABIArg::ImplicitPtrArg { .. } => {2029panic!("ImplicitPtrArg not supported in return position");2030}2031}2032}2033assert!(outputs.next().is_none());20342035if let Some(try_call_payloads) = try_call_payloads {2036// Let `M` say where the payload values are going to end up and then2037// double-check it's the same size as the calling convention's2038// reported number of exception types.2039let pregs = M::exception_payload_regs(callee_conv);2040assert_eq!(2041callee_conv.exception_payload_types(M::word_type()).len(),2042pregs.len()2043);20442045// We need to update `defs` to contain the exception2046// payload regs as well. We have two sources of info that2047// we join:2048//2049// - The machine-specific ABI implementation `M`, which2050// tells us the particular registers that payload values2051// must be in2052// - The passed-in lowering context, which gives us the2053// vregs we must define.2054//2055// Note that payload values may need to end up in the same2056// physical registers as ordinary return values; this is2057// not a conflict, because we either get one or the2058// other. For regalloc's purposes, we define both starting2059// here at the callsite, but we can share one def in the2060// `defs` list and alias one vreg to another. Thus we2061// handle the two cases below for each payload register:2062// overlaps a return value (and we alias to it) or not2063// (and we add a def).2064for (i, &preg) in pregs.iter().enumerate() {2065let vreg = try_call_payloads[i];2066if let Some(existing) = defs.iter().find(|def| match def.location {2067RetLocation::Reg(r, _) => r == preg,2068_ => false,2069}) {2070vregs.set_vreg_alias(vreg.to_reg(), existing.vreg.to_reg());2071} else {2072defs.push(CallRetPair {2073vreg,2074location: RetLocation::Reg(preg, word_ty),2075});2076}2077}2078}20792080defs2081}20822083/// Populate a `CallInfo` for a call with signature `sig`.2084///2085/// `dest` is the target-specific call destination value2086/// `uses` is the `CallArgList` describing argument constraints2087/// `defs` is the `CallRetList` describing return constraints2088/// `try_call_info` describes exception targets for try_call instructions2089/// `patchable` describes whether this callsite should emit metadata2090/// for patching to enable/disable it.2091///2092/// The clobber list is computed here from the above data.2093pub fn gen_call_info<T>(2094&self,2095sigs: &SigSet,2096sig: Sig,2097dest: T,2098uses: CallArgList,2099defs: CallRetList,2100try_call_info: Option<TryCallInfo>,2101patchable: bool,2102) -> CallInfo<T> {2103let caller_conv = self.call_conv;2104let callee_conv = sigs[sig].call_conv;2105let stack_arg_space = sigs[sig].sized_stack_arg_space;21062107let clobbers = {2108// Get clobbers: all caller-saves. These may include return value2109// regs, which we will remove from the clobber set below.2110let mut clobbers =2111<M>::get_regs_clobbered_by_call(callee_conv, try_call_info.is_some());21122113// Remove retval regs from clobbers.2114for def in &defs {2115if let RetLocation::Reg(preg, _) = def.location {2116clobbers.remove(PReg::from(preg.to_real_reg().unwrap()));2117}2118}21192120clobbers2121};21222123// Any adjustment to SP to account for required outgoing arguments/stack return values must2124// be done inside of the call pseudo-op, to ensure that SP is always in a consistent2125// state for all other instructions. For example, if a tail-call abi function is called2126// here, the reclamation of the outgoing argument area must be done inside of the call2127// pseudo-op's emission to ensure that SP is consistent at all other points in the lowered2128// function. (Except the prologue and epilogue, but those are fairly special parts of the2129// function that establish the SP invariants that are relied on elsewhere and are generated2130// after the register allocator has run and thus cannot have register allocator-inserted2131// references to SP offsets.)21322133let callee_pop_size = if callee_conv == isa::CallConv::Tail {2134// The tail calling convention has callees pop stack arguments.2135stack_arg_space2136} else {213702138};21392140CallInfo {2141dest,2142uses,2143defs,2144clobbers,2145callee_conv,2146caller_conv,2147callee_pop_size,2148try_call_info,2149patchable,2150}2151}21522153/// Get the raw offset of a sized stackslot in the slot region.2154pub fn sized_stackslot_offset(&self, slot: StackSlot) -> u32 {2155self.sized_stackslots[slot]2156}21572158/// Produce an instruction that computes a sized stackslot address.2159pub fn sized_stackslot_addr(2160&self,2161slot: StackSlot,2162offset: u32,2163into_reg: Writable<Reg>,2164) -> M::I {2165// Offset from beginning of stackslot area.2166let stack_off = self.sized_stackslots[slot] as i64;2167let sp_off: i64 = stack_off + (offset as i64);2168M::gen_get_stack_addr(StackAMode::Slot(sp_off), into_reg)2169}21702171/// Produce an instruction that computes a dynamic stackslot address.2172pub fn dynamic_stackslot_addr(&self, slot: DynamicStackSlot, into_reg: Writable<Reg>) -> M::I {2173let stack_off = self.dynamic_stackslots[slot] as i64;2174M::gen_get_stack_addr(StackAMode::Slot(stack_off), into_reg)2175}21762177/// Get an `args` pseudo-inst, if any, that should appear at the2178/// very top of the function body prior to regalloc.2179pub fn take_args(&mut self) -> Option<M::I> {2180if self.reg_args.len() > 0 {2181// Very first instruction is an `args` pseudo-inst that2182// establishes live-ranges for in-register arguments and2183// constrains them at the start of the function to the2184// locations defined by the ABI.2185Some(M::gen_args(core::mem::take(&mut self.reg_args)))2186} else {2187None2188}2189}2190}21912192/// ### Post-Regalloc Functions2193///2194/// These methods of `Callee` may only be called after2195/// regalloc.2196impl<M: ABIMachineSpec> Callee<M> {2197/// Compute the final frame layout, post-regalloc.2198///2199/// This must be called before gen_prologue or gen_epilogue.2200pub fn compute_frame_layout(2201&mut self,2202sigs: &SigSet,2203spillslots: usize,2204clobbered: Vec<Writable<RealReg>>,2205function_calls: FunctionCalls,2206) {2207let bytes = M::word_bytes();2208let total_stacksize = self.stackslots_size + bytes * spillslots as u32;2209let mask = M::stack_align(self.call_conv) - 1;2210let total_stacksize = (total_stacksize + mask) & !mask; // 16-align the stack.2211self.frame_layout = Some(M::compute_frame_layout(2212self.call_conv,2213&self.flags,2214self.signature(),2215&clobbered,2216function_calls,2217self.stack_args_size(sigs),2218self.tail_args_size,2219self.stackslots_size,2220total_stacksize,2221self.outgoing_args_size,2222));2223}22242225/// Generate a prologue, post-regalloc.2226///2227/// This should include any stack frame or other setup necessary to use the2228/// other methods (`load_arg`, `store_retval`, and spillslot accesses.)2229pub fn gen_prologue(&self) -> SmallInstVec<M::I> {2230let frame_layout = self.frame_layout();2231let mut insts = smallvec![];22322233// Set up frame.2234insts.extend(M::gen_prologue_frame_setup(2235self.call_conv,2236&self.flags,2237&self.isa_flags,2238&frame_layout,2239));22402241// The stack limit check needs to cover all the stack adjustments we2242// might make, up to the next stack limit check in any function we2243// call. Since this happens after frame setup, the current function's2244// setup area needs to be accounted for in the caller's stack limit2245// check, but we need to account for any setup area that our callees2246// might need. Note that s390x may also use the outgoing args area for2247// backtrace support even in leaf functions, so that should be accounted2248// for unconditionally.2249let total_stacksize = (frame_layout.tail_args_size - frame_layout.incoming_args_size)2250+ frame_layout.clobber_size2251+ frame_layout.fixed_frame_storage_size2252+ frame_layout.outgoing_args_size2253+ if frame_layout.function_calls == FunctionCalls::None {225402255} else {2256frame_layout.setup_area_size2257};22582259// Leaf functions with zero stack don't need a stack check if one's2260// specified, otherwise always insert the stack check.2261if total_stacksize > 0 || frame_layout.function_calls != FunctionCalls::None {2262if let Some((reg, stack_limit_load)) = &self.stack_limit {2263insts.extend(stack_limit_load.clone());2264self.insert_stack_check(*reg, total_stacksize, &mut insts);2265}22662267if self.flags.enable_probestack() {2268let guard_size = 1 << self.flags.probestack_size_log2();2269match self.flags.probestack_strategy() {2270ProbestackStrategy::Inline => M::gen_inline_probestack(2271&mut insts,2272self.call_conv,2273total_stacksize,2274guard_size,2275),2276ProbestackStrategy::Outline => {2277if total_stacksize >= guard_size {2278M::gen_probestack(&mut insts, total_stacksize);2279}2280}2281}2282}2283}22842285// Save clobbered registers.2286insts.extend(M::gen_clobber_save(2287self.call_conv,2288&self.flags,2289&frame_layout,2290));22912292insts2293}22942295/// Generate an epilogue, post-regalloc.2296///2297/// Note that this must generate the actual return instruction (rather than2298/// emitting this in the lowering logic), because the epilogue code comes2299/// before the return and the two are likely closely related.2300pub fn gen_epilogue(&self) -> SmallInstVec<M::I> {2301let frame_layout = self.frame_layout();2302let mut insts = smallvec![];23032304// Restore clobbered registers.2305insts.extend(M::gen_clobber_restore(2306self.call_conv,2307&self.flags,2308&frame_layout,2309));23102311// Tear down frame.2312insts.extend(M::gen_epilogue_frame_restore(2313self.call_conv,2314&self.flags,2315&self.isa_flags,2316&frame_layout,2317));23182319// And return.2320insts.extend(M::gen_return(2321self.call_conv,2322&self.isa_flags,2323&frame_layout,2324));23252326trace!("Epilogue: {:?}", insts);2327insts2328}23292330/// Return a reference to the computed frame layout information. This2331/// function will panic if it's called before [`Self::compute_frame_layout`].2332pub fn frame_layout(&self) -> &FrameLayout {2333self.frame_layout2334.as_ref()2335.expect("frame layout not computed before prologue generation")2336}23372338/// Returns the offset from SP to FP for the given function, after2339/// the prologue has set up the frame. This comprises the spill2340/// slots and stack-storage slots as well as storage for clobbered2341/// callee-save registers and outgoing arguments at callsites2342/// (space for which is reserved during frame setup).2343pub fn sp_to_fp_offset(&self) -> u32 {2344let frame_layout = self.frame_layout();2345frame_layout.clobber_size2346+ frame_layout.fixed_frame_storage_size2347+ frame_layout.outgoing_args_size2348}23492350/// Returns offset from the slot base in the current frame to the caller's SP.2351pub fn slot_base_to_caller_sp_offset(&self) -> u32 {2352// Note: this looks very similar to `frame_size()` above, but2353// it differs in both endpoints: it measures from the bottom2354// of stackslots, excluding outgoing args; and it includes the2355// setup area (FP/LR) size and any extra tail-args space.2356let frame_layout = self.frame_layout();2357frame_layout.clobber_size2358+ frame_layout.fixed_frame_storage_size2359+ frame_layout.setup_area_size2360+ (frame_layout.tail_args_size - frame_layout.incoming_args_size)2361}23622363/// Returns the size of arguments expected on the stack.2364pub fn stack_args_size(&self, sigs: &SigSet) -> u32 {2365sigs[self.sig].sized_stack_arg_space2366}23672368/// Get the spill-slot size.2369pub fn get_spillslot_size(&self, rc: RegClass) -> u32 {2370let max = if self.dynamic_type_sizes.len() == 0 {2371162372} else {2373*self2374.dynamic_type_sizes2375.iter()2376.max_by(|x, y| x.1.cmp(&y.1))2377.map(|(_k, v)| v)2378.unwrap()2379};2380M::get_number_of_spillslots_for_value(rc, max, &self.isa_flags)2381}23822383/// Get the spill slot offset relative to the fixed allocation area start.2384pub fn get_spillslot_offset(&self, slot: SpillSlot) -> i64 {2385self.frame_layout().spillslot_offset(slot)2386}23872388/// Generate a spill.2389pub fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg) -> M::I {2390let ty = M::I::canonical_type_for_rc(from_reg.class());2391debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);23922393let sp_off = self.get_spillslot_offset(to_slot);2394trace!("gen_spill: {from_reg:?} into slot {to_slot:?} at offset {sp_off}");23952396let from = StackAMode::Slot(sp_off);2397<M>::gen_store_stack(from, Reg::from(from_reg), ty)2398}23992400/// Generate a reload (fill).2401pub fn gen_reload(&self, to_reg: Writable<RealReg>, from_slot: SpillSlot) -> M::I {2402let ty = M::I::canonical_type_for_rc(to_reg.to_reg().class());2403debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);24042405let sp_off = self.get_spillslot_offset(from_slot);2406trace!("gen_reload: {to_reg:?} from slot {from_slot:?} at offset {sp_off}");24072408let from = StackAMode::Slot(sp_off);2409<M>::gen_load_stack(from, to_reg.map(Reg::from), ty)2410}24112412/// Provide metadata to be emitted alongside machine code.2413///2414/// This metadata describes the frame layout sufficiently to find2415/// stack slots, so that runtimes and unwinders can observe state2416/// set up by compiled code in stackslots allocated for that2417/// purpose.2418pub fn frame_slot_metadata(&self) -> MachBufferFrameLayout {2419let frame_to_fp_offset = self.sp_to_fp_offset();2420let mut stackslots = SecondaryMap::with_capacity(self.sized_stackslots.len());2421let storage_area_base = self.frame_layout().outgoing_args_size;2422for (slot, storage_area_offset) in &self.sized_stackslots {2423stackslots[slot] = MachBufferStackSlot {2424offset: storage_area_base.checked_add(*storage_area_offset).unwrap(),2425key: self.sized_stackslot_keys[slot],2426};2427}2428MachBufferFrameLayout {2429frame_to_fp_offset,2430stackslots,2431}2432}2433}24342435/// An input argument to a call instruction: the vreg that is used,2436/// and the preg it is constrained to (per the ABI).2437#[derive(Clone, Debug)]2438pub struct CallArgPair {2439/// The virtual register to use for the argument.2440pub vreg: Reg,2441/// The real register into which the arg goes.2442pub preg: Reg,2443}24442445/// An output return value from a call instruction: the vreg that is2446/// defined, and the preg or stack location it is constrained to (per2447/// the ABI).2448#[derive(Clone, Debug)]2449pub struct CallRetPair {2450/// The virtual register to define from this return value.2451pub vreg: Writable<Reg>,2452/// The real register from which the return value is read.2453pub location: RetLocation,2454}24552456/// A location to load a return-value from after a call completes.2457#[derive(Clone, Debug, PartialEq, Eq)]2458pub enum RetLocation {2459/// A physical register.2460Reg(Reg, Type),2461/// A stack location, identified by a `StackAMode`.2462Stack(StackAMode, Type),2463}24642465pub type CallArgList = SmallVec<[CallArgPair; 8]>;2466pub type CallRetList = SmallVec<[CallRetPair; 8]>;24672468impl<T> CallInfo<T> {2469/// Emit loads for any stack-carried return values using the call2470/// info and allocations.2471pub fn emit_retval_loads<2472M: ABIMachineSpec,2473EmitFn: FnMut(M::I),2474IslandFn: Fn(u32) -> Option<M::I>,2475>(2476&self,2477stackslots_size: u32,2478mut emit: EmitFn,2479emit_island: IslandFn,2480) {2481// Count stack-ret locations and emit an island to account for2482// this space usage.2483let mut space_needed = 0;2484for CallRetPair { location, .. } in &self.defs {2485if let RetLocation::Stack(..) = location {2486// Assume up to ten instructions, semi-arbitrarily:2487// load from stack, store to spillslot, codegen of2488// large offsets on RISC ISAs.2489space_needed += 10 * M::I::worst_case_size();2490}2491}2492if space_needed > 0 {2493if let Some(island_inst) = emit_island(space_needed) {2494emit(island_inst);2495}2496}24972498let temp = M::retval_temp_reg(self.callee_conv);2499// The temporary must be noted as clobbered unless there are2500// no returns (hence it isn't needed). The latter can only be2501// the case statically for an ABI when the ABI doesn't allow2502// any returns at all (e.g., preserve-all ABI).2503debug_assert!(2504self.defs.is_empty()2505|| M::get_regs_clobbered_by_call(self.callee_conv, self.try_call_info.is_some())2506.contains(PReg::from(temp.to_reg().to_real_reg().unwrap()))2507);25082509for CallRetPair { vreg, location } in &self.defs {2510match location {2511RetLocation::Reg(preg, ..) => {2512// The temporary must not also be an actual return2513// value register.2514debug_assert!(*preg != temp.to_reg());2515}2516RetLocation::Stack(amode, ty) => {2517if let Some(spillslot) = vreg.to_reg().to_spillslot() {2518// `temp` is an integer register of machine word2519// width, but `ty` may be floating-point/vector,2520// which (i) may not be loadable directly into an2521// int reg, and (ii) may be wider than a machine2522// word. For simplicity, and because there are not2523// always easy choices for volatile float/vec regs2524// (see e.g. x86-64, where fastcall clobbers only2525// xmm0-xmm5, but tail uses xmm0-xmm7 for2526// returns), we use the integer temp register in2527// steps.2528let parts = (ty.bytes() + M::word_bytes() - 1) / M::word_bytes();2529let one_part_load_ty =2530Type::int_with_byte_size(M::word_bytes().min(ty.bytes()) as u16)2531.unwrap();2532for part in 0..parts {2533emit(M::gen_load_stack(2534amode.offset_by(part * M::word_bytes()),2535temp,2536one_part_load_ty,2537));2538emit(M::gen_store_stack(2539StackAMode::Slot(2540i64::from(stackslots_size)2541+ i64::from(M::word_bytes())2542* ((spillslot.index() as i64) + (part as i64)),2543),2544temp.to_reg(),2545M::word_type(),2546));2547}2548} else {2549assert_ne!(*vreg, temp);2550emit(M::gen_load_stack(*amode, *vreg, *ty));2551}2552}2553}2554}2555}2556}25572558impl TryCallInfo {2559pub(crate) fn exception_handlers(2560&self,2561layout: &FrameLayout,2562) -> impl Iterator<Item = MachExceptionHandler> {2563self.exception_handlers.iter().map(|handler| match handler {2564TryCallHandler::Tag(tag, label) => MachExceptionHandler::Tag(*tag, *label),2565TryCallHandler::Default(label) => MachExceptionHandler::Default(*label),2566TryCallHandler::Context(reg) => {2567let loc = if let Some(spillslot) = reg.to_spillslot() {2568// The spillslot offset is relative to the "fixed2569// storage area", which comes after outgoing args.2570let offset = layout.spillslot_offset(spillslot) + i64::from(layout.outgoing_args_size);2571ExceptionContextLoc::SPOffset(u32::try_from(offset).expect("SP offset cannot be negative or larger than 4GiB"))2572} else if let Some(realreg) = reg.to_real_reg() {2573ExceptionContextLoc::GPR(realreg.hw_enc())2574} else {2575panic!("Virtual register present in try-call handler clause after register allocation");2576};2577MachExceptionHandler::Context(loc)2578}2579})2580}25812582pub(crate) fn pretty_print_dests(&self) -> String {2583self.exception_handlers2584.iter()2585.map(|handler| match handler {2586TryCallHandler::Tag(tag, label) => format!("{tag:?}: {label:?}"),2587TryCallHandler::Default(label) => format!("default: {label:?}"),2588TryCallHandler::Context(loc) => format!("context {loc:?}"),2589})2590.collect::<Vec<_>>()2591.join(", ")2592}25932594pub(crate) fn collect_operands(&mut self, collector: &mut impl OperandVisitor) {2595for handler in &mut self.exception_handlers {2596match handler {2597TryCallHandler::Context(ctx) => {2598collector.any_late_use(ctx);2599}2600TryCallHandler::Tag(_, _) | TryCallHandler::Default(_) => {}2601}2602}2603}2604}26052606#[cfg(test)]2607mod tests {2608use super::SigData;26092610#[test]2611fn sig_data_size() {2612// The size of `SigData` is performance sensitive, so make sure2613// we don't regress it unintentionally.2614assert_eq!(core::mem::size_of::<SigData>(), 24);2615}2616}261726182619