Path: blob/main/cranelift/codegen/src/machinst/buffer.rs
1693 views
//! In-memory representation of compiled machine code, with labels and fixups to1//! refer to those labels. Handles constant-pool island insertion and also2//! veneer insertion for out-of-range jumps.3//!4//! This code exists to solve three problems:5//!6//! - Branch targets for forward branches are not known until later, when we7//! emit code in a single pass through the instruction structs.8//!9//! - On many architectures, address references or offsets have limited range.10//! For example, on AArch64, conditional branches can only target code +/- 1MB11//! from the branch itself.12//!13//! - The lowering of control flow from the CFG-with-edges produced by14//! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty15//! edge blocks when the register allocator does not need to insert any16//! spills/reloads/moves in edge blocks, results in many suboptimal branch17//! patterns. The lowering also pays no attention to block order, and so18//! two-target conditional forms (cond-br followed by uncond-br) can often by19//! avoided because one of the targets is the fallthrough. There are several20//! cases here where we can simplify to use fewer branches.21//!22//! This "buffer" implements a single-pass code emission strategy (with a later23//! "fixup" pass, but only through recorded fixups, not all instructions). The24//! basic idea is:25//!26//! - Emit branches as they are, including two-target (cond/uncond) compound27//! forms, but with zero offsets and optimistically assuming the target will be28//! in range. Record the "fixup" for later. Targets are denoted instead by29//! symbolic "labels" that are then bound to certain offsets in the buffer as30//! we emit code. (Nominally, there is a label at the start of every basic31//! block.)32//!33//! - As we do this, track the offset in the buffer at which the first label34//! reference "goes out of range". We call this the "deadline". If we reach the35//! deadline and we still have not bound the label to which an unresolved branch36//! refers, we have a problem!37//!38//! - To solve this problem, we emit "islands" full of "veneers". An island is39//! simply a chunk of code inserted in the middle of the code actually produced40//! by the emitter (e.g., vcode iterating over instruction structs). The emitter41//! has some awareness of this: it either asks for an island between blocks, so42//! it is not accidentally executed, or else it emits a branch around the island43//! when all other options fail (see `Inst::EmitIsland` meta-instruction).44//!45//! - A "veneer" is an instruction (or sequence of instructions) in an "island"46//! that implements a longer-range reference to a label. The idea is that, for47//! example, a branch with a limited range can branch to a "veneer" instead,48//! which is simply a branch in a form that can use a longer-range reference. On49//! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional50//! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a51//! conditional branch's label reference can be fixed up with a "veneer" to52//! achieve a longer range.53//!54//! - To implement all of this, we require the backend to provide a `LabelUse`55//! type that implements a trait. This is nominally an enum that records one of56//! several kinds of references to an offset in code -- basically, a relocation57//! type -- and will usually correspond to different instruction formats. The58//! `LabelUse` implementation specifies the maximum range, how to patch in the59//! actual label location when known, and how to generate a veneer to extend the60//! range.61//!62//! That satisfies label references, but we still may have suboptimal branch63//! patterns. To clean up the branches, we do a simple "peephole"-style64//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)65//! informs the buffer of branches in the code and, in the case of conditionals,66//! the code that would have been emitted to invert this branch's condition. We67//! track the "latest branches": these are branches that are contiguous up to68//! the current offset. (If any code is emitted after a branch, that branch or69//! run of contiguous branches is no longer "latest".) The latest branches are70//! those that we can edit by simply truncating the buffer and doing something71//! else instead.72//!73//! To optimize branches, we implement several simple rules, and try to apply74//! them to the "latest branches" when possible:75//!76//! - A branch with a label target, when that label is bound to the ending77//! offset of the branch (the fallthrough location), can be removed altogether,78//! because the branch would have no effect).79//!80//! - An unconditional branch that starts at a label location, and branches to81//! another label, results in a "label alias": all references to the label bound82//! *to* this branch instruction are instead resolved to the *target* of the83//! branch instruction. This effectively removes empty blocks that just84//! unconditionally branch to the next block. We call this "branch threading".85//!86//! - A conditional followed by an unconditional, when the conditional branches87//! to the unconditional's fallthrough, results in (i) the truncation of the88//! unconditional, (ii) the inversion of the condition's condition, and (iii)89//! replacement of the conditional's target (using the original target of the90//! unconditional). This is a fancy way of saying "we can flip a two-target91//! conditional branch's taken/not-taken targets if it works better with our92//! fallthrough". To make this work, the emitter actually gives the buffer93//! *both* forms of every conditional branch: the true form is emitted into the94//! buffer, and the "inverted" machine-code bytes are provided as part of the95//! branch-fixup metadata.96//!97//! - An unconditional B preceded by another unconditional P, when B's label(s) have98//! been redirected to target(B), can be removed entirely. This is an extension99//! of the branch-threading optimization, and is valid because if we know there100//! will be no fallthrough into this branch instruction (the prior instruction101//! is an unconditional jump), and if we know we have successfully redirected102//! all labels, then this branch instruction is unreachable. Note that this103//! works because the redirection happens before the label is ever resolved104//! (fixups happen at island emission time, at which point latest-branches are105//! cleared, or at the end of emission), so we are sure to catch and redirect106//! all possible paths to this instruction.107//!108//! # Branch-optimization Correctness109//!110//! The branch-optimization mechanism depends on a few data structures with111//! invariants, which are always held outside the scope of top-level public112//! methods:113//!114//! - The latest-branches list. Each entry describes a span of the buffer115//! (start/end offsets), the label target, the corresponding fixup-list entry116//! index, and the bytes (must be the same length) for the inverted form, if117//! conditional. The list of labels that are bound to the start-offset of this118//! branch is *complete* (if any label has a resolved offset equal to `start`119//! and is not an alias, it must appear in this list) and *precise* (no label120//! in this list can be bound to another offset). No label in this list should121//! be an alias. No two branch ranges can overlap, and branches are in122//! ascending-offset order.123//!124//! - The labels-at-tail list. This contains all MachLabels that have been bound125//! to (whose resolved offsets are equal to) the tail offset of the buffer.126//! No label in this list should be an alias.127//!128//! - The label_offsets array, containing the bound offset of a label or129//! UNKNOWN. No label can be bound at an offset greater than the current130//! buffer tail.131//!132//! - The label_aliases array, containing another label to which a label is133//! bound or UNKNOWN. A label's resolved offset is the resolved offset134//! of the label it is aliased to, if this is set.135//!136//! We argue below, at each method, how the invariants in these data structures137//! are maintained (grep for "Post-invariant").138//!139//! Given these invariants, we argue why each optimization preserves execution140//! semantics below (grep for "Preserves execution semantics").141//!142//! # Avoiding Quadratic Behavior143//!144//! There are two cases where we've had to take some care to avoid145//! quadratic worst-case behavior:146//!147//! - The "labels at this branch" list can grow unboundedly if the148//! code generator binds many labels at one location. If the count149//! gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we150//! simply abort an optimization early in a way that is always correct151//! but is conservative.152//!153//! - The fixup list can interact with island emission to create154//! "quadratic island behavior". In a little more detail, one can hit155//! this behavior by having some pending fixups (forward label156//! references) with long-range label-use kinds, and some others157//! with shorter-range references that nonetheless still are pending158//! long enough to trigger island generation. In such a case, we159//! process the fixup list, generate veneers to extend some forward160//! references' ranges, but leave the other (longer-range) ones161//! alone. The way this was implemented put them back on a list and162//! resulted in quadratic behavior.163//!164//! To avoid this fixups are split into two lists: one "pending" list and one165//! final list. The pending list is kept around for handling fixups related to166//! branches so it can be edited/truncated. When an island is reached, which167//! starts processing fixups, all pending fixups are flushed into the final168//! list. The final list is a `BinaryHeap` which enables fixup processing to169//! only process those which are required during island emission, deferring170//! all longer-range fixups to later.171172use crate::binemit::{Addend, CodeOffset, Reloc};173use crate::ir::function::FunctionParameters;174use crate::ir::{ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};175use crate::isa::unwind::UnwindInst;176use crate::machinst::{177BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,178};179use crate::trace;180use crate::{MachInstEmitState, ir};181use crate::{VCodeConstantData, timing};182use core::ops::Range;183use cranelift_control::ControlPlane;184use cranelift_entity::{PrimaryMap, entity_impl};185use smallvec::SmallVec;186use std::cmp::Ordering;187use std::collections::BinaryHeap;188use std::mem;189use std::string::String;190use std::vec::Vec;191192#[cfg(feature = "enable-serde")]193use serde::{Deserialize, Serialize};194195#[cfg(feature = "enable-serde")]196pub trait CompilePhase {197type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;198type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;199}200201#[cfg(not(feature = "enable-serde"))]202pub trait CompilePhase {203type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;204type SourceLocType: core::fmt::Debug + PartialEq + Clone;205}206207/// Status of a compiled artifact that needs patching before being used.208#[derive(Clone, Debug, PartialEq)]209#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]210pub struct Stencil;211212/// Status of a compiled artifact ready to use.213#[derive(Clone, Debug, PartialEq)]214pub struct Final;215216impl CompilePhase for Stencil {217type MachSrcLocType = MachSrcLoc<Stencil>;218type SourceLocType = RelSourceLoc;219}220221impl CompilePhase for Final {222type MachSrcLocType = MachSrcLoc<Final>;223type SourceLocType = SourceLoc;224}225226#[derive(Clone, Copy, Debug, PartialEq, Eq)]227enum ForceVeneers {228Yes,229No,230}231232/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink233/// in bulk.234///235/// This struct uses `SmallVec`s to support small-ish function bodies without236/// any heap allocation. As such, it will be several kilobytes large. This is237/// likely fine as long as it is stack-allocated for function emission then238/// thrown away; but beware if many buffer objects are retained persistently.239pub struct MachBuffer<I: VCodeInst> {240/// The buffer contents, as raw bytes.241data: SmallVec<[u8; 1024]>,242/// The required alignment of this buffer.243min_alignment: u32,244/// Any relocations referring to this code. Note that only *external*245/// relocations are tracked here; references to labels within the buffer are246/// resolved before emission.247relocs: SmallVec<[MachReloc; 16]>,248/// Any trap records referring to this code.249traps: SmallVec<[MachTrap; 16]>,250/// Any call site records referring to this code.251call_sites: SmallVec<[MachCallSite; 16]>,252/// Any exception-handler records referred to at call sites.253exception_handlers: SmallVec<[MachExceptionHandler; 16]>,254/// Any source location mappings referring to this code.255srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,256/// Any user stack maps for this code.257///258/// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted259/// by code offset, and each stack map covers `span` bytes on the stack.260user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,261/// Any unwind info at a given location.262unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,263/// The current source location in progress (after `start_srcloc()` and264/// before `end_srcloc()`). This is a (start_offset, src_loc) tuple.265cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,266/// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.267label_offsets: SmallVec<[CodeOffset; 16]>,268/// Label aliases: when one label points to an unconditional jump, and that269/// jump points to another label, we can redirect references to the first270/// label immediately to the second.271///272/// Invariant: we don't have label-alias cycles. We ensure this by,273/// before setting label A to alias label B, resolving B's alias274/// target (iteratively until a non-aliased label); if B is already275/// aliased to A, then we cannot alias A back to B.276label_aliases: SmallVec<[MachLabel; 16]>,277/// Constants that must be emitted at some point.278pending_constants: SmallVec<[VCodeConstant; 16]>,279/// Byte size of all constants in `pending_constants`.280pending_constants_size: CodeOffset,281/// Traps that must be emitted at some point.282pending_traps: SmallVec<[MachLabelTrap; 16]>,283/// Fixups that haven't yet been flushed into `fixup_records` below and may284/// be related to branches that are chomped. These all get added to285/// `fixup_records` during island emission.286pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,287/// The nearest upcoming deadline for entries in `pending_fixup_records`.288pending_fixup_deadline: CodeOffset,289/// Fixups that must be performed after all code is emitted.290fixup_records: BinaryHeap<MachLabelFixup<I>>,291/// Latest branches, to facilitate in-place editing for better fallthrough292/// behavior and empty-block removal.293latest_branches: SmallVec<[MachBranch; 4]>,294/// All labels at the current offset (emission tail). This is lazily295/// cleared: it is actually accurate as long as the current offset is296/// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should297/// be considered as empty.298///299/// For correctness, this *must* be complete (i.e., the vector must contain300/// all labels whose offsets are resolved to the current tail), because we301/// rely on it to update labels when we truncate branches.302labels_at_tail: SmallVec<[MachLabel; 4]>,303/// The last offset at which `labels_at_tail` is valid. It is conceptually304/// always describing the tail of the buffer, but we do not clear305/// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it306/// when the offset has grown past this (`labels_at_tail_off`) point.307/// Always <= `cur_offset()`.308labels_at_tail_off: CodeOffset,309/// Metadata about all constants that this function has access to.310///311/// This records the size/alignment of all constants (not the actual data)312/// along with the last available label generated for the constant. This map313/// is consulted when constants are referred to and the label assigned to a314/// constant may change over time as well.315constants: PrimaryMap<VCodeConstant, MachBufferConstant>,316/// All recorded usages of constants as pairs of the constant and where the317/// constant needs to be placed within `self.data`. Note that the same318/// constant may appear in this array multiple times if it was emitted319/// multiple times.320used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,321/// Indicates when a patchable region is currently open, to guard that it's322/// not possible to nest patchable regions.323open_patchable: bool,324}325326impl MachBufferFinalized<Stencil> {327/// Get a finalized machine buffer by applying the function's base source location.328pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {329MachBufferFinalized {330data: self.data,331relocs: self.relocs,332traps: self.traps,333call_sites: self.call_sites,334exception_handlers: self.exception_handlers,335srclocs: self336.srclocs337.into_iter()338.map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))339.collect(),340user_stack_maps: self.user_stack_maps,341unwind_info: self.unwind_info,342alignment: self.alignment,343}344}345}346347/// A `MachBuffer` once emission is completed: holds generated code and records,348/// without fixups. This allows the type to be independent of the backend.349#[derive(PartialEq, Debug, Clone)]350#[cfg_attr(351feature = "enable-serde",352derive(serde_derive::Serialize, serde_derive::Deserialize)353)]354pub struct MachBufferFinalized<T: CompilePhase> {355/// The buffer contents, as raw bytes.356pub(crate) data: SmallVec<[u8; 1024]>,357/// Any relocations referring to this code. Note that only *external*358/// relocations are tracked here; references to labels within the buffer are359/// resolved before emission.360pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,361/// Any trap records referring to this code.362pub(crate) traps: SmallVec<[MachTrap; 16]>,363/// Any call site records referring to this code.364pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,365/// Any exception-handler records referred to at call sites.366pub(crate) exception_handlers: SmallVec<[FinalizedMachExceptionHandler; 16]>,367/// Any source location mappings referring to this code.368pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,369/// Any user stack maps for this code.370///371/// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted372/// by code offset, and each stack map covers `span` bytes on the stack.373pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,374/// Any unwind info at a given location.375pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,376/// The required alignment of this buffer.377pub alignment: u32,378}379380const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;381const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);382383/// Threshold on max length of `labels_at_this_branch` list to avoid384/// unbounded quadratic behavior (see comment below at use-site).385const LABEL_LIST_THRESHOLD: usize = 100;386387/// A label refers to some offset in a `MachBuffer`. It may not be resolved at388/// the point at which it is used by emitted code; the buffer records "fixups"389/// for references to the label, and will come back and patch the code390/// appropriately when the label's location is eventually known.391#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]392pub struct MachLabel(u32);393entity_impl!(MachLabel);394395impl MachLabel {396/// Get a label for a block. (The first N MachLabels are always reserved for397/// the N blocks in the vcode.)398pub fn from_block(bindex: BlockIndex) -> MachLabel {399MachLabel(bindex.index() as u32)400}401402/// Creates a string representing this label, for convenience.403pub fn to_string(&self) -> String {404format!("label{}", self.0)405}406}407408impl Default for MachLabel {409fn default() -> Self {410UNKNOWN_LABEL411}412}413414/// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is415/// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming416/// the [`OpenPatchRegion`] token in the process.417pub struct OpenPatchRegion(usize);418419/// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example420/// of where you might want to use this is for patching instructions that mention constants that421/// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable422/// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token423/// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,424/// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction425/// bytes, and the constants uses can be updated directly.426pub struct PatchRegion {427range: Range<usize>,428}429430impl PatchRegion {431/// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.432pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {433&mut buffer.data[self.range]434}435}436437impl<I: VCodeInst> MachBuffer<I> {438/// Create a new section, known to start at `start_offset` and with a size limited to439/// `length_limit`.440pub fn new() -> MachBuffer<I> {441MachBuffer {442data: SmallVec::new(),443min_alignment: I::function_alignment().minimum,444relocs: SmallVec::new(),445traps: SmallVec::new(),446call_sites: SmallVec::new(),447exception_handlers: SmallVec::new(),448srclocs: SmallVec::new(),449user_stack_maps: SmallVec::new(),450unwind_info: SmallVec::new(),451cur_srcloc: None,452label_offsets: SmallVec::new(),453label_aliases: SmallVec::new(),454pending_constants: SmallVec::new(),455pending_constants_size: 0,456pending_traps: SmallVec::new(),457pending_fixup_records: SmallVec::new(),458pending_fixup_deadline: u32::MAX,459fixup_records: Default::default(),460latest_branches: SmallVec::new(),461labels_at_tail: SmallVec::new(),462labels_at_tail_off: 0,463constants: Default::default(),464used_constants: Default::default(),465open_patchable: false,466}467}468469/// Current offset from start of buffer.470pub fn cur_offset(&self) -> CodeOffset {471self.data.len() as CodeOffset472}473474/// Add a byte.475pub fn put1(&mut self, value: u8) {476self.data.push(value);477478// Post-invariant: conceptual-labels_at_tail contains a complete and479// precise list of labels bound at `cur_offset()`. We have advanced480// `cur_offset()`, hence if it had been equal to `labels_at_tail_off`481// before, it is not anymore (and it cannot become equal, because482// `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is483// conceptually empty (even though it is only lazily cleared). No labels484// can be bound at this new offset (by invariant on `label_offsets`).485// Hence the invariant holds.486}487488/// Add 2 bytes.489pub fn put2(&mut self, value: u16) {490let bytes = value.to_le_bytes();491self.data.extend_from_slice(&bytes[..]);492493// Post-invariant: as for `put1()`.494}495496/// Add 4 bytes.497pub fn put4(&mut self, value: u32) {498let bytes = value.to_le_bytes();499self.data.extend_from_slice(&bytes[..]);500501// Post-invariant: as for `put1()`.502}503504/// Add 8 bytes.505pub fn put8(&mut self, value: u64) {506let bytes = value.to_le_bytes();507self.data.extend_from_slice(&bytes[..]);508509// Post-invariant: as for `put1()`.510}511512/// Add a slice of bytes.513pub fn put_data(&mut self, data: &[u8]) {514self.data.extend_from_slice(data);515516// Post-invariant: as for `put1()`.517}518519/// Reserve appended space and return a mutable slice referring to it.520pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {521let off = self.data.len();522let new_len = self.data.len() + len;523self.data.resize(new_len, 0);524&mut self.data[off..]525526// Post-invariant: as for `put1()`.527}528529/// Align up to the given alignment.530pub fn align_to(&mut self, align_to: CodeOffset) {531trace!("MachBuffer: align to {}", align_to);532assert!(533align_to.is_power_of_two(),534"{align_to} is not a power of two"535);536while self.cur_offset() & (align_to - 1) != 0 {537self.put1(0);538}539540// Post-invariant: as for `put1()`.541}542543/// Begin a region of patchable code. There is one requirement for the544/// code that is emitted: It must not introduce any instructions that545/// could be chomped (branches are an example of this). In other words,546/// you must not call [`MachBuffer::add_cond_branch`] or547/// [`MachBuffer::add_uncond_branch`] between calls to this method and548/// [`MachBuffer::end_patchable`].549pub fn start_patchable(&mut self) -> OpenPatchRegion {550assert!(!self.open_patchable, "Patchable regions may not be nested");551self.open_patchable = true;552OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())553}554555/// End a region of patchable code, yielding a [`PatchRegion`] value that556/// can be consumed later to produce a one-off mutable slice to the557/// associated region of the data buffer.558pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {559// No need to assert the state of `open_patchable` here, as we take560// ownership of the only `OpenPatchable` value.561self.open_patchable = false;562let end = usize::try_from(self.cur_offset()).unwrap();563PatchRegion { range: open.0..end }564}565566/// Allocate a `Label` to refer to some offset. May not be bound to a fixed567/// offset yet.568pub fn get_label(&mut self) -> MachLabel {569let l = self.label_offsets.len() as u32;570self.label_offsets.push(UNKNOWN_LABEL_OFFSET);571self.label_aliases.push(UNKNOWN_LABEL);572trace!("MachBuffer: new label -> {:?}", MachLabel(l));573MachLabel(l)574575// Post-invariant: the only mutation is to add a new label; it has no576// bound offset yet, so it trivially satisfies all invariants.577}578579/// Reserve the first N MachLabels for blocks.580pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {581trace!("MachBuffer: first {} labels are for blocks", blocks);582debug_assert!(self.label_offsets.is_empty());583self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);584self.label_aliases.resize(blocks, UNKNOWN_LABEL);585586// Post-invariant: as for `get_label()`.587}588589/// Registers metadata in this `MachBuffer` about the `constants` provided.590///591/// This will record the size/alignment of all constants which will prepare592/// them for emission later on.593pub fn register_constants(&mut self, constants: &VCodeConstants) {594for (c, val) in constants.iter() {595self.register_constant(&c, val);596}597}598599/// Similar to [`MachBuffer::register_constants`] but registers a600/// single constant metadata. This function is useful in601/// situations where not all constants are known at the time of602/// emission.603pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {604let c2 = self.constants.push(MachBufferConstant {605upcoming_label: None,606align: data.alignment(),607size: data.as_slice().len(),608});609assert_eq!(*constant, c2);610}611612/// Completes constant emission by iterating over `self.used_constants` and613/// filling in the "holes" with the constant values provided by `constants`.614///615/// Returns the alignment required for this entire buffer. Alignment starts616/// at the ISA's minimum function alignment and can be increased due to617/// constant requirements.618fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {619let mut alignment = self.min_alignment;620for (constant, offset) in mem::take(&mut self.used_constants) {621let constant = constants.get(constant);622let data = constant.as_slice();623self.data[offset as usize..][..data.len()].copy_from_slice(data);624alignment = constant.alignment().max(alignment);625}626alignment627}628629/// Returns a label that can be used to refer to the `constant` provided.630///631/// This will automatically defer a new constant to be emitted for632/// `constant` if it has not been previously emitted. Note that this633/// function may return a different label for the same constant at634/// different points in time. The label is valid to use only from the635/// current location; the MachBuffer takes care to emit the same constant636/// multiple times if needed so the constant is always in range.637pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {638let MachBufferConstant {639align,640size,641upcoming_label,642} = self.constants[constant];643if let Some(label) = upcoming_label {644return label;645}646647let label = self.get_label();648trace!(649"defer constant: eventually emit {size} bytes aligned \650to {align} at label {label:?}",651);652self.pending_constants.push(constant);653self.pending_constants_size += size as u32;654self.constants[constant].upcoming_label = Some(label);655label656}657658/// Bind a label to the current offset. A label can only be bound once.659pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {660trace!(661"MachBuffer: bind label {:?} at offset {}",662label,663self.cur_offset()664);665debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);666debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);667let offset = self.cur_offset();668self.label_offsets[label.0 as usize] = offset;669self.lazily_clear_labels_at_tail();670self.labels_at_tail.push(label);671672// Invariants hold: bound offset of label is <= cur_offset (in fact it673// is equal). If the `labels_at_tail` list was complete and precise674// before, it is still, because we have bound this label to the current675// offset and added it to the list (which contains all labels at the676// current offset).677678self.optimize_branches(ctrl_plane);679680// Post-invariant: by `optimize_branches()` (see argument there).681}682683/// Lazily clear `labels_at_tail` if the tail offset has moved beyond the684/// offset that it applies to.685fn lazily_clear_labels_at_tail(&mut self) {686let offset = self.cur_offset();687if offset > self.labels_at_tail_off {688self.labels_at_tail_off = offset;689self.labels_at_tail.clear();690}691692// Post-invariant: either labels_at_tail_off was at cur_offset, and693// state is untouched, or was less than cur_offset, in which case the694// labels_at_tail list was conceptually empty, and is now actually695// empty.696}697698/// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.699pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {700let mut iters = 0;701while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {702label = self.label_aliases[label.0 as usize];703// To protect against an infinite loop (despite our assurances to704// ourselves that the invariants make this impossible), assert out705// after 1M iterations. The number of basic blocks is limited706// in most contexts anyway so this should be impossible to hit with707// a legitimate input.708iters += 1;709assert!(iters < 1_000_000, "Unexpected cycle in label aliases");710}711self.label_offsets[label.0 as usize]712713// Post-invariant: no mutations.714}715716/// Emit a reference to the given label with the given reference type (i.e.,717/// branch-instruction format) at the current offset. This is like a718/// relocation, but handled internally.719///720/// This can be called before the branch is actually emitted; fixups will721/// not happen until an island is emitted or the buffer is finished.722pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {723trace!(724"MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",725offset, label, kind726);727728// Add the fixup, and update the worst-case island size based on a729// veneer for this label use.730let fixup = MachLabelFixup {731label,732offset,733kind,734};735self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());736self.pending_fixup_records.push(fixup);737738// Post-invariant: no mutations to branches/labels data structures.739}740741/// Inform the buffer of an unconditional branch at the given offset,742/// targeting the given label. May be used to optimize branches.743/// The last added label-use must correspond to this branch.744/// This must be called when the current offset is equal to `start`; i.e.,745/// before actually emitting the branch. This implies that for a branch that746/// uses a label and is eligible for optimizations by the MachBuffer, the747/// proper sequence is:748///749/// - Call `use_label_at_offset()` to emit the fixup record.750/// - Call `add_uncond_branch()` to make note of the branch.751/// - Emit the bytes for the branch's machine code.752///753/// Additional requirement: no labels may be bound between `start` and `end`754/// (exclusive on both ends).755pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {756debug_assert!(757!self.open_patchable,758"Branch instruction inserted within a patchable region"759);760assert!(self.cur_offset() == start);761debug_assert!(end > start);762assert!(!self.pending_fixup_records.is_empty());763let fixup = self.pending_fixup_records.len() - 1;764self.lazily_clear_labels_at_tail();765self.latest_branches.push(MachBranch {766start,767end,768target,769fixup,770inverted: None,771labels_at_this_branch: self.labels_at_tail.clone(),772});773774// Post-invariant: we asserted branch start is current tail; the list of775// labels at branch is cloned from list of labels at current tail.776}777778/// Inform the buffer of a conditional branch at the given offset,779/// targeting the given label. May be used to optimize branches.780/// The last added label-use must correspond to this branch.781///782/// Additional requirement: no labels may be bound between `start` and `end`783/// (exclusive on both ends).784pub fn add_cond_branch(785&mut self,786start: CodeOffset,787end: CodeOffset,788target: MachLabel,789inverted: &[u8],790) {791debug_assert!(792!self.open_patchable,793"Branch instruction inserted within a patchable region"794);795assert!(self.cur_offset() == start);796debug_assert!(end > start);797assert!(!self.pending_fixup_records.is_empty());798debug_assert!(799inverted.len() == (end - start) as usize,800"branch length = {}, but inverted length = {}",801end - start,802inverted.len()803);804let fixup = self.pending_fixup_records.len() - 1;805let inverted = Some(SmallVec::from(inverted));806self.lazily_clear_labels_at_tail();807self.latest_branches.push(MachBranch {808start,809end,810target,811fixup,812inverted,813labels_at_this_branch: self.labels_at_tail.clone(),814});815816// Post-invariant: we asserted branch start is current tail; labels at817// branch list is cloned from list of labels at current tail.818}819820fn truncate_last_branch(&mut self) {821debug_assert!(822!self.open_patchable,823"Branch instruction truncated within a patchable region"824);825826self.lazily_clear_labels_at_tail();827// Invariants hold at this point.828829let b = self.latest_branches.pop().unwrap();830assert!(b.end == self.cur_offset());831832// State:833// [PRE CODE]834// Offset b.start, b.labels_at_this_branch:835// [BRANCH CODE]836// cur_off, self.labels_at_tail -->837// (end of buffer)838self.data.truncate(b.start as usize);839self.pending_fixup_records.truncate(b.fixup);840while let Some(last_srcloc) = self.srclocs.last_mut() {841if last_srcloc.end <= b.start {842break;843}844if last_srcloc.start < b.start {845last_srcloc.end = b.start;846break;847}848self.srclocs.pop();849}850// State:851// [PRE CODE]852// cur_off, Offset b.start, b.labels_at_this_branch:853// (end of buffer)854//855// self.labels_at_tail --> (past end of buffer)856let cur_off = self.cur_offset();857self.labels_at_tail_off = cur_off;858// State:859// [PRE CODE]860// cur_off, Offset b.start, b.labels_at_this_branch,861// self.labels_at_tail:862// (end of buffer)863//864// resolve_label_offset(l) for l in labels_at_tail:865// (past end of buffer)866867trace!(868"truncate_last_branch: truncated {:?}; off now {}",869b, cur_off870);871872// Fix up resolved label offsets for labels at tail.873for &l in &self.labels_at_tail {874self.label_offsets[l.0 as usize] = cur_off;875}876// Old labels_at_this_branch are now at cur_off.877self.labels_at_tail.extend(b.labels_at_this_branch);878879// Post-invariant: this operation is defined to truncate the buffer,880// which moves cur_off backward, and to move labels at the end of the881// buffer back to the start-of-branch offset.882//883// latest_branches satisfies all invariants:884// - it has no branches past the end of the buffer (branches are in885// order, we removed the last one, and we truncated the buffer to just886// before the start of that branch)887// - no labels were moved to lower offsets than the (new) cur_off, so888// the labels_at_this_branch list for any other branch need not change.889//890// labels_at_tail satisfies all invariants:891// - all labels that were at the tail after the truncated branch are892// moved backward to just before the branch, which becomes the new tail;893// thus every element in the list should remain (ensured by `.extend()`894// above).895// - all labels that refer to the new tail, which is the start-offset of896// the truncated branch, must be present. The `labels_at_this_branch`897// list in the truncated branch's record is a complete and precise list898// of exactly these labels; we append these to labels_at_tail.899// - labels_at_tail_off is at cur_off after truncation occurs, so the900// list is valid (not to be lazily cleared).901//902// The stated operation was performed:903// - For each label at the end of the buffer prior to this method, it904// now resolves to the new (truncated) end of the buffer: it must have905// been in `labels_at_tail` (this list is precise and complete, and906// the tail was at the end of the truncated branch on entry), and we907// iterate over this list and set `label_offsets` to the new tail.908// None of these labels could have been an alias (by invariant), so909// `label_offsets` is authoritative for each.910// - No other labels will be past the end of the buffer, because of the911// requirement that no labels be bound to the middle of branch ranges912// (see comments to `add_{cond,uncond}_branch()`).913// - The buffer is truncated to just before the last branch, and the914// fixup record referring to that last branch is removed.915}916917/// Performs various optimizations on branches pointing at the current label.918pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {919if ctrl_plane.get_decision() {920return;921}922923self.lazily_clear_labels_at_tail();924// Invariants valid at this point.925926trace!(927"enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",928self.latest_branches, self.labels_at_tail, self.pending_fixup_records929);930931// We continue to munch on branches at the tail of the buffer until no932// more rules apply. Note that the loop only continues if a branch is933// actually truncated (or if labels are redirected away from a branch),934// so this always makes progress.935while let Some(b) = self.latest_branches.last() {936let cur_off = self.cur_offset();937trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);938// If there has been any code emission since the end of the last branch or939// label definition, then there's nothing we can edit (because we940// don't move code once placed, only back up and overwrite), so941// clear the records and finish.942if b.end < cur_off {943break;944}945946// If the "labels at this branch" list on this branch is947// longer than a threshold, don't do any simplification,948// and let the branch remain to separate those labels from949// the current tail. This avoids quadratic behavior (see950// #3468): otherwise, if a long string of "goto next;951// next:" patterns are emitted, all of the labels will952// coalesce into a long list of aliases for the current953// buffer tail. We must track all aliases of the current954// tail for correctness, but we are also allowed to skip955// optimization (removal) of any branch, so we take the956// escape hatch here and let it stand. In effect this957// "spreads" the many thousands of labels in the958// pathological case among an actual (harmless but959// suboptimal) instruction once per N labels.960if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {961break;962}963964// Invariant: we are looking at a branch that ends at the tail of965// the buffer.966967// For any branch, conditional or unconditional:968// - If the target is a label at the current offset, then remove969// the conditional branch, and reset all labels that targeted970// the current offset (end of branch) to the truncated971// end-of-code.972//973// Preserves execution semantics: a branch to its own fallthrough974// address is equivalent to a no-op; in both cases, nextPC is the975// fallthrough.976if self.resolve_label_offset(b.target) == cur_off {977trace!("branch with target == cur off; truncating");978self.truncate_last_branch();979continue;980}981982// If latest is an unconditional branch:983//984// - If the branch's target is not its own start address, then for985// each label at the start of branch, make the label an alias of the986// branch target, and remove the label from the "labels at this987// branch" list.988//989// - Preserves execution semantics: an unconditional branch's990// only effect is to set PC to a new PC; this change simply991// collapses one step in the step-semantics.992//993// - Post-invariant: the labels that were bound to the start of994// this branch become aliases, so they must not be present in any995// labels-at-this-branch list or the labels-at-tail list. The996// labels are removed form the latest-branch record's997// labels-at-this-branch list, and are never placed in the998// labels-at-tail list. Furthermore, it is correct that they are999// not in either list, because they are now aliases, and labels1000// that are aliases remain aliases forever.1001//1002// - If there is a prior unconditional branch that ends just before1003// this one begins, and this branch has no labels bound to its1004// start, then we can truncate this branch, because it is entirely1005// unreachable (we have redirected all labels that make it1006// reachable otherwise). Do so and continue around the loop.1007//1008// - Preserves execution semantics: the branch is unreachable,1009// because execution can only flow into an instruction from the1010// prior instruction's fallthrough or from a branch bound to that1011// instruction's start offset. Unconditional branches have no1012// fallthrough, so if the prior instruction is an unconditional1013// branch, no fallthrough entry can happen. The1014// labels-at-this-branch list is complete (by invariant), so if it1015// is empty, then the instruction is entirely unreachable. Thus,1016// it can be removed.1017//1018// - Post-invariant: ensured by truncate_last_branch().1019//1020// - If there is a prior conditional branch whose target label1021// resolves to the current offset (branches around the1022// unconditional branch), then remove the unconditional branch,1023// and make the target of the unconditional the target of the1024// conditional instead.1025//1026// - Preserves execution semantics: previously we had:1027//1028// L1:1029// cond_br L21030// br L31031// L2:1032// (end of buffer)1033//1034// by removing the last branch, we have:1035//1036// L1:1037// cond_br L21038// L2:1039// (end of buffer)1040//1041// we then fix up the records for the conditional branch to1042// have:1043//1044// L1:1045// cond_br.inverted L31046// L2:1047//1048// In the original code, control flow reaches L2 when the1049// conditional branch's predicate is true, and L3 otherwise. In1050// the optimized code, the same is true.1051//1052// - Post-invariant: all edits to latest_branches and1053// labels_at_tail are performed by `truncate_last_branch()`,1054// which maintains the invariants at each step.10551056if b.is_uncond() {1057// Set any label equal to current branch's start as an alias of1058// the branch's target, if the target is not the branch itself1059// (i.e., an infinite loop).1060//1061// We cannot perform this aliasing if the target of this branch1062// ultimately aliases back here; if so, we need to keep this1063// branch, so break out of this loop entirely (and clear the1064// latest-branches list below).1065//1066// Note that this check is what prevents cycles from forming in1067// `self.label_aliases`. To see why, consider an arbitrary start1068// state:1069//1070// label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to1071// Ln, which is not aliased.1072//1073// We would create a cycle if we assigned label_aliases[Ln]1074// = L1. Note that the below assignment is the only write1075// to label_aliases.1076//1077// By our other invariants, we have that Ln (`l` below)1078// resolves to the offset `b.start`, because it is in the1079// set `b.labels_at_this_branch`.1080//1081// If L1 were already aliased, through some arbitrarily deep1082// chain, to Ln, then it must also resolve to this offset1083// `b.start`.1084//1085// By checking the resolution of `L1` against this offset,1086// and aborting this branch-simplification if they are1087// equal, we prevent the below assignment from ever creating1088// a cycle.1089if self.resolve_label_offset(b.target) != b.start {1090let redirected = b.labels_at_this_branch.len();1091for &l in &b.labels_at_this_branch {1092trace!(1093" -> label at start of branch {:?} redirected to target {:?}",1094l, b.target1095);1096self.label_aliases[l.0 as usize] = b.target;1097// NOTE: we continue to ensure the invariant that labels1098// pointing to tail of buffer are in `labels_at_tail`1099// because we already ensured above that the last branch1100// cannot have a target of `cur_off`; so we never have1101// to put the label into `labels_at_tail` when moving it1102// here.1103}1104// Maintain invariant: all branches have been redirected1105// and are no longer pointing at the start of this branch.1106let mut_b = self.latest_branches.last_mut().unwrap();1107mut_b.labels_at_this_branch.clear();11081109if redirected > 0 {1110trace!(" -> after label redirects, restarting loop");1111continue;1112}1113} else {1114break;1115}11161117let b = self.latest_branches.last().unwrap();11181119// Examine any immediately preceding branch.1120if self.latest_branches.len() > 1 {1121let prev_b = &self.latest_branches[self.latest_branches.len() - 2];1122trace!(" -> more than one branch; prev_b = {:?}", prev_b);1123// This uncond is immediately after another uncond; we1124// should have already redirected labels to this uncond away1125// (but check to be sure); so we can truncate this uncond.1126if prev_b.is_uncond()1127&& prev_b.end == b.start1128&& b.labels_at_this_branch.is_empty()1129{1130trace!(" -> uncond follows another uncond; truncating");1131self.truncate_last_branch();1132continue;1133}11341135// This uncond is immediately after a conditional, and the1136// conditional's target is the end of this uncond, and we've1137// already redirected labels to this uncond away; so we can1138// truncate this uncond, flip the sense of the conditional, and1139// set the conditional's target (in `latest_branches` and in1140// `fixup_records`) to the uncond's target.1141if prev_b.is_cond()1142&& prev_b.end == b.start1143&& self.resolve_label_offset(prev_b.target) == cur_off1144{1145trace!(1146" -> uncond follows a conditional, and conditional's target resolves to current offset"1147);1148// Save the target of the uncond (this becomes the1149// target of the cond), and truncate the uncond.1150let target = b.target;1151let data = prev_b.inverted.clone().unwrap();1152self.truncate_last_branch();11531154// Mutate the code and cond branch.1155let off_before_edit = self.cur_offset();1156let prev_b = self.latest_branches.last_mut().unwrap();1157let not_inverted = SmallVec::from(1158&self.data[(prev_b.start as usize)..(prev_b.end as usize)],1159);11601161// Low-level edit: replaces bytes of branch with1162// inverted form. cur_off remains the same afterward, so1163// we do not need to modify label data structures.1164self.data.truncate(prev_b.start as usize);1165self.data.extend_from_slice(&data[..]);11661167// Save the original code as the inversion of the1168// inverted branch, in case we later edit this branch1169// again.1170prev_b.inverted = Some(not_inverted);1171self.pending_fixup_records[prev_b.fixup].label = target;1172trace!(" -> reassigning target of condbr to {:?}", target);1173prev_b.target = target;1174debug_assert_eq!(off_before_edit, self.cur_offset());1175continue;1176}1177}1178}11791180// If we couldn't do anything with the last branch, then break.1181break;1182}11831184self.purge_latest_branches();11851186trace!(1187"leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",1188self.latest_branches, self.labels_at_tail, self.pending_fixup_records1189);1190}11911192fn purge_latest_branches(&mut self) {1193// All of our branch simplification rules work only if a branch ends at1194// the tail of the buffer, with no following code; and branches are in1195// order in latest_branches; so if the last entry ends prior to1196// cur_offset, then clear all entries.1197let cur_off = self.cur_offset();1198if let Some(l) = self.latest_branches.last() {1199if l.end < cur_off {1200trace!("purge_latest_branches: removing branch {:?}", l);1201self.latest_branches.clear();1202}1203}12041205// Post-invariant: no invariant requires any branch to appear in1206// `latest_branches`; it is always optional. The list-clear above thus1207// preserves all semantics.1208}12091210/// Emit a trap at some point in the future with the specified code and1211/// stack map.1212///1213/// This function returns a [`MachLabel`] which will be the future address1214/// of the trap. Jumps should refer to this label, likely by using the1215/// [`MachBuffer::use_label_at_offset`] method, to get a relocation1216/// patched in once the address of the trap is known.1217///1218/// This will batch all traps into the end of the function.1219pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {1220let label = self.get_label();1221self.pending_traps.push(MachLabelTrap {1222label,1223code,1224loc: self.cur_srcloc.map(|(_start, loc)| loc),1225});1226label1227}12281229/// Is an island needed within the next N bytes?1230pub fn island_needed(&self, distance: CodeOffset) -> bool {1231let deadline = match self.fixup_records.peek() {1232Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),1233None => self.pending_fixup_deadline,1234};1235deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline1236}12371238/// Returns the maximal offset that islands can reach if `distance` more1239/// bytes are appended.1240///1241/// This is used to determine if veneers need insertions since jumps that1242/// can't reach past this point must get a veneer of some form.1243fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {1244// Assume that all fixups will require veneers and that the veneers are1245// the worst-case size for each platform. This is an over-generalization1246// to avoid iterating over the `fixup_records` list or maintaining1247// information about it as we go along.1248let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())1249as u32)1250* (I::LabelUse::worst_case_veneer_size())1251+ self.pending_constants_size1252+ (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;1253self.cur_offset()1254.saturating_add(distance)1255.saturating_add(island_worst_case_size)1256}12571258/// Emit all pending constants and required pending veneers.1259///1260/// Should only be called if `island_needed()` returns true, i.e., if we1261/// actually reach a deadline. It's not necessarily a problem to do so1262/// otherwise but it may result in unnecessary work during emission.1263pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {1264self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);1265}12661267/// Same as `emit_island`, but an internal API with a `force_veneers`1268/// argument to force all veneers to always get emitted for debugging.1269fn emit_island_maybe_forced(1270&mut self,1271force_veneers: ForceVeneers,1272distance: CodeOffset,1273ctrl_plane: &mut ControlPlane,1274) {1275// We're going to purge fixups, so no latest-branch editing can happen1276// anymore.1277self.latest_branches.clear();12781279// End the current location tracking since anything emitted during this1280// function shouldn't be attributed to whatever the current source1281// location is.1282//1283// Note that the current source location, if it's set right now, will be1284// restored at the end of this island emission.1285let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);1286if cur_loc.is_some() {1287self.end_srcloc();1288}12891290let forced_threshold = self.worst_case_end_of_island(distance);12911292// First flush out all traps/constants so we have more labels in case1293// fixups are applied against these labels.1294//1295// Note that traps are placed first since this typically happens at the1296// end of the function and for disassemblers we try to keep all the code1297// contiguously together.1298for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {1299// If this trap has source information associated with it then1300// emit this information for the trap instruction going out now too.1301if let Some(loc) = loc {1302self.start_srcloc(loc);1303}1304self.align_to(I::LabelUse::ALIGN);1305self.bind_label(label, ctrl_plane);1306self.add_trap(code);1307self.put_data(I::TRAP_OPCODE);1308if loc.is_some() {1309self.end_srcloc();1310}1311}13121313for constant in mem::take(&mut self.pending_constants) {1314let MachBufferConstant { align, size, .. } = self.constants[constant];1315let label = self.constants[constant].upcoming_label.take().unwrap();1316self.align_to(align);1317self.bind_label(label, ctrl_plane);1318self.used_constants.push((constant, self.cur_offset()));1319self.get_appended_space(size);1320}13211322// Either handle all pending fixups because they're ready or move them1323// onto the `BinaryHeap` tracking all pending fixups if they aren't1324// ready.1325assert!(self.latest_branches.is_empty());1326for fixup in mem::take(&mut self.pending_fixup_records) {1327if self.should_apply_fixup(&fixup, forced_threshold) {1328self.handle_fixup(fixup, force_veneers, forced_threshold);1329} else {1330self.fixup_records.push(fixup);1331}1332}1333self.pending_fixup_deadline = u32::MAX;1334while let Some(fixup) = self.fixup_records.peek() {1335trace!("emit_island: fixup {:?}", fixup);13361337// If this fixup shouldn't be applied, that means its label isn't1338// defined yet and there'll be remaining space to apply a veneer if1339// necessary in the future after this island. In that situation1340// because `fixup_records` is sorted by deadline this loop can1341// exit.1342if !self.should_apply_fixup(fixup, forced_threshold) {1343break;1344}13451346let fixup = self.fixup_records.pop().unwrap();1347self.handle_fixup(fixup, force_veneers, forced_threshold);1348}13491350if let Some(loc) = cur_loc {1351self.start_srcloc(loc);1352}1353}13541355fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {1356let label_offset = self.resolve_label_offset(fixup.label);1357label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold1358}13591360fn handle_fixup(1361&mut self,1362fixup: MachLabelFixup<I>,1363force_veneers: ForceVeneers,1364forced_threshold: CodeOffset,1365) {1366let MachLabelFixup {1367label,1368offset,1369kind,1370} = fixup;1371let start = offset as usize;1372let end = (offset + kind.patch_size()) as usize;1373let label_offset = self.resolve_label_offset(label);13741375if label_offset != UNKNOWN_LABEL_OFFSET {1376// If the offset of the label for this fixup is known then1377// we're going to do something here-and-now. We're either going1378// to patch the original offset because it's an in-bounds jump,1379// or we're going to generate a veneer, patch the fixup to jump1380// to the veneer, and then keep going.1381//1382// If the label comes after the original fixup, then we should1383// be guaranteed that the jump is in-bounds. Otherwise there's1384// a bug somewhere because this method wasn't called soon1385// enough. All forward-jumps are tracked and should get veneers1386// before their deadline comes and they're unable to jump1387// further.1388//1389// Otherwise if the label is before the fixup, then that's a1390// backwards jump. If it's past the maximum negative range1391// then we'll emit a veneer that to jump forward to which can1392// then jump backwards.1393let veneer_required = if label_offset >= offset {1394assert!((label_offset - offset) <= kind.max_pos_range());1395false1396} else {1397(offset - label_offset) > kind.max_neg_range()1398};1399trace!(1400" -> label_offset = {}, known, required = {} (pos {} neg {})",1401label_offset,1402veneer_required,1403kind.max_pos_range(),1404kind.max_neg_range()1405);14061407if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {1408self.emit_veneer(label, offset, kind);1409} else {1410let slice = &mut self.data[start..end];1411trace!(1412"patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"1413);1414kind.patch(slice, offset, label_offset);1415}1416} else {1417// If the offset of this label is not known at this time then1418// that means that a veneer is required because after this1419// island the target can't be in range of the original target.1420assert!(forced_threshold - offset > kind.max_pos_range());1421self.emit_veneer(label, offset, kind);1422}1423}14241425/// Emits a "veneer" the `kind` code at `offset` to jump to `label`.1426///1427/// This will generate extra machine code, using `kind`, to get a1428/// larger-jump-kind than `kind` allows. The code at `offset` is then1429/// patched to jump to our new code, and then the new code is enqueued for1430/// a fixup to get processed at some later time.1431fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {1432// If this `kind` doesn't support a veneer then that's a bug in the1433// backend because we need to implement support for such a veneer.1434assert!(1435kind.supports_veneer(),1436"jump beyond the range of {kind:?} but a veneer isn't supported",1437);14381439// Allocate space for a veneer in the island.1440self.align_to(I::LabelUse::ALIGN);1441let veneer_offset = self.cur_offset();1442trace!("making a veneer at {}", veneer_offset);1443let start = offset as usize;1444let end = (offset + kind.patch_size()) as usize;1445let slice = &mut self.data[start..end];1446// Patch the original label use to refer to the veneer.1447trace!(1448"patching original at offset {} to veneer offset {}",1449offset, veneer_offset1450);1451kind.patch(slice, offset, veneer_offset);1452// Generate the veneer.1453let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);1454let (veneer_fixup_off, veneer_label_use) =1455kind.generate_veneer(veneer_slice, veneer_offset);1456trace!(1457"generated veneer; fixup offset {}, label_use {:?}",1458veneer_fixup_off, veneer_label_use1459);1460// Register a new use of `label` with our new veneer fixup and1461// offset. This'll recalculate deadlines accordingly and1462// enqueue this fixup to get processed at some later1463// time.1464self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);1465}14661467fn finish_emission_maybe_forcing_veneers(1468&mut self,1469force_veneers: ForceVeneers,1470ctrl_plane: &mut ControlPlane,1471) {1472while !self.pending_constants.is_empty()1473|| !self.pending_traps.is_empty()1474|| !self.fixup_records.is_empty()1475|| !self.pending_fixup_records.is_empty()1476{1477// `emit_island()` will emit any pending veneers and constants, and1478// as a side-effect, will also take care of any fixups with resolved1479// labels eagerly.1480self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);1481}14821483// Ensure that all labels have been fixed up after the last island is emitted. This is a1484// full (release-mode) assert because an unresolved label means the emitted code is1485// incorrect.1486assert!(self.fixup_records.is_empty());1487assert!(self.pending_fixup_records.is_empty());1488}14891490/// Finish any deferred emissions and/or fixups.1491pub fn finish(1492mut self,1493constants: &VCodeConstants,1494ctrl_plane: &mut ControlPlane,1495) -> MachBufferFinalized<Stencil> {1496let _tt = timing::vcode_emit_finish();14971498self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);14991500let alignment = self.finish_constants(constants);15011502// Resolve all labels to their offsets.1503let finalized_relocs = self1504.relocs1505.iter()1506.map(|reloc| FinalizedMachReloc {1507offset: reloc.offset,1508kind: reloc.kind,1509addend: reloc.addend,1510target: match &reloc.target {1511RelocTarget::ExternalName(name) => {1512FinalizedRelocTarget::ExternalName(name.clone())1513}1514RelocTarget::Label(label) => {1515FinalizedRelocTarget::Func(self.resolve_label_offset(*label))1516}1517},1518})1519.collect();15201521let finalized_exception_handlers = self1522.exception_handlers1523.iter()1524.map(|handler| handler.finalize(|label| self.resolve_label_offset(label)))1525.collect();15261527let mut srclocs = self.srclocs;1528srclocs.sort_by_key(|entry| entry.start);15291530MachBufferFinalized {1531data: self.data,1532relocs: finalized_relocs,1533traps: self.traps,1534call_sites: self.call_sites,1535exception_handlers: finalized_exception_handlers,1536srclocs,1537user_stack_maps: self.user_stack_maps,1538unwind_info: self.unwind_info,1539alignment,1540}1541}15421543/// Add an external relocation at the given offset.1544pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(1545&mut self,1546offset: CodeOffset,1547kind: Reloc,1548target: &T,1549addend: Addend,1550) {1551let target: RelocTarget = target.clone().into();1552// FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally1553// generate a label-use statement to track whether an island is possibly1554// needed to escape this function to actually get to the external name.1555// This is most likely to come up on AArch64 where calls between1556// functions use a 26-bit signed offset which gives +/- 64MB. This means1557// that if a function is 128MB in size and there's a call in the middle1558// it's impossible to reach the actual target. Also, while it's1559// technically possible to jump to the start of a function and then jump1560// further, island insertion below always inserts islands after1561// previously appended code so for Cranelift's own implementation this1562// is also a problem for 64MB functions on AArch64 which start with a1563// call instruction, those won't be able to escape.1564//1565// Ideally what needs to happen here is that a `LabelUse` is1566// transparently generated (or call-sites of this function are audited1567// to generate a `LabelUse` instead) and tracked internally. The actual1568// relocation would then change over time if and when a veneer is1569// inserted, where the relocation here would be patched by this1570// `MachBuffer` to jump to the veneer. The problem, though, is that all1571// this still needs to end up, in the case of a singular function,1572// generating a final relocation pointing either to this particular1573// relocation or to the veneer inserted. Additionally1574// `MachBuffer` needs the concept of a label which will never be1575// resolved, so `emit_island` doesn't trip over not actually ever1576// knowning what some labels are. Currently the loop in1577// `finish_emission_maybe_forcing_veneers` would otherwise infinitely1578// loop.1579//1580// For now this means that because relocs aren't tracked at all that1581// AArch64 functions have a rough size limits of 64MB. For now that's1582// somewhat reasonable and the failure mode is a panic in `MachBuffer`1583// when a relocation can't otherwise be resolved later, so it shouldn't1584// actually result in any memory unsafety or anything like that.1585self.relocs.push(MachReloc {1586offset,1587kind,1588target,1589addend,1590});1591}15921593/// Add an external relocation at the current offset.1594pub fn add_reloc<T: Into<RelocTarget> + Clone>(1595&mut self,1596kind: Reloc,1597target: &T,1598addend: Addend,1599) {1600self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);1601}16021603/// Add a trap record at the current offset.1604pub fn add_trap(&mut self, code: TrapCode) {1605self.traps.push(MachTrap {1606offset: self.data.len() as CodeOffset,1607code,1608});1609}16101611/// Add a call-site record at the current offset.1612pub fn add_call_site(&mut self) {1613self.add_try_call_site(None, core::iter::empty());1614}16151616/// Add a call-site record at the current offset with exception1617/// handlers.1618pub fn add_try_call_site(1619&mut self,1620frame_offset: Option<u32>,1621exception_handlers: impl Iterator<Item = MachExceptionHandler>,1622) {1623let start = u32::try_from(self.exception_handlers.len()).unwrap();1624self.exception_handlers.extend(exception_handlers);1625let end = u32::try_from(self.exception_handlers.len()).unwrap();1626let exception_handler_range = start..end;16271628self.call_sites.push(MachCallSite {1629ret_addr: self.data.len() as CodeOffset,1630frame_offset,1631exception_handler_range,1632});1633}16341635/// Add an unwind record at the current offset.1636pub fn add_unwind(&mut self, unwind: UnwindInst) {1637self.unwind_info.push((self.cur_offset(), unwind));1638}16391640/// Set the `SourceLoc` for code from this offset until the offset at the1641/// next call to `end_srcloc()`.1642/// Returns the current [CodeOffset] and [RelSourceLoc].1643pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {1644let cur = (self.cur_offset(), loc);1645self.cur_srcloc = Some(cur);1646cur1647}16481649/// Mark the end of the `SourceLoc` segment started at the last1650/// `start_srcloc()` call.1651pub fn end_srcloc(&mut self) {1652let (start, loc) = self1653.cur_srcloc1654.take()1655.expect("end_srcloc() called without start_srcloc()");1656let end = self.cur_offset();1657// Skip zero-length extends.1658debug_assert!(end >= start);1659if end > start {1660self.srclocs.push(MachSrcLoc { start, end, loc });1661}1662}16631664/// Push a user stack map onto this buffer.1665///1666/// The stack map is associated with the given `return_addr` code1667/// offset. This must be the PC for the instruction just *after* this stack1668/// map's associated instruction. For example in the sequence `call $foo;1669/// add r8, rax`, the `return_addr` must be the offset of the start of the1670/// `add` instruction.1671///1672/// Stack maps must be pushed in sorted `return_addr` order.1673pub fn push_user_stack_map(1674&mut self,1675emit_state: &I::State,1676return_addr: CodeOffset,1677mut stack_map: ir::UserStackMap,1678) {1679let span = emit_state.frame_layout().active_size();1680trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");16811682debug_assert!(1683self.user_stack_maps1684.last()1685.map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),1686"pushed stack maps out of order: {} is not less than {}",1687self.user_stack_maps.last().unwrap().0,1688return_addr,1689);16901691stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());1692self.user_stack_maps.push((return_addr, span, stack_map));1693}16941695/// Increase the alignment of the buffer to the given alignment if bigger1696/// than the current alignment.1697pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {1698self.min_alignment = self.min_alignment.max(16991u32.checked_shl(u32::from(align_to))1700.expect("log2_min_function_alignment too large"),1701);1702}1703}17041705impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {1706fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {1707for b in iter {1708self.put1(b);1709}1710}1711}17121713impl<T: CompilePhase> MachBufferFinalized<T> {1714/// Get a list of source location mapping tuples in sorted-by-start-offset order.1715pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {1716&self.srclocs[..]1717}17181719/// Get the total required size for the code.1720pub fn total_size(&self) -> CodeOffset {1721self.data.len() as CodeOffset1722}17231724/// Return the code in this mach buffer as a hex string for testing purposes.1725pub fn stringify_code_bytes(&self) -> String {1726// This is pretty lame, but whatever ..1727use std::fmt::Write;1728let mut s = String::with_capacity(self.data.len() * 2);1729for b in &self.data {1730write!(&mut s, "{b:02X}").unwrap();1731}1732s1733}17341735/// Get the code bytes.1736pub fn data(&self) -> &[u8] {1737// N.B.: we emit every section into the .text section as far as1738// the `CodeSink` is concerned; we do not bother to segregate1739// the contents into the actual program text, the jumptable and the1740// rodata (constant pool). This allows us to generate code assuming1741// that these will not be relocated relative to each other, and avoids1742// having to designate each section as belonging in one of the three1743// fixed categories defined by `CodeSink`. If this becomes a problem1744// later (e.g. because of memory permissions or similar), we can1745// add this designation and segregate the output; take care, however,1746// to add the appropriate relocations in this case.17471748&self.data[..]1749}17501751/// Get the list of external relocations for this code.1752pub fn relocs(&self) -> &[FinalizedMachReloc] {1753&self.relocs[..]1754}17551756/// Get the list of trap records for this code.1757pub fn traps(&self) -> &[MachTrap] {1758&self.traps[..]1759}17601761/// Get the user stack map metadata for this code.1762pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {1763&self.user_stack_maps1764}17651766/// Take this buffer's user strack map metadata.1767pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {1768mem::take(&mut self.user_stack_maps)1769}17701771/// Get the list of call sites for this code, along with1772/// associated exception handlers.1773///1774/// Each item yielded by the returned iterator is a struct with:1775///1776/// - The call site metadata record, with a `ret_addr` field1777/// directly accessible and denoting the offset of the return1778/// address into this buffer's code.1779/// - The slice of pairs of exception tags and code offsets1780/// denoting exception-handler entry points associated with this1781/// call site.1782pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {1783self.call_sites.iter().map(|call_site| {1784let handler_range = call_site.exception_handler_range.clone();1785let handler_range = usize::try_from(handler_range.start).unwrap()1786..usize::try_from(handler_range.end).unwrap();1787FinalizedMachCallSite {1788ret_addr: call_site.ret_addr,1789frame_offset: call_site.frame_offset,1790exception_handlers: &self.exception_handlers[handler_range],1791}1792})1793}1794}17951796/// An item in the exception-handler list for a callsite, with label1797/// references. Items are interpreted in left-to-right order and the1798/// first match wins.1799#[derive(Clone, Copy, Debug, PartialEq, Eq)]1800pub enum MachExceptionHandler {1801/// A specific tag (in the current dynamic context) should be1802/// handled by the code at the given offset.1803Tag(ExceptionTag, MachLabel),1804/// All exceptions should be handled by the code at the given1805/// offset.1806Default(MachLabel),1807/// The dynamic context for interpreting tags is updated to the1808/// value stored in the given machine location (in this frame's1809/// context).1810Context(ExceptionContextLoc),1811}18121813impl MachExceptionHandler {1814fn finalize<F: Fn(MachLabel) -> CodeOffset>(self, f: F) -> FinalizedMachExceptionHandler {1815match self {1816Self::Tag(tag, label) => FinalizedMachExceptionHandler::Tag(tag, f(label)),1817Self::Default(label) => FinalizedMachExceptionHandler::Default(f(label)),1818Self::Context(loc) => FinalizedMachExceptionHandler::Context(loc),1819}1820}1821}18221823/// An item in the exception-handler list for a callsite, with final1824/// (lowered) code offsets. Items are interpreted in left-to-right1825/// order and the first match wins.1826#[derive(Clone, Copy, Debug, PartialEq, Eq)]1827#[cfg_attr(1828feature = "enable-serde",1829derive(serde_derive::Serialize, serde_derive::Deserialize)1830)]1831pub enum FinalizedMachExceptionHandler {1832/// A specific tag (in the current dynamic context) should be1833/// handled by the code at the given offset.1834Tag(ExceptionTag, CodeOffset),1835/// All exceptions should be handled by the code at the given1836/// offset.1837Default(CodeOffset),1838/// The dynamic context for interpreting tags is updated to the1839/// value stored in the given machine location (in this frame's1840/// context).1841Context(ExceptionContextLoc),1842}18431844/// A location for a dynamic exception context value.1845#[derive(Clone, Copy, Debug, PartialEq, Eq)]1846#[cfg_attr(1847feature = "enable-serde",1848derive(serde_derive::Serialize, serde_derive::Deserialize)1849)]1850pub enum ExceptionContextLoc {1851/// An offset from SP at the callsite.1852SPOffset(u32),1853/// A GPR at the callsite. The physical register number for the1854/// GPR register file on the target architecture is used.1855GPR(u8),1856}18571858/// Metadata about a constant.1859struct MachBufferConstant {1860/// A label which has not yet been bound which can be used for this1861/// constant.1862///1863/// This is lazily created when a label is requested for a constant and is1864/// cleared when a constant is emitted.1865upcoming_label: Option<MachLabel>,1866/// Required alignment.1867align: CodeOffset,1868/// The byte size of this constant.1869size: usize,1870}18711872/// A trap that is deferred to the next time an island is emitted for either1873/// traps, constants, or fixups.1874struct MachLabelTrap {1875/// This label will refer to the trap's offset.1876label: MachLabel,1877/// The code associated with this trap.1878code: TrapCode,1879/// An optional source location to assign for this trap.1880loc: Option<RelSourceLoc>,1881}18821883/// A fixup to perform on the buffer once code is emitted. Fixups always refer1884/// to labels and patch the code based on label offsets. Hence, they are like1885/// relocations, but internal to one buffer.1886#[derive(Debug)]1887struct MachLabelFixup<I: VCodeInst> {1888/// The label whose offset controls this fixup.1889label: MachLabel,1890/// The offset to fix up / patch to refer to this label.1891offset: CodeOffset,1892/// The kind of fixup. This is architecture-specific; each architecture may have,1893/// e.g., several types of branch instructions, each with differently-sized1894/// offset fields and different places within the instruction to place the1895/// bits.1896kind: I::LabelUse,1897}18981899impl<I: VCodeInst> MachLabelFixup<I> {1900fn deadline(&self) -> CodeOffset {1901self.offset.saturating_add(self.kind.max_pos_range())1902}1903}19041905impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {1906fn eq(&self, other: &Self) -> bool {1907self.deadline() == other.deadline()1908}1909}19101911impl<I: VCodeInst> Eq for MachLabelFixup<I> {}19121913impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {1914fn partial_cmp(&self, other: &Self) -> Option<Ordering> {1915Some(self.cmp(other))1916}1917}19181919impl<I: VCodeInst> Ord for MachLabelFixup<I> {1920fn cmp(&self, other: &Self) -> Ordering {1921other.deadline().cmp(&self.deadline())1922}1923}19241925/// A relocation resulting from a compilation.1926#[derive(Clone, Debug, PartialEq)]1927#[cfg_attr(1928feature = "enable-serde",1929derive(serde_derive::Serialize, serde_derive::Deserialize)1930)]1931pub struct MachRelocBase<T> {1932/// The offset at which the relocation applies, *relative to the1933/// containing section*.1934pub offset: CodeOffset,1935/// The kind of relocation.1936pub kind: Reloc,1937/// The external symbol / name to which this relocation refers.1938pub target: T,1939/// The addend to add to the symbol value.1940pub addend: i64,1941}19421943type MachReloc = MachRelocBase<RelocTarget>;19441945/// A relocation resulting from a compilation.1946pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;19471948/// A Relocation target1949#[derive(Debug, Clone, PartialEq, Eq, Hash)]1950pub enum RelocTarget {1951/// Points to an [ExternalName] outside the current function.1952ExternalName(ExternalName),1953/// Points to a [MachLabel] inside this function.1954/// This is different from [MachLabelFixup] in that both the relocation and the1955/// label will be emitted and are only resolved at link time.1956///1957/// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.1958Label(MachLabel),1959}19601961impl From<ExternalName> for RelocTarget {1962fn from(name: ExternalName) -> Self {1963Self::ExternalName(name)1964}1965}19661967impl From<MachLabel> for RelocTarget {1968fn from(label: MachLabel) -> Self {1969Self::Label(label)1970}1971}19721973/// A Relocation target1974#[derive(Debug, Clone, PartialEq, Eq, Hash)]1975#[cfg_attr(1976feature = "enable-serde",1977derive(serde_derive::Serialize, serde_derive::Deserialize)1978)]1979pub enum FinalizedRelocTarget {1980/// Points to an [ExternalName] outside the current function.1981ExternalName(ExternalName),1982/// Points to a [CodeOffset] from the start of the current function.1983Func(CodeOffset),1984}19851986impl FinalizedRelocTarget {1987/// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the1988/// output.1989pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {1990match self {1991FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),1992FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),1993}1994}1995}19961997/// A trap record resulting from a compilation.1998#[derive(Clone, Debug, PartialEq)]1999#[cfg_attr(2000feature = "enable-serde",2001derive(serde_derive::Serialize, serde_derive::Deserialize)2002)]2003pub struct MachTrap {2004/// The offset at which the trap instruction occurs, *relative to the2005/// containing section*.2006pub offset: CodeOffset,2007/// The trap code.2008pub code: TrapCode,2009}20102011/// A call site record resulting from a compilation.2012#[derive(Clone, Debug, PartialEq)]2013#[cfg_attr(2014feature = "enable-serde",2015derive(serde_derive::Serialize, serde_derive::Deserialize)2016)]2017pub struct MachCallSite {2018/// The offset of the call's return address, *relative to the2019/// start of the buffer*.2020pub ret_addr: CodeOffset,20212022/// The offset from the FP at this callsite down to the SP when2023/// the call occurs, if known. In other words, the size of the2024/// stack frame up to the saved FP slot. Useful to recover the2025/// start of the stack frame and to look up dynamic contexts2026/// stored in [`ExceptionContextLoc::SPOffset`].2027///2028/// If `None`, the compiler backend did not specify a frame2029/// offset. The runtime in use with the compiled code may require2030/// the frame offset if exception handlers are present or dynamic2031/// context is used, but that is not Cranelift's concern: the2032/// frame offset is optional at this level.2033pub frame_offset: Option<u32>,20342035/// Range in `exception_handlers` corresponding to the exception2036/// handlers for this callsite.2037exception_handler_range: Range<u32>,2038}20392040/// A call site record resulting from a compilation.2041#[derive(Clone, Debug, PartialEq)]2042pub struct FinalizedMachCallSite<'a> {2043/// The offset of the call's return address, *relative to the2044/// start of the buffer*.2045pub ret_addr: CodeOffset,20462047/// The offset from the FP at this callsite down to the SP when2048/// the call occurs, if known. In other words, the size of the2049/// stack frame up to the saved FP slot. Useful to recover the2050/// start of the stack frame and to look up dynamic contexts2051/// stored in [`ExceptionContextLoc::SPOffset`].2052///2053/// If `None`, the compiler backend did not specify a frame2054/// offset. The runtime in use with the compiled code may require2055/// the frame offset if exception handlers are present or dynamic2056/// context is used, but that is not Cranelift's concern: the2057/// frame offset is optional at this level.2058pub frame_offset: Option<u32>,20592060/// Exception handlers at this callsite, with target offsets2061/// *relative to the start of the buffer*.2062pub exception_handlers: &'a [FinalizedMachExceptionHandler],2063}20642065/// A source-location mapping resulting from a compilation.2066#[derive(PartialEq, Debug, Clone)]2067#[cfg_attr(2068feature = "enable-serde",2069derive(serde_derive::Serialize, serde_derive::Deserialize)2070)]2071pub struct MachSrcLoc<T: CompilePhase> {2072/// The start of the region of code corresponding to a source location.2073/// This is relative to the start of the function, not to the start of the2074/// section.2075pub start: CodeOffset,2076/// The end of the region of code corresponding to a source location.2077/// This is relative to the start of the function, not to the start of the2078/// section.2079pub end: CodeOffset,2080/// The source location.2081pub loc: T::SourceLocType,2082}20832084impl MachSrcLoc<Stencil> {2085fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {2086MachSrcLoc {2087start: self.start,2088end: self.end,2089loc: self.loc.expand(base_srcloc),2090}2091}2092}20932094/// Record of branch instruction in the buffer, to facilitate editing.2095#[derive(Clone, Debug)]2096struct MachBranch {2097start: CodeOffset,2098end: CodeOffset,2099target: MachLabel,2100fixup: usize,2101inverted: Option<SmallVec<[u8; 8]>>,2102/// All labels pointing to the start of this branch. For correctness, this2103/// *must* be complete (i.e., must contain all labels whose resolved offsets2104/// are at the start of this branch): we rely on being able to redirect all2105/// labels that could jump to this branch before removing it, if it is2106/// otherwise unreachable.2107labels_at_this_branch: SmallVec<[MachLabel; 4]>,2108}21092110impl MachBranch {2111fn is_cond(&self) -> bool {2112self.inverted.is_some()2113}2114fn is_uncond(&self) -> bool {2115self.inverted.is_none()2116}2117}21182119/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.2120///2121/// Note that `MachBuffer` was primarily written for intra-function references2122/// of jumps between basic blocks, but it's also quite usable for entire text2123/// sections and resolving references between functions themselves. This2124/// builder interprets "blocks" as labeled functions for the purposes of2125/// resolving labels internally in the buffer.2126pub struct MachTextSectionBuilder<I: VCodeInst> {2127buf: MachBuffer<I>,2128next_func: usize,2129force_veneers: ForceVeneers,2130}21312132impl<I: VCodeInst> MachTextSectionBuilder<I> {2133/// Creates a new text section builder which will have `num_funcs` functions2134/// pushed into it.2135pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {2136let mut buf = MachBuffer::new();2137buf.reserve_labels_for_blocks(num_funcs);2138MachTextSectionBuilder {2139buf,2140next_func: 0,2141force_veneers: ForceVeneers::No,2142}2143}2144}21452146impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {2147fn append(2148&mut self,2149labeled: bool,2150func: &[u8],2151align: u32,2152ctrl_plane: &mut ControlPlane,2153) -> u64 {2154// Conditionally emit an island if it's necessary to resolve jumps2155// between functions which are too far away.2156let size = func.len() as u32;2157if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {2158self.buf2159.emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);2160}21612162self.buf.align_to(align);2163let pos = self.buf.cur_offset();2164if labeled {2165self.buf.bind_label(2166MachLabel::from_block(BlockIndex::new(self.next_func)),2167ctrl_plane,2168);2169self.next_func += 1;2170}2171self.buf.put_data(func);2172u64::from(pos)2173}21742175fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {2176crate::trace!(2177"Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"2178);2179let label = MachLabel::from_block(BlockIndex::new(target));2180let offset = u32::try_from(offset).unwrap();2181match I::LabelUse::from_reloc(reloc, addend) {2182Some(label_use) => {2183self.buf.use_label_at_offset(offset, label, label_use);2184true2185}2186None => false,2187}2188}21892190fn force_veneers(&mut self) {2191self.force_veneers = ForceVeneers::Yes;2192}21932194fn write(&mut self, offset: u64, data: &[u8]) {2195self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);2196}21972198fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {2199// Double-check all functions were pushed.2200assert_eq!(self.next_func, self.buf.label_offsets.len());22012202// Finish up any veneers, if necessary.2203self.buf2204.finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);22052206// We don't need the data any more, so return it to the caller.2207mem::take(&mut self.buf.data).into_vec()2208}2209}22102211// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.2212#[cfg(all(test, feature = "arm64"))]2213mod test {2214use cranelift_entity::EntityRef as _;22152216use super::*;2217use crate::ir::UserExternalNameRef;2218use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};2219use crate::isa::aarch64::inst::{OperandSize, xreg};2220use crate::machinst::{MachInstEmit, MachInstEmitState};2221use crate::settings;22222223fn label(n: u32) -> MachLabel {2224MachLabel::from_block(BlockIndex::new(n as usize))2225}2226fn target(n: u32) -> BranchTarget {2227BranchTarget::Label(label(n))2228}22292230#[test]2231fn test_elide_jump_to_next() {2232let info = EmitInfo::new(settings::Flags::new(settings::builder()));2233let mut buf = MachBuffer::new();2234let mut state = <Inst as MachInstEmit>::State::default();2235let constants = Default::default();22362237buf.reserve_labels_for_blocks(2);2238buf.bind_label(label(0), state.ctrl_plane_mut());2239let inst = Inst::Jump { dest: target(1) };2240inst.emit(&mut buf, &info, &mut state);2241buf.bind_label(label(1), state.ctrl_plane_mut());2242let buf = buf.finish(&constants, state.ctrl_plane_mut());2243assert_eq!(0, buf.total_size());2244}22452246#[test]2247fn test_elide_trivial_jump_blocks() {2248let info = EmitInfo::new(settings::Flags::new(settings::builder()));2249let mut buf = MachBuffer::new();2250let mut state = <Inst as MachInstEmit>::State::default();2251let constants = Default::default();22522253buf.reserve_labels_for_blocks(4);22542255buf.bind_label(label(0), state.ctrl_plane_mut());2256let inst = Inst::CondBr {2257kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2258taken: target(1),2259not_taken: target(2),2260};2261inst.emit(&mut buf, &info, &mut state);22622263buf.bind_label(label(1), state.ctrl_plane_mut());2264let inst = Inst::Jump { dest: target(3) };2265inst.emit(&mut buf, &info, &mut state);22662267buf.bind_label(label(2), state.ctrl_plane_mut());2268let inst = Inst::Jump { dest: target(3) };2269inst.emit(&mut buf, &info, &mut state);22702271buf.bind_label(label(3), state.ctrl_plane_mut());22722273let buf = buf.finish(&constants, state.ctrl_plane_mut());2274assert_eq!(0, buf.total_size());2275}22762277#[test]2278fn test_flip_cond() {2279let info = EmitInfo::new(settings::Flags::new(settings::builder()));2280let mut buf = MachBuffer::new();2281let mut state = <Inst as MachInstEmit>::State::default();2282let constants = Default::default();22832284buf.reserve_labels_for_blocks(4);22852286buf.bind_label(label(0), state.ctrl_plane_mut());2287let inst = Inst::CondBr {2288kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),2289taken: target(1),2290not_taken: target(2),2291};2292inst.emit(&mut buf, &info, &mut state);22932294buf.bind_label(label(1), state.ctrl_plane_mut());2295let inst = Inst::Nop4;2296inst.emit(&mut buf, &info, &mut state);22972298buf.bind_label(label(2), state.ctrl_plane_mut());2299let inst = Inst::Udf {2300trap_code: TrapCode::STACK_OVERFLOW,2301};2302inst.emit(&mut buf, &info, &mut state);23032304buf.bind_label(label(3), state.ctrl_plane_mut());23052306let buf = buf.finish(&constants, state.ctrl_plane_mut());23072308let mut buf2 = MachBuffer::new();2309let mut state = Default::default();2310let inst = Inst::TrapIf {2311kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2312trap_code: TrapCode::STACK_OVERFLOW,2313};2314inst.emit(&mut buf2, &info, &mut state);2315let inst = Inst::Nop4;2316inst.emit(&mut buf2, &info, &mut state);23172318let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());23192320assert_eq!(buf.data, buf2.data);2321}23222323#[test]2324fn test_island() {2325let info = EmitInfo::new(settings::Flags::new(settings::builder()));2326let mut buf = MachBuffer::new();2327let mut state = <Inst as MachInstEmit>::State::default();2328let constants = Default::default();23292330buf.reserve_labels_for_blocks(4);23312332buf.bind_label(label(0), state.ctrl_plane_mut());2333let inst = Inst::CondBr {2334kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2335taken: target(2),2336not_taken: target(3),2337};2338inst.emit(&mut buf, &info, &mut state);23392340buf.bind_label(label(1), state.ctrl_plane_mut());2341while buf.cur_offset() < 2000000 {2342if buf.island_needed(0) {2343buf.emit_island(0, state.ctrl_plane_mut());2344}2345let inst = Inst::Nop4;2346inst.emit(&mut buf, &info, &mut state);2347}23482349buf.bind_label(label(2), state.ctrl_plane_mut());2350let inst = Inst::Nop4;2351inst.emit(&mut buf, &info, &mut state);23522353buf.bind_label(label(3), state.ctrl_plane_mut());2354let inst = Inst::Nop4;2355inst.emit(&mut buf, &info, &mut state);23562357let buf = buf.finish(&constants, state.ctrl_plane_mut());23582359assert_eq!(2000000 + 8, buf.total_size());23602361let mut buf2 = MachBuffer::new();2362let mut state = Default::default();2363let inst = Inst::CondBr {2364kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),23652366// This conditionally taken branch has a 19-bit constant, shifted2367// to the left by two, giving us a 21-bit range in total. Half of2368// this range positive so the we should be around 1 << 20 bytes2369// away for our jump target.2370//2371// There are two pending fixups by the time we reach this point,2372// one for this 19-bit jump and one for the unconditional 26-bit2373// jump below. A 19-bit veneer is 4 bytes large and the 26-bit2374// veneer is 20 bytes large, which means that pessimistically2375// assuming we'll need two veneers. Currently each veneer is2376// pessimistically assumed to be the maximal size which means we2377// need 40 bytes of extra space, meaning that the actual island2378// should come 40-bytes before the deadline.2379taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),23802381// This branch is in-range so no veneers should be needed, it should2382// go directly to the target.2383not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),2384};2385inst.emit(&mut buf2, &info, &mut state);23862387let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());23882389assert_eq!(&buf.data[0..8], &buf2.data[..]);2390}23912392#[test]2393fn test_island_backward() {2394let info = EmitInfo::new(settings::Flags::new(settings::builder()));2395let mut buf = MachBuffer::new();2396let mut state = <Inst as MachInstEmit>::State::default();2397let constants = Default::default();23982399buf.reserve_labels_for_blocks(4);24002401buf.bind_label(label(0), state.ctrl_plane_mut());2402let inst = Inst::Nop4;2403inst.emit(&mut buf, &info, &mut state);24042405buf.bind_label(label(1), state.ctrl_plane_mut());2406let inst = Inst::Nop4;2407inst.emit(&mut buf, &info, &mut state);24082409buf.bind_label(label(2), state.ctrl_plane_mut());2410while buf.cur_offset() < 2000000 {2411let inst = Inst::Nop4;2412inst.emit(&mut buf, &info, &mut state);2413}24142415buf.bind_label(label(3), state.ctrl_plane_mut());2416let inst = Inst::CondBr {2417kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2418taken: target(0),2419not_taken: target(1),2420};2421inst.emit(&mut buf, &info, &mut state);24222423let buf = buf.finish(&constants, state.ctrl_plane_mut());24242425assert_eq!(2000000 + 12, buf.total_size());24262427let mut buf2 = MachBuffer::new();2428let mut state = Default::default();2429let inst = Inst::CondBr {2430kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2431taken: BranchTarget::ResolvedOffset(8),2432not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),2433};2434inst.emit(&mut buf2, &info, &mut state);2435let inst = Inst::Jump {2436dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),2437};2438inst.emit(&mut buf2, &info, &mut state);24392440let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());24412442assert_eq!(&buf.data[2000000..], &buf2.data[..]);2443}24442445#[test]2446fn test_multiple_redirect() {2447// label0:2448// cbz x0, label12449// b label22450// label1:2451// b label32452// label2:2453// nop2454// nop2455// b label02456// label3:2457// b label42458// label4:2459// b label52460// label5:2461// b label72462// label6:2463// nop2464// label7:2465// ret2466//2467// -- should become:2468//2469// label0:2470// cbz x0, label72471// label2:2472// nop2473// nop2474// b label02475// label6:2476// nop2477// label7:2478// ret24792480let info = EmitInfo::new(settings::Flags::new(settings::builder()));2481let mut buf = MachBuffer::new();2482let mut state = <Inst as MachInstEmit>::State::default();2483let constants = Default::default();24842485buf.reserve_labels_for_blocks(8);24862487buf.bind_label(label(0), state.ctrl_plane_mut());2488let inst = Inst::CondBr {2489kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),2490taken: target(1),2491not_taken: target(2),2492};2493inst.emit(&mut buf, &info, &mut state);24942495buf.bind_label(label(1), state.ctrl_plane_mut());2496let inst = Inst::Jump { dest: target(3) };2497inst.emit(&mut buf, &info, &mut state);24982499buf.bind_label(label(2), state.ctrl_plane_mut());2500let inst = Inst::Nop4;2501inst.emit(&mut buf, &info, &mut state);2502inst.emit(&mut buf, &info, &mut state);2503let inst = Inst::Jump { dest: target(0) };2504inst.emit(&mut buf, &info, &mut state);25052506buf.bind_label(label(3), state.ctrl_plane_mut());2507let inst = Inst::Jump { dest: target(4) };2508inst.emit(&mut buf, &info, &mut state);25092510buf.bind_label(label(4), state.ctrl_plane_mut());2511let inst = Inst::Jump { dest: target(5) };2512inst.emit(&mut buf, &info, &mut state);25132514buf.bind_label(label(5), state.ctrl_plane_mut());2515let inst = Inst::Jump { dest: target(7) };2516inst.emit(&mut buf, &info, &mut state);25172518buf.bind_label(label(6), state.ctrl_plane_mut());2519let inst = Inst::Nop4;2520inst.emit(&mut buf, &info, &mut state);25212522buf.bind_label(label(7), state.ctrl_plane_mut());2523let inst = Inst::Ret {};2524inst.emit(&mut buf, &info, &mut state);25252526let buf = buf.finish(&constants, state.ctrl_plane_mut());25272528let golden_data = vec![25290xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x1425300x1f, 0x20, 0x03, 0xd5, // nop25310x1f, 0x20, 0x03, 0xd5, // nop25320xfd, 0xff, 0xff, 0x17, // b 025330x1f, 0x20, 0x03, 0xd5, // nop25340xc0, 0x03, 0x5f, 0xd6, // ret2535];25362537assert_eq!(&golden_data[..], &buf.data[..]);2538}25392540#[test]2541fn test_handle_branch_cycle() {2542// label0:2543// b label12544// label1:2545// b label22546// label2:2547// b label32548// label3:2549// b label42550// label4:2551// b label1 // note: not label0 (to make it interesting).2552//2553// -- should become:2554//2555// label0, label1, ..., label4:2556// b label02557let info = EmitInfo::new(settings::Flags::new(settings::builder()));2558let mut buf = MachBuffer::new();2559let mut state = <Inst as MachInstEmit>::State::default();2560let constants = Default::default();25612562buf.reserve_labels_for_blocks(5);25632564buf.bind_label(label(0), state.ctrl_plane_mut());2565let inst = Inst::Jump { dest: target(1) };2566inst.emit(&mut buf, &info, &mut state);25672568buf.bind_label(label(1), state.ctrl_plane_mut());2569let inst = Inst::Jump { dest: target(2) };2570inst.emit(&mut buf, &info, &mut state);25712572buf.bind_label(label(2), state.ctrl_plane_mut());2573let inst = Inst::Jump { dest: target(3) };2574inst.emit(&mut buf, &info, &mut state);25752576buf.bind_label(label(3), state.ctrl_plane_mut());2577let inst = Inst::Jump { dest: target(4) };2578inst.emit(&mut buf, &info, &mut state);25792580buf.bind_label(label(4), state.ctrl_plane_mut());2581let inst = Inst::Jump { dest: target(1) };2582inst.emit(&mut buf, &info, &mut state);25832584let buf = buf.finish(&constants, state.ctrl_plane_mut());25852586let golden_data = vec![25870x00, 0x00, 0x00, 0x14, // b 02588];25892590assert_eq!(&golden_data[..], &buf.data[..]);2591}25922593#[test]2594fn metadata_records() {2595let mut buf = MachBuffer::<Inst>::new();2596let ctrl_plane = &mut Default::default();2597let constants = Default::default();25982599buf.reserve_labels_for_blocks(3);26002601buf.bind_label(label(0), ctrl_plane);2602buf.put1(1);2603buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);2604buf.put1(2);2605buf.add_trap(TrapCode::INTEGER_OVERFLOW);2606buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);2607buf.add_try_call_site(2608Some(0x10),2609[2610MachExceptionHandler::Tag(ExceptionTag::new(42), label(2)),2611MachExceptionHandler::Default(label(1)),2612]2613.into_iter(),2614);2615buf.add_reloc(2616Reloc::Abs4,2617&ExternalName::User(UserExternalNameRef::new(0)),26180,2619);2620buf.put1(3);2621buf.add_reloc(2622Reloc::Abs8,2623&ExternalName::User(UserExternalNameRef::new(1)),26241,2625);2626buf.put1(4);2627buf.bind_label(label(1), ctrl_plane);2628buf.put1(0xff);2629buf.bind_label(label(2), ctrl_plane);2630buf.put1(0xff);26312632let buf = buf.finish(&constants, ctrl_plane);26332634assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);2635assert_eq!(2636buf.traps()2637.iter()2638.map(|trap| (trap.offset, trap.code))2639.collect::<Vec<_>>(),2640vec![2641(1, TrapCode::HEAP_OUT_OF_BOUNDS),2642(2, TrapCode::INTEGER_OVERFLOW),2643(2, TrapCode::INTEGER_DIVISION_BY_ZERO)2644]2645);2646let call_sites: Vec<_> = buf.call_sites().collect();2647assert_eq!(call_sites[0].ret_addr, 2);2648assert_eq!(call_sites[0].frame_offset, Some(0x10));2649assert_eq!(2650call_sites[0].exception_handlers,2651&[2652FinalizedMachExceptionHandler::Tag(ExceptionTag::new(42), 5),2653FinalizedMachExceptionHandler::Default(4)2654],2655);2656assert_eq!(2657buf.relocs()2658.iter()2659.map(|reloc| (reloc.offset, reloc.kind))2660.collect::<Vec<_>>(),2661vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]2662);2663}2664}266526662667