Path: blob/main/cranelift/codegen/src/machinst/buffer.rs
3054 views
//! In-memory representation of compiled machine code, with labels and fixups to1//! refer to those labels. Handles constant-pool island insertion and also2//! veneer insertion for out-of-range jumps.3//!4//! This code exists to solve three problems:5//!6//! - Branch targets for forward branches are not known until later, when we7//! emit code in a single pass through the instruction structs.8//!9//! - On many architectures, address references or offsets have limited range.10//! For example, on AArch64, conditional branches can only target code +/- 1MB11//! from the branch itself.12//!13//! - The lowering of control flow from the CFG-with-edges produced by14//! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty15//! edge blocks when the register allocator does not need to insert any16//! spills/reloads/moves in edge blocks, results in many suboptimal branch17//! patterns. The lowering also pays no attention to block order, and so18//! two-target conditional forms (cond-br followed by uncond-br) can often by19//! avoided because one of the targets is the fallthrough. There are several20//! cases here where we can simplify to use fewer branches.21//!22//! This "buffer" implements a single-pass code emission strategy (with a later23//! "fixup" pass, but only through recorded fixups, not all instructions). The24//! basic idea is:25//!26//! - Emit branches as they are, including two-target (cond/uncond) compound27//! forms, but with zero offsets and optimistically assuming the target will be28//! in range. Record the "fixup" for later. Targets are denoted instead by29//! symbolic "labels" that are then bound to certain offsets in the buffer as30//! we emit code. (Nominally, there is a label at the start of every basic31//! block.)32//!33//! - As we do this, track the offset in the buffer at which the first label34//! reference "goes out of range". We call this the "deadline". If we reach the35//! deadline and we still have not bound the label to which an unresolved branch36//! refers, we have a problem!37//!38//! - To solve this problem, we emit "islands" full of "veneers". An island is39//! simply a chunk of code inserted in the middle of the code actually produced40//! by the emitter (e.g., vcode iterating over instruction structs). The emitter41//! has some awareness of this: it either asks for an island between blocks, so42//! it is not accidentally executed, or else it emits a branch around the island43//! when all other options fail (see `Inst::EmitIsland` meta-instruction).44//!45//! - A "veneer" is an instruction (or sequence of instructions) in an "island"46//! that implements a longer-range reference to a label. The idea is that, for47//! example, a branch with a limited range can branch to a "veneer" instead,48//! which is simply a branch in a form that can use a longer-range reference. On49//! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional50//! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a51//! conditional branch's label reference can be fixed up with a "veneer" to52//! achieve a longer range.53//!54//! - To implement all of this, we require the backend to provide a `LabelUse`55//! type that implements a trait. This is nominally an enum that records one of56//! several kinds of references to an offset in code -- basically, a relocation57//! type -- and will usually correspond to different instruction formats. The58//! `LabelUse` implementation specifies the maximum range, how to patch in the59//! actual label location when known, and how to generate a veneer to extend the60//! range.61//!62//! That satisfies label references, but we still may have suboptimal branch63//! patterns. To clean up the branches, we do a simple "peephole"-style64//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)65//! informs the buffer of branches in the code and, in the case of conditionals,66//! the code that would have been emitted to invert this branch's condition. We67//! track the "latest branches": these are branches that are contiguous up to68//! the current offset. (If any code is emitted after a branch, that branch or69//! run of contiguous branches is no longer "latest".) The latest branches are70//! those that we can edit by simply truncating the buffer and doing something71//! else instead.72//!73//! To optimize branches, we implement several simple rules, and try to apply74//! them to the "latest branches" when possible:75//!76//! - A branch with a label target, when that label is bound to the ending77//! offset of the branch (the fallthrough location), can be removed altogether,78//! because the branch would have no effect).79//!80//! - An unconditional branch that starts at a label location, and branches to81//! another label, results in a "label alias": all references to the label bound82//! *to* this branch instruction are instead resolved to the *target* of the83//! branch instruction. This effectively removes empty blocks that just84//! unconditionally branch to the next block. We call this "branch threading".85//!86//! - A conditional followed by an unconditional, when the conditional branches87//! to the unconditional's fallthrough, results in (i) the truncation of the88//! unconditional, (ii) the inversion of the condition's condition, and (iii)89//! replacement of the conditional's target (using the original target of the90//! unconditional). This is a fancy way of saying "we can flip a two-target91//! conditional branch's taken/not-taken targets if it works better with our92//! fallthrough". To make this work, the emitter actually gives the buffer93//! *both* forms of every conditional branch: the true form is emitted into the94//! buffer, and the "inverted" machine-code bytes are provided as part of the95//! branch-fixup metadata.96//!97//! - An unconditional B preceded by another unconditional P, when B's label(s) have98//! been redirected to target(B), can be removed entirely. This is an extension99//! of the branch-threading optimization, and is valid because if we know there100//! will be no fallthrough into this branch instruction (the prior instruction101//! is an unconditional jump), and if we know we have successfully redirected102//! all labels, then this branch instruction is unreachable. Note that this103//! works because the redirection happens before the label is ever resolved104//! (fixups happen at island emission time, at which point latest-branches are105//! cleared, or at the end of emission), so we are sure to catch and redirect106//! all possible paths to this instruction.107//!108//! # Branch-optimization Correctness109//!110//! The branch-optimization mechanism depends on a few data structures with111//! invariants, which are always held outside the scope of top-level public112//! methods:113//!114//! - The latest-branches list. Each entry describes a span of the buffer115//! (start/end offsets), the label target, the corresponding fixup-list entry116//! index, and the bytes (must be the same length) for the inverted form, if117//! conditional. The list of labels that are bound to the start-offset of this118//! branch is *complete* (if any label has a resolved offset equal to `start`119//! and is not an alias, it must appear in this list) and *precise* (no label120//! in this list can be bound to another offset). No label in this list should121//! be an alias. No two branch ranges can overlap, and branches are in122//! ascending-offset order.123//!124//! - The labels-at-tail list. This contains all MachLabels that have been bound125//! to (whose resolved offsets are equal to) the tail offset of the buffer.126//! No label in this list should be an alias.127//!128//! - The label_offsets array, containing the bound offset of a label or129//! UNKNOWN. No label can be bound at an offset greater than the current130//! buffer tail.131//!132//! - The label_aliases array, containing another label to which a label is133//! bound or UNKNOWN. A label's resolved offset is the resolved offset134//! of the label it is aliased to, if this is set.135//!136//! We argue below, at each method, how the invariants in these data structures137//! are maintained (grep for "Post-invariant").138//!139//! Given these invariants, we argue why each optimization preserves execution140//! semantics below (grep for "Preserves execution semantics").141//!142//! # Avoiding Quadratic Behavior143//!144//! There are two cases where we've had to take some care to avoid145//! quadratic worst-case behavior:146//!147//! - The "labels at this branch" list can grow unboundedly if the148//! code generator binds many labels at one location. If the count149//! gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we150//! simply abort an optimization early in a way that is always correct151//! but is conservative.152//!153//! - The fixup list can interact with island emission to create154//! "quadratic island behavior". In a little more detail, one can hit155//! this behavior by having some pending fixups (forward label156//! references) with long-range label-use kinds, and some others157//! with shorter-range references that nonetheless still are pending158//! long enough to trigger island generation. In such a case, we159//! process the fixup list, generate veneers to extend some forward160//! references' ranges, but leave the other (longer-range) ones161//! alone. The way this was implemented put them back on a list and162//! resulted in quadratic behavior.163//!164//! To avoid this fixups are split into two lists: one "pending" list and one165//! final list. The pending list is kept around for handling fixups related to166//! branches so it can be edited/truncated. When an island is reached, which167//! starts processing fixups, all pending fixups are flushed into the final168//! list. The final list is a `BinaryHeap` which enables fixup processing to169//! only process those which are required during island emission, deferring170//! all longer-range fixups to later.171172use crate::binemit::{Addend, CodeOffset, Reloc};173use crate::ir::function::FunctionParameters;174use crate::ir::{DebugTag, ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};175use crate::isa::unwind::UnwindInst;176use crate::machinst::{177BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,178};179use crate::trace;180use crate::{MachInstEmitState, ir};181use crate::{VCodeConstantData, timing};182use alloc::collections::BinaryHeap;183use alloc::string::String;184use alloc::vec::Vec;185use core::cmp::Ordering;186use core::mem;187use core::ops::Range;188use cranelift_control::ControlPlane;189use cranelift_entity::{PrimaryMap, SecondaryMap, entity_impl};190use smallvec::SmallVec;191192#[cfg(feature = "enable-serde")]193use serde::{Deserialize, Serialize};194195#[cfg(feature = "enable-serde")]196pub trait CompilePhase {197type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;198type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;199}200201#[cfg(not(feature = "enable-serde"))]202pub trait CompilePhase {203type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;204type SourceLocType: core::fmt::Debug + PartialEq + Clone;205}206207/// Status of a compiled artifact that needs patching before being used.208#[derive(Clone, Debug, PartialEq)]209#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]210pub struct Stencil;211212/// Status of a compiled artifact ready to use.213#[derive(Clone, Debug, PartialEq)]214pub struct Final;215216impl CompilePhase for Stencil {217type MachSrcLocType = MachSrcLoc<Stencil>;218type SourceLocType = RelSourceLoc;219}220221impl CompilePhase for Final {222type MachSrcLocType = MachSrcLoc<Final>;223type SourceLocType = SourceLoc;224}225226#[derive(Clone, Copy, Debug, PartialEq, Eq)]227enum ForceVeneers {228Yes,229No,230}231232/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink233/// in bulk.234///235/// This struct uses `SmallVec`s to support small-ish function bodies without236/// any heap allocation. As such, it will be several kilobytes large. This is237/// likely fine as long as it is stack-allocated for function emission then238/// thrown away; but beware if many buffer objects are retained persistently.239pub struct MachBuffer<I: VCodeInst> {240/// The buffer contents, as raw bytes.241data: SmallVec<[u8; 1024]>,242/// The required alignment of this buffer.243min_alignment: u32,244/// Any relocations referring to this code. Note that only *external*245/// relocations are tracked here; references to labels within the buffer are246/// resolved before emission.247relocs: SmallVec<[MachReloc; 16]>,248/// Any trap records referring to this code.249traps: SmallVec<[MachTrap; 16]>,250/// Any call site records referring to this code.251call_sites: SmallVec<[MachCallSite; 16]>,252/// Any patchable call site locations.253patchable_call_sites: SmallVec<[MachPatchableCallSite; 16]>,254/// Any exception-handler records referred to at call sites.255exception_handlers: SmallVec<[MachExceptionHandler; 16]>,256/// Any source location mappings referring to this code.257srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,258/// Any debug tags referring to this code.259debug_tags: Vec<MachDebugTags>,260/// Pool of debug tags referenced by `MachDebugTags` entries.261debug_tag_pool: Vec<DebugTag>,262/// Any user stack maps for this code.263///264/// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted265/// by code offset, and each stack map covers `span` bytes on the stack.266user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,267/// Any unwind info at a given location.268unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,269/// The current source location in progress (after `start_srcloc()` and270/// before `end_srcloc()`). This is a (start_offset, src_loc) tuple.271cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,272/// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.273label_offsets: SmallVec<[CodeOffset; 16]>,274/// Label aliases: when one label points to an unconditional jump, and that275/// jump points to another label, we can redirect references to the first276/// label immediately to the second.277///278/// Invariant: we don't have label-alias cycles. We ensure this by,279/// before setting label A to alias label B, resolving B's alias280/// target (iteratively until a non-aliased label); if B is already281/// aliased to A, then we cannot alias A back to B.282label_aliases: SmallVec<[MachLabel; 16]>,283/// Constants that must be emitted at some point.284pending_constants: SmallVec<[VCodeConstant; 16]>,285/// Byte size of all constants in `pending_constants`.286pending_constants_size: CodeOffset,287/// Traps that must be emitted at some point.288pending_traps: SmallVec<[MachLabelTrap; 16]>,289/// Fixups that haven't yet been flushed into `fixup_records` below and may290/// be related to branches that are chomped. These all get added to291/// `fixup_records` during island emission.292pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,293/// The nearest upcoming deadline for entries in `pending_fixup_records`.294pending_fixup_deadline: CodeOffset,295/// Fixups that must be performed after all code is emitted.296fixup_records: BinaryHeap<MachLabelFixup<I>>,297/// Latest branches, to facilitate in-place editing for better fallthrough298/// behavior and empty-block removal.299latest_branches: SmallVec<[MachBranch; 4]>,300/// All labels at the current offset (emission tail). This is lazily301/// cleared: it is actually accurate as long as the current offset is302/// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should303/// be considered as empty.304///305/// For correctness, this *must* be complete (i.e., the vector must contain306/// all labels whose offsets are resolved to the current tail), because we307/// rely on it to update labels when we truncate branches.308labels_at_tail: SmallVec<[MachLabel; 4]>,309/// The last offset at which `labels_at_tail` is valid. It is conceptually310/// always describing the tail of the buffer, but we do not clear311/// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it312/// when the offset has grown past this (`labels_at_tail_off`) point.313/// Always <= `cur_offset()`.314labels_at_tail_off: CodeOffset,315/// Metadata about all constants that this function has access to.316///317/// This records the size/alignment of all constants (not the actual data)318/// along with the last available label generated for the constant. This map319/// is consulted when constants are referred to and the label assigned to a320/// constant may change over time as well.321constants: PrimaryMap<VCodeConstant, MachBufferConstant>,322/// All recorded usages of constants as pairs of the constant and where the323/// constant needs to be placed within `self.data`. Note that the same324/// constant may appear in this array multiple times if it was emitted325/// multiple times.326used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,327/// Indicates when a patchable region is currently open, to guard that it's328/// not possible to nest patchable regions.329open_patchable: bool,330/// Stack frame layout metadata. If provided for a MachBuffer331/// containing a function body, this allows interpretation of332/// runtime state given a view of an active stack frame.333frame_layout: Option<MachBufferFrameLayout>,334}335336impl MachBufferFinalized<Stencil> {337/// Get a finalized machine buffer by applying the function's base source location.338pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {339MachBufferFinalized {340data: self.data,341relocs: self.relocs,342traps: self.traps,343call_sites: self.call_sites,344patchable_call_sites: self.patchable_call_sites,345exception_handlers: self.exception_handlers,346srclocs: self347.srclocs348.into_iter()349.map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))350.collect(),351debug_tags: self.debug_tags,352debug_tag_pool: self.debug_tag_pool,353user_stack_maps: self.user_stack_maps,354unwind_info: self.unwind_info,355alignment: self.alignment,356frame_layout: self.frame_layout,357nop_units: self.nop_units,358}359}360}361362/// A `MachBuffer` once emission is completed: holds generated code and records,363/// without fixups. This allows the type to be independent of the backend.364#[derive(PartialEq, Debug, Clone)]365#[cfg_attr(366feature = "enable-serde",367derive(serde_derive::Serialize, serde_derive::Deserialize)368)]369pub struct MachBufferFinalized<T: CompilePhase> {370/// The buffer contents, as raw bytes.371pub(crate) data: SmallVec<[u8; 1024]>,372/// Any relocations referring to this code. Note that only *external*373/// relocations are tracked here; references to labels within the buffer are374/// resolved before emission.375pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,376/// Any trap records referring to this code.377pub(crate) traps: SmallVec<[MachTrap; 16]>,378/// Any call site records referring to this code.379pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,380/// Any patchable call site locations refering to this code.381pub(crate) patchable_call_sites: SmallVec<[MachPatchableCallSite; 16]>,382/// Any exception-handler records referred to at call sites.383pub(crate) exception_handlers: SmallVec<[FinalizedMachExceptionHandler; 16]>,384/// Any source location mappings referring to this code.385pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,386/// Any debug tags referring to this code.387pub(crate) debug_tags: Vec<MachDebugTags>,388/// Pool of debug tags referenced by `MachDebugTags` entries.389pub(crate) debug_tag_pool: Vec<DebugTag>,390/// Any user stack maps for this code.391///392/// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted393/// by code offset, and each stack map covers `span` bytes on the stack.394pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,395/// Stack frame layout metadata. If provided for a MachBuffer396/// containing a function body, this allows interpretation of397/// runtime state given a view of an active stack frame.398pub(crate) frame_layout: Option<MachBufferFrameLayout>,399/// Any unwind info at a given location.400pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,401/// The required alignment of this buffer.402pub alignment: u32,403/// The means by which to NOP out patchable call sites.404///405/// This allows a consumer of a `MachBufferFinalized` to disable406/// patchable call sites (which are enabled by default) without407/// specific knowledge of the target ISA.408///409/// Each entry is one form of nop, and these are required to be410/// sorted in ascending-size order.411pub nop_units: Vec<Vec<u8>>,412}413414const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;415const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);416417/// Threshold on max length of `labels_at_this_branch` list to avoid418/// unbounded quadratic behavior (see comment below at use-site).419const LABEL_LIST_THRESHOLD: usize = 100;420421/// A label refers to some offset in a `MachBuffer`. It may not be resolved at422/// the point at which it is used by emitted code; the buffer records "fixups"423/// for references to the label, and will come back and patch the code424/// appropriately when the label's location is eventually known.425#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]426pub struct MachLabel(u32);427entity_impl!(MachLabel);428429impl MachLabel {430/// Get a label for a block. (The first N MachLabels are always reserved for431/// the N blocks in the vcode.)432pub fn from_block(bindex: BlockIndex) -> MachLabel {433MachLabel(bindex.index() as u32)434}435436/// Creates a string representing this label, for convenience.437pub fn to_string(&self) -> String {438format!("label{}", self.0)439}440}441442impl Default for MachLabel {443fn default() -> Self {444UNKNOWN_LABEL445}446}447448/// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is449/// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming450/// the [`OpenPatchRegion`] token in the process.451pub struct OpenPatchRegion(usize);452453/// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example454/// of where you might want to use this is for patching instructions that mention constants that455/// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable456/// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token457/// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,458/// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction459/// bytes, and the constants uses can be updated directly.460pub struct PatchRegion {461range: Range<usize>,462}463464impl PatchRegion {465/// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.466pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {467&mut buffer.data[self.range]468}469}470471impl<I: VCodeInst> MachBuffer<I> {472/// Create a new section, known to start at `start_offset` and with a size limited to473/// `length_limit`.474pub fn new() -> MachBuffer<I> {475MachBuffer {476data: SmallVec::new(),477min_alignment: I::function_alignment().minimum,478relocs: SmallVec::new(),479traps: SmallVec::new(),480call_sites: SmallVec::new(),481patchable_call_sites: SmallVec::new(),482exception_handlers: SmallVec::new(),483srclocs: SmallVec::new(),484debug_tags: vec![],485debug_tag_pool: vec![],486user_stack_maps: SmallVec::new(),487unwind_info: SmallVec::new(),488cur_srcloc: None,489label_offsets: SmallVec::new(),490label_aliases: SmallVec::new(),491pending_constants: SmallVec::new(),492pending_constants_size: 0,493pending_traps: SmallVec::new(),494pending_fixup_records: SmallVec::new(),495pending_fixup_deadline: u32::MAX,496fixup_records: Default::default(),497latest_branches: SmallVec::new(),498labels_at_tail: SmallVec::new(),499labels_at_tail_off: 0,500constants: Default::default(),501used_constants: Default::default(),502open_patchable: false,503frame_layout: None,504}505}506507/// Current offset from start of buffer.508pub fn cur_offset(&self) -> CodeOffset {509self.data.len() as CodeOffset510}511512/// Add a byte.513pub fn put1(&mut self, value: u8) {514self.data.push(value);515516// Post-invariant: conceptual-labels_at_tail contains a complete and517// precise list of labels bound at `cur_offset()`. We have advanced518// `cur_offset()`, hence if it had been equal to `labels_at_tail_off`519// before, it is not anymore (and it cannot become equal, because520// `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is521// conceptually empty (even though it is only lazily cleared). No labels522// can be bound at this new offset (by invariant on `label_offsets`).523// Hence the invariant holds.524}525526/// Add 2 bytes.527pub fn put2(&mut self, value: u16) {528let bytes = value.to_le_bytes();529self.data.extend_from_slice(&bytes[..]);530531// Post-invariant: as for `put1()`.532}533534/// Add 4 bytes.535pub fn put4(&mut self, value: u32) {536let bytes = value.to_le_bytes();537self.data.extend_from_slice(&bytes[..]);538539// Post-invariant: as for `put1()`.540}541542/// Add 8 bytes.543pub fn put8(&mut self, value: u64) {544let bytes = value.to_le_bytes();545self.data.extend_from_slice(&bytes[..]);546547// Post-invariant: as for `put1()`.548}549550/// Add a slice of bytes.551pub fn put_data(&mut self, data: &[u8]) {552self.data.extend_from_slice(data);553554// Post-invariant: as for `put1()`.555}556557/// Reserve appended space and return a mutable slice referring to it.558pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {559let off = self.data.len();560let new_len = self.data.len() + len;561self.data.resize(new_len, 0);562&mut self.data[off..]563564// Post-invariant: as for `put1()`.565}566567/// Align up to the given alignment.568pub fn align_to(&mut self, align_to: CodeOffset) {569trace!("MachBuffer: align to {}", align_to);570assert!(571align_to.is_power_of_two(),572"{align_to} is not a power of two"573);574while self.cur_offset() & (align_to - 1) != 0 {575self.put1(0);576}577578// Post-invariant: as for `put1()`.579}580581/// Begin a region of patchable code. There is one requirement for the582/// code that is emitted: It must not introduce any instructions that583/// could be chomped (branches are an example of this). In other words,584/// you must not call [`MachBuffer::add_cond_branch`] or585/// [`MachBuffer::add_uncond_branch`] between calls to this method and586/// [`MachBuffer::end_patchable`].587pub fn start_patchable(&mut self) -> OpenPatchRegion {588assert!(!self.open_patchable, "Patchable regions may not be nested");589self.open_patchable = true;590OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())591}592593/// End a region of patchable code, yielding a [`PatchRegion`] value that594/// can be consumed later to produce a one-off mutable slice to the595/// associated region of the data buffer.596pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {597// No need to assert the state of `open_patchable` here, as we take598// ownership of the only `OpenPatchable` value.599self.open_patchable = false;600let end = usize::try_from(self.cur_offset()).unwrap();601PatchRegion { range: open.0..end }602}603604/// Allocate a `Label` to refer to some offset. May not be bound to a fixed605/// offset yet.606pub fn get_label(&mut self) -> MachLabel {607let l = self.label_offsets.len() as u32;608self.label_offsets.push(UNKNOWN_LABEL_OFFSET);609self.label_aliases.push(UNKNOWN_LABEL);610trace!("MachBuffer: new label -> {:?}", MachLabel(l));611MachLabel(l)612613// Post-invariant: the only mutation is to add a new label; it has no614// bound offset yet, so it trivially satisfies all invariants.615}616617/// Reserve the first N MachLabels for blocks.618pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {619trace!("MachBuffer: first {} labels are for blocks", blocks);620debug_assert!(self.label_offsets.is_empty());621self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);622self.label_aliases.resize(blocks, UNKNOWN_LABEL);623624// Post-invariant: as for `get_label()`.625}626627/// Registers metadata in this `MachBuffer` about the `constants` provided.628///629/// This will record the size/alignment of all constants which will prepare630/// them for emission later on.631pub fn register_constants(&mut self, constants: &VCodeConstants) {632for (c, val) in constants.iter() {633self.register_constant(&c, val);634}635}636637/// Similar to [`MachBuffer::register_constants`] but registers a638/// single constant metadata. This function is useful in639/// situations where not all constants are known at the time of640/// emission.641pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {642let c2 = self.constants.push(MachBufferConstant {643upcoming_label: None,644align: data.alignment(),645size: data.as_slice().len(),646});647assert_eq!(*constant, c2);648}649650/// Completes constant emission by iterating over `self.used_constants` and651/// filling in the "holes" with the constant values provided by `constants`.652///653/// Returns the alignment required for this entire buffer. Alignment starts654/// at the ISA's minimum function alignment and can be increased due to655/// constant requirements.656fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {657let mut alignment = self.min_alignment;658for (constant, offset) in mem::take(&mut self.used_constants) {659let constant = constants.get(constant);660let data = constant.as_slice();661self.data[offset as usize..][..data.len()].copy_from_slice(data);662alignment = constant.alignment().max(alignment);663}664alignment665}666667/// Returns a label that can be used to refer to the `constant` provided.668///669/// This will automatically defer a new constant to be emitted for670/// `constant` if it has not been previously emitted. Note that this671/// function may return a different label for the same constant at672/// different points in time. The label is valid to use only from the673/// current location; the MachBuffer takes care to emit the same constant674/// multiple times if needed so the constant is always in range.675pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {676let MachBufferConstant {677align,678size,679upcoming_label,680} = self.constants[constant];681if let Some(label) = upcoming_label {682return label;683}684685let label = self.get_label();686trace!(687"defer constant: eventually emit {size} bytes aligned \688to {align} at label {label:?}",689);690self.pending_constants.push(constant);691self.pending_constants_size += size as u32;692self.constants[constant].upcoming_label = Some(label);693label694}695696/// Bind a label to the current offset. A label can only be bound once.697pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {698trace!(699"MachBuffer: bind label {:?} at offset {}",700label,701self.cur_offset()702);703debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);704debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);705let offset = self.cur_offset();706self.label_offsets[label.0 as usize] = offset;707self.lazily_clear_labels_at_tail();708self.labels_at_tail.push(label);709710// Invariants hold: bound offset of label is <= cur_offset (in fact it711// is equal). If the `labels_at_tail` list was complete and precise712// before, it is still, because we have bound this label to the current713// offset and added it to the list (which contains all labels at the714// current offset).715716self.optimize_branches(ctrl_plane);717718// Post-invariant: by `optimize_branches()` (see argument there).719}720721/// Lazily clear `labels_at_tail` if the tail offset has moved beyond the722/// offset that it applies to.723fn lazily_clear_labels_at_tail(&mut self) {724let offset = self.cur_offset();725if offset > self.labels_at_tail_off {726self.labels_at_tail_off = offset;727self.labels_at_tail.clear();728}729730// Post-invariant: either labels_at_tail_off was at cur_offset, and731// state is untouched, or was less than cur_offset, in which case the732// labels_at_tail list was conceptually empty, and is now actually733// empty.734}735736/// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.737pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {738let mut iters = 0;739while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {740label = self.label_aliases[label.0 as usize];741// To protect against an infinite loop (despite our assurances to742// ourselves that the invariants make this impossible), assert out743// after 1M iterations. The number of basic blocks is limited744// in most contexts anyway so this should be impossible to hit with745// a legitimate input.746iters += 1;747assert!(iters < 1_000_000, "Unexpected cycle in label aliases");748}749self.label_offsets[label.0 as usize]750751// Post-invariant: no mutations.752}753754/// Emit a reference to the given label with the given reference type (i.e.,755/// branch-instruction format) at the current offset. This is like a756/// relocation, but handled internally.757///758/// This can be called before the branch is actually emitted; fixups will759/// not happen until an island is emitted or the buffer is finished.760pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {761trace!(762"MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",763offset, label, kind764);765766// Add the fixup, and update the worst-case island size based on a767// veneer for this label use.768let fixup = MachLabelFixup {769label,770offset,771kind,772};773self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());774self.pending_fixup_records.push(fixup);775776// Post-invariant: no mutations to branches/labels data structures.777}778779/// Inform the buffer of an unconditional branch at the given offset,780/// targeting the given label. May be used to optimize branches.781/// The last added label-use must correspond to this branch.782/// This must be called when the current offset is equal to `start`; i.e.,783/// before actually emitting the branch. This implies that for a branch that784/// uses a label and is eligible for optimizations by the MachBuffer, the785/// proper sequence is:786///787/// - Call `use_label_at_offset()` to emit the fixup record.788/// - Call `add_uncond_branch()` to make note of the branch.789/// - Emit the bytes for the branch's machine code.790///791/// Additional requirement: no labels may be bound between `start` and `end`792/// (exclusive on both ends).793pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {794debug_assert!(795!self.open_patchable,796"Branch instruction inserted within a patchable region"797);798assert!(self.cur_offset() == start);799debug_assert!(end > start);800assert!(!self.pending_fixup_records.is_empty());801let fixup = self.pending_fixup_records.len() - 1;802self.lazily_clear_labels_at_tail();803self.latest_branches.push(MachBranch {804start,805end,806target,807fixup,808inverted: None,809labels_at_this_branch: self.labels_at_tail.clone(),810});811812// Post-invariant: we asserted branch start is current tail; the list of813// labels at branch is cloned from list of labels at current tail.814}815816/// Inform the buffer of a conditional branch at the given offset,817/// targeting the given label. May be used to optimize branches.818/// The last added label-use must correspond to this branch.819///820/// Additional requirement: no labels may be bound between `start` and `end`821/// (exclusive on both ends).822pub fn add_cond_branch(823&mut self,824start: CodeOffset,825end: CodeOffset,826target: MachLabel,827inverted: &[u8],828) {829debug_assert!(830!self.open_patchable,831"Branch instruction inserted within a patchable region"832);833assert!(self.cur_offset() == start);834debug_assert!(end > start);835assert!(!self.pending_fixup_records.is_empty());836debug_assert!(837inverted.len() == (end - start) as usize,838"branch length = {}, but inverted length = {}",839end - start,840inverted.len()841);842let fixup = self.pending_fixup_records.len() - 1;843let inverted = Some(SmallVec::from(inverted));844self.lazily_clear_labels_at_tail();845self.latest_branches.push(MachBranch {846start,847end,848target,849fixup,850inverted,851labels_at_this_branch: self.labels_at_tail.clone(),852});853854// Post-invariant: we asserted branch start is current tail; labels at855// branch list is cloned from list of labels at current tail.856}857858fn truncate_last_branch(&mut self) {859debug_assert!(860!self.open_patchable,861"Branch instruction truncated within a patchable region"862);863864self.lazily_clear_labels_at_tail();865// Invariants hold at this point.866867let b = self.latest_branches.pop().unwrap();868assert!(b.end == self.cur_offset());869870// State:871// [PRE CODE]872// Offset b.start, b.labels_at_this_branch:873// [BRANCH CODE]874// cur_off, self.labels_at_tail -->875// (end of buffer)876self.data.truncate(b.start as usize);877self.pending_fixup_records.truncate(b.fixup);878879// Trim srclocs and debug tags now past the end of the buffer.880while let Some(last_srcloc) = self.srclocs.last_mut() {881if last_srcloc.end <= b.start {882break;883}884if last_srcloc.start < b.start {885last_srcloc.end = b.start;886break;887}888self.srclocs.pop();889}890while let Some(last_debug_tag) = self.debug_tags.last() {891if last_debug_tag.offset <= b.start {892break;893}894self.debug_tags.pop();895}896897// State:898// [PRE CODE]899// cur_off, Offset b.start, b.labels_at_this_branch:900// (end of buffer)901//902// self.labels_at_tail --> (past end of buffer)903let cur_off = self.cur_offset();904self.labels_at_tail_off = cur_off;905// State:906// [PRE CODE]907// cur_off, Offset b.start, b.labels_at_this_branch,908// self.labels_at_tail:909// (end of buffer)910//911// resolve_label_offset(l) for l in labels_at_tail:912// (past end of buffer)913914trace!(915"truncate_last_branch: truncated {:?}; off now {}",916b, cur_off917);918919// Fix up resolved label offsets for labels at tail.920for &l in &self.labels_at_tail {921self.label_offsets[l.0 as usize] = cur_off;922}923// Old labels_at_this_branch are now at cur_off.924self.labels_at_tail.extend(b.labels_at_this_branch);925926// Post-invariant: this operation is defined to truncate the buffer,927// which moves cur_off backward, and to move labels at the end of the928// buffer back to the start-of-branch offset.929//930// latest_branches satisfies all invariants:931// - it has no branches past the end of the buffer (branches are in932// order, we removed the last one, and we truncated the buffer to just933// before the start of that branch)934// - no labels were moved to lower offsets than the (new) cur_off, so935// the labels_at_this_branch list for any other branch need not change.936//937// labels_at_tail satisfies all invariants:938// - all labels that were at the tail after the truncated branch are939// moved backward to just before the branch, which becomes the new tail;940// thus every element in the list should remain (ensured by `.extend()`941// above).942// - all labels that refer to the new tail, which is the start-offset of943// the truncated branch, must be present. The `labels_at_this_branch`944// list in the truncated branch's record is a complete and precise list945// of exactly these labels; we append these to labels_at_tail.946// - labels_at_tail_off is at cur_off after truncation occurs, so the947// list is valid (not to be lazily cleared).948//949// The stated operation was performed:950// - For each label at the end of the buffer prior to this method, it951// now resolves to the new (truncated) end of the buffer: it must have952// been in `labels_at_tail` (this list is precise and complete, and953// the tail was at the end of the truncated branch on entry), and we954// iterate over this list and set `label_offsets` to the new tail.955// None of these labels could have been an alias (by invariant), so956// `label_offsets` is authoritative for each.957// - No other labels will be past the end of the buffer, because of the958// requirement that no labels be bound to the middle of branch ranges959// (see comments to `add_{cond,uncond}_branch()`).960// - The buffer is truncated to just before the last branch, and the961// fixup record referring to that last branch is removed.962}963964/// Performs various optimizations on branches pointing at the current label.965pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {966if ctrl_plane.get_decision() {967return;968}969970self.lazily_clear_labels_at_tail();971// Invariants valid at this point.972973trace!(974"enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",975self.latest_branches, self.labels_at_tail, self.pending_fixup_records976);977978// We continue to munch on branches at the tail of the buffer until no979// more rules apply. Note that the loop only continues if a branch is980// actually truncated (or if labels are redirected away from a branch),981// so this always makes progress.982while let Some(b) = self.latest_branches.last() {983let cur_off = self.cur_offset();984trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);985// If there has been any code emission since the end of the last branch or986// label definition, then there's nothing we can edit (because we987// don't move code once placed, only back up and overwrite), so988// clear the records and finish.989if b.end < cur_off {990break;991}992993// If the "labels at this branch" list on this branch is994// longer than a threshold, don't do any simplification,995// and let the branch remain to separate those labels from996// the current tail. This avoids quadratic behavior (see997// #3468): otherwise, if a long string of "goto next;998// next:" patterns are emitted, all of the labels will999// coalesce into a long list of aliases for the current1000// buffer tail. We must track all aliases of the current1001// tail for correctness, but we are also allowed to skip1002// optimization (removal) of any branch, so we take the1003// escape hatch here and let it stand. In effect this1004// "spreads" the many thousands of labels in the1005// pathological case among an actual (harmless but1006// suboptimal) instruction once per N labels.1007if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {1008break;1009}10101011// Invariant: we are looking at a branch that ends at the tail of1012// the buffer.10131014// For any branch, conditional or unconditional:1015// - If the target is a label at the current offset, then remove1016// the conditional branch, and reset all labels that targeted1017// the current offset (end of branch) to the truncated1018// end-of-code.1019//1020// Preserves execution semantics: a branch to its own fallthrough1021// address is equivalent to a no-op; in both cases, nextPC is the1022// fallthrough.1023if self.resolve_label_offset(b.target) == cur_off {1024trace!("branch with target == cur off; truncating");1025self.truncate_last_branch();1026continue;1027}10281029// If latest is an unconditional branch:1030//1031// - If the branch's target is not its own start address, then for1032// each label at the start of branch, make the label an alias of the1033// branch target, and remove the label from the "labels at this1034// branch" list.1035//1036// - Preserves execution semantics: an unconditional branch's1037// only effect is to set PC to a new PC; this change simply1038// collapses one step in the step-semantics.1039//1040// - Post-invariant: the labels that were bound to the start of1041// this branch become aliases, so they must not be present in any1042// labels-at-this-branch list or the labels-at-tail list. The1043// labels are removed form the latest-branch record's1044// labels-at-this-branch list, and are never placed in the1045// labels-at-tail list. Furthermore, it is correct that they are1046// not in either list, because they are now aliases, and labels1047// that are aliases remain aliases forever.1048//1049// - If there is a prior unconditional branch that ends just before1050// this one begins, and this branch has no labels bound to its1051// start, then we can truncate this branch, because it is entirely1052// unreachable (we have redirected all labels that make it1053// reachable otherwise). Do so and continue around the loop.1054//1055// - Preserves execution semantics: the branch is unreachable,1056// because execution can only flow into an instruction from the1057// prior instruction's fallthrough or from a branch bound to that1058// instruction's start offset. Unconditional branches have no1059// fallthrough, so if the prior instruction is an unconditional1060// branch, no fallthrough entry can happen. The1061// labels-at-this-branch list is complete (by invariant), so if it1062// is empty, then the instruction is entirely unreachable. Thus,1063// it can be removed.1064//1065// - Post-invariant: ensured by truncate_last_branch().1066//1067// - If there is a prior conditional branch whose target label1068// resolves to the current offset (branches around the1069// unconditional branch), then remove the unconditional branch,1070// and make the target of the unconditional the target of the1071// conditional instead.1072//1073// - Preserves execution semantics: previously we had:1074//1075// L1:1076// cond_br L21077// br L31078// L2:1079// (end of buffer)1080//1081// by removing the last branch, we have:1082//1083// L1:1084// cond_br L21085// L2:1086// (end of buffer)1087//1088// we then fix up the records for the conditional branch to1089// have:1090//1091// L1:1092// cond_br.inverted L31093// L2:1094//1095// In the original code, control flow reaches L2 when the1096// conditional branch's predicate is true, and L3 otherwise. In1097// the optimized code, the same is true.1098//1099// - Post-invariant: all edits to latest_branches and1100// labels_at_tail are performed by `truncate_last_branch()`,1101// which maintains the invariants at each step.11021103if b.is_uncond() {1104// Set any label equal to current branch's start as an alias of1105// the branch's target, if the target is not the branch itself1106// (i.e., an infinite loop).1107//1108// We cannot perform this aliasing if the target of this branch1109// ultimately aliases back here; if so, we need to keep this1110// branch, so break out of this loop entirely (and clear the1111// latest-branches list below).1112//1113// Note that this check is what prevents cycles from forming in1114// `self.label_aliases`. To see why, consider an arbitrary start1115// state:1116//1117// label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to1118// Ln, which is not aliased.1119//1120// We would create a cycle if we assigned label_aliases[Ln]1121// = L1. Note that the below assignment is the only write1122// to label_aliases.1123//1124// By our other invariants, we have that Ln (`l` below)1125// resolves to the offset `b.start`, because it is in the1126// set `b.labels_at_this_branch`.1127//1128// If L1 were already aliased, through some arbitrarily deep1129// chain, to Ln, then it must also resolve to this offset1130// `b.start`.1131//1132// By checking the resolution of `L1` against this offset,1133// and aborting this branch-simplification if they are1134// equal, we prevent the below assignment from ever creating1135// a cycle.1136if self.resolve_label_offset(b.target) != b.start {1137let redirected = b.labels_at_this_branch.len();1138for &l in &b.labels_at_this_branch {1139trace!(1140" -> label at start of branch {:?} redirected to target {:?}",1141l, b.target1142);1143self.label_aliases[l.0 as usize] = b.target;1144// NOTE: we continue to ensure the invariant that labels1145// pointing to tail of buffer are in `labels_at_tail`1146// because we already ensured above that the last branch1147// cannot have a target of `cur_off`; so we never have1148// to put the label into `labels_at_tail` when moving it1149// here.1150}1151// Maintain invariant: all branches have been redirected1152// and are no longer pointing at the start of this branch.1153let mut_b = self.latest_branches.last_mut().unwrap();1154mut_b.labels_at_this_branch.clear();11551156if redirected > 0 {1157trace!(" -> after label redirects, restarting loop");1158continue;1159}1160} else {1161break;1162}11631164let b = self.latest_branches.last().unwrap();11651166// Examine any immediately preceding branch.1167if self.latest_branches.len() > 1 {1168let prev_b = &self.latest_branches[self.latest_branches.len() - 2];1169trace!(" -> more than one branch; prev_b = {:?}", prev_b);1170// This uncond is immediately after another uncond; we1171// should have already redirected labels to this uncond away1172// (but check to be sure); so we can truncate this uncond.1173if prev_b.is_uncond()1174&& prev_b.end == b.start1175&& b.labels_at_this_branch.is_empty()1176{1177trace!(" -> uncond follows another uncond; truncating");1178self.truncate_last_branch();1179continue;1180}11811182// This uncond is immediately after a conditional, and the1183// conditional's target is the end of this uncond, and we've1184// already redirected labels to this uncond away; so we can1185// truncate this uncond, flip the sense of the conditional, and1186// set the conditional's target (in `latest_branches` and in1187// `fixup_records`) to the uncond's target.1188if prev_b.is_cond()1189&& prev_b.end == b.start1190&& self.resolve_label_offset(prev_b.target) == cur_off1191{1192trace!(1193" -> uncond follows a conditional, and conditional's target resolves to current offset"1194);1195// Save the target of the uncond (this becomes the1196// target of the cond), and truncate the uncond.1197let target = b.target;1198let data = prev_b.inverted.clone().unwrap();1199self.truncate_last_branch();12001201// Mutate the code and cond branch.1202let off_before_edit = self.cur_offset();1203let prev_b = self.latest_branches.last_mut().unwrap();1204let not_inverted = SmallVec::from(1205&self.data[(prev_b.start as usize)..(prev_b.end as usize)],1206);12071208// Low-level edit: replaces bytes of branch with1209// inverted form. cur_off remains the same afterward, so1210// we do not need to modify label data structures.1211self.data.truncate(prev_b.start as usize);1212self.data.extend_from_slice(&data[..]);12131214// Save the original code as the inversion of the1215// inverted branch, in case we later edit this branch1216// again.1217prev_b.inverted = Some(not_inverted);1218self.pending_fixup_records[prev_b.fixup].label = target;1219trace!(" -> reassigning target of condbr to {:?}", target);1220prev_b.target = target;1221debug_assert_eq!(off_before_edit, self.cur_offset());1222continue;1223}1224}1225}12261227// If we couldn't do anything with the last branch, then break.1228break;1229}12301231self.purge_latest_branches();12321233trace!(1234"leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",1235self.latest_branches, self.labels_at_tail, self.pending_fixup_records1236);1237}12381239fn purge_latest_branches(&mut self) {1240// All of our branch simplification rules work only if a branch ends at1241// the tail of the buffer, with no following code; and branches are in1242// order in latest_branches; so if the last entry ends prior to1243// cur_offset, then clear all entries.1244let cur_off = self.cur_offset();1245if let Some(l) = self.latest_branches.last() {1246if l.end < cur_off {1247trace!("purge_latest_branches: removing branch {:?}", l);1248self.latest_branches.clear();1249}1250}12511252// Post-invariant: no invariant requires any branch to appear in1253// `latest_branches`; it is always optional. The list-clear above thus1254// preserves all semantics.1255}12561257/// Emit a trap at some point in the future with the specified code and1258/// stack map.1259///1260/// This function returns a [`MachLabel`] which will be the future address1261/// of the trap. Jumps should refer to this label, likely by using the1262/// [`MachBuffer::use_label_at_offset`] method, to get a relocation1263/// patched in once the address of the trap is known.1264///1265/// This will batch all traps into the end of the function.1266pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {1267let label = self.get_label();1268self.pending_traps.push(MachLabelTrap {1269label,1270code,1271loc: self.cur_srcloc.map(|(_start, loc)| loc),1272});1273label1274}12751276/// Is an island needed within the next N bytes?1277pub fn island_needed(&self, distance: CodeOffset) -> bool {1278let deadline = match self.fixup_records.peek() {1279Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),1280None => self.pending_fixup_deadline,1281};1282deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline1283}12841285/// Returns the maximal offset that islands can reach if `distance` more1286/// bytes are appended.1287///1288/// This is used to determine if veneers need insertions since jumps that1289/// can't reach past this point must get a veneer of some form.1290fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {1291// Assume that all fixups will require veneers and that the veneers are1292// the worst-case size for each platform. This is an over-generalization1293// to avoid iterating over the `fixup_records` list or maintaining1294// information about it as we go along.1295let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())1296as u32)1297* (I::LabelUse::worst_case_veneer_size())1298+ self.pending_constants_size1299+ (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;1300self.cur_offset()1301.saturating_add(distance)1302.saturating_add(island_worst_case_size)1303}13041305/// Emit all pending constants and required pending veneers.1306///1307/// Should only be called if `island_needed()` returns true, i.e., if we1308/// actually reach a deadline. It's not necessarily a problem to do so1309/// otherwise but it may result in unnecessary work during emission.1310pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {1311self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);1312}13131314/// Same as `emit_island`, but an internal API with a `force_veneers`1315/// argument to force all veneers to always get emitted for debugging.1316fn emit_island_maybe_forced(1317&mut self,1318force_veneers: ForceVeneers,1319distance: CodeOffset,1320ctrl_plane: &mut ControlPlane,1321) {1322// We're going to purge fixups, so no latest-branch editing can happen1323// anymore.1324self.latest_branches.clear();13251326// End the current location tracking since anything emitted during this1327// function shouldn't be attributed to whatever the current source1328// location is.1329//1330// Note that the current source location, if it's set right now, will be1331// restored at the end of this island emission.1332let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);1333if cur_loc.is_some() {1334self.end_srcloc();1335}13361337let forced_threshold = self.worst_case_end_of_island(distance);13381339// First flush out all traps/constants so we have more labels in case1340// fixups are applied against these labels.1341//1342// Note that traps are placed first since this typically happens at the1343// end of the function and for disassemblers we try to keep all the code1344// contiguously together.1345for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {1346// If this trap has source information associated with it then1347// emit this information for the trap instruction going out now too.1348if let Some(loc) = loc {1349self.start_srcloc(loc);1350}1351self.align_to(I::LabelUse::ALIGN);1352self.bind_label(label, ctrl_plane);1353self.add_trap(code);1354self.put_data(I::TRAP_OPCODE);1355if loc.is_some() {1356self.end_srcloc();1357}1358}13591360for constant in mem::take(&mut self.pending_constants) {1361let MachBufferConstant { align, size, .. } = self.constants[constant];1362let label = self.constants[constant].upcoming_label.take().unwrap();1363self.align_to(align);1364self.bind_label(label, ctrl_plane);1365self.used_constants.push((constant, self.cur_offset()));1366self.get_appended_space(size);1367}13681369// Either handle all pending fixups because they're ready or move them1370// onto the `BinaryHeap` tracking all pending fixups if they aren't1371// ready.1372assert!(self.latest_branches.is_empty());1373for fixup in mem::take(&mut self.pending_fixup_records) {1374if self.should_apply_fixup(&fixup, forced_threshold) {1375self.handle_fixup(fixup, force_veneers, forced_threshold);1376} else {1377self.fixup_records.push(fixup);1378}1379}1380self.pending_fixup_deadline = u32::MAX;1381while let Some(fixup) = self.fixup_records.peek() {1382trace!("emit_island: fixup {:?}", fixup);13831384// If this fixup shouldn't be applied, that means its label isn't1385// defined yet and there'll be remaining space to apply a veneer if1386// necessary in the future after this island. In that situation1387// because `fixup_records` is sorted by deadline this loop can1388// exit.1389if !self.should_apply_fixup(fixup, forced_threshold) {1390break;1391}13921393let fixup = self.fixup_records.pop().unwrap();1394self.handle_fixup(fixup, force_veneers, forced_threshold);1395}13961397if let Some(loc) = cur_loc {1398self.start_srcloc(loc);1399}1400}14011402fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {1403let label_offset = self.resolve_label_offset(fixup.label);1404label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold1405}14061407fn handle_fixup(1408&mut self,1409fixup: MachLabelFixup<I>,1410force_veneers: ForceVeneers,1411forced_threshold: CodeOffset,1412) {1413let MachLabelFixup {1414label,1415offset,1416kind,1417} = fixup;1418let start = offset as usize;1419let end = (offset + kind.patch_size()) as usize;1420let label_offset = self.resolve_label_offset(label);14211422if label_offset != UNKNOWN_LABEL_OFFSET {1423// If the offset of the label for this fixup is known then1424// we're going to do something here-and-now. We're either going1425// to patch the original offset because it's an in-bounds jump,1426// or we're going to generate a veneer, patch the fixup to jump1427// to the veneer, and then keep going.1428//1429// If the label comes after the original fixup, then we should1430// be guaranteed that the jump is in-bounds. Otherwise there's1431// a bug somewhere because this method wasn't called soon1432// enough. All forward-jumps are tracked and should get veneers1433// before their deadline comes and they're unable to jump1434// further.1435//1436// Otherwise if the label is before the fixup, then that's a1437// backwards jump. If it's past the maximum negative range1438// then we'll emit a veneer that to jump forward to which can1439// then jump backwards.1440let veneer_required = if label_offset >= offset {1441assert!((label_offset - offset) <= kind.max_pos_range());1442false1443} else {1444(offset - label_offset) > kind.max_neg_range()1445};1446trace!(1447" -> label_offset = {}, known, required = {} (pos {} neg {})",1448label_offset,1449veneer_required,1450kind.max_pos_range(),1451kind.max_neg_range()1452);14531454if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {1455self.emit_veneer(label, offset, kind);1456} else {1457let slice = &mut self.data[start..end];1458trace!(1459"patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"1460);1461kind.patch(slice, offset, label_offset);1462}1463} else {1464// If the offset of this label is not known at this time then1465// that means that a veneer is required because after this1466// island the target can't be in range of the original target.1467assert!(forced_threshold - offset > kind.max_pos_range());1468self.emit_veneer(label, offset, kind);1469}1470}14711472/// Emits a "veneer" the `kind` code at `offset` to jump to `label`.1473///1474/// This will generate extra machine code, using `kind`, to get a1475/// larger-jump-kind than `kind` allows. The code at `offset` is then1476/// patched to jump to our new code, and then the new code is enqueued for1477/// a fixup to get processed at some later time.1478fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {1479// If this `kind` doesn't support a veneer then that's a bug in the1480// backend because we need to implement support for such a veneer.1481assert!(1482kind.supports_veneer(),1483"jump beyond the range of {kind:?} but a veneer isn't supported",1484);14851486// Allocate space for a veneer in the island.1487self.align_to(I::LabelUse::ALIGN);1488let veneer_offset = self.cur_offset();1489trace!("making a veneer at {}", veneer_offset);1490let start = offset as usize;1491let end = (offset + kind.patch_size()) as usize;1492let slice = &mut self.data[start..end];1493// Patch the original label use to refer to the veneer.1494trace!(1495"patching original at offset {} to veneer offset {}",1496offset, veneer_offset1497);1498kind.patch(slice, offset, veneer_offset);1499// Generate the veneer.1500let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);1501let (veneer_fixup_off, veneer_label_use) =1502kind.generate_veneer(veneer_slice, veneer_offset);1503trace!(1504"generated veneer; fixup offset {}, label_use {:?}",1505veneer_fixup_off, veneer_label_use1506);1507// Register a new use of `label` with our new veneer fixup and1508// offset. This'll recalculate deadlines accordingly and1509// enqueue this fixup to get processed at some later1510// time.1511self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);1512}15131514fn finish_emission_maybe_forcing_veneers(1515&mut self,1516force_veneers: ForceVeneers,1517ctrl_plane: &mut ControlPlane,1518) {1519while !self.pending_constants.is_empty()1520|| !self.pending_traps.is_empty()1521|| !self.fixup_records.is_empty()1522|| !self.pending_fixup_records.is_empty()1523{1524// `emit_island()` will emit any pending veneers and constants, and1525// as a side-effect, will also take care of any fixups with resolved1526// labels eagerly.1527self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);1528}15291530// Ensure that all labels have been fixed up after the last island is emitted. This is a1531// full (release-mode) assert because an unresolved label means the emitted code is1532// incorrect.1533assert!(self.fixup_records.is_empty());1534assert!(self.pending_fixup_records.is_empty());1535}15361537/// Finish any deferred emissions and/or fixups.1538pub fn finish(1539mut self,1540constants: &VCodeConstants,1541ctrl_plane: &mut ControlPlane,1542) -> MachBufferFinalized<Stencil> {1543let _tt = timing::vcode_emit_finish();15441545self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);15461547let alignment = self.finish_constants(constants);15481549// Resolve all labels to their offsets.1550let finalized_relocs = self1551.relocs1552.iter()1553.map(|reloc| FinalizedMachReloc {1554offset: reloc.offset,1555kind: reloc.kind,1556addend: reloc.addend,1557target: match &reloc.target {1558RelocTarget::ExternalName(name) => {1559FinalizedRelocTarget::ExternalName(name.clone())1560}1561RelocTarget::Label(label) => {1562FinalizedRelocTarget::Func(self.resolve_label_offset(*label))1563}1564},1565})1566.collect();15671568let finalized_exception_handlers = self1569.exception_handlers1570.iter()1571.map(|handler| handler.finalize(|label| self.resolve_label_offset(label)))1572.collect();15731574let mut srclocs = self.srclocs;1575srclocs.sort_by_key(|entry| entry.start);15761577MachBufferFinalized {1578data: self.data,1579relocs: finalized_relocs,1580traps: self.traps,1581call_sites: self.call_sites,1582patchable_call_sites: self.patchable_call_sites,1583exception_handlers: finalized_exception_handlers,1584srclocs,1585debug_tags: self.debug_tags,1586debug_tag_pool: self.debug_tag_pool,1587user_stack_maps: self.user_stack_maps,1588unwind_info: self.unwind_info,1589alignment,1590frame_layout: self.frame_layout,1591nop_units: I::gen_nop_units(),1592}1593}15941595/// Add an external relocation at the given offset.1596pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(1597&mut self,1598offset: CodeOffset,1599kind: Reloc,1600target: &T,1601addend: Addend,1602) {1603let target: RelocTarget = target.clone().into();1604// FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally1605// generate a label-use statement to track whether an island is possibly1606// needed to escape this function to actually get to the external name.1607// This is most likely to come up on AArch64 where calls between1608// functions use a 26-bit signed offset which gives +/- 64MB. This means1609// that if a function is 128MB in size and there's a call in the middle1610// it's impossible to reach the actual target. Also, while it's1611// technically possible to jump to the start of a function and then jump1612// further, island insertion below always inserts islands after1613// previously appended code so for Cranelift's own implementation this1614// is also a problem for 64MB functions on AArch64 which start with a1615// call instruction, those won't be able to escape.1616//1617// Ideally what needs to happen here is that a `LabelUse` is1618// transparently generated (or call-sites of this function are audited1619// to generate a `LabelUse` instead) and tracked internally. The actual1620// relocation would then change over time if and when a veneer is1621// inserted, where the relocation here would be patched by this1622// `MachBuffer` to jump to the veneer. The problem, though, is that all1623// this still needs to end up, in the case of a singular function,1624// generating a final relocation pointing either to this particular1625// relocation or to the veneer inserted. Additionally1626// `MachBuffer` needs the concept of a label which will never be1627// resolved, so `emit_island` doesn't trip over not actually ever1628// knowing what some labels are. Currently the loop in1629// `finish_emission_maybe_forcing_veneers` would otherwise infinitely1630// loop.1631//1632// For now this means that because relocs aren't tracked at all that1633// AArch64 functions have a rough size limits of 64MB. For now that's1634// somewhat reasonable and the failure mode is a panic in `MachBuffer`1635// when a relocation can't otherwise be resolved later, so it shouldn't1636// actually result in any memory unsafety or anything like that.1637self.relocs.push(MachReloc {1638offset,1639kind,1640target,1641addend,1642});1643}16441645/// Add an external relocation at the current offset.1646pub fn add_reloc<T: Into<RelocTarget> + Clone>(1647&mut self,1648kind: Reloc,1649target: &T,1650addend: Addend,1651) {1652self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);1653}16541655/// Add a trap record at the current offset.1656pub fn add_trap(&mut self, code: TrapCode) {1657self.traps.push(MachTrap {1658offset: self.data.len() as CodeOffset,1659code,1660});1661}16621663/// Add a call-site record at the current offset.1664pub fn add_call_site(&mut self) {1665self.add_try_call_site(None, core::iter::empty());1666}16671668/// Add a call-site record at the current offset with exception1669/// handlers.1670pub fn add_try_call_site(1671&mut self,1672frame_offset: Option<u32>,1673exception_handlers: impl Iterator<Item = MachExceptionHandler>,1674) {1675let start = u32::try_from(self.exception_handlers.len()).unwrap();1676self.exception_handlers.extend(exception_handlers);1677let end = u32::try_from(self.exception_handlers.len()).unwrap();1678let exception_handler_range = start..end;16791680self.call_sites.push(MachCallSite {1681ret_addr: self.data.len() as CodeOffset,1682frame_offset,1683exception_handler_range,1684});1685}16861687/// Add a patchable call record at the current offset The actual1688/// call is expected to have been emitted; the VCodeInst trait1689/// specifies how to NOP it out, and we carry that information to1690/// the finalized Machbuffer.1691pub fn add_patchable_call_site(&mut self, len: u32) {1692self.patchable_call_sites.push(MachPatchableCallSite {1693ret_addr: self.cur_offset(),1694len,1695});1696}16971698/// Add an unwind record at the current offset.1699pub fn add_unwind(&mut self, unwind: UnwindInst) {1700self.unwind_info.push((self.cur_offset(), unwind));1701}17021703/// Set the `SourceLoc` for code from this offset until the offset at the1704/// next call to `end_srcloc()`.1705/// Returns the current [CodeOffset] and [RelSourceLoc].1706pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {1707let cur = (self.cur_offset(), loc);1708self.cur_srcloc = Some(cur);1709cur1710}17111712/// Mark the end of the `SourceLoc` segment started at the last1713/// `start_srcloc()` call.1714pub fn end_srcloc(&mut self) {1715let (start, loc) = self1716.cur_srcloc1717.take()1718.expect("end_srcloc() called without start_srcloc()");1719let end = self.cur_offset();1720// Skip zero-length extends.1721debug_assert!(end >= start);1722if end > start {1723self.srclocs.push(MachSrcLoc { start, end, loc });1724}1725}17261727/// Push a user stack map onto this buffer.1728///1729/// The stack map is associated with the given `return_addr` code1730/// offset. This must be the PC for the instruction just *after* this stack1731/// map's associated instruction. For example in the sequence `call $foo;1732/// add r8, rax`, the `return_addr` must be the offset of the start of the1733/// `add` instruction.1734///1735/// Stack maps must be pushed in sorted `return_addr` order.1736pub fn push_user_stack_map(1737&mut self,1738emit_state: &I::State,1739return_addr: CodeOffset,1740mut stack_map: ir::UserStackMap,1741) {1742let span = emit_state.frame_layout().active_size();1743trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");17441745debug_assert!(1746self.user_stack_maps1747.last()1748.map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),1749"pushed stack maps out of order: {} is not less than {}",1750self.user_stack_maps.last().unwrap().0,1751return_addr,1752);17531754stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());1755self.user_stack_maps.push((return_addr, span, stack_map));1756}17571758/// Push a debug tag associated with the current buffer offset.1759pub fn push_debug_tags(&mut self, pos: MachDebugTagPos, tags: &[DebugTag]) {1760trace!("debug tags at offset {}: {tags:?}", self.cur_offset());1761let start = u32::try_from(self.debug_tag_pool.len()).unwrap();1762self.debug_tag_pool.extend(tags.iter().cloned());1763let end = u32::try_from(self.debug_tag_pool.len()).unwrap();1764self.debug_tags.push(MachDebugTags {1765offset: self.cur_offset(),1766pos,1767range: start..end,1768});1769}17701771/// Increase the alignment of the buffer to the given alignment if bigger1772/// than the current alignment.1773pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {1774self.min_alignment = self.min_alignment.max(17751u32.checked_shl(u32::from(align_to))1776.expect("log2_min_function_alignment too large"),1777);1778}17791780/// Set the frame layout metadata.1781pub fn set_frame_layout(&mut self, frame_layout: MachBufferFrameLayout) {1782debug_assert!(self.frame_layout.is_none());1783self.frame_layout = Some(frame_layout);1784}1785}17861787impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {1788fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {1789for b in iter {1790self.put1(b);1791}1792}1793}17941795impl<T: CompilePhase> MachBufferFinalized<T> {1796/// Get a list of source location mapping tuples in sorted-by-start-offset order.1797pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {1798&self.srclocs[..]1799}18001801/// Get all debug tags, sorted by associated offset.1802pub fn debug_tags(&self) -> impl Iterator<Item = MachBufferDebugTagList<'_>> {1803self.debug_tags.iter().map(|tags| {1804let start = usize::try_from(tags.range.start).unwrap();1805let end = usize::try_from(tags.range.end).unwrap();1806MachBufferDebugTagList {1807offset: tags.offset,1808pos: tags.pos,1809tags: &self.debug_tag_pool[start..end],1810}1811})1812}18131814/// Get the total required size for the code.1815pub fn total_size(&self) -> CodeOffset {1816self.data.len() as CodeOffset1817}18181819/// Return the code in this mach buffer as a hex string for testing purposes.1820pub fn stringify_code_bytes(&self) -> String {1821// This is pretty lame, but whatever ..1822use core::fmt::Write;1823let mut s = String::with_capacity(self.data.len() * 2);1824for b in &self.data {1825write!(&mut s, "{b:02X}").unwrap();1826}1827s1828}18291830/// Get the code bytes.1831pub fn data(&self) -> &[u8] {1832// N.B.: we emit every section into the .text section as far as1833// the `CodeSink` is concerned; we do not bother to segregate1834// the contents into the actual program text, the jumptable and the1835// rodata (constant pool). This allows us to generate code assuming1836// that these will not be relocated relative to each other, and avoids1837// having to designate each section as belonging in one of the three1838// fixed categories defined by `CodeSink`. If this becomes a problem1839// later (e.g. because of memory permissions or similar), we can1840// add this designation and segregate the output; take care, however,1841// to add the appropriate relocations in this case.18421843&self.data[..]1844}18451846/// Get a mutable slice of the code bytes, allowing patching1847/// post-passes.1848pub fn data_mut(&mut self) -> &mut [u8] {1849&mut self.data[..]1850}18511852/// Get the list of external relocations for this code.1853pub fn relocs(&self) -> &[FinalizedMachReloc] {1854&self.relocs[..]1855}18561857/// Get the list of trap records for this code.1858pub fn traps(&self) -> &[MachTrap] {1859&self.traps[..]1860}18611862/// Get the user stack map metadata for this code.1863pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {1864&self.user_stack_maps1865}18661867/// Take this buffer's user strack map metadata.1868pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {1869mem::take(&mut self.user_stack_maps)1870}18711872/// Get the list of call sites for this code, along with1873/// associated exception handlers.1874///1875/// Each item yielded by the returned iterator is a struct with:1876///1877/// - The call site metadata record, with a `ret_addr` field1878/// directly accessible and denoting the offset of the return1879/// address into this buffer's code.1880/// - The slice of pairs of exception tags and code offsets1881/// denoting exception-handler entry points associated with this1882/// call site.1883pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {1884self.call_sites.iter().map(|call_site| {1885let handler_range = call_site.exception_handler_range.clone();1886let handler_range = usize::try_from(handler_range.start).unwrap()1887..usize::try_from(handler_range.end).unwrap();1888FinalizedMachCallSite {1889ret_addr: call_site.ret_addr,1890frame_offset: call_site.frame_offset,1891exception_handlers: &self.exception_handlers[handler_range],1892}1893})1894}18951896/// Get the frame layout, if known.1897pub fn frame_layout(&self) -> Option<&MachBufferFrameLayout> {1898self.frame_layout.as_ref()1899}19001901/// Get the list of patchable call sites for this code.1902///1903/// Each location in the buffer contains the bytes for a call1904/// instruction to the specified target. If the call is to be1905/// patched out, the bytes in the region should be replaced with1906/// those given in the `MachBufferFinalized::nop` array, repeated1907/// as many times as necessary. (The length of the patchable1908/// region is guaranteed to be an integer multiple of that NOP1909/// unit size.)1910pub fn patchable_call_sites(&self) -> impl Iterator<Item = &MachPatchableCallSite> + '_ {1911self.patchable_call_sites.iter()1912}1913}19141915/// An item in the exception-handler list for a callsite, with label1916/// references. Items are interpreted in left-to-right order and the1917/// first match wins.1918#[derive(Clone, Copy, Debug, PartialEq, Eq)]1919pub enum MachExceptionHandler {1920/// A specific tag (in the current dynamic context) should be1921/// handled by the code at the given offset.1922Tag(ExceptionTag, MachLabel),1923/// All exceptions should be handled by the code at the given1924/// offset.1925Default(MachLabel),1926/// The dynamic context for interpreting tags is updated to the1927/// value stored in the given machine location (in this frame's1928/// context).1929Context(ExceptionContextLoc),1930}19311932impl MachExceptionHandler {1933fn finalize<F: Fn(MachLabel) -> CodeOffset>(self, f: F) -> FinalizedMachExceptionHandler {1934match self {1935Self::Tag(tag, label) => FinalizedMachExceptionHandler::Tag(tag, f(label)),1936Self::Default(label) => FinalizedMachExceptionHandler::Default(f(label)),1937Self::Context(loc) => FinalizedMachExceptionHandler::Context(loc),1938}1939}1940}19411942/// An item in the exception-handler list for a callsite, with final1943/// (lowered) code offsets. Items are interpreted in left-to-right1944/// order and the first match wins.1945#[derive(Clone, Copy, Debug, PartialEq, Eq)]1946#[cfg_attr(1947feature = "enable-serde",1948derive(serde_derive::Serialize, serde_derive::Deserialize)1949)]1950pub enum FinalizedMachExceptionHandler {1951/// A specific tag (in the current dynamic context) should be1952/// handled by the code at the given offset.1953Tag(ExceptionTag, CodeOffset),1954/// All exceptions should be handled by the code at the given1955/// offset.1956Default(CodeOffset),1957/// The dynamic context for interpreting tags is updated to the1958/// value stored in the given machine location (in this frame's1959/// context).1960Context(ExceptionContextLoc),1961}19621963/// A location for a dynamic exception context value.1964#[derive(Clone, Copy, Debug, PartialEq, Eq)]1965#[cfg_attr(1966feature = "enable-serde",1967derive(serde_derive::Serialize, serde_derive::Deserialize)1968)]1969pub enum ExceptionContextLoc {1970/// An offset from SP at the callsite.1971SPOffset(u32),1972/// A GPR at the callsite. The physical register number for the1973/// GPR register file on the target architecture is used.1974GPR(u8),1975}19761977/// Metadata about a constant.1978struct MachBufferConstant {1979/// A label which has not yet been bound which can be used for this1980/// constant.1981///1982/// This is lazily created when a label is requested for a constant and is1983/// cleared when a constant is emitted.1984upcoming_label: Option<MachLabel>,1985/// Required alignment.1986align: CodeOffset,1987/// The byte size of this constant.1988size: usize,1989}19901991/// A trap that is deferred to the next time an island is emitted for either1992/// traps, constants, or fixups.1993struct MachLabelTrap {1994/// This label will refer to the trap's offset.1995label: MachLabel,1996/// The code associated with this trap.1997code: TrapCode,1998/// An optional source location to assign for this trap.1999loc: Option<RelSourceLoc>,2000}20012002/// A fixup to perform on the buffer once code is emitted. Fixups always refer2003/// to labels and patch the code based on label offsets. Hence, they are like2004/// relocations, but internal to one buffer.2005#[derive(Debug)]2006struct MachLabelFixup<I: VCodeInst> {2007/// The label whose offset controls this fixup.2008label: MachLabel,2009/// The offset to fix up / patch to refer to this label.2010offset: CodeOffset,2011/// The kind of fixup. This is architecture-specific; each architecture may have,2012/// e.g., several types of branch instructions, each with differently-sized2013/// offset fields and different places within the instruction to place the2014/// bits.2015kind: I::LabelUse,2016}20172018impl<I: VCodeInst> MachLabelFixup<I> {2019fn deadline(&self) -> CodeOffset {2020self.offset.saturating_add(self.kind.max_pos_range())2021}2022}20232024impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {2025fn eq(&self, other: &Self) -> bool {2026self.deadline() == other.deadline()2027}2028}20292030impl<I: VCodeInst> Eq for MachLabelFixup<I> {}20312032impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {2033fn partial_cmp(&self, other: &Self) -> Option<Ordering> {2034Some(self.cmp(other))2035}2036}20372038impl<I: VCodeInst> Ord for MachLabelFixup<I> {2039fn cmp(&self, other: &Self) -> Ordering {2040other.deadline().cmp(&self.deadline())2041}2042}20432044/// A relocation resulting from a compilation.2045#[derive(Clone, Debug, PartialEq)]2046#[cfg_attr(2047feature = "enable-serde",2048derive(serde_derive::Serialize, serde_derive::Deserialize)2049)]2050pub struct MachRelocBase<T> {2051/// The offset at which the relocation applies, *relative to the2052/// containing section*.2053pub offset: CodeOffset,2054/// The kind of relocation.2055pub kind: Reloc,2056/// The external symbol / name to which this relocation refers.2057pub target: T,2058/// The addend to add to the symbol value.2059pub addend: i64,2060}20612062type MachReloc = MachRelocBase<RelocTarget>;20632064/// A relocation resulting from a compilation.2065pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;20662067/// A Relocation target2068#[derive(Debug, Clone, PartialEq, Eq, Hash)]2069pub enum RelocTarget {2070/// Points to an [ExternalName] outside the current function.2071ExternalName(ExternalName),2072/// Points to a [MachLabel] inside this function.2073/// This is different from [MachLabelFixup] in that both the relocation and the2074/// label will be emitted and are only resolved at link time.2075///2076/// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.2077Label(MachLabel),2078}20792080impl From<ExternalName> for RelocTarget {2081fn from(name: ExternalName) -> Self {2082Self::ExternalName(name)2083}2084}20852086impl From<MachLabel> for RelocTarget {2087fn from(label: MachLabel) -> Self {2088Self::Label(label)2089}2090}20912092/// A Relocation target2093#[derive(Debug, Clone, PartialEq, Eq, Hash)]2094#[cfg_attr(2095feature = "enable-serde",2096derive(serde_derive::Serialize, serde_derive::Deserialize)2097)]2098pub enum FinalizedRelocTarget {2099/// Points to an [ExternalName] outside the current function.2100ExternalName(ExternalName),2101/// Points to a [CodeOffset] from the start of the current function.2102Func(CodeOffset),2103}21042105impl FinalizedRelocTarget {2106/// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the2107/// output.2108pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {2109match self {2110FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),2111FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),2112}2113}2114}21152116/// A trap record resulting from a compilation.2117#[derive(Clone, Debug, PartialEq)]2118#[cfg_attr(2119feature = "enable-serde",2120derive(serde_derive::Serialize, serde_derive::Deserialize)2121)]2122pub struct MachTrap {2123/// The offset at which the trap instruction occurs, *relative to the2124/// containing section*.2125pub offset: CodeOffset,2126/// The trap code.2127pub code: TrapCode,2128}21292130/// A call site record resulting from a compilation.2131#[derive(Clone, Debug, PartialEq)]2132#[cfg_attr(2133feature = "enable-serde",2134derive(serde_derive::Serialize, serde_derive::Deserialize)2135)]2136pub struct MachCallSite {2137/// The offset of the call's return address, *relative to the2138/// start of the buffer*.2139pub ret_addr: CodeOffset,21402141/// The offset from the FP at this callsite down to the SP when2142/// the call occurs, if known. In other words, the size of the2143/// stack frame up to the saved FP slot. Useful to recover the2144/// start of the stack frame and to look up dynamic contexts2145/// stored in [`ExceptionContextLoc::SPOffset`].2146///2147/// If `None`, the compiler backend did not specify a frame2148/// offset. The runtime in use with the compiled code may require2149/// the frame offset if exception handlers are present or dynamic2150/// context is used, but that is not Cranelift's concern: the2151/// frame offset is optional at this level.2152pub frame_offset: Option<u32>,21532154/// Range in `exception_handlers` corresponding to the exception2155/// handlers for this callsite.2156exception_handler_range: Range<u32>,2157}21582159/// A call site record resulting from a compilation.2160#[derive(Clone, Debug, PartialEq)]2161pub struct FinalizedMachCallSite<'a> {2162/// The offset of the call's return address, *relative to the2163/// start of the buffer*.2164pub ret_addr: CodeOffset,21652166/// The offset from the FP at this callsite down to the SP when2167/// the call occurs, if known. In other words, the size of the2168/// stack frame up to the saved FP slot. Useful to recover the2169/// start of the stack frame and to look up dynamic contexts2170/// stored in [`ExceptionContextLoc::SPOffset`].2171///2172/// If `None`, the compiler backend did not specify a frame2173/// offset. The runtime in use with the compiled code may require2174/// the frame offset if exception handlers are present or dynamic2175/// context is used, but that is not Cranelift's concern: the2176/// frame offset is optional at this level.2177pub frame_offset: Option<u32>,21782179/// Exception handlers at this callsite, with target offsets2180/// *relative to the start of the buffer*.2181pub exception_handlers: &'a [FinalizedMachExceptionHandler],2182}21832184/// A patchable call site record resulting from a compilation.2185#[derive(Clone, Debug, PartialEq)]2186#[cfg_attr(2187feature = "enable-serde",2188derive(serde_derive::Serialize, serde_derive::Deserialize)2189)]2190pub struct MachPatchableCallSite {2191/// The offset of the call's return address (i.e., the address2192/// after the end of the patchable region), *relative to the start2193/// of the buffer*.2194pub ret_addr: CodeOffset,21952196/// The length of the region to be patched by NOP bytes.2197pub len: u32,2198}21992200/// A source-location mapping resulting from a compilation.2201#[derive(PartialEq, Debug, Clone)]2202#[cfg_attr(2203feature = "enable-serde",2204derive(serde_derive::Serialize, serde_derive::Deserialize)2205)]2206pub struct MachSrcLoc<T: CompilePhase> {2207/// The start of the region of code corresponding to a source location.2208/// This is relative to the start of the function, not to the start of the2209/// section.2210pub start: CodeOffset,2211/// The end of the region of code corresponding to a source location.2212/// This is relative to the start of the function, not to the start of the2213/// section.2214pub end: CodeOffset,2215/// The source location.2216pub loc: T::SourceLocType,2217}22182219impl MachSrcLoc<Stencil> {2220fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {2221MachSrcLoc {2222start: self.start,2223end: self.end,2224loc: self.loc.expand(base_srcloc),2225}2226}2227}22282229/// Record of branch instruction in the buffer, to facilitate editing.2230#[derive(Clone, Debug)]2231struct MachBranch {2232start: CodeOffset,2233end: CodeOffset,2234target: MachLabel,2235fixup: usize,2236inverted: Option<SmallVec<[u8; 8]>>,2237/// All labels pointing to the start of this branch. For correctness, this2238/// *must* be complete (i.e., must contain all labels whose resolved offsets2239/// are at the start of this branch): we rely on being able to redirect all2240/// labels that could jump to this branch before removing it, if it is2241/// otherwise unreachable.2242labels_at_this_branch: SmallVec<[MachLabel; 4]>,2243}22442245impl MachBranch {2246fn is_cond(&self) -> bool {2247self.inverted.is_some()2248}2249fn is_uncond(&self) -> bool {2250self.inverted.is_none()2251}2252}22532254/// Stack-frame layout information carried through to machine2255/// code. This provides sufficient information to interpret an active2256/// stack frame from a running function, if provided.2257#[derive(Clone, Debug, PartialEq)]2258#[cfg_attr(2259feature = "enable-serde",2260derive(serde_derive::Serialize, serde_derive::Deserialize)2261)]2262pub struct MachBufferFrameLayout {2263/// Offset from bottom of frame to FP (near top of frame). This2264/// allows reading the frame given only FP.2265pub frame_to_fp_offset: u32,2266/// Offset from bottom of frame for each StackSlot,2267pub stackslots: SecondaryMap<ir::StackSlot, MachBufferStackSlot>,2268}22692270/// Descriptor for a single stack slot in the compiled function.2271#[derive(Clone, Debug, PartialEq, Default)]2272#[cfg_attr(2273feature = "enable-serde",2274derive(serde_derive::Serialize, serde_derive::Deserialize)2275)]2276pub struct MachBufferStackSlot {2277/// Offset from the bottom of the stack frame.2278pub offset: u32,22792280/// User-provided key to describe this stack slot.2281pub key: Option<ir::StackSlotKey>,2282}22832284/// Debug tags: a sequence of references to a stack slot, or a2285/// user-defined value, at a particular PC.2286#[derive(Clone, Debug, PartialEq)]2287#[cfg_attr(2288feature = "enable-serde",2289derive(serde_derive::Serialize, serde_derive::Deserialize)2290)]2291pub(crate) struct MachDebugTags {2292/// Offset at which this tag applies.2293pub offset: CodeOffset,22942295/// Position on the attached instruction. This indicates whether2296/// the tags attach to the prior instruction (i.e., as a return2297/// point from a call) or the current instruction (i.e., as a PC2298/// seen during a trap).2299pub pos: MachDebugTagPos,23002301/// The range in the tag pool.2302pub range: Range<u32>,2303}23042305/// Debug tag position on an instruction.2306///2307/// We need to distinguish position on an instruction, and not just2308/// use offsets, because of the following case:2309///2310/// ```plain2311/// <tag1, tag2> call ...2312/// <tag3, tag4> trapping_store ...2313/// ```2314///2315/// If the stack is walked and interpreted with debug tags while2316/// within the call, the PC seen will be the return point, i.e. the2317/// address after the call. If the stack is walked and interpreted2318/// with debug tags upon a trap of the following instruction, it will2319/// be the PC of that instruction -- which is the same PC! Thus to2320/// disambiguate which tags we want, we attach a "pre/post" flag to2321/// every group of tags at an offset; and when we look up tags, we2322/// look them up for an offset and "position" at that offset.2323///2324/// Thus there are logically two positions at every offset -- so the2325/// above will be emitted as2326///2327/// ```plain2328/// 0: call ...2329/// 4, post: <tag1, tag2>2330/// 4, pre: <tag3, tag4>2331/// 4: trapping_store ...2332/// ```2333#[derive(Clone, Copy, Debug, PartialEq, Eq)]2334#[cfg_attr(2335feature = "enable-serde",2336derive(serde_derive::Serialize, serde_derive::Deserialize)2337)]2338pub enum MachDebugTagPos {2339/// Tags attached after the instruction that ends at this offset.2340///2341/// This is used to attach tags to a call, because the PC we see2342/// when walking the stack is the *return point*.2343Post,2344/// Tags attached before the instruction that starts at this offset.2345///2346/// This is used to attach tags to every other kind of2347/// instruction, because the PC we see when processing a trap of2348/// that instruction is the PC of that instruction, not the2349/// following one.2350Pre,2351}23522353/// Iterator item for visiting debug tags.2354pub struct MachBufferDebugTagList<'a> {2355/// Offset at which this tag applies.2356pub offset: CodeOffset,23572358/// Position at this offset ("post", attaching to prior2359/// instruction, or "pre", attaching to next instruction).2360pub pos: MachDebugTagPos,23612362/// The underlying tags.2363pub tags: &'a [DebugTag],2364}23652366/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.2367///2368/// Note that `MachBuffer` was primarily written for intra-function references2369/// of jumps between basic blocks, but it's also quite usable for entire text2370/// sections and resolving references between functions themselves. This2371/// builder interprets "blocks" as labeled functions for the purposes of2372/// resolving labels internally in the buffer.2373pub struct MachTextSectionBuilder<I: VCodeInst> {2374buf: MachBuffer<I>,2375next_func: usize,2376force_veneers: ForceVeneers,2377}23782379impl<I: VCodeInst> MachTextSectionBuilder<I> {2380/// Creates a new text section builder which will have `num_funcs` functions2381/// pushed into it.2382pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {2383let mut buf = MachBuffer::new();2384buf.reserve_labels_for_blocks(num_funcs);2385MachTextSectionBuilder {2386buf,2387next_func: 0,2388force_veneers: ForceVeneers::No,2389}2390}2391}23922393impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {2394fn append(2395&mut self,2396labeled: bool,2397func: &[u8],2398align: u32,2399ctrl_plane: &mut ControlPlane,2400) -> u64 {2401// Conditionally emit an island if it's necessary to resolve jumps2402// between functions which are too far away.2403let size = func.len() as u32;2404if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {2405self.buf2406.emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);2407}24082409self.buf.align_to(align);2410let pos = self.buf.cur_offset();2411if labeled {2412self.buf.bind_label(2413MachLabel::from_block(BlockIndex::new(self.next_func)),2414ctrl_plane,2415);2416self.next_func += 1;2417}2418self.buf.put_data(func);2419u64::from(pos)2420}24212422fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {2423crate::trace!(2424"Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"2425);2426let label = MachLabel::from_block(BlockIndex::new(target));2427let offset = u32::try_from(offset).unwrap();2428match I::LabelUse::from_reloc(reloc, addend) {2429Some(label_use) => {2430self.buf.use_label_at_offset(offset, label, label_use);2431true2432}2433None => false,2434}2435}24362437fn force_veneers(&mut self) {2438self.force_veneers = ForceVeneers::Yes;2439}24402441fn write(&mut self, offset: u64, data: &[u8]) {2442self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);2443}24442445fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {2446// Double-check all functions were pushed.2447assert_eq!(self.next_func, self.buf.label_offsets.len());24482449// Finish up any veneers, if necessary.2450self.buf2451.finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);24522453// We don't need the data any more, so return it to the caller.2454mem::take(&mut self.buf.data).into_vec()2455}2456}24572458// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.2459#[cfg(all(test, feature = "arm64"))]2460mod test {2461use cranelift_entity::EntityRef as _;24622463use super::*;2464use crate::ir::UserExternalNameRef;2465use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};2466use crate::isa::aarch64::inst::{OperandSize, xreg};2467use crate::machinst::{MachInstEmit, MachInstEmitState};2468use crate::settings;24692470fn label(n: u32) -> MachLabel {2471MachLabel::from_block(BlockIndex::new(n as usize))2472}2473fn target(n: u32) -> BranchTarget {2474BranchTarget::Label(label(n))2475}24762477#[test]2478fn test_elide_jump_to_next() {2479let info = EmitInfo::new(settings::Flags::new(settings::builder()));2480let mut buf = MachBuffer::new();2481let mut state = <Inst as MachInstEmit>::State::default();2482let constants = Default::default();24832484buf.reserve_labels_for_blocks(2);2485buf.bind_label(label(0), state.ctrl_plane_mut());2486let inst = Inst::Jump { dest: target(1) };2487inst.emit(&mut buf, &info, &mut state);2488buf.bind_label(label(1), state.ctrl_plane_mut());2489let buf = buf.finish(&constants, state.ctrl_plane_mut());2490assert_eq!(0, buf.total_size());2491}24922493#[test]2494fn test_elide_trivial_jump_blocks() {2495let info = EmitInfo::new(settings::Flags::new(settings::builder()));2496let mut buf = MachBuffer::new();2497let mut state = <Inst as MachInstEmit>::State::default();2498let constants = Default::default();24992500buf.reserve_labels_for_blocks(4);25012502buf.bind_label(label(0), state.ctrl_plane_mut());2503let inst = Inst::CondBr {2504kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2505taken: target(1),2506not_taken: target(2),2507};2508inst.emit(&mut buf, &info, &mut state);25092510buf.bind_label(label(1), state.ctrl_plane_mut());2511let inst = Inst::Jump { dest: target(3) };2512inst.emit(&mut buf, &info, &mut state);25132514buf.bind_label(label(2), state.ctrl_plane_mut());2515let inst = Inst::Jump { dest: target(3) };2516inst.emit(&mut buf, &info, &mut state);25172518buf.bind_label(label(3), state.ctrl_plane_mut());25192520let buf = buf.finish(&constants, state.ctrl_plane_mut());2521assert_eq!(0, buf.total_size());2522}25232524#[test]2525fn test_flip_cond() {2526let info = EmitInfo::new(settings::Flags::new(settings::builder()));2527let mut buf = MachBuffer::new();2528let mut state = <Inst as MachInstEmit>::State::default();2529let constants = Default::default();25302531buf.reserve_labels_for_blocks(4);25322533buf.bind_label(label(0), state.ctrl_plane_mut());2534let inst = Inst::CondBr {2535kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),2536taken: target(1),2537not_taken: target(2),2538};2539inst.emit(&mut buf, &info, &mut state);25402541buf.bind_label(label(1), state.ctrl_plane_mut());2542let inst = Inst::Nop4;2543inst.emit(&mut buf, &info, &mut state);25442545buf.bind_label(label(2), state.ctrl_plane_mut());2546let inst = Inst::Udf {2547trap_code: TrapCode::STACK_OVERFLOW,2548};2549inst.emit(&mut buf, &info, &mut state);25502551buf.bind_label(label(3), state.ctrl_plane_mut());25522553let buf = buf.finish(&constants, state.ctrl_plane_mut());25542555let mut buf2 = MachBuffer::new();2556let mut state = Default::default();2557let inst = Inst::TrapIf {2558kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2559trap_code: TrapCode::STACK_OVERFLOW,2560};2561inst.emit(&mut buf2, &info, &mut state);2562let inst = Inst::Nop4;2563inst.emit(&mut buf2, &info, &mut state);25642565let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());25662567assert_eq!(buf.data, buf2.data);2568}25692570#[test]2571fn test_island() {2572let info = EmitInfo::new(settings::Flags::new(settings::builder()));2573let mut buf = MachBuffer::new();2574let mut state = <Inst as MachInstEmit>::State::default();2575let constants = Default::default();25762577buf.reserve_labels_for_blocks(4);25782579buf.bind_label(label(0), state.ctrl_plane_mut());2580let inst = Inst::CondBr {2581kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2582taken: target(2),2583not_taken: target(3),2584};2585inst.emit(&mut buf, &info, &mut state);25862587buf.bind_label(label(1), state.ctrl_plane_mut());2588while buf.cur_offset() < 2000000 {2589if buf.island_needed(0) {2590buf.emit_island(0, state.ctrl_plane_mut());2591}2592let inst = Inst::Nop4;2593inst.emit(&mut buf, &info, &mut state);2594}25952596buf.bind_label(label(2), state.ctrl_plane_mut());2597let inst = Inst::Nop4;2598inst.emit(&mut buf, &info, &mut state);25992600buf.bind_label(label(3), state.ctrl_plane_mut());2601let inst = Inst::Nop4;2602inst.emit(&mut buf, &info, &mut state);26032604let buf = buf.finish(&constants, state.ctrl_plane_mut());26052606assert_eq!(2000000 + 8, buf.total_size());26072608let mut buf2 = MachBuffer::new();2609let mut state = Default::default();2610let inst = Inst::CondBr {2611kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),26122613// This conditionally taken branch has a 19-bit constant, shifted2614// to the left by two, giving us a 21-bit range in total. Half of2615// this range positive so the we should be around 1 << 20 bytes2616// away for our jump target.2617//2618// There are two pending fixups by the time we reach this point,2619// one for this 19-bit jump and one for the unconditional 26-bit2620// jump below. A 19-bit veneer is 4 bytes large and the 26-bit2621// veneer is 20 bytes large, which means that pessimistically2622// assuming we'll need two veneers. Currently each veneer is2623// pessimistically assumed to be the maximal size which means we2624// need 40 bytes of extra space, meaning that the actual island2625// should come 40-bytes before the deadline.2626taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),26272628// This branch is in-range so no veneers should be needed, it should2629// go directly to the target.2630not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),2631};2632inst.emit(&mut buf2, &info, &mut state);26332634let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());26352636assert_eq!(&buf.data[0..8], &buf2.data[..]);2637}26382639#[test]2640fn test_island_backward() {2641let info = EmitInfo::new(settings::Flags::new(settings::builder()));2642let mut buf = MachBuffer::new();2643let mut state = <Inst as MachInstEmit>::State::default();2644let constants = Default::default();26452646buf.reserve_labels_for_blocks(4);26472648buf.bind_label(label(0), state.ctrl_plane_mut());2649let inst = Inst::Nop4;2650inst.emit(&mut buf, &info, &mut state);26512652buf.bind_label(label(1), state.ctrl_plane_mut());2653let inst = Inst::Nop4;2654inst.emit(&mut buf, &info, &mut state);26552656buf.bind_label(label(2), state.ctrl_plane_mut());2657while buf.cur_offset() < 2000000 {2658let inst = Inst::Nop4;2659inst.emit(&mut buf, &info, &mut state);2660}26612662buf.bind_label(label(3), state.ctrl_plane_mut());2663let inst = Inst::CondBr {2664kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2665taken: target(0),2666not_taken: target(1),2667};2668inst.emit(&mut buf, &info, &mut state);26692670let buf = buf.finish(&constants, state.ctrl_plane_mut());26712672assert_eq!(2000000 + 12, buf.total_size());26732674let mut buf2 = MachBuffer::new();2675let mut state = Default::default();2676let inst = Inst::CondBr {2677kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),2678taken: BranchTarget::ResolvedOffset(8),2679not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),2680};2681inst.emit(&mut buf2, &info, &mut state);2682let inst = Inst::Jump {2683dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),2684};2685inst.emit(&mut buf2, &info, &mut state);26862687let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());26882689assert_eq!(&buf.data[2000000..], &buf2.data[..]);2690}26912692#[test]2693fn test_multiple_redirect() {2694// label0:2695// cbz x0, label12696// b label22697// label1:2698// b label32699// label2:2700// nop2701// nop2702// b label02703// label3:2704// b label42705// label4:2706// b label52707// label5:2708// b label72709// label6:2710// nop2711// label7:2712// ret2713//2714// -- should become:2715//2716// label0:2717// cbz x0, label72718// label2:2719// nop2720// nop2721// b label02722// label6:2723// nop2724// label7:2725// ret27262727let info = EmitInfo::new(settings::Flags::new(settings::builder()));2728let mut buf = MachBuffer::new();2729let mut state = <Inst as MachInstEmit>::State::default();2730let constants = Default::default();27312732buf.reserve_labels_for_blocks(8);27332734buf.bind_label(label(0), state.ctrl_plane_mut());2735let inst = Inst::CondBr {2736kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),2737taken: target(1),2738not_taken: target(2),2739};2740inst.emit(&mut buf, &info, &mut state);27412742buf.bind_label(label(1), state.ctrl_plane_mut());2743let inst = Inst::Jump { dest: target(3) };2744inst.emit(&mut buf, &info, &mut state);27452746buf.bind_label(label(2), state.ctrl_plane_mut());2747let inst = Inst::Nop4;2748inst.emit(&mut buf, &info, &mut state);2749inst.emit(&mut buf, &info, &mut state);2750let inst = Inst::Jump { dest: target(0) };2751inst.emit(&mut buf, &info, &mut state);27522753buf.bind_label(label(3), state.ctrl_plane_mut());2754let inst = Inst::Jump { dest: target(4) };2755inst.emit(&mut buf, &info, &mut state);27562757buf.bind_label(label(4), state.ctrl_plane_mut());2758let inst = Inst::Jump { dest: target(5) };2759inst.emit(&mut buf, &info, &mut state);27602761buf.bind_label(label(5), state.ctrl_plane_mut());2762let inst = Inst::Jump { dest: target(7) };2763inst.emit(&mut buf, &info, &mut state);27642765buf.bind_label(label(6), state.ctrl_plane_mut());2766let inst = Inst::Nop4;2767inst.emit(&mut buf, &info, &mut state);27682769buf.bind_label(label(7), state.ctrl_plane_mut());2770let inst = Inst::Ret {};2771inst.emit(&mut buf, &info, &mut state);27722773let buf = buf.finish(&constants, state.ctrl_plane_mut());27742775let golden_data = vec![27760xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x1427770x1f, 0x20, 0x03, 0xd5, // nop27780x1f, 0x20, 0x03, 0xd5, // nop27790xfd, 0xff, 0xff, 0x17, // b 027800x1f, 0x20, 0x03, 0xd5, // nop27810xc0, 0x03, 0x5f, 0xd6, // ret2782];27832784assert_eq!(&golden_data[..], &buf.data[..]);2785}27862787#[test]2788fn test_handle_branch_cycle() {2789// label0:2790// b label12791// label1:2792// b label22793// label2:2794// b label32795// label3:2796// b label42797// label4:2798// b label1 // note: not label0 (to make it interesting).2799//2800// -- should become:2801//2802// label0, label1, ..., label4:2803// b label02804let info = EmitInfo::new(settings::Flags::new(settings::builder()));2805let mut buf = MachBuffer::new();2806let mut state = <Inst as MachInstEmit>::State::default();2807let constants = Default::default();28082809buf.reserve_labels_for_blocks(5);28102811buf.bind_label(label(0), state.ctrl_plane_mut());2812let inst = Inst::Jump { dest: target(1) };2813inst.emit(&mut buf, &info, &mut state);28142815buf.bind_label(label(1), state.ctrl_plane_mut());2816let inst = Inst::Jump { dest: target(2) };2817inst.emit(&mut buf, &info, &mut state);28182819buf.bind_label(label(2), state.ctrl_plane_mut());2820let inst = Inst::Jump { dest: target(3) };2821inst.emit(&mut buf, &info, &mut state);28222823buf.bind_label(label(3), state.ctrl_plane_mut());2824let inst = Inst::Jump { dest: target(4) };2825inst.emit(&mut buf, &info, &mut state);28262827buf.bind_label(label(4), state.ctrl_plane_mut());2828let inst = Inst::Jump { dest: target(1) };2829inst.emit(&mut buf, &info, &mut state);28302831let buf = buf.finish(&constants, state.ctrl_plane_mut());28322833let golden_data = vec![28340x00, 0x00, 0x00, 0x14, // b 02835];28362837assert_eq!(&golden_data[..], &buf.data[..]);2838}28392840#[test]2841fn metadata_records() {2842let mut buf = MachBuffer::<Inst>::new();2843let ctrl_plane = &mut Default::default();2844let constants = Default::default();28452846buf.reserve_labels_for_blocks(3);28472848buf.bind_label(label(0), ctrl_plane);2849buf.put1(1);2850buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);2851buf.put1(2);2852buf.add_trap(TrapCode::INTEGER_OVERFLOW);2853buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);2854buf.add_try_call_site(2855Some(0x10),2856[2857MachExceptionHandler::Tag(ExceptionTag::new(42), label(2)),2858MachExceptionHandler::Default(label(1)),2859]2860.into_iter(),2861);2862buf.add_reloc(2863Reloc::Abs4,2864&ExternalName::User(UserExternalNameRef::new(0)),28650,2866);2867buf.put1(3);2868buf.add_reloc(2869Reloc::Abs8,2870&ExternalName::User(UserExternalNameRef::new(1)),28711,2872);2873buf.put1(4);2874buf.bind_label(label(1), ctrl_plane);2875buf.put1(0xff);2876buf.bind_label(label(2), ctrl_plane);2877buf.put1(0xff);28782879let buf = buf.finish(&constants, ctrl_plane);28802881assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);2882assert_eq!(2883buf.traps()2884.iter()2885.map(|trap| (trap.offset, trap.code))2886.collect::<Vec<_>>(),2887vec![2888(1, TrapCode::HEAP_OUT_OF_BOUNDS),2889(2, TrapCode::INTEGER_OVERFLOW),2890(2, TrapCode::INTEGER_DIVISION_BY_ZERO)2891]2892);2893let call_sites: Vec<_> = buf.call_sites().collect();2894assert_eq!(call_sites[0].ret_addr, 2);2895assert_eq!(call_sites[0].frame_offset, Some(0x10));2896assert_eq!(2897call_sites[0].exception_handlers,2898&[2899FinalizedMachExceptionHandler::Tag(ExceptionTag::new(42), 5),2900FinalizedMachExceptionHandler::Default(4)2901],2902);2903assert_eq!(2904buf.relocs()2905.iter()2906.map(|reloc| (reloc.offset, reloc.kind))2907.collect::<Vec<_>>(),2908vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]2909);2910}2911}291229132914