Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
bytecodealliance
GitHub Repository: bytecodealliance/wasmtime
Path: blob/main/cranelift/codegen/src/machinst/buffer.rs
1693 views
1
//! In-memory representation of compiled machine code, with labels and fixups to
2
//! refer to those labels. Handles constant-pool island insertion and also
3
//! veneer insertion for out-of-range jumps.
4
//!
5
//! This code exists to solve three problems:
6
//!
7
//! - Branch targets for forward branches are not known until later, when we
8
//! emit code in a single pass through the instruction structs.
9
//!
10
//! - On many architectures, address references or offsets have limited range.
11
//! For example, on AArch64, conditional branches can only target code +/- 1MB
12
//! from the branch itself.
13
//!
14
//! - The lowering of control flow from the CFG-with-edges produced by
15
//! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16
//! edge blocks when the register allocator does not need to insert any
17
//! spills/reloads/moves in edge blocks, results in many suboptimal branch
18
//! patterns. The lowering also pays no attention to block order, and so
19
//! two-target conditional forms (cond-br followed by uncond-br) can often by
20
//! avoided because one of the targets is the fallthrough. There are several
21
//! cases here where we can simplify to use fewer branches.
22
//!
23
//! This "buffer" implements a single-pass code emission strategy (with a later
24
//! "fixup" pass, but only through recorded fixups, not all instructions). The
25
//! basic idea is:
26
//!
27
//! - Emit branches as they are, including two-target (cond/uncond) compound
28
//! forms, but with zero offsets and optimistically assuming the target will be
29
//! in range. Record the "fixup" for later. Targets are denoted instead by
30
//! symbolic "labels" that are then bound to certain offsets in the buffer as
31
//! we emit code. (Nominally, there is a label at the start of every basic
32
//! block.)
33
//!
34
//! - As we do this, track the offset in the buffer at which the first label
35
//! reference "goes out of range". We call this the "deadline". If we reach the
36
//! deadline and we still have not bound the label to which an unresolved branch
37
//! refers, we have a problem!
38
//!
39
//! - To solve this problem, we emit "islands" full of "veneers". An island is
40
//! simply a chunk of code inserted in the middle of the code actually produced
41
//! by the emitter (e.g., vcode iterating over instruction structs). The emitter
42
//! has some awareness of this: it either asks for an island between blocks, so
43
//! it is not accidentally executed, or else it emits a branch around the island
44
//! when all other options fail (see `Inst::EmitIsland` meta-instruction).
45
//!
46
//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47
//! that implements a longer-range reference to a label. The idea is that, for
48
//! example, a branch with a limited range can branch to a "veneer" instead,
49
//! which is simply a branch in a form that can use a longer-range reference. On
50
//! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51
//! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52
//! conditional branch's label reference can be fixed up with a "veneer" to
53
//! achieve a longer range.
54
//!
55
//! - To implement all of this, we require the backend to provide a `LabelUse`
56
//! type that implements a trait. This is nominally an enum that records one of
57
//! several kinds of references to an offset in code -- basically, a relocation
58
//! type -- and will usually correspond to different instruction formats. The
59
//! `LabelUse` implementation specifies the maximum range, how to patch in the
60
//! actual label location when known, and how to generate a veneer to extend the
61
//! range.
62
//!
63
//! That satisfies label references, but we still may have suboptimal branch
64
//! patterns. To clean up the branches, we do a simple "peephole"-style
65
//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66
//! informs the buffer of branches in the code and, in the case of conditionals,
67
//! the code that would have been emitted to invert this branch's condition. We
68
//! track the "latest branches": these are branches that are contiguous up to
69
//! the current offset. (If any code is emitted after a branch, that branch or
70
//! run of contiguous branches is no longer "latest".) The latest branches are
71
//! those that we can edit by simply truncating the buffer and doing something
72
//! else instead.
73
//!
74
//! To optimize branches, we implement several simple rules, and try to apply
75
//! them to the "latest branches" when possible:
76
//!
77
//! - A branch with a label target, when that label is bound to the ending
78
//! offset of the branch (the fallthrough location), can be removed altogether,
79
//! because the branch would have no effect).
80
//!
81
//! - An unconditional branch that starts at a label location, and branches to
82
//! another label, results in a "label alias": all references to the label bound
83
//! *to* this branch instruction are instead resolved to the *target* of the
84
//! branch instruction. This effectively removes empty blocks that just
85
//! unconditionally branch to the next block. We call this "branch threading".
86
//!
87
//! - A conditional followed by an unconditional, when the conditional branches
88
//! to the unconditional's fallthrough, results in (i) the truncation of the
89
//! unconditional, (ii) the inversion of the condition's condition, and (iii)
90
//! replacement of the conditional's target (using the original target of the
91
//! unconditional). This is a fancy way of saying "we can flip a two-target
92
//! conditional branch's taken/not-taken targets if it works better with our
93
//! fallthrough". To make this work, the emitter actually gives the buffer
94
//! *both* forms of every conditional branch: the true form is emitted into the
95
//! buffer, and the "inverted" machine-code bytes are provided as part of the
96
//! branch-fixup metadata.
97
//!
98
//! - An unconditional B preceded by another unconditional P, when B's label(s) have
99
//! been redirected to target(B), can be removed entirely. This is an extension
100
//! of the branch-threading optimization, and is valid because if we know there
101
//! will be no fallthrough into this branch instruction (the prior instruction
102
//! is an unconditional jump), and if we know we have successfully redirected
103
//! all labels, then this branch instruction is unreachable. Note that this
104
//! works because the redirection happens before the label is ever resolved
105
//! (fixups happen at island emission time, at which point latest-branches are
106
//! cleared, or at the end of emission), so we are sure to catch and redirect
107
//! all possible paths to this instruction.
108
//!
109
//! # Branch-optimization Correctness
110
//!
111
//! The branch-optimization mechanism depends on a few data structures with
112
//! invariants, which are always held outside the scope of top-level public
113
//! methods:
114
//!
115
//! - The latest-branches list. Each entry describes a span of the buffer
116
//! (start/end offsets), the label target, the corresponding fixup-list entry
117
//! index, and the bytes (must be the same length) for the inverted form, if
118
//! conditional. The list of labels that are bound to the start-offset of this
119
//! branch is *complete* (if any label has a resolved offset equal to `start`
120
//! and is not an alias, it must appear in this list) and *precise* (no label
121
//! in this list can be bound to another offset). No label in this list should
122
//! be an alias. No two branch ranges can overlap, and branches are in
123
//! ascending-offset order.
124
//!
125
//! - The labels-at-tail list. This contains all MachLabels that have been bound
126
//! to (whose resolved offsets are equal to) the tail offset of the buffer.
127
//! No label in this list should be an alias.
128
//!
129
//! - The label_offsets array, containing the bound offset of a label or
130
//! UNKNOWN. No label can be bound at an offset greater than the current
131
//! buffer tail.
132
//!
133
//! - The label_aliases array, containing another label to which a label is
134
//! bound or UNKNOWN. A label's resolved offset is the resolved offset
135
//! of the label it is aliased to, if this is set.
136
//!
137
//! We argue below, at each method, how the invariants in these data structures
138
//! are maintained (grep for "Post-invariant").
139
//!
140
//! Given these invariants, we argue why each optimization preserves execution
141
//! semantics below (grep for "Preserves execution semantics").
142
//!
143
//! # Avoiding Quadratic Behavior
144
//!
145
//! There are two cases where we've had to take some care to avoid
146
//! quadratic worst-case behavior:
147
//!
148
//! - The "labels at this branch" list can grow unboundedly if the
149
//! code generator binds many labels at one location. If the count
150
//! gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
151
//! simply abort an optimization early in a way that is always correct
152
//! but is conservative.
153
//!
154
//! - The fixup list can interact with island emission to create
155
//! "quadratic island behavior". In a little more detail, one can hit
156
//! this behavior by having some pending fixups (forward label
157
//! references) with long-range label-use kinds, and some others
158
//! with shorter-range references that nonetheless still are pending
159
//! long enough to trigger island generation. In such a case, we
160
//! process the fixup list, generate veneers to extend some forward
161
//! references' ranges, but leave the other (longer-range) ones
162
//! alone. The way this was implemented put them back on a list and
163
//! resulted in quadratic behavior.
164
//!
165
//! To avoid this fixups are split into two lists: one "pending" list and one
166
//! final list. The pending list is kept around for handling fixups related to
167
//! branches so it can be edited/truncated. When an island is reached, which
168
//! starts processing fixups, all pending fixups are flushed into the final
169
//! list. The final list is a `BinaryHeap` which enables fixup processing to
170
//! only process those which are required during island emission, deferring
171
//! all longer-range fixups to later.
172
173
use crate::binemit::{Addend, CodeOffset, Reloc};
174
use crate::ir::function::FunctionParameters;
175
use crate::ir::{ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};
176
use crate::isa::unwind::UnwindInst;
177
use crate::machinst::{
178
BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
179
};
180
use crate::trace;
181
use crate::{MachInstEmitState, ir};
182
use crate::{VCodeConstantData, timing};
183
use core::ops::Range;
184
use cranelift_control::ControlPlane;
185
use cranelift_entity::{PrimaryMap, entity_impl};
186
use smallvec::SmallVec;
187
use std::cmp::Ordering;
188
use std::collections::BinaryHeap;
189
use std::mem;
190
use std::string::String;
191
use std::vec::Vec;
192
193
#[cfg(feature = "enable-serde")]
194
use serde::{Deserialize, Serialize};
195
196
#[cfg(feature = "enable-serde")]
197
pub trait CompilePhase {
198
type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
199
type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
200
}
201
202
#[cfg(not(feature = "enable-serde"))]
203
pub trait CompilePhase {
204
type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
205
type SourceLocType: core::fmt::Debug + PartialEq + Clone;
206
}
207
208
/// Status of a compiled artifact that needs patching before being used.
209
#[derive(Clone, Debug, PartialEq)]
210
#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
211
pub struct Stencil;
212
213
/// Status of a compiled artifact ready to use.
214
#[derive(Clone, Debug, PartialEq)]
215
pub struct Final;
216
217
impl CompilePhase for Stencil {
218
type MachSrcLocType = MachSrcLoc<Stencil>;
219
type SourceLocType = RelSourceLoc;
220
}
221
222
impl CompilePhase for Final {
223
type MachSrcLocType = MachSrcLoc<Final>;
224
type SourceLocType = SourceLoc;
225
}
226
227
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
228
enum ForceVeneers {
229
Yes,
230
No,
231
}
232
233
/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
234
/// in bulk.
235
///
236
/// This struct uses `SmallVec`s to support small-ish function bodies without
237
/// any heap allocation. As such, it will be several kilobytes large. This is
238
/// likely fine as long as it is stack-allocated for function emission then
239
/// thrown away; but beware if many buffer objects are retained persistently.
240
pub struct MachBuffer<I: VCodeInst> {
241
/// The buffer contents, as raw bytes.
242
data: SmallVec<[u8; 1024]>,
243
/// The required alignment of this buffer.
244
min_alignment: u32,
245
/// Any relocations referring to this code. Note that only *external*
246
/// relocations are tracked here; references to labels within the buffer are
247
/// resolved before emission.
248
relocs: SmallVec<[MachReloc; 16]>,
249
/// Any trap records referring to this code.
250
traps: SmallVec<[MachTrap; 16]>,
251
/// Any call site records referring to this code.
252
call_sites: SmallVec<[MachCallSite; 16]>,
253
/// Any exception-handler records referred to at call sites.
254
exception_handlers: SmallVec<[MachExceptionHandler; 16]>,
255
/// Any source location mappings referring to this code.
256
srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
257
/// Any user stack maps for this code.
258
///
259
/// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
260
/// by code offset, and each stack map covers `span` bytes on the stack.
261
user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
262
/// Any unwind info at a given location.
263
unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
264
/// The current source location in progress (after `start_srcloc()` and
265
/// before `end_srcloc()`). This is a (start_offset, src_loc) tuple.
266
cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
267
/// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
268
label_offsets: SmallVec<[CodeOffset; 16]>,
269
/// Label aliases: when one label points to an unconditional jump, and that
270
/// jump points to another label, we can redirect references to the first
271
/// label immediately to the second.
272
///
273
/// Invariant: we don't have label-alias cycles. We ensure this by,
274
/// before setting label A to alias label B, resolving B's alias
275
/// target (iteratively until a non-aliased label); if B is already
276
/// aliased to A, then we cannot alias A back to B.
277
label_aliases: SmallVec<[MachLabel; 16]>,
278
/// Constants that must be emitted at some point.
279
pending_constants: SmallVec<[VCodeConstant; 16]>,
280
/// Byte size of all constants in `pending_constants`.
281
pending_constants_size: CodeOffset,
282
/// Traps that must be emitted at some point.
283
pending_traps: SmallVec<[MachLabelTrap; 16]>,
284
/// Fixups that haven't yet been flushed into `fixup_records` below and may
285
/// be related to branches that are chomped. These all get added to
286
/// `fixup_records` during island emission.
287
pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
288
/// The nearest upcoming deadline for entries in `pending_fixup_records`.
289
pending_fixup_deadline: CodeOffset,
290
/// Fixups that must be performed after all code is emitted.
291
fixup_records: BinaryHeap<MachLabelFixup<I>>,
292
/// Latest branches, to facilitate in-place editing for better fallthrough
293
/// behavior and empty-block removal.
294
latest_branches: SmallVec<[MachBranch; 4]>,
295
/// All labels at the current offset (emission tail). This is lazily
296
/// cleared: it is actually accurate as long as the current offset is
297
/// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
298
/// be considered as empty.
299
///
300
/// For correctness, this *must* be complete (i.e., the vector must contain
301
/// all labels whose offsets are resolved to the current tail), because we
302
/// rely on it to update labels when we truncate branches.
303
labels_at_tail: SmallVec<[MachLabel; 4]>,
304
/// The last offset at which `labels_at_tail` is valid. It is conceptually
305
/// always describing the tail of the buffer, but we do not clear
306
/// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
307
/// when the offset has grown past this (`labels_at_tail_off`) point.
308
/// Always <= `cur_offset()`.
309
labels_at_tail_off: CodeOffset,
310
/// Metadata about all constants that this function has access to.
311
///
312
/// This records the size/alignment of all constants (not the actual data)
313
/// along with the last available label generated for the constant. This map
314
/// is consulted when constants are referred to and the label assigned to a
315
/// constant may change over time as well.
316
constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
317
/// All recorded usages of constants as pairs of the constant and where the
318
/// constant needs to be placed within `self.data`. Note that the same
319
/// constant may appear in this array multiple times if it was emitted
320
/// multiple times.
321
used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
322
/// Indicates when a patchable region is currently open, to guard that it's
323
/// not possible to nest patchable regions.
324
open_patchable: bool,
325
}
326
327
impl MachBufferFinalized<Stencil> {
328
/// Get a finalized machine buffer by applying the function's base source location.
329
pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
330
MachBufferFinalized {
331
data: self.data,
332
relocs: self.relocs,
333
traps: self.traps,
334
call_sites: self.call_sites,
335
exception_handlers: self.exception_handlers,
336
srclocs: self
337
.srclocs
338
.into_iter()
339
.map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
340
.collect(),
341
user_stack_maps: self.user_stack_maps,
342
unwind_info: self.unwind_info,
343
alignment: self.alignment,
344
}
345
}
346
}
347
348
/// A `MachBuffer` once emission is completed: holds generated code and records,
349
/// without fixups. This allows the type to be independent of the backend.
350
#[derive(PartialEq, Debug, Clone)]
351
#[cfg_attr(
352
feature = "enable-serde",
353
derive(serde_derive::Serialize, serde_derive::Deserialize)
354
)]
355
pub struct MachBufferFinalized<T: CompilePhase> {
356
/// The buffer contents, as raw bytes.
357
pub(crate) data: SmallVec<[u8; 1024]>,
358
/// Any relocations referring to this code. Note that only *external*
359
/// relocations are tracked here; references to labels within the buffer are
360
/// resolved before emission.
361
pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
362
/// Any trap records referring to this code.
363
pub(crate) traps: SmallVec<[MachTrap; 16]>,
364
/// Any call site records referring to this code.
365
pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
366
/// Any exception-handler records referred to at call sites.
367
pub(crate) exception_handlers: SmallVec<[FinalizedMachExceptionHandler; 16]>,
368
/// Any source location mappings referring to this code.
369
pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
370
/// Any user stack maps for this code.
371
///
372
/// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
373
/// by code offset, and each stack map covers `span` bytes on the stack.
374
pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
375
/// Any unwind info at a given location.
376
pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
377
/// The required alignment of this buffer.
378
pub alignment: u32,
379
}
380
381
const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
382
const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
383
384
/// Threshold on max length of `labels_at_this_branch` list to avoid
385
/// unbounded quadratic behavior (see comment below at use-site).
386
const LABEL_LIST_THRESHOLD: usize = 100;
387
388
/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
389
/// the point at which it is used by emitted code; the buffer records "fixups"
390
/// for references to the label, and will come back and patch the code
391
/// appropriately when the label's location is eventually known.
392
#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
393
pub struct MachLabel(u32);
394
entity_impl!(MachLabel);
395
396
impl MachLabel {
397
/// Get a label for a block. (The first N MachLabels are always reserved for
398
/// the N blocks in the vcode.)
399
pub fn from_block(bindex: BlockIndex) -> MachLabel {
400
MachLabel(bindex.index() as u32)
401
}
402
403
/// Creates a string representing this label, for convenience.
404
pub fn to_string(&self) -> String {
405
format!("label{}", self.0)
406
}
407
}
408
409
impl Default for MachLabel {
410
fn default() -> Self {
411
UNKNOWN_LABEL
412
}
413
}
414
415
/// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
416
/// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
417
/// the [`OpenPatchRegion`] token in the process.
418
pub struct OpenPatchRegion(usize);
419
420
/// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
421
/// of where you might want to use this is for patching instructions that mention constants that
422
/// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
423
/// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
424
/// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
425
/// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
426
/// bytes, and the constants uses can be updated directly.
427
pub struct PatchRegion {
428
range: Range<usize>,
429
}
430
431
impl PatchRegion {
432
/// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
433
pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
434
&mut buffer.data[self.range]
435
}
436
}
437
438
impl<I: VCodeInst> MachBuffer<I> {
439
/// Create a new section, known to start at `start_offset` and with a size limited to
440
/// `length_limit`.
441
pub fn new() -> MachBuffer<I> {
442
MachBuffer {
443
data: SmallVec::new(),
444
min_alignment: I::function_alignment().minimum,
445
relocs: SmallVec::new(),
446
traps: SmallVec::new(),
447
call_sites: SmallVec::new(),
448
exception_handlers: SmallVec::new(),
449
srclocs: SmallVec::new(),
450
user_stack_maps: SmallVec::new(),
451
unwind_info: SmallVec::new(),
452
cur_srcloc: None,
453
label_offsets: SmallVec::new(),
454
label_aliases: SmallVec::new(),
455
pending_constants: SmallVec::new(),
456
pending_constants_size: 0,
457
pending_traps: SmallVec::new(),
458
pending_fixup_records: SmallVec::new(),
459
pending_fixup_deadline: u32::MAX,
460
fixup_records: Default::default(),
461
latest_branches: SmallVec::new(),
462
labels_at_tail: SmallVec::new(),
463
labels_at_tail_off: 0,
464
constants: Default::default(),
465
used_constants: Default::default(),
466
open_patchable: false,
467
}
468
}
469
470
/// Current offset from start of buffer.
471
pub fn cur_offset(&self) -> CodeOffset {
472
self.data.len() as CodeOffset
473
}
474
475
/// Add a byte.
476
pub fn put1(&mut self, value: u8) {
477
self.data.push(value);
478
479
// Post-invariant: conceptual-labels_at_tail contains a complete and
480
// precise list of labels bound at `cur_offset()`. We have advanced
481
// `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
482
// before, it is not anymore (and it cannot become equal, because
483
// `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
484
// conceptually empty (even though it is only lazily cleared). No labels
485
// can be bound at this new offset (by invariant on `label_offsets`).
486
// Hence the invariant holds.
487
}
488
489
/// Add 2 bytes.
490
pub fn put2(&mut self, value: u16) {
491
let bytes = value.to_le_bytes();
492
self.data.extend_from_slice(&bytes[..]);
493
494
// Post-invariant: as for `put1()`.
495
}
496
497
/// Add 4 bytes.
498
pub fn put4(&mut self, value: u32) {
499
let bytes = value.to_le_bytes();
500
self.data.extend_from_slice(&bytes[..]);
501
502
// Post-invariant: as for `put1()`.
503
}
504
505
/// Add 8 bytes.
506
pub fn put8(&mut self, value: u64) {
507
let bytes = value.to_le_bytes();
508
self.data.extend_from_slice(&bytes[..]);
509
510
// Post-invariant: as for `put1()`.
511
}
512
513
/// Add a slice of bytes.
514
pub fn put_data(&mut self, data: &[u8]) {
515
self.data.extend_from_slice(data);
516
517
// Post-invariant: as for `put1()`.
518
}
519
520
/// Reserve appended space and return a mutable slice referring to it.
521
pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
522
let off = self.data.len();
523
let new_len = self.data.len() + len;
524
self.data.resize(new_len, 0);
525
&mut self.data[off..]
526
527
// Post-invariant: as for `put1()`.
528
}
529
530
/// Align up to the given alignment.
531
pub fn align_to(&mut self, align_to: CodeOffset) {
532
trace!("MachBuffer: align to {}", align_to);
533
assert!(
534
align_to.is_power_of_two(),
535
"{align_to} is not a power of two"
536
);
537
while self.cur_offset() & (align_to - 1) != 0 {
538
self.put1(0);
539
}
540
541
// Post-invariant: as for `put1()`.
542
}
543
544
/// Begin a region of patchable code. There is one requirement for the
545
/// code that is emitted: It must not introduce any instructions that
546
/// could be chomped (branches are an example of this). In other words,
547
/// you must not call [`MachBuffer::add_cond_branch`] or
548
/// [`MachBuffer::add_uncond_branch`] between calls to this method and
549
/// [`MachBuffer::end_patchable`].
550
pub fn start_patchable(&mut self) -> OpenPatchRegion {
551
assert!(!self.open_patchable, "Patchable regions may not be nested");
552
self.open_patchable = true;
553
OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
554
}
555
556
/// End a region of patchable code, yielding a [`PatchRegion`] value that
557
/// can be consumed later to produce a one-off mutable slice to the
558
/// associated region of the data buffer.
559
pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
560
// No need to assert the state of `open_patchable` here, as we take
561
// ownership of the only `OpenPatchable` value.
562
self.open_patchable = false;
563
let end = usize::try_from(self.cur_offset()).unwrap();
564
PatchRegion { range: open.0..end }
565
}
566
567
/// Allocate a `Label` to refer to some offset. May not be bound to a fixed
568
/// offset yet.
569
pub fn get_label(&mut self) -> MachLabel {
570
let l = self.label_offsets.len() as u32;
571
self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
572
self.label_aliases.push(UNKNOWN_LABEL);
573
trace!("MachBuffer: new label -> {:?}", MachLabel(l));
574
MachLabel(l)
575
576
// Post-invariant: the only mutation is to add a new label; it has no
577
// bound offset yet, so it trivially satisfies all invariants.
578
}
579
580
/// Reserve the first N MachLabels for blocks.
581
pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
582
trace!("MachBuffer: first {} labels are for blocks", blocks);
583
debug_assert!(self.label_offsets.is_empty());
584
self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
585
self.label_aliases.resize(blocks, UNKNOWN_LABEL);
586
587
// Post-invariant: as for `get_label()`.
588
}
589
590
/// Registers metadata in this `MachBuffer` about the `constants` provided.
591
///
592
/// This will record the size/alignment of all constants which will prepare
593
/// them for emission later on.
594
pub fn register_constants(&mut self, constants: &VCodeConstants) {
595
for (c, val) in constants.iter() {
596
self.register_constant(&c, val);
597
}
598
}
599
600
/// Similar to [`MachBuffer::register_constants`] but registers a
601
/// single constant metadata. This function is useful in
602
/// situations where not all constants are known at the time of
603
/// emission.
604
pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
605
let c2 = self.constants.push(MachBufferConstant {
606
upcoming_label: None,
607
align: data.alignment(),
608
size: data.as_slice().len(),
609
});
610
assert_eq!(*constant, c2);
611
}
612
613
/// Completes constant emission by iterating over `self.used_constants` and
614
/// filling in the "holes" with the constant values provided by `constants`.
615
///
616
/// Returns the alignment required for this entire buffer. Alignment starts
617
/// at the ISA's minimum function alignment and can be increased due to
618
/// constant requirements.
619
fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
620
let mut alignment = self.min_alignment;
621
for (constant, offset) in mem::take(&mut self.used_constants) {
622
let constant = constants.get(constant);
623
let data = constant.as_slice();
624
self.data[offset as usize..][..data.len()].copy_from_slice(data);
625
alignment = constant.alignment().max(alignment);
626
}
627
alignment
628
}
629
630
/// Returns a label that can be used to refer to the `constant` provided.
631
///
632
/// This will automatically defer a new constant to be emitted for
633
/// `constant` if it has not been previously emitted. Note that this
634
/// function may return a different label for the same constant at
635
/// different points in time. The label is valid to use only from the
636
/// current location; the MachBuffer takes care to emit the same constant
637
/// multiple times if needed so the constant is always in range.
638
pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
639
let MachBufferConstant {
640
align,
641
size,
642
upcoming_label,
643
} = self.constants[constant];
644
if let Some(label) = upcoming_label {
645
return label;
646
}
647
648
let label = self.get_label();
649
trace!(
650
"defer constant: eventually emit {size} bytes aligned \
651
to {align} at label {label:?}",
652
);
653
self.pending_constants.push(constant);
654
self.pending_constants_size += size as u32;
655
self.constants[constant].upcoming_label = Some(label);
656
label
657
}
658
659
/// Bind a label to the current offset. A label can only be bound once.
660
pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
661
trace!(
662
"MachBuffer: bind label {:?} at offset {}",
663
label,
664
self.cur_offset()
665
);
666
debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
667
debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
668
let offset = self.cur_offset();
669
self.label_offsets[label.0 as usize] = offset;
670
self.lazily_clear_labels_at_tail();
671
self.labels_at_tail.push(label);
672
673
// Invariants hold: bound offset of label is <= cur_offset (in fact it
674
// is equal). If the `labels_at_tail` list was complete and precise
675
// before, it is still, because we have bound this label to the current
676
// offset and added it to the list (which contains all labels at the
677
// current offset).
678
679
self.optimize_branches(ctrl_plane);
680
681
// Post-invariant: by `optimize_branches()` (see argument there).
682
}
683
684
/// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
685
/// offset that it applies to.
686
fn lazily_clear_labels_at_tail(&mut self) {
687
let offset = self.cur_offset();
688
if offset > self.labels_at_tail_off {
689
self.labels_at_tail_off = offset;
690
self.labels_at_tail.clear();
691
}
692
693
// Post-invariant: either labels_at_tail_off was at cur_offset, and
694
// state is untouched, or was less than cur_offset, in which case the
695
// labels_at_tail list was conceptually empty, and is now actually
696
// empty.
697
}
698
699
/// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
700
pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
701
let mut iters = 0;
702
while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
703
label = self.label_aliases[label.0 as usize];
704
// To protect against an infinite loop (despite our assurances to
705
// ourselves that the invariants make this impossible), assert out
706
// after 1M iterations. The number of basic blocks is limited
707
// in most contexts anyway so this should be impossible to hit with
708
// a legitimate input.
709
iters += 1;
710
assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
711
}
712
self.label_offsets[label.0 as usize]
713
714
// Post-invariant: no mutations.
715
}
716
717
/// Emit a reference to the given label with the given reference type (i.e.,
718
/// branch-instruction format) at the current offset. This is like a
719
/// relocation, but handled internally.
720
///
721
/// This can be called before the branch is actually emitted; fixups will
722
/// not happen until an island is emitted or the buffer is finished.
723
pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
724
trace!(
725
"MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
726
offset, label, kind
727
);
728
729
// Add the fixup, and update the worst-case island size based on a
730
// veneer for this label use.
731
let fixup = MachLabelFixup {
732
label,
733
offset,
734
kind,
735
};
736
self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());
737
self.pending_fixup_records.push(fixup);
738
739
// Post-invariant: no mutations to branches/labels data structures.
740
}
741
742
/// Inform the buffer of an unconditional branch at the given offset,
743
/// targeting the given label. May be used to optimize branches.
744
/// The last added label-use must correspond to this branch.
745
/// This must be called when the current offset is equal to `start`; i.e.,
746
/// before actually emitting the branch. This implies that for a branch that
747
/// uses a label and is eligible for optimizations by the MachBuffer, the
748
/// proper sequence is:
749
///
750
/// - Call `use_label_at_offset()` to emit the fixup record.
751
/// - Call `add_uncond_branch()` to make note of the branch.
752
/// - Emit the bytes for the branch's machine code.
753
///
754
/// Additional requirement: no labels may be bound between `start` and `end`
755
/// (exclusive on both ends).
756
pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
757
debug_assert!(
758
!self.open_patchable,
759
"Branch instruction inserted within a patchable region"
760
);
761
assert!(self.cur_offset() == start);
762
debug_assert!(end > start);
763
assert!(!self.pending_fixup_records.is_empty());
764
let fixup = self.pending_fixup_records.len() - 1;
765
self.lazily_clear_labels_at_tail();
766
self.latest_branches.push(MachBranch {
767
start,
768
end,
769
target,
770
fixup,
771
inverted: None,
772
labels_at_this_branch: self.labels_at_tail.clone(),
773
});
774
775
// Post-invariant: we asserted branch start is current tail; the list of
776
// labels at branch is cloned from list of labels at current tail.
777
}
778
779
/// Inform the buffer of a conditional branch at the given offset,
780
/// targeting the given label. May be used to optimize branches.
781
/// The last added label-use must correspond to this branch.
782
///
783
/// Additional requirement: no labels may be bound between `start` and `end`
784
/// (exclusive on both ends).
785
pub fn add_cond_branch(
786
&mut self,
787
start: CodeOffset,
788
end: CodeOffset,
789
target: MachLabel,
790
inverted: &[u8],
791
) {
792
debug_assert!(
793
!self.open_patchable,
794
"Branch instruction inserted within a patchable region"
795
);
796
assert!(self.cur_offset() == start);
797
debug_assert!(end > start);
798
assert!(!self.pending_fixup_records.is_empty());
799
debug_assert!(
800
inverted.len() == (end - start) as usize,
801
"branch length = {}, but inverted length = {}",
802
end - start,
803
inverted.len()
804
);
805
let fixup = self.pending_fixup_records.len() - 1;
806
let inverted = Some(SmallVec::from(inverted));
807
self.lazily_clear_labels_at_tail();
808
self.latest_branches.push(MachBranch {
809
start,
810
end,
811
target,
812
fixup,
813
inverted,
814
labels_at_this_branch: self.labels_at_tail.clone(),
815
});
816
817
// Post-invariant: we asserted branch start is current tail; labels at
818
// branch list is cloned from list of labels at current tail.
819
}
820
821
fn truncate_last_branch(&mut self) {
822
debug_assert!(
823
!self.open_patchable,
824
"Branch instruction truncated within a patchable region"
825
);
826
827
self.lazily_clear_labels_at_tail();
828
// Invariants hold at this point.
829
830
let b = self.latest_branches.pop().unwrap();
831
assert!(b.end == self.cur_offset());
832
833
// State:
834
// [PRE CODE]
835
// Offset b.start, b.labels_at_this_branch:
836
// [BRANCH CODE]
837
// cur_off, self.labels_at_tail -->
838
// (end of buffer)
839
self.data.truncate(b.start as usize);
840
self.pending_fixup_records.truncate(b.fixup);
841
while let Some(last_srcloc) = self.srclocs.last_mut() {
842
if last_srcloc.end <= b.start {
843
break;
844
}
845
if last_srcloc.start < b.start {
846
last_srcloc.end = b.start;
847
break;
848
}
849
self.srclocs.pop();
850
}
851
// State:
852
// [PRE CODE]
853
// cur_off, Offset b.start, b.labels_at_this_branch:
854
// (end of buffer)
855
//
856
// self.labels_at_tail --> (past end of buffer)
857
let cur_off = self.cur_offset();
858
self.labels_at_tail_off = cur_off;
859
// State:
860
// [PRE CODE]
861
// cur_off, Offset b.start, b.labels_at_this_branch,
862
// self.labels_at_tail:
863
// (end of buffer)
864
//
865
// resolve_label_offset(l) for l in labels_at_tail:
866
// (past end of buffer)
867
868
trace!(
869
"truncate_last_branch: truncated {:?}; off now {}",
870
b, cur_off
871
);
872
873
// Fix up resolved label offsets for labels at tail.
874
for &l in &self.labels_at_tail {
875
self.label_offsets[l.0 as usize] = cur_off;
876
}
877
// Old labels_at_this_branch are now at cur_off.
878
self.labels_at_tail.extend(b.labels_at_this_branch);
879
880
// Post-invariant: this operation is defined to truncate the buffer,
881
// which moves cur_off backward, and to move labels at the end of the
882
// buffer back to the start-of-branch offset.
883
//
884
// latest_branches satisfies all invariants:
885
// - it has no branches past the end of the buffer (branches are in
886
// order, we removed the last one, and we truncated the buffer to just
887
// before the start of that branch)
888
// - no labels were moved to lower offsets than the (new) cur_off, so
889
// the labels_at_this_branch list for any other branch need not change.
890
//
891
// labels_at_tail satisfies all invariants:
892
// - all labels that were at the tail after the truncated branch are
893
// moved backward to just before the branch, which becomes the new tail;
894
// thus every element in the list should remain (ensured by `.extend()`
895
// above).
896
// - all labels that refer to the new tail, which is the start-offset of
897
// the truncated branch, must be present. The `labels_at_this_branch`
898
// list in the truncated branch's record is a complete and precise list
899
// of exactly these labels; we append these to labels_at_tail.
900
// - labels_at_tail_off is at cur_off after truncation occurs, so the
901
// list is valid (not to be lazily cleared).
902
//
903
// The stated operation was performed:
904
// - For each label at the end of the buffer prior to this method, it
905
// now resolves to the new (truncated) end of the buffer: it must have
906
// been in `labels_at_tail` (this list is precise and complete, and
907
// the tail was at the end of the truncated branch on entry), and we
908
// iterate over this list and set `label_offsets` to the new tail.
909
// None of these labels could have been an alias (by invariant), so
910
// `label_offsets` is authoritative for each.
911
// - No other labels will be past the end of the buffer, because of the
912
// requirement that no labels be bound to the middle of branch ranges
913
// (see comments to `add_{cond,uncond}_branch()`).
914
// - The buffer is truncated to just before the last branch, and the
915
// fixup record referring to that last branch is removed.
916
}
917
918
/// Performs various optimizations on branches pointing at the current label.
919
pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
920
if ctrl_plane.get_decision() {
921
return;
922
}
923
924
self.lazily_clear_labels_at_tail();
925
// Invariants valid at this point.
926
927
trace!(
928
"enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
929
self.latest_branches, self.labels_at_tail, self.pending_fixup_records
930
);
931
932
// We continue to munch on branches at the tail of the buffer until no
933
// more rules apply. Note that the loop only continues if a branch is
934
// actually truncated (or if labels are redirected away from a branch),
935
// so this always makes progress.
936
while let Some(b) = self.latest_branches.last() {
937
let cur_off = self.cur_offset();
938
trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
939
// If there has been any code emission since the end of the last branch or
940
// label definition, then there's nothing we can edit (because we
941
// don't move code once placed, only back up and overwrite), so
942
// clear the records and finish.
943
if b.end < cur_off {
944
break;
945
}
946
947
// If the "labels at this branch" list on this branch is
948
// longer than a threshold, don't do any simplification,
949
// and let the branch remain to separate those labels from
950
// the current tail. This avoids quadratic behavior (see
951
// #3468): otherwise, if a long string of "goto next;
952
// next:" patterns are emitted, all of the labels will
953
// coalesce into a long list of aliases for the current
954
// buffer tail. We must track all aliases of the current
955
// tail for correctness, but we are also allowed to skip
956
// optimization (removal) of any branch, so we take the
957
// escape hatch here and let it stand. In effect this
958
// "spreads" the many thousands of labels in the
959
// pathological case among an actual (harmless but
960
// suboptimal) instruction once per N labels.
961
if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
962
break;
963
}
964
965
// Invariant: we are looking at a branch that ends at the tail of
966
// the buffer.
967
968
// For any branch, conditional or unconditional:
969
// - If the target is a label at the current offset, then remove
970
// the conditional branch, and reset all labels that targeted
971
// the current offset (end of branch) to the truncated
972
// end-of-code.
973
//
974
// Preserves execution semantics: a branch to its own fallthrough
975
// address is equivalent to a no-op; in both cases, nextPC is the
976
// fallthrough.
977
if self.resolve_label_offset(b.target) == cur_off {
978
trace!("branch with target == cur off; truncating");
979
self.truncate_last_branch();
980
continue;
981
}
982
983
// If latest is an unconditional branch:
984
//
985
// - If the branch's target is not its own start address, then for
986
// each label at the start of branch, make the label an alias of the
987
// branch target, and remove the label from the "labels at this
988
// branch" list.
989
//
990
// - Preserves execution semantics: an unconditional branch's
991
// only effect is to set PC to a new PC; this change simply
992
// collapses one step in the step-semantics.
993
//
994
// - Post-invariant: the labels that were bound to the start of
995
// this branch become aliases, so they must not be present in any
996
// labels-at-this-branch list or the labels-at-tail list. The
997
// labels are removed form the latest-branch record's
998
// labels-at-this-branch list, and are never placed in the
999
// labels-at-tail list. Furthermore, it is correct that they are
1000
// not in either list, because they are now aliases, and labels
1001
// that are aliases remain aliases forever.
1002
//
1003
// - If there is a prior unconditional branch that ends just before
1004
// this one begins, and this branch has no labels bound to its
1005
// start, then we can truncate this branch, because it is entirely
1006
// unreachable (we have redirected all labels that make it
1007
// reachable otherwise). Do so and continue around the loop.
1008
//
1009
// - Preserves execution semantics: the branch is unreachable,
1010
// because execution can only flow into an instruction from the
1011
// prior instruction's fallthrough or from a branch bound to that
1012
// instruction's start offset. Unconditional branches have no
1013
// fallthrough, so if the prior instruction is an unconditional
1014
// branch, no fallthrough entry can happen. The
1015
// labels-at-this-branch list is complete (by invariant), so if it
1016
// is empty, then the instruction is entirely unreachable. Thus,
1017
// it can be removed.
1018
//
1019
// - Post-invariant: ensured by truncate_last_branch().
1020
//
1021
// - If there is a prior conditional branch whose target label
1022
// resolves to the current offset (branches around the
1023
// unconditional branch), then remove the unconditional branch,
1024
// and make the target of the unconditional the target of the
1025
// conditional instead.
1026
//
1027
// - Preserves execution semantics: previously we had:
1028
//
1029
// L1:
1030
// cond_br L2
1031
// br L3
1032
// L2:
1033
// (end of buffer)
1034
//
1035
// by removing the last branch, we have:
1036
//
1037
// L1:
1038
// cond_br L2
1039
// L2:
1040
// (end of buffer)
1041
//
1042
// we then fix up the records for the conditional branch to
1043
// have:
1044
//
1045
// L1:
1046
// cond_br.inverted L3
1047
// L2:
1048
//
1049
// In the original code, control flow reaches L2 when the
1050
// conditional branch's predicate is true, and L3 otherwise. In
1051
// the optimized code, the same is true.
1052
//
1053
// - Post-invariant: all edits to latest_branches and
1054
// labels_at_tail are performed by `truncate_last_branch()`,
1055
// which maintains the invariants at each step.
1056
1057
if b.is_uncond() {
1058
// Set any label equal to current branch's start as an alias of
1059
// the branch's target, if the target is not the branch itself
1060
// (i.e., an infinite loop).
1061
//
1062
// We cannot perform this aliasing if the target of this branch
1063
// ultimately aliases back here; if so, we need to keep this
1064
// branch, so break out of this loop entirely (and clear the
1065
// latest-branches list below).
1066
//
1067
// Note that this check is what prevents cycles from forming in
1068
// `self.label_aliases`. To see why, consider an arbitrary start
1069
// state:
1070
//
1071
// label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1072
// Ln, which is not aliased.
1073
//
1074
// We would create a cycle if we assigned label_aliases[Ln]
1075
// = L1. Note that the below assignment is the only write
1076
// to label_aliases.
1077
//
1078
// By our other invariants, we have that Ln (`l` below)
1079
// resolves to the offset `b.start`, because it is in the
1080
// set `b.labels_at_this_branch`.
1081
//
1082
// If L1 were already aliased, through some arbitrarily deep
1083
// chain, to Ln, then it must also resolve to this offset
1084
// `b.start`.
1085
//
1086
// By checking the resolution of `L1` against this offset,
1087
// and aborting this branch-simplification if they are
1088
// equal, we prevent the below assignment from ever creating
1089
// a cycle.
1090
if self.resolve_label_offset(b.target) != b.start {
1091
let redirected = b.labels_at_this_branch.len();
1092
for &l in &b.labels_at_this_branch {
1093
trace!(
1094
" -> label at start of branch {:?} redirected to target {:?}",
1095
l, b.target
1096
);
1097
self.label_aliases[l.0 as usize] = b.target;
1098
// NOTE: we continue to ensure the invariant that labels
1099
// pointing to tail of buffer are in `labels_at_tail`
1100
// because we already ensured above that the last branch
1101
// cannot have a target of `cur_off`; so we never have
1102
// to put the label into `labels_at_tail` when moving it
1103
// here.
1104
}
1105
// Maintain invariant: all branches have been redirected
1106
// and are no longer pointing at the start of this branch.
1107
let mut_b = self.latest_branches.last_mut().unwrap();
1108
mut_b.labels_at_this_branch.clear();
1109
1110
if redirected > 0 {
1111
trace!(" -> after label redirects, restarting loop");
1112
continue;
1113
}
1114
} else {
1115
break;
1116
}
1117
1118
let b = self.latest_branches.last().unwrap();
1119
1120
// Examine any immediately preceding branch.
1121
if self.latest_branches.len() > 1 {
1122
let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1123
trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1124
// This uncond is immediately after another uncond; we
1125
// should have already redirected labels to this uncond away
1126
// (but check to be sure); so we can truncate this uncond.
1127
if prev_b.is_uncond()
1128
&& prev_b.end == b.start
1129
&& b.labels_at_this_branch.is_empty()
1130
{
1131
trace!(" -> uncond follows another uncond; truncating");
1132
self.truncate_last_branch();
1133
continue;
1134
}
1135
1136
// This uncond is immediately after a conditional, and the
1137
// conditional's target is the end of this uncond, and we've
1138
// already redirected labels to this uncond away; so we can
1139
// truncate this uncond, flip the sense of the conditional, and
1140
// set the conditional's target (in `latest_branches` and in
1141
// `fixup_records`) to the uncond's target.
1142
if prev_b.is_cond()
1143
&& prev_b.end == b.start
1144
&& self.resolve_label_offset(prev_b.target) == cur_off
1145
{
1146
trace!(
1147
" -> uncond follows a conditional, and conditional's target resolves to current offset"
1148
);
1149
// Save the target of the uncond (this becomes the
1150
// target of the cond), and truncate the uncond.
1151
let target = b.target;
1152
let data = prev_b.inverted.clone().unwrap();
1153
self.truncate_last_branch();
1154
1155
// Mutate the code and cond branch.
1156
let off_before_edit = self.cur_offset();
1157
let prev_b = self.latest_branches.last_mut().unwrap();
1158
let not_inverted = SmallVec::from(
1159
&self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1160
);
1161
1162
// Low-level edit: replaces bytes of branch with
1163
// inverted form. cur_off remains the same afterward, so
1164
// we do not need to modify label data structures.
1165
self.data.truncate(prev_b.start as usize);
1166
self.data.extend_from_slice(&data[..]);
1167
1168
// Save the original code as the inversion of the
1169
// inverted branch, in case we later edit this branch
1170
// again.
1171
prev_b.inverted = Some(not_inverted);
1172
self.pending_fixup_records[prev_b.fixup].label = target;
1173
trace!(" -> reassigning target of condbr to {:?}", target);
1174
prev_b.target = target;
1175
debug_assert_eq!(off_before_edit, self.cur_offset());
1176
continue;
1177
}
1178
}
1179
}
1180
1181
// If we couldn't do anything with the last branch, then break.
1182
break;
1183
}
1184
1185
self.purge_latest_branches();
1186
1187
trace!(
1188
"leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1189
self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1190
);
1191
}
1192
1193
fn purge_latest_branches(&mut self) {
1194
// All of our branch simplification rules work only if a branch ends at
1195
// the tail of the buffer, with no following code; and branches are in
1196
// order in latest_branches; so if the last entry ends prior to
1197
// cur_offset, then clear all entries.
1198
let cur_off = self.cur_offset();
1199
if let Some(l) = self.latest_branches.last() {
1200
if l.end < cur_off {
1201
trace!("purge_latest_branches: removing branch {:?}", l);
1202
self.latest_branches.clear();
1203
}
1204
}
1205
1206
// Post-invariant: no invariant requires any branch to appear in
1207
// `latest_branches`; it is always optional. The list-clear above thus
1208
// preserves all semantics.
1209
}
1210
1211
/// Emit a trap at some point in the future with the specified code and
1212
/// stack map.
1213
///
1214
/// This function returns a [`MachLabel`] which will be the future address
1215
/// of the trap. Jumps should refer to this label, likely by using the
1216
/// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1217
/// patched in once the address of the trap is known.
1218
///
1219
/// This will batch all traps into the end of the function.
1220
pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1221
let label = self.get_label();
1222
self.pending_traps.push(MachLabelTrap {
1223
label,
1224
code,
1225
loc: self.cur_srcloc.map(|(_start, loc)| loc),
1226
});
1227
label
1228
}
1229
1230
/// Is an island needed within the next N bytes?
1231
pub fn island_needed(&self, distance: CodeOffset) -> bool {
1232
let deadline = match self.fixup_records.peek() {
1233
Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1234
None => self.pending_fixup_deadline,
1235
};
1236
deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline
1237
}
1238
1239
/// Returns the maximal offset that islands can reach if `distance` more
1240
/// bytes are appended.
1241
///
1242
/// This is used to determine if veneers need insertions since jumps that
1243
/// can't reach past this point must get a veneer of some form.
1244
fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1245
// Assume that all fixups will require veneers and that the veneers are
1246
// the worst-case size for each platform. This is an over-generalization
1247
// to avoid iterating over the `fixup_records` list or maintaining
1248
// information about it as we go along.
1249
let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())
1250
as u32)
1251
* (I::LabelUse::worst_case_veneer_size())
1252
+ self.pending_constants_size
1253
+ (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1254
self.cur_offset()
1255
.saturating_add(distance)
1256
.saturating_add(island_worst_case_size)
1257
}
1258
1259
/// Emit all pending constants and required pending veneers.
1260
///
1261
/// Should only be called if `island_needed()` returns true, i.e., if we
1262
/// actually reach a deadline. It's not necessarily a problem to do so
1263
/// otherwise but it may result in unnecessary work during emission.
1264
pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1265
self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1266
}
1267
1268
/// Same as `emit_island`, but an internal API with a `force_veneers`
1269
/// argument to force all veneers to always get emitted for debugging.
1270
fn emit_island_maybe_forced(
1271
&mut self,
1272
force_veneers: ForceVeneers,
1273
distance: CodeOffset,
1274
ctrl_plane: &mut ControlPlane,
1275
) {
1276
// We're going to purge fixups, so no latest-branch editing can happen
1277
// anymore.
1278
self.latest_branches.clear();
1279
1280
// End the current location tracking since anything emitted during this
1281
// function shouldn't be attributed to whatever the current source
1282
// location is.
1283
//
1284
// Note that the current source location, if it's set right now, will be
1285
// restored at the end of this island emission.
1286
let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1287
if cur_loc.is_some() {
1288
self.end_srcloc();
1289
}
1290
1291
let forced_threshold = self.worst_case_end_of_island(distance);
1292
1293
// First flush out all traps/constants so we have more labels in case
1294
// fixups are applied against these labels.
1295
//
1296
// Note that traps are placed first since this typically happens at the
1297
// end of the function and for disassemblers we try to keep all the code
1298
// contiguously together.
1299
for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1300
// If this trap has source information associated with it then
1301
// emit this information for the trap instruction going out now too.
1302
if let Some(loc) = loc {
1303
self.start_srcloc(loc);
1304
}
1305
self.align_to(I::LabelUse::ALIGN);
1306
self.bind_label(label, ctrl_plane);
1307
self.add_trap(code);
1308
self.put_data(I::TRAP_OPCODE);
1309
if loc.is_some() {
1310
self.end_srcloc();
1311
}
1312
}
1313
1314
for constant in mem::take(&mut self.pending_constants) {
1315
let MachBufferConstant { align, size, .. } = self.constants[constant];
1316
let label = self.constants[constant].upcoming_label.take().unwrap();
1317
self.align_to(align);
1318
self.bind_label(label, ctrl_plane);
1319
self.used_constants.push((constant, self.cur_offset()));
1320
self.get_appended_space(size);
1321
}
1322
1323
// Either handle all pending fixups because they're ready or move them
1324
// onto the `BinaryHeap` tracking all pending fixups if they aren't
1325
// ready.
1326
assert!(self.latest_branches.is_empty());
1327
for fixup in mem::take(&mut self.pending_fixup_records) {
1328
if self.should_apply_fixup(&fixup, forced_threshold) {
1329
self.handle_fixup(fixup, force_veneers, forced_threshold);
1330
} else {
1331
self.fixup_records.push(fixup);
1332
}
1333
}
1334
self.pending_fixup_deadline = u32::MAX;
1335
while let Some(fixup) = self.fixup_records.peek() {
1336
trace!("emit_island: fixup {:?}", fixup);
1337
1338
// If this fixup shouldn't be applied, that means its label isn't
1339
// defined yet and there'll be remaining space to apply a veneer if
1340
// necessary in the future after this island. In that situation
1341
// because `fixup_records` is sorted by deadline this loop can
1342
// exit.
1343
if !self.should_apply_fixup(fixup, forced_threshold) {
1344
break;
1345
}
1346
1347
let fixup = self.fixup_records.pop().unwrap();
1348
self.handle_fixup(fixup, force_veneers, forced_threshold);
1349
}
1350
1351
if let Some(loc) = cur_loc {
1352
self.start_srcloc(loc);
1353
}
1354
}
1355
1356
fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1357
let label_offset = self.resolve_label_offset(fixup.label);
1358
label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold
1359
}
1360
1361
fn handle_fixup(
1362
&mut self,
1363
fixup: MachLabelFixup<I>,
1364
force_veneers: ForceVeneers,
1365
forced_threshold: CodeOffset,
1366
) {
1367
let MachLabelFixup {
1368
label,
1369
offset,
1370
kind,
1371
} = fixup;
1372
let start = offset as usize;
1373
let end = (offset + kind.patch_size()) as usize;
1374
let label_offset = self.resolve_label_offset(label);
1375
1376
if label_offset != UNKNOWN_LABEL_OFFSET {
1377
// If the offset of the label for this fixup is known then
1378
// we're going to do something here-and-now. We're either going
1379
// to patch the original offset because it's an in-bounds jump,
1380
// or we're going to generate a veneer, patch the fixup to jump
1381
// to the veneer, and then keep going.
1382
//
1383
// If the label comes after the original fixup, then we should
1384
// be guaranteed that the jump is in-bounds. Otherwise there's
1385
// a bug somewhere because this method wasn't called soon
1386
// enough. All forward-jumps are tracked and should get veneers
1387
// before their deadline comes and they're unable to jump
1388
// further.
1389
//
1390
// Otherwise if the label is before the fixup, then that's a
1391
// backwards jump. If it's past the maximum negative range
1392
// then we'll emit a veneer that to jump forward to which can
1393
// then jump backwards.
1394
let veneer_required = if label_offset >= offset {
1395
assert!((label_offset - offset) <= kind.max_pos_range());
1396
false
1397
} else {
1398
(offset - label_offset) > kind.max_neg_range()
1399
};
1400
trace!(
1401
" -> label_offset = {}, known, required = {} (pos {} neg {})",
1402
label_offset,
1403
veneer_required,
1404
kind.max_pos_range(),
1405
kind.max_neg_range()
1406
);
1407
1408
if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1409
self.emit_veneer(label, offset, kind);
1410
} else {
1411
let slice = &mut self.data[start..end];
1412
trace!(
1413
"patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"
1414
);
1415
kind.patch(slice, offset, label_offset);
1416
}
1417
} else {
1418
// If the offset of this label is not known at this time then
1419
// that means that a veneer is required because after this
1420
// island the target can't be in range of the original target.
1421
assert!(forced_threshold - offset > kind.max_pos_range());
1422
self.emit_veneer(label, offset, kind);
1423
}
1424
}
1425
1426
/// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1427
///
1428
/// This will generate extra machine code, using `kind`, to get a
1429
/// larger-jump-kind than `kind` allows. The code at `offset` is then
1430
/// patched to jump to our new code, and then the new code is enqueued for
1431
/// a fixup to get processed at some later time.
1432
fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1433
// If this `kind` doesn't support a veneer then that's a bug in the
1434
// backend because we need to implement support for such a veneer.
1435
assert!(
1436
kind.supports_veneer(),
1437
"jump beyond the range of {kind:?} but a veneer isn't supported",
1438
);
1439
1440
// Allocate space for a veneer in the island.
1441
self.align_to(I::LabelUse::ALIGN);
1442
let veneer_offset = self.cur_offset();
1443
trace!("making a veneer at {}", veneer_offset);
1444
let start = offset as usize;
1445
let end = (offset + kind.patch_size()) as usize;
1446
let slice = &mut self.data[start..end];
1447
// Patch the original label use to refer to the veneer.
1448
trace!(
1449
"patching original at offset {} to veneer offset {}",
1450
offset, veneer_offset
1451
);
1452
kind.patch(slice, offset, veneer_offset);
1453
// Generate the veneer.
1454
let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1455
let (veneer_fixup_off, veneer_label_use) =
1456
kind.generate_veneer(veneer_slice, veneer_offset);
1457
trace!(
1458
"generated veneer; fixup offset {}, label_use {:?}",
1459
veneer_fixup_off, veneer_label_use
1460
);
1461
// Register a new use of `label` with our new veneer fixup and
1462
// offset. This'll recalculate deadlines accordingly and
1463
// enqueue this fixup to get processed at some later
1464
// time.
1465
self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1466
}
1467
1468
fn finish_emission_maybe_forcing_veneers(
1469
&mut self,
1470
force_veneers: ForceVeneers,
1471
ctrl_plane: &mut ControlPlane,
1472
) {
1473
while !self.pending_constants.is_empty()
1474
|| !self.pending_traps.is_empty()
1475
|| !self.fixup_records.is_empty()
1476
|| !self.pending_fixup_records.is_empty()
1477
{
1478
// `emit_island()` will emit any pending veneers and constants, and
1479
// as a side-effect, will also take care of any fixups with resolved
1480
// labels eagerly.
1481
self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);
1482
}
1483
1484
// Ensure that all labels have been fixed up after the last island is emitted. This is a
1485
// full (release-mode) assert because an unresolved label means the emitted code is
1486
// incorrect.
1487
assert!(self.fixup_records.is_empty());
1488
assert!(self.pending_fixup_records.is_empty());
1489
}
1490
1491
/// Finish any deferred emissions and/or fixups.
1492
pub fn finish(
1493
mut self,
1494
constants: &VCodeConstants,
1495
ctrl_plane: &mut ControlPlane,
1496
) -> MachBufferFinalized<Stencil> {
1497
let _tt = timing::vcode_emit_finish();
1498
1499
self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1500
1501
let alignment = self.finish_constants(constants);
1502
1503
// Resolve all labels to their offsets.
1504
let finalized_relocs = self
1505
.relocs
1506
.iter()
1507
.map(|reloc| FinalizedMachReloc {
1508
offset: reloc.offset,
1509
kind: reloc.kind,
1510
addend: reloc.addend,
1511
target: match &reloc.target {
1512
RelocTarget::ExternalName(name) => {
1513
FinalizedRelocTarget::ExternalName(name.clone())
1514
}
1515
RelocTarget::Label(label) => {
1516
FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1517
}
1518
},
1519
})
1520
.collect();
1521
1522
let finalized_exception_handlers = self
1523
.exception_handlers
1524
.iter()
1525
.map(|handler| handler.finalize(|label| self.resolve_label_offset(label)))
1526
.collect();
1527
1528
let mut srclocs = self.srclocs;
1529
srclocs.sort_by_key(|entry| entry.start);
1530
1531
MachBufferFinalized {
1532
data: self.data,
1533
relocs: finalized_relocs,
1534
traps: self.traps,
1535
call_sites: self.call_sites,
1536
exception_handlers: finalized_exception_handlers,
1537
srclocs,
1538
user_stack_maps: self.user_stack_maps,
1539
unwind_info: self.unwind_info,
1540
alignment,
1541
}
1542
}
1543
1544
/// Add an external relocation at the given offset.
1545
pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1546
&mut self,
1547
offset: CodeOffset,
1548
kind: Reloc,
1549
target: &T,
1550
addend: Addend,
1551
) {
1552
let target: RelocTarget = target.clone().into();
1553
// FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1554
// generate a label-use statement to track whether an island is possibly
1555
// needed to escape this function to actually get to the external name.
1556
// This is most likely to come up on AArch64 where calls between
1557
// functions use a 26-bit signed offset which gives +/- 64MB. This means
1558
// that if a function is 128MB in size and there's a call in the middle
1559
// it's impossible to reach the actual target. Also, while it's
1560
// technically possible to jump to the start of a function and then jump
1561
// further, island insertion below always inserts islands after
1562
// previously appended code so for Cranelift's own implementation this
1563
// is also a problem for 64MB functions on AArch64 which start with a
1564
// call instruction, those won't be able to escape.
1565
//
1566
// Ideally what needs to happen here is that a `LabelUse` is
1567
// transparently generated (or call-sites of this function are audited
1568
// to generate a `LabelUse` instead) and tracked internally. The actual
1569
// relocation would then change over time if and when a veneer is
1570
// inserted, where the relocation here would be patched by this
1571
// `MachBuffer` to jump to the veneer. The problem, though, is that all
1572
// this still needs to end up, in the case of a singular function,
1573
// generating a final relocation pointing either to this particular
1574
// relocation or to the veneer inserted. Additionally
1575
// `MachBuffer` needs the concept of a label which will never be
1576
// resolved, so `emit_island` doesn't trip over not actually ever
1577
// knowning what some labels are. Currently the loop in
1578
// `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1579
// loop.
1580
//
1581
// For now this means that because relocs aren't tracked at all that
1582
// AArch64 functions have a rough size limits of 64MB. For now that's
1583
// somewhat reasonable and the failure mode is a panic in `MachBuffer`
1584
// when a relocation can't otherwise be resolved later, so it shouldn't
1585
// actually result in any memory unsafety or anything like that.
1586
self.relocs.push(MachReloc {
1587
offset,
1588
kind,
1589
target,
1590
addend,
1591
});
1592
}
1593
1594
/// Add an external relocation at the current offset.
1595
pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1596
&mut self,
1597
kind: Reloc,
1598
target: &T,
1599
addend: Addend,
1600
) {
1601
self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);
1602
}
1603
1604
/// Add a trap record at the current offset.
1605
pub fn add_trap(&mut self, code: TrapCode) {
1606
self.traps.push(MachTrap {
1607
offset: self.data.len() as CodeOffset,
1608
code,
1609
});
1610
}
1611
1612
/// Add a call-site record at the current offset.
1613
pub fn add_call_site(&mut self) {
1614
self.add_try_call_site(None, core::iter::empty());
1615
}
1616
1617
/// Add a call-site record at the current offset with exception
1618
/// handlers.
1619
pub fn add_try_call_site(
1620
&mut self,
1621
frame_offset: Option<u32>,
1622
exception_handlers: impl Iterator<Item = MachExceptionHandler>,
1623
) {
1624
let start = u32::try_from(self.exception_handlers.len()).unwrap();
1625
self.exception_handlers.extend(exception_handlers);
1626
let end = u32::try_from(self.exception_handlers.len()).unwrap();
1627
let exception_handler_range = start..end;
1628
1629
self.call_sites.push(MachCallSite {
1630
ret_addr: self.data.len() as CodeOffset,
1631
frame_offset,
1632
exception_handler_range,
1633
});
1634
}
1635
1636
/// Add an unwind record at the current offset.
1637
pub fn add_unwind(&mut self, unwind: UnwindInst) {
1638
self.unwind_info.push((self.cur_offset(), unwind));
1639
}
1640
1641
/// Set the `SourceLoc` for code from this offset until the offset at the
1642
/// next call to `end_srcloc()`.
1643
/// Returns the current [CodeOffset] and [RelSourceLoc].
1644
pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1645
let cur = (self.cur_offset(), loc);
1646
self.cur_srcloc = Some(cur);
1647
cur
1648
}
1649
1650
/// Mark the end of the `SourceLoc` segment started at the last
1651
/// `start_srcloc()` call.
1652
pub fn end_srcloc(&mut self) {
1653
let (start, loc) = self
1654
.cur_srcloc
1655
.take()
1656
.expect("end_srcloc() called without start_srcloc()");
1657
let end = self.cur_offset();
1658
// Skip zero-length extends.
1659
debug_assert!(end >= start);
1660
if end > start {
1661
self.srclocs.push(MachSrcLoc { start, end, loc });
1662
}
1663
}
1664
1665
/// Push a user stack map onto this buffer.
1666
///
1667
/// The stack map is associated with the given `return_addr` code
1668
/// offset. This must be the PC for the instruction just *after* this stack
1669
/// map's associated instruction. For example in the sequence `call $foo;
1670
/// add r8, rax`, the `return_addr` must be the offset of the start of the
1671
/// `add` instruction.
1672
///
1673
/// Stack maps must be pushed in sorted `return_addr` order.
1674
pub fn push_user_stack_map(
1675
&mut self,
1676
emit_state: &I::State,
1677
return_addr: CodeOffset,
1678
mut stack_map: ir::UserStackMap,
1679
) {
1680
let span = emit_state.frame_layout().active_size();
1681
trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1682
1683
debug_assert!(
1684
self.user_stack_maps
1685
.last()
1686
.map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1687
"pushed stack maps out of order: {} is not less than {}",
1688
self.user_stack_maps.last().unwrap().0,
1689
return_addr,
1690
);
1691
1692
stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());
1693
self.user_stack_maps.push((return_addr, span, stack_map));
1694
}
1695
1696
/// Increase the alignment of the buffer to the given alignment if bigger
1697
/// than the current alignment.
1698
pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {
1699
self.min_alignment = self.min_alignment.max(
1700
1u32.checked_shl(u32::from(align_to))
1701
.expect("log2_min_function_alignment too large"),
1702
);
1703
}
1704
}
1705
1706
impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {
1707
fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {
1708
for b in iter {
1709
self.put1(b);
1710
}
1711
}
1712
}
1713
1714
impl<T: CompilePhase> MachBufferFinalized<T> {
1715
/// Get a list of source location mapping tuples in sorted-by-start-offset order.
1716
pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1717
&self.srclocs[..]
1718
}
1719
1720
/// Get the total required size for the code.
1721
pub fn total_size(&self) -> CodeOffset {
1722
self.data.len() as CodeOffset
1723
}
1724
1725
/// Return the code in this mach buffer as a hex string for testing purposes.
1726
pub fn stringify_code_bytes(&self) -> String {
1727
// This is pretty lame, but whatever ..
1728
use std::fmt::Write;
1729
let mut s = String::with_capacity(self.data.len() * 2);
1730
for b in &self.data {
1731
write!(&mut s, "{b:02X}").unwrap();
1732
}
1733
s
1734
}
1735
1736
/// Get the code bytes.
1737
pub fn data(&self) -> &[u8] {
1738
// N.B.: we emit every section into the .text section as far as
1739
// the `CodeSink` is concerned; we do not bother to segregate
1740
// the contents into the actual program text, the jumptable and the
1741
// rodata (constant pool). This allows us to generate code assuming
1742
// that these will not be relocated relative to each other, and avoids
1743
// having to designate each section as belonging in one of the three
1744
// fixed categories defined by `CodeSink`. If this becomes a problem
1745
// later (e.g. because of memory permissions or similar), we can
1746
// add this designation and segregate the output; take care, however,
1747
// to add the appropriate relocations in this case.
1748
1749
&self.data[..]
1750
}
1751
1752
/// Get the list of external relocations for this code.
1753
pub fn relocs(&self) -> &[FinalizedMachReloc] {
1754
&self.relocs[..]
1755
}
1756
1757
/// Get the list of trap records for this code.
1758
pub fn traps(&self) -> &[MachTrap] {
1759
&self.traps[..]
1760
}
1761
1762
/// Get the user stack map metadata for this code.
1763
pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {
1764
&self.user_stack_maps
1765
}
1766
1767
/// Take this buffer's user strack map metadata.
1768
pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {
1769
mem::take(&mut self.user_stack_maps)
1770
}
1771
1772
/// Get the list of call sites for this code, along with
1773
/// associated exception handlers.
1774
///
1775
/// Each item yielded by the returned iterator is a struct with:
1776
///
1777
/// - The call site metadata record, with a `ret_addr` field
1778
/// directly accessible and denoting the offset of the return
1779
/// address into this buffer's code.
1780
/// - The slice of pairs of exception tags and code offsets
1781
/// denoting exception-handler entry points associated with this
1782
/// call site.
1783
pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {
1784
self.call_sites.iter().map(|call_site| {
1785
let handler_range = call_site.exception_handler_range.clone();
1786
let handler_range = usize::try_from(handler_range.start).unwrap()
1787
..usize::try_from(handler_range.end).unwrap();
1788
FinalizedMachCallSite {
1789
ret_addr: call_site.ret_addr,
1790
frame_offset: call_site.frame_offset,
1791
exception_handlers: &self.exception_handlers[handler_range],
1792
}
1793
})
1794
}
1795
}
1796
1797
/// An item in the exception-handler list for a callsite, with label
1798
/// references. Items are interpreted in left-to-right order and the
1799
/// first match wins.
1800
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1801
pub enum MachExceptionHandler {
1802
/// A specific tag (in the current dynamic context) should be
1803
/// handled by the code at the given offset.
1804
Tag(ExceptionTag, MachLabel),
1805
/// All exceptions should be handled by the code at the given
1806
/// offset.
1807
Default(MachLabel),
1808
/// The dynamic context for interpreting tags is updated to the
1809
/// value stored in the given machine location (in this frame's
1810
/// context).
1811
Context(ExceptionContextLoc),
1812
}
1813
1814
impl MachExceptionHandler {
1815
fn finalize<F: Fn(MachLabel) -> CodeOffset>(self, f: F) -> FinalizedMachExceptionHandler {
1816
match self {
1817
Self::Tag(tag, label) => FinalizedMachExceptionHandler::Tag(tag, f(label)),
1818
Self::Default(label) => FinalizedMachExceptionHandler::Default(f(label)),
1819
Self::Context(loc) => FinalizedMachExceptionHandler::Context(loc),
1820
}
1821
}
1822
}
1823
1824
/// An item in the exception-handler list for a callsite, with final
1825
/// (lowered) code offsets. Items are interpreted in left-to-right
1826
/// order and the first match wins.
1827
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1828
#[cfg_attr(
1829
feature = "enable-serde",
1830
derive(serde_derive::Serialize, serde_derive::Deserialize)
1831
)]
1832
pub enum FinalizedMachExceptionHandler {
1833
/// A specific tag (in the current dynamic context) should be
1834
/// handled by the code at the given offset.
1835
Tag(ExceptionTag, CodeOffset),
1836
/// All exceptions should be handled by the code at the given
1837
/// offset.
1838
Default(CodeOffset),
1839
/// The dynamic context for interpreting tags is updated to the
1840
/// value stored in the given machine location (in this frame's
1841
/// context).
1842
Context(ExceptionContextLoc),
1843
}
1844
1845
/// A location for a dynamic exception context value.
1846
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
1847
#[cfg_attr(
1848
feature = "enable-serde",
1849
derive(serde_derive::Serialize, serde_derive::Deserialize)
1850
)]
1851
pub enum ExceptionContextLoc {
1852
/// An offset from SP at the callsite.
1853
SPOffset(u32),
1854
/// A GPR at the callsite. The physical register number for the
1855
/// GPR register file on the target architecture is used.
1856
GPR(u8),
1857
}
1858
1859
/// Metadata about a constant.
1860
struct MachBufferConstant {
1861
/// A label which has not yet been bound which can be used for this
1862
/// constant.
1863
///
1864
/// This is lazily created when a label is requested for a constant and is
1865
/// cleared when a constant is emitted.
1866
upcoming_label: Option<MachLabel>,
1867
/// Required alignment.
1868
align: CodeOffset,
1869
/// The byte size of this constant.
1870
size: usize,
1871
}
1872
1873
/// A trap that is deferred to the next time an island is emitted for either
1874
/// traps, constants, or fixups.
1875
struct MachLabelTrap {
1876
/// This label will refer to the trap's offset.
1877
label: MachLabel,
1878
/// The code associated with this trap.
1879
code: TrapCode,
1880
/// An optional source location to assign for this trap.
1881
loc: Option<RelSourceLoc>,
1882
}
1883
1884
/// A fixup to perform on the buffer once code is emitted. Fixups always refer
1885
/// to labels and patch the code based on label offsets. Hence, they are like
1886
/// relocations, but internal to one buffer.
1887
#[derive(Debug)]
1888
struct MachLabelFixup<I: VCodeInst> {
1889
/// The label whose offset controls this fixup.
1890
label: MachLabel,
1891
/// The offset to fix up / patch to refer to this label.
1892
offset: CodeOffset,
1893
/// The kind of fixup. This is architecture-specific; each architecture may have,
1894
/// e.g., several types of branch instructions, each with differently-sized
1895
/// offset fields and different places within the instruction to place the
1896
/// bits.
1897
kind: I::LabelUse,
1898
}
1899
1900
impl<I: VCodeInst> MachLabelFixup<I> {
1901
fn deadline(&self) -> CodeOffset {
1902
self.offset.saturating_add(self.kind.max_pos_range())
1903
}
1904
}
1905
1906
impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
1907
fn eq(&self, other: &Self) -> bool {
1908
self.deadline() == other.deadline()
1909
}
1910
}
1911
1912
impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
1913
1914
impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
1915
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
1916
Some(self.cmp(other))
1917
}
1918
}
1919
1920
impl<I: VCodeInst> Ord for MachLabelFixup<I> {
1921
fn cmp(&self, other: &Self) -> Ordering {
1922
other.deadline().cmp(&self.deadline())
1923
}
1924
}
1925
1926
/// A relocation resulting from a compilation.
1927
#[derive(Clone, Debug, PartialEq)]
1928
#[cfg_attr(
1929
feature = "enable-serde",
1930
derive(serde_derive::Serialize, serde_derive::Deserialize)
1931
)]
1932
pub struct MachRelocBase<T> {
1933
/// The offset at which the relocation applies, *relative to the
1934
/// containing section*.
1935
pub offset: CodeOffset,
1936
/// The kind of relocation.
1937
pub kind: Reloc,
1938
/// The external symbol / name to which this relocation refers.
1939
pub target: T,
1940
/// The addend to add to the symbol value.
1941
pub addend: i64,
1942
}
1943
1944
type MachReloc = MachRelocBase<RelocTarget>;
1945
1946
/// A relocation resulting from a compilation.
1947
pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
1948
1949
/// A Relocation target
1950
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1951
pub enum RelocTarget {
1952
/// Points to an [ExternalName] outside the current function.
1953
ExternalName(ExternalName),
1954
/// Points to a [MachLabel] inside this function.
1955
/// This is different from [MachLabelFixup] in that both the relocation and the
1956
/// label will be emitted and are only resolved at link time.
1957
///
1958
/// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
1959
Label(MachLabel),
1960
}
1961
1962
impl From<ExternalName> for RelocTarget {
1963
fn from(name: ExternalName) -> Self {
1964
Self::ExternalName(name)
1965
}
1966
}
1967
1968
impl From<MachLabel> for RelocTarget {
1969
fn from(label: MachLabel) -> Self {
1970
Self::Label(label)
1971
}
1972
}
1973
1974
/// A Relocation target
1975
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
1976
#[cfg_attr(
1977
feature = "enable-serde",
1978
derive(serde_derive::Serialize, serde_derive::Deserialize)
1979
)]
1980
pub enum FinalizedRelocTarget {
1981
/// Points to an [ExternalName] outside the current function.
1982
ExternalName(ExternalName),
1983
/// Points to a [CodeOffset] from the start of the current function.
1984
Func(CodeOffset),
1985
}
1986
1987
impl FinalizedRelocTarget {
1988
/// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
1989
/// output.
1990
pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
1991
match self {
1992
FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
1993
FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
1994
}
1995
}
1996
}
1997
1998
/// A trap record resulting from a compilation.
1999
#[derive(Clone, Debug, PartialEq)]
2000
#[cfg_attr(
2001
feature = "enable-serde",
2002
derive(serde_derive::Serialize, serde_derive::Deserialize)
2003
)]
2004
pub struct MachTrap {
2005
/// The offset at which the trap instruction occurs, *relative to the
2006
/// containing section*.
2007
pub offset: CodeOffset,
2008
/// The trap code.
2009
pub code: TrapCode,
2010
}
2011
2012
/// A call site record resulting from a compilation.
2013
#[derive(Clone, Debug, PartialEq)]
2014
#[cfg_attr(
2015
feature = "enable-serde",
2016
derive(serde_derive::Serialize, serde_derive::Deserialize)
2017
)]
2018
pub struct MachCallSite {
2019
/// The offset of the call's return address, *relative to the
2020
/// start of the buffer*.
2021
pub ret_addr: CodeOffset,
2022
2023
/// The offset from the FP at this callsite down to the SP when
2024
/// the call occurs, if known. In other words, the size of the
2025
/// stack frame up to the saved FP slot. Useful to recover the
2026
/// start of the stack frame and to look up dynamic contexts
2027
/// stored in [`ExceptionContextLoc::SPOffset`].
2028
///
2029
/// If `None`, the compiler backend did not specify a frame
2030
/// offset. The runtime in use with the compiled code may require
2031
/// the frame offset if exception handlers are present or dynamic
2032
/// context is used, but that is not Cranelift's concern: the
2033
/// frame offset is optional at this level.
2034
pub frame_offset: Option<u32>,
2035
2036
/// Range in `exception_handlers` corresponding to the exception
2037
/// handlers for this callsite.
2038
exception_handler_range: Range<u32>,
2039
}
2040
2041
/// A call site record resulting from a compilation.
2042
#[derive(Clone, Debug, PartialEq)]
2043
pub struct FinalizedMachCallSite<'a> {
2044
/// The offset of the call's return address, *relative to the
2045
/// start of the buffer*.
2046
pub ret_addr: CodeOffset,
2047
2048
/// The offset from the FP at this callsite down to the SP when
2049
/// the call occurs, if known. In other words, the size of the
2050
/// stack frame up to the saved FP slot. Useful to recover the
2051
/// start of the stack frame and to look up dynamic contexts
2052
/// stored in [`ExceptionContextLoc::SPOffset`].
2053
///
2054
/// If `None`, the compiler backend did not specify a frame
2055
/// offset. The runtime in use with the compiled code may require
2056
/// the frame offset if exception handlers are present or dynamic
2057
/// context is used, but that is not Cranelift's concern: the
2058
/// frame offset is optional at this level.
2059
pub frame_offset: Option<u32>,
2060
2061
/// Exception handlers at this callsite, with target offsets
2062
/// *relative to the start of the buffer*.
2063
pub exception_handlers: &'a [FinalizedMachExceptionHandler],
2064
}
2065
2066
/// A source-location mapping resulting from a compilation.
2067
#[derive(PartialEq, Debug, Clone)]
2068
#[cfg_attr(
2069
feature = "enable-serde",
2070
derive(serde_derive::Serialize, serde_derive::Deserialize)
2071
)]
2072
pub struct MachSrcLoc<T: CompilePhase> {
2073
/// The start of the region of code corresponding to a source location.
2074
/// This is relative to the start of the function, not to the start of the
2075
/// section.
2076
pub start: CodeOffset,
2077
/// The end of the region of code corresponding to a source location.
2078
/// This is relative to the start of the function, not to the start of the
2079
/// section.
2080
pub end: CodeOffset,
2081
/// The source location.
2082
pub loc: T::SourceLocType,
2083
}
2084
2085
impl MachSrcLoc<Stencil> {
2086
fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
2087
MachSrcLoc {
2088
start: self.start,
2089
end: self.end,
2090
loc: self.loc.expand(base_srcloc),
2091
}
2092
}
2093
}
2094
2095
/// Record of branch instruction in the buffer, to facilitate editing.
2096
#[derive(Clone, Debug)]
2097
struct MachBranch {
2098
start: CodeOffset,
2099
end: CodeOffset,
2100
target: MachLabel,
2101
fixup: usize,
2102
inverted: Option<SmallVec<[u8; 8]>>,
2103
/// All labels pointing to the start of this branch. For correctness, this
2104
/// *must* be complete (i.e., must contain all labels whose resolved offsets
2105
/// are at the start of this branch): we rely on being able to redirect all
2106
/// labels that could jump to this branch before removing it, if it is
2107
/// otherwise unreachable.
2108
labels_at_this_branch: SmallVec<[MachLabel; 4]>,
2109
}
2110
2111
impl MachBranch {
2112
fn is_cond(&self) -> bool {
2113
self.inverted.is_some()
2114
}
2115
fn is_uncond(&self) -> bool {
2116
self.inverted.is_none()
2117
}
2118
}
2119
2120
/// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2121
///
2122
/// Note that `MachBuffer` was primarily written for intra-function references
2123
/// of jumps between basic blocks, but it's also quite usable for entire text
2124
/// sections and resolving references between functions themselves. This
2125
/// builder interprets "blocks" as labeled functions for the purposes of
2126
/// resolving labels internally in the buffer.
2127
pub struct MachTextSectionBuilder<I: VCodeInst> {
2128
buf: MachBuffer<I>,
2129
next_func: usize,
2130
force_veneers: ForceVeneers,
2131
}
2132
2133
impl<I: VCodeInst> MachTextSectionBuilder<I> {
2134
/// Creates a new text section builder which will have `num_funcs` functions
2135
/// pushed into it.
2136
pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2137
let mut buf = MachBuffer::new();
2138
buf.reserve_labels_for_blocks(num_funcs);
2139
MachTextSectionBuilder {
2140
buf,
2141
next_func: 0,
2142
force_veneers: ForceVeneers::No,
2143
}
2144
}
2145
}
2146
2147
impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2148
fn append(
2149
&mut self,
2150
labeled: bool,
2151
func: &[u8],
2152
align: u32,
2153
ctrl_plane: &mut ControlPlane,
2154
) -> u64 {
2155
// Conditionally emit an island if it's necessary to resolve jumps
2156
// between functions which are too far away.
2157
let size = func.len() as u32;
2158
if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2159
self.buf
2160
.emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2161
}
2162
2163
self.buf.align_to(align);
2164
let pos = self.buf.cur_offset();
2165
if labeled {
2166
self.buf.bind_label(
2167
MachLabel::from_block(BlockIndex::new(self.next_func)),
2168
ctrl_plane,
2169
);
2170
self.next_func += 1;
2171
}
2172
self.buf.put_data(func);
2173
u64::from(pos)
2174
}
2175
2176
fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2177
crate::trace!(
2178
"Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2179
);
2180
let label = MachLabel::from_block(BlockIndex::new(target));
2181
let offset = u32::try_from(offset).unwrap();
2182
match I::LabelUse::from_reloc(reloc, addend) {
2183
Some(label_use) => {
2184
self.buf.use_label_at_offset(offset, label, label_use);
2185
true
2186
}
2187
None => false,
2188
}
2189
}
2190
2191
fn force_veneers(&mut self) {
2192
self.force_veneers = ForceVeneers::Yes;
2193
}
2194
2195
fn write(&mut self, offset: u64, data: &[u8]) {
2196
self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);
2197
}
2198
2199
fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2200
// Double-check all functions were pushed.
2201
assert_eq!(self.next_func, self.buf.label_offsets.len());
2202
2203
// Finish up any veneers, if necessary.
2204
self.buf
2205
.finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2206
2207
// We don't need the data any more, so return it to the caller.
2208
mem::take(&mut self.buf.data).into_vec()
2209
}
2210
}
2211
2212
// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2213
#[cfg(all(test, feature = "arm64"))]
2214
mod test {
2215
use cranelift_entity::EntityRef as _;
2216
2217
use super::*;
2218
use crate::ir::UserExternalNameRef;
2219
use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2220
use crate::isa::aarch64::inst::{OperandSize, xreg};
2221
use crate::machinst::{MachInstEmit, MachInstEmitState};
2222
use crate::settings;
2223
2224
fn label(n: u32) -> MachLabel {
2225
MachLabel::from_block(BlockIndex::new(n as usize))
2226
}
2227
fn target(n: u32) -> BranchTarget {
2228
BranchTarget::Label(label(n))
2229
}
2230
2231
#[test]
2232
fn test_elide_jump_to_next() {
2233
let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2234
let mut buf = MachBuffer::new();
2235
let mut state = <Inst as MachInstEmit>::State::default();
2236
let constants = Default::default();
2237
2238
buf.reserve_labels_for_blocks(2);
2239
buf.bind_label(label(0), state.ctrl_plane_mut());
2240
let inst = Inst::Jump { dest: target(1) };
2241
inst.emit(&mut buf, &info, &mut state);
2242
buf.bind_label(label(1), state.ctrl_plane_mut());
2243
let buf = buf.finish(&constants, state.ctrl_plane_mut());
2244
assert_eq!(0, buf.total_size());
2245
}
2246
2247
#[test]
2248
fn test_elide_trivial_jump_blocks() {
2249
let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2250
let mut buf = MachBuffer::new();
2251
let mut state = <Inst as MachInstEmit>::State::default();
2252
let constants = Default::default();
2253
2254
buf.reserve_labels_for_blocks(4);
2255
2256
buf.bind_label(label(0), state.ctrl_plane_mut());
2257
let inst = Inst::CondBr {
2258
kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2259
taken: target(1),
2260
not_taken: target(2),
2261
};
2262
inst.emit(&mut buf, &info, &mut state);
2263
2264
buf.bind_label(label(1), state.ctrl_plane_mut());
2265
let inst = Inst::Jump { dest: target(3) };
2266
inst.emit(&mut buf, &info, &mut state);
2267
2268
buf.bind_label(label(2), state.ctrl_plane_mut());
2269
let inst = Inst::Jump { dest: target(3) };
2270
inst.emit(&mut buf, &info, &mut state);
2271
2272
buf.bind_label(label(3), state.ctrl_plane_mut());
2273
2274
let buf = buf.finish(&constants, state.ctrl_plane_mut());
2275
assert_eq!(0, buf.total_size());
2276
}
2277
2278
#[test]
2279
fn test_flip_cond() {
2280
let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2281
let mut buf = MachBuffer::new();
2282
let mut state = <Inst as MachInstEmit>::State::default();
2283
let constants = Default::default();
2284
2285
buf.reserve_labels_for_blocks(4);
2286
2287
buf.bind_label(label(0), state.ctrl_plane_mut());
2288
let inst = Inst::CondBr {
2289
kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2290
taken: target(1),
2291
not_taken: target(2),
2292
};
2293
inst.emit(&mut buf, &info, &mut state);
2294
2295
buf.bind_label(label(1), state.ctrl_plane_mut());
2296
let inst = Inst::Nop4;
2297
inst.emit(&mut buf, &info, &mut state);
2298
2299
buf.bind_label(label(2), state.ctrl_plane_mut());
2300
let inst = Inst::Udf {
2301
trap_code: TrapCode::STACK_OVERFLOW,
2302
};
2303
inst.emit(&mut buf, &info, &mut state);
2304
2305
buf.bind_label(label(3), state.ctrl_plane_mut());
2306
2307
let buf = buf.finish(&constants, state.ctrl_plane_mut());
2308
2309
let mut buf2 = MachBuffer::new();
2310
let mut state = Default::default();
2311
let inst = Inst::TrapIf {
2312
kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2313
trap_code: TrapCode::STACK_OVERFLOW,
2314
};
2315
inst.emit(&mut buf2, &info, &mut state);
2316
let inst = Inst::Nop4;
2317
inst.emit(&mut buf2, &info, &mut state);
2318
2319
let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2320
2321
assert_eq!(buf.data, buf2.data);
2322
}
2323
2324
#[test]
2325
fn test_island() {
2326
let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2327
let mut buf = MachBuffer::new();
2328
let mut state = <Inst as MachInstEmit>::State::default();
2329
let constants = Default::default();
2330
2331
buf.reserve_labels_for_blocks(4);
2332
2333
buf.bind_label(label(0), state.ctrl_plane_mut());
2334
let inst = Inst::CondBr {
2335
kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2336
taken: target(2),
2337
not_taken: target(3),
2338
};
2339
inst.emit(&mut buf, &info, &mut state);
2340
2341
buf.bind_label(label(1), state.ctrl_plane_mut());
2342
while buf.cur_offset() < 2000000 {
2343
if buf.island_needed(0) {
2344
buf.emit_island(0, state.ctrl_plane_mut());
2345
}
2346
let inst = Inst::Nop4;
2347
inst.emit(&mut buf, &info, &mut state);
2348
}
2349
2350
buf.bind_label(label(2), state.ctrl_plane_mut());
2351
let inst = Inst::Nop4;
2352
inst.emit(&mut buf, &info, &mut state);
2353
2354
buf.bind_label(label(3), state.ctrl_plane_mut());
2355
let inst = Inst::Nop4;
2356
inst.emit(&mut buf, &info, &mut state);
2357
2358
let buf = buf.finish(&constants, state.ctrl_plane_mut());
2359
2360
assert_eq!(2000000 + 8, buf.total_size());
2361
2362
let mut buf2 = MachBuffer::new();
2363
let mut state = Default::default();
2364
let inst = Inst::CondBr {
2365
kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2366
2367
// This conditionally taken branch has a 19-bit constant, shifted
2368
// to the left by two, giving us a 21-bit range in total. Half of
2369
// this range positive so the we should be around 1 << 20 bytes
2370
// away for our jump target.
2371
//
2372
// There are two pending fixups by the time we reach this point,
2373
// one for this 19-bit jump and one for the unconditional 26-bit
2374
// jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2375
// veneer is 20 bytes large, which means that pessimistically
2376
// assuming we'll need two veneers. Currently each veneer is
2377
// pessimistically assumed to be the maximal size which means we
2378
// need 40 bytes of extra space, meaning that the actual island
2379
// should come 40-bytes before the deadline.
2380
taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2381
2382
// This branch is in-range so no veneers should be needed, it should
2383
// go directly to the target.
2384
not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2385
};
2386
inst.emit(&mut buf2, &info, &mut state);
2387
2388
let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2389
2390
assert_eq!(&buf.data[0..8], &buf2.data[..]);
2391
}
2392
2393
#[test]
2394
fn test_island_backward() {
2395
let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2396
let mut buf = MachBuffer::new();
2397
let mut state = <Inst as MachInstEmit>::State::default();
2398
let constants = Default::default();
2399
2400
buf.reserve_labels_for_blocks(4);
2401
2402
buf.bind_label(label(0), state.ctrl_plane_mut());
2403
let inst = Inst::Nop4;
2404
inst.emit(&mut buf, &info, &mut state);
2405
2406
buf.bind_label(label(1), state.ctrl_plane_mut());
2407
let inst = Inst::Nop4;
2408
inst.emit(&mut buf, &info, &mut state);
2409
2410
buf.bind_label(label(2), state.ctrl_plane_mut());
2411
while buf.cur_offset() < 2000000 {
2412
let inst = Inst::Nop4;
2413
inst.emit(&mut buf, &info, &mut state);
2414
}
2415
2416
buf.bind_label(label(3), state.ctrl_plane_mut());
2417
let inst = Inst::CondBr {
2418
kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2419
taken: target(0),
2420
not_taken: target(1),
2421
};
2422
inst.emit(&mut buf, &info, &mut state);
2423
2424
let buf = buf.finish(&constants, state.ctrl_plane_mut());
2425
2426
assert_eq!(2000000 + 12, buf.total_size());
2427
2428
let mut buf2 = MachBuffer::new();
2429
let mut state = Default::default();
2430
let inst = Inst::CondBr {
2431
kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2432
taken: BranchTarget::ResolvedOffset(8),
2433
not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2434
};
2435
inst.emit(&mut buf2, &info, &mut state);
2436
let inst = Inst::Jump {
2437
dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2438
};
2439
inst.emit(&mut buf2, &info, &mut state);
2440
2441
let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2442
2443
assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2444
}
2445
2446
#[test]
2447
fn test_multiple_redirect() {
2448
// label0:
2449
// cbz x0, label1
2450
// b label2
2451
// label1:
2452
// b label3
2453
// label2:
2454
// nop
2455
// nop
2456
// b label0
2457
// label3:
2458
// b label4
2459
// label4:
2460
// b label5
2461
// label5:
2462
// b label7
2463
// label6:
2464
// nop
2465
// label7:
2466
// ret
2467
//
2468
// -- should become:
2469
//
2470
// label0:
2471
// cbz x0, label7
2472
// label2:
2473
// nop
2474
// nop
2475
// b label0
2476
// label6:
2477
// nop
2478
// label7:
2479
// ret
2480
2481
let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2482
let mut buf = MachBuffer::new();
2483
let mut state = <Inst as MachInstEmit>::State::default();
2484
let constants = Default::default();
2485
2486
buf.reserve_labels_for_blocks(8);
2487
2488
buf.bind_label(label(0), state.ctrl_plane_mut());
2489
let inst = Inst::CondBr {
2490
kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2491
taken: target(1),
2492
not_taken: target(2),
2493
};
2494
inst.emit(&mut buf, &info, &mut state);
2495
2496
buf.bind_label(label(1), state.ctrl_plane_mut());
2497
let inst = Inst::Jump { dest: target(3) };
2498
inst.emit(&mut buf, &info, &mut state);
2499
2500
buf.bind_label(label(2), state.ctrl_plane_mut());
2501
let inst = Inst::Nop4;
2502
inst.emit(&mut buf, &info, &mut state);
2503
inst.emit(&mut buf, &info, &mut state);
2504
let inst = Inst::Jump { dest: target(0) };
2505
inst.emit(&mut buf, &info, &mut state);
2506
2507
buf.bind_label(label(3), state.ctrl_plane_mut());
2508
let inst = Inst::Jump { dest: target(4) };
2509
inst.emit(&mut buf, &info, &mut state);
2510
2511
buf.bind_label(label(4), state.ctrl_plane_mut());
2512
let inst = Inst::Jump { dest: target(5) };
2513
inst.emit(&mut buf, &info, &mut state);
2514
2515
buf.bind_label(label(5), state.ctrl_plane_mut());
2516
let inst = Inst::Jump { dest: target(7) };
2517
inst.emit(&mut buf, &info, &mut state);
2518
2519
buf.bind_label(label(6), state.ctrl_plane_mut());
2520
let inst = Inst::Nop4;
2521
inst.emit(&mut buf, &info, &mut state);
2522
2523
buf.bind_label(label(7), state.ctrl_plane_mut());
2524
let inst = Inst::Ret {};
2525
inst.emit(&mut buf, &info, &mut state);
2526
2527
let buf = buf.finish(&constants, state.ctrl_plane_mut());
2528
2529
let golden_data = vec![
2530
0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2531
0x1f, 0x20, 0x03, 0xd5, // nop
2532
0x1f, 0x20, 0x03, 0xd5, // nop
2533
0xfd, 0xff, 0xff, 0x17, // b 0
2534
0x1f, 0x20, 0x03, 0xd5, // nop
2535
0xc0, 0x03, 0x5f, 0xd6, // ret
2536
];
2537
2538
assert_eq!(&golden_data[..], &buf.data[..]);
2539
}
2540
2541
#[test]
2542
fn test_handle_branch_cycle() {
2543
// label0:
2544
// b label1
2545
// label1:
2546
// b label2
2547
// label2:
2548
// b label3
2549
// label3:
2550
// b label4
2551
// label4:
2552
// b label1 // note: not label0 (to make it interesting).
2553
//
2554
// -- should become:
2555
//
2556
// label0, label1, ..., label4:
2557
// b label0
2558
let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2559
let mut buf = MachBuffer::new();
2560
let mut state = <Inst as MachInstEmit>::State::default();
2561
let constants = Default::default();
2562
2563
buf.reserve_labels_for_blocks(5);
2564
2565
buf.bind_label(label(0), state.ctrl_plane_mut());
2566
let inst = Inst::Jump { dest: target(1) };
2567
inst.emit(&mut buf, &info, &mut state);
2568
2569
buf.bind_label(label(1), state.ctrl_plane_mut());
2570
let inst = Inst::Jump { dest: target(2) };
2571
inst.emit(&mut buf, &info, &mut state);
2572
2573
buf.bind_label(label(2), state.ctrl_plane_mut());
2574
let inst = Inst::Jump { dest: target(3) };
2575
inst.emit(&mut buf, &info, &mut state);
2576
2577
buf.bind_label(label(3), state.ctrl_plane_mut());
2578
let inst = Inst::Jump { dest: target(4) };
2579
inst.emit(&mut buf, &info, &mut state);
2580
2581
buf.bind_label(label(4), state.ctrl_plane_mut());
2582
let inst = Inst::Jump { dest: target(1) };
2583
inst.emit(&mut buf, &info, &mut state);
2584
2585
let buf = buf.finish(&constants, state.ctrl_plane_mut());
2586
2587
let golden_data = vec![
2588
0x00, 0x00, 0x00, 0x14, // b 0
2589
];
2590
2591
assert_eq!(&golden_data[..], &buf.data[..]);
2592
}
2593
2594
#[test]
2595
fn metadata_records() {
2596
let mut buf = MachBuffer::<Inst>::new();
2597
let ctrl_plane = &mut Default::default();
2598
let constants = Default::default();
2599
2600
buf.reserve_labels_for_blocks(3);
2601
2602
buf.bind_label(label(0), ctrl_plane);
2603
buf.put1(1);
2604
buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
2605
buf.put1(2);
2606
buf.add_trap(TrapCode::INTEGER_OVERFLOW);
2607
buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);
2608
buf.add_try_call_site(
2609
Some(0x10),
2610
[
2611
MachExceptionHandler::Tag(ExceptionTag::new(42), label(2)),
2612
MachExceptionHandler::Default(label(1)),
2613
]
2614
.into_iter(),
2615
);
2616
buf.add_reloc(
2617
Reloc::Abs4,
2618
&ExternalName::User(UserExternalNameRef::new(0)),
2619
0,
2620
);
2621
buf.put1(3);
2622
buf.add_reloc(
2623
Reloc::Abs8,
2624
&ExternalName::User(UserExternalNameRef::new(1)),
2625
1,
2626
);
2627
buf.put1(4);
2628
buf.bind_label(label(1), ctrl_plane);
2629
buf.put1(0xff);
2630
buf.bind_label(label(2), ctrl_plane);
2631
buf.put1(0xff);
2632
2633
let buf = buf.finish(&constants, ctrl_plane);
2634
2635
assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);
2636
assert_eq!(
2637
buf.traps()
2638
.iter()
2639
.map(|trap| (trap.offset, trap.code))
2640
.collect::<Vec<_>>(),
2641
vec![
2642
(1, TrapCode::HEAP_OUT_OF_BOUNDS),
2643
(2, TrapCode::INTEGER_OVERFLOW),
2644
(2, TrapCode::INTEGER_DIVISION_BY_ZERO)
2645
]
2646
);
2647
let call_sites: Vec<_> = buf.call_sites().collect();
2648
assert_eq!(call_sites[0].ret_addr, 2);
2649
assert_eq!(call_sites[0].frame_offset, Some(0x10));
2650
assert_eq!(
2651
call_sites[0].exception_handlers,
2652
&[
2653
FinalizedMachExceptionHandler::Tag(ExceptionTag::new(42), 5),
2654
FinalizedMachExceptionHandler::Default(4)
2655
],
2656
);
2657
assert_eq!(
2658
buf.relocs()
2659
.iter()
2660
.map(|reloc| (reloc.offset, reloc.kind))
2661
.collect::<Vec<_>>(),
2662
vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2663
);
2664
}
2665
}
2666
2667