Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
bytecodealliance
GitHub Repository: bytecodealliance/wasmtime
Path: blob/main/cranelift/codegen/src/machinst/abi.rs
3070 views
1
//! Implementation of a vanilla ABI, shared between several machines. The
2
//! implementation here assumes that arguments will be passed in registers
3
//! first, then additional args on the stack; that the stack grows downward,
4
//! contains a standard frame (return address and frame pointer), and the
5
//! compiler is otherwise free to allocate space below that with its choice of
6
//! layout; and that the machine has some notion of caller- and callee-save
7
//! registers. Most modern machines, e.g. x86-64 and AArch64, should fit this
8
//! mold and thus both of these backends use this shared implementation.
9
//!
10
//! See the documentation in specific machine backends for the "instantiation"
11
//! of this generic ABI, i.e., which registers are caller/callee-save, arguments
12
//! and return values, and any other special requirements.
13
//!
14
//! For now the implementation here assumes a 64-bit machine, but we intend to
15
//! make this 32/64-bit-generic shortly.
16
//!
17
//! # Vanilla ABI
18
//!
19
//! First, arguments and return values are passed in registers up to a certain
20
//! fixed count, after which they overflow onto the stack. Multiple return
21
//! values either fit in registers, or are returned in a separate return-value
22
//! area on the stack, given by a hidden extra parameter.
23
//!
24
//! Note that the exact stack layout is up to us. We settled on the
25
//! below design based on several requirements. In particular, we need
26
//! to be able to generate instructions (or instruction sequences) to
27
//! access arguments, stack slots, and spill slots before we know how
28
//! many spill slots or clobber-saves there will be, because of our
29
//! pass structure. We also prefer positive offsets to negative
30
//! offsets because of an asymmetry in some machines' addressing modes
31
//! (e.g., on AArch64, positive offsets have a larger possible range
32
//! without a long-form sequence to synthesize an arbitrary
33
//! offset). We also need clobber-save registers to be "near" the
34
//! frame pointer: Windows unwind information requires it to be within
35
//! 240 bytes of RBP. Finally, it is not allowed to access memory
36
//! below the current SP value.
37
//!
38
//! We assume that a prologue first pushes the frame pointer (and
39
//! return address above that, if the machine does not do that in
40
//! hardware). We set FP to point to this two-word frame record. We
41
//! store all other frame slots below this two-word frame record, as
42
//! well as enough space for arguments to the largest possible
43
//! function call. The stack pointer then remains at this position
44
//! for the duration of the function, allowing us to address all
45
//! frame storage at positive offsets from SP.
46
//!
47
//! Note that if we ever support dynamic stack-space allocation (for
48
//! `alloca`), we will need a way to reference spill slots and stack
49
//! slots relative to a dynamic SP, because we will no longer be able
50
//! to know a static offset from SP to the slots at any particular
51
//! program point. Probably the best solution at that point will be to
52
//! revert to using the frame pointer as the reference for all slots,
53
//! to allow generating spill/reload and stackslot accesses before we
54
//! know how large the clobber-saves will be.
55
//!
56
//! # Stack Layout
57
//!
58
//! The stack looks like:
59
//!
60
//! ```plain
61
//! (high address)
62
//! | ... |
63
//! | caller frames |
64
//! | ... |
65
//! +===========================+
66
//! | ... |
67
//! | stack args |
68
//! Canonical Frame Address --> | (accessed via FP) |
69
//! +---------------------------+
70
//! SP at function entry -----> | return address |
71
//! +---------------------------+
72
//! FP after prologue --------> | FP (pushed by prologue) |
73
//! +---------------------------+ -----
74
//! | ... | |
75
//! | clobbered callee-saves | |
76
//! unwind-frame base --------> | (pushed by prologue) | |
77
//! +---------------------------+ ----- |
78
//! | ... | | |
79
//! | spill slots | | |
80
//! | (accessed via SP) | fixed active
81
//! | ... | frame size
82
//! | stack slots | storage |
83
//! | (accessed via SP) | size |
84
//! | (alloc'd by prologue) | | |
85
//! +---------------------------+ ----- |
86
//! | [alignment as needed] | |
87
//! | ... | |
88
//! | args for largest call | |
89
//! SP -----------------------> | (alloc'd by prologue) | |
90
//! +===========================+ -----
91
//!
92
//! (low address)
93
//! ```
94
//!
95
//! # Multi-value Returns
96
//!
97
//! We support multi-value returns by using multiple return-value
98
//! registers. In some cases this is an extension of the base system
99
//! ABI. See each platform's `abi.rs` implementation for details.
100
101
use crate::CodegenError;
102
use crate::FxHashMap;
103
use crate::HashMap;
104
use crate::entity::SecondaryMap;
105
use crate::ir::{ArgumentExtension, ArgumentPurpose, ExceptionTag, Signature};
106
use crate::ir::{StackSlotKey, types::*};
107
use crate::isa::TargetIsa;
108
use crate::settings::ProbestackStrategy;
109
use crate::{ir, isa};
110
use crate::{machinst::*, trace};
111
use alloc::boxed::Box;
112
use core::marker::PhantomData;
113
use regalloc2::{MachineEnv, PReg, PRegSet};
114
use smallvec::smallvec;
115
116
/// A small vector of instructions (with some reasonable size); appropriate for
117
/// a small fixed sequence implementing one operation.
118
pub type SmallInstVec<I> = SmallVec<[I; 4]>;
119
120
/// A type used by backends to track argument-binding info in the "args"
121
/// pseudoinst. The pseudoinst holds a vec of `ArgPair` structs.
122
#[derive(Clone, Debug)]
123
pub struct ArgPair {
124
/// The vreg that is defined by this args pseudoinst.
125
pub vreg: Writable<Reg>,
126
/// The preg that the arg arrives in; this constrains the vreg's
127
/// placement at the pseudoinst.
128
pub preg: Reg,
129
}
130
131
/// A type used by backends to track return register binding info in the "ret"
132
/// pseudoinst. The pseudoinst holds a vec of `RetPair` structs.
133
#[derive(Clone, Debug)]
134
pub struct RetPair {
135
/// The vreg that is returned by this pseudionst.
136
pub vreg: Reg,
137
/// The preg that the arg is returned through; this constrains the vreg's
138
/// placement at the pseudoinst.
139
pub preg: Reg,
140
}
141
142
/// A location for (part of) an argument or return value. These "storage slots"
143
/// are specified for each register-sized part of an argument.
144
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
145
pub enum ABIArgSlot {
146
/// In a real register.
147
Reg {
148
/// Register that holds this arg.
149
reg: RealReg,
150
/// Value type of this arg.
151
ty: ir::Type,
152
/// Should this arg be zero- or sign-extended?
153
extension: ir::ArgumentExtension,
154
},
155
/// Arguments only: on stack, at given offset from SP at entry.
156
Stack {
157
/// Offset of this arg relative to the base of stack args.
158
offset: i64,
159
/// Value type of this arg.
160
ty: ir::Type,
161
/// Should this arg be zero- or sign-extended?
162
extension: ir::ArgumentExtension,
163
},
164
}
165
166
impl ABIArgSlot {
167
/// The type of the value that will be stored in this slot.
168
pub fn get_type(&self) -> ir::Type {
169
match self {
170
ABIArgSlot::Reg { ty, .. } => *ty,
171
ABIArgSlot::Stack { ty, .. } => *ty,
172
}
173
}
174
}
175
176
/// A vector of `ABIArgSlot`s. Inline capacity for one element because basically
177
/// 100% of values use one slot. Only `i128`s need multiple slots, and they are
178
/// super rare (and never happen with Wasm).
179
pub type ABIArgSlotVec = SmallVec<[ABIArgSlot; 1]>;
180
181
/// An ABIArg is composed of one or more parts. This allows for a CLIF-level
182
/// Value to be passed with its parts in more than one location at the ABI
183
/// level. For example, a 128-bit integer may be passed in two 64-bit registers,
184
/// or even a 64-bit register and a 64-bit stack slot, on a 64-bit machine. The
185
/// number of "parts" should correspond to the number of registers used to store
186
/// this type according to the machine backend.
187
///
188
/// As an invariant, the `purpose` for every part must match. As a further
189
/// invariant, a `StructArg` part cannot appear with any other part.
190
#[derive(Clone, Debug)]
191
pub enum ABIArg {
192
/// Storage slots (registers or stack locations) for each part of the
193
/// argument value. The number of slots must equal the number of register
194
/// parts used to store a value of this type.
195
Slots {
196
/// Slots, one per register part.
197
slots: ABIArgSlotVec,
198
/// Purpose of this arg.
199
purpose: ir::ArgumentPurpose,
200
},
201
/// Structure argument. We reserve stack space for it, but the CLIF-level
202
/// semantics are a little weird: the value passed to the call instruction,
203
/// and received in the corresponding block param, is a *pointer*. On the
204
/// caller side, we memcpy the data from the passed-in pointer to the stack
205
/// area; on the callee side, we compute a pointer to this stack area and
206
/// provide that as the argument's value.
207
StructArg {
208
/// Offset of this arg relative to base of stack args.
209
offset: i64,
210
/// Size of this arg on the stack.
211
size: u64,
212
/// Purpose of this arg.
213
purpose: ir::ArgumentPurpose,
214
},
215
/// Implicit argument. Similar to a StructArg, except that we have the
216
/// target type, not a pointer type, at the CLIF-level. This argument is
217
/// still being passed via reference implicitly.
218
ImplicitPtrArg {
219
/// Register or stack slot holding a pointer to the buffer.
220
pointer: ABIArgSlot,
221
/// Offset of the argument buffer.
222
offset: i64,
223
/// Type of the implicit argument.
224
ty: Type,
225
/// Purpose of this arg.
226
purpose: ir::ArgumentPurpose,
227
},
228
}
229
230
impl ABIArg {
231
/// Create an ABIArg from one register.
232
pub fn reg(
233
reg: RealReg,
234
ty: ir::Type,
235
extension: ir::ArgumentExtension,
236
purpose: ir::ArgumentPurpose,
237
) -> ABIArg {
238
ABIArg::Slots {
239
slots: smallvec![ABIArgSlot::Reg { reg, ty, extension }],
240
purpose,
241
}
242
}
243
244
/// Create an ABIArg from one stack slot.
245
pub fn stack(
246
offset: i64,
247
ty: ir::Type,
248
extension: ir::ArgumentExtension,
249
purpose: ir::ArgumentPurpose,
250
) -> ABIArg {
251
ABIArg::Slots {
252
slots: smallvec![ABIArgSlot::Stack {
253
offset,
254
ty,
255
extension,
256
}],
257
purpose,
258
}
259
}
260
}
261
262
/// Are we computing information about arguments or return values? Much of the
263
/// handling is factored out into common routines; this enum allows us to
264
/// distinguish which case we're handling.
265
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
266
pub enum ArgsOrRets {
267
/// Arguments.
268
Args,
269
/// Return values.
270
Rets,
271
}
272
273
/// Abstract location for a machine-specific ABI impl to translate into the
274
/// appropriate addressing mode.
275
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
276
pub enum StackAMode {
277
/// Offset into the current frame's argument area.
278
IncomingArg(i64, u32),
279
/// Offset within the stack slots in the current frame.
280
Slot(i64),
281
/// Offset into the callee frame's argument area.
282
OutgoingArg(i64),
283
}
284
285
impl StackAMode {
286
fn offset_by(&self, offset: u32) -> Self {
287
match self {
288
StackAMode::IncomingArg(off, size) => {
289
StackAMode::IncomingArg(off.checked_add(i64::from(offset)).unwrap(), *size)
290
}
291
StackAMode::Slot(off) => StackAMode::Slot(off.checked_add(i64::from(offset)).unwrap()),
292
StackAMode::OutgoingArg(off) => {
293
StackAMode::OutgoingArg(off.checked_add(i64::from(offset)).unwrap())
294
}
295
}
296
}
297
}
298
299
/// Trait implemented by machine-specific backend to represent ISA flags.
300
pub trait IsaFlags: Clone {
301
/// Get a flag indicating whether forward-edge CFI is enabled.
302
fn is_forward_edge_cfi_enabled(&self) -> bool {
303
false
304
}
305
}
306
307
/// Used as an out-parameter to accumulate a sequence of `ABIArg`s in
308
/// `ABIMachineSpec::compute_arg_locs`. Wraps the shared allocation for all
309
/// `ABIArg`s in `SigSet` and exposes just the args for the current
310
/// `compute_arg_locs` call.
311
pub struct ArgsAccumulator<'a> {
312
sig_set_abi_args: &'a mut Vec<ABIArg>,
313
start: usize,
314
non_formal_flag: bool,
315
}
316
317
impl<'a> ArgsAccumulator<'a> {
318
fn new(sig_set_abi_args: &'a mut Vec<ABIArg>) -> Self {
319
let start = sig_set_abi_args.len();
320
ArgsAccumulator {
321
sig_set_abi_args,
322
start,
323
non_formal_flag: false,
324
}
325
}
326
327
#[inline]
328
pub fn push(&mut self, arg: ABIArg) {
329
debug_assert!(!self.non_formal_flag);
330
self.sig_set_abi_args.push(arg)
331
}
332
333
#[inline]
334
pub fn push_non_formal(&mut self, arg: ABIArg) {
335
self.non_formal_flag = true;
336
self.sig_set_abi_args.push(arg)
337
}
338
339
#[inline]
340
pub fn args(&self) -> &[ABIArg] {
341
&self.sig_set_abi_args[self.start..]
342
}
343
344
#[inline]
345
pub fn args_mut(&mut self) -> &mut [ABIArg] {
346
&mut self.sig_set_abi_args[self.start..]
347
}
348
}
349
350
/// Trait implemented by machine-specific backend to provide information about
351
/// register assignments and to allow generating the specific instructions for
352
/// stack loads/saves, prologues/epilogues, etc.
353
pub trait ABIMachineSpec {
354
/// The instruction type.
355
type I: VCodeInst;
356
357
/// The ISA flags type.
358
type F: IsaFlags;
359
360
/// This is the limit for the size of argument and return-value areas on the
361
/// stack. We place a reasonable limit here to avoid integer overflow issues
362
/// with 32-bit arithmetic.
363
const STACK_ARG_RET_SIZE_LIMIT: u32;
364
365
/// Returns the number of bits in a word, that is 32/64 for 32/64-bit architecture.
366
fn word_bits() -> u32;
367
368
/// Returns the number of bytes in a word.
369
fn word_bytes() -> u32 {
370
return Self::word_bits() / 8;
371
}
372
373
/// Returns word-size integer type.
374
fn word_type() -> Type {
375
match Self::word_bits() {
376
32 => I32,
377
64 => I64,
378
_ => unreachable!(),
379
}
380
}
381
382
/// Returns word register class.
383
fn word_reg_class() -> RegClass {
384
RegClass::Int
385
}
386
387
/// Returns required stack alignment in bytes.
388
fn stack_align(call_conv: isa::CallConv) -> u32;
389
390
/// Process a list of parameters or return values and allocate them to registers
391
/// and stack slots.
392
///
393
/// The argument locations should be pushed onto the given `ArgsAccumulator`
394
/// in order. Any extra arguments added (such as return area pointers)
395
/// should come at the end of the list so that the first N lowered
396
/// parameters align with the N clif parameters.
397
///
398
/// Returns the stack-space used (rounded up to as alignment requires), and
399
/// if `add_ret_area_ptr` was passed, the index of the extra synthetic arg
400
/// that was added.
401
fn compute_arg_locs(
402
call_conv: isa::CallConv,
403
flags: &settings::Flags,
404
params: &[ir::AbiParam],
405
args_or_rets: ArgsOrRets,
406
add_ret_area_ptr: bool,
407
args: ArgsAccumulator,
408
) -> CodegenResult<(u32, Option<usize>)>;
409
410
/// Generate a load from the stack.
411
fn gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;
412
413
/// Generate a store to the stack.
414
fn gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I;
415
416
/// Generate a move.
417
fn gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I;
418
419
/// Generate an integer-extend operation.
420
fn gen_extend(
421
to_reg: Writable<Reg>,
422
from_reg: Reg,
423
is_signed: bool,
424
from_bits: u8,
425
to_bits: u8,
426
) -> Self::I;
427
428
/// Generate an "args" pseudo-instruction to capture input args in
429
/// registers.
430
fn gen_args(args: Vec<ArgPair>) -> Self::I;
431
432
/// Generate a "rets" pseudo-instruction that moves vregs to return
433
/// registers.
434
fn gen_rets(rets: Vec<RetPair>) -> Self::I;
435
436
/// Generate an add-with-immediate. Note that even if this uses a scratch
437
/// register, it must satisfy two requirements:
438
///
439
/// - The add-imm sequence must only clobber caller-save registers that are
440
/// not used for arguments, because it will be placed in the prologue
441
/// before the clobbered callee-save registers are saved.
442
///
443
/// - The add-imm sequence must work correctly when `from_reg` and/or
444
/// `into_reg` are the register returned by `get_stacklimit_reg()`.
445
fn gen_add_imm(
446
call_conv: isa::CallConv,
447
into_reg: Writable<Reg>,
448
from_reg: Reg,
449
imm: u32,
450
) -> SmallInstVec<Self::I>;
451
452
/// Generate a sequence that traps with a `TrapCode::StackOverflow` code if
453
/// the stack pointer is less than the given limit register (assuming the
454
/// stack grows downward).
455
fn gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>;
456
457
/// Generate an instruction to compute an address of a stack slot (FP- or
458
/// SP-based offset).
459
fn gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>) -> Self::I;
460
461
/// Get a fixed register to use to compute a stack limit. This is needed for
462
/// certain sequences generated after the register allocator has already
463
/// run. This must satisfy two requirements:
464
///
465
/// - It must be a caller-save register that is not used for arguments,
466
/// because it will be clobbered in the prologue before the clobbered
467
/// callee-save registers are saved.
468
///
469
/// - It must be safe to pass as an argument and/or destination to
470
/// `gen_add_imm()`. This is relevant when an addition with a large
471
/// immediate needs its own temporary; it cannot use the same fixed
472
/// temporary as this one.
473
fn get_stacklimit_reg(call_conv: isa::CallConv) -> Reg;
474
475
/// Generate a load to the given [base+offset] address.
476
fn gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I;
477
478
/// Generate a store from the given [base+offset] address.
479
fn gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I;
480
481
/// Adjust the stack pointer up or down.
482
fn gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>;
483
484
/// Compute a FrameLayout structure containing a sorted list of all clobbered
485
/// registers that are callee-saved according to the ABI, as well as the sizes
486
/// of all parts of the stack frame. The result is used to emit the prologue
487
/// and epilogue routines.
488
fn compute_frame_layout(
489
call_conv: isa::CallConv,
490
flags: &settings::Flags,
491
sig: &Signature,
492
regs: &[Writable<RealReg>],
493
function_calls: FunctionCalls,
494
incoming_args_size: u32,
495
tail_args_size: u32,
496
stackslots_size: u32,
497
fixed_frame_storage_size: u32,
498
outgoing_args_size: u32,
499
) -> FrameLayout;
500
501
/// Generate the usual frame-setup sequence for this architecture: e.g.,
502
/// `push rbp / mov rbp, rsp` on x86-64, or `stp fp, lr, [sp, #-16]!` on
503
/// AArch64.
504
fn gen_prologue_frame_setup(
505
call_conv: isa::CallConv,
506
flags: &settings::Flags,
507
isa_flags: &Self::F,
508
frame_layout: &FrameLayout,
509
) -> SmallInstVec<Self::I>;
510
511
/// Generate the usual frame-restore sequence for this architecture.
512
fn gen_epilogue_frame_restore(
513
call_conv: isa::CallConv,
514
flags: &settings::Flags,
515
isa_flags: &Self::F,
516
frame_layout: &FrameLayout,
517
) -> SmallInstVec<Self::I>;
518
519
/// Generate a return instruction.
520
fn gen_return(
521
call_conv: isa::CallConv,
522
isa_flags: &Self::F,
523
frame_layout: &FrameLayout,
524
) -> SmallInstVec<Self::I>;
525
526
/// Generate a probestack call.
527
fn gen_probestack(insts: &mut SmallInstVec<Self::I>, frame_size: u32);
528
529
/// Generate a inline stack probe.
530
fn gen_inline_probestack(
531
insts: &mut SmallInstVec<Self::I>,
532
call_conv: isa::CallConv,
533
frame_size: u32,
534
guard_size: u32,
535
);
536
537
/// Generate a clobber-save sequence. The implementation here should return
538
/// a sequence of instructions that "push" or otherwise save to the stack all
539
/// registers written/modified by the function body that are callee-saved.
540
/// The sequence of instructions should adjust the stack pointer downward,
541
/// and should align as necessary according to ABI requirements.
542
fn gen_clobber_save(
543
call_conv: isa::CallConv,
544
flags: &settings::Flags,
545
frame_layout: &FrameLayout,
546
) -> SmallVec<[Self::I; 16]>;
547
548
/// Generate a clobber-restore sequence. This sequence should perform the
549
/// opposite of the clobber-save sequence generated above, assuming that SP
550
/// going into the sequence is at the same point that it was left when the
551
/// clobber-save sequence finished.
552
fn gen_clobber_restore(
553
call_conv: isa::CallConv,
554
flags: &settings::Flags,
555
frame_layout: &FrameLayout,
556
) -> SmallVec<[Self::I; 16]>;
557
558
/// Generate a memcpy invocation. Used to set up struct
559
/// args. Takes `src`, `dst` as read-only inputs and passes a temporary
560
/// allocator.
561
fn gen_memcpy<F: FnMut(Type) -> Writable<Reg>>(
562
call_conv: isa::CallConv,
563
dst: Reg,
564
src: Reg,
565
size: usize,
566
alloc_tmp: F,
567
) -> SmallVec<[Self::I; 8]>;
568
569
/// Get the number of spillslots required for the given register-class.
570
fn get_number_of_spillslots_for_value(
571
rc: RegClass,
572
target_vector_bytes: u32,
573
isa_flags: &Self::F,
574
) -> u32;
575
576
/// Get the ABI-dependent MachineEnv for managing register allocation.
577
fn get_machine_env(flags: &settings::Flags, call_conv: isa::CallConv) -> &MachineEnv;
578
579
/// Get all caller-save registers, that is, registers that we expect
580
/// not to be saved across a call to a callee with the given ABI.
581
fn get_regs_clobbered_by_call(
582
call_conv_of_callee: isa::CallConv,
583
is_exception: bool,
584
) -> PRegSet;
585
586
/// Get the needed extension mode, given the mode attached to the argument
587
/// in the signature and the calling convention. The input (the attribute in
588
/// the signature) specifies what extension type should be done *if* the ABI
589
/// requires extension to the full register; this method's return value
590
/// indicates whether the extension actually *will* be done.
591
fn get_ext_mode(
592
call_conv: isa::CallConv,
593
specified: ir::ArgumentExtension,
594
) -> ir::ArgumentExtension;
595
596
/// Get a temporary register that is available to use after a call
597
/// completes and that does not interfere with register-carried
598
/// return values. This is used to move stack-carried return
599
/// values directly into spillslots if needed.
600
fn retval_temp_reg(call_conv_of_callee: isa::CallConv) -> Writable<Reg>;
601
602
/// Get the exception payload registers, if any, for a calling
603
/// convention.
604
///
605
/// Note that the argument here is the calling convention of the *callee*.
606
/// This might differ from the caller but the exceptional payloads that are
607
/// available are defined by the callee, not the caller.
608
fn exception_payload_regs(callee_conv: isa::CallConv) -> &'static [Reg] {
609
let _ = callee_conv;
610
&[]
611
}
612
}
613
614
/// Out-of-line data for calls, to keep the size of `Inst` down.
615
#[derive(Clone, Debug)]
616
pub struct CallInfo<T> {
617
/// Receiver of this call
618
pub dest: T,
619
/// Register uses of this call.
620
pub uses: CallArgList,
621
/// Register defs of this call.
622
pub defs: CallRetList,
623
/// Registers clobbered by this call, as per its calling convention.
624
pub clobbers: PRegSet,
625
/// The calling convention of the callee.
626
pub callee_conv: isa::CallConv,
627
/// The calling convention of the caller.
628
pub caller_conv: isa::CallConv,
629
/// The number of bytes that the callee will pop from the stack for the
630
/// caller, if any. (Used for popping stack arguments with the `tail`
631
/// calling convention.)
632
pub callee_pop_size: u32,
633
/// Information for a try-call, if this is one. We combine
634
/// handling of calls and try-calls as much as possible to share
635
/// argument/return logic; they mostly differ in the metadata that
636
/// they emit, which this information feeds into.
637
pub try_call_info: Option<TryCallInfo>,
638
/// Whether this call is patchable.
639
pub patchable: bool,
640
}
641
642
/// Out-of-line information present on `try_call` instructions only:
643
/// information that is used to generate exception-handling tables and
644
/// link up to destination blocks properly.
645
#[derive(Clone, Debug)]
646
pub struct TryCallInfo {
647
/// The target to jump to on a normal returhn.
648
pub continuation: MachLabel,
649
/// Exception tags to catch and corresponding destination labels.
650
pub exception_handlers: Box<[TryCallHandler]>,
651
}
652
653
/// Information about an individual handler at a try-call site.
654
#[derive(Clone, Debug)]
655
pub enum TryCallHandler {
656
/// If the tag matches (given the current context), recover at the
657
/// label.
658
Tag(ExceptionTag, MachLabel),
659
/// Recover at the label unconditionally.
660
Default(MachLabel),
661
/// Set the dynamic context for interpreting tags at this point in
662
/// the handler list.
663
Context(Reg),
664
}
665
666
impl<T> CallInfo<T> {
667
/// Creates an empty set of info with no clobbers/uses/etc with the
668
/// specified ABI
669
pub fn empty(dest: T, call_conv: isa::CallConv) -> CallInfo<T> {
670
CallInfo {
671
dest,
672
uses: smallvec![],
673
defs: smallvec![],
674
clobbers: PRegSet::empty(),
675
caller_conv: call_conv,
676
callee_conv: call_conv,
677
callee_pop_size: 0,
678
try_call_info: None,
679
patchable: false,
680
}
681
}
682
}
683
684
/// The id of an ABI signature within the `SigSet`.
685
#[derive(Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
686
pub struct Sig(u32);
687
cranelift_entity::entity_impl!(Sig);
688
689
impl Sig {
690
fn prev(self) -> Option<Sig> {
691
self.0.checked_sub(1).map(Sig)
692
}
693
}
694
695
/// ABI information shared between body (callee) and caller.
696
#[derive(Clone, Debug)]
697
pub struct SigData {
698
/// Currently both return values and arguments are stored in a continuous space vector
699
/// in `SigSet::abi_args`.
700
///
701
/// ```plain
702
/// +----------------------------------------------+
703
/// | return values |
704
/// | ... |
705
/// rets_end --> +----------------------------------------------+
706
/// | arguments |
707
/// | ... |
708
/// args_end --> +----------------------------------------------+
709
///
710
/// ```
711
///
712
/// Note we only store two offsets as rets_end == args_start, and rets_start == prev.args_end.
713
///
714
/// Argument location ending offset (regs or stack slots). Stack offsets are relative to
715
/// SP on entry to function.
716
///
717
/// This is a index into the `SigSet::abi_args`.
718
args_end: u32,
719
720
/// Return-value location ending offset. Stack offsets are relative to the return-area
721
/// pointer.
722
///
723
/// This is a index into the `SigSet::abi_args`.
724
rets_end: u32,
725
726
/// Space on stack used to store arguments. We're storing the size in u32 to
727
/// reduce the size of the struct.
728
sized_stack_arg_space: u32,
729
730
/// Space on stack used to store return values. We're storing the size in u32 to
731
/// reduce the size of the struct.
732
sized_stack_ret_space: u32,
733
734
/// Index in `args` of the stack-return-value-area argument.
735
stack_ret_arg: Option<u16>,
736
737
/// Calling convention used.
738
call_conv: isa::CallConv,
739
}
740
741
impl SigData {
742
/// Get total stack space required for arguments.
743
pub fn sized_stack_arg_space(&self) -> u32 {
744
self.sized_stack_arg_space
745
}
746
747
/// Get total stack space required for return values.
748
pub fn sized_stack_ret_space(&self) -> u32 {
749
self.sized_stack_ret_space
750
}
751
752
/// Get calling convention used.
753
pub fn call_conv(&self) -> isa::CallConv {
754
self.call_conv
755
}
756
757
/// The index of the stack-return-value-area argument, if any.
758
pub fn stack_ret_arg(&self) -> Option<u16> {
759
self.stack_ret_arg
760
}
761
}
762
763
/// A (mostly) deduplicated set of ABI signatures.
764
///
765
/// We say "mostly" because we do not dedupe between signatures interned via
766
/// `ir::SigRef` (direct and indirect calls; the vast majority of signatures in
767
/// this set) vs via `ir::Signature` (the callee itself and libcalls). Doing
768
/// this final bit of deduplication would require filling out the
769
/// `ir_signature_to_abi_sig`, which is a bunch of allocations (not just the
770
/// hash map itself but params and returns vecs in each signature) that we want
771
/// to avoid.
772
///
773
/// In general, prefer using the `ir::SigRef`-taking methods to the
774
/// `ir::Signature`-taking methods when you can get away with it, as they don't
775
/// require cloning non-copy types that will trigger heap allocations.
776
///
777
/// This type can be indexed by `Sig` to access its associated `SigData`.
778
pub struct SigSet {
779
/// Interned `ir::Signature`s that we already have an ABI signature for.
780
ir_signature_to_abi_sig: FxHashMap<ir::Signature, Sig>,
781
782
/// Interned `ir::SigRef`s that we already have an ABI signature for.
783
ir_sig_ref_to_abi_sig: SecondaryMap<ir::SigRef, Option<Sig>>,
784
785
/// A single, shared allocation for all `ABIArg`s used by all
786
/// `SigData`s. Each `SigData` references its args/rets via indices into
787
/// this allocation.
788
abi_args: Vec<ABIArg>,
789
790
/// The actual ABI signatures, keyed by `Sig`.
791
sigs: PrimaryMap<Sig, SigData>,
792
}
793
794
impl SigSet {
795
/// Construct a new `SigSet`, interning all of the signatures used by the
796
/// given function.
797
pub fn new<M>(func: &ir::Function, flags: &settings::Flags) -> CodegenResult<Self>
798
where
799
M: ABIMachineSpec,
800
{
801
let arg_estimate = func.dfg.signatures.len() * 6;
802
803
let mut sigs = SigSet {
804
ir_signature_to_abi_sig: FxHashMap::default(),
805
ir_sig_ref_to_abi_sig: SecondaryMap::with_capacity(func.dfg.signatures.len()),
806
abi_args: Vec::with_capacity(arg_estimate),
807
sigs: PrimaryMap::with_capacity(1 + func.dfg.signatures.len()),
808
};
809
810
sigs.make_abi_sig_from_ir_signature::<M>(func.signature.clone(), flags)?;
811
for sig_ref in func.dfg.signatures.keys() {
812
sigs.make_abi_sig_from_ir_sig_ref::<M>(sig_ref, &func.dfg, flags)?;
813
}
814
815
Ok(sigs)
816
}
817
818
/// Have we already interned an ABI signature for the given `ir::Signature`?
819
pub fn have_abi_sig_for_signature(&self, signature: &ir::Signature) -> bool {
820
self.ir_signature_to_abi_sig.contains_key(signature)
821
}
822
823
/// Construct and intern an ABI signature for the given `ir::Signature`.
824
pub fn make_abi_sig_from_ir_signature<M>(
825
&mut self,
826
signature: ir::Signature,
827
flags: &settings::Flags,
828
) -> CodegenResult<Sig>
829
where
830
M: ABIMachineSpec,
831
{
832
// Because the `HashMap` entry API requires taking ownership of the
833
// lookup key -- and we want to avoid unnecessary clones of
834
// `ir::Signature`s, even at the cost of duplicate lookups -- we can't
835
// have a single, get-or-create-style method for interning
836
// `ir::Signature`s into ABI signatures. So at least (debug) assert that
837
// we aren't creating duplicate ABI signatures for the same
838
// `ir::Signature`.
839
debug_assert!(!self.have_abi_sig_for_signature(&signature));
840
841
let sig_data = self.from_func_sig::<M>(&signature, flags)?;
842
let sig = self.sigs.push(sig_data);
843
self.ir_signature_to_abi_sig.insert(signature, sig);
844
Ok(sig)
845
}
846
847
fn make_abi_sig_from_ir_sig_ref<M>(
848
&mut self,
849
sig_ref: ir::SigRef,
850
dfg: &ir::DataFlowGraph,
851
flags: &settings::Flags,
852
) -> CodegenResult<Sig>
853
where
854
M: ABIMachineSpec,
855
{
856
if let Some(sig) = self.ir_sig_ref_to_abi_sig[sig_ref] {
857
return Ok(sig);
858
}
859
let signature = &dfg.signatures[sig_ref];
860
let sig_data = self.from_func_sig::<M>(signature, flags)?;
861
let sig = self.sigs.push(sig_data);
862
self.ir_sig_ref_to_abi_sig[sig_ref] = Some(sig);
863
Ok(sig)
864
}
865
866
/// Get the already-interned ABI signature id for the given `ir::SigRef`.
867
pub fn abi_sig_for_sig_ref(&self, sig_ref: ir::SigRef) -> Sig {
868
self.ir_sig_ref_to_abi_sig[sig_ref]
869
.expect("must call `make_abi_sig_from_ir_sig_ref` before `get_abi_sig_for_sig_ref`")
870
}
871
872
/// Get the already-interned ABI signature id for the given `ir::Signature`.
873
pub fn abi_sig_for_signature(&self, signature: &ir::Signature) -> Sig {
874
self.ir_signature_to_abi_sig
875
.get(signature)
876
.copied()
877
.expect("must call `make_abi_sig_from_ir_signature` before `get_abi_sig_for_signature`")
878
}
879
880
pub fn from_func_sig<M: ABIMachineSpec>(
881
&mut self,
882
sig: &ir::Signature,
883
flags: &settings::Flags,
884
) -> CodegenResult<SigData> {
885
// Keep in sync with ensure_struct_return_ptr_is_returned
886
if sig.uses_special_return(ArgumentPurpose::StructReturn) {
887
panic!("Explicit StructReturn return value not allowed: {sig:?}")
888
}
889
let tmp;
890
let returns = if let Some(struct_ret_index) =
891
sig.special_param_index(ArgumentPurpose::StructReturn)
892
{
893
if !sig.returns.is_empty() {
894
panic!("No return values are allowed when using StructReturn: {sig:?}");
895
}
896
tmp = [sig.params[struct_ret_index]];
897
&tmp
898
} else {
899
sig.returns.as_slice()
900
};
901
902
// Compute args and retvals from signature. Handle retvals first,
903
// because we may need to add a return-area arg to the args.
904
905
// NOTE: We rely on the order of the args (rets -> args) inserted to compute the offsets in
906
// `SigSet::args()` and `SigSet::rets()`. Therefore, we cannot change the two
907
// compute_arg_locs order.
908
let (sized_stack_ret_space, _) = M::compute_arg_locs(
909
sig.call_conv,
910
flags,
911
&returns,
912
ArgsOrRets::Rets,
913
/* extra ret-area ptr = */ false,
914
ArgsAccumulator::new(&mut self.abi_args),
915
)?;
916
if !flags.enable_multi_ret_implicit_sret() {
917
assert_eq!(sized_stack_ret_space, 0);
918
}
919
let rets_end = u32::try_from(self.abi_args.len()).unwrap();
920
921
// To avoid overflow issues, limit the return size to something reasonable.
922
if sized_stack_ret_space > M::STACK_ARG_RET_SIZE_LIMIT {
923
return Err(CodegenError::ImplLimitExceeded);
924
}
925
926
let need_stack_return_area = sized_stack_ret_space > 0;
927
if need_stack_return_area {
928
assert!(!sig.uses_special_param(ir::ArgumentPurpose::StructReturn));
929
}
930
931
let (sized_stack_arg_space, stack_ret_arg) = M::compute_arg_locs(
932
sig.call_conv,
933
flags,
934
&sig.params,
935
ArgsOrRets::Args,
936
need_stack_return_area,
937
ArgsAccumulator::new(&mut self.abi_args),
938
)?;
939
let args_end = u32::try_from(self.abi_args.len()).unwrap();
940
941
// To avoid overflow issues, limit the arg size to something reasonable.
942
if sized_stack_arg_space > M::STACK_ARG_RET_SIZE_LIMIT {
943
return Err(CodegenError::ImplLimitExceeded);
944
}
945
946
trace!(
947
"ABISig: sig {:?} => args end = {} rets end = {}
948
arg stack = {} ret stack = {} stack_ret_arg = {:?}",
949
sig,
950
args_end,
951
rets_end,
952
sized_stack_arg_space,
953
sized_stack_ret_space,
954
need_stack_return_area,
955
);
956
957
let stack_ret_arg = stack_ret_arg.map(|s| u16::try_from(s).unwrap());
958
Ok(SigData {
959
args_end,
960
rets_end,
961
sized_stack_arg_space,
962
sized_stack_ret_space,
963
stack_ret_arg,
964
call_conv: sig.call_conv,
965
})
966
}
967
968
/// Get this signature's ABI arguments.
969
pub fn args(&self, sig: Sig) -> &[ABIArg] {
970
let sig_data = &self.sigs[sig];
971
// Please see comments in `SigSet::from_func_sig` of how we store the offsets.
972
let start = usize::try_from(sig_data.rets_end).unwrap();
973
let end = usize::try_from(sig_data.args_end).unwrap();
974
&self.abi_args[start..end]
975
}
976
977
/// Get information specifying how to pass the implicit pointer
978
/// to the return-value area on the stack, if required.
979
pub fn get_ret_arg(&self, sig: Sig) -> Option<ABIArg> {
980
let sig_data = &self.sigs[sig];
981
if let Some(i) = sig_data.stack_ret_arg {
982
Some(self.args(sig)[usize::from(i)].clone())
983
} else {
984
None
985
}
986
}
987
988
/// Get information specifying how to pass one argument.
989
pub fn get_arg(&self, sig: Sig, idx: usize) -> ABIArg {
990
self.args(sig)[idx].clone()
991
}
992
993
/// Get this signature's ABI returns.
994
pub fn rets(&self, sig: Sig) -> &[ABIArg] {
995
let sig_data = &self.sigs[sig];
996
// Please see comments in `SigSet::from_func_sig` of how we store the offsets.
997
let start = usize::try_from(sig.prev().map_or(0, |prev| self.sigs[prev].args_end)).unwrap();
998
let end = usize::try_from(sig_data.rets_end).unwrap();
999
&self.abi_args[start..end]
1000
}
1001
1002
/// Get information specifying how to pass one return value.
1003
pub fn get_ret(&self, sig: Sig, idx: usize) -> ABIArg {
1004
self.rets(sig)[idx].clone()
1005
}
1006
1007
/// Get the number of arguments expected.
1008
pub fn num_args(&self, sig: Sig) -> usize {
1009
let len = self.args(sig).len();
1010
if self.sigs[sig].stack_ret_arg.is_some() {
1011
len - 1
1012
} else {
1013
len
1014
}
1015
}
1016
1017
/// Get the number of return values expected.
1018
pub fn num_rets(&self, sig: Sig) -> usize {
1019
self.rets(sig).len()
1020
}
1021
}
1022
1023
// NB: we do _not_ implement `IndexMut` because these signatures are
1024
// deduplicated and shared!
1025
impl core::ops::Index<Sig> for SigSet {
1026
type Output = SigData;
1027
1028
fn index(&self, sig: Sig) -> &Self::Output {
1029
&self.sigs[sig]
1030
}
1031
}
1032
1033
/// Structure describing the layout of a function's stack frame.
1034
#[derive(Clone, Debug, Default)]
1035
pub struct FrameLayout {
1036
/// Word size in bytes, so this struct can be
1037
/// monomorphic/independent of `ABIMachineSpec`.
1038
pub word_bytes: u32,
1039
1040
/// N.B. The areas whose sizes are given in this structure fully
1041
/// cover the current function's stack frame, from high to low
1042
/// stack addresses in the sequence below. Each size contains
1043
/// any alignment padding that may be required by the ABI.
1044
1045
/// Size of incoming arguments on the stack. This is not technically
1046
/// part of this function's frame, but code in the function will still
1047
/// need to access it. Depending on the ABI, we may need to set up a
1048
/// frame pointer to do so; we also may need to pop this area from the
1049
/// stack upon return.
1050
pub incoming_args_size: u32,
1051
1052
/// The size of the incoming argument area, taking into account any
1053
/// potential increase in size required for tail calls present in the
1054
/// function. In the case that no tail calls are present, this value
1055
/// will be the same as [`Self::incoming_args_size`].
1056
pub tail_args_size: u32,
1057
1058
/// Size of the "setup area", typically holding the return address
1059
/// and/or the saved frame pointer. This may be written either during
1060
/// the call itself (e.g. a pushed return address) or by code emitted
1061
/// from gen_prologue_frame_setup. In any case, after that code has
1062
/// completed execution, the stack pointer is expected to point to the
1063
/// bottom of this area. The same holds at the start of code emitted
1064
/// by gen_epilogue_frame_restore.
1065
pub setup_area_size: u32,
1066
1067
/// Size of the area used to save callee-saved clobbered registers.
1068
/// This area is accessed by code emitted from gen_clobber_save and
1069
/// gen_clobber_restore.
1070
pub clobber_size: u32,
1071
1072
/// Storage allocated for the fixed part of the stack frame.
1073
/// This contains stack slots and spill slots.
1074
pub fixed_frame_storage_size: u32,
1075
1076
/// The size of all stackslots.
1077
pub stackslots_size: u32,
1078
1079
/// Stack size to be reserved for outgoing arguments, if used by
1080
/// the current ABI, or 0 otherwise. After gen_clobber_save and
1081
/// before gen_clobber_restore, the stack pointer points to the
1082
/// bottom of this area.
1083
pub outgoing_args_size: u32,
1084
1085
/// Sorted list of callee-saved registers that are clobbered
1086
/// according to the ABI. These registers will be saved and
1087
/// restored by gen_clobber_save and gen_clobber_restore.
1088
pub clobbered_callee_saves: Vec<Writable<RealReg>>,
1089
1090
/// The function's call pattern classification.
1091
pub function_calls: FunctionCalls,
1092
}
1093
1094
impl FrameLayout {
1095
/// Split the clobbered callee-save registers into integer-class and
1096
/// float-class groups.
1097
///
1098
/// This method does not currently support vector-class callee-save
1099
/// registers because no current backend has them.
1100
pub fn clobbered_callee_saves_by_class(&self) -> (&[Writable<RealReg>], &[Writable<RealReg>]) {
1101
let (ints, floats) = self.clobbered_callee_saves.split_at(
1102
self.clobbered_callee_saves
1103
.partition_point(|r| r.to_reg().class() == RegClass::Int),
1104
);
1105
debug_assert!(floats.iter().all(|r| r.to_reg().class() == RegClass::Float));
1106
(ints, floats)
1107
}
1108
1109
/// The size of FP to SP while the frame is active (not during prologue
1110
/// setup or epilogue tear down).
1111
pub fn active_size(&self) -> u32 {
1112
self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size
1113
}
1114
1115
/// Get the offset from the SP to the sized stack slots area.
1116
pub fn sp_to_sized_stack_slots(&self) -> u32 {
1117
self.outgoing_args_size
1118
}
1119
1120
/// Get the offset of a spill slot from SP.
1121
pub fn spillslot_offset(&self, spillslot: SpillSlot) -> i64 {
1122
// Offset from beginning of spillslot area.
1123
let islot = spillslot.index() as i64;
1124
let spill_off = islot * self.word_bytes as i64;
1125
let sp_off = self.stackslots_size as i64 + spill_off;
1126
1127
sp_off
1128
}
1129
1130
/// Get the offset from SP up to FP.
1131
pub fn sp_to_fp(&self) -> u32 {
1132
self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size
1133
}
1134
}
1135
1136
/// ABI object for a function body.
1137
pub struct Callee<M: ABIMachineSpec> {
1138
/// CLIF-level signature, possibly normalized.
1139
ir_sig: ir::Signature,
1140
/// Signature: arg and retval regs.
1141
sig: Sig,
1142
/// Defined dynamic types.
1143
dynamic_type_sizes: HashMap<Type, u32>,
1144
/// Offsets to each dynamic stackslot.
1145
dynamic_stackslots: PrimaryMap<DynamicStackSlot, u32>,
1146
/// Offsets to each sized stackslot.
1147
sized_stackslots: PrimaryMap<StackSlot, u32>,
1148
/// Descriptors for sized stackslots.
1149
sized_stackslot_keys: SecondaryMap<StackSlot, Option<StackSlotKey>>,
1150
/// Total stack size of all stackslots
1151
stackslots_size: u32,
1152
/// Stack size to be reserved for outgoing arguments.
1153
outgoing_args_size: u32,
1154
/// Initially the number of bytes originating in the callers frame where stack arguments will
1155
/// live. After lowering this number may be larger than the size expected by the function being
1156
/// compiled, as tail calls potentially require more space for stack arguments.
1157
tail_args_size: u32,
1158
/// Register-argument defs, to be provided to the `args`
1159
/// pseudo-inst, and pregs to constrain them to.
1160
reg_args: Vec<ArgPair>,
1161
/// Finalized frame layout for this function.
1162
frame_layout: Option<FrameLayout>,
1163
/// The register holding the return-area pointer, if needed.
1164
ret_area_ptr: Option<Reg>,
1165
/// Calling convention this function expects.
1166
call_conv: isa::CallConv,
1167
/// The settings controlling this function's compilation.
1168
flags: settings::Flags,
1169
/// The ISA-specific flag values controlling this function's compilation.
1170
isa_flags: M::F,
1171
/// If this function has a stack limit specified, then `Reg` is where the
1172
/// stack limit will be located after the instructions specified have been
1173
/// executed.
1174
///
1175
/// Note that this is intended for insertion into the prologue, if
1176
/// present. Also note that because the instructions here execute in the
1177
/// prologue this happens after legalization/register allocation/etc so we
1178
/// need to be extremely careful with each instruction. The instructions are
1179
/// manually register-allocated and carefully only use caller-saved
1180
/// registers and keep nothing live after this sequence of instructions.
1181
stack_limit: Option<(Reg, SmallInstVec<M::I>)>,
1182
1183
_mach: PhantomData<M>,
1184
}
1185
1186
fn get_special_purpose_param_register(
1187
f: &ir::Function,
1188
sigs: &SigSet,
1189
sig: Sig,
1190
purpose: ir::ArgumentPurpose,
1191
) -> Option<Reg> {
1192
let idx = f.signature.special_param_index(purpose)?;
1193
match &sigs.args(sig)[idx] {
1194
&ABIArg::Slots { ref slots, .. } => match &slots[0] {
1195
&ABIArgSlot::Reg { reg, .. } => Some(reg.into()),
1196
_ => None,
1197
},
1198
_ => None,
1199
}
1200
}
1201
1202
fn checked_round_up(val: u32, mask: u32) -> Option<u32> {
1203
Some(val.checked_add(mask)? & !mask)
1204
}
1205
1206
impl<M: ABIMachineSpec> Callee<M> {
1207
/// Create a new body ABI instance.
1208
pub fn new(
1209
f: &ir::Function,
1210
isa: &dyn TargetIsa,
1211
isa_flags: &M::F,
1212
sigs: &SigSet,
1213
) -> CodegenResult<Self> {
1214
trace!("ABI: func signature {:?}", f.signature);
1215
1216
let flags = isa.flags().clone();
1217
let sig = sigs.abi_sig_for_signature(&f.signature);
1218
1219
let call_conv = f.signature.call_conv;
1220
// Only these calling conventions are supported.
1221
debug_assert!(
1222
call_conv == isa::CallConv::SystemV
1223
|| call_conv == isa::CallConv::Tail
1224
|| call_conv == isa::CallConv::Fast
1225
|| call_conv == isa::CallConv::WindowsFastcall
1226
|| call_conv == isa::CallConv::AppleAarch64
1227
|| call_conv == isa::CallConv::Winch
1228
|| call_conv == isa::CallConv::PreserveAll,
1229
"Unsupported calling convention: {call_conv:?}"
1230
);
1231
1232
// Compute sized stackslot locations and total stackslot size.
1233
let mut end_offset: u32 = 0;
1234
let mut sized_stackslots = PrimaryMap::new();
1235
let mut sized_stackslot_keys = SecondaryMap::new();
1236
1237
for (stackslot, data) in f.sized_stack_slots.iter() {
1238
// We start our computation possibly unaligned where the previous
1239
// stackslot left off.
1240
let unaligned_start_offset = end_offset;
1241
1242
// The start of the stackslot must be aligned.
1243
//
1244
// We always at least machine-word-align slots, but also
1245
// satisfy the user's requested alignment.
1246
debug_assert!(data.align_shift < 32);
1247
let align = core::cmp::max(M::word_bytes(), 1u32 << data.align_shift);
1248
let mask = align - 1;
1249
let start_offset = checked_round_up(unaligned_start_offset, mask)
1250
.ok_or(CodegenError::ImplLimitExceeded)?;
1251
1252
// The end offset is the start offset increased by the size
1253
end_offset = start_offset
1254
.checked_add(data.size)
1255
.ok_or(CodegenError::ImplLimitExceeded)?;
1256
1257
debug_assert_eq!(stackslot.as_u32() as usize, sized_stackslots.len());
1258
sized_stackslots.push(start_offset);
1259
sized_stackslot_keys[stackslot] = data.key;
1260
}
1261
1262
// Compute dynamic stackslot locations and total stackslot size.
1263
let mut dynamic_stackslots = PrimaryMap::new();
1264
for (stackslot, data) in f.dynamic_stack_slots.iter() {
1265
debug_assert_eq!(stackslot.as_u32() as usize, dynamic_stackslots.len());
1266
1267
// This computation is similar to the stackslots above
1268
let unaligned_start_offset = end_offset;
1269
1270
let mask = M::word_bytes() - 1;
1271
let start_offset = checked_round_up(unaligned_start_offset, mask)
1272
.ok_or(CodegenError::ImplLimitExceeded)?;
1273
1274
let ty = f.get_concrete_dynamic_ty(data.dyn_ty).ok_or_else(|| {
1275
CodegenError::Unsupported(format!("invalid dynamic vector type: {}", data.dyn_ty))
1276
})?;
1277
1278
end_offset = start_offset
1279
.checked_add(isa.dynamic_vector_bytes(ty))
1280
.ok_or(CodegenError::ImplLimitExceeded)?;
1281
1282
dynamic_stackslots.push(start_offset);
1283
}
1284
1285
// The size of the stackslots needs to be word aligned
1286
let stackslots_size = checked_round_up(end_offset, M::word_bytes() - 1)
1287
.ok_or(CodegenError::ImplLimitExceeded)?;
1288
1289
let mut dynamic_type_sizes = HashMap::with_capacity(f.dfg.dynamic_types.len());
1290
for (dyn_ty, _data) in f.dfg.dynamic_types.iter() {
1291
let ty = f
1292
.get_concrete_dynamic_ty(dyn_ty)
1293
.unwrap_or_else(|| panic!("invalid dynamic vector type: {dyn_ty}"));
1294
let size = isa.dynamic_vector_bytes(ty);
1295
dynamic_type_sizes.insert(ty, size);
1296
}
1297
1298
// Figure out what instructions, if any, will be needed to check the
1299
// stack limit. This can either be specified as a special-purpose
1300
// argument or as a global value which often calculates the stack limit
1301
// from the arguments.
1302
let stack_limit = f
1303
.stack_limit
1304
.map(|gv| gen_stack_limit::<M>(f, sigs, sig, gv));
1305
1306
let tail_args_size = sigs[sig].sized_stack_arg_space;
1307
1308
Ok(Self {
1309
ir_sig: ensure_struct_return_ptr_is_returned(&f.signature),
1310
sig,
1311
dynamic_stackslots,
1312
dynamic_type_sizes,
1313
sized_stackslots,
1314
sized_stackslot_keys,
1315
stackslots_size,
1316
outgoing_args_size: 0,
1317
tail_args_size,
1318
reg_args: vec![],
1319
frame_layout: None,
1320
ret_area_ptr: None,
1321
call_conv,
1322
flags,
1323
isa_flags: isa_flags.clone(),
1324
stack_limit,
1325
_mach: PhantomData,
1326
})
1327
}
1328
1329
/// Inserts instructions necessary for checking the stack limit into the
1330
/// prologue.
1331
///
1332
/// This function will generate instructions necessary for perform a stack
1333
/// check at the header of a function. The stack check is intended to trap
1334
/// if the stack pointer goes below a particular threshold, preventing stack
1335
/// overflow in wasm or other code. The `stack_limit` argument here is the
1336
/// register which holds the threshold below which we're supposed to trap.
1337
/// This function is known to allocate `stack_size` bytes and we'll push
1338
/// instructions onto `insts`.
1339
///
1340
/// Note that the instructions generated here are special because this is
1341
/// happening so late in the pipeline (e.g. after register allocation). This
1342
/// means that we need to do manual register allocation here and also be
1343
/// careful to not clobber any callee-saved or argument registers. For now
1344
/// this routine makes do with the `spilltmp_reg` as one temporary
1345
/// register, and a second register of `tmp2` which is caller-saved. This
1346
/// should be fine for us since no spills should happen in this sequence of
1347
/// instructions, so our register won't get accidentally clobbered.
1348
///
1349
/// No values can be live after the prologue, but in this case that's ok
1350
/// because we just need to perform a stack check before progressing with
1351
/// the rest of the function.
1352
fn insert_stack_check(
1353
&self,
1354
stack_limit: Reg,
1355
stack_size: u32,
1356
insts: &mut SmallInstVec<M::I>,
1357
) {
1358
// With no explicit stack allocated we can just emit the simple check of
1359
// the stack registers against the stack limit register, and trap if
1360
// it's out of bounds.
1361
if stack_size == 0 {
1362
insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1363
return;
1364
}
1365
1366
// Note that the 32k stack size here is pretty special. See the
1367
// documentation in x86/abi.rs for why this is here. The general idea is
1368
// that we're protecting against overflow in the addition that happens
1369
// below.
1370
if stack_size >= 32 * 1024 {
1371
insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1372
}
1373
1374
// Add the `stack_size` to `stack_limit`, placing the result in
1375
// `scratch`.
1376
//
1377
// Note though that `stack_limit`'s register may be the same as
1378
// `scratch`. If our stack size doesn't fit into an immediate this
1379
// means we need a second scratch register for loading the stack size
1380
// into a register.
1381
let scratch = Writable::from_reg(M::get_stacklimit_reg(self.call_conv));
1382
insts.extend(M::gen_add_imm(
1383
self.call_conv,
1384
scratch,
1385
stack_limit,
1386
stack_size,
1387
));
1388
insts.extend(M::gen_stack_lower_bound_trap(scratch.to_reg()));
1389
}
1390
}
1391
1392
/// Generates the instructions necessary for the `gv` to be materialized into a
1393
/// register.
1394
///
1395
/// This function will return a register that will contain the result of
1396
/// evaluating `gv`. It will also return any instructions necessary to calculate
1397
/// the value of the register.
1398
///
1399
/// Note that global values are typically lowered to instructions via the
1400
/// standard legalization pass. Unfortunately though prologue generation happens
1401
/// so late in the pipeline that we can't use these legalization passes to
1402
/// generate the instructions for `gv`. As a result we duplicate some lowering
1403
/// of `gv` here and support only some global values. This is similar to what
1404
/// the x86 backend does for now, and hopefully this can be somewhat cleaned up
1405
/// in the future too!
1406
///
1407
/// Also note that this function will make use of `writable_spilltmp_reg()` as a
1408
/// temporary register to store values in if necessary. Currently after we write
1409
/// to this register there's guaranteed to be no spilled values between where
1410
/// it's used, because we're not participating in register allocation anyway!
1411
fn gen_stack_limit<M: ABIMachineSpec>(
1412
f: &ir::Function,
1413
sigs: &SigSet,
1414
sig: Sig,
1415
gv: ir::GlobalValue,
1416
) -> (Reg, SmallInstVec<M::I>) {
1417
let mut insts = smallvec![];
1418
let reg = generate_gv::<M>(f, sigs, sig, gv, &mut insts);
1419
return (reg, insts);
1420
}
1421
1422
fn generate_gv<M: ABIMachineSpec>(
1423
f: &ir::Function,
1424
sigs: &SigSet,
1425
sig: Sig,
1426
gv: ir::GlobalValue,
1427
insts: &mut SmallInstVec<M::I>,
1428
) -> Reg {
1429
match f.global_values[gv] {
1430
// Return the direct register the vmcontext is in
1431
ir::GlobalValueData::VMContext => {
1432
get_special_purpose_param_register(f, sigs, sig, ir::ArgumentPurpose::VMContext)
1433
.expect("no vmcontext parameter found")
1434
}
1435
// Load our base value into a register, then load from that register
1436
// in to a temporary register.
1437
ir::GlobalValueData::Load {
1438
base,
1439
offset,
1440
global_type: _,
1441
flags: _,
1442
} => {
1443
let base = generate_gv::<M>(f, sigs, sig, base, insts);
1444
let into_reg = Writable::from_reg(M::get_stacklimit_reg(f.stencil.signature.call_conv));
1445
insts.push(M::gen_load_base_offset(
1446
into_reg,
1447
base,
1448
offset.into(),
1449
M::word_type(),
1450
));
1451
return into_reg.to_reg();
1452
}
1453
ref other => panic!("global value for stack limit not supported: {other}"),
1454
}
1455
}
1456
1457
/// Returns true if the signature needs to be legalized.
1458
fn missing_struct_return(sig: &ir::Signature) -> bool {
1459
sig.uses_special_param(ArgumentPurpose::StructReturn)
1460
&& !sig.uses_special_return(ArgumentPurpose::StructReturn)
1461
}
1462
1463
fn ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature {
1464
// Keep in sync with Callee::new
1465
let mut sig = sig.clone();
1466
if sig.uses_special_return(ArgumentPurpose::StructReturn) {
1467
panic!("Explicit StructReturn return value not allowed: {sig:?}")
1468
}
1469
if let Some(struct_ret_index) = sig.special_param_index(ArgumentPurpose::StructReturn) {
1470
if !sig.returns.is_empty() {
1471
panic!("No return values are allowed when using StructReturn: {sig:?}");
1472
}
1473
sig.returns.insert(0, sig.params[struct_ret_index]);
1474
}
1475
sig
1476
}
1477
1478
/// ### Pre-Regalloc Functions
1479
///
1480
/// These methods of `Callee` may only be called before regalloc.
1481
impl<M: ABIMachineSpec> Callee<M> {
1482
/// Access the (possibly legalized) signature.
1483
pub fn signature(&self) -> &ir::Signature {
1484
debug_assert!(
1485
!missing_struct_return(&self.ir_sig),
1486
"`Callee::ir_sig` is always legalized"
1487
);
1488
&self.ir_sig
1489
}
1490
1491
/// Initialize. This is called after the Callee is constructed because it
1492
/// may allocate a temp vreg, which can only be allocated once the lowering
1493
/// context exists.
1494
pub fn init_retval_area(
1495
&mut self,
1496
sigs: &SigSet,
1497
vregs: &mut VRegAllocator<M::I>,
1498
) -> CodegenResult<()> {
1499
if sigs[self.sig].stack_ret_arg.is_some() {
1500
let ret_area_ptr = vregs.alloc(M::word_type())?;
1501
self.ret_area_ptr = Some(ret_area_ptr.only_reg().unwrap());
1502
}
1503
Ok(())
1504
}
1505
1506
/// Get the return area pointer register, if any.
1507
pub fn ret_area_ptr(&self) -> Option<Reg> {
1508
self.ret_area_ptr
1509
}
1510
1511
/// Accumulate outgoing arguments.
1512
///
1513
/// This ensures that at least `size` bytes are allocated in the prologue to
1514
/// be available for use in function calls to hold arguments and/or return
1515
/// values. If this function is called multiple times, the maximum of all
1516
/// `size` values will be available.
1517
pub fn accumulate_outgoing_args_size(&mut self, size: u32) {
1518
if size > self.outgoing_args_size {
1519
self.outgoing_args_size = size;
1520
}
1521
}
1522
1523
/// Accumulate the incoming argument area size requirements for a tail call,
1524
/// as it could be larger than the incoming arguments of the function
1525
/// currently being compiled.
1526
pub fn accumulate_tail_args_size(&mut self, size: u32) {
1527
if size > self.tail_args_size {
1528
self.tail_args_size = size;
1529
}
1530
}
1531
1532
pub fn is_forward_edge_cfi_enabled(&self) -> bool {
1533
self.isa_flags.is_forward_edge_cfi_enabled()
1534
}
1535
1536
/// Get the calling convention implemented by this ABI object.
1537
pub fn call_conv(&self) -> isa::CallConv {
1538
self.call_conv
1539
}
1540
1541
/// Get the ABI-dependent MachineEnv for managing register allocation.
1542
pub fn machine_env(&self) -> &MachineEnv {
1543
M::get_machine_env(&self.flags, self.call_conv)
1544
}
1545
1546
/// The offsets of all sized stack slots (not spill slots) for debuginfo purposes.
1547
pub fn sized_stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32> {
1548
&self.sized_stackslots
1549
}
1550
1551
/// The offsets of all dynamic stack slots (not spill slots) for debuginfo purposes.
1552
pub fn dynamic_stackslot_offsets(&self) -> &PrimaryMap<DynamicStackSlot, u32> {
1553
&self.dynamic_stackslots
1554
}
1555
1556
/// Generate an instruction which copies an argument to a destination
1557
/// register.
1558
pub fn gen_copy_arg_to_regs(
1559
&mut self,
1560
sigs: &SigSet,
1561
idx: usize,
1562
into_regs: ValueRegs<Writable<Reg>>,
1563
vregs: &mut VRegAllocator<M::I>,
1564
) -> SmallInstVec<M::I> {
1565
let mut insts = smallvec![];
1566
let mut copy_arg_slot_to_reg = |slot: &ABIArgSlot, into_reg: &Writable<Reg>| {
1567
match slot {
1568
&ABIArgSlot::Reg { reg, .. } => {
1569
// Add a preg -> def pair to the eventual `args`
1570
// instruction. Extension mode doesn't matter
1571
// (we're copying out, not in; we ignore high bits
1572
// by convention).
1573
let arg = ArgPair {
1574
vreg: *into_reg,
1575
preg: reg.into(),
1576
};
1577
self.reg_args.push(arg);
1578
}
1579
&ABIArgSlot::Stack {
1580
offset,
1581
ty,
1582
extension,
1583
..
1584
} => {
1585
// However, we have to respect the extension mode for stack
1586
// slots, or else we grab the wrong bytes on big-endian.
1587
let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1588
let ty =
1589
if ext != ArgumentExtension::None && M::word_bits() > ty_bits(ty) as u32 {
1590
M::word_type()
1591
} else {
1592
ty
1593
};
1594
insts.push(M::gen_load_stack(
1595
StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1596
*into_reg,
1597
ty,
1598
));
1599
}
1600
}
1601
};
1602
1603
match &sigs.args(self.sig)[idx] {
1604
&ABIArg::Slots { ref slots, .. } => {
1605
assert_eq!(into_regs.len(), slots.len());
1606
for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {
1607
copy_arg_slot_to_reg(&slot, &into_reg);
1608
}
1609
}
1610
&ABIArg::StructArg { offset, .. } => {
1611
let into_reg = into_regs.only_reg().unwrap();
1612
// Buffer address is implicitly defined by the ABI.
1613
insts.push(M::gen_get_stack_addr(
1614
StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1615
into_reg,
1616
));
1617
}
1618
&ABIArg::ImplicitPtrArg { pointer, ty, .. } => {
1619
let into_reg = into_regs.only_reg().unwrap();
1620
// We need to dereference the pointer.
1621
let base = match &pointer {
1622
&ABIArgSlot::Reg { reg, ty, .. } => {
1623
let tmp = vregs.alloc_with_deferred_error(ty).only_reg().unwrap();
1624
self.reg_args.push(ArgPair {
1625
vreg: Writable::from_reg(tmp),
1626
preg: reg.into(),
1627
});
1628
tmp
1629
}
1630
&ABIArgSlot::Stack { offset, ty, .. } => {
1631
let addr_reg = writable_value_regs(vregs.alloc_with_deferred_error(ty))
1632
.only_reg()
1633
.unwrap();
1634
insts.push(M::gen_load_stack(
1635
StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1636
addr_reg,
1637
ty,
1638
));
1639
addr_reg.to_reg()
1640
}
1641
};
1642
insts.push(M::gen_load_base_offset(into_reg, base, 0, ty));
1643
}
1644
}
1645
insts
1646
}
1647
1648
/// Generate an instruction which copies a source register to a return value slot.
1649
pub fn gen_copy_regs_to_retval(
1650
&self,
1651
sigs: &SigSet,
1652
idx: usize,
1653
from_regs: ValueRegs<Reg>,
1654
vregs: &mut VRegAllocator<M::I>,
1655
) -> (SmallVec<[RetPair; 2]>, SmallInstVec<M::I>) {
1656
let mut reg_pairs = smallvec![];
1657
let mut ret = smallvec![];
1658
let word_bits = M::word_bits() as u8;
1659
match &sigs.rets(self.sig)[idx] {
1660
&ABIArg::Slots { ref slots, .. } => {
1661
assert_eq!(from_regs.len(), slots.len());
1662
for (slot, &from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1663
match slot {
1664
&ABIArgSlot::Reg {
1665
reg, ty, extension, ..
1666
} => {
1667
let from_bits = ty_bits(ty) as u8;
1668
let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1669
let vreg = match (ext, from_bits) {
1670
(ir::ArgumentExtension::Uext, n)
1671
| (ir::ArgumentExtension::Sext, n)
1672
if n < word_bits =>
1673
{
1674
let signed = ext == ir::ArgumentExtension::Sext;
1675
let dst =
1676
writable_value_regs(vregs.alloc_with_deferred_error(ty))
1677
.only_reg()
1678
.unwrap();
1679
ret.push(M::gen_extend(
1680
dst, from_reg, signed, from_bits,
1681
/* to_bits = */ word_bits,
1682
));
1683
dst.to_reg()
1684
}
1685
_ => {
1686
// No move needed, regalloc2 will emit it using the constraint
1687
// added by the RetPair.
1688
from_reg
1689
}
1690
};
1691
reg_pairs.push(RetPair {
1692
vreg,
1693
preg: Reg::from(reg),
1694
});
1695
}
1696
&ABIArgSlot::Stack {
1697
offset,
1698
ty,
1699
extension,
1700
..
1701
} => {
1702
let mut ty = ty;
1703
let from_bits = ty_bits(ty) as u8;
1704
// A machine ABI implementation should ensure that stack frames
1705
// have "reasonable" size. All current ABIs for machinst
1706
// backends (aarch64 and x64) enforce a 128MB limit.
1707
let off = i32::try_from(offset).expect(
1708
"Argument stack offset greater than 2GB; should hit impl limit first",
1709
);
1710
let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1711
// Trash the from_reg; it should be its last use.
1712
match (ext, from_bits) {
1713
(ir::ArgumentExtension::Uext, n)
1714
| (ir::ArgumentExtension::Sext, n)
1715
if n < word_bits =>
1716
{
1717
assert_eq!(M::word_reg_class(), from_reg.class());
1718
let signed = ext == ir::ArgumentExtension::Sext;
1719
let dst =
1720
writable_value_regs(vregs.alloc_with_deferred_error(ty))
1721
.only_reg()
1722
.unwrap();
1723
ret.push(M::gen_extend(
1724
dst, from_reg, signed, from_bits,
1725
/* to_bits = */ word_bits,
1726
));
1727
// Store the extended version.
1728
ty = M::word_type();
1729
}
1730
_ => {}
1731
};
1732
ret.push(M::gen_store_base_offset(
1733
self.ret_area_ptr.unwrap(),
1734
off,
1735
from_reg,
1736
ty,
1737
));
1738
}
1739
}
1740
}
1741
}
1742
ABIArg::StructArg { .. } => {
1743
panic!("StructArg in return position is unsupported");
1744
}
1745
ABIArg::ImplicitPtrArg { .. } => {
1746
panic!("ImplicitPtrArg in return position is unsupported");
1747
}
1748
}
1749
(reg_pairs, ret)
1750
}
1751
1752
/// Generate any setup instruction needed to save values to the
1753
/// return-value area. This is usually used when were are multiple return
1754
/// values or an otherwise large return value that must be passed on the
1755
/// stack; typically the ABI specifies an extra hidden argument that is a
1756
/// pointer to that memory.
1757
pub fn gen_retval_area_setup(
1758
&mut self,
1759
sigs: &SigSet,
1760
vregs: &mut VRegAllocator<M::I>,
1761
) -> Option<M::I> {
1762
if let Some(i) = sigs[self.sig].stack_ret_arg {
1763
let ret_area_ptr = Writable::from_reg(self.ret_area_ptr.unwrap());
1764
let insts =
1765
self.gen_copy_arg_to_regs(sigs, i.into(), ValueRegs::one(ret_area_ptr), vregs);
1766
insts.into_iter().next().map(|inst| {
1767
trace!(
1768
"gen_retval_area_setup: inst {:?}; ptr reg is {:?}",
1769
inst,
1770
ret_area_ptr.to_reg()
1771
);
1772
inst
1773
})
1774
} else {
1775
trace!("gen_retval_area_setup: not needed");
1776
None
1777
}
1778
}
1779
1780
/// Generate a return instruction.
1781
pub fn gen_rets(&self, rets: Vec<RetPair>) -> M::I {
1782
M::gen_rets(rets)
1783
}
1784
1785
/// Set up arguments values `args` for a call with signature `sig`.
1786
/// This will return a series of instructions to be emitted to set
1787
/// up all arguments, as well as a `CallArgList` list representing
1788
/// the arguments passed in registers. The latter need to be added
1789
/// as constraints to the actual call instruction.
1790
pub fn gen_call_args(
1791
&self,
1792
sigs: &SigSet,
1793
sig: Sig,
1794
args: &[ValueRegs<Reg>],
1795
is_tail_call: bool,
1796
flags: &settings::Flags,
1797
vregs: &mut VRegAllocator<M::I>,
1798
) -> (CallArgList, SmallInstVec<M::I>) {
1799
let mut uses: CallArgList = smallvec![];
1800
let mut insts = smallvec![];
1801
1802
assert_eq!(args.len(), sigs.num_args(sig));
1803
1804
let call_conv = sigs[sig].call_conv;
1805
let stack_arg_space = sigs[sig].sized_stack_arg_space;
1806
let stack_arg = |offset| {
1807
if is_tail_call {
1808
StackAMode::IncomingArg(offset, stack_arg_space)
1809
} else {
1810
StackAMode::OutgoingArg(offset)
1811
}
1812
};
1813
1814
let word_ty = M::word_type();
1815
let word_rc = M::word_reg_class();
1816
let word_bits = M::word_bits() as usize;
1817
1818
if is_tail_call {
1819
debug_assert_eq!(
1820
self.call_conv,
1821
isa::CallConv::Tail,
1822
"Can only do `return_call`s from within a `tail` calling convention function"
1823
);
1824
}
1825
1826
// Helper to process a single argument slot (register or stack slot).
1827
// This will either add the register to the `uses` list or write the
1828
// value to the stack slot in the outgoing argument area (or for tail
1829
// calls, the incoming argument area).
1830
let mut process_arg_slot = |insts: &mut SmallInstVec<M::I>, slot, vreg, ty| {
1831
match &slot {
1832
&ABIArgSlot::Reg { reg, .. } => {
1833
uses.push(CallArgPair {
1834
vreg,
1835
preg: reg.into(),
1836
});
1837
}
1838
&ABIArgSlot::Stack { offset, .. } => {
1839
insts.push(M::gen_store_stack(stack_arg(offset), vreg, ty));
1840
}
1841
};
1842
};
1843
1844
// First pass: Handle `StructArg` arguments. These need to be copied
1845
// into their associated stack buffers. This should happen before any
1846
// of the other arguments are processed, as the `memcpy` call might
1847
// clobber registers used by other arguments.
1848
for (idx, from_regs) in args.iter().enumerate() {
1849
match &sigs.args(sig)[idx] {
1850
&ABIArg::Slots { .. } | &ABIArg::ImplicitPtrArg { .. } => {}
1851
&ABIArg::StructArg { offset, size, .. } => {
1852
let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1853
insts.push(M::gen_get_stack_addr(
1854
stack_arg(offset),
1855
Writable::from_reg(tmp),
1856
));
1857
insts.extend(M::gen_memcpy(
1858
isa::CallConv::for_libcall(flags, call_conv),
1859
tmp,
1860
from_regs.only_reg().unwrap(),
1861
size as usize,
1862
|ty| {
1863
Writable::from_reg(
1864
vregs.alloc_with_deferred_error(ty).only_reg().unwrap(),
1865
)
1866
},
1867
));
1868
}
1869
}
1870
}
1871
1872
// Second pass: Handle everything except `StructArg` arguments.
1873
for (idx, from_regs) in args.iter().enumerate() {
1874
match sigs.args(sig)[idx] {
1875
ABIArg::Slots { ref slots, .. } => {
1876
assert_eq!(from_regs.len(), slots.len());
1877
for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1878
// Load argument slot value from `from_reg`, and perform any zero-
1879
// or sign-extension that is required by the ABI.
1880
let (ty, extension) = match *slot {
1881
ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),
1882
ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),
1883
};
1884
let ext = M::get_ext_mode(call_conv, extension);
1885
let (vreg, ty) = if ext != ir::ArgumentExtension::None
1886
&& ty_bits(ty) < word_bits
1887
{
1888
assert_eq!(word_rc, from_reg.class());
1889
let signed = match ext {
1890
ir::ArgumentExtension::Uext => false,
1891
ir::ArgumentExtension::Sext => true,
1892
_ => unreachable!(),
1893
};
1894
let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1895
insts.push(M::gen_extend(
1896
Writable::from_reg(tmp),
1897
*from_reg,
1898
signed,
1899
ty_bits(ty) as u8,
1900
word_bits as u8,
1901
));
1902
(tmp, word_ty)
1903
} else {
1904
(*from_reg, ty)
1905
};
1906
process_arg_slot(&mut insts, *slot, vreg, ty);
1907
}
1908
}
1909
ABIArg::ImplicitPtrArg {
1910
offset,
1911
pointer,
1912
ty,
1913
..
1914
} => {
1915
let vreg = from_regs.only_reg().unwrap();
1916
let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1917
insts.push(M::gen_get_stack_addr(
1918
stack_arg(offset),
1919
Writable::from_reg(tmp),
1920
));
1921
insts.push(M::gen_store_base_offset(tmp, 0, vreg, ty));
1922
process_arg_slot(&mut insts, pointer, tmp, word_ty);
1923
}
1924
ABIArg::StructArg { .. } => {}
1925
}
1926
}
1927
1928
// Finally, set the stack-return pointer to the return argument area.
1929
// For tail calls, this means forwarding the incoming stack-return pointer.
1930
if let Some(ret_arg) = sigs.get_ret_arg(sig) {
1931
let ret_area = if is_tail_call {
1932
self.ret_area_ptr.expect(
1933
"if the tail callee has a return pointer, then the tail caller must as well",
1934
)
1935
} else {
1936
let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1937
let amode = StackAMode::OutgoingArg(stack_arg_space.into());
1938
insts.push(M::gen_get_stack_addr(amode, Writable::from_reg(tmp)));
1939
tmp
1940
};
1941
match ret_arg {
1942
// The return pointer must occupy a single slot.
1943
ABIArg::Slots { slots, .. } => {
1944
assert_eq!(slots.len(), 1);
1945
process_arg_slot(&mut insts, slots[0], ret_area, word_ty);
1946
}
1947
_ => unreachable!(),
1948
}
1949
}
1950
1951
(uses, insts)
1952
}
1953
1954
/// Set up return values `outputs` for a call with signature `sig`.
1955
/// This does not emit (or return) any instructions, but returns a
1956
/// `CallRetList` representing the return value constraints. This
1957
/// needs to be added to the actual call instruction.
1958
///
1959
/// If `try_call_payloads` is non-zero, it is expected to hold
1960
/// exception payload registers for try_call instructions. These
1961
/// will be added as needed to the `CallRetList` as well.
1962
pub fn gen_call_rets(
1963
&self,
1964
sigs: &SigSet,
1965
sig: Sig,
1966
outputs: &[ValueRegs<Reg>],
1967
try_call_payloads: Option<&[Writable<Reg>]>,
1968
vregs: &mut VRegAllocator<M::I>,
1969
) -> CallRetList {
1970
let callee_conv = sigs[sig].call_conv;
1971
let stack_arg_space = sigs[sig].sized_stack_arg_space;
1972
1973
let word_ty = M::word_type();
1974
let word_bits = M::word_bits() as usize;
1975
1976
let mut defs: CallRetList = smallvec![];
1977
let mut outputs = outputs.into_iter();
1978
let num_rets = sigs.num_rets(sig);
1979
for idx in 0..num_rets {
1980
let ret = sigs.rets(sig)[idx].clone();
1981
match ret {
1982
ABIArg::Slots {
1983
ref slots, purpose, ..
1984
} => {
1985
// We do not use the returned copy of the return buffer pointer,
1986
// so skip any StructReturn returns that may be present.
1987
if purpose == ArgumentPurpose::StructReturn {
1988
continue;
1989
}
1990
let retval_regs = outputs.next().unwrap();
1991
assert_eq!(retval_regs.len(), slots.len());
1992
for (slot, retval_reg) in slots.iter().zip(retval_regs.regs().iter()) {
1993
// We do not perform any extension because we're copying out, not in,
1994
// and we ignore high bits in our own registers by convention. However,
1995
// we still need to use the proper extended type to access stack slots
1996
// (this is critical on big-endian systems).
1997
let (ty, extension) = match *slot {
1998
ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),
1999
ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),
2000
};
2001
let ext = M::get_ext_mode(callee_conv, extension);
2002
let ty = if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
2003
word_ty
2004
} else {
2005
ty
2006
};
2007
2008
match slot {
2009
&ABIArgSlot::Reg { reg, .. } => {
2010
defs.push(CallRetPair {
2011
vreg: Writable::from_reg(*retval_reg),
2012
location: RetLocation::Reg(reg.into(), ty),
2013
});
2014
}
2015
&ABIArgSlot::Stack { offset, .. } => {
2016
let amode =
2017
StackAMode::OutgoingArg(offset + i64::from(stack_arg_space));
2018
defs.push(CallRetPair {
2019
vreg: Writable::from_reg(*retval_reg),
2020
location: RetLocation::Stack(amode, ty),
2021
});
2022
}
2023
}
2024
}
2025
}
2026
ABIArg::StructArg { .. } => {
2027
panic!("StructArg not supported in return position");
2028
}
2029
ABIArg::ImplicitPtrArg { .. } => {
2030
panic!("ImplicitPtrArg not supported in return position");
2031
}
2032
}
2033
}
2034
assert!(outputs.next().is_none());
2035
2036
if let Some(try_call_payloads) = try_call_payloads {
2037
// Let `M` say where the payload values are going to end up and then
2038
// double-check it's the same size as the calling convention's
2039
// reported number of exception types.
2040
let pregs = M::exception_payload_regs(callee_conv);
2041
assert_eq!(
2042
callee_conv.exception_payload_types(M::word_type()).len(),
2043
pregs.len()
2044
);
2045
2046
// We need to update `defs` to contain the exception
2047
// payload regs as well. We have two sources of info that
2048
// we join:
2049
//
2050
// - The machine-specific ABI implementation `M`, which
2051
// tells us the particular registers that payload values
2052
// must be in
2053
// - The passed-in lowering context, which gives us the
2054
// vregs we must define.
2055
//
2056
// Note that payload values may need to end up in the same
2057
// physical registers as ordinary return values; this is
2058
// not a conflict, because we either get one or the
2059
// other. For regalloc's purposes, we define both starting
2060
// here at the callsite, but we can share one def in the
2061
// `defs` list and alias one vreg to another. Thus we
2062
// handle the two cases below for each payload register:
2063
// overlaps a return value (and we alias to it) or not
2064
// (and we add a def).
2065
for (i, &preg) in pregs.iter().enumerate() {
2066
let vreg = try_call_payloads[i];
2067
if let Some(existing) = defs.iter().find(|def| match def.location {
2068
RetLocation::Reg(r, _) => r == preg,
2069
_ => false,
2070
}) {
2071
vregs.set_vreg_alias(vreg.to_reg(), existing.vreg.to_reg());
2072
} else {
2073
defs.push(CallRetPair {
2074
vreg,
2075
location: RetLocation::Reg(preg, word_ty),
2076
});
2077
}
2078
}
2079
}
2080
2081
defs
2082
}
2083
2084
/// Populate a `CallInfo` for a call with signature `sig`.
2085
///
2086
/// `dest` is the target-specific call destination value
2087
/// `uses` is the `CallArgList` describing argument constraints
2088
/// `defs` is the `CallRetList` describing return constraints
2089
/// `try_call_info` describes exception targets for try_call instructions
2090
/// `patchable` describes whether this callsite should emit metadata
2091
/// for patching to enable/disable it.
2092
///
2093
/// The clobber list is computed here from the above data.
2094
pub fn gen_call_info<T>(
2095
&self,
2096
sigs: &SigSet,
2097
sig: Sig,
2098
dest: T,
2099
uses: CallArgList,
2100
defs: CallRetList,
2101
try_call_info: Option<TryCallInfo>,
2102
patchable: bool,
2103
) -> CallInfo<T> {
2104
let caller_conv = self.call_conv;
2105
let callee_conv = sigs[sig].call_conv;
2106
let stack_arg_space = sigs[sig].sized_stack_arg_space;
2107
2108
let clobbers = {
2109
// Get clobbers: all caller-saves. These may include return value
2110
// regs, which we will remove from the clobber set below.
2111
let mut clobbers =
2112
<M>::get_regs_clobbered_by_call(callee_conv, try_call_info.is_some());
2113
2114
// Remove retval regs from clobbers.
2115
for def in &defs {
2116
if let RetLocation::Reg(preg, _) = def.location {
2117
clobbers.remove(PReg::from(preg.to_real_reg().unwrap()));
2118
}
2119
}
2120
2121
clobbers
2122
};
2123
2124
// Any adjustment to SP to account for required outgoing arguments/stack return values must
2125
// be done inside of the call pseudo-op, to ensure that SP is always in a consistent
2126
// state for all other instructions. For example, if a tail-call abi function is called
2127
// here, the reclamation of the outgoing argument area must be done inside of the call
2128
// pseudo-op's emission to ensure that SP is consistent at all other points in the lowered
2129
// function. (Except the prologue and epilogue, but those are fairly special parts of the
2130
// function that establish the SP invariants that are relied on elsewhere and are generated
2131
// after the register allocator has run and thus cannot have register allocator-inserted
2132
// references to SP offsets.)
2133
2134
let callee_pop_size = if callee_conv == isa::CallConv::Tail {
2135
// The tail calling convention has callees pop stack arguments.
2136
stack_arg_space
2137
} else {
2138
0
2139
};
2140
2141
CallInfo {
2142
dest,
2143
uses,
2144
defs,
2145
clobbers,
2146
callee_conv,
2147
caller_conv,
2148
callee_pop_size,
2149
try_call_info,
2150
patchable,
2151
}
2152
}
2153
2154
/// Get the raw offset of a sized stackslot in the slot region.
2155
pub fn sized_stackslot_offset(&self, slot: StackSlot) -> u32 {
2156
self.sized_stackslots[slot]
2157
}
2158
2159
/// Produce an instruction that computes a sized stackslot address.
2160
pub fn sized_stackslot_addr(
2161
&self,
2162
slot: StackSlot,
2163
offset: u32,
2164
into_reg: Writable<Reg>,
2165
) -> M::I {
2166
// Offset from beginning of stackslot area.
2167
let stack_off = self.sized_stackslots[slot] as i64;
2168
let sp_off: i64 = stack_off + (offset as i64);
2169
M::gen_get_stack_addr(StackAMode::Slot(sp_off), into_reg)
2170
}
2171
2172
/// Produce an instruction that computes a dynamic stackslot address.
2173
pub fn dynamic_stackslot_addr(&self, slot: DynamicStackSlot, into_reg: Writable<Reg>) -> M::I {
2174
let stack_off = self.dynamic_stackslots[slot] as i64;
2175
M::gen_get_stack_addr(StackAMode::Slot(stack_off), into_reg)
2176
}
2177
2178
/// Get an `args` pseudo-inst, if any, that should appear at the
2179
/// very top of the function body prior to regalloc.
2180
pub fn take_args(&mut self) -> Option<M::I> {
2181
if self.reg_args.len() > 0 {
2182
// Very first instruction is an `args` pseudo-inst that
2183
// establishes live-ranges for in-register arguments and
2184
// constrains them at the start of the function to the
2185
// locations defined by the ABI.
2186
Some(M::gen_args(core::mem::take(&mut self.reg_args)))
2187
} else {
2188
None
2189
}
2190
}
2191
}
2192
2193
/// ### Post-Regalloc Functions
2194
///
2195
/// These methods of `Callee` may only be called after
2196
/// regalloc.
2197
impl<M: ABIMachineSpec> Callee<M> {
2198
/// Compute the final frame layout, post-regalloc.
2199
///
2200
/// This must be called before gen_prologue or gen_epilogue.
2201
pub fn compute_frame_layout(
2202
&mut self,
2203
sigs: &SigSet,
2204
spillslots: usize,
2205
clobbered: Vec<Writable<RealReg>>,
2206
function_calls: FunctionCalls,
2207
) {
2208
let bytes = M::word_bytes();
2209
let total_stacksize = self.stackslots_size + bytes * spillslots as u32;
2210
let mask = M::stack_align(self.call_conv) - 1;
2211
let total_stacksize = (total_stacksize + mask) & !mask; // 16-align the stack.
2212
self.frame_layout = Some(M::compute_frame_layout(
2213
self.call_conv,
2214
&self.flags,
2215
self.signature(),
2216
&clobbered,
2217
function_calls,
2218
self.stack_args_size(sigs),
2219
self.tail_args_size,
2220
self.stackslots_size,
2221
total_stacksize,
2222
self.outgoing_args_size,
2223
));
2224
}
2225
2226
/// Generate a prologue, post-regalloc.
2227
///
2228
/// This should include any stack frame or other setup necessary to use the
2229
/// other methods (`load_arg`, `store_retval`, and spillslot accesses.)
2230
pub fn gen_prologue(&self) -> SmallInstVec<M::I> {
2231
let frame_layout = self.frame_layout();
2232
let mut insts = smallvec![];
2233
2234
// Set up frame.
2235
insts.extend(M::gen_prologue_frame_setup(
2236
self.call_conv,
2237
&self.flags,
2238
&self.isa_flags,
2239
&frame_layout,
2240
));
2241
2242
// The stack limit check needs to cover all the stack adjustments we
2243
// might make, up to the next stack limit check in any function we
2244
// call. Since this happens after frame setup, the current function's
2245
// setup area needs to be accounted for in the caller's stack limit
2246
// check, but we need to account for any setup area that our callees
2247
// might need. Note that s390x may also use the outgoing args area for
2248
// backtrace support even in leaf functions, so that should be accounted
2249
// for unconditionally.
2250
let total_stacksize = (frame_layout.tail_args_size - frame_layout.incoming_args_size)
2251
+ frame_layout.clobber_size
2252
+ frame_layout.fixed_frame_storage_size
2253
+ frame_layout.outgoing_args_size
2254
+ if frame_layout.function_calls == FunctionCalls::None {
2255
0
2256
} else {
2257
frame_layout.setup_area_size
2258
};
2259
2260
// Leaf functions with zero stack don't need a stack check if one's
2261
// specified, otherwise always insert the stack check.
2262
if total_stacksize > 0 || frame_layout.function_calls != FunctionCalls::None {
2263
if let Some((reg, stack_limit_load)) = &self.stack_limit {
2264
insts.extend(stack_limit_load.clone());
2265
self.insert_stack_check(*reg, total_stacksize, &mut insts);
2266
}
2267
2268
if self.flags.enable_probestack() {
2269
let guard_size = 1 << self.flags.probestack_size_log2();
2270
match self.flags.probestack_strategy() {
2271
ProbestackStrategy::Inline => M::gen_inline_probestack(
2272
&mut insts,
2273
self.call_conv,
2274
total_stacksize,
2275
guard_size,
2276
),
2277
ProbestackStrategy::Outline => {
2278
if total_stacksize >= guard_size {
2279
M::gen_probestack(&mut insts, total_stacksize);
2280
}
2281
}
2282
}
2283
}
2284
}
2285
2286
// Save clobbered registers.
2287
insts.extend(M::gen_clobber_save(
2288
self.call_conv,
2289
&self.flags,
2290
&frame_layout,
2291
));
2292
2293
insts
2294
}
2295
2296
/// Generate an epilogue, post-regalloc.
2297
///
2298
/// Note that this must generate the actual return instruction (rather than
2299
/// emitting this in the lowering logic), because the epilogue code comes
2300
/// before the return and the two are likely closely related.
2301
pub fn gen_epilogue(&self) -> SmallInstVec<M::I> {
2302
let frame_layout = self.frame_layout();
2303
let mut insts = smallvec![];
2304
2305
// Restore clobbered registers.
2306
insts.extend(M::gen_clobber_restore(
2307
self.call_conv,
2308
&self.flags,
2309
&frame_layout,
2310
));
2311
2312
// Tear down frame.
2313
insts.extend(M::gen_epilogue_frame_restore(
2314
self.call_conv,
2315
&self.flags,
2316
&self.isa_flags,
2317
&frame_layout,
2318
));
2319
2320
// And return.
2321
insts.extend(M::gen_return(
2322
self.call_conv,
2323
&self.isa_flags,
2324
&frame_layout,
2325
));
2326
2327
trace!("Epilogue: {:?}", insts);
2328
insts
2329
}
2330
2331
/// Return a reference to the computed frame layout information. This
2332
/// function will panic if it's called before [`Self::compute_frame_layout`].
2333
pub fn frame_layout(&self) -> &FrameLayout {
2334
self.frame_layout
2335
.as_ref()
2336
.expect("frame layout not computed before prologue generation")
2337
}
2338
2339
/// Returns the offset from SP to FP for the given function, after
2340
/// the prologue has set up the frame. This comprises the spill
2341
/// slots and stack-storage slots as well as storage for clobbered
2342
/// callee-save registers and outgoing arguments at callsites
2343
/// (space for which is reserved during frame setup).
2344
pub fn sp_to_fp_offset(&self) -> u32 {
2345
let frame_layout = self.frame_layout();
2346
frame_layout.clobber_size
2347
+ frame_layout.fixed_frame_storage_size
2348
+ frame_layout.outgoing_args_size
2349
}
2350
2351
/// Returns offset from the slot base in the current frame to the caller's SP.
2352
pub fn slot_base_to_caller_sp_offset(&self) -> u32 {
2353
// Note: this looks very similar to `frame_size()` above, but
2354
// it differs in both endpoints: it measures from the bottom
2355
// of stackslots, excluding outgoing args; and it includes the
2356
// setup area (FP/LR) size and any extra tail-args space.
2357
let frame_layout = self.frame_layout();
2358
frame_layout.clobber_size
2359
+ frame_layout.fixed_frame_storage_size
2360
+ frame_layout.setup_area_size
2361
+ (frame_layout.tail_args_size - frame_layout.incoming_args_size)
2362
}
2363
2364
/// Returns the size of arguments expected on the stack.
2365
pub fn stack_args_size(&self, sigs: &SigSet) -> u32 {
2366
sigs[self.sig].sized_stack_arg_space
2367
}
2368
2369
/// Get the spill-slot size.
2370
pub fn get_spillslot_size(&self, rc: RegClass) -> u32 {
2371
let max = if self.dynamic_type_sizes.len() == 0 {
2372
16
2373
} else {
2374
*self
2375
.dynamic_type_sizes
2376
.iter()
2377
.max_by(|x, y| x.1.cmp(&y.1))
2378
.map(|(_k, v)| v)
2379
.unwrap()
2380
};
2381
M::get_number_of_spillslots_for_value(rc, max, &self.isa_flags)
2382
}
2383
2384
/// Get the spill slot offset relative to the fixed allocation area start.
2385
pub fn get_spillslot_offset(&self, slot: SpillSlot) -> i64 {
2386
self.frame_layout().spillslot_offset(slot)
2387
}
2388
2389
/// Generate a spill.
2390
pub fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg) -> M::I {
2391
let ty = M::I::canonical_type_for_rc(from_reg.class());
2392
debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
2393
2394
let sp_off = self.get_spillslot_offset(to_slot);
2395
trace!("gen_spill: {from_reg:?} into slot {to_slot:?} at offset {sp_off}");
2396
2397
let from = StackAMode::Slot(sp_off);
2398
<M>::gen_store_stack(from, Reg::from(from_reg), ty)
2399
}
2400
2401
/// Generate a reload (fill).
2402
pub fn gen_reload(&self, to_reg: Writable<RealReg>, from_slot: SpillSlot) -> M::I {
2403
let ty = M::I::canonical_type_for_rc(to_reg.to_reg().class());
2404
debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
2405
2406
let sp_off = self.get_spillslot_offset(from_slot);
2407
trace!("gen_reload: {to_reg:?} from slot {from_slot:?} at offset {sp_off}");
2408
2409
let from = StackAMode::Slot(sp_off);
2410
<M>::gen_load_stack(from, to_reg.map(Reg::from), ty)
2411
}
2412
2413
/// Provide metadata to be emitted alongside machine code.
2414
///
2415
/// This metadata describes the frame layout sufficiently to find
2416
/// stack slots, so that runtimes and unwinders can observe state
2417
/// set up by compiled code in stackslots allocated for that
2418
/// purpose.
2419
pub fn frame_slot_metadata(&self) -> MachBufferFrameLayout {
2420
let frame_to_fp_offset = self.sp_to_fp_offset();
2421
let mut stackslots = SecondaryMap::with_capacity(self.sized_stackslots.len());
2422
let storage_area_base = self.frame_layout().outgoing_args_size;
2423
for (slot, storage_area_offset) in &self.sized_stackslots {
2424
stackslots[slot] = MachBufferStackSlot {
2425
offset: storage_area_base.checked_add(*storage_area_offset).unwrap(),
2426
key: self.sized_stackslot_keys[slot],
2427
};
2428
}
2429
MachBufferFrameLayout {
2430
frame_to_fp_offset,
2431
stackslots,
2432
}
2433
}
2434
}
2435
2436
/// An input argument to a call instruction: the vreg that is used,
2437
/// and the preg it is constrained to (per the ABI).
2438
#[derive(Clone, Debug)]
2439
pub struct CallArgPair {
2440
/// The virtual register to use for the argument.
2441
pub vreg: Reg,
2442
/// The real register into which the arg goes.
2443
pub preg: Reg,
2444
}
2445
2446
/// An output return value from a call instruction: the vreg that is
2447
/// defined, and the preg or stack location it is constrained to (per
2448
/// the ABI).
2449
#[derive(Clone, Debug)]
2450
pub struct CallRetPair {
2451
/// The virtual register to define from this return value.
2452
pub vreg: Writable<Reg>,
2453
/// The real register from which the return value is read.
2454
pub location: RetLocation,
2455
}
2456
2457
/// A location to load a return-value from after a call completes.
2458
#[derive(Clone, Debug, PartialEq, Eq)]
2459
pub enum RetLocation {
2460
/// A physical register.
2461
Reg(Reg, Type),
2462
/// A stack location, identified by a `StackAMode`.
2463
Stack(StackAMode, Type),
2464
}
2465
2466
pub type CallArgList = SmallVec<[CallArgPair; 8]>;
2467
pub type CallRetList = SmallVec<[CallRetPair; 8]>;
2468
2469
impl<T> CallInfo<T> {
2470
/// Emit loads for any stack-carried return values using the call
2471
/// info and allocations.
2472
pub fn emit_retval_loads<
2473
M: ABIMachineSpec,
2474
EmitFn: FnMut(M::I),
2475
IslandFn: Fn(u32) -> Option<M::I>,
2476
>(
2477
&self,
2478
stackslots_size: u32,
2479
mut emit: EmitFn,
2480
emit_island: IslandFn,
2481
) {
2482
// Count stack-ret locations and emit an island to account for
2483
// this space usage.
2484
let mut space_needed = 0;
2485
for CallRetPair { location, .. } in &self.defs {
2486
if let RetLocation::Stack(..) = location {
2487
// Assume up to ten instructions, semi-arbitrarily:
2488
// load from stack, store to spillslot, codegen of
2489
// large offsets on RISC ISAs.
2490
space_needed += 10 * M::I::worst_case_size();
2491
}
2492
}
2493
if space_needed > 0 {
2494
if let Some(island_inst) = emit_island(space_needed) {
2495
emit(island_inst);
2496
}
2497
}
2498
2499
let temp = M::retval_temp_reg(self.callee_conv);
2500
// The temporary must be noted as clobbered unless there are
2501
// no returns (hence it isn't needed). The latter can only be
2502
// the case statically for an ABI when the ABI doesn't allow
2503
// any returns at all (e.g., preserve-all ABI).
2504
debug_assert!(
2505
self.defs.is_empty()
2506
|| M::get_regs_clobbered_by_call(self.callee_conv, self.try_call_info.is_some())
2507
.contains(PReg::from(temp.to_reg().to_real_reg().unwrap()))
2508
);
2509
2510
for CallRetPair { vreg, location } in &self.defs {
2511
match location {
2512
RetLocation::Reg(preg, ..) => {
2513
// The temporary must not also be an actual return
2514
// value register.
2515
debug_assert!(*preg != temp.to_reg());
2516
}
2517
RetLocation::Stack(amode, ty) => {
2518
if let Some(spillslot) = vreg.to_reg().to_spillslot() {
2519
// `temp` is an integer register of machine word
2520
// width, but `ty` may be floating-point/vector,
2521
// which (i) may not be loadable directly into an
2522
// int reg, and (ii) may be wider than a machine
2523
// word. For simplicity, and because there are not
2524
// always easy choices for volatile float/vec regs
2525
// (see e.g. x86-64, where fastcall clobbers only
2526
// xmm0-xmm5, but tail uses xmm0-xmm7 for
2527
// returns), we use the integer temp register in
2528
// steps.
2529
let parts = (ty.bytes() + M::word_bytes() - 1) / M::word_bytes();
2530
let one_part_load_ty =
2531
Type::int_with_byte_size(M::word_bytes().min(ty.bytes()) as u16)
2532
.unwrap();
2533
for part in 0..parts {
2534
emit(M::gen_load_stack(
2535
amode.offset_by(part * M::word_bytes()),
2536
temp,
2537
one_part_load_ty,
2538
));
2539
emit(M::gen_store_stack(
2540
StackAMode::Slot(
2541
i64::from(stackslots_size)
2542
+ i64::from(M::word_bytes())
2543
* ((spillslot.index() as i64) + (part as i64)),
2544
),
2545
temp.to_reg(),
2546
M::word_type(),
2547
));
2548
}
2549
} else {
2550
assert_ne!(*vreg, temp);
2551
emit(M::gen_load_stack(*amode, *vreg, *ty));
2552
}
2553
}
2554
}
2555
}
2556
}
2557
}
2558
2559
impl TryCallInfo {
2560
pub(crate) fn exception_handlers(
2561
&self,
2562
layout: &FrameLayout,
2563
) -> impl Iterator<Item = MachExceptionHandler> {
2564
self.exception_handlers.iter().map(|handler| match handler {
2565
TryCallHandler::Tag(tag, label) => MachExceptionHandler::Tag(*tag, *label),
2566
TryCallHandler::Default(label) => MachExceptionHandler::Default(*label),
2567
TryCallHandler::Context(reg) => {
2568
let loc = if let Some(spillslot) = reg.to_spillslot() {
2569
// The spillslot offset is relative to the "fixed
2570
// storage area", which comes after outgoing args.
2571
let offset = layout.spillslot_offset(spillslot) + i64::from(layout.outgoing_args_size);
2572
ExceptionContextLoc::SPOffset(u32::try_from(offset).expect("SP offset cannot be negative or larger than 4GiB"))
2573
} else if let Some(realreg) = reg.to_real_reg() {
2574
ExceptionContextLoc::GPR(realreg.hw_enc())
2575
} else {
2576
panic!("Virtual register present in try-call handler clause after register allocation");
2577
};
2578
MachExceptionHandler::Context(loc)
2579
}
2580
})
2581
}
2582
2583
pub(crate) fn pretty_print_dests(&self) -> String {
2584
self.exception_handlers
2585
.iter()
2586
.map(|handler| match handler {
2587
TryCallHandler::Tag(tag, label) => format!("{tag:?}: {label:?}"),
2588
TryCallHandler::Default(label) => format!("default: {label:?}"),
2589
TryCallHandler::Context(loc) => format!("context {loc:?}"),
2590
})
2591
.collect::<Vec<_>>()
2592
.join(", ")
2593
}
2594
2595
pub(crate) fn collect_operands(&mut self, collector: &mut impl OperandVisitor) {
2596
for handler in &mut self.exception_handlers {
2597
match handler {
2598
TryCallHandler::Context(ctx) => {
2599
collector.any_late_use(ctx);
2600
}
2601
TryCallHandler::Tag(_, _) | TryCallHandler::Default(_) => {}
2602
}
2603
}
2604
}
2605
}
2606
2607
#[cfg(test)]
2608
mod tests {
2609
use super::SigData;
2610
2611
#[test]
2612
fn sig_data_size() {
2613
// The size of `SigData` is performance sensitive, so make sure
2614
// we don't regress it unintentionally.
2615
assert_eq!(core::mem::size_of::<SigData>(), 24);
2616
}
2617
}
2618
2619