Path: blob/main/crates/environ/src/component/translate/adapt.rs
3088 views
//! Identification and creation of fused adapter modules in Wasmtime.1//!2//! A major piece of the component model is the ability for core wasm modules to3//! talk to each other through the use of lifted and lowered functions. For4//! example one core wasm module can export a function which is lifted. Another5//! component could import that lifted function, lower it, and pass it as the6//! import to another core wasm module. This is what Wasmtime calls "adapter7//! fusion" where two core wasm functions are coming together through the8//! component model.9//!10//! There are a few ingredients during adapter fusion:11//!12//! * A core wasm function which is "lifted".13//! * A "lift type" which is the type that the component model function had in14//! the original component15//! * A "lower type" which is the type that the component model function has16//! in the destination component (the one the uses `canon lower`)17//! * Configuration options for both the lift and the lower operations such as18//! memories, reallocs, etc.19//!20//! With these ingredients combined Wasmtime must produce a function which21//! connects the two components through the options specified. The fused adapter22//! performs tasks such as validation of passed values, copying data between23//! linear memories, etc.24//!25//! Wasmtime's current implementation of fused adapters is designed to reduce26//! complexity elsewhere as much as possible while also being suitable for being27//! used as a polyfill for the component model in JS environments as well. To28//! that end Wasmtime implements a fused adapter with another wasm module that29//! it itself generates on the fly. The usage of WebAssembly for fused adapters30//! has a number of advantages:31//!32//! * There is no need to create a raw Cranelift-based compiler. This is where33//! majority of "unsafety" lives in Wasmtime so reducing the need to lean on34//! this or audit another compiler is predicted to weed out a whole class of35//! bugs in the fused adapter compiler.36//!37//! * As mentioned above generation of WebAssembly modules means that this is38//! suitable for use in JS environments. For example a hypothetical tool which39//! polyfills a component onto the web today would need to do something for40//! adapter modules, and ideally the adapters themselves are speedy. While41//! this could all be written in JS the adapting process is quite nontrivial42//! so sharing code with Wasmtime would be ideal.43//!44//! * Using WebAssembly insulates the implementation to bugs to a certain45//! degree. While logic bugs are still possible it should be much more46//! difficult to have segfaults or things like that. With adapters exclusively47//! executing inside a WebAssembly sandbox like everything else the failure48//! modes to the host at least should be minimized.49//!50//! * Integration into the runtime is relatively simple, the adapter modules are51//! just another kind of wasm module to instantiate and wire up at runtime.52//! The goal is that the `GlobalInitializer` list that is processed at runtime53//! will have all of its `Adapter`-using variants erased by the time it makes54//! its way all the way up to Wasmtime. This means that the support in55//! Wasmtime prior to adapter modules is actually the same as the support56//! after adapter modules are added, keeping the runtime fiddly bits quite57//! minimal.58//!59//! This isn't to say that this approach isn't without its disadvantages of60//! course. For now though this seems to be a reasonable set of tradeoffs for61//! the development stage of the component model proposal.62//!63//! ## Creating adapter modules64//!65//! With WebAssembly itself being used to implement fused adapters, Wasmtime66//! still has the question of how to organize the adapter functions into actual67//! wasm modules.68//!69//! The first thing you might reach for is to put all the adapters into the same70//! wasm module. This cannot be done, however, because some adapters may depend71//! on other adapters (transitively) to be created. This means that if72//! everything were in the same module there would be no way to instantiate the73//! module. An example of this dependency is an adapter (A) used to create a74//! core wasm instance (M) whose exported memory is then referenced by another75//! adapter (B). In this situation the adapter B cannot be in the same module76//! as adapter A because B needs the memory of M but M is created with A which77//! would otherwise create a circular dependency.78//!79//! The second possibility of organizing adapter modules would be to place each80//! fused adapter into its own module. Each `canon lower` would effectively81//! become a core wasm module instantiation at that point. While this works it's82//! currently believed to be a bit too fine-grained. For example it would mean83//! that importing a dozen lowered functions into a module could possibly result84//! in up to a dozen different adapter modules. While this possibility could85//! work it has been ruled out as "probably too expensive at runtime".86//!87//! Thus the purpose and existence of this module is now evident -- this module88//! exists to identify what exactly goes into which adapter module. This will89//! evaluate the `GlobalInitializer` lists coming out of the `inline` pass and90//! insert `InstantiateModule` entries for where adapter modules should be91//! created.92//!93//! ## Partitioning adapter modules94//!95//! Currently this module does not attempt to be really all that fancy about96//! grouping adapters into adapter modules. The main idea is that most items97//! within an adapter module are likely to be close together since they're98//! theoretically going to be used for an instantiation of a core wasm module99//! just after the fused adapter was declared. With that in mind the current100//! algorithm is a one-pass approach to partitioning everything into adapter101//! modules.102//!103//! Adapters were identified in-order as part of the inlining phase of104//! translation where we're guaranteed that once an adapter is identified105//! it can't depend on anything identified later. The pass implemented here is106//! to visit all transitive dependencies of an adapter. If one of the107//! dependencies of an adapter is an adapter in the current adapter module108//! being built then the current module is finished and a new adapter module is109//! started. This should quickly partition adapters into contiugous chunks of110//! their index space which can be in adapter modules together.111//!112//! There's probably more general algorithms for this but for now this should be113//! fast enough as it's "just" a linear pass. As we get more components over114//! time this may want to be revisited if too many adapter modules are being115//! created.116117use crate::EntityType;118use crate::component::translate::*;119use crate::fact;120use std::collections::HashSet;121122/// Metadata information about a fused adapter.123#[derive(Debug, Clone, Hash, Eq, PartialEq)]124pub struct Adapter {125/// The type used when the original core wasm function was lifted.126///127/// Note that this could be different than `lower_ty` (but still matches128/// according to subtyping rules).129pub lift_ty: TypeFuncIndex,130/// Canonical ABI options used when the function was lifted.131pub lift_options: AdapterOptions,132/// The type used when the function was lowered back into a core wasm133/// function.134///135/// Note that this could be different than `lift_ty` (but still matches136/// according to subtyping rules).137pub lower_ty: TypeFuncIndex,138/// Canonical ABI options used when the function was lowered.139pub lower_options: AdapterOptions,140/// The original core wasm function which was lifted.141pub func: dfg::CoreDef,142}143144/// The data model for objects that are not unboxed in locals.145#[derive(Debug, Clone, Hash, Eq, PartialEq)]146pub enum DataModel {147/// Data is stored in GC objects.148Gc {},149150/// Data is stored in a linear memory.151LinearMemory {152/// An optional memory definition supplied.153memory: Option<dfg::CoreExport<MemoryIndex>>,154/// If `memory` is specified, whether it's a 64-bit memory.155memory64: bool,156/// An optional definition of `realloc` to used.157realloc: Option<dfg::CoreDef>,158},159}160161/// Configuration options which can be specified as part of the canonical ABI162/// in the component model.163#[derive(Debug, Clone, Hash, Eq, PartialEq)]164pub struct AdapterOptions {165/// The Wasmtime-assigned component instance index where the options were166/// originally specified.167pub instance: RuntimeComponentInstanceIndex,168/// The ancestors (i.e. chain of instantiating instances) of the instance169/// specified in the `instance` field.170pub ancestors: Vec<RuntimeComponentInstanceIndex>,171/// How strings are encoded.172pub string_encoding: StringEncoding,173/// The async callback function used by these options, if specified.174pub callback: Option<dfg::CoreDef>,175/// An optional definition of a `post-return` to use.176pub post_return: Option<dfg::CoreDef>,177/// Whether to use the async ABI for lifting or lowering.178pub async_: bool,179/// Whether or not this intrinsic can consume a task cancellation180/// notification.181pub cancellable: bool,182/// The core function type that is being lifted from / lowered to.183pub core_type: ModuleInternedTypeIndex,184/// The data model used by this adapter: linear memory or GC objects.185pub data_model: DataModel,186}187188impl<'data> Translator<'_, 'data> {189/// This is the entrypoint of functionality within this module which190/// performs all the work of identifying adapter usages and organizing191/// everything into adapter modules.192///193/// This will mutate the provided `component` in-place and fill out the dfg194/// metadata for adapter modules.195pub(super) fn partition_adapter_modules(&mut self, component: &mut dfg::ComponentDfg) {196// Visit each adapter, in order of its original definition, during the197// partitioning. This allows for the guarantee that dependencies are198// visited in a topological fashion ideally.199let mut state = PartitionAdapterModules::default();200for (id, adapter) in component.adapters.iter() {201state.adapter(component, id, adapter);202}203state.finish_adapter_module();204205// Now that all adapters have been partitioned into modules this loop206// generates a core wasm module for each adapter module, translates207// the module using standard core wasm translation, and then fills out208// the dfg metadata for each adapter.209for (module_id, adapter_module) in state.adapter_modules.iter() {210let mut module = fact::Module::new(self.types.types(), self.tunables);211let mut names = Vec::with_capacity(adapter_module.adapters.len());212for adapter in adapter_module.adapters.iter() {213let name = format!("adapter{}", adapter.as_u32());214module.adapt(&name, &component.adapters[*adapter]);215names.push(name);216}217let wasm = module.encode();218let imports = module.imports().to_vec();219220// Extend the lifetime of the owned `wasm: Vec<u8>` on the stack to221// a higher scope defined by our original caller. That allows to222// transform `wasm` into `&'data [u8]` which is much easier to work223// with here.224let wasm = &*self.scope_vec.push(wasm);225if log::log_enabled!(log::Level::Trace) {226match wasmprinter::print_bytes(wasm) {227Ok(s) => log::trace!("generated adapter module:\n{s}"),228Err(e) => log::trace!("failed to print adapter module: {e}"),229}230}231232// With the wasm binary this is then pushed through general233// translation, validation, etc. Note that multi-memory is234// specifically enabled here since the adapter module is highly235// likely to use that if anything is actually indirected through236// memory.237self.validator.reset();238let static_module_index = self.static_modules.next_key();239let translation = ModuleEnvironment::new(240self.tunables,241&mut self.validator,242self.types.module_types_builder(),243static_module_index,244)245.translate(Parser::new(0), wasm)246.expect("invalid adapter module generated");247248// Record, for each adapter in this adapter module, the module that249// the adapter was placed within as well as the function index of250// the adapter in the wasm module generated. Note that adapters are251// partitioned in-order so we're guaranteed to push the adapters252// in-order here as well. (with an assert to double-check)253for (adapter, name) in adapter_module.adapters.iter().zip(&names) {254let index = translation.module.exports[name];255let i = component.adapter_partitionings.push((module_id, index));256assert_eq!(i, *adapter);257}258259// Finally the metadata necessary to instantiate this adapter260// module is also recorded in the dfg. This metadata will be used261// to generate `GlobalInitializer` entries during the linearization262// final phase.263assert_eq!(imports.len(), translation.module.imports().len());264let args = imports265.iter()266.zip(translation.module.imports())267.map(|(arg, (_, _, ty))| fact_import_to_core_def(component, arg, ty))268.collect::<Vec<_>>();269let static_module_index2 = self.static_modules.push(translation);270assert_eq!(static_module_index, static_module_index2);271let id = component.adapter_modules.push((static_module_index, args));272assert_eq!(id, module_id);273}274}275}276277fn fact_import_to_core_def(278dfg: &mut dfg::ComponentDfg,279import: &fact::Import,280ty: EntityType,281) -> dfg::CoreDef {282fn unwrap_memory(def: &dfg::CoreDef) -> dfg::CoreExport<MemoryIndex> {283match def {284dfg::CoreDef::Export(e) => e.clone().map_index(|i| match i {285EntityIndex::Memory(i) => i,286_ => unreachable!(),287}),288_ => unreachable!(),289}290}291292let mut simple_intrinsic = |trampoline: dfg::Trampoline| {293let signature = ty.unwrap_func();294let index = dfg295.trampolines296.push((signature.unwrap_module_type_index(), trampoline));297dfg::CoreDef::Trampoline(index)298};299match import {300fact::Import::CoreDef(def) => def.clone(),301fact::Import::Transcode {302op,303from,304from64,305to,306to64,307} => {308let from = dfg.memories.push(unwrap_memory(from));309let to = dfg.memories.push(unwrap_memory(to));310let signature = ty.unwrap_func();311let index = dfg.trampolines.push((312signature.unwrap_module_type_index(),313dfg::Trampoline::Transcoder {314op: *op,315from,316from64: *from64,317to,318to64: *to64,319},320));321dfg::CoreDef::Trampoline(index)322}323fact::Import::ResourceTransferOwn => simple_intrinsic(dfg::Trampoline::ResourceTransferOwn),324fact::Import::ResourceTransferBorrow => {325simple_intrinsic(dfg::Trampoline::ResourceTransferBorrow)326}327fact::Import::PrepareCall { memory } => simple_intrinsic(dfg::Trampoline::PrepareCall {328memory: memory.as_ref().map(|v| dfg.memories.push(unwrap_memory(v))),329}),330fact::Import::SyncStartCall { callback } => {331simple_intrinsic(dfg::Trampoline::SyncStartCall {332callback: callback.clone().map(|v| dfg.callbacks.push(v)),333})334}335fact::Import::AsyncStartCall {336callback,337post_return,338} => simple_intrinsic(dfg::Trampoline::AsyncStartCall {339callback: callback.clone().map(|v| dfg.callbacks.push(v)),340post_return: post_return.clone().map(|v| dfg.post_returns.push(v)),341}),342fact::Import::FutureTransfer => simple_intrinsic(dfg::Trampoline::FutureTransfer),343fact::Import::StreamTransfer => simple_intrinsic(dfg::Trampoline::StreamTransfer),344fact::Import::ErrorContextTransfer => {345simple_intrinsic(dfg::Trampoline::ErrorContextTransfer)346}347fact::Import::Trap => simple_intrinsic(dfg::Trampoline::Trap),348fact::Import::EnterSyncCall => simple_intrinsic(dfg::Trampoline::EnterSyncCall),349fact::Import::ExitSyncCall => simple_intrinsic(dfg::Trampoline::ExitSyncCall),350}351}352353#[derive(Default)]354struct PartitionAdapterModules {355/// The next adapter module that's being created. This may be empty.356next_module: AdapterModuleInProgress,357358/// The set of items which are known to be defined which the adapter module359/// in progress is allowed to depend on.360defined_items: HashSet<Def>,361362/// Finished adapter modules that won't be added to.363///364/// In theory items could be added to preexisting modules here but to keep365/// this pass linear this is never modified after insertion.366adapter_modules: PrimaryMap<dfg::AdapterModuleId, AdapterModuleInProgress>,367}368369#[derive(Default)]370struct AdapterModuleInProgress {371/// The adapters which have been placed into this module.372adapters: Vec<dfg::AdapterId>,373}374375/// Items that adapters can depend on.376///377/// Note that this is somewhat of a flat list and is intended to mostly model378/// core wasm instances which are side-effectful unlike other host items like379/// lowerings or always-trapping functions.380#[derive(Copy, Clone, Hash, Eq, PartialEq)]381enum Def {382Adapter(dfg::AdapterId),383Instance(dfg::InstanceId),384}385386impl PartitionAdapterModules {387fn adapter(&mut self, dfg: &dfg::ComponentDfg, id: dfg::AdapterId, adapter: &Adapter) {388// Visit all dependencies of this adapter and if anything depends on389// the current adapter module in progress then a new adapter module is390// started.391self.adapter_options(dfg, &adapter.lift_options);392self.adapter_options(dfg, &adapter.lower_options);393self.core_def(dfg, &adapter.func);394395// With all dependencies visited this adapter is added to the next396// module.397//398// This will either get added the preexisting module if this adapter399// didn't depend on anything in that module itself or it will be added400// to a fresh module if this adapter depended on something that the401// current adapter module created.402log::debug!("adding {id:?} to adapter module");403self.next_module.adapters.push(id);404}405406fn adapter_options(&mut self, dfg: &dfg::ComponentDfg, options: &AdapterOptions) {407if let Some(def) = &options.callback {408self.core_def(dfg, def);409}410if let Some(def) = &options.post_return {411self.core_def(dfg, def);412}413match &options.data_model {414DataModel::Gc {} => {415// Nothing to do here yet.416}417DataModel::LinearMemory {418memory,419memory64: _,420realloc,421} => {422if let Some(memory) = memory {423self.core_export(dfg, memory);424}425if let Some(def) = realloc {426self.core_def(dfg, def);427}428}429}430}431432fn core_def(&mut self, dfg: &dfg::ComponentDfg, def: &dfg::CoreDef) {433match def {434dfg::CoreDef::Export(e) => self.core_export(dfg, e),435dfg::CoreDef::Adapter(id) => {436// If this adapter is already defined then we can safely depend437// on it with no consequences.438if self.defined_items.contains(&Def::Adapter(*id)) {439log::debug!("using existing adapter {id:?} ");440return;441}442443log::debug!("splitting module needing {id:?} ");444445// .. otherwise we found a case of an adapter depending on an446// adapter-module-in-progress meaning that the current adapter447// module must be completed and then a new one is started.448self.finish_adapter_module();449assert!(self.defined_items.contains(&Def::Adapter(*id)));450}451452// These items can't transitively depend on an adapter453dfg::CoreDef::Trampoline(_)454| dfg::CoreDef::InstanceFlags(_)455| dfg::CoreDef::UnsafeIntrinsic(..)456| dfg::CoreDef::TaskMayBlock => {}457}458}459460fn core_export<T>(&mut self, dfg: &dfg::ComponentDfg, export: &dfg::CoreExport<T>) {461// When an adapter depends on an exported item it actually depends on462// the instance of that exported item. The caveat here is that the463// adapter not only depends on that particular instance, but also all464// prior instances to that instance as well because instance465// instantiation order is fixed and cannot change.466//467// To model this the instance index space is looped over here and while468// an instance hasn't been visited it's visited. Note that if an469// instance has already been visited then all prior instances have470// already been visited so there's no need to continue.471let mut instance = export.instance;472while self.defined_items.insert(Def::Instance(instance)) {473self.instance(dfg, instance);474if instance.as_u32() == 0 {475break;476}477instance = dfg::InstanceId::from_u32(instance.as_u32() - 1);478}479}480481fn instance(&mut self, dfg: &dfg::ComponentDfg, instance: dfg::InstanceId) {482log::debug!("visiting instance {instance:?}");483484// ... otherwise if this is the first timet he instance has been seen485// then the instances own arguments are recursively visited to find486// transitive dependencies on adapters.487match &dfg.instances[instance] {488dfg::Instance::Static(_, args) => {489for arg in args.iter() {490self.core_def(dfg, arg);491}492}493dfg::Instance::Import(_, args) => {494for (_, values) in args {495for (_, def) in values {496self.core_def(dfg, def);497}498}499}500}501}502503fn finish_adapter_module(&mut self) {504if self.next_module.adapters.is_empty() {505return;506}507508// Reset the state of the current module-in-progress and then flag all509// pending adapters as now defined since the current module is being510// committed.511let module = mem::take(&mut self.next_module);512for adapter in module.adapters.iter() {513let inserted = self.defined_items.insert(Def::Adapter(*adapter));514assert!(inserted);515}516let idx = self.adapter_modules.push(module);517log::debug!("finishing adapter module {idx:?}");518}519}520521522