Path: blob/main/crates/wasi-nn/examples/classification-component-onnx/README.md
3088 views
Onnx Backend Classification Component Example
This example demonstrates how to use the wasi-nn crate to run a classification using the ONNX Runtime backend from a WebAssembly component.
It supports CPU and GPU (Nvidia CUDA) execution targets.
Note: GPU execution target only supports Nvidia CUDA (onnx-cuda) as execution provider (EP) for now.
Build
In this directory, run the following command to build the WebAssembly component:
Running the Example
In the Wasmtime root directory, run the following command to build the Wasmtime CLI and run the WebAssembly component:
Building Wasmtime
For CPU-only execution:
For GPU (Nvidia CUDA) support:
Running with Different Execution Targets
The execution target is controlled by passing a single argument to the WASM module.
Arguments:
No argument or
cpu- Use CPU executiongpuorcuda- Use GPU/CUDA execution
CPU Execution (default):
GPU (CUDA) Execution:
Expected Output
You should get output similar to:
When using GPU target, the first line will indicate the selected execution target. You can monitor GPU usage using cmd watch -n 1 nvidia-smi.
To see trace logs from wasmtime_wasi_nn or ort, run Wasmtime with WASMTIME_LOG enabled, e.g.,
Prerequisites for GPU(CUDA) Support
NVIDIA GPU with CUDA support
CUDA Toolkit 12.x with cuDNN 9.x
Build wasmtime with
wasmtime-wasi-nn/onnx-cudafeature
ONNX Runtime's Fallback Behavior
If the GPU execution provider is requested (by passing gpu) but the device does not have a GPU or the necessary CUDA drivers are missing, ONNX Runtime will silently fall back to the CPU execution provider. The application will continue to run, but inference will happen on the CPU.
To verify if fallback is happening, you can enable ONNX Runtime logging:
Build Wasmtime with the additional
wasmtime-wasi-nn/ort-tracingfeature:Run Wasmtime with
WASMTIME_LOGenabled to seeortwarnings:You should see a warning like:
No execution providers from session options registered successfully; may fall back to CPU.