Local AI Model Runner

Upload an ONNX model or enter a model download URL, then run browser-local inference with WebNN NPU/GPU first

Category

Dev Tools

How to Use

  1. Upload an ONNX model or enter a downloadable model URL
  2. Choose a preferred backend such as WebNN NPU, WebNN GPU, or WASM
  3. Configure input tensor shape, data type, and inference options
  4. Run inference and inspect latency, output tensors, and backend fallback messages

Examples

  • Test a small classifier

    Input: Upload model.onnx | input tensor 1x3x224x224

    Output: Shows inference time and output tensor summary

  • Validate NPU availability

    Input: Select WebNN NPU first

    Output: If the browser cannot create an NPU context, fallback or error information is shown

FAQ

Are models or inputs uploaded?
No. Model loading and inference happen in the browser. Remote model URLs are downloaded by the browser into the current page environment.
Can every ONNX model run?
No. Support depends on operators, model size, input format, browser memory, and selected backend capability.
Is this production-ready?
It is for prototyping and compatibility checks. Production use needs fixed models, input validation, error handling, and performance baselines.

Related tools