Local AI Model Runner
Upload an ONNX model or enter a model download URL, then run browser-local inference with WebNN NPU/GPU first
Category
How to Use
- Upload an ONNX model or enter a downloadable model URL
- Choose a preferred backend such as WebNN NPU, WebNN GPU, or WASM
- Configure input tensor shape, data type, and inference options
- Run inference and inspect latency, output tensors, and backend fallback messages
Examples
-
Test a small classifier
Input:
Upload model.onnx | input tensor 1x3x224x224Output:
Shows inference time and output tensor summary -
Validate NPU availability
Input:
Select WebNN NPU firstOutput:
If the browser cannot create an NPU context, fallback or error information is shown
FAQ
- Are models or inputs uploaded?
- No. Model loading and inference happen in the browser. Remote model URLs are downloaded by the browser into the current page environment.
- Can every ONNX model run?
- No. Support depends on operators, model size, input format, browser memory, and selected backend capability.
- Is this production-ready?
- It is for prototyping and compatibility checks. Production use needs fixed models, input validation, error handling, and performance baselines.
Related tools
- UUID Generator
Batch generate UUID v4 with one-click copy
- Timestamp Converter
Convert between timestamps and date/time formats
- Regex Tester
Test and debug regular expressions in real time
- Hash Generator
Compute MD5, SHA1, SHA256 hash values
- GPU Stress Test
FurMark-like browser stress test powered by Three.js WebGPU with real-time stability metrics
- GPU Benchmark
Run a fixed WebGPU scene for 60 seconds and sum per-second FPS as the final score
- WebNN NPU Compute Measurement
Estimate effective NPU compute available to the browser by running a WebNN NPU-only matrix multiplication workload
- QR Code Generator
Generate custom QR codes with advanced features like colors, logo, and batch generation