ONNX Runtime is a performance-focused complete scoring engine for Open Neural Network Exchange (ONNX) models, with an open extensible architecture to continually address the latest developments in AI and Deep Learning. ONNX Runtime stays up to date with the ONNX standard with complete implementation of all ONNX operators, and supports all ONNX releases (1.2+) with both future and backwards compatibility. Please refer to this page for ONNX opset compatibility details.
ONNX is an interoperable format for machine learning models supported by various ML and DNN frameworks and tools. The universal format makes it easier to interoperate between frameworks and maximize the reach of hardware optimization investments.
Setup
Usage
More Info
ONNX Runtime provides comprehensive support of the ONNX spec and can be used to run all models based on ONNX v1.2.1 and higher. See version compatibility details here.
Traditional ML support
In addition to DNN models, ONNX Runtime fully supports the ONNX-ML profile of the ONNX spec for traditional ML scenarios.
For the full set of operators and types supported, please see operator documentation
Note: Some operators not supported in the current ONNX version may be available as a Contrib Operator
ONNX Runtime supports both CPU and GPU. Using various graph optimizations and accelerators, ONNX Runtime can provide lower latency compared to other runtimes for faster end-to-end customer experiences and minimized machine utilization costs.
Currently ONNX Runtime supports the following accelerators:
Not all variations are supported in the official release builds, but can be built from source following these instructions. Find Dockerfiles here.
We are continuously working to integrate new execution providers for further improvements in latency and efficiency. If you are interested in contributing a new execution provider, please see this page.
API documentation and package installation
ONNX Runtime is currently available for Linux, Windows, and Mac with Python, C#, C++, and C APIs. If you have specific scenarios that are not supported, please share your suggestions and scenario details via Github Issues.
Quick Start: The ONNX-Ecosystem Docker container image is available on Dockerhub and includes ONNX Runtime (CPU, Python), dependencies, tools to convert from various frameworks, and Jupyter notebooks to help get started.
Additional dockerfiles for some features can be found here.
CPU (MLAS+Eigen) | CPU (MKL-ML) | GPU (CUDA) | |
---|---|---|---|
Python |
pypi: onnxruntime Windows (x64) Linux (x64) Mac OS X (x64) |
-- |
pypi: onnxruntime-gpu Windows (x64) Linux (x64) |
C# |
Nuget: Microsoft.ML.OnnxRuntime Windows (x64, x86) Linux (x64, x86) Mac OS X (x64) |
Nuget: Microsoft.ML.OnnxRuntime.MKLML Windows (x64) Linux (x64) Mac OS X (x64) |
Nuget: Microsoft.ML.OnnxRuntime.Gpu Windows (x64) Linux (x64) |
C/C++ wrapper |
Nuget: Microsoft.ML.OnnxRuntime .zip, .tgz Windows (x64, x86) Linux (x64, x86) Mac OS X (x64) |
Nuget: Microsoft.ML.OnnxRuntime.MKLML Windows (x64) Linux (x64) Mac OS X (x64) |
Nuget: Microsoft.ML.OnnxRuntime.Gpu .zip, .tgz Windows (x64) Linux (x64) |
apt-get install libgomp1
.pip
to be download the Python binaries, run pip install --upgrade pip
prior to downloading.en_US.UTF-8 locale
is required.
locale-gen en_US.UTF-8
update-locale LANG=en_US.UTF-8
If additional build flavors are needed, please find instructions on building from source at Build ONNX Runtime. For production scenarios, it's strongly recommended to build from an official release branch.
Dockerfiles are available here to help you get started.
ONNX Runtime can be deployed to the cloud for model inferencing using Azure Machine Learning Services. See detailed instructions and sample notebooks.
ONNX Runtime Server (beta) is a hosted application for serving ONNX models using ONNX Runtime, providing a REST API for prediction. Usage details can be found here, and image installation instructions are here.
ONNX Runtime is open and extensible, supporting a broad set of configurations and execution providers for model acceleration. For performance tuning guidance, please see this page.
Inference only
Inference with model conversion
Inference and deploy through AzureML
Inferencing using ONNX Model Zoo models:
Convert existing model for Inferencing:
Train a model with PyTorch and Inferencing:
GPU: Inferencing with TensorRT Execution Provider (AKS)
Inference and Deploy wtih Azure IoT Edge
Other
We welcome contributions! Please see the contribution guidelines.
For any feedback or to report a bug, please file a GitHub Issue.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。