Pulse · microsoft/onnxruntime · GitHub

July 31, 2025 – August 7, 2025

Overview

75 Active pull requests

1,221 Active issues

41 Pull requests merged by 26 people

Bump tmp from 0.2.1 to 0.2.4 in /onnxruntime/test/wasm
#25671 merged Aug 7, 2025
[webgpu] support float16 type for Einsum operator
#25443 merged Aug 7, 2025
Update number of mel bins for whisper model
#25675 merged Aug 7, 2025
Allow DML EP to be used with any CPU EP
#25664 merged Aug 7, 2025
Cherry-pick MiGraphX EP fixes from upstream for rel-1.23.0
#25659 merged Aug 7, 2025
ORT perf test support for plugin EP
#25374 merged Aug 6, 2025
Enable BrowserStack testing stage
#25668 merged Aug 6, 2025
Remove training packages from onnxruntime-ios-packaging-pipeline
#25451 merged Aug 6, 2025
Fix the is_leaf check in TreeEnsemble
#25410 merged Aug 6, 2025
upgrade dawn to 794b6fadc4171f7b853a77ffdf0948fbec431f41
#25461 merged Aug 6, 2025
Update python bindings to be able to use a shared allocator and/or IDataTransfer registered by a plugin EP in the Environment
#25346 merged Aug 6, 2025
[NV TRT RTX EP] Cumulative TRT RTX EP merge
#25656 merged Aug 6, 2025
[build] fix build with delay load hook
#25657 merged Aug 5, 2025
[build] fix WebAssembly build on macOS/arm64
#25653 merged Aug 5, 2025
[web] fix support for subgroup
#25649 merged Aug 4, 2025
Add patch for WebGPU on Android to handle fp16 in uniforms
#25349 merged Aug 4, 2025
Cherry-pick PR #25626 to 1.23.0 release branch
#25640 merged Aug 4, 2025
Enable 2bit CPU matmul fallback
#25582 merged Aug 4, 2025
[QNN EP] Add support for GatherNd Op in QNN EP
#25635 merged Aug 4, 2025
Update qMoE spec to support block quantization
#25641 merged Aug 4, 2025
[MIGraphX EP] Fix CreateExecutionProviderFactory with correct struct and change vendor_id
#25625 merged Aug 4, 2025
[CANN] Fix CANN build error
#25627 merged Aug 4, 2025
Update MoE and qMoE spec
#25619 merged Aug 2, 2025
[build] use cross-compile to build macOS x86_64 target for WebGPU
#25617 merged Aug 2, 2025
Add support for QMoE in CPU
#25558 merged Aug 2, 2025
Move moving weights to memory to the end of Graph::Resolve()
#25626 merged Aug 2, 2025
Add CUDA implementation of GatherBlockQuantized operator
#25575 merged Aug 1, 2025
Cherry-picks for ORT 1.23.0
#25620 merged Aug 1, 2025
[build] fix macOS x86_64 cross-compile warning
#25615 merged Aug 1, 2025
update CANN docs
#25624 merged Aug 1, 2025
[webgpu] Apply Flash Attention if sliding window exceeds KV cache length
#25594 merged Aug 1, 2025
Update macOS target version from 13.3 to 13.4
#25616 merged Aug 1, 2025
[QNN-EP] Resolve VTCM buffer sharing bugs
#25622 merged Aug 1, 2025
[QNN EP] Add ONNX ScatterElements support
#24811 merged Aug 1, 2025
[QNN EP] Bug fix: multiple consumer for cast result in name conflicts
#25584 merged Aug 1, 2025
Optimize layout for SubgroupMatrixLoad on Intel
#25384 merged Aug 1, 2025
[QNN EP] Lower Gemm with 2d bias to FC + ElementwiseAdd when targeting HTP.
#25605 merged Aug 1, 2025
[build] disable CodeQL for NPM Packaging Pipeline
#25614 merged Aug 1, 2025
Cache opSupportLimits to improve the performance and update tracing e…
#25589 merged Jul 31, 2025
Refactor Java Test Pipeline
#25608 merged Jul 31, 2025
[QNN EP] Add Unit tests for LPBQ Fusions
#25592 merged Jul 31, 2025

34 Pull requests opened by 31 people

Reduce CMake's CUDA_ARCHITECTURES
#25618 opened Jul 31, 2025
fix(ort): add automatic patching for nvidia cudnn library path
#25628 opened Aug 1, 2025
[VitisAI] bugfix model_clone optimization
#25629 opened Aug 1, 2025
[webgpu] support GatherND operator
#25632 opened Aug 1, 2025
rel-1.22.2 cherry-pick 1
#25633 opened Aug 1, 2025
Depthwise conv 3x3 s1
#25637 opened Aug 2, 2025
Add more tests to GatherBlockQuantized operator
#25639 opened Aug 2, 2025
Optimizations and fixes in QMoE CPU kernel
#25642 opened Aug 3, 2025
Bump ruff from 0.12.4 to 0.12.7
#25643 opened Aug 4, 2025
Support int4 and uint4 for reshape on opset 21+.
#25645 opened Aug 4, 2025
DequantizeLinear should support non-zero zero_point when input type is int32
#25646 opened Aug 4, 2025
Add comprehensive ThresholdedRelu custom operator example and fix common implementation issues
#25650 opened Aug 4, 2025
Skip node output dump for MemcpyToHost
#25651 opened Aug 5, 2025
Properly remove in-memory references
#25652 opened Aug 5, 2025
Bugfix vitisai ep model clone with 23979 25320
#25654 opened Aug 5, 2025
safeint.h: quelch gcc's -Wreturn-type
#25655 opened Aug 5, 2025
Update semver.h to fix compilation error under linux
#25658 opened Aug 5, 2025
Remove incorrect function calls
#25662 opened Aug 6, 2025
fixed matmul broadcasting
#25663 opened Aug 6, 2025
Python binding for listing custom operators.
#25665 opened Aug 6, 2025
[QNN-EP] Add CastLoneQFusion to transform Cast and QNode into Convert
#25667 opened Aug 6, 2025
Replace vmlaq_f32 with vfmaq_f32 (fused multiply-add)
#25669 opened Aug 6, 2025
Bump transformers from 4.50.0 to 4.53.0 in /onnxruntime/python/tools/transformers/models/stable_diffusion/requirements
#25672 opened Aug 6, 2025
Relax WeightBiasQuantization constraint for larger QDQ node group
#25673 opened Aug 6, 2025
[webgpu] support bool for binary operators
#25674 opened Aug 6, 2025
Add vendorid check to GetSupportedDevicesImpl for MIGraphx EP
#25677 opened Aug 6, 2025
[WIP] Test integration with ONNX 1.19
#25678 opened Aug 7, 2025
[WebNN] Remove NHWC preferred layout
#25679 opened Aug 7, 2025
FP16 inference performance improvement on CPU
#25680 opened Aug 7, 2025
[MIGraphX EP][BUG] Fix pybind compilation with MIGraphX after merging #25346
#25683 opened Aug 7, 2025
Skeleton for Attention(23) on CUDA
#25684 opened Aug 7, 2025
2-bit TMAC matmul
#25686 opened Aug 7, 2025
Update QAIRT to 2.37.0
#25688 opened Aug 7, 2025
[WIP] Move provider tests to `onnxruntime_provider_test` and enable use of plugin EPs
#25689 opened Aug 7, 2025

1,207 Issues closed by 31 people

[WebGPU] `Kernel "[GroupQueryAttention] /model/layers.0/attn/GroupQueryAttention" failed. Error: Input "key" is expected to have 3, 4, or 5 dimensions".`
#22987 closed Aug 7, 2025
Microsoft.AI.MachineLearning cannot be used in UWP app on on Windows 10 ARM64
#4686 closed Aug 7, 2025
Debugging capability of onnxruntime in Visual Studio 2019 incapacitated
#4812 closed Aug 7, 2025
[WinML] [C++/WinRT] Clarify how to share Ort::Env environments with WinRT/WinML instances
#4971 closed Aug 7, 2025
C Sharp API for openvino doesn't run on GPU
#5011 closed Aug 7, 2025
onxruntime-gpu installation issues
#5020 closed Aug 7, 2025
program stucks when multi processes
#5093 closed Aug 7, 2025
Exception thrown from Dispose method (When missing dependency)
#5250 closed Aug 7, 2025
DLRM model failure to execute on GPU
#5295 closed Aug 7, 2025
Running quantized models on GPU
#5359 closed Aug 7, 2025
Can Session::Run be const?
#5558 closed Aug 7, 2025
ML.NET issue while Using yolov4 onnx model
#5593 closed Aug 7, 2025
Passing Non-Const pointer to Session::Run() using CPP Api
#5597 closed Aug 7, 2025
How to reduce memory used?
#5711 closed Aug 7, 2025
openvino build failed nuget
#5749 closed Aug 7, 2025
How to loading a pytorch model with input shape of (None, 32) using the C# inference ?
#5781 closed Aug 7, 2025
Any support for double type tensor when loading pytorch onnx model ?
#5782 closed Aug 7, 2025
memory keep increasing with dynamic input shape of network
#5796 closed Aug 7, 2025
Memory usage with Cuda ExecutionProvider
#5801 closed Aug 7, 2025
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : The node is not placed on any Execution Provider. OneHot(11) (node while/cond_5/one_hot).
#5825 closed Aug 7, 2025
Non-zero status code returned while running Div node
#5830 closed Aug 7, 2025
Performance comparison
#5834 closed Aug 7, 2025
IOBindings in C++ API are missing a way to SynchronizeInputs.
#5857 closed Aug 7, 2025
How to compile with vs2019, with the platform tool set "Visual Studio 2015 - Windows XP (v140_xp)"， i want use it in xp system
#5859 closed Aug 7, 2025
Quantized model much slower than full precision model
#5865 closed Aug 7, 2025
Performance issue with operator Where on CPU
#5896 closed Aug 7, 2025
Performance issue with operators SVMRegressor and SVMClassifier for RBF kernel on CPU
#5898 closed Aug 7, 2025
failed:/onnxruntime_src/onnxruntime/core/graph/model_load_utils.h:47 void onnxruntime::model_load_utils::ValidateOpsetForDomain
#5905 closed Aug 7, 2025
Support GCN
#5910 closed Aug 7, 2025
EyeLike with dynamic shape results in error
#5917 closed Aug 7, 2025
Can't train mnist in parallel
#5918 closed Aug 7, 2025
could not open "tensorrt_provider_factory.h", "mkldnn_provider_factory.h"
#5925 closed Aug 7, 2025
Dynamic shape got wrong output
#5928 closed Aug 7, 2025
Issue with Multi-GPU and GPU memory limit
#5939 closed Aug 7, 2025
can not get expected speed in onnxruntime
#5953 closed Aug 7, 2025
Error using onnx model containing Bidirectional layer with MatMulAddFusion
#5955 closed Aug 7, 2025
No opset import for domain 'com.microsoft'
#5971 closed Aug 7, 2025
"undefined symbol" error occured, when I use ort.SessionOptions.register_custom_ops_library
#5984 closed Aug 7, 2025
Under TRT EP, custom op cannot fall back to CUDA EP
#6002 closed Aug 7, 2025
Inconsistent inference time between C Python API [Megatron-LM]
#6025 closed Aug 7, 2025
Onnx Batch Processing
#6044 closed Aug 7, 2025
How to extract the size of a map type in c++?
#6077 closed Aug 7, 2025
could the checkpoint of bert convert to onnx model? I have a bug that 'BertForPreTraining' object has no attribute 'layers, output'
#6089 closed Aug 7, 2025
how to implement execution provider (EP) that allow onnx run on my hardware?
#6110 closed Aug 7, 2025
32bit vs 64bit when compiling or something else?
#6144 closed Aug 7, 2025
GPU memory consumption keeps increasing with multithreading in Java
#6181 closed Aug 7, 2025
Not support rtx 3000 series
#6213 closed Aug 7, 2025
sample c++ program just print "hello" does not start
#6243 closed Aug 7, 2025
Cannot create OnnxTensor with UINT8 type.
#6261 closed Aug 7, 2025
Referencing Microsoft.ML.OnnxRuntime and Microsoft.ML.OnnxRuntime.GPU in a c# project.
#6264 closed Aug 7, 2025
Could onnxruntime be compiled into wasm using emsdk?
#6275 closed Aug 7, 2025
Performance shaking
#6301 closed Aug 7, 2025
[Bug] Wrong implementation in LpPool
#6302 closed Aug 7, 2025
Memory corruption when using OnnxRuntime with OpenVINO on the Intel MyriadX and Raspberry Pi 4B
#6304 closed Aug 7, 2025
[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Error while inferencing DLRM onnx model
#6319 closed Aug 7, 2025
Error: Running double precision model exported from pyTorch
#6320 closed Aug 7, 2025
The output for GPT is NAN when fp16=True
#6328 closed Aug 7, 2025
ROCm build seems broken: `error: ‘ncclComm_t’ does not name a type`
#6358 closed Aug 7, 2025
Implementation of ONNX Functions
#6360 closed Aug 7, 2025
Incorrect TypeInferenceError on UNDEFINED tensor type
#6370 closed Aug 7, 2025
[C-Api] Dynamic Shape Error: Non-zero status code returned while running Sigmoid node.
#6372 closed Aug 7, 2025
onnxruntime v1.6.0 on Jetson Nano - Illegal Instruction (core dumped)
#6375 closed Aug 7, 2025
How to extract dimension of inputs in onnxruntime/core/providers/cpu/math/matmul.cc
#6396 closed Aug 7, 2025
Which executor to build when using: Intel® Deep Learning Boost (Intel® DL Boost)
#6400 closed Aug 7, 2025
[question] Configure GPU arena with Python bindings
#6411 closed Aug 7, 2025
Onnxruntime error when Relu-layer follows Dense-layer without activation and biases
#6423 closed Aug 7, 2025
Reshape `requested_shape` forced to have leading dimension 1 when it should be -1
#6424 closed Aug 7, 2025
Build failure in `orttraining_pybind_state.cc` when building with `--enable_training` and `--build_wheel`
#6536 closed Aug 7, 2025
NaN in AveragePooling
#6543 closed Aug 7, 2025
Loss of accuracy when GPT-2 based model is exported to ONNX
#6549 closed Aug 7, 2025
Custom Op Registration and Implementation
#6564 closed Aug 7, 2025
Inference error using migraohx-onnxruntime
#6605 closed Aug 7, 2025
/onnxruntime/core/mlas/lib/quantize.cpp:50:62: error: ‘vminnmq_f32’ was not declared in this scope
#6638 closed Aug 7, 2025
Failed to add Microsoft.AI.MachineLearning NuGet package to .NET Framework 4.6.1 projects
#6662 closed Aug 7, 2025
INT8 quantized model is very slow
#6732 closed Aug 7, 2025
Shape inference error for Range node
#6737 closed Aug 7, 2025
onnxruntime-gpu (cudaexecutionprovider) usage of cudnn autotuner
#6744 closed Aug 7, 2025
Unable to compile on Linux with CUDA
#6749 closed Aug 7, 2025
Onnxruntime inference with Integrated GPU Failed
#6755 closed Aug 7, 2025
Onnxruntime.gpu is as slower than cpu mode
#6799 closed Aug 7, 2025
Multiple input and multiple output models that create tensors in loops can cause serious crashes
#6821 closed Aug 7, 2025
ONNXRuntime Inference with Finetuned BERT Model outputting odd results
#6830 closed Aug 7, 2025
Unable to build onnxruntime with "--build_wheel" and "--enable_pybind" options
#6841 closed Aug 7, 2025
[JAVA Bindings + Android arm64-v8a] ONNXRuntime build documentation
#6923 closed Aug 7, 2025
dynamic shape input is much slower than fixed shape input in gpu
#6978 closed Aug 7, 2025
CUDA header requested but missing in DNNL part of ORT 1.7.1
#7005 closed Aug 7, 2025
Build fail for docker on MacOS. -NO GPU.
#7052 closed Aug 7, 2025
Large Memory Allocations When Loading RandomForestRegressor Model
#7067 closed Aug 7, 2025
Non-zero status code returned while running BatchNormalization node
#7095 closed Aug 7, 2025
Memory and timing issue with onnxruntime python API with TensorFlow model
#7106 closed Aug 7, 2025
Compile error in header onnxruntime_cxx_api.h when update ONNX runtime from 1.5.2 to 1.7.1
#7142 closed Aug 7, 2025
Batch inference
#7178 closed Aug 7, 2025
Segmentation fault when running onnxruntime inside docker with cpuset restrictions
#7207 closed Aug 7, 2025
Significant difference in the performance of pytorch and exported onnx models
#7212 closed Aug 7, 2025
TensorrtExecutionProvider slower than CUDAExecutionProvider: Transformers
#7230 closed Aug 7, 2025
The speed of running the onnx model is 6x slower than the pytorch model on Jetson TX2
#7233 closed Aug 7, 2025
[Python API + ARM64] Running ResNet50 on ARM board using ACL Error and Performance Issue
#7234 closed Aug 7, 2025
ACL (32bit) Execution Provider fails on gemm node
#7255 closed Aug 7, 2025
onnxruntime gpu version can't installed, how to fix it?
#7272 closed Aug 7, 2025
Run a model containing CustomOp with TensorRT provider fails
#7314 closed Aug 7, 2025
C# console app crash upon appending OpenVino execution provider
#7330 closed Aug 7, 2025
Cannot save Tensorrt .engine model in v1.7.1
#7339 closed Aug 7, 2025
openvino continued package by pyinstaller external dll issue
#7346 closed Aug 7, 2025
Resize Operator rounds-down instead of round-to-even for int32/uint8
#7368 closed Aug 7, 2025
How to compile the framework that can run in Windows XP？
#7444 closed Aug 7, 2025
Please, update the docs. Provider parameter "cuda_mem_limit" was renamed to "gpu_mem_limit" in nightly build.
#7457 closed Aug 7, 2025
How to release gpu memory without exiting the process？
#7463 closed Aug 7, 2025
Running inference using GPU or TensorRT on Jetson
#7484 closed Aug 7, 2025
Problem compiling ONNX RT with CUDA and TensorRT on Windows
#7562 closed Aug 7, 2025
Use of torch InstanceNorm2d and dynamic tensor size causes crash
#7572 closed Aug 7, 2025
onnxruntime build is not compatible with onnx build. Protobuf loaded twice.
#7597 closed Aug 7, 2025
Large GPU memory usage with EXHAUSTIVE cuDNN search
#7612 closed Aug 7, 2025
Enable CUDA provider option configuration in Java
#7613 closed Aug 7, 2025
Build fails with --use_rknpu
#7614 closed Aug 7, 2025
Publish the providers with the release build
#7628 closed Aug 7, 2025
int8 quantization on GPU support? (transformers)
#7634 closed Aug 7, 2025
Does onnxruntime support bert with relative position embedding
#7713 closed Aug 7, 2025
quantize model can‘t run on gpu ?
#7745 closed Aug 7, 2025
TensorRT execution provider SEGFAULT
#7757 closed Aug 7, 2025
CUDA kernel not found in registries for Op type: Pad
#7779 closed Aug 7, 2025
ACL and ArmNN v21.02 EP has problem with GEMM
#7784 closed Aug 7, 2025
get error when using a model with custom op
#7788 closed Aug 7, 2025
Force fallback to CPU execution for Gather, Unsqueeze, Concat nodes - onnxruntime-gpu 1.7.0, opset 12 and 13
#7792 closed Aug 7, 2025
How to get sparse tensor input in custom op?
#7838 closed Aug 7, 2025
Non-zero status code returned while running FusedConv node. Name:'fused ' onnxruntime::OpKernelContext::Input Missing Input: input
#7853 closed Aug 7, 2025
Build failure in onnxruntime/test/featurizers_ops/truncated_svdtransformer_test.cc
#7878 closed Aug 7, 2025
onnxruntime::BroadcastIterator::Init axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 15 by 64
#7888 closed Aug 7, 2025
undefined reference to `onnx::optimization::GetAvailablePasses() on Nvidia Jetson NX
#7970 closed Aug 7, 2025
Runtime exception during initialization of SparkML model (One falsenode is pointing either to itself, either to another tree.)
#8008 closed Aug 7, 2025
Running multiple input node onnx model using onnxrntime C/C++ API
#8019 closed Aug 7, 2025
Memory leak in free-dimention model in C++
#8053 closed Aug 7, 2025
CUDAExecutionProvider does not handle Clip on float16 tensor.
#8070 closed Aug 7, 2025
Why ReduceSum get shape 0 for an empty input?
#8146 closed Aug 7, 2025
System memory leak on cuda GPU backend.
#8147 closed Aug 7, 2025
Does ONNX Runtime and its execution providers support FP16 inference?
#8173 closed Aug 7, 2025
Reflect padding output seems incorrect when padding size larger than input dimension
#8265 closed Aug 7, 2025
ai.onnxruntime.OrtException: Error code - ORT_FAIL - message: OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library
#8283 closed Aug 7, 2025
Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed
#8313 closed Aug 7, 2025
Inference Speed is slow on GPU
#8316 closed Aug 7, 2025
After 8bit quantization, the GPU inference speed is very slow
#8330 closed Aug 7, 2025
GPUs operate slower than CPUs
#8362 closed Aug 7, 2025
error using C# tensorRT EP builded from source
#8367 closed Aug 7, 2025
Why cuda provider allocator must be threadlocal?
#8378 closed Aug 7, 2025
Implement Split for double or float64 data type
#8382 closed Aug 7, 2025
ERROR running model inference:Non-zero status code returned while running Cast node
#8424 closed Aug 7, 2025
Found regression on ORT 1.8.1
#8513 closed Aug 7, 2025
Does the onnxruntime.quantization.quantize_dynamic support GPU quantization?
#8524 closed Aug 7, 2025
gpu memory can not release.
#8544 closed Aug 7, 2025
Build failure of onnxruntime Docker container with Vitis-AI
#8596 closed Aug 7, 2025
PrepareForCompute Non concat axis dimensions must match: Axis 0 has mismatched dimensions of 1 and 0
#8685 closed Aug 7, 2025
error with torch.sum or torch.tensor.mean operator on GPU
#8742 closed Aug 7, 2025
Symbolic shape inference error for loop node & seq(tensor)
#8755 closed Aug 7, 2025
onnxruntime Jetson tx2 cuda
#8771 closed Aug 7, 2025
AttributeError: module 'onnxruntime' has no attribute 'set_default_logger_severity'
#8789 closed Aug 7, 2025
IsNaN and Split have no double implementations
#8791 closed Aug 7, 2025
Readily available Python wheels for ARM?
#8874 closed Aug 7, 2025
cannot import name ‘get_all_providers‘
#8907 closed Aug 7, 2025
Runetime Error: Decoder with dynamic axes does not work with Encoder output
#8910 closed Aug 7, 2025
How to get the value of tensors in subgraph?
#8929 closed Aug 7, 2025
The model run time become longer when i update the onnxruntime from version 1.7 to version 1.8
#8938 closed Aug 7, 2025
ONNX inference result are different to pytorch model
#8977 closed Aug 7, 2025
Type error when runs an control flow model in ORT
#8999 closed Aug 7, 2025
[E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running Transpose node. Name:'model/unet3d_segmentation/conv3d_12/Conv3D__165' Status Message: CUDA error cudaErrorInvalidConfiguration:invalid configuration argument
#9083 closed Aug 7, 2025
cross compile but onnx-ml.pb.cc error
#9093 closed Aug 7, 2025
how to input 'None' in cpp-version
#9121 closed Aug 7, 2025
InferenceSession.run in python is inconsistent in terms of performance
#9208 closed Aug 7, 2025
Can't load Cuda Provider on Linux due symbol lookup error
#9309 closed Aug 7, 2025
ONNXRuntime CPU - Memory spiking continuously (Memory leak)
#9313 closed Aug 7, 2025
error: '_Frees_ptr_opt_' has not been declared
#9332 closed Aug 7, 2025
QLinearConv per-channel result is wrong and it's seem overflow when input is big for my model
#9365 closed Aug 7, 2025
ORT execution fails when a gradient builder is not registered for module-local functions
#9375 closed Aug 7, 2025
Relu getting dropped during quantization
#9425 closed Aug 7, 2025
OnnxRuntime Build Failure in Docker
#9530 closed Aug 7, 2025
YAMNet model running on CudaExecutionProvider is 3x slower than running on tensorflow
#9657 closed Aug 7, 2025
Gap in inference time between onnxruntime and torch vanishes when increasing the batch size
#9660 closed Aug 7, 2025
libonnxruntime.so crash
#9684 closed Aug 7, 2025
[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Conv(1) node with name 'Conv_0'
#9685 closed Aug 7, 2025
yolov5 with the compiled onnxruntime by self，but is so slow, not with the GPU
#9689 closed Aug 7, 2025
Unable to load shared library 'onnxruntime' on MacOS (DllNotFoundException)
#9707 closed Aug 7, 2025
Support for int64 with webgl backend of the web runtime
#9724 closed Aug 7, 2025
ouput of onnx model with custom op in the loop structrue is confusing
#9742 closed Aug 7, 2025
How to build for multiple execution provider?
#9756 closed Aug 7, 2025
Inference is slower when running inside Docker
#9767 closed Aug 7, 2025
[ONNXRuntimeError] : 1 : FAIL : Fatal error: test_custom is not a registered function/op
#9831 closed Aug 7, 2025
non-NEON Compatibility
#9849 closed Aug 7, 2025
Yolov5 ORT train failed with onnxruntime backend
#9936 closed Aug 7, 2025
Support for pip wheel tensorrt
#9986 closed Aug 7, 2025
question about warnup long time
#10017 closed Aug 7, 2025
Importing onnxruntime on AWS Lambdas with ARM64 processor causes crash
#10038 closed Aug 7, 2025
how to forward with a batch images, oncetime?
#10071 closed Aug 7, 2025
when my models input size is 3808, then i forward with yolov5, the memry is break.
#10074 closed Aug 7, 2025
Same Pad_Head value in ORT for SAME_UPPER/SAME_LOWER if get negative odd pad value
#10086 closed Aug 7, 2025
onnxruntime latest version segment fault
#10113 closed Aug 7, 2025
ORTModule import error : with onnxruntime
#10127 closed Aug 7, 2025
BatchNorm fails on CUDA EP with zero length sequences
#10128 closed Aug 7, 2025
Do you have any plan to add 'Round' Operator for gradient builder registry for orttrainer?
#10138 closed Aug 7, 2025
Performance question about some nodes generated by dynamic quantization
#10153 closed Aug 7, 2025
Sigmoid fails and output all zeros
#10154 closed Aug 7, 2025
Why does onnxruntime run slower on C++?
#10155 closed Aug 7, 2025
`InferenceSession` initialization hangs
#10166 closed Aug 7, 2025
TensorRT EP failed to set INT8 dynamic range.
#10206 closed Aug 7, 2025
how to use docker and onnxruntime deploy onnx model on GPU?
#10257 closed Aug 7, 2025
Inconsistent inference timing on CPU
#10270 closed Aug 7, 2025
Inference: Time in GPU is similar in CPU. GPU not speed up
#10271 closed Aug 7, 2025
multiple InferenceSession slowdown inference speed
#10273 closed Aug 7, 2025
DnnlExecutionProvider is not visible in python API
#10275 closed Aug 7, 2025
add QLinearMatMul do not quantize per channel flag to quantize_static extra options
#10283 closed Aug 7, 2025
onnxruntime inference is around 5 times slower than pytorch when using GPU
#10303 closed Aug 7, 2025
Bug: pthread sent an error! undefined:undefined: ortWasmThreaded is not defined
#10311 closed Aug 7, 2025
Quantized int8 onnx GPT2 model returns different tokens whether using past_key_values or not for the same sentence
#10322 closed Aug 7, 2025
Onnxruntime multithread options [C++ CPU]
#10330 closed Aug 7, 2025
Issues when trying to use Onnxruntime and Tensorrt execution provider in a java application
#10352 closed Aug 7, 2025
build onnxruntime error linux
#10364 closed Aug 7, 2025
Error happened while building onnxruntime
#10378 closed Aug 7, 2025
Question about hidden states in onnx DistilGPT2
#10382 closed Aug 7, 2025
Is TensorRT execution provider caching is thread-safe
#10412 closed Aug 7, 2025
Loading a Keras model with custom layers into Microsoft.ML
#10419 closed Aug 7, 2025
ort-web Error: invalid input shape. when using webgl backend and there is a torch.nn.BatchNorm1d layer in the network
#10437 closed Aug 7, 2025
cast BatchNorm2d to int32
#10440 closed Aug 7, 2025
TensorRT input: 717 has no shape specified.
#10443 closed Aug 7, 2025
raise Exception("Incomplete symbolic shape inference") when running "symbolic_shape_infer.py"
#10484 closed Aug 7, 2025
C++ OnnxRuntime-GPU Slower than Python OnnxRuntime-GPU/C++ OnnxRuntime-CPU
#10492 closed Aug 7, 2025
slower after graph optimization!
#10538 closed Aug 7, 2025
onnxruntime and onnxruntime-gpu produce different output for ReduceL1 operator
#10542 closed Aug 7, 2025
Run maskrcnn onnx from pytorch and inference on c++ with gpu sometimes will error
#10543 closed Aug 7, 2025
Exception in DirectML on second inference run
#10546 closed Aug 7, 2025
Unit Tests failure while building on Windows with CUDA EP
#10561 closed Aug 7, 2025
Building Error
#10600 closed Aug 7, 2025
OpenVINO Execution provider's CPU Utility is low
#10601 closed Aug 7, 2025
How to use OpenVINO GetAvailableDevices?
#10602 closed Aug 7, 2025
why it take 200 seconds to run onnxruntime.InferenceSession
#10608 closed Aug 7, 2025
Building OnnxRuntime v1.10.0 with CUDAExecutionProvider for sm_75 GPU fails in CUDA10.2 environment
#10610 closed Aug 7, 2025
C + + onnxruntime GPU is ten times slower than CPU
#10611 closed Aug 7, 2025
Optimization for T5 transformer models.
#10613 closed Aug 7, 2025
[E:onnxruntime:, sequential_executor.cc:346 Execute] Non-zero status code returned while running Add node. Name:'Add_1363' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:505 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 9 by 505
#10618 closed Aug 7, 2025
about providers and providers_options in InferenceSession
#10620 closed Aug 7, 2025
How to use mimalloc in Linux?
#10629 closed Aug 7, 2025
CPU & CUDA execution provider produce different value
#10636 closed Aug 7, 2025
No libonnxruntime_providers_cuda.so generated?
#10639 closed Aug 7, 2025
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION
#10657 closed Aug 7, 2025
Get wrong result when use webgl backend
#10673 closed Aug 7, 2025
onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs initializer_node_arg != nullptr was false.
#10677 closed Aug 7, 2025
Need help on the following from wiki listed roadmap.
#10689 closed Aug 7, 2025
Output shape is mismatched with ONNX SPEC about Resize_tf_crop_and_size with scale input
#10727 closed Aug 7, 2025
gpu onnxruntime lib
#10731 closed Aug 7, 2025
Onnx model consumes huge CPU memory
#10742 closed Aug 7, 2025
inference qdq model failed with TRT EP.
#10743 closed Aug 7, 2025
build on windows cup is fine，but cuda not
#10745 closed Aug 7, 2025
Is there a version of onnxruntime that is compatible with windows 7?
#10749 closed Aug 7, 2025
can build on windows with Geforce 1060 card, cuda 11.0 cudnn 8.0.2 successfully?
#10763 closed Aug 7, 2025
very slow in inference
#10764 closed Aug 7, 2025
ONNX models give slower inference in Python Multiprocessing
#10786 closed Aug 7, 2025
Inference time of onnxruntime gpu increases at very high batch sizes
#10789 closed Aug 7, 2025
Transformer optimizer outputs confusing error
#10838 closed Aug 7, 2025
C++ is 10x slower compared with Python, CPU only
#10849 closed Aug 7, 2025
Windows 32 bit performance much slower than 64bit?
#10855 closed Aug 7, 2025
Different inference results from python and C#
#10863 closed Aug 7, 2025
Does WebGL fail when network inputs are not dimensions in powers of two?
#10873 closed Aug 7, 2025
TensorRT conversion support on Huggingface transformers quantized models.
#10888 closed Aug 7, 2025
onnxruntime/capi/onnxruntime_inference_collection.py", line 370, in _create_inference_session sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model) onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from onnx_data/cpm_large_opt.onnx failed:Protobuf parsing failed.
#10892 closed Aug 7, 2025
python3 -m onnxruntime_tools.transformers.optimizer when opt_level=1 comes error for BERT
#10893 closed Aug 7, 2025
1 : Fail : Non-zero status code returned while running FusedConv node.
#10894 closed Aug 7, 2025
After using onnxruntime.transformers.optimizer to optimize onnx, the optimized model fail to tensorrt
#10905 closed Aug 7, 2025
TensorRT Execution [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization
#10914 closed Aug 7, 2025
slow fp16 performance
#10919 closed Aug 7, 2025
onnxruntime TensorRT Related Questions
#10930 closed Aug 7, 2025
MinGW support (MSYS2)
#10976 closed Aug 7, 2025
docker can't clone git repository for ARM64
#10991 closed Aug 7, 2025
Xor with broadcasting computes error
#11000 closed Aug 7, 2025
Inconsistent behavior between CPU and GPU on ReLU operator when input is NaN
#11010 closed Aug 7, 2025
0xc00007b error, could not startup exe at all with onnxruntime1.7 win-x64 cpu on win10
#11016 closed Aug 7, 2025
Failed to build onnxruntime-vitisai docker container due to missing NO_PUBKEY
#11017 closed Aug 7, 2025
Huggingface Transformers Shape Inference Issue
#11019 closed Aug 7, 2025
kalid-onnxruntime Fatal error: Gemm is not a registered function/op
#11021 closed Aug 7, 2025
Updating state of the network
#11026 closed Aug 7, 2025
Can't constant fold SequenceEmpty node
#11041 closed Aug 7, 2025
Cuda EP parallelization issues for batches
#11047 closed Aug 7, 2025
C++ API, "tried creating tensor with negative value in shape" error when 'permute' and 'reshape' functions are used
#11069 closed Aug 7, 2025
Inference session creation freezes
#11087 closed Aug 7, 2025
compile with cuda error:Couldn't find CUDA library root.
#11090 closed Aug 7, 2025
Performance reduction due to copying of output OrtValues to numpy arrays
#11099 closed Aug 7, 2025
Using DnnlExecutionProvider for inference is much slower than using CPUExecutionProvider.
#11122 closed Aug 7, 2025
Different detection output values for C++ and Python with onnxruntime
#11123 closed Aug 7, 2025
docker container linux run onnxruntime infer core dumped
#11135 closed Aug 7, 2025
[question] yolov5-onnx-float16 not improve on GPU
#11151 closed Aug 7, 2025
How to use Flask with onnxruntime
#11156 closed Aug 7, 2025
Instruction level profiling in onnxruntime
#11159 closed Aug 7, 2025
No c++ header files for building custom op
#11169 closed Aug 7, 2025
A normal output of convolution layer multiplies infinity will result in NaN
#11173 closed Aug 7, 2025
Build from source issue on Windows
#11178 closed Aug 7, 2025
onnxruntime-web is 11-17x times slower than native inference
#11181 closed Aug 7, 2025
Custom Op does not support dynamic input/output number
#11186 closed Aug 7, 2025
Saving GPT2LMHeadModel_ConfigurableOneStepSearch error.
#11198 closed Aug 7, 2025
How to compress the sparse matrix in onnx model
#11200 closed Aug 7, 2025
Inference time for qunatized onnx models, TensorRT> CUDA> CPU. Is this expected?
#11201 closed Aug 7, 2025
auto_set_affinity can't be set to true for parallel executor
#11205 closed Aug 7, 2025
[web] ~100 seconds to load model/InferenceSession
#11217 closed Aug 7, 2025
NonZero shape inference behavior with scalar input mismatches ONNX and PyTorch
#11232 closed Aug 7, 2025
Unhandled exception at 0x00007FFABE6A9538 (cudnn_cnn_infer64_8.dll) in Onnx.exe
#11235 closed Aug 7, 2025
[React Native .ort Model Loading Error] "Error: Can't load a model: No content provider: ..."
#11239 closed Aug 7, 2025
I want use gpu on my jetson nx2 platform with c++, how should i do?
#11240 closed Aug 7, 2025
Non-zero status code returned while running Slice node. Name:'Slice_24' Status Message: slice.cc:153 FillVectorsFromInput Starts must be a 1-D array
#11257 closed Aug 7, 2025
Unsupported If operator in gradient builder for Hugging Face Transformers RoBERTa model
#11268 closed Aug 7, 2025
optimize_model : new model types
#11270 closed Aug 7, 2025
The onnx model of IMDN is slower than the original pytorch model and output many warnings
#11274 closed Aug 7, 2025
pulled master 1.12 quantization get unexpected result
#11277 closed Aug 7, 2025
LSTM export ONNX:Non-zero status code returned while running ScatterElements node. Name:'ScatterElements_880'
#11278 closed Aug 7, 2025
Why gpt2-xl (based transformer-xl) onnx slower than the originer pytorch
#11293 closed Aug 7, 2025
is the effect of onnx on Bert affected by python version?
#11295 closed Aug 7, 2025
TVM EP and TensorRT EP do not support dynamic inputs
#11333 closed Aug 7, 2025
MacOS M1 binary compilation and possibility to fine tune a model in C++
#11343 closed Aug 7, 2025
CUDAExecutionProvider optimized model adds incompatible node resulting in Failed to find kernel for MemcpyToHost
#11348 closed Aug 7, 2025
Lower performance on Inceptionv3/4 model with TensorRT EP than TensorRT directly
#11356 closed Aug 7, 2025
CUDAExecutionProvider not releasing memory after terminate session
#11362 closed Aug 7, 2025
ONNX Runtime compatibility for Jetson AGX Xavier
#11378 closed Aug 7, 2025
About running onnxruntime in singularity container
#11397 closed Aug 7, 2025
Benchmark code using torch.onnx.export
#11399 closed Aug 7, 2025
About building onnxruntime singularity container with DockerFile
#11409 closed Aug 7, 2025
Static quantization+per_channel is wrong for MobileNetV3
#11415 closed Aug 7, 2025
Can I quantize TreeEnsembleClassifier op?
#11436 closed Aug 7, 2025
Onnx T5 fp16 conversion without past_key_values
#11438 closed Aug 7, 2025
How to run a double input onnx model
#11453 closed Aug 7, 2025
InferenceSession giving different results than the original sklearn SVC model
#11490 closed Aug 7, 2025
C#, How to access the different output layer of inference (semantic segmentation)
#11502 closed Aug 7, 2025
[Documentation Request]
#11505 closed Aug 7, 2025
onnxruntime error
#11509 closed Aug 7, 2025
[Documentation Request] tensorAt for Csharp?
#11510 closed Aug 7, 2025
About Convolution Implementation
#11517 closed Aug 7, 2025
How to release a session properly?
#11529 closed Aug 7, 2025
Fail to convert model with reusable blocks
#11530 closed Aug 7, 2025
CPUExecutionProvider outputs wrong value for a quantized model
#11532 closed Aug 7, 2025
TensorRT EP session creation fails with invalid weights type of Int8 when ORT_TENSORRT_INT8_ENABLE set to 1
#11535 closed Aug 7, 2025
Using a model with float input types causes space issue
#11541 closed Aug 7, 2025
how to use c sharp to call libonnxruntime.dll? i build the onnxruntime dynamic dll, did it can be encapsulated c++ dll in order to c sharp called
#11550 closed Aug 7, 2025
can c sharp call onnxruntime c++ dll don't use c# third lib？ i create onnxruntime c++ project,but i want to call the dll with c sharp
#11551 closed Aug 7, 2025
T5-Large Export Results in ProtoBuf Error due to 2GB External Data when using padded inputs
#11558 closed Aug 7, 2025
CUDA failure 100: no CUDA-capable device is detected ; error when inferencing on a GPUVM
#11561 closed Aug 7, 2025
Specify CPUs to use for parallel inference when external CPU pinning is used
#11563 closed Aug 7, 2025
[js/web] Inference is Broken in Safari when Cross Origin Isolation is active
#11567 closed Aug 7, 2025
Header missmatch C/C++ - mac
#11570 closed Aug 7, 2025
The effect of turning optimization on and off on quantized model performance
#11576 closed Aug 7, 2025
ONNXRUNTIME + OpenVINO on ARM64
#11582 closed Aug 7, 2025
did it can build onnxruntime with any cuda version by source code ? is not relate to onnxtuntime version?
#11584 closed Aug 7, 2025
cpu and gpu results is not the same
#11590 closed Aug 7, 2025
CUDNN failure 4: CUDNN_STATUS_INTERNAL_ERROR ; error when inferencing on a GPUVM
#11592 closed Aug 7, 2025
issues with pybind11 repository while installing
#11595 closed Aug 7, 2025
Bad performance for QDQ model with openvino EP
#11604 closed Aug 7, 2025
Unable to build onnxruntime_v1.10.0 C++ api with --enable_memory_profile --enable_cuda_profiling flags
#11607 closed Aug 7, 2025
Shape inference fails
#11614 closed Aug 7, 2025
building ——libonnxruntime_providers_cuda.so Error running link command: No such file or directory
#11621 closed Aug 7, 2025
how to set providers with onnx runtime-gpu1.70 ?
#11624 closed Aug 7, 2025
using multithread to call onnxruntime inference,
#11628 closed Aug 7, 2025
which tags should i download of onnxruntime-gpu 1.6 for c#
#11646 closed Aug 7, 2025
build for c#
#11648 closed Aug 7, 2025
output shape can not be specified in com.microsoft::GridSample op
#11652 closed Aug 7, 2025
Installing ORTModule torch extension reports TypeError
#11663 closed Aug 7, 2025
when set inter_op_num=0 with ORT_PARALLEL model the performance is very bad than inter_op_num=1?
#11668 closed Aug 7, 2025
How to implement a new operator inference function?
#11678 closed Aug 7, 2025
[web] `ort.InferenceSession.create` silently hangs/fails on iOS/iPad browsers if COEP/COOP headers are set
#11679 closed Aug 7, 2025
which onnxruntime-gpu version is compatible for CUDA 11.1 ?
#11685 closed Aug 7, 2025
Real-ESRGAN slow onnxruntime inference compared to Pytorch one
#11688 closed Aug 7, 2025
Linux CI pipelines can't test unreleased versions of ONNX
#11693 closed Aug 7, 2025
Dynamic quantization of Albert model
#11701 closed Aug 7, 2025
Low level profiling for onnxrt Conv kernel(default backend)
#11702 closed Aug 7, 2025
CUDA EP spending lots of time idling
#11706 closed Aug 7, 2025
Race condition when setting do_copy_in_default_stream to false
#11713 closed Aug 7, 2025
Reading back multidimensional output in C++
#11718 closed Aug 7, 2025
how to get the remaining GPU memory to get the batch size?
#11735 closed Aug 7, 2025
ssd_mobilenet_v1 infer error for TensorRT Execution Provider
#11736 closed Aug 7, 2025
build rknpu backend error
#11738 closed Aug 7, 2025
Pip installed Transformer Benchmark cannot run on TF
#11751 closed Aug 7, 2025
Converted ONNX model works in Python but not in C++
#11761 closed Aug 7, 2025
Failed to build onnxruntime on Apple Sillion
#11805 closed Aug 7, 2025
I do not get any performance improvement after using TensorRT provider for object detection model
#11806 closed Aug 7, 2025
When I use onnxruntime to run onnx model on GPU, it sucks up too much video memory. Is that normal?
#11809 closed Aug 7, 2025
Issue importing onnxruntime
#11815 closed Aug 7, 2025
[Bug] Mixing negative and positive paddings causes segfault/uninitialized memory values produced in reflected pad
#11828 closed Aug 7, 2025
in cmake/CMakeList.txt all avx related option all set off, do we need do anything to use avx features?
#11833 closed Aug 7, 2025
[ONNXRuntimeError] FuseReluClip failure
#11836 closed Aug 7, 2025
Incompatible dimensions for matrix multiplication Error in StarNet model when doing InferenceSession
#11846 closed Aug 7, 2025
What's the meaning of the hole of tracing file
#11850 closed Aug 7, 2025
How to use batch run？
#11852 closed Aug 7, 2025
Use NPU in NXP iMX8MP?
#11854 closed Aug 7, 2025
What is the meaning of src_arg_index and dst_arg_index in EdgeEndToMatch structure?
#11856 closed Aug 7, 2025
Wrong output shape due to MergeShape failure
#11870 closed Aug 7, 2025
Not clear quantization pipeline for tensorrt ep
#11873 closed Aug 7, 2025
Pytorch -> Onnx custom Yolov5 model works in python but not in JS
#11874 closed Aug 7, 2025
[ONNXRuntimeError] Load model from *** failed: Unsuported type proto value case
#11889 closed Aug 7, 2025
Quantize specific ops per-tensor while per_channel=True
#11890 closed Aug 7, 2025
onnx and onnxruntime disagree on input with no known rank
#11891 closed Aug 7, 2025
Bug: MatMul fails for input shapes of [0, k] and [k, ]
#11895 closed Aug 7, 2025
Immense GPU memory consumption
#11903 closed Aug 7, 2025
ConvTranspose with auto_pad attribute
#11927 closed Aug 7, 2025
how to get inference time with c# onnxruntime-gpu-1.6.0
#11946 closed Aug 7, 2025
excute dnnl provider error
#11947 closed Aug 7, 2025
windows11+onnxruntime1.8.0+vs2019 inferencing crash
#11950 closed Aug 7, 2025
Multi thread of single session Python vs C++ (end with core dumped)
#11951 closed Aug 7, 2025
Inference_GPT2-OneStepSearch_OnnxRuntime_CPU.ipynb Error
#11959 closed Aug 7, 2025
Question about quantize Gemm OP
#11961 closed Aug 7, 2025
Got segmentation fault error when using 'InferenceSession' API
#11964 closed Aug 7, 2025
how to configure lobal/shared threadpool with multithread, in c#API?
#11966 closed Aug 7, 2025
set gpu option failed
#11967 closed Aug 7, 2025
quant onnx model slower than pytorch with mish6 activation, howerver faster with relu6
#11975 closed Aug 7, 2025
inference time is not stable
#11983 closed Aug 7, 2025
Any interest in hosting the Rust bindings
#11992 closed Aug 7, 2025
inference is different on linux and windows
#11993 closed Aug 7, 2025
Inconsistent result to NumPy and PyTorch when consecutively casting a float tensor to int32 and then to bool
#11994 closed Aug 7, 2025
failed to initialize a session in the GPU environment
#11996 closed Aug 7, 2025
The test time of sess.run does not match the time of profile
#11997 closed Aug 7, 2025
build C#api with cuda 11.0 /cudnn 8.0
#11999 closed Aug 7, 2025
Issue with NeMo MTEncDecModel model in ONNX IOBinding
#12003 closed Aug 7, 2025
how to build onnxruntime from source with dnnl?
#12011 closed Aug 7, 2025
create op
#12017 closed Aug 7, 2025
Resize with mode linear is missing output elements
#12019 closed Aug 7, 2025
Microsoft.ML.OnnxRuntime.Tests.InferenceTest.TestPreTrainedModels should get opset version from the model file
#12040 closed Aug 7, 2025
Builds C# bindings and creates nuget package
#12042 closed Aug 7, 2025
GlobalAveragePool on large size of ones miscalculates
#12043 closed Aug 7, 2025
Using onnxruntime server for model deployment
#12044 closed Aug 7, 2025
Support pasts as inputs in gpt2 beam search operator
#12047 closed Aug 7, 2025
Build wasm static library bug because of missing `testdata` folder.
#12048 closed Aug 7, 2025
Performance in parallel session Run()
#12049 closed Aug 7, 2025
Builds C# bindings and creates nuget package for vs2019 install
#12061 closed Aug 7, 2025
ONNXRuntimeError for "Where" node when the input is too long
#12065 closed Aug 7, 2025
Performance issue with beam search in onnxruntime
#12078 closed Aug 7, 2025
Support for cmake's FetchContent()
#12081 closed Aug 7, 2025
TensorRT Provider Vs TensorRT Native
#12083 closed Aug 7, 2025
Resize with mode linear always produces 0.5 on GPU regardless of the input
#12091 closed Aug 7, 2025
Resize with `nearest` mode have inconsistent results compared to PyTorch and TVM
#12098 closed Aug 7, 2025
onnxruntime tensorrt sometime cost verg log time
#12120 closed Aug 7, 2025
How do I call the same model in CUDA with many various inputs?
#12126 closed Aug 7, 2025
Error in symbloc_shape_infer.py: assert name in self.sympy_data_ or ...
#12127 closed Aug 7, 2025
Inference time vs torch w/regard to batch_size and BatchNorm
#12130 closed Aug 7, 2025
When will Attention OP extra_add_qk input support automatic broadcast
#12149 closed Aug 7, 2025
Query regarding timings under ONNXRT profiler
#12150 closed Aug 7, 2025
Hi Does ONNX Runtime support FP16 and INT8 inference on Intel OneDNN ExecutionProvider?
#12160 closed Aug 7, 2025
Eager mode generator support non-tensor return types
#12163 closed Aug 7, 2025
symbolic_shape_infer.py not working with models quantized with 🤗 Optimum for TensorRT
#12173 closed Aug 7, 2025
upgrading pip and wheels kills CUDAExecutionProvider
#12185 closed Aug 7, 2025
why first session.run is too slower than after
#12197 closed Aug 7, 2025
Performance issue of ConvInteger
#12206 closed Aug 7, 2025
How to release memory after Inference session run in Python
#12207 closed Aug 7, 2025
Regarding the dynamism for custom op in ONNXRT
#12211 closed Aug 7, 2025
Quantized Model Running Slow Using Cuda as EP
#12229 closed Aug 7, 2025
Exported beam search model consumes a lot of more memory
#12246 closed Aug 7, 2025
Mismatch in the order of the column names in the benchmarking script for transformer models
#12265 closed Aug 7, 2025
LoadLibrary failed with error 126 (DirectML)
#12269 closed Aug 7, 2025
TRT EP failed to create model session with CUDA custom op
#12282 closed Aug 7, 2025
Since ORT 1.12 ort.InferenceSession throws error when the last provider is not capable
#12287 closed Aug 7, 2025
Resize op can't work well under Cubic mode with ORT 1.12.
#12302 closed Aug 7, 2025
Details regarding ONNXRuntime inference with OpenVino Backend
#12305 closed Aug 7, 2025
Why the performance of onednn is worse than the common version
#12315 closed Aug 7, 2025
ONNXRT default CPU EP vs Openvino EP Performance
#12316 closed Aug 7, 2025
onnx graph partition optimize
#12318 closed Aug 7, 2025
Wrong native library directory name for M1 Mac in the Java package
#12324 closed Aug 7, 2025
MetaCommand exception from DirectML EP
#12328 closed Aug 7, 2025
window10 ort with openvino backend error
#12334 closed Aug 7, 2025
unsafe exception code in C++ API, wrongly declaring exceptions, incomplete constructors
#12338 closed Aug 7, 2025
Unable to build Onnxruntime 1.12.0 with OpenVINO 2020.3 on Windows 10
#12342 closed Aug 7, 2025
Quantized ONNX model output
#12346 closed Aug 7, 2025
Performance gains by ONNX inconsistent
#12348 closed Aug 7, 2025
Integer quantization fails on Transformer-based vision model
#12362 closed Aug 7, 2025
Setting Openvino EP to run on one core with one thread
#12365 closed Aug 7, 2025
Unable to build tensorrt docker image
#12373 closed Aug 7, 2025
Accept dictionary of tensor as input (python api)
#12380 closed Aug 7, 2025
Fail to build onnxRT with oneDNN using official build command
#12382 closed Aug 7, 2025
Segmentation fault
#12386 closed Aug 7, 2025
While loading the onnx file with InferenceSession getting session ID 11 error
#12402 closed Aug 7, 2025
Failed to build with ACL(and ARMnn)
#12407 closed Aug 7, 2025
Can't build with OpenVINO 2022.1 ("onnxruntime_providers_shared" does not exist)
#12411 closed Aug 7, 2025
`Env(OrtLoggingLevel, const char* logid, OrtLoggingFunction, ...` fails to pass `logid` param to log function
#12414 closed Aug 7, 2025
CUDA support for longer-input models like BigBird
#12463 closed Aug 7, 2025
I found that the OnnxRuntime used almost all of the instruction sets for the convolutional computations and I wanted to optimize for that
#12479 closed Aug 7, 2025
How to exit abnormally in the Python Operator (PyOp)
#12481 closed Aug 7, 2025
QDQ + Add nodes are not fused into QLinearAdd when the graph is optimized
#12487 closed Aug 7, 2025
performance is poor when onnxruntime C++ run in intel cpu
#12489 closed Aug 7, 2025
LSTM Y output is inconsistent with TF inference result when seq_len is effective
#12492 closed Aug 7, 2025
Clarify NMS sorting strategy
#12493 closed Aug 7, 2025
Attributes in nested function calls are zeroed out
#12506 closed Aug 7, 2025
Computing loss within onnxrunitme inference (GPT2 model)
#12526 closed Aug 7, 2025
java deploy in k8s Failed to load library libonnxruntime_providers_cuda.so with error
#12540 closed Aug 7, 2025
engine decryption does not work in TensorRT EP
#12551 closed Aug 7, 2025
Add execution provider selection for quantize_static
#12573 closed Aug 7, 2025
Document beamsearch
#12584 closed Aug 7, 2025
Name:'MatMul_32007' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch
#12594 closed Aug 7, 2025
Run the onnx model converted from seq2seq and report an error
#12608 closed Aug 7, 2025
Where is the definition of session.Run() in onnxruntime C++ api
#12623 closed Aug 7, 2025
cuda_provider_options.h include non existing file?
#12636 closed Aug 7, 2025
when the model support dynamic batch ，the input shape [ -1,-1,80], how can warm up? because of the dynamic batch , I do know the warmup batchsize number ,can it use min_batchsize and max _batchsize to warmup?
#12637 closed Aug 7, 2025
The quantization model reduces the accuracy compared to the TRT
#12638 closed Aug 7, 2025
Failed to create TensorrtExecutionProvider using onnxruntime-gpu
#12639 closed Aug 7, 2025
Confusing exception about supported types
#12648 closed Aug 7, 2025
get kill signal when quantize the ONNX model using quantize_static
#12652 closed Aug 7, 2025
Enable Global Shared Threadpool and Memory Allocator For C#
#12654 closed Aug 7, 2025
Non-zero status code returned while running TopK node. (ssdlite320_mobilenet_v3_large)
#12669 closed Aug 7, 2025
Wrong Results for FP16 Models in CUDAExecutionProvider and TensorRTExecutionProvider
#12726 closed Aug 7, 2025
`static inline Ort::Env onnx_env{nullptr}` easily leads to nullptr deref on app exit
#12736 closed Aug 7, 2025
SystemError : 13 for transformers optimizer
#12745 closed Aug 7, 2025
BatchNormalization produces all zeros for 1D input
#12754 closed Aug 7, 2025
How to set the priority of ONNX in GPU?
#12760 closed Aug 7, 2025
onnxruntime-linux-x64-gpu-1.12.1
#12766 closed Aug 7, 2025
Asynchrononus Inference
#12768 closed Aug 7, 2025
I want to use tensorrt as the back-end of onnx
#12781 closed Aug 7, 2025
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : CUDA error executing cudaSetDevice(GetDeviceId())
#12785 closed Aug 7, 2025
cast op not support multithread
#12786 closed Aug 7, 2025
How to set cpu_num to a specific value?
#12819 closed Aug 7, 2025
AttentionPastState_dynamic test fails during building with CUDA EP from source
#12820 closed Aug 7, 2025
Memory management
#12824 closed Aug 7, 2025
error: package directory 'onnxruntime/backend' does not exist [Build]
#12922 closed Aug 7, 2025
[Web] Failed to compile shader on WebGL
#12927 closed Aug 7, 2025
Disabling optimization produces incorrect results on CUDAExecutionProvider in 1.12
#12946 closed Aug 7, 2025
[Performance] Dynamic model input prediction is slow
#12955 closed Aug 7, 2025
Why is there not ParallelExecutionPlan like SequentialExecutionPlan in the ParallelExecutor of onnxruntime?
#13036 closed Aug 7, 2025
onnxruntime calculate gradients but no need for training
#13057 closed Aug 7, 2025
onnxruntime-gpu, cudaoptions, result is different
#13061 closed Aug 7, 2025
onnxruntime-node crash the electron app[Web]
#13086 closed Aug 7, 2025
what's the differences between onnxruntime with openvino backend VS openvino directly?
#13087 closed Aug 7, 2025
[Performance] a problem for Ort::IoBinding
#13090 closed Aug 7, 2025
[Performance] ONNX Runtime GPT2 Model Running Significantly Slower than PyTorch
#13105 closed Aug 7, 2025
[Test issue] Updated Ignore
#13109 closed Aug 7, 2025
[Performance] Multithreading performance tails off after 3 threads, possible memory issue
#13138 closed Aug 7, 2025
Failed to create CUDAExecutionProvider
#13139 closed Aug 7, 2025
Onnxruntime fails on GPU loading inference with int8 models
#13168 closed Aug 7, 2025
Multilingual-MiniLM-L12-H384 ONNX inference in NodeJS
#13171 closed Aug 7, 2025
GPU inference result not stable
#13178 closed Aug 7, 2025
[Performance] inference time much slower (1529ms vs. 20 ms) on GPU vs CPU.
#13199 closed Aug 7, 2025
[Performance] Performance issue on Linux vs Windows for BERT model.
#13224 closed Aug 7, 2025
Contrib IRFFT operator output dimensions calculation
#13236 closed Aug 7, 2025
Onnx create session takes a long time.
#13240 closed Aug 7, 2025
Inference time spikes in UNET onnx
#13258 closed Aug 7, 2025
[Performance] Too Slow when i do inference
#13265 closed Aug 7, 2025
[Mobile] .Net target Arm64
#13295 closed Aug 7, 2025
[ONNXRuntimeError] : 1 : FAIL : This is an invalid model. Error: the graph is not acyclic.
#13322 closed Aug 7, 2025
onnx Pad operator with negative pads value outputs 'nan'
#13332 closed Aug 7, 2025
[Build] Upgrade to latest protobuf
#13335 closed Aug 7, 2025
[Performance] Comparing ONNX CPU execution profiles of two FasterRCNN checkpoints
#13341 closed Aug 7, 2025
[Build] ONNX Runtime Build Error ZCU102 (DPUCZDX8G)
#13351 closed Aug 7, 2025
quantize_dynamic results in initializer error
#13358 closed Aug 7, 2025
[Performance] CNN Inference has latency spikes with TensorRT EP
#13366 closed Aug 7, 2025
Onnxruntime crashes if setting cpu affinity fails in Ort::Session constructor
#13367 closed Aug 7, 2025
Using GPU in c++
#13380 closed Aug 7, 2025
Can't run qdq model with TRT EP
#13381 closed Aug 7, 2025
Whether the .trt model can be loaded
#13394 closed Aug 7, 2025
Does ORT support quantize
#13413 closed Aug 7, 2025
ONNX Runtime Inference on GPU: Failed to create CUDAExecutionProvider
#13414 closed Aug 7, 2025
Consecutive casting leads to wrong result
#13418 closed Aug 7, 2025
Parameters are optimized out even if it is a needed return value
#13425 closed Aug 7, 2025
[Web] Is it possible to use both webgl backend and wasm backend in onnxruntime-web
#13435 closed Aug 7, 2025
run_with_iobinding is not outputting the expected result for batched input data for T5 model running on ort CUDA EP
#13463 closed Aug 7, 2025
GPU Arena blocked session->Run()
#13464 closed Aug 7, 2025
Consecutive call to Ort::Session::Run() crashes
#13476 closed Aug 7, 2025
did onnxruntime-gpu surport call CUDA code or call custom kernel funtion to preprocess Image?
#13491 closed Aug 7, 2025
[Performance]
#13492 closed Aug 7, 2025
ORT fails on Slice() when indices are of different integer types
#13497 closed Aug 7, 2025
Init provider bridge failed when put onnxruntime folder under path which contains other Unicode character
#13499 closed Aug 7, 2025
[Performance]
#13500 closed Aug 7, 2025
[Performance] C# Gpu memory allocation
#13504 closed Aug 7, 2025
Removing the semantic segmentation's bounding box
#13513 closed Aug 7, 2025
How to transfer the Ort::Value obtained to cuda code for post-processing, such as a .cu file?
#13528 closed Aug 7, 2025
[Training] Whether onnxruntime training can be used in Megatron.
#13532 closed Aug 7, 2025
How can I load a model larger than 2G in memory
#13543 closed Aug 7, 2025
Zero Result with DirectML Execution Provider
#13545 closed Aug 7, 2025
Inference speed: Swintransformer torch vs onnxruntime-gpu
#13550 closed Aug 7, 2025
[Build]
#13554 closed Aug 7, 2025
ORT fails on CPU looking for LayerNormalization node, for mixed-precision ONNX
#13556 closed Aug 7, 2025
[TVM] Exception during initialization
#13572 closed Aug 7, 2025
unable to build onnxruntime for openvino execution provider to get nuget packages
#13577 closed Aug 7, 2025
Does Microsoft.ML.OnnxRuntime have a dependency on System.CodeDom.dll ?
#13604 closed Aug 7, 2025
[Build]
#13606 closed Aug 7, 2025
[C++] Model output image different in C++ ORT vs. Python ORT & PyTorch
#13614 closed Aug 7, 2025
[Performance] Operators assigned to CPU instead of CUDA
#13615 closed Aug 7, 2025
Dimension Padding problem in reduction_ops.cc
#13654 closed Aug 7, 2025
[Performance] onnxruntime session uses 5x more system memory if torch is imported
#13662 closed Aug 7, 2025
GPT2 Static Quantization Failed. Non-zero status code returned while running Reshape node. Name:'past_0_ReduceMax_Reshape'
#13667 closed Aug 7, 2025
Help in running onnxruntime with SNPE as execution provider
#13693 closed Aug 7, 2025
GPU with device_id=0 is always occupied no matter what device_id is specified when run the inference
#13697 closed Aug 7, 2025
onnxruntime-gpu get warning "Serializing optimized model with Graph Optimization level greater than ORT_ENABLE_EXTENDED and the NchwcTransformer enabled".
#13709 closed Aug 7, 2025
[DML] reproducible bug on DML provider
#13714 closed Aug 7, 2025
[Build] Avoid NEON when building on Raspberry Pi 4
#13718 closed Aug 7, 2025
[Web] Uncaught (in promise) TypeError: cannot resolve operator 'Erf' with opsets: ai.onnx v15
#13729 closed Aug 7, 2025
[Web] NPM package include ts files in the output
#13736 closed Aug 7, 2025
[Web]
#13749 closed Aug 7, 2025
Cannot run inference on Integrated Graphics with OpenVino EP using C Sharp API
#13772 closed Aug 7, 2025
QDQ not instrumenting inputs if first operator is a SUM
#13794 closed Aug 7, 2025
how bring my hardware backend to onnxruntime framework
#13797 closed Aug 7, 2025
[WebGL] cannot resolve operator 'DynamicQuantizeLinear' with opsets: ai.onnx v16, ...
#13800 closed Aug 7, 2025
[CPUExecutionProvider] PyTorch/Numpy operations following InferenceSession.run() are 50x slower compared to using dummy inputs
#13808 closed Aug 7, 2025
hello,how to improve [Performance] in batch inference with multicore cpu
#13820 closed Aug 7, 2025
Dynamic quantization is useless on AMD cpus(AMD EPYC 7K62 48-Core Processor)
#13872 closed Aug 7, 2025
SSDLite 320: RuntimeException on CUDA. TopK index assert was false.
#13876 closed Aug 7, 2025
Segmentation Faults when using TensorRT on Jetson Orin Dev Kit
#13877 closed Aug 7, 2025
Model run with `TensorrtExecutionProvider` outputs different results compared to `CPUExecutionProvider` / `CUDAExecutionProvider` when the ONNX `Loop` operator is used
#13894 closed Aug 7, 2025
[Web] dynamic batch size doesn't work when use webgl provider
#13909 closed Aug 7, 2025
[Web] ort-wasm-simd.wasm can't be loaded in Electron renderer (using webpack)
#13933 closed Aug 7, 2025
[Build] Incomplete type used in nested name specifier, Ubuntu
#13942 closed Aug 7, 2025
Do I need to convert data to device for TensorRTExecutionProvider?
#13952 closed Aug 7, 2025
CUDA provider gives different result with respect to CPU
#13962 closed Aug 7, 2025
bug: onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_4 node.
#13973 closed Aug 7, 2025
Non-zero status code returned while running Resize node
#13975 closed Aug 7, 2025
Java Problematic Frame [libonnxruntime.dylib+0x8212be] onnxruntime::DataTypeImpl::ToString(onnxruntime::DataTypeImpl const*)+0xe
#13976 closed Aug 7, 2025
[Performance] [webgl]bad performance of webgl
#13986 closed Aug 7, 2025
[windows7] Unable to load DLL 'onnxruntime.dll': The specified module could not be found.
#14003 closed Aug 7, 2025
[Performance] CUDA EP with Strange Inference Time
#14016 closed Aug 7, 2025
[Performance] the speed and cpu utilization with SetIntraOpNumThreads(1) and SetIntraOpNumThreads(2)
#14018 closed Aug 7, 2025
[Performance] onnx vs pt memory usage
#14029 closed Aug 7, 2025
[Performance] High memory use by CUDAProvider in Jetson Xavier NX(JetPack 4.4)
#14038 closed Aug 7, 2025
java onnxruntime_providers_cuda.dll
#14047 closed Aug 7, 2025
There is a vulnerability in torch:1.12.0,upgrade recommended
#14059 closed Aug 7, 2025
[Build] impossible to build onnxruntime with vs2022
#14086 closed Aug 7, 2025
[Build] core/framework/fence.h not found while build upon CANN
#14121 closed Aug 7, 2025
300% slower on MYRIAD_FP16 when using CustomVision fp16 model
#14125 closed Aug 7, 2025
[Training] Does the current training code support RNN model like seq2seq and Transformer and GNN model?
#14139 closed Aug 7, 2025
[Build] Dockerfile.arm64 - No module named 'packaging' error
#14140 closed Aug 7, 2025
CUDNN error executing cudnnConvolutionForward
#14186 closed Aug 7, 2025
ONNXRuntime outputs numerically incorrect results for mixed precision models.
#14189 closed Aug 7, 2025
Infer shape incorrect for Split with opset 15
#14200 closed Aug 7, 2025
ConvTranspose2d onnxruntime and pytorch forward results are inconsistent
#14208 closed Aug 7, 2025
`onnxruntime.quantization` does not support `.onnx` files produced by `tf2onnx.convert.from_function` with the `large_model` option set to `True`
#14213 closed Aug 7, 2025
No module named 'onnxruntime.transformers.io_binding_helper'
#14230 closed Aug 7, 2025
Valgrind: Source and destination overlap in memcpy_chk
#14254 closed Aug 7, 2025
[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_3 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_3_0'
#14280 closed Aug 7, 2025
[Build] Docker arm64 build fails.
#14283 closed Aug 7, 2025
ONNX Runtime support for the graph optimization of bigbird_pegasus model
#14295 closed Aug 7, 2025
TensorRT EP same inference Time of INT 8 and FP 16
#14315 closed Aug 7, 2025
STFT op has the wrong expected shape
#14316 closed Aug 7, 2025
Program will stuck when creating 'Ort::Session'
#14317 closed Aug 7, 2025
[Performance] ONNXruntime CPU is slower than Pytorch Tracing to Torchscript on CPU
#14326 closed Aug 7, 2025
RemoveNode Should be unreachable if CanRemoveNodeAndMergeEdges is in sync with the logic
#14360 closed Aug 7, 2025
[Bug] Attention and QAttention don't work properly in some cases
#14363 closed Aug 7, 2025
Add some custom QlinearXXX Ops
#14365 closed Aug 7, 2025
[Build] Error in builiding with Tensorrt EP
#14394 closed Aug 7, 2025
Import Error " cannot import name 'get_all_providers' "
#14395 closed Aug 7, 2025
[Training] The gradient builder has not been registered: ReduceMin
#14412 closed Aug 7, 2025
Free allocated data of Ort::Value in C++
#14420 closed Aug 7, 2025
Pad operator not quantizable?
#14422 closed Aug 7, 2025
Different Python exceptions on OOM with `run_with_iobinding` and `run`
#14438 closed Aug 7, 2025
Modifying QlinearADD
#14441 closed Aug 7, 2025
[ONNXRuntimeError] Unsupported OrtValue type with CUDA EP
#14457 closed Aug 7, 2025
[Performance] There is some confusion with onnx + oneDNN or onnx + OpenVINO
#14468 closed Aug 7, 2025
[Build]
#14471 closed Aug 7, 2025
[Performance] cuda_options.arena_extend_strategy = 1 does not free memory
#14474 closed Aug 7, 2025
[Performance] DirectML cost more memory than CPU when process the Win32(X86) program (official demo).
#14479 closed Aug 7, 2025
[Performance] CPU Usage is too high
#14490 closed Aug 7, 2025
[Performance] cuDNN lib mismatch let to a underutilization of GPU
#14498 closed Aug 7, 2025
missing headers and pkgconfig files in binary packages distribution (from github releases) (linux)
#14503 closed Aug 7, 2025
[Web] Runtime error using `onnxruntime-node` with webpack
#14505 closed Aug 7, 2025
[Performance] Find out why the GPU memory allocated with `CUDAExecutionProvider` is much larger than the ONNX size
#14526 closed Aug 7, 2025
Non-zero status code returned while running DnnlCustomOp2 node
#14543 closed Aug 7, 2025
Check and modify the weights of a layer of an onnx model at runtime
#14545 closed Aug 7, 2025
[Performance] DirectML Dynamic Axes very slow
#14550 closed Aug 7, 2025
[BUG] FusedConv node error
#14561 closed Aug 7, 2025
fp32 model with autocast to fp16: Shape mismatch attempting to re-use buffer
#14582 closed Aug 7, 2025
[Build] cuda dll wrap up
#14585 closed Aug 7, 2025
different results with onnxruntime-gpu-1.10
#14587 closed Aug 7, 2025
[Web] currently non-1 steps is not supported for Slice
#14588 closed Aug 7, 2025
Destroying an inference session without exiting the python process
#14590 closed Aug 7, 2025
C# - CUDA Nuget BUG : DefaultLogger Attempt to use DefaultLogger but none has been registered.
#14593 closed Aug 7, 2025
Onnxruntime Arm NN Ep build error.
#14611 closed Aug 7, 2025
[Performance]
#14615 closed Aug 7, 2025
[Build] cpp_field.h(189,47): error C2059: 语法错误:“)”
#14627 closed Aug 7, 2025
[Performance] Memory grows after reloading model
#14641 closed Aug 7, 2025
[Build] Building for C++ On Jetson Nano CUDA 10.2
#14644 closed Aug 7, 2025
TensorRT Execution Build Fails on Jetson Jetpack 4.6.1
#14658 closed Aug 7, 2025
DEEPFACE LIVE Issue with onnxruntime_pybind_state.
#14667 closed Aug 7, 2025
[Build]
#14674 closed Aug 7, 2025
Custom Operater Output Tensor Shape Error
#14683 closed Aug 7, 2025
[Web] Inference speed halves if you open DevTools after loading an inference session, even if you close DevTools afterwards
#14692 closed Aug 7, 2025
`CleanUnusedInitializersAndNodeArgs` warnings are printed only with subgraphs
#14694 closed Aug 7, 2025
[Performance]why is the inference latency of onnx QDQ quantized model converted from tflite quantized model (or from tensorflow Quantization-Aware training (QAT) model) as same as normal onnx float32 model?
#14707 closed Aug 7, 2025
A runtime can run on cuda device 0 but fail on cuda device 1
#14710 closed Aug 7, 2025
Non-zero status code returned while running Reshape node. Name:'Reshape_7411' The input tensor cannot be reshaped to the requested shape. Input shape:{51}, requested shape:{}
#14712 closed Aug 7, 2025
How to inference with multiple batches and multiple inputs.
#14713 closed Aug 7, 2025
Crash in JavaGPU on Windows
#14714 closed Aug 7, 2025
The Microsoft.ML.OnnxRuntime.Gpu nuget on Visual studio latest version 1.14.0 has a bug when running with the tensorrt on run time.
#14730 closed Aug 7, 2025
clog_vlog_fatal[Build]
#14740 closed Aug 7, 2025
[Performance] How to create multiple tensors with consecutive addresses when the cuda memory is not occupied?
#14742 closed Aug 7, 2025
Memory Leak
#14745 closed Aug 7, 2025
[Build] macOS: cross compiling arm64 on intel fails
#14746 closed Aug 7, 2025
[Performance] Can oneDNN EP accelerate the inference time of onnxruntime on x86 machines?
#14749 closed Aug 7, 2025
Basic Optimizer adds non-standard ONNX ops
#14752 closed Aug 7, 2025
Basic Optimizer adds non-standard ONNX ops for roi_align
#14753 closed Aug 7, 2025
Basic Optimizer adds non-standard ONNX ops for input tensor
#14754 closed Aug 7, 2025
[Build] cmake install when --use_xnnpack is broken
#14757 closed Aug 7, 2025
Failed to build CUDA docker image[Build]
#14765 closed Aug 7, 2025
`onnx.checker.check_model` raises `Bad node spec` for custom nodes created from ORT `optimize_model`
#14768 closed Aug 7, 2025
Dependency Problem (java onnxruntime)
#14787 closed Aug 7, 2025
[Build] Can't access OrtSessionOptionsAppendExecutionProvider_Dnnl while using oneDNN
#14799 closed Aug 7, 2025
[Build] Dockerfile.arm64 build fails
#14801 closed Aug 7, 2025
[Build] Unable to load TensorRT Execution Provider
#14802 closed Aug 7, 2025
Read access violation under OnnxRuntimeCpuSessionBuilder::Initialize during WinML operator tests for function operators
#14810 closed Aug 7, 2025
[Web] how to reduce wasm file size
#14817 closed Aug 7, 2025
onnxruntime with CUDA not releasing about 400 MB memory after the session and environment is destroyed
#14819 closed Aug 7, 2025
working model with Resize node becomes invalid after using convert_float_to_float16
#14827 closed Aug 7, 2025
How do I pass a list of tensors in onnxruntime-web?
#14829 closed Aug 7, 2025
DML EP cannot load some quantized onnx files.
#14835 closed Aug 7, 2025
[Performance] Performance degradation while using dynamic axes
#14863 closed Aug 7, 2025
UndefinedBehaviorSanitizer reports problem in onnxruntime_global_thread_pools_test
#14882 closed Aug 7, 2025
[Build] Error APPX1101 - Payload contains two or more files with the same destination path 'microsoft.ai.machinelearning.dll'
#14915 closed Aug 7, 2025
[Performance]
#14919 closed Aug 7, 2025
Is there a Python way to get the max supported ONNX IR version from ORT package?
#14932 closed Aug 7, 2025
[Performance] 3-100x regression when opset 16 or 17 is used (CUDA EP)
#14956 closed Aug 7, 2025
[Performance] Can not release memory in gpu.
#14957 closed Aug 7, 2025
Reuse output tensors memory that was allocated by first call to Ort::Session.Run(...)
#14960 closed Aug 7, 2025
Compatibility between Onnx and Blazor Webassembly
#14962 closed Aug 7, 2025
Running T5 export ONNX example leads to shape inference error
#14963 closed Aug 7, 2025
Microsoft.ML.OnnxRuntime.Gpu not working in MAUI project
#14974 closed Aug 7, 2025
[Build] Failed to build in docker container
#14983 closed Aug 7, 2025
conv throws safeint exception
#14985 closed Aug 7, 2025
Static linkage of onnx_runtime and providers library
#14986 closed Aug 7, 2025
[Build] static assertion fails when building from source with GCC 13.0.1
#14991 closed Aug 7, 2025
[Performance] inference problems with io_binding: unexpected shape or unexpected data type
#14998 closed Aug 7, 2025
[Performance] TensorRT provider produces (slightly) differently named engine files for the same model between runs
#14999 closed Aug 7, 2025
CUDA Graph Error - CUDA failure 900: operation not permitted when stream is capturing
#15002 closed Aug 7, 2025
ONNX does not support Dirichlet distribution?
#15016 closed Aug 7, 2025
[Build] Problems with FP16 Layernorm
#15021 closed Aug 7, 2025
[Build] api-ms-win-core-heap-l2-1-0.dll missing on windows server 2012 R2
#15025 closed Aug 7, 2025
onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126
#15035 closed Aug 7, 2025
accuracy reduced with multithreaded GPU prediction
#15038 closed Aug 7, 2025
mT5 convert to ONNX and GPU inference problems
#15042 closed Aug 7, 2025
[Build] Cannot specify compile definitions for target "onnx" which is not built by this project.
#15051 closed Aug 7, 2025
[bug] error while loading shared libraries: libonnxruntime.so.1.8.1: cannot open shared object file: No such file or directory
#15053 closed Aug 7, 2025
[Performance] Inference doubles VRAM (DirectML)
#15074 closed Aug 7, 2025
[Web] Memory spike in ORT-web leading to app crash
#15086 closed Aug 7, 2025
onnxruntime: bfc_arena.cc:361 void* onnxruntime::BFCArena::FindChunkPtr(onnxruntime::BFCArena::BinNum, size_t, size_t) !chunk->in_use() was false.
#15087 closed Aug 7, 2025
The dimension of incides to ScatterND op is wrong during inference.
#15095 closed Aug 7, 2025
[Performance] onnxruntime allocates lots of cuda memory on T4
#15098 closed Aug 7, 2025
fail build with gcc 12.x in onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_qdq.cc
#15111 closed Aug 7, 2025
How to reduce GPU memory usage when inference
#15127 closed Aug 7, 2025
descriptor_table_tensorboard_2fcompat_2fproto_2fattr_5fvalue_2eproto not declared (TRT 8.5.0)
#15131 closed Aug 7, 2025
how to inference with fp16 precise in python code?
#15134 closed Aug 7, 2025
NOT_IMPLEMENTED GridSample(16) on onnxruntime 1.14.1
#15137 closed Aug 7, 2025
onnxruntime::utils::ConstantNodeProtoToTensorProto Unsupported attribute value type of 9 in 'Constant' node 'Constant_35'
#15149 closed Aug 7, 2025
Type Error: Type 'tensor(int64)' of input parameter (relative_position) of operator (Min) in node (Min_2286) is invalid.
#15167 closed Aug 7, 2025
inference speed is very slow when using fp16 while using fp 32 is normal
#15170 closed Aug 7, 2025
A bug occurs when the program terminates
#15174 closed Aug 7, 2025
[Performance] Why is the Conv + Max-Pool model faster than the Conv model using GraphOptimizationLevel::ORT::ENABLE_ALL?
#15180 closed Aug 7, 2025
[Performance] GPT NEO: better performance of python GPT NEO than its onnx runtime version in C++?
#15191 closed Aug 7, 2025
[Build] segfault when run unitest (ctest)
#15224 closed Aug 7, 2025
[Build] fail to build on Windows ARM64
#15252 closed Aug 7, 2025
[Performance] How to debug/reduce GPU utilization?
#15254 closed Aug 7, 2025
[Performance]
#15265 closed Aug 7, 2025
ONNX model with FBNetv3 architecture Conversion to TensorRT Problem
#15269 closed Aug 7, 2025
[Build] ONNX Java Runtime - Handle UnsatisfiedLinkError
#15281 closed Aug 7, 2025
[Documentation Request] Estimating (or Checking) Allocated Memory
#15326 closed Aug 7, 2025
[Performance] Timings feedback
#15328 closed Aug 7, 2025
[Performance] Gemm op is slower after quantization
#15332 closed Aug 7, 2025
[Mobile] onnxruntime-c and onnxruntime-extensions-c pod conflict with DocumentReader pod
#15333 closed Aug 7, 2025
[Performance] ONNXRUNTIME sometime DEAD in python multiprocessing
#15345 closed Aug 7, 2025
ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2235646909
#15349 closed Aug 7, 2025
[Web] custom ops
#15374 closed Aug 7, 2025
[Performance] Running Large Language Models for dynamic input size is poor performance. (DirectML)
#15394 closed Aug 7, 2025
Opset Coverage - Binary Size Tradeoff
#15397 closed Aug 7, 2025
[Build] C++ API calling fail: error C2280: 'Ort::Value::Value(const Ort::Value &)' : attempt to reference a deleted function
#15418 closed Aug 7, 2025
Mask-RCNN network is giving significantly different result with DirectML EP
#15459 closed Aug 7, 2025
Error Unrecognized attribute: layout for operator DynamicQuantizeLSTM
#15465 closed Aug 7, 2025
Please provide informative message on dlopen failures -- python API
#15476 closed Aug 7, 2025
[Performance] WebAssembly 1x1 Conv almost 4x slower than native
#15483 closed Aug 7, 2025
[Performance] Model converted to mixed precision results in higher latency
#15490 closed Aug 7, 2025
Inference slows down on gpu.
#15491 closed Aug 7, 2025
[Bug?] Casting int8-->float
#15492 closed Aug 7, 2025
InferenceSession fails with segmentation fault when fp16 model is loaded with CPUExecutionProvider
#15494 closed Aug 7, 2025
[ErrorCode:Fail] Load model from [...]\latin_ipa_forward.onnx failed:invalid vector subscript
#15495 closed Aug 7, 2025
[Build] Openvino debug build fails on VS2019
#15496 closed Aug 7, 2025
[Web] probability is not returned: `error code = 1`
#15511 closed Aug 7, 2025
SimplifiedLayerNormalization loading error for converted FP16 databricks/dolly-v2-3b model
#15531 closed Aug 7, 2025
[Performance] FP16 model can not get acceleration on GPU with ONNXRuntime-GPU
#15534 closed Aug 7, 2025
Get results from Mask RCNN model with C++
#15541 closed Aug 7, 2025
fatal error: gsl/gsl: No such file or directory
#15554 closed Aug 7, 2025
[Build] 1.14.0-dev-20230120-0204-3d6cea14f4 (This build breaks model on Intel)
#15567 closed Aug 7, 2025
[Performance] CUDA fp16 didn't get speed up
#15585 closed Aug 7, 2025
Error with custom spconv class in onnx runtime
#15594 closed Aug 7, 2025
[Build] Java Nightly build
#15600 closed Aug 7, 2025
[Build] the Linux build config
#15621 closed Aug 7, 2025
Can't use onnxruntime with DirectML built from source
#15628 closed Aug 7, 2025
[Performance] CNN model exported by PyTorch runs slower than Tensorflow 1.0
#15647 closed Aug 7, 2025
onnxRuntimeException and DefaultLogger issues in AWS Lambda runtime
#15650 closed Aug 7, 2025
ONNXRuntime in Docker
#15652 closed Aug 7, 2025
ONNX with FloatTensorType when inferred from C++ returns different label everytime
#15665 closed Aug 7, 2025
[Build] Compile Error if path too long
#15674 closed Aug 7, 2025
[CANN]EP: CANN cannot complete inference on Atlas200DK
#15677 closed Aug 7, 2025
[Performance] Can't get GPU speed-up when exe program is located inside the path with chinese character
#15678 closed Aug 7, 2025
[ErrorCode:InvalidArgument] Invalid Feed Input Name:image
#15692 closed Aug 7, 2025
[Performance] Can we set model weight precision when converting keras model into onnx model?
#15695 closed Aug 7, 2025
[Build] Onnxruntime-gpu for Jetpack 5.1.1 on Jetson Orin Nano Developer Kit
#15732 closed Aug 7, 2025
GraphOptimization (ORT_ENABLE_ALL) is slower using ONNXRuntime-GPU
#15743 closed Aug 7, 2025
Load onnx failed(segmentation fault) with version 1.14.1 (2)
#15745 closed Aug 7, 2025
Inference using the CUDA EP returns nan
#15752 closed Aug 7, 2025
[Build]
#15786 closed Aug 7, 2025
How to set CalibrationDataReader when my datatype is time series?
#15836 closed Aug 7, 2025
[Build]
#15863 closed Aug 7, 2025
[Training] Training Onnx format Models
#15867 closed Aug 7, 2025
Failed top create CUDAExecutionProvider
#15873 closed Aug 7, 2025
[RunTimeError]Infer error shape in runtime and mismatch with onnx spec about Split opset 18
#15882 closed Aug 7, 2025
[Performance] `CUDAExecutionProvider` uses 3x the memory of `CPUExecutionProvider`
#15886 closed Aug 7, 2025
symbolic_shape_infer.py failure
#15898 closed Aug 7, 2025
Linking executable with static libraries --> error LNK2038: mismatch detected
#15928 closed Aug 7, 2025
Atlas200DK uses EP: CANN to infer resnet50 and reports "CANN errorEE9999: Inner Error!"
#15947 closed Aug 7, 2025
[Performance] Redundant ReorderOutput / ReorderInput operators in Conv+Maxpool layers when graph optimization level is ALL
#15964 closed Aug 7, 2025
float16 result not match with numpy or torch
#15977 closed Aug 7, 2025
[Predict] Prediction from ONNX is same for all images
#16001 closed Aug 7, 2025
The result of Col2Im operator not close with Torch result on fp16 dtype
#16007 closed Aug 7, 2025
[Performance] QUInt8 vs a basic ONNX
#16009 closed Aug 7, 2025
RunOptions.only_execute_path_to_fetches not working
#16013 closed Aug 7, 2025
Cannot open include file: numpy/arrayobject.h
#16027 closed Aug 7, 2025
[Web] The onnxruntime-web example is loading wasm file twice if set to local path
#16028 closed Aug 7, 2025
[Web] [WebGPU] Uncaught (in promise) DOMException: Unable to instantiate a Device in Firefox Nightly/Linux
#16029 closed Aug 7, 2025
inference time decreasing when increasing batch size to a certain point and them the inference time increasing again.
#16030 closed Aug 7, 2025
can we customize memory allocation functions(like malloc/free) for inference in C api?
#16032 closed Aug 7, 2025
[Performance] How to solve the problem of releasing GPU memory in onnxruntime
#16033 closed Aug 7, 2025
[Performance] Huge gap between nn.Conv1d() and nn.Conv2d() - models exported by PyTorch
#16047 closed Aug 7, 2025
Unexpected inference output from QLinearConv
#16105 closed Aug 7, 2025
Memory leak in cpuinfo_x86_linux_init
#16117 closed Aug 7, 2025
Segmentation Fault when optimizing Stable Diffusion models
#16140 closed Aug 7, 2025
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] with swin-t
#16143 closed Aug 7, 2025
Segmentation fault while loading CUDA Provider
#16146 closed Aug 7, 2025
[Performance] ONNX Runtime doesn't parallelize operations in CPU models
#16158 closed Aug 7, 2025
The prediction results from STFT has changed with a notable shift towards larger difference from PyTorch in ORT==1.15.0
#16163 closed Aug 7, 2025
[MacOS] Unable to load libonnxruntime.dylib because binaries are not signed.
#16168 closed Aug 7, 2025
[Build] line 2812, in <module> sys.exit(main())
#16179 closed Aug 7, 2025
no acceleration onnx on e5 2680v3
#16185 closed Aug 7, 2025
[Performance] setIntraOpNumThreads doesn't offer enough parallelization in JAVA-API
#16192 closed Aug 7, 2025
DmlExecutionProvider bound to PyTorch tensor stops running
#16197 closed Aug 7, 2025
NullReferenceException when creating an object of class SessionOptions | Unity
#16205 closed Aug 7, 2025
[quantization] Problem with QDQ of Pow/Sqrt/Div
#16219 closed Aug 7, 2025
why the input doesn't place in cuda ?
#16225 closed Aug 7, 2025
[Training][api:C++][feature request] Support Model Forward Output and Backward Gradient Extraction in ONNX runtime training
#16232 closed Aug 7, 2025
TensorrtExecutionProvider::GetSupportedList graph_build.Resolve().IsOK() was false.
#16234 closed Aug 7, 2025
Not returning anything for out-of-vocabulary text while batch inference using Tf-IDF ONNX Vectorizer model
#16251 closed Aug 7, 2025
Inconsistent generation of vectors by TF-IDF ONNX Vectorizer Model
#16252 closed Aug 7, 2025
[OOM] Unable to convert 30B Model
#16254 closed Aug 7, 2025
[Performance] Evaluation behavior with external arrays (C API)
#16255 closed Aug 7, 2025
onnx use more memory than pytorch for some model
#16264 closed Aug 7, 2025
[Web/Build] Failed to consume onnxruntime-common because of JS parser not up-to-date
#16265 closed Aug 7, 2025
how to trace the error "assert node is not None" when use the onnxruntime.transformers.optimizer
#16268 closed Aug 7, 2025
ROCm EP: Errors when trying to infer, which GPUs are supported?
#16271 closed Aug 7, 2025
[Accuracy/Performance]
#16275 closed Aug 7, 2025
Does OnnxruntimeV1.14 still support the Python Operator, and which the highest version supports this feature？
#16277 closed Aug 7, 2025
Seg faults when creating InferenceSession for SAM backbone
#16300 closed Aug 7, 2025
[Mobile] Error: Non string type of a tensor data is not allowed
#16301 closed Aug 7, 2025
[ onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running FusedMatMul node. Name:'MatMul_With_Transpose_token_14_FusedMatMulAndScale' Status Message: bad allocation unknown file: error: C++ exception with description "Non-zero status code returned while running FusedMatMul node. Name:'MatMul_With_Transpose_token_14_FusedMatMulAndScale' Status Message: bad allocation" thrown in the test body.
#16305 closed Aug 7, 2025
issue running onnxruntime with pytest
#16306 closed Aug 7, 2025
How to catch exception OOM.
#16307 closed Aug 7, 2025
How to edit Clip Operator in OnnxRuntime?
#16315 closed Aug 7, 2025
get error when using libonnxruntime with dnnl EP
#16320 closed Aug 7, 2025
InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from d2net.onnx failed:This is an invalid model. Type Error: Type 'tensor(bool)' of input parameter (onnx::Min_109) of operator (Min) in node (Min_65) is invalid.
#16321 closed Aug 7, 2025
Increase - decrease the maximum number of events during inference profiling.
#16334 closed Aug 7, 2025
[Build] Fails to parse FP16 LayerNormalization in opset>=18
#16341 closed Aug 7, 2025
[Build] Disable ORT_ENABLE_STREAM build error
#16345 closed Aug 7, 2025
MaxPool: When Ceil_mode=1, MaxPool Generates Big Values.
#16350 closed Aug 7, 2025
AveragePool: When Ceil_mode=1, AveragePool Generates Nan or 0 Values.
#16351 closed Aug 7, 2025
[Training]
#16354 closed Aug 7, 2025
multi-GPU inferencing
#16382 closed Aug 7, 2025
Operator Pad reflect mode does not yield correct results
#16401 closed Aug 7, 2025
[Web] Web ~40x slower than native
#16412 closed Aug 7, 2025
[Performance] DML dynamic axes performance regression.
#16424 closed Aug 7, 2025
C++ Runtime does not recognize supposedly correct input.
#16430 closed Aug 7, 2025
Normalizer does not work as expected
#16451 closed Aug 7, 2025
[Mobile] Unable to load models in Xamarin iOS
#16463 closed Aug 7, 2025
[Performance] net.set_providers(['DmlExecutionProvider'], [{'device_id': 0}]) could get stuck forever (directml EP)
#16473 closed Aug 7, 2025
m2m 100 418M
#16480 closed Aug 7, 2025
Automatic deallocation (?) of the Ort::Sessions, memory leak?
#16497 closed Aug 7, 2025
[Performance] A model with a large TreeEnsembleClassifier node takes too long to be loaded
#16511 closed Aug 7, 2025
Setting `CUBLAS_WORKSPACE_CONFIG=":4096:8"` leads to `CUBLAS_STATUS_ALLOC_FAILED`
#16512 closed Aug 7, 2025
ONNXRuntimeError: Training mode does not support BN opset 14 (or higher) yet.
#16867 closed Aug 7, 2025
[Build] INVALID_ARGUMENT : Invalid rank for input: input Got: 4 Expected: 2 Please fix either the inputs or the model.
#16557 closed Aug 7, 2025
[Build] libonnxruntime_providers_dnnl.so: undefined symbol: omp_get_max_threads
#16561 closed Aug 7, 2025
Large model >2GB save_to_ort
#16573 closed Aug 7, 2025
[Build] fatal error: too many errors emitted, stopping now [-ferror-limit=]
#16576 closed Aug 7, 2025
[Build] Cannot build onnxruntime
#16583 closed Aug 7, 2025
Conv3d precision error between pytorch and onnx
#16589 closed Aug 7, 2025
[Training] Define a custom training with some ONNX models
#16597 closed Aug 7, 2025
[Performance] Performance degradation observed w.r.t DNNL-EP in v1.15.1 compared to v1.13.1
#16609 closed Aug 7, 2025
[Build] No C++ library is generated after compilation completed
#16610 closed Aug 7, 2025
[Performance] Computation time of iteratively applying neural network in a single ONNX model using CUDA Execution Provider dominated by Memcpy
#16625 closed Aug 7, 2025
[Build] Dependency on OMP/MPI Runtime
#16631 closed Aug 7, 2025
[Performance]
#16637 closed Aug 7, 2025
The input tensor cannot be reshaped to the requested shape after adding Gather output to model's output
#16670 closed Aug 7, 2025
Access violation reading location when I use CreateArenaCfgV2 and CUDA
#16686 closed Aug 7, 2025
One path in the graph requests feature X(>Y) but input tensor has Y features
#16695 closed Aug 7, 2025
[Bug] Coqui VITS ONNX model can't be statically quantized.
#16738 closed Aug 7, 2025
CUDA Custom Op CUDA failure
#16748 closed Aug 7, 2025
clean build v1.15.1 fails three fp16 tests due to `difference between... exceeds threshold"
#16775 closed Aug 7, 2025
[Performance] FP16 models incur large cast latency when run on CPUs without FP16 support
#16778 closed Aug 7, 2025
Incorrect Output from Java Model
#16781 closed Aug 7, 2025
Segmentation Fault when using TensorRT execution provider
#16790 closed Aug 7, 2025
[Performance] using onnxruntime with ray and also fix for memory footprint too high
#16793 closed Aug 7, 2025
[Performance] [Web] Using the `onnxruntime-web` package (`wasm` backend) with Node.js is 1.6x to 2x faster than in browsers and Deno?
#16798 closed Aug 7, 2025
[Training] Proposal: Implement back propagation algorithm for C#
#16809 closed Aug 7, 2025
[Performance]
#16817 closed Aug 7, 2025
[Mobile] Failure to load whisper model .ort with react-native, regular and quantized versions
#16819 closed Aug 7, 2025
onnxruntime_providers_cuda.dll cannot be loaded due to "Can't find dependent libraries" under Windows 10 environment using Java
#16821 closed Aug 7, 2025
op.SequenceEmpty(dtype=xxx) cannot be set to float16.
#16846 closed Aug 7, 2025
[Performance]high latency variance
#16876 closed Aug 7, 2025
[Performance] Convolution layer issue profiling
#16926 closed Aug 7, 2025
[Mobile][Kotlin] OnnxTensor.createTensor from floatBuffer takes up 7 seconds
#16937 closed Aug 7, 2025
Crash at winrt::Windows::AI::MachineLearning::implementation::LearningModelSession::GetResults during inferencing
#16988 closed Aug 7, 2025
Native assemblies aren't copied when Onnx is a transitive dependency and using netstandard
#17010 closed Aug 7, 2025
Why onnxruntime extracts only 483MB json file?
#17013 closed Aug 7, 2025
Why some models are not profiling the input weights, bias etc and the node index in the json file properly?
#17022 closed Aug 7, 2025
An error occurred when I used the TensorrtExecutionProvider in onnx runtime
#17047 closed Aug 7, 2025
[Web] Cannot Convert to RGB when using Tensor.fromImage(image,{tensorFormat:'RGB'})
#17094 closed Aug 7, 2025
[Performance] Pytorch Model converted to ONNX with CUDAProvider run slower 3x time than Using Pytorch with GPU
#17116 closed Aug 7, 2025
How to release onnxruntime gpu memory
#17142 closed Aug 7, 2025
[TOOLS]：Using transformers.optimizer optimize large model, segmentation fault (core dumped)
#17212 closed Aug 7, 2025
Onnx model inference Fatal error: ai.onnx.contib:bev_pool_v2(-1) is not a registered function/op
#17214 closed Aug 7, 2025
[C#] Invalid input name error
#17244 closed Aug 7, 2025
AssertionError on num_heads > 0 for bert with specific optimization config
#17254 closed Aug 7, 2025
windows10 x86 x64 inference time varies greatly
#17256 closed Aug 7, 2025
[Performance] Operators assigned to CPU instead of CUDA, CPU thread management problem
#17268 closed Aug 7, 2025
[Web] Error: no available backend found. ERR: [wasm] TypeError: Failed to parse URL from
#17274 closed Aug 7, 2025
[Build] Error: cpuid.h: No such file or directory when cross-compiling ORT 1.15.1 with NNAPI for arm64
#17283 closed Aug 7, 2025
Freeing heap block containing an active critical section
#17345 closed Aug 7, 2025
[Performance] 3X slower inference on onnxruntime than pytorch(huggingface)
#17366 closed Aug 7, 2025
[Performance] Memcpy leads to AllocationError for argmax
#17371 closed Aug 7, 2025
[web/js] need for more methods on tensor object
#17372 closed Aug 7, 2025
[Performance] Quantized model inference on CPU slower/same as FP32
#17389 closed Aug 7, 2025
Default `tensorFormat` should RGBA for HTMLImageElement variant
#17395 closed Aug 7, 2025
[Build] windows dll compilation error with versions above 1.14.0
#17404 closed Aug 7, 2025
[Web] Add binary/where broadcast case when FXC issue got fixed in tint
#17405 closed Aug 7, 2025
[Performance] Data size of Batch Normalization using cuDNN in inference.
#17406 closed Aug 7, 2025
Yolov8 Static Quantization
#17410 closed Aug 7, 2025
CUDA Stream and Synchronous in custom operato
#17412 closed Aug 7, 2025
[Performance] How much memory it needs to load a 3.4 GB model to GPU through DirectML?
#17413 closed Aug 7, 2025
valgrind memcpy_chk overlap onnxruntime1.15.1
#17431 closed Aug 7, 2025
Extract node info
#17444 closed Aug 7, 2025
[Bug] FP16 conversion yields an unusable model
#17447 closed Aug 7, 2025
[Mobile iOS] Run fp16 onnx model on CoreML EP
#17448 closed Aug 7, 2025
C++ API, Memory Leak instantiating Ort::Sessions
#17451 closed Aug 7, 2025
Failure with OpenvinoEP within ORT
#17499 closed Aug 7, 2025
Resize of doesn't work well while the coordinate_transformation_mode is 'align_corners'.
#17564 closed Aug 7, 2025
Inference speed of Quantized model not increased after static Quantization[Performance]
#17634 closed Aug 7, 2025
DML EP One session but called in different threads. [Performance]
#17686 closed Aug 7, 2025
SkipLayerNormFusion -- High Output Difference Between PyTorch and ONNX Runtime with Extended Optimizations
#17689 closed Aug 7, 2025
[Web]
#17700 closed Aug 7, 2025
[Mobile | iOS] I got "Unknown exception" error.
#17731 closed Aug 7, 2025
[Web] Custom build packages
#17743 closed Aug 7, 2025
[web] following-up work items for supporting uniform buffers
#17860 closed Aug 7, 2025
[Web] Declaration is not emitted in onnxruntime-node package
#17979 closed Aug 7, 2025
[Build] Why does TensorRT EP need the full version of protobuf?
#18040 closed Aug 7, 2025
[Web] Which node.js version is supposed to be supported?
#18078 closed Aug 7, 2025
Microsoft.ML.OnnxRuntime.OpenVino Encountered unknown exception in Initialize
#18152 closed Aug 7, 2025
ORT bug in Col2Im CPU 3D cases
#18156 closed Aug 7, 2025
[Mobile|Android] Fatal error: ai.onnx.contrib:SentencepieceTokenizer(-1) is not a registered function/op
#18226 closed Aug 7, 2025
The onnx.helper make_function command strips type information leading to inference errors
#18264 closed Aug 7, 2025
[Web] onnxruntime-web and onnxruntime-node return different results for LSTM model
#18335 closed Aug 7, 2025
[Performance] the speed with SetIntraOpNumThreads(1),SetIntraOpNumThreads(4),SetInterOpNumThreads(1),SetInterOpNumThreads(4)
#18385 closed Aug 7, 2025
[Performance] Does `com.microsoft.Attention` use FlashAttention-2?
#18474 closed Aug 7, 2025
Add ORT Extensions to Java and build with Gradle
#18503 closed Aug 7, 2025
Model Run Session wasting time[Performance]
#18510 closed Aug 7, 2025
Is there any way to convert a qdqmodel to qlinearmodel use ort?
#18511 closed Aug 7, 2025
[Training] qat
#18534 closed Aug 7, 2025
[Build] manylinux_2_28 support
#18537 closed Aug 7, 2025
[Build] TRT EP cannot be built without CUDA EP
#18542 closed Aug 7, 2025
Call Session class method name Run failed,don't know why
#18548 closed Aug 7, 2025
Does the computation order affect the computation result?
#18564 closed Aug 7, 2025
[Web] How could I get the shape of the output tensor?
#18568 closed Aug 7, 2025
[Build]
#18570 closed Aug 7, 2025
# Issue with Rounding Behavior in onnxruntime's Quantizelinear Layer
#18576 closed Aug 7, 2025
Session Run throws an access violation exception when I recreate the session
#18578 closed Aug 7, 2025
[Node.js] Support for loading models with external data in `onnxruntime-node`
#18586 closed Aug 7, 2025
Cuda EP does not compute reduce with empty set correctly?
#18588 closed Aug 7, 2025
[Mobile] Model with large input size cause Segmentation Fault while session->run()
#18595 closed Aug 7, 2025
Session initialization stuck/crash in DMLCreateDevice while using DirectML EP
#18599 closed Aug 7, 2025
Profiling multithreaded runs
#18600 closed Aug 7, 2025
Segmentation Fault when some of node outputs is empty
#18601 closed Aug 7, 2025
What is the recommended setup for running multiple models/sessions in parallel in C++?
#18610 closed Aug 7, 2025
DirectML Resize Node error.
#18613 closed Aug 7, 2025
[Build]
#18617 closed Aug 7, 2025
Could not find an implementation for SkipGroupNorm(1) node with name 'SkipGroupNorm_0'
#18623 closed Aug 7, 2025
Crash in ResizeHelper::Initialize executing a model on ARM64
#18628 closed Aug 7, 2025
ONNXRuntime Segmentation Fault Crash on Inference (iOS and Mac)
#18632 closed Aug 7, 2025
[Performance] dynamic batch infer cost time question
#18639 closed Aug 7, 2025
ORT memory error with the graph from linspace
#18648 closed Aug 7, 2025
Are there any benchmark tools for onnx mobile like Tensorflow Lite?
#18664 closed Aug 7, 2025
Different results of consecutive runs for same input
#18672 closed Aug 7, 2025
Strange condition size_t channel_rindex = is_nchw ? 2 : 2;
#18674 closed Aug 7, 2025
Missprinted condition: head_size != num_heads * head_size
#18675 closed Aug 7, 2025
Parallel inference of multiple models in different threads
#18806 closed Aug 7, 2025
Onnxruntime using OpenVINO as execution provider encountered Exception during initialization problem on model candy.onnx
#18825 closed Aug 7, 2025
[Performance] Java API lacks functionality to control allocator settings.
#18845 closed Aug 7, 2025
[dynamo_export] starts_.size() == ends_.size() + 1 was false. No matching 'start' entry.
#18863 closed Aug 7, 2025
[dynamo_export] MLFloat16 data type is not supported with ScatterElements opset 18 when reduction is 'max'.
#18864 closed Aug 7, 2025
[Web] Non-zero status code returned while running Slice node `webgpu`
#18892 closed Aug 7, 2025
compute_range not available
#18893 closed Aug 7, 2025
the resout of onnx and trt engine is different?why?
#18902 closed Aug 7, 2025
SafeIntOnOverflow() Integer overflow error when inferencing on too many samples with Python
#18905 closed Aug 7, 2025
error 126 Onnx in ComfyUI[Performance] O
#18925 closed Aug 7, 2025
ai.onnxruntime.OrtException: Unsupported type - FLOAT16
#18926 closed Aug 7, 2025
How to use multiple inputs of different types in C++ session
#18932 closed Aug 7, 2025
[Web] onnxruntime-web is not work in nodejs
#18933 closed Aug 7, 2025
[Web] no available backend found. ERR: [wasm] TypeError: _ is not a function, [cpu] Error: previous call to 'initializeWebAssembly()' failed., [xnnpack] Error: previous call to 'initializeWebAssembly()' failed
#18938 closed Aug 7, 2025
How to set `trt_profile_min_shapes` for inputs with name containing colons?
#18939 closed Aug 7, 2025
OP (Conv) inference results mismatch with PyTorch
#18946 closed Aug 7, 2025
[Build] How to build onnxruntime with openvino statically?
#18950 closed Aug 7, 2025
[Performance] 2x Regression in 1st Inference time cost
#18957 closed Aug 7, 2025
High Output Difference between ONNX model with different optimizer settings
#18959 closed Aug 7, 2025
[Build] ModuleNotFoundError: No module named 'onnxruntime'
#18966 closed Aug 7, 2025
Error with finding onnxruntime_binding.node on Windows 10 on a bootcamp Macbook
#18971 closed Aug 7, 2025
How to observe arena allocator memory request metrics
#18972 closed Aug 7, 2025
Could not load library cudnn_cnn_infer64_8.dll. Error code 127
#18973 closed Aug 7, 2025
[Build] Failure with OneDNN on Intel MacOS
#18976 closed Aug 7, 2025
Cannot quantize yolov5 float to int8 onnx model
#18987 closed Aug 7, 2025
Encounter unknown exception in initialize using Openvino EP
#19004 closed Aug 7, 2025
ONNX Runtime inference on string input
#19006 closed Aug 7, 2025
[Error: Exception in HostFunction: <unknown>] while running ort models in react-native
#19021 closed Aug 7, 2025
[Performance] It is not possible to use a discrete graphics card with DML.
#19025 closed Aug 7, 2025
[Build] deploying the EfficientAD anomaly detection algorithm, an error occurred while executing the "Run" command
#19030 closed Aug 7, 2025
Freeing tensor data created via CreateTensor
#19034 closed Aug 7, 2025
[Build] Linux x86_64 STATIC Build
#19035 closed Aug 7, 2025
cudaMemcpyAsync throws exception in GPUDataTransfer
#19076 closed Aug 7, 2025
[Training] On device training doesn't work with INT8 Models
#19078 closed Aug 7, 2025
[Performance] The CUDA Stream cannot be set through Python API
#19094 closed Aug 7, 2025
Longformer `convert_to_onnx.py` not working due to missing imports
#19149 closed Aug 7, 2025
[Performance] Why run first inference so slow, although run one time in initialzation?
#19177 closed Aug 7, 2025
ORT 1.17.0 Release Candidates available for testing
#19236 closed Aug 7, 2025
`shape_inference.quant_pre_process` causes `AttributeError: module 'onnx.helper' has no attribute 'make_sequence_value_info'`
#19323 closed Aug 7, 2025
[Training] How to update running_mean and running_var of BatchNormalization during training
#19370 closed Aug 7, 2025
[Performance] In ONNX Runtime, the CPU consumption does not scale linearly with the number of threads
#19384 closed Aug 7, 2025
Backwards convolution layers in CUDA provider should heed
#19391 closed Aug 7, 2025
InferenceSession.run does not validate rank of scalar inputs
#19434 closed Aug 7, 2025
[Web] Memory Access Out of Bounds Error When Using ONNX Runtime Web Inference in NPM Package (wasm)
#19443 closed Aug 7, 2025
[Performance] CPU inference much slower from GPU runtime
#19451 closed Aug 7, 2025
[On-device Training] Yolo custom loss
#19464 closed Aug 7, 2025
[Performance]
#19479 closed Aug 7, 2025
Errors about using c# and TensorRT
#19489 closed Aug 7, 2025
Accuracy drops a lot when using fp16 with TensorRT EP
#19492 closed Aug 7, 2025
quantize_dynamic : nodes_to_quantize(Gemm) is ignored
#19503 closed Aug 7, 2025
ONNX Runtime OpenVINO EP is way behind
#19688 closed Aug 7, 2025
Observed TDR on a low-end system
#19724 closed Aug 7, 2025
Inconsistent Prediction Outputs for Onnx Model
#19834 closed Aug 7, 2025
import InferenceSesseion and capi._pybind_state.
#19836 closed Aug 7, 2025
[Performance] onnxruntime 1.17.1 version doesnt support CUDA 12.4
#19839 closed Aug 7, 2025
[Performance] Accuracy dropped heavily using onnxruntime to inference a model quantized by QAT
#19850 closed Aug 7, 2025
Inference speed problem even if using a high-end Hardware.
#19865 closed Aug 7, 2025
[iOS] Output of type sequence<map<int64,float32>> causes crash on iOS
#19867 closed Aug 7, 2025
[Build] Where is official build for Unity?
#19964 closed Aug 7, 2025
[BUG] [OpenVino EP] Only first result in session is correct.
#19975 closed Aug 7, 2025
Onnx Runtime EntryPointNotFoundException: OrtGetApiBase in Unity Application.
#20048 closed Aug 7, 2025
Layer not supported in one provider (Tensorrt) not working with second provider (CUDA) in an inference problem.
#20058 closed Aug 7, 2025
[Performance] Inference failed or unsupported using quantize_dynamic
#20060 closed Aug 7, 2025
openvino with int8
#20072 closed Aug 7, 2025
Unpredictable onnxruntime-node crash when using Electron
#20084 closed Aug 7, 2025
In Aquatic mode links text “PyTorch and Hugging face” is not clearly visible: A11y_WCP URLs - ONNX Runtime_Home_Learn more about how to use ONNX Runtime with_usability
#20150 closed Aug 7, 2025
`convert_float_to_float16` results in `failed in shape inference <class 'Exception'>`
#20189 closed Aug 7, 2025
Whether CUDA12.4 and cudnn9 matches onnxruntime-win-x64-cuda12-1.17.1
#20223 closed Aug 7, 2025
[Training] Can we use ORTModule for inference?
#20281 closed Aug 7, 2025
C API Seg Fault from OrtGetApiBase()->GetApi(ORT_API_VERSION);
#20283 closed Aug 7, 2025
[Performance] ScatterND / GridSample operators are on CPU instead of GPU / CUDA
#20297 closed Aug 7, 2025
DirectML returning empty result with ObjectDetection (Mobilinet V2 FPN Keras)
#20386 closed Aug 7, 2025
[Build] Cmake install debug and release configuration
#20387 closed Aug 7, 2025
[Performance] Profiling on CUDA shows confusing values
#20398 closed Aug 7, 2025
[Performance] Massive Performance slowdown from v1.13.1 -> 1.14.0
#20400 closed Aug 7, 2025
onnxruntime 1.17.3 is missing from cuda 12 artifacts feed
#20409 closed Aug 7, 2025
Dockerfile does not work
#20458 closed Aug 7, 2025
[Build] cross-compiling onnxruntime for arm32 and onnxruntime_ENABLE_CPUINFO not working.
#20461 closed Aug 7, 2025
RUNTIME_EXCEPTION, 80070057 The parameter is incorrect in v1.17.3
#20464 closed Aug 7, 2025
[Build] cmake duplicate target "memory" between abseil and xnnpack
#20469 closed Aug 7, 2025
[Build] Error when load pf16 model
#20570 closed Aug 7, 2025
DirectML Exception 80070057 "The parameter is incorrect"
#20575 closed Aug 7, 2025
windows系统，Java中使用onnxruntime进行压测，cpu飙升很快，一直100%
#20593 closed Aug 7, 2025
Missing dll cudnn_ops_infer64_8.dll does not generate a python error
#20605 closed Aug 7, 2025
[BUG] Running operations over concat output rewrites it's values
#20606 closed Aug 7, 2025
[Discussion] ORT GPU binaries do not contain DML
#20638 closed Aug 7, 2025
[Build] TVM EP Build
#20665 closed Aug 7, 2025
LayerNormalization doesnt' work as expected on Mac
#20676 closed Aug 7, 2025
User-provided session logging function is not used for every log
#20680 closed Aug 7, 2025
Windows ARM64 & X64 CLIP Image Encoder different results
#20722 closed Aug 7, 2025
[Build] quantization unittest failed when run all tests
#20821 closed Aug 7, 2025
[.NET] Update tensor implementations to new Tensor<T> type
#20874 closed Aug 7, 2025
Java CreateTensor with NIO ByteBuffer for reuse purpose
#20882 closed Aug 7, 2025
[Build] how to buid on openharmony?
#20895 closed Aug 7, 2025
Stateful/Memory models
#20943 closed Aug 7, 2025
[Performance] Severe performance penalty with transformer model and DirectML
#20983 closed Aug 7, 2025
onnxruntime shape mismatch during quantization of yolov8 models
#21048 closed Aug 7, 2025
DML cannot use device_id = 1 , run_with_iobinding failed.
#21092 closed Aug 7, 2025
Symbolic Shape infer fails on onnx file without much logs
#21120 closed Aug 7, 2025
[Performance] Whisper model inference results incorrect after Transformer Optimizer
#21150 closed Aug 7, 2025
ORT 1.18.1 Release Candidates available for testing
#21173 closed Aug 7, 2025
[Performance] Mapfile support for certain external data files is not working
#21195 closed Aug 7, 2025
[Mobile] QNN failed to finalize QNN graph for attention layer
#21221 closed Aug 7, 2025
TArray used for broadcast was limited to be within range [0, 8] on onnxruntime 1.16.3
#21254 closed Aug 7, 2025
Not able to load onnx model multilingual-e5-large
#21321 closed Aug 7, 2025
TensorRT EP's inference results are abnormal.
#21457 closed Aug 7, 2025
[Build] Unable to build with --use_dml
#21568 closed Aug 7, 2025
Memory leak in NPU inference after each one session.run
#21587 closed Aug 7, 2025
[Performance]
#21635 closed Aug 7, 2025
Quantized SeaLLM v2 Model Outputs Same as Input
#21636 closed Aug 7, 2025
Same Model Hash Code Issue from different models
#21672 closed Aug 7, 2025
[Bug]: Onnxruntime.CPU memoty leaks
#21723 closed Aug 7, 2025
Inferencing FP16 model using onnxruntime
#21737 closed Aug 7, 2025
[Web] requested dist/*.mjs files for cdnjs
#21785 closed Aug 7, 2025
run_async not running asynchronously
#21791 closed Aug 7, 2025
[Bug] [onnxruntime-node] Error: no available backend found. ERR: [wasm] backend not found.
#21813 closed Aug 7, 2025
Error when trying to run vision model onnx
#21869 closed Aug 7, 2025
[Build] “onnxruntime_cxx_api.h”: No such file or directory
#21891 closed Aug 7, 2025
Snapdragon X processor is unsupported
#21947 closed Aug 7, 2025
[Mobile] IOS library crashes in Release configuration
#21960 closed Aug 7, 2025
[Web] Uncaught WebGPU validation error on Snapdragon SM8450 but works on SM8250
#21970 closed Aug 7, 2025
[Build] onnxruntime-openvino library does not have python3.12 support
#22015 closed Aug 7, 2025
onnxruntime-gpu(1.18.0) can not be install
#22028 closed Aug 7, 2025
[Training] Implicit dependency of Python training API on 'torch' package
#22070 closed Aug 7, 2025
GetElementType is not implemented after updating onnxruntime
#22075 closed Aug 7, 2025
[Web] Error when using Web Workers on Next.js
#22113 closed Aug 7, 2025
[Question or BUG] ONNX Runtime CUDA Sessions in Unity Produce Empty Outputs When Running Multiple Models Sequentially on a Single Graphic Card
#22146 closed Aug 7, 2025
Warnings displayed as errors during TensorRT optimization.
#22164 closed Aug 7, 2025
trt_weight_stripped_engine_enable does not work for all networks/size ranges.
#22165 closed Aug 7, 2025
trt_weight_stripped_engine_enable does not work together with trt_dump_ep_context_model
#22179 closed Aug 7, 2025
Filenames in OrtTensorRTProviderOptionsV2 should be std::filesystem::path or at least const ORTCHAR_T*
#22182 closed Aug 7, 2025
[Performance] fp16 support and performance
#22242 closed Aug 7, 2025
Upcoming ORT 1.20 Release Overview
#22274 closed Aug 7, 2025
[Performance] High CUDA memory usage with ONNX Runtime and inconsistent memory release
#22297 closed Aug 7, 2025
Build failure on Windows 10 using OpenVino 2024.3 & 2024.4 both.
#22314 closed Aug 7, 2025
`quant_pre_process SymbolicShapeInference` causes AttributeError: 'NoneType' object has no attribute 'HasField' when the model has a Constant node.
#22422 closed Aug 7, 2025
The EP_CTX_BLOB seems to have both WRITE and EXECUTABLE permissions enabled
#22437 closed Aug 7, 2025
External data is not loaded with custom allocator
#22468 closed Aug 7, 2025
[Performance] C++ api: destroy the execution provider if the `Ort::Session` is destroyed
#22511 closed Aug 7, 2025
DistilBERT model inference failure using ONNX Runtime QNNExecutionProvider on Snapdragon® X Elite NPU
#22532 closed Aug 7, 2025
Caused by: java.lang.UnsatisfiedLinkError: /tmp/onnxruntime-java18295816951647233732/libonnxruntime.so: Error relocating /tmp/onnxruntime-java18295816951647233732/libonnxruntime.so: __vsnprintf_chk: symbol not found
#22539 closed Aug 7, 2025
[DO NOT UNPIN] ORT Nightly Package Name Change
#22541 closed Aug 7, 2025
Negative output for sigmoid
#22557 closed Aug 7, 2025
[Performance] Model runtime spiky with TensorRT Execution Provider
#22664 closed Aug 7, 2025
Exception during initialization: safeint.h:17 static void SafeIntExceptionHandler<onnxruntime::OnnxRuntimeException>::SafeIntOnOverflow() Integer overflow - caused by int64 index of -1?
#22694 closed Aug 7, 2025
FP16 ONNX model outputs NaN after the first successful execution
#22723 closed Aug 7, 2025
CUDA providers failed to build against 12.6 with error error #221-D
#22728 closed Aug 7, 2025
why force max_length <= kMaxSequenceLength in beam_search_parameters.cc ?
#22735 closed Aug 7, 2025
[TensorRT EP] How can I disable generating cache when using trt execution provider
#22822 closed Aug 7, 2025
[Dev] "./onnxruntime_test_all --help" gives segmentation fault
#22838 closed Aug 7, 2025
how to release gpu memory when use onnxruntime with fastapi
#22899 closed Aug 7, 2025
[Performance] Binary operators using SSE on AVX systems
#22905 closed Aug 7, 2025
[Mobile] Error: Can't load a model: Error Code - ORT_INVALID_PROTOBUF
#22927 closed Aug 7, 2025
[Training] RuntimeError: gradient_builder_base.h:123 onnxruntime::training::ArgDef onnxruntime::training::GradientBuilderBase::O(size_t, bool) const i < node_->OutputDefs().size() was false
#22955 closed Aug 7, 2025
Remove Python :: 3.7 Python :: 3.8 Python :: 3.9 from pypi metadata
#22993 closed Aug 7, 2025
To reduce the compiled binary size of ONNX Runtime at x86_64 linux with "create_reduced_build_config.py", but got a Failed to find kernel for com.microsoft.nchwc.Conv(1)
#23018 closed Aug 7, 2025
[Build] Dotnet packages on nuget are not built with Release optimizations
#23053 closed Aug 7, 2025
[Web] ORT format model not working on WebGPU EP + Wasm Static lib
#23072 closed Aug 7, 2025
[Build] onnxruntime_gpu PiPy on a slow host
#23079 closed Aug 7, 2025
Cannot resolve operator 'LSTM' with webgl backend
#23083 closed Aug 7, 2025
[Bug][CUDAExecutionProvider] INVALID_ARGUMENT : unsupported conv activation mode "Sigmoid"
#23114 closed Aug 7, 2025
Understanding max_mem option of OrtArenaCfg class
#23121 closed Aug 7, 2025
[Bug] Inconsistent Results After ONNX Runtime Optimization
#23133 closed Aug 7, 2025
Inconsistent Results After ONNX Runtime Optimization
#23142 closed Aug 7, 2025
[Build] Better support for vcpkg
#23158 closed Aug 7, 2025
ONNX 1.17.0 integration remaining work: fix QNN EP test failures
#23163 closed Aug 7, 2025
Inconsistent Results After ONNX Runtime Optimization
#23199 closed Aug 7, 2025
[Inference Error] The onnx inference result is inconsistent with the numpy inference result
#23202 closed Aug 7, 2025
[Build] how to build onnxruntime with openvino EP for android
#23222 closed Aug 7, 2025
[Build] Xcode unit tests fail with libc++abi: terminating due to uncaught exception of type onnxruntime::OnnxRuntimeException:
#23259 closed Aug 7, 2025
[Build] Building for Mac Catalyst Fails When Installed Via Cocoapods
#23307 closed Aug 7, 2025
[Performance] Max operator became 4.5X slower after Fixing NaN propagation for float16 min and max operators.
#23337 closed Aug 7, 2025
memory.enable_memory_arena_shrinkage is not working in python
#23339 closed Aug 7, 2025
Issue loading custom ONNX model with complex-valued operations in ONNX Runtime (C++)
#23341 closed Aug 7, 2025
Memory creeping up
#23348 closed Aug 7, 2025
No speedup from float16 with directml compared to cuda
#23359 closed Aug 7, 2025
[Build] Possibly unintentional or misconfigured dependencies for QNN EP in onnxruntime_python.cmake
#23360 closed Aug 7, 2025
[Performance] GPU Fallback to CPU Without Error When CUDA DLLs Are Missing
#23372 closed Aug 7, 2025
[Performance] 40% slowdown in ONNX Resize Operator on CPU
#23391 closed Aug 7, 2025
[Performance] Round node shows huge performance drop on Windows
#23430 closed Aug 7, 2025
debug result is ok, release get NaN output
#23440 closed Aug 7, 2025
[QUESTION]: onnxruntime with onednn backend
#23543 closed Aug 7, 2025
[Performance] Speed-up TensorRT engine compilation
#23546 closed Aug 7, 2025
Custom operators is not a registered function/op (python)
#23566 closed Aug 7, 2025
[Performance] ORT-WebGPU Average Pooling is working too long in edge case
#23614 closed Aug 7, 2025
TensorRT Provider "Attribute reduction is not supported"
#23618 closed Aug 7, 2025
session.disable_fallback() has no effect, it always fallback to cpu
#23647 closed Aug 7, 2025
[Build] CMake Error at onnxruntime_unittests.cmake:1026 (find_path): Could not find onnx_SOURCE_DIR using the following files: onnx/onnx-ml.proto3, onnx/onnx-ml.proto Call Stack (most recent call first): CMakeLists.txt:1789 (include)
#23684 closed Aug 7, 2025
terminate called after throwing an instance of 'Ort::Exception' what(): Invalid input name: serving_default_input_1:0
#23730 closed Aug 7, 2025
Onnxruntime using OpenVINO for older version Intel UHD630
#23735 closed Aug 7, 2025
[Build] Error building with ACL EP on aarch64 linux (Raspberry Pi 5)
#23741 closed Aug 7, 2025
[Mobile] Onnxruntime react-native issue: [java.lang.ClassCastException: java.lang.String[][] cannot be cast to java.lang.String[]]
#23782 closed Aug 7, 2025
[Mobile] Dynamic Shape Challenge: Enabling LLM on QNN-HTP
#23832 closed Aug 7, 2025
[Performance] Why does inference occupy so much memory?
#23867 closed Aug 7, 2025
The Pad operator has a calculation error in the "reflect" mode.
#23878 closed Aug 7, 2025
Bad Allocation Error in ONNX Runtime on Windows x86 CPU When Processing Multiple Images Sequentially
#23938 closed Aug 7, 2025
TensorRT Support for Multiple Profiles
#23965 closed Aug 7, 2025
[Build] Unsupported AVX512-FP16 Instructions in MLAS (vcvtneeph2ps, vcvtneoph2ps)
#24025 closed Aug 7, 2025
Application is getting crashed while creating session for the onnxruntime-qnn with QnnCpu backend option.
#24082 closed Aug 7, 2025
ImportError: Unable to import dependency onnxruntime
#24120 closed Aug 7, 2025
onnxruntime-mobile implementation on custom execution provider
#24135 closed Aug 7, 2025
segmentation fault while using onnxruntime==1.21.0
#24144 closed Aug 7, 2025
[Feature Request] A model with dynamic input and dynamic output。 will have a memory leak after inference with Openvino.
#24162 closed Aug 7, 2025
Python Session.run_async Causes Program Exit
#24200 closed Aug 7, 2025
OpenVINO EP not able to use CPU device
#24208 closed Aug 7, 2025
Questions about using AMD VitisAI EP, how can i run my model on AMD NPU?
#24214 closed Aug 7, 2025
[Build] OpenVINO ep for macOS
#24273 closed Aug 7, 2025
[Build] Building v1.21.0: unsupported instruction 'vpdpbusds'
#24275 closed Aug 7, 2025
SIGSEGV when calling OrtSession.run()
#24288 closed Aug 7, 2025
[Build] Onnxruntime v1.21.0 fails to build with GCC-13
#24290 closed Aug 7, 2025
quantize onnx models to INT8
#24374 closed Aug 7, 2025
[Performance] [QNN EP] Performance gap between onnxruntime QNN EP and Genie from QNN SDK.
#24417 closed Aug 7, 2025
[Build] Python build fails because onnxruntime/capi/build_and_package_info.py is missing
#24570 closed Aug 7, 2025
[MLAS] Plan to add RISC-V Vector (RVV) support to MLAS
#24596 closed Aug 7, 2025
nuget package 1.21.2 causes conflicts in Solutions targeting .NET Framework 4.8
#24599 closed Aug 7, 2025
[Mobile] Objective-C API for register onnxruntime-extensions as a custom ops library
#24613 closed Aug 7, 2025
[DO NOT UNPIN] ORT 1.22.0 Release Candidates available for testing
#24671 closed Aug 7, 2025
Scale in resize node becomes an identity node not a parameter inside resize node
#24824 closed Aug 7, 2025
Import error in pytest with onnxruntime-directml 1.22.0
#24907 closed Aug 7, 2025
[Web] Clarification on wasm/simd vs wasm/simd/threaded default in onnxruntime-web v1.19.0+
#25666 closed Aug 7, 2025
[Feature Request] Support for ScatterElements op for QNN-EP
#22962 closed Aug 7, 2025
TreeEnsemble can incorrectly decide a root branch is a leaf.
#24679 closed Aug 6, 2025
[Build] CMake Error related to onnxruntime_unittests.cmake
#24972 closed Aug 6, 2025
onnxruntime custom OP Failure
#25644 closed Aug 5, 2025
onnxruntime with the CPUExecutionProvider errors out while processing the ReverseSequence operator
#24920 closed Aug 4, 2025
[Performance] ORT takes ~11GB memory for quantizing a model of size ~1GB
#24954 closed Aug 4, 2025
[Documentation]
#24958 closed Aug 4, 2025
How to use kv_cache more reasonably in the exported onnx model?
#24873 closed Aug 2, 2025
Llama3.2-1B ONNX Graph generated by olive auto-opt fails to run on DirectML execution provider
#24937 closed Aug 2, 2025
[Build] error: array index 7 is past the end of the array (that has type '__m256[4]')
#23180 closed Aug 1, 2025
olive: a weird behavior of a model converted to ONNX format
#25600 closed Aug 1, 2025

14 Issues opened by 14 people

ORT ABI support in onnx perf test
#25685 opened Aug 7, 2025
[Build] Build failed on Qualcomm WOS platform
#25682 opened Aug 7, 2025
[Build] Pybind11 3.0 support
#25681 opened Aug 7, 2025
[ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running ReduceSum node. Name:'MaskReduceSum_0' Status Message: CUDNN failure 5000: CUDNN_STATUS_EXECUTION_FAILED
#25676 opened Aug 6, 2025
'onnxruntime_providers_cuda.dll' crashes with access violation during TLS callback initialization (v1.22.1, CUDA 12.9, Windows)
#25670 opened Aug 6, 2025
[Documentation] Comparison with PyTorch code is identical to Comparison with OpenVINO section
#25661 opened Aug 6, 2025
[Feature Request] Integration with ONNX 1.19.0
#25648 opened Aug 4, 2025
IExecutionProvider::FusedNodeAndGraph set intermediate unused results as model outputs
#25647 opened Aug 4, 2025
[Performance] Quantized ONNX models cannot be efficiently used with speculative decoding?
#25636 opened Aug 1, 2025
Output mismatch since version 1.21+
#25634 opened Aug 1, 2025
Inference fails with 4 bit quantization
#25631 opened Aug 1, 2025
[Feature Request] Nvidia TensorRT RTX runtime in C#
#25630 opened Aug 1, 2025
[Feature Request] Support linear_tree=True in Lightgbm
#25623 opened Aug 1, 2025
[Bug] [Performance] Cannot write_calibration_table for per channel quantization calibration
#25621 opened Aug 1, 2025

84 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[MIGraphX EP] Syncing AMD changes upstream
#25583 commented on Aug 7, 2025 • 51 new comments
[webgpu] support And operator
#25440 commented on Aug 6, 2025 • 5 new comments
Add support for bitnets to ORT WebGPU EP
#25587 commented on Aug 3, 2025 • 2 new comments
Add reduceSum support for uint32 and uint64
#25597 commented on Aug 6, 2025 • 1 new comment
[ARM CPU] SVE support for Elementwise kernels
#25238 commented on Aug 3, 2025 • 1 new comment
Fix antialias downsample on CUDA EP
#25265 commented on Aug 4, 2025 • 1 new comment
[MLAS] Add 8-bit weights ARM64 Gemm implementation
#25110 commented on Aug 5, 2025 • 0 new comments
Avoid traversing entire arrays when extracting shape from objects in java
#24833 commented on Aug 4, 2025 • 0 new comments
Using separate cuda streams for one session
#23319 commented on Aug 7, 2025 • 0 new comments
[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.`
#22994 commented on Aug 7, 2025 • 0 new comments
Always getting "Failed to create CUDAExecutionProvider"
#11092 commented on Aug 7, 2025 • 0 new comments
Awful performance with LASER model when using TensorRT provider
#8315 commented on Aug 7, 2025 • 0 new comments
multiple tests fail on Windows due to `ORT_ENABLE_STREAM` define logic error
#20180 commented on Aug 7, 2025 • 0 new comments
Drop support for Python 3.5
#5961 commented on Aug 7, 2025 • 0 new comments
Onnx Runtime for Java is packaged with 200MB onnxruntime.pdb in the win-x64 native package
#12084 commented on Aug 7, 2025 • 0 new comments
GPU Memory allocation with multiple cuda stream
#12920 commented on Aug 7, 2025 • 0 new comments
[Performance] Dynamic Shape performance
#13198 commented on Aug 7, 2025 • 0 new comments
Mixed Precision ValueError: validation failed for model with all nodes in node_block_list
#14235 commented on Aug 7, 2025 • 0 new comments
[Performance] running on xavier gpu but cpu usage high
#14676 commented on Aug 7, 2025 • 0 new comments
Does ortvalue_from_numpy support directml?
#15421 commented on Aug 7, 2025 • 0 new comments
perf_view shows nothing after json load
#15927 commented on Aug 7, 2025 • 0 new comments
[TEST FAILED] Several tests fails while running onnxruntime_test_all on armv7 based device
#16387 commented on Aug 7, 2025 • 0 new comments
Unable to use LSTM with mask of dynamic shape with TensorrtExecutionProvider
#16885 commented on Aug 7, 2025 • 0 new comments
[Mobile] android prod crash: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)
#20828 commented on Aug 7, 2025 • 0 new comments
[webgpu] And int64 to cast
#25610 commented on Aug 2, 2025 • 0 new comments
Refactor code to prevent internal structure from leaking outside Graph class
#25586 commented on Jul 31, 2025 • 0 new comments
[webgpu] Optimize dp4 prefill shader for Qualcomm
#25578 commented on Aug 4, 2025 • 0 new comments
[Web] Avoid unnecessary data copy for pre-allocated tensors
#25571 commented on Aug 4, 2025 • 0 new comments
[CUDA EP] Add hardswish op and add bf16 support for hardsigmoid
#25562 commented on Aug 4, 2025 • 0 new comments
[webgpu] Add more GEMM test
#25556 commented on Aug 1, 2025 • 0 new comments
2bit matmul implementation
#25542 commented on Aug 6, 2025 • 0 new comments
[Not-For-Review] support enableGraphCapture in tests
#25535 commented on Aug 5, 2025 • 0 new comments
Retrieve Device and Command buffer for DML
#25533 commented on Aug 5, 2025 • 0 new comments
POWER : Implement MlasGemmQuantKernel using VSX builtins for M = 1
#25490 commented on Aug 5, 2025 • 0 new comments
Compile API: disable optimizations by default
#25474 commented on Aug 2, 2025 • 0 new comments
Compile API: output EPContext binary data to write function
#25471 commented on Aug 2, 2025 • 0 new comments
Compile API: output model and initializer stream write functions
#25455 commented on Aug 2, 2025 • 0 new comments
[EP ABI] Get EP compiled model compatibility
#25331 commented on Aug 2, 2025 • 0 new comments
Fix Sign and Clip operation on int64 tensors
#25280 commented on Aug 4, 2025 • 0 new comments
[Mlas] optimize MlasConv using thread partition opt
#25255 commented on Aug 5, 2025 • 0 new comments
[WIP] Add some device discovery support for non-Windows platforms
#25228 commented on Aug 2, 2025 • 0 new comments
Update index.md
#25119 commented on Aug 5, 2025 • 0 new comments
Type mismatch error when loading a Float16 model
#25522 commented on Aug 4, 2025 • 0 new comments
[Build] cmake cannot find KLEIDIAI - Windows 11 ARM
#24865 commented on Aug 3, 2025 • 0 new comments
onnxruntime errors out due to the wrong process of GatherElements operator with the CPUExecutionProvider: Out of range value in index tensor
#24917 commented on Aug 3, 2025 • 0 new comments
[BUG] Non-zero status code returned while running Resize node. in Direct ML backend
#24928 commented on Aug 3, 2025 • 0 new comments
Incorrect Use of CUDA Constants in MIGraphXExecutionProvider::CreatePreferredAllocators (Should Use HIP)
#25268 commented on Aug 3, 2025 • 0 new comments
pip install keras and pytorch comes with .onnxruntime_pybind11_state error from rembg python package
#25289 commented on Aug 3, 2025 • 0 new comments
[Performance] How to used pinned memory in onnxruntime.
#20947 commented on Aug 2, 2025 • 0 new comments
[Build] can't build CUDA (+ vino and directML) for latest v1.22 on windows
#25081 commented on Aug 2, 2025 • 0 new comments
Initializers use wrong allocator
#25108 commented on Aug 2, 2025 • 0 new comments
[Build] CMake configurations files for bin release 1.22.0 are broken
#25242 commented on Aug 2, 2025 • 0 new comments
[DirectML EP] Error when validating attributes of `Slice` operator
#25252 commented on Aug 2, 2025 • 0 new comments
[Build] CMake configurations files for bin release 1.22.0 are broken for Linux
#25279 commented on Aug 2, 2025 • 0 new comments
[Build] Build script for MacOS fails for targets older than 13.4 because tests can not be built
#24277 commented on Aug 2, 2025 • 0 new comments
Does BatchNormalization support 2D shape of `X` input
#25230 commented on Aug 1, 2025 • 0 new comments
Segmentation Fault running model
#25613 commented on Aug 1, 2025 • 0 new comments
[Build] CCCL API migration issue.
#24774 commented on Aug 1, 2025 • 0 new comments
[Build] How to build ONNX Runtime as a dynamic framework (.dylib/.framework) for iOS？
#25256 commented on Aug 1, 2025 • 0 new comments
about infer ocr Memory exception
#25258 commented on Aug 1, 2025 • 0 new comments
[Bug] CUDAExecutionProvider fails to load due to missing libcudnn.so.9 in LD_LIBRARY_PATH when using onnxruntime-gpu==1.22.0
#25609 commented on Aug 1, 2025 • 0 new comments
[Feature Request] Improve Telemetry Disablement
#25573 commented on Aug 1, 2025 • 0 new comments
[Feature Request] Cast Float16 model to Float32 [Web]
#17230 commented on Aug 1, 2025 • 0 new comments
RunAsync C# API crashes without any error
#19140 commented on Aug 7, 2025 • 0 new comments
[Web] `Error: [WebGPU] Kernel "[Conv] /text_encoder/encoder/layers.0/feed_forward/conv_2/Conv" failed. Error: FILTER_IN_CHANNEL should be equal to DATA_CHANNEL`
#21108 commented on Aug 7, 2025 • 0 new comments
[Performance] Increased memory usage when loading from bytes
#21165 commented on Aug 7, 2025 • 0 new comments
[Web] `Error: using ceil() in shape computation is not yet supported for AveragePool`
#21206 commented on Aug 7, 2025 • 0 new comments
[Build] Mismatched library directory in linux-x64 package: lib and lib64
#22267 commented on Aug 7, 2025 • 0 new comments
[Performance] GPU op placement control when some ops must be on the CPU
#23154 commented on Aug 7, 2025 • 0 new comments
[Build] 1.20.2 Microsoft.ML.OnnxRuntime.Managed nuget package needs Microsoft.ML.OnnxRuntime 1.20.2 which is not available
#23640 commented on Aug 7, 2025 • 0 new comments
[CPU EP] GatherND crashes with division by zero when batch dimensions mismatch between input and indices
#23828 commented on Aug 7, 2025 • 0 new comments
GetShape crashes on Linux
#25295 commented on Aug 7, 2025 • 0 new comments
[Build] Cannot read property 'install' of null with onnxruntime-react-native imported
#19510 commented on Aug 7, 2025 • 0 new comments
OpenVino Runtime Exception. Unexpected: CPU plug-in doesn't support If operation with dynamic rank. Operation name: input.15
#23757 commented on Aug 7, 2025 • 0 new comments
[Performance] Openvino 2x slower than with OpenCV on an Intel HD Graphics 620 / 630
#25266 commented on Aug 6, 2025 • 0 new comments
[CANN] When using onnxruntime-cann for inference, it failed to utilize the NPU for inference
#22229 commented on Aug 6, 2025 • 0 new comments
OpenVINO EP fails to run models with in-memory external data
#25304 commented on Aug 6, 2025 • 0 new comments
[WebGPU] Subgroups feature is not enabled for ort-web WebGPU EP
#25595 commented on Aug 6, 2025 • 0 new comments
[Bug] Auto EP selection rejects the combination of DML EP with other EPs like OpenVINO EP
#25504 commented on Aug 5, 2025 • 0 new comments
GetEpDevices() does not Detect Intel NPU via OpenVINO EP
#25557 commented on Aug 5, 2025 • 0 new comments
onnxruntime-gpu fails to find libnvrtc.so.12 when CUDA is not installed globally
#24719 commented on Aug 5, 2025 • 0 new comments
mutex issue on Mac only for release 1.21.X only
#24579 commented on Aug 5, 2025 • 0 new comments
Cannot use Microsoft.ML.OnnxRuntime NuGet package 1.22.1 with Microsoft.SemanticKernel.Connectors.Onnx
#25287 commented on Aug 4, 2025 • 0 new comments
Incorrect cubic resizing with antialias on CUDA
#25264 commented on Aug 4, 2025 • 0 new comments