-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Insights: microsoft/onnxruntime
Overview
Could not load contribution data
Please try again later
41 Pull requests merged by 26 people
-
Bump tmp from 0.2.1 to 0.2.4 in /onnxruntime/test/wasm
#25671 merged
Aug 7, 2025 -
[webgpu] support float16 type for Einsum operator
#25443 merged
Aug 7, 2025 -
Update number of mel bins for whisper model
#25675 merged
Aug 7, 2025 -
Allow DML EP to be used with any CPU EP
#25664 merged
Aug 7, 2025 -
Cherry-pick MiGraphX EP fixes from upstream for rel-1.23.0
#25659 merged
Aug 7, 2025 -
ORT perf test support for plugin EP
#25374 merged
Aug 6, 2025 -
Enable BrowserStack testing stage
#25668 merged
Aug 6, 2025 -
Remove training packages from onnxruntime-ios-packaging-pipeline
#25451 merged
Aug 6, 2025 -
Fix the is_leaf check in TreeEnsemble
#25410 merged
Aug 6, 2025 -
upgrade dawn to 794b6fadc4171f7b853a77ffdf0948fbec431f41
#25461 merged
Aug 6, 2025 -
[NV TRT RTX EP] Cumulative TRT RTX EP merge
#25656 merged
Aug 6, 2025 -
[build] fix build with delay load hook
#25657 merged
Aug 5, 2025 -
[build] fix WebAssembly build on macOS/arm64
#25653 merged
Aug 5, 2025 -
[web] fix support for subgroup
#25649 merged
Aug 4, 2025 -
Add patch for WebGPU on Android to handle fp16 in uniforms
#25349 merged
Aug 4, 2025 -
Cherry-pick PR #25626 to 1.23.0 release branch
#25640 merged
Aug 4, 2025 -
Enable 2bit CPU matmul fallback
#25582 merged
Aug 4, 2025 -
[QNN EP] Add support for GatherNd Op in QNN EP
#25635 merged
Aug 4, 2025 -
Update qMoE spec to support block quantization
#25641 merged
Aug 4, 2025 -
[MIGraphX EP] Fix CreateExecutionProviderFactory with correct struct and change vendor_id
#25625 merged
Aug 4, 2025 -
[CANN] Fix CANN build error
#25627 merged
Aug 4, 2025 -
Update MoE and qMoE spec
#25619 merged
Aug 2, 2025 -
[build] use cross-compile to build macOS x86_64 target for WebGPU
#25617 merged
Aug 2, 2025 -
Add support for QMoE in CPU
#25558 merged
Aug 2, 2025 -
Move moving weights to memory to the end of Graph::Resolve()
#25626 merged
Aug 2, 2025 -
Add CUDA implementation of GatherBlockQuantized operator
#25575 merged
Aug 1, 2025 -
Cherry-picks for ORT 1.23.0
#25620 merged
Aug 1, 2025 -
[build] fix macOS x86_64 cross-compile warning
#25615 merged
Aug 1, 2025 -
update CANN docs
#25624 merged
Aug 1, 2025 -
[webgpu] Apply Flash Attention if sliding window exceeds KV cache length
#25594 merged
Aug 1, 2025 -
Update macOS target version from 13.3 to 13.4
#25616 merged
Aug 1, 2025 -
[QNN-EP] Resolve VTCM buffer sharing bugs
#25622 merged
Aug 1, 2025 -
[QNN EP] Add ONNX ScatterElements support
#24811 merged
Aug 1, 2025 -
[QNN EP] Bug fix: multiple consumer for cast result in name conflicts
#25584 merged
Aug 1, 2025 -
Optimize layout for SubgroupMatrixLoad on Intel
#25384 merged
Aug 1, 2025 -
[QNN EP] Lower Gemm with 2d bias to FC + ElementwiseAdd when targeting HTP.
#25605 merged
Aug 1, 2025 -
[build] disable CodeQL for NPM Packaging Pipeline
#25614 merged
Aug 1, 2025 -
Cache opSupportLimits to improve the performance and update tracing e…
#25589 merged
Jul 31, 2025 -
Refactor Java Test Pipeline
#25608 merged
Jul 31, 2025 -
[QNN EP] Add Unit tests for LPBQ Fusions
#25592 merged
Jul 31, 2025
34 Pull requests opened by 31 people
-
Reduce CMake's CUDA_ARCHITECTURES
#25618 opened
Jul 31, 2025 -
fix(ort): add automatic patching for nvidia cudnn library path
#25628 opened
Aug 1, 2025 -
[VitisAI] bugfix model_clone optimization
#25629 opened
Aug 1, 2025 -
[webgpu] support GatherND operator
#25632 opened
Aug 1, 2025 -
rel-1.22.2 cherry-pick 1
#25633 opened
Aug 1, 2025 -
Depthwise conv 3x3 s1
#25637 opened
Aug 2, 2025 -
Add more tests to GatherBlockQuantized operator
#25639 opened
Aug 2, 2025 -
Optimizations and fixes in QMoE CPU kernel
#25642 opened
Aug 3, 2025 -
Bump ruff from 0.12.4 to 0.12.7
#25643 opened
Aug 4, 2025 -
Support int4 and uint4 for reshape on opset 21+.
#25645 opened
Aug 4, 2025 -
DequantizeLinear should support non-zero zero_point when input type is int32
#25646 opened
Aug 4, 2025 -
Add comprehensive ThresholdedRelu custom operator example and fix common implementation issues
#25650 opened
Aug 4, 2025 -
Skip node output dump for MemcpyToHost
#25651 opened
Aug 5, 2025 -
Properly remove in-memory references
#25652 opened
Aug 5, 2025 -
Bugfix vitisai ep model clone with 23979 25320
#25654 opened
Aug 5, 2025 -
safeint.h: quelch gcc's -Wreturn-type
#25655 opened
Aug 5, 2025 -
Update semver.h to fix compilation error under linux
#25658 opened
Aug 5, 2025 -
Remove incorrect function calls
#25662 opened
Aug 6, 2025 -
fixed matmul broadcasting
#25663 opened
Aug 6, 2025 -
Python binding for listing custom operators.
#25665 opened
Aug 6, 2025 -
[QNN-EP] Add CastLoneQFusion to transform Cast and QNode into Convert
#25667 opened
Aug 6, 2025 -
Replace vmlaq_f32 with vfmaq_f32 (fused multiply-add)
#25669 opened
Aug 6, 2025 -
Relax WeightBiasQuantization constraint for larger QDQ node group
#25673 opened
Aug 6, 2025 -
[webgpu] support bool for binary operators
#25674 opened
Aug 6, 2025 -
Add vendorid check to GetSupportedDevicesImpl for MIGraphx EP
#25677 opened
Aug 6, 2025 -
[WIP] Test integration with ONNX 1.19
#25678 opened
Aug 7, 2025 -
[WebNN] Remove NHWC preferred layout
#25679 opened
Aug 7, 2025 -
FP16 inference performance improvement on CPU
#25680 opened
Aug 7, 2025 -
[MIGraphX EP][BUG] Fix pybind compilation with MIGraphX after merging #25346
#25683 opened
Aug 7, 2025 -
Skeleton for Attention(23) on CUDA
#25684 opened
Aug 7, 2025 -
2-bit TMAC matmul
#25686 opened
Aug 7, 2025 -
Update QAIRT to 2.37.0
#25688 opened
Aug 7, 2025 -
[WIP] Move provider tests to `onnxruntime_provider_test` and enable use of plugin EPs
#25689 opened
Aug 7, 2025
1,207 Issues closed by 31 people
-
Microsoft.AI.MachineLearning cannot be used in UWP app on on Windows 10 ARM64
#4686 closed
Aug 7, 2025 -
Debugging capability of onnxruntime in Visual Studio 2019 incapacitated
#4812 closed
Aug 7, 2025 -
[WinML] [C++/WinRT] Clarify how to share Ort::Env environments with WinRT/WinML instances
#4971 closed
Aug 7, 2025 -
C Sharp API for openvino doesn't run on GPU
#5011 closed
Aug 7, 2025 -
onxruntime-gpu installation issues
#5020 closed
Aug 7, 2025 -
program stucks when multi processes
#5093 closed
Aug 7, 2025 -
Exception thrown from Dispose method (When missing dependency)
#5250 closed
Aug 7, 2025 -
DLRM model failure to execute on GPU
#5295 closed
Aug 7, 2025 -
Running quantized models on GPU
#5359 closed
Aug 7, 2025 -
Can Session::Run be const?
#5558 closed
Aug 7, 2025 -
ML.NET issue while Using yolov4 onnx model
#5593 closed
Aug 7, 2025 -
Passing Non-Const pointer to Session::Run() using CPP Api
#5597 closed
Aug 7, 2025 -
How to reduce memory used?
#5711 closed
Aug 7, 2025 -
openvino build failed nuget
#5749 closed
Aug 7, 2025 -
How to loading a pytorch model with input shape of (None, 32) using the C# inference ?
#5781 closed
Aug 7, 2025 -
Any support for double type tensor when loading pytorch onnx model ?
#5782 closed
Aug 7, 2025 -
memory keep increasing with dynamic input shape of network
#5796 closed
Aug 7, 2025 -
Memory usage with Cuda ExecutionProvider
#5801 closed
Aug 7, 2025 -
Non-zero status code returned while running Div node
#5830 closed
Aug 7, 2025 -
Performance comparison
#5834 closed
Aug 7, 2025 -
IOBindings in C++ API are missing a way to SynchronizeInputs.
#5857 closed
Aug 7, 2025 -
Quantized model much slower than full precision model
#5865 closed
Aug 7, 2025 -
Performance issue with operator Where on CPU
#5896 closed
Aug 7, 2025 -
Performance issue with operators SVMRegressor and SVMClassifier for RBF kernel on CPU
#5898 closed
Aug 7, 2025 -
Support GCN
#5910 closed
Aug 7, 2025 -
EyeLike with dynamic shape results in error
#5917 closed
Aug 7, 2025 -
Can't train mnist in parallel
#5918 closed
Aug 7, 2025 -
could not open "tensorrt_provider_factory.h", "mkldnn_provider_factory.h"
#5925 closed
Aug 7, 2025 -
Dynamic shape got wrong output
#5928 closed
Aug 7, 2025 -
Issue with Multi-GPU and GPU memory limit
#5939 closed
Aug 7, 2025 -
can not get expected speed in onnxruntime
#5953 closed
Aug 7, 2025 -
Error using onnx model containing Bidirectional layer with MatMulAddFusion
#5955 closed
Aug 7, 2025 -
No opset import for domain 'com.microsoft'
#5971 closed
Aug 7, 2025 -
"undefined symbol" error occured, when I use ort.SessionOptions.register_custom_ops_library
#5984 closed
Aug 7, 2025 -
Under TRT EP, custom op cannot fall back to CUDA EP
#6002 closed
Aug 7, 2025 -
Inconsistent inference time between C Python API [Megatron-LM]
#6025 closed
Aug 7, 2025 -
Onnx Batch Processing
#6044 closed
Aug 7, 2025 -
How to extract the size of a map type in c++?
#6077 closed
Aug 7, 2025 -
how to implement execution provider (EP) that allow onnx run on my hardware?
#6110 closed
Aug 7, 2025 -
32bit vs 64bit when compiling or something else?
#6144 closed
Aug 7, 2025 -
GPU memory consumption keeps increasing with multithreading in Java
#6181 closed
Aug 7, 2025 -
Not support rtx 3000 series
#6213 closed
Aug 7, 2025 -
sample c++ program just print "hello" does not start
#6243 closed
Aug 7, 2025 -
Cannot create OnnxTensor with UINT8 type.
#6261 closed
Aug 7, 2025 -
Referencing Microsoft.ML.OnnxRuntime and Microsoft.ML.OnnxRuntime.GPU in a c# project.
#6264 closed
Aug 7, 2025 -
Could onnxruntime be compiled into wasm using emsdk?
#6275 closed
Aug 7, 2025 -
Performance shaking
#6301 closed
Aug 7, 2025 -
[Bug] Wrong implementation in LpPool
#6302 closed
Aug 7, 2025 -
Memory corruption when using OnnxRuntime with OpenVINO on the Intel MyriadX and Raspberry Pi 4B
#6304 closed
Aug 7, 2025 -
[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Error while inferencing DLRM onnx model
#6319 closed
Aug 7, 2025 -
Error: Running double precision model exported from pyTorch
#6320 closed
Aug 7, 2025 -
The output for GPT is NAN when fp16=True
#6328 closed
Aug 7, 2025 -
ROCm build seems broken: `error: ‘ncclComm_t’ does not name a type`
#6358 closed
Aug 7, 2025 -
Implementation of ONNX Functions
#6360 closed
Aug 7, 2025 -
Incorrect TypeInferenceError on UNDEFINED tensor type
#6370 closed
Aug 7, 2025 -
[C-Api] Dynamic Shape Error: Non-zero status code returned while running Sigmoid node.
#6372 closed
Aug 7, 2025 -
onnxruntime v1.6.0 on Jetson Nano - Illegal Instruction (core dumped)
#6375 closed
Aug 7, 2025 -
How to extract dimension of inputs in onnxruntime/core/providers/cpu/math/matmul.cc
#6396 closed
Aug 7, 2025 -
Which executor to build when using: Intel® Deep Learning Boost (Intel® DL Boost)
#6400 closed
Aug 7, 2025 -
[question] Configure GPU arena with Python bindings
#6411 closed
Aug 7, 2025 -
Onnxruntime error when Relu-layer follows Dense-layer without activation and biases
#6423 closed
Aug 7, 2025 -
Reshape `requested_shape` forced to have leading dimension 1 when it should be -1
#6424 closed
Aug 7, 2025 -
NaN in AveragePooling
#6543 closed
Aug 7, 2025 -
Loss of accuracy when GPT-2 based model is exported to ONNX
#6549 closed
Aug 7, 2025 -
Custom Op Registration and Implementation
#6564 closed
Aug 7, 2025 -
Inference error using migraohx-onnxruntime
#6605 closed
Aug 7, 2025 -
/onnxruntime/core/mlas/lib/quantize.cpp:50:62: error: ‘vminnmq_f32’ was not declared in this scope
#6638 closed
Aug 7, 2025 -
Failed to add Microsoft.AI.MachineLearning NuGet package to .NET Framework 4.6.1 projects
#6662 closed
Aug 7, 2025 -
INT8 quantized model is very slow
#6732 closed
Aug 7, 2025 -
Shape inference error for Range node
#6737 closed
Aug 7, 2025 -
onnxruntime-gpu (cudaexecutionprovider) usage of cudnn autotuner
#6744 closed
Aug 7, 2025 -
Unable to compile on Linux with CUDA
#6749 closed
Aug 7, 2025 -
Onnxruntime inference with Integrated GPU Failed
#6755 closed
Aug 7, 2025 -
Onnxruntime.gpu is as slower than cpu mode
#6799 closed
Aug 7, 2025 -
Multiple input and multiple output models that create tensors in loops can cause serious crashes
#6821 closed
Aug 7, 2025 -
ONNXRuntime Inference with Finetuned BERT Model outputting odd results
#6830 closed
Aug 7, 2025 -
Unable to build onnxruntime with "--build_wheel" and "--enable_pybind" options
#6841 closed
Aug 7, 2025 -
[JAVA Bindings + Android arm64-v8a] ONNXRuntime build documentation
#6923 closed
Aug 7, 2025 -
dynamic shape input is much slower than fixed shape input in gpu
#6978 closed
Aug 7, 2025 -
CUDA header requested but missing in DNNL part of ORT 1.7.1
#7005 closed
Aug 7, 2025 -
Build fail for docker on MacOS. -NO GPU.
#7052 closed
Aug 7, 2025 -
Large Memory Allocations When Loading RandomForestRegressor Model
#7067 closed
Aug 7, 2025 -
Non-zero status code returned while running BatchNormalization node
#7095 closed
Aug 7, 2025 -
Memory and timing issue with onnxruntime python API with TensorFlow model
#7106 closed
Aug 7, 2025 -
Compile error in header onnxruntime_cxx_api.h when update ONNX runtime from 1.5.2 to 1.7.1
#7142 closed
Aug 7, 2025 -
Batch inference
#7178 closed
Aug 7, 2025 -
Segmentation fault when running onnxruntime inside docker with cpuset restrictions
#7207 closed
Aug 7, 2025 -
Significant difference in the performance of pytorch and exported onnx models
#7212 closed
Aug 7, 2025 -
TensorrtExecutionProvider slower than CUDAExecutionProvider: Transformers
#7230 closed
Aug 7, 2025 -
The speed of running the onnx model is 6x slower than the pytorch model on Jetson TX2
#7233 closed
Aug 7, 2025 -
[Python API + ARM64] Running ResNet50 on ARM board using ACL Error and Performance Issue
#7234 closed
Aug 7, 2025 -
ACL (32bit) Execution Provider fails on gemm node
#7255 closed
Aug 7, 2025 -
onnxruntime gpu version can't installed, how to fix it?
#7272 closed
Aug 7, 2025 -
Run a model containing CustomOp with TensorRT provider fails
#7314 closed
Aug 7, 2025 -
C# console app crash upon appending OpenVino execution provider
#7330 closed
Aug 7, 2025 -
Cannot save Tensorrt .engine model in v1.7.1
#7339 closed
Aug 7, 2025 -
openvino continued package by pyinstaller external dll issue
#7346 closed
Aug 7, 2025 -
Resize Operator rounds-down instead of round-to-even for int32/uint8
#7368 closed
Aug 7, 2025 -
How to compile the framework that can run in Windows XP?
#7444 closed
Aug 7, 2025 -
How to release gpu memory without exiting the process?
#7463 closed
Aug 7, 2025 -
Running inference using GPU or TensorRT on Jetson
#7484 closed
Aug 7, 2025 -
Problem compiling ONNX RT with CUDA and TensorRT on Windows
#7562 closed
Aug 7, 2025 -
Use of torch InstanceNorm2d and dynamic tensor size causes crash
#7572 closed
Aug 7, 2025 -
onnxruntime build is not compatible with onnx build. Protobuf loaded twice.
#7597 closed
Aug 7, 2025 -
Large GPU memory usage with EXHAUSTIVE cuDNN search
#7612 closed
Aug 7, 2025 -
Enable CUDA provider option configuration in Java
#7613 closed
Aug 7, 2025 -
Build fails with --use_rknpu
#7614 closed
Aug 7, 2025 -
Publish the providers with the release build
#7628 closed
Aug 7, 2025 -
int8 quantization on GPU support? (transformers)
#7634 closed
Aug 7, 2025 -
Does onnxruntime support bert with relative position embedding
#7713 closed
Aug 7, 2025 -
quantize model can‘t run on gpu ?
#7745 closed
Aug 7, 2025 -
TensorRT execution provider SEGFAULT
#7757 closed
Aug 7, 2025 -
CUDA kernel not found in registries for Op type: Pad
#7779 closed
Aug 7, 2025 -
ACL and ArmNN v21.02 EP has problem with GEMM
#7784 closed
Aug 7, 2025 -
get error when using a model with custom op
#7788 closed
Aug 7, 2025 -
How to get sparse tensor input in custom op?
#7838 closed
Aug 7, 2025 -
Build failure in onnxruntime/test/featurizers_ops/truncated_svdtransformer_test.cc
#7878 closed
Aug 7, 2025 -
undefined reference to `onnx::optimization::GetAvailablePasses() on Nvidia Jetson NX
#7970 closed
Aug 7, 2025 -
Running multiple input node onnx model using onnxrntime C/C++ API
#8019 closed
Aug 7, 2025 -
Memory leak in free-dimention model in C++
#8053 closed
Aug 7, 2025 -
CUDAExecutionProvider does not handle Clip on float16 tensor.
#8070 closed
Aug 7, 2025 -
Why ReduceSum get shape 0 for an empty input?
#8146 closed
Aug 7, 2025 -
System memory leak on cuda GPU backend.
#8147 closed
Aug 7, 2025 -
Does ONNX Runtime and its execution providers support FP16 inference?
#8173 closed
Aug 7, 2025 -
Reflect padding output seems incorrect when padding size larger than input dimension
#8265 closed
Aug 7, 2025 -
Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed
#8313 closed
Aug 7, 2025 -
Inference Speed is slow on GPU
#8316 closed
Aug 7, 2025 -
After 8bit quantization, the GPU inference speed is very slow
#8330 closed
Aug 7, 2025 -
GPUs operate slower than CPUs
#8362 closed
Aug 7, 2025 -
error using C# tensorRT EP builded from source
#8367 closed
Aug 7, 2025 -
Why cuda provider allocator must be threadlocal?
#8378 closed
Aug 7, 2025 -
Implement Split for double or float64 data type
#8382 closed
Aug 7, 2025 -
ERROR running model inference:Non-zero status code returned while running Cast node
#8424 closed
Aug 7, 2025 -
Found regression on ORT 1.8.1
#8513 closed
Aug 7, 2025 -
Does the onnxruntime.quantization.quantize_dynamic support GPU quantization?
#8524 closed
Aug 7, 2025 -
gpu memory can not release.
#8544 closed
Aug 7, 2025 -
Build failure of onnxruntime Docker container with Vitis-AI
#8596 closed
Aug 7, 2025 -
PrepareForCompute Non concat axis dimensions must match: Axis 0 has mismatched dimensions of 1 and 0
#8685 closed
Aug 7, 2025 -
error with torch.sum or torch.tensor.mean operator on GPU
#8742 closed
Aug 7, 2025 -
Symbolic shape inference error for loop node & seq(tensor)
#8755 closed
Aug 7, 2025 -
onnxruntime Jetson tx2 cuda
#8771 closed
Aug 7, 2025 -
AttributeError: module 'onnxruntime' has no attribute 'set_default_logger_severity'
#8789 closed
Aug 7, 2025 -
IsNaN and Split have no double implementations
#8791 closed
Aug 7, 2025 -
Readily available Python wheels for ARM?
#8874 closed
Aug 7, 2025 -
cannot import name ‘get_all_providers‘
#8907 closed
Aug 7, 2025 -
Runetime Error: Decoder with dynamic axes does not work with Encoder output
#8910 closed
Aug 7, 2025 -
How to get the value of tensors in subgraph?
#8929 closed
Aug 7, 2025 -
The model run time become longer when i update the onnxruntime from version 1.7 to version 1.8
#8938 closed
Aug 7, 2025 -
ONNX inference result are different to pytorch model
#8977 closed
Aug 7, 2025 -
Type error when runs an control flow model in ORT
#8999 closed
Aug 7, 2025 -
cross compile but onnx-ml.pb.cc error
#9093 closed
Aug 7, 2025 -
how to input 'None' in cpp-version
#9121 closed
Aug 7, 2025 -
InferenceSession.run in python is inconsistent in terms of performance
#9208 closed
Aug 7, 2025 -
Can't load Cuda Provider on Linux due symbol lookup error
#9309 closed
Aug 7, 2025 -
ONNXRuntime CPU - Memory spiking continuously (Memory leak)
#9313 closed
Aug 7, 2025 -
error: '_Frees_ptr_opt_' has not been declared
#9332 closed
Aug 7, 2025 -
QLinearConv per-channel result is wrong and it's seem overflow when input is big for my model
#9365 closed
Aug 7, 2025 -
ORT execution fails when a gradient builder is not registered for module-local functions
#9375 closed
Aug 7, 2025 -
Relu getting dropped during quantization
#9425 closed
Aug 7, 2025 -
OnnxRuntime Build Failure in Docker
#9530 closed
Aug 7, 2025 -
YAMNet model running on CudaExecutionProvider is 3x slower than running on tensorflow
#9657 closed
Aug 7, 2025 -
Gap in inference time between onnxruntime and torch vanishes when increasing the batch size
#9660 closed
Aug 7, 2025 -
libonnxruntime.so crash
#9684 closed
Aug 7, 2025 -
yolov5 with the compiled onnxruntime by self,but is so slow, not with the GPU
#9689 closed
Aug 7, 2025 -
Unable to load shared library 'onnxruntime' on MacOS (DllNotFoundException)
#9707 closed
Aug 7, 2025 -
Support for int64 with webgl backend of the web runtime
#9724 closed
Aug 7, 2025 -
ouput of onnx model with custom op in the loop structrue is confusing
#9742 closed
Aug 7, 2025 -
How to build for multiple execution provider?
#9756 closed
Aug 7, 2025 -
Inference is slower when running inside Docker
#9767 closed
Aug 7, 2025 -
[ONNXRuntimeError] : 1 : FAIL : Fatal error: test_custom is not a registered function/op
#9831 closed
Aug 7, 2025 -
non-NEON Compatibility
#9849 closed
Aug 7, 2025 -
Yolov5 ORT train failed with onnxruntime backend
#9936 closed
Aug 7, 2025 -
Support for pip wheel tensorrt
#9986 closed
Aug 7, 2025 -
question about warnup long time
#10017 closed
Aug 7, 2025 -
Importing onnxruntime on AWS Lambdas with ARM64 processor causes crash
#10038 closed
Aug 7, 2025 -
how to forward with a batch images, oncetime?
#10071 closed
Aug 7, 2025 -
when my models input size is 3808, then i forward with yolov5, the memry is break.
#10074 closed
Aug 7, 2025 -
Same Pad_Head value in ORT for SAME_UPPER/SAME_LOWER if get negative odd pad value
#10086 closed
Aug 7, 2025 -
onnxruntime latest version segment fault
#10113 closed
Aug 7, 2025 -
ORTModule import error : with onnxruntime
#10127 closed
Aug 7, 2025 -
BatchNorm fails on CUDA EP with zero length sequences
#10128 closed
Aug 7, 2025 -
Do you have any plan to add 'Round' Operator for gradient builder registry for orttrainer?
#10138 closed
Aug 7, 2025 -
Performance question about some nodes generated by dynamic quantization
#10153 closed
Aug 7, 2025 -
Sigmoid fails and output all zeros
#10154 closed
Aug 7, 2025 -
Why does onnxruntime run slower on C++?
#10155 closed
Aug 7, 2025 -
`InferenceSession` initialization hangs
#10166 closed
Aug 7, 2025 -
TensorRT EP failed to set INT8 dynamic range.
#10206 closed
Aug 7, 2025 -
how to use docker and onnxruntime deploy onnx model on GPU?
#10257 closed
Aug 7, 2025 -
Inconsistent inference timing on CPU
#10270 closed
Aug 7, 2025 -
Inference: Time in GPU is similar in CPU. GPU not speed up
#10271 closed
Aug 7, 2025 -
multiple InferenceSession slowdown inference speed
#10273 closed
Aug 7, 2025 -
DnnlExecutionProvider is not visible in python API
#10275 closed
Aug 7, 2025 -
add QLinearMatMul do not quantize per channel flag to quantize_static extra options
#10283 closed
Aug 7, 2025 -
onnxruntime inference is around 5 times slower than pytorch when using GPU
#10303 closed
Aug 7, 2025 -
Bug: pthread sent an error! undefined:undefined: ortWasmThreaded is not defined
#10311 closed
Aug 7, 2025 -
Onnxruntime multithread options [C++ CPU]
#10330 closed
Aug 7, 2025 -
Issues when trying to use Onnxruntime and Tensorrt execution provider in a java application
#10352 closed
Aug 7, 2025 -
build onnxruntime error linux
#10364 closed
Aug 7, 2025 -
Error happened while building onnxruntime
#10378 closed
Aug 7, 2025 -
Question about hidden states in onnx DistilGPT2
#10382 closed
Aug 7, 2025 -
Is TensorRT execution provider caching is thread-safe
#10412 closed
Aug 7, 2025 -
Loading a Keras model with custom layers into Microsoft.ML
#10419 closed
Aug 7, 2025 -
cast BatchNorm2d to int32
#10440 closed
Aug 7, 2025 -
TensorRT input: 717 has no shape specified.
#10443 closed
Aug 7, 2025 -
raise Exception("Incomplete symbolic shape inference") when running "symbolic_shape_infer.py"
#10484 closed
Aug 7, 2025 -
C++ OnnxRuntime-GPU Slower than Python OnnxRuntime-GPU/C++ OnnxRuntime-CPU
#10492 closed
Aug 7, 2025 -
slower after graph optimization!
#10538 closed
Aug 7, 2025 -
onnxruntime and onnxruntime-gpu produce different output for ReduceL1 operator
#10542 closed
Aug 7, 2025 -
Run maskrcnn onnx from pytorch and inference on c++ with gpu sometimes will error
#10543 closed
Aug 7, 2025 -
Exception in DirectML on second inference run
#10546 closed
Aug 7, 2025 -
Unit Tests failure while building on Windows with CUDA EP
#10561 closed
Aug 7, 2025 -
Building Error
#10600 closed
Aug 7, 2025 -
OpenVINO Execution provider's CPU Utility is low
#10601 closed
Aug 7, 2025 -
How to use OpenVINO GetAvailableDevices?
#10602 closed
Aug 7, 2025 -
why it take 200 seconds to run onnxruntime.InferenceSession
#10608 closed
Aug 7, 2025 -
Building OnnxRuntime v1.10.0 with CUDAExecutionProvider for sm_75 GPU fails in CUDA10.2 environment
#10610 closed
Aug 7, 2025 -
C + + onnxruntime GPU is ten times slower than CPU
#10611 closed
Aug 7, 2025 -
Optimization for T5 transformer models.
#10613 closed
Aug 7, 2025 -
about providers and providers_options in InferenceSession
#10620 closed
Aug 7, 2025 -
How to use mimalloc in Linux?
#10629 closed
Aug 7, 2025 -
CPU & CUDA execution provider produce different value
#10636 closed
Aug 7, 2025 -
No libonnxruntime_providers_cuda.so generated?
#10639 closed
Aug 7, 2025 -
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION
#10657 closed
Aug 7, 2025 -
Get wrong result when use webgl backend
#10673 closed
Aug 7, 2025 -
onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs initializer_node_arg != nullptr was false.
#10677 closed
Aug 7, 2025 -
Need help on the following from wiki listed roadmap.
#10689 closed
Aug 7, 2025 -
Output shape is mismatched with ONNX SPEC about Resize_tf_crop_and_size with scale input
#10727 closed
Aug 7, 2025 -
gpu onnxruntime lib
#10731 closed
Aug 7, 2025 -
Onnx model consumes huge CPU memory
#10742 closed
Aug 7, 2025 -
inference qdq model failed with TRT EP.
#10743 closed
Aug 7, 2025 -
build on windows cup is fine,but cuda not
#10745 closed
Aug 7, 2025 -
Is there a version of onnxruntime that is compatible with windows 7?
#10749 closed
Aug 7, 2025 -
can build on windows with Geforce 1060 card, cuda 11.0 cudnn 8.0.2 successfully?
#10763 closed
Aug 7, 2025 -
very slow in inference
#10764 closed
Aug 7, 2025 -
ONNX models give slower inference in Python Multiprocessing
#10786 closed
Aug 7, 2025 -
Inference time of onnxruntime gpu increases at very high batch sizes
#10789 closed
Aug 7, 2025 -
Transformer optimizer outputs confusing error
#10838 closed
Aug 7, 2025 -
C++ is 10x slower compared with Python, CPU only
#10849 closed
Aug 7, 2025 -
Windows 32 bit performance much slower than 64bit?
#10855 closed
Aug 7, 2025 -
Different inference results from python and C#
#10863 closed
Aug 7, 2025 -
Does WebGL fail when network inputs are not dimensions in powers of two?
#10873 closed
Aug 7, 2025 -
TensorRT conversion support on Huggingface transformers quantized models.
#10888 closed
Aug 7, 2025 -
python3 -m onnxruntime_tools.transformers.optimizer when opt_level=1 comes error for BERT
#10893 closed
Aug 7, 2025 -
1 : Fail : Non-zero status code returned while running FusedConv node.
#10894 closed
Aug 7, 2025 -
After using onnxruntime.transformers.optimizer to optimize onnx, the optimized model fail to tensorrt
#10905 closed
Aug 7, 2025 -
TensorRT Execution [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization
#10914 closed
Aug 7, 2025 -
slow fp16 performance
#10919 closed
Aug 7, 2025 -
onnxruntime TensorRT Related Questions
#10930 closed
Aug 7, 2025 -
MinGW support (MSYS2)
#10976 closed
Aug 7, 2025 -
docker can't clone git repository for ARM64
#10991 closed
Aug 7, 2025 -
Xor with broadcasting computes error
#11000 closed
Aug 7, 2025 -
Inconsistent behavior between CPU and GPU on ReLU operator when input is NaN
#11010 closed
Aug 7, 2025 -
0xc00007b error, could not startup exe at all with onnxruntime1.7 win-x64 cpu on win10
#11016 closed
Aug 7, 2025 -
Failed to build onnxruntime-vitisai docker container due to missing NO_PUBKEY
#11017 closed
Aug 7, 2025 -
Huggingface Transformers Shape Inference Issue
#11019 closed
Aug 7, 2025 -
kalid-onnxruntime Fatal error: Gemm is not a registered function/op
#11021 closed
Aug 7, 2025 -
Updating state of the network
#11026 closed
Aug 7, 2025 -
Can't constant fold SequenceEmpty node
#11041 closed
Aug 7, 2025 -
Cuda EP parallelization issues for batches
#11047 closed
Aug 7, 2025 -
Inference session creation freezes
#11087 closed
Aug 7, 2025 -
compile with cuda error:Couldn't find CUDA library root.
#11090 closed
Aug 7, 2025 -
Performance reduction due to copying of output OrtValues to numpy arrays
#11099 closed
Aug 7, 2025 -
Using DnnlExecutionProvider for inference is much slower than using CPUExecutionProvider.
#11122 closed
Aug 7, 2025 -
Different detection output values for C++ and Python with onnxruntime
#11123 closed
Aug 7, 2025 -
docker container linux run onnxruntime infer core dumped
#11135 closed
Aug 7, 2025 -
[question] yolov5-onnx-float16 not improve on GPU
#11151 closed
Aug 7, 2025 -
How to use Flask with onnxruntime
#11156 closed
Aug 7, 2025 -
Instruction level profiling in onnxruntime
#11159 closed
Aug 7, 2025 -
No c++ header files for building custom op
#11169 closed
Aug 7, 2025 -
A normal output of convolution layer multiplies infinity will result in NaN
#11173 closed
Aug 7, 2025 -
Build from source issue on Windows
#11178 closed
Aug 7, 2025 -
onnxruntime-web is 11-17x times slower than native inference
#11181 closed
Aug 7, 2025 -
Custom Op does not support dynamic input/output number
#11186 closed
Aug 7, 2025 -
Saving GPT2LMHeadModel_ConfigurableOneStepSearch error.
#11198 closed
Aug 7, 2025 -
How to compress the sparse matrix in onnx model
#11200 closed
Aug 7, 2025 -
Inference time for qunatized onnx models, TensorRT> CUDA> CPU. Is this expected?
#11201 closed
Aug 7, 2025 -
auto_set_affinity can't be set to true for parallel executor
#11205 closed
Aug 7, 2025 -
[web] ~100 seconds to load model/InferenceSession
#11217 closed
Aug 7, 2025 -
NonZero shape inference behavior with scalar input mismatches ONNX and PyTorch
#11232 closed
Aug 7, 2025 -
Unhandled exception at 0x00007FFABE6A9538 (cudnn_cnn_infer64_8.dll) in Onnx.exe
#11235 closed
Aug 7, 2025 -
[React Native .ort Model Loading Error] "Error: Can't load a model: No content provider: ..."
#11239 closed
Aug 7, 2025 -
I want use gpu on my jetson nx2 platform with c++, how should i do?
#11240 closed
Aug 7, 2025 -
Unsupported If operator in gradient builder for Hugging Face Transformers RoBERTa model
#11268 closed
Aug 7, 2025 -
optimize_model : new model types
#11270 closed
Aug 7, 2025 -
The onnx model of IMDN is slower than the original pytorch model and output many warnings
#11274 closed
Aug 7, 2025 -
pulled master 1.12 quantization get unexpected result
#11277 closed
Aug 7, 2025 -
Why gpt2-xl (based transformer-xl) onnx slower than the originer pytorch
#11293 closed
Aug 7, 2025 -
is the effect of onnx on Bert affected by python version?
#11295 closed
Aug 7, 2025 -
TVM EP and TensorRT EP do not support dynamic inputs
#11333 closed
Aug 7, 2025 -
MacOS M1 binary compilation and possibility to fine tune a model in C++
#11343 closed
Aug 7, 2025 -
Lower performance on Inceptionv3/4 model with TensorRT EP than TensorRT directly
#11356 closed
Aug 7, 2025 -
CUDAExecutionProvider not releasing memory after terminate session
#11362 closed
Aug 7, 2025 -
ONNX Runtime compatibility for Jetson AGX Xavier
#11378 closed
Aug 7, 2025 -
About running onnxruntime in singularity container
#11397 closed
Aug 7, 2025 -
Benchmark code using torch.onnx.export
#11399 closed
Aug 7, 2025 -
About building onnxruntime singularity container with DockerFile
#11409 closed
Aug 7, 2025 -
Static quantization+per_channel is wrong for MobileNetV3
#11415 closed
Aug 7, 2025 -
Can I quantize TreeEnsembleClassifier op?
#11436 closed
Aug 7, 2025 -
Onnx T5 fp16 conversion without past_key_values
#11438 closed
Aug 7, 2025 -
How to run a double input onnx model
#11453 closed
Aug 7, 2025 -
InferenceSession giving different results than the original sklearn SVC model
#11490 closed
Aug 7, 2025 -
C#, How to access the different output layer of inference (semantic segmentation)
#11502 closed
Aug 7, 2025 -
[Documentation Request]
#11505 closed
Aug 7, 2025 -
onnxruntime error
#11509 closed
Aug 7, 2025 -
[Documentation Request] tensorAt for Csharp?
#11510 closed
Aug 7, 2025 -
About Convolution Implementation
#11517 closed
Aug 7, 2025 -
How to release a session properly?
#11529 closed
Aug 7, 2025 -
Fail to convert model with reusable blocks
#11530 closed
Aug 7, 2025 -
CPUExecutionProvider outputs wrong value for a quantized model
#11532 closed
Aug 7, 2025 -
Using a model with float input types causes space issue
#11541 closed
Aug 7, 2025 -
T5-Large Export Results in ProtoBuf Error due to 2GB External Data when using padded inputs
#11558 closed
Aug 7, 2025 -
CUDA failure 100: no CUDA-capable device is detected ; error when inferencing on a GPUVM
#11561 closed
Aug 7, 2025 -
Specify CPUs to use for parallel inference when external CPU pinning is used
#11563 closed
Aug 7, 2025 -
[js/web] Inference is Broken in Safari when Cross Origin Isolation is active
#11567 closed
Aug 7, 2025 -
Header missmatch C/C++ - mac
#11570 closed
Aug 7, 2025 -
The effect of turning optimization on and off on quantized model performance
#11576 closed
Aug 7, 2025 -
ONNXRUNTIME + OpenVINO on ARM64
#11582 closed
Aug 7, 2025 -
cpu and gpu results is not the same
#11590 closed
Aug 7, 2025 -
CUDNN failure 4: CUDNN_STATUS_INTERNAL_ERROR ; error when inferencing on a GPUVM
#11592 closed
Aug 7, 2025 -
issues with pybind11 repository while installing
#11595 closed
Aug 7, 2025 -
Bad performance for QDQ model with openvino EP
#11604 closed
Aug 7, 2025 -
Unable to build onnxruntime_v1.10.0 C++ api with --enable_memory_profile --enable_cuda_profiling flags
#11607 closed
Aug 7, 2025 -
Shape inference fails
#11614 closed
Aug 7, 2025 -
building ——libonnxruntime_providers_cuda.so Error running link command: No such file or directory
#11621 closed
Aug 7, 2025 -
how to set providers with onnx runtime-gpu1.70 ?
#11624 closed
Aug 7, 2025 -
using multithread to call onnxruntime inference,
#11628 closed
Aug 7, 2025 -
which tags should i download of onnxruntime-gpu 1.6 for c#
#11646 closed
Aug 7, 2025 -
build for c#
#11648 closed
Aug 7, 2025 -
output shape can not be specified in com.microsoft::GridSample op
#11652 closed
Aug 7, 2025 -
Installing ORTModule torch extension reports TypeError
#11663 closed
Aug 7, 2025 -
when set inter_op_num=0 with ORT_PARALLEL model the performance is very bad than inter_op_num=1?
#11668 closed
Aug 7, 2025 -
How to implement a new operator inference function?
#11678 closed
Aug 7, 2025 -
which onnxruntime-gpu version is compatible for CUDA 11.1 ?
#11685 closed
Aug 7, 2025 -
Real-ESRGAN slow onnxruntime inference compared to Pytorch one
#11688 closed
Aug 7, 2025 -
Linux CI pipelines can't test unreleased versions of ONNX
#11693 closed
Aug 7, 2025 -
Dynamic quantization of Albert model
#11701 closed
Aug 7, 2025 -
Low level profiling for onnxrt Conv kernel(default backend)
#11702 closed
Aug 7, 2025 -
CUDA EP spending lots of time idling
#11706 closed
Aug 7, 2025 -
Race condition when setting do_copy_in_default_stream to false
#11713 closed
Aug 7, 2025 -
Reading back multidimensional output in C++
#11718 closed
Aug 7, 2025 -
how to get the remaining GPU memory to get the batch size?
#11735 closed
Aug 7, 2025 -
ssd_mobilenet_v1 infer error for TensorRT Execution Provider
#11736 closed
Aug 7, 2025 -
build rknpu backend error
#11738 closed
Aug 7, 2025 -
Pip installed Transformer Benchmark cannot run on TF
#11751 closed
Aug 7, 2025 -
Converted ONNX model works in Python but not in C++
#11761 closed
Aug 7, 2025 -
Failed to build onnxruntime on Apple Sillion
#11805 closed
Aug 7, 2025 -
I do not get any performance improvement after using TensorRT provider for object detection model
#11806 closed
Aug 7, 2025 -
When I use onnxruntime to run onnx model on GPU, it sucks up too much video memory. Is that normal?
#11809 closed
Aug 7, 2025 -
Issue importing onnxruntime
#11815 closed
Aug 7, 2025 -
in cmake/CMakeList.txt all avx related option all set off, do we need do anything to use avx features?
#11833 closed
Aug 7, 2025 -
[ONNXRuntimeError] FuseReluClip failure
#11836 closed
Aug 7, 2025 -
Incompatible dimensions for matrix multiplication Error in StarNet model when doing InferenceSession
#11846 closed
Aug 7, 2025 -
What's the meaning of the hole of tracing file
#11850 closed
Aug 7, 2025 -
How to use batch run?
#11852 closed
Aug 7, 2025 -
Use NPU in NXP iMX8MP?
#11854 closed
Aug 7, 2025 -
What is the meaning of src_arg_index and dst_arg_index in EdgeEndToMatch structure?
#11856 closed
Aug 7, 2025 -
Wrong output shape due to MergeShape failure
#11870 closed
Aug 7, 2025 -
Not clear quantization pipeline for tensorrt ep
#11873 closed
Aug 7, 2025 -
Pytorch -> Onnx custom Yolov5 model works in python but not in JS
#11874 closed
Aug 7, 2025 -
[ONNXRuntimeError] Load model from *** failed: Unsuported type proto value case
#11889 closed
Aug 7, 2025 -
Quantize specific ops per-tensor while per_channel=True
#11890 closed
Aug 7, 2025 -
onnx and onnxruntime disagree on input with no known rank
#11891 closed
Aug 7, 2025 -
Bug: MatMul fails for input shapes of [0, k] and [k, ]
#11895 closed
Aug 7, 2025 -
Immense GPU memory consumption
#11903 closed
Aug 7, 2025 -
ConvTranspose with auto_pad attribute
#11927 closed
Aug 7, 2025 -
how to get inference time with c# onnxruntime-gpu-1.6.0
#11946 closed
Aug 7, 2025 -
excute dnnl provider error
#11947 closed
Aug 7, 2025 -
windows11+onnxruntime1.8.0+vs2019 inferencing crash
#11950 closed
Aug 7, 2025 -
Multi thread of single session Python vs C++ (end with core dumped)
#11951 closed
Aug 7, 2025 -
Inference_GPT2-OneStepSearch_OnnxRuntime_CPU.ipynb Error
#11959 closed
Aug 7, 2025 -
Question about quantize Gemm OP
#11961 closed
Aug 7, 2025 -
Got segmentation fault error when using 'InferenceSession' API
#11964 closed
Aug 7, 2025 -
how to configure lobal/shared threadpool with multithread, in c#API?
#11966 closed
Aug 7, 2025 -
set gpu option failed
#11967 closed
Aug 7, 2025 -
quant onnx model slower than pytorch with mish6 activation, howerver faster with relu6
#11975 closed
Aug 7, 2025 -
inference time is not stable
#11983 closed
Aug 7, 2025 -
Any interest in hosting the Rust bindings
#11992 closed
Aug 7, 2025 -
inference is different on linux and windows
#11993 closed
Aug 7, 2025 -
failed to initialize a session in the GPU environment
#11996 closed
Aug 7, 2025 -
The test time of sess.run does not match the time of profile
#11997 closed
Aug 7, 2025 -
build C#api with cuda 11.0 /cudnn 8.0
#11999 closed
Aug 7, 2025 -
Issue with NeMo MTEncDecModel model in ONNX IOBinding
#12003 closed
Aug 7, 2025 -
how to build onnxruntime from source with dnnl?
#12011 closed
Aug 7, 2025 -
create op
#12017 closed
Aug 7, 2025 -
Resize with mode linear is missing output elements
#12019 closed
Aug 7, 2025 -
Builds C# bindings and creates nuget package
#12042 closed
Aug 7, 2025 -
GlobalAveragePool on large size of ones miscalculates
#12043 closed
Aug 7, 2025 -
Using onnxruntime server for model deployment
#12044 closed
Aug 7, 2025 -
Support pasts as inputs in gpt2 beam search operator
#12047 closed
Aug 7, 2025 -
Build wasm static library bug because of missing `testdata` folder.
#12048 closed
Aug 7, 2025 -
Performance in parallel session Run()
#12049 closed
Aug 7, 2025 -
Builds C# bindings and creates nuget package for vs2019 install
#12061 closed
Aug 7, 2025 -
ONNXRuntimeError for "Where" node when the input is too long
#12065 closed
Aug 7, 2025 -
Performance issue with beam search in onnxruntime
#12078 closed
Aug 7, 2025 -
Support for cmake's FetchContent()
#12081 closed
Aug 7, 2025 -
TensorRT Provider Vs TensorRT Native
#12083 closed
Aug 7, 2025 -
Resize with mode linear always produces 0.5 on GPU regardless of the input
#12091 closed
Aug 7, 2025 -
Resize with `nearest` mode have inconsistent results compared to PyTorch and TVM
#12098 closed
Aug 7, 2025 -
onnxruntime tensorrt sometime cost verg log time
#12120 closed
Aug 7, 2025 -
How do I call the same model in CUDA with many various inputs?
#12126 closed
Aug 7, 2025 -
Error in symbloc_shape_infer.py: assert name in self.sympy_data_ or ...
#12127 closed
Aug 7, 2025 -
Inference time vs torch w/regard to batch_size and BatchNorm
#12130 closed
Aug 7, 2025 -
When will Attention OP extra_add_qk input support automatic broadcast
#12149 closed
Aug 7, 2025 -
Query regarding timings under ONNXRT profiler
#12150 closed
Aug 7, 2025 -
Hi Does ONNX Runtime support FP16 and INT8 inference on Intel OneDNN ExecutionProvider?
#12160 closed
Aug 7, 2025 -
Eager mode generator support non-tensor return types
#12163 closed
Aug 7, 2025 -
symbolic_shape_infer.py not working with models quantized with 🤗 Optimum for TensorRT
#12173 closed
Aug 7, 2025 -
upgrading pip and wheels kills CUDAExecutionProvider
#12185 closed
Aug 7, 2025 -
why first session.run is too slower than after
#12197 closed
Aug 7, 2025 -
Performance issue of ConvInteger
#12206 closed
Aug 7, 2025 -
How to release memory after Inference session run in Python
#12207 closed
Aug 7, 2025 -
Regarding the dynamism for custom op in ONNXRT
#12211 closed
Aug 7, 2025 -
Quantized Model Running Slow Using Cuda as EP
#12229 closed
Aug 7, 2025 -
Exported beam search model consumes a lot of more memory
#12246 closed
Aug 7, 2025 -
Mismatch in the order of the column names in the benchmarking script for transformer models
#12265 closed
Aug 7, 2025 -
LoadLibrary failed with error 126 (DirectML)
#12269 closed
Aug 7, 2025 -
TRT EP failed to create model session with CUDA custom op
#12282 closed
Aug 7, 2025 -
Since ORT 1.12 ort.InferenceSession throws error when the last provider is not capable
#12287 closed
Aug 7, 2025 -
Resize op can't work well under Cubic mode with ORT 1.12.
#12302 closed
Aug 7, 2025 -
Details regarding ONNXRuntime inference with OpenVino Backend
#12305 closed
Aug 7, 2025 -
Why the performance of onednn is worse than the common version
#12315 closed
Aug 7, 2025 -
ONNXRT default CPU EP vs Openvino EP Performance
#12316 closed
Aug 7, 2025 -
onnx graph partition optimize
#12318 closed
Aug 7, 2025 -
Wrong native library directory name for M1 Mac in the Java package
#12324 closed
Aug 7, 2025 -
MetaCommand exception from DirectML EP
#12328 closed
Aug 7, 2025 -
window10 ort with openvino backend error
#12334 closed
Aug 7, 2025 -
unsafe exception code in C++ API, wrongly declaring exceptions, incomplete constructors
#12338 closed
Aug 7, 2025 -
Unable to build Onnxruntime 1.12.0 with OpenVINO 2020.3 on Windows 10
#12342 closed
Aug 7, 2025 -
Quantized ONNX model output
#12346 closed
Aug 7, 2025 -
Performance gains by ONNX inconsistent
#12348 closed
Aug 7, 2025 -
Integer quantization fails on Transformer-based vision model
#12362 closed
Aug 7, 2025 -
Setting Openvino EP to run on one core with one thread
#12365 closed
Aug 7, 2025 -
Unable to build tensorrt docker image
#12373 closed
Aug 7, 2025 -
Accept dictionary of tensor as input (python api)
#12380 closed
Aug 7, 2025 -
Fail to build onnxRT with oneDNN using official build command
#12382 closed
Aug 7, 2025 -
Segmentation fault
#12386 closed
Aug 7, 2025 -
While loading the onnx file with InferenceSession getting session ID 11 error
#12402 closed
Aug 7, 2025 -
Failed to build with ACL(and ARMnn)
#12407 closed
Aug 7, 2025 -
Can't build with OpenVINO 2022.1 ("onnxruntime_providers_shared" does not exist)
#12411 closed
Aug 7, 2025 -
CUDA support for longer-input models like BigBird
#12463 closed
Aug 7, 2025 -
How to exit abnormally in the Python Operator (PyOp)
#12481 closed
Aug 7, 2025 -
QDQ + Add nodes are not fused into QLinearAdd when the graph is optimized
#12487 closed
Aug 7, 2025 -
performance is poor when onnxruntime C++ run in intel cpu
#12489 closed
Aug 7, 2025 -
LSTM Y output is inconsistent with TF inference result when seq_len is effective
#12492 closed
Aug 7, 2025 -
Clarify NMS sorting strategy
#12493 closed
Aug 7, 2025 -
Attributes in nested function calls are zeroed out
#12506 closed
Aug 7, 2025 -
Computing loss within onnxrunitme inference (GPT2 model)
#12526 closed
Aug 7, 2025 -
java deploy in k8s Failed to load library libonnxruntime_providers_cuda.so with error
#12540 closed
Aug 7, 2025 -
engine decryption does not work in TensorRT EP
#12551 closed
Aug 7, 2025 -
Add execution provider selection for quantize_static
#12573 closed
Aug 7, 2025 -
Document beamsearch
#12584 closed
Aug 7, 2025 -
Name:'MatMul_32007' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch
#12594 closed
Aug 7, 2025 -
Run the onnx model converted from seq2seq and report an error
#12608 closed
Aug 7, 2025 -
Where is the definition of session.Run() in onnxruntime C++ api
#12623 closed
Aug 7, 2025 -
cuda_provider_options.h include non existing file?
#12636 closed
Aug 7, 2025 -
The quantization model reduces the accuracy compared to the TRT
#12638 closed
Aug 7, 2025 -
Failed to create TensorrtExecutionProvider using onnxruntime-gpu
#12639 closed
Aug 7, 2025 -
Confusing exception about supported types
#12648 closed
Aug 7, 2025 -
get kill signal when quantize the ONNX model using quantize_static
#12652 closed
Aug 7, 2025 -
Enable Global Shared Threadpool and Memory Allocator For C#
#12654 closed
Aug 7, 2025 -
Non-zero status code returned while running TopK node. (ssdlite320_mobilenet_v3_large)
#12669 closed
Aug 7, 2025 -
Wrong Results for FP16 Models in CUDAExecutionProvider and TensorRTExecutionProvider
#12726 closed
Aug 7, 2025 -
`static inline Ort::Env onnx_env{nullptr}` easily leads to nullptr deref on app exit
#12736 closed
Aug 7, 2025 -
SystemError : 13 for transformers optimizer
#12745 closed
Aug 7, 2025 -
BatchNormalization produces all zeros for 1D input
#12754 closed
Aug 7, 2025 -
How to set the priority of ONNX in GPU?
#12760 closed
Aug 7, 2025 -
onnxruntime-linux-x64-gpu-1.12.1
#12766 closed
Aug 7, 2025 -
Asynchrononus Inference
#12768 closed
Aug 7, 2025 -
I want to use tensorrt as the back-end of onnx
#12781 closed
Aug 7, 2025 -
cast op not support multithread
#12786 closed
Aug 7, 2025 -
How to set cpu_num to a specific value?
#12819 closed
Aug 7, 2025 -
AttentionPastState_dynamic test fails during building with CUDA EP from source
#12820 closed
Aug 7, 2025 -
Memory management
#12824 closed
Aug 7, 2025 -
error: package directory 'onnxruntime/backend' does not exist [Build]
#12922 closed
Aug 7, 2025 -
[Web] Failed to compile shader on WebGL
#12927 closed
Aug 7, 2025 -
Disabling optimization produces incorrect results on CUDAExecutionProvider in 1.12
#12946 closed
Aug 7, 2025 -
[Performance] Dynamic model input prediction is slow
#12955 closed
Aug 7, 2025 -
onnxruntime calculate gradients but no need for training
#13057 closed
Aug 7, 2025 -
onnxruntime-gpu, cudaoptions, result is different
#13061 closed
Aug 7, 2025 -
onnxruntime-node crash the electron app[Web]
#13086 closed
Aug 7, 2025 -
what's the differences between onnxruntime with openvino backend VS openvino directly?
#13087 closed
Aug 7, 2025 -
[Performance] a problem for Ort::IoBinding
#13090 closed
Aug 7, 2025 -
[Performance] ONNX Runtime GPT2 Model Running Significantly Slower than PyTorch
#13105 closed
Aug 7, 2025 -
[Test issue] Updated Ignore
#13109 closed
Aug 7, 2025 -
[Performance] Multithreading performance tails off after 3 threads, possible memory issue
#13138 closed
Aug 7, 2025 -
Failed to create CUDAExecutionProvider
#13139 closed
Aug 7, 2025 -
Onnxruntime fails on GPU loading inference with int8 models
#13168 closed
Aug 7, 2025 -
Multilingual-MiniLM-L12-H384 ONNX inference in NodeJS
#13171 closed
Aug 7, 2025 -
GPU inference result not stable
#13178 closed
Aug 7, 2025 -
[Performance] inference time much slower (1529ms vs. 20 ms) on GPU vs CPU.
#13199 closed
Aug 7, 2025 -
[Performance] Performance issue on Linux vs Windows for BERT model.
#13224 closed
Aug 7, 2025 -
Contrib IRFFT operator output dimensions calculation
#13236 closed
Aug 7, 2025 -
Onnx create session takes a long time.
#13240 closed
Aug 7, 2025 -
Inference time spikes in UNET onnx
#13258 closed
Aug 7, 2025 -
[Performance] Too Slow when i do inference
#13265 closed
Aug 7, 2025 -
[Mobile] .Net target Arm64
#13295 closed
Aug 7, 2025 -
[ONNXRuntimeError] : 1 : FAIL : This is an invalid model. Error: the graph is not acyclic.
#13322 closed
Aug 7, 2025 -
onnx Pad operator with negative pads value outputs 'nan'
#13332 closed
Aug 7, 2025 -
[Build] Upgrade to latest protobuf
#13335 closed
Aug 7, 2025 -
[Performance] Comparing ONNX CPU execution profiles of two FasterRCNN checkpoints
#13341 closed
Aug 7, 2025 -
[Build] ONNX Runtime Build Error ZCU102 (DPUCZDX8G)
#13351 closed
Aug 7, 2025 -
quantize_dynamic results in initializer error
#13358 closed
Aug 7, 2025 -
[Performance] CNN Inference has latency spikes with TensorRT EP
#13366 closed
Aug 7, 2025 -
Onnxruntime crashes if setting cpu affinity fails in Ort::Session constructor
#13367 closed
Aug 7, 2025 -
Using GPU in c++
#13380 closed
Aug 7, 2025 -
Can't run qdq model with TRT EP
#13381 closed
Aug 7, 2025 -
Whether the .trt model can be loaded
#13394 closed
Aug 7, 2025 -
Does ORT support quantize
#13413 closed
Aug 7, 2025 -
ONNX Runtime Inference on GPU: Failed to create CUDAExecutionProvider
#13414 closed
Aug 7, 2025 -
Consecutive casting leads to wrong result
#13418 closed
Aug 7, 2025 -
Parameters are optimized out even if it is a needed return value
#13425 closed
Aug 7, 2025 -
[Web] Is it possible to use both webgl backend and wasm backend in onnxruntime-web
#13435 closed
Aug 7, 2025 -
GPU Arena blocked session->Run()
#13464 closed
Aug 7, 2025 -
Consecutive call to Ort::Session::Run() crashes
#13476 closed
Aug 7, 2025 -
did onnxruntime-gpu surport call CUDA code or call custom kernel funtion to preprocess Image?
#13491 closed
Aug 7, 2025 -
[Performance]
#13492 closed
Aug 7, 2025 -
ORT fails on Slice() when indices are of different integer types
#13497 closed
Aug 7, 2025 -
[Performance]
#13500 closed
Aug 7, 2025 -
[Performance] C# Gpu memory allocation
#13504 closed
Aug 7, 2025 -
Removing the semantic segmentation's bounding box
#13513 closed
Aug 7, 2025 -
How to transfer the Ort::Value obtained to cuda code for post-processing, such as a .cu file?
#13528 closed
Aug 7, 2025 -
[Training] Whether onnxruntime training can be used in Megatron.
#13532 closed
Aug 7, 2025 -
How can I load a model larger than 2G in memory
#13543 closed
Aug 7, 2025 -
Zero Result with DirectML Execution Provider
#13545 closed
Aug 7, 2025 -
Inference speed: Swintransformer torch vs onnxruntime-gpu
#13550 closed
Aug 7, 2025 -
[Build]
#13554 closed
Aug 7, 2025 -
ORT fails on CPU looking for LayerNormalization node, for mixed-precision ONNX
#13556 closed
Aug 7, 2025 -
[TVM] Exception during initialization
#13572 closed
Aug 7, 2025 -
unable to build onnxruntime for openvino execution provider to get nuget packages
#13577 closed
Aug 7, 2025 -
Does Microsoft.ML.OnnxRuntime have a dependency on System.CodeDom.dll ?
#13604 closed
Aug 7, 2025 -
[Build]
#13606 closed
Aug 7, 2025 -
[C++] Model output image different in C++ ORT vs. Python ORT & PyTorch
#13614 closed
Aug 7, 2025 -
[Performance] Operators assigned to CPU instead of CUDA
#13615 closed
Aug 7, 2025 -
Dimension Padding problem in reduction_ops.cc
#13654 closed
Aug 7, 2025 -
[Performance] onnxruntime session uses 5x more system memory if torch is imported
#13662 closed
Aug 7, 2025 -
Help in running onnxruntime with SNPE as execution provider
#13693 closed
Aug 7, 2025 -
GPU with device_id=0 is always occupied no matter what device_id is specified when run the inference
#13697 closed
Aug 7, 2025 -
[DML] reproducible bug on DML provider
#13714 closed
Aug 7, 2025 -
[Build] Avoid NEON when building on Raspberry Pi 4
#13718 closed
Aug 7, 2025 -
[Web] Uncaught (in promise) TypeError: cannot resolve operator 'Erf' with opsets: ai.onnx v15
#13729 closed
Aug 7, 2025 -
[Web] NPM package include ts files in the output
#13736 closed
Aug 7, 2025 -
[Web]
#13749 closed
Aug 7, 2025 -
Cannot run inference on Integrated Graphics with OpenVino EP using C Sharp API
#13772 closed
Aug 7, 2025 -
QDQ not instrumenting inputs if first operator is a SUM
#13794 closed
Aug 7, 2025 -
how bring my hardware backend to onnxruntime framework
#13797 closed
Aug 7, 2025 -
[WebGL] cannot resolve operator 'DynamicQuantizeLinear' with opsets: ai.onnx v16, ...
#13800 closed
Aug 7, 2025 -
hello,how to improve [Performance] in batch inference with multicore cpu
#13820 closed
Aug 7, 2025 -
Dynamic quantization is useless on AMD cpus(AMD EPYC 7K62 48-Core Processor)
#13872 closed
Aug 7, 2025 -
SSDLite 320: RuntimeException on CUDA. TopK index assert was false.
#13876 closed
Aug 7, 2025 -
Segmentation Faults when using TensorRT on Jetson Orin Dev Kit
#13877 closed
Aug 7, 2025 -
[Web] dynamic batch size doesn't work when use webgl provider
#13909 closed
Aug 7, 2025 -
[Web] ort-wasm-simd.wasm can't be loaded in Electron renderer (using webpack)
#13933 closed
Aug 7, 2025 -
[Build] Incomplete type used in nested name specifier, Ubuntu
#13942 closed
Aug 7, 2025 -
Do I need to convert data to device for TensorRTExecutionProvider?
#13952 closed
Aug 7, 2025 -
CUDA provider gives different result with respect to CPU
#13962 closed
Aug 7, 2025 -
Non-zero status code returned while running Resize node
#13975 closed
Aug 7, 2025 -
[Performance] [webgl]bad performance of webgl
#13986 closed
Aug 7, 2025 -
[windows7] Unable to load DLL 'onnxruntime.dll': The specified module could not be found.
#14003 closed
Aug 7, 2025 -
[Performance] CUDA EP with Strange Inference Time
#14016 closed
Aug 7, 2025 -
[Performance] the speed and cpu utilization with SetIntraOpNumThreads(1) and SetIntraOpNumThreads(2)
#14018 closed
Aug 7, 2025 -
[Performance] onnx vs pt memory usage
#14029 closed
Aug 7, 2025 -
[Performance] High memory use by CUDAProvider in Jetson Xavier NX(JetPack 4.4)
#14038 closed
Aug 7, 2025 -
java onnxruntime_providers_cuda.dll
#14047 closed
Aug 7, 2025 -
There is a vulnerability in torch:1.12.0,upgrade recommended
#14059 closed
Aug 7, 2025 -
[Build] impossible to build onnxruntime with vs2022
#14086 closed
Aug 7, 2025 -
[Build] core/framework/fence.h not found while build upon CANN
#14121 closed
Aug 7, 2025 -
300% slower on MYRIAD_FP16 when using CustomVision fp16 model
#14125 closed
Aug 7, 2025 -
[Training] Does the current training code support RNN model like seq2seq and Transformer and GNN model?
#14139 closed
Aug 7, 2025 -
[Build] Dockerfile.arm64 - No module named 'packaging' error
#14140 closed
Aug 7, 2025 -
CUDNN error executing cudnnConvolutionForward
#14186 closed
Aug 7, 2025 -
ONNXRuntime outputs numerically incorrect results for mixed precision models.
#14189 closed
Aug 7, 2025 -
Infer shape incorrect for Split with opset 15
#14200 closed
Aug 7, 2025 -
ConvTranspose2d onnxruntime and pytorch forward results are inconsistent
#14208 closed
Aug 7, 2025 -
No module named 'onnxruntime.transformers.io_binding_helper'
#14230 closed
Aug 7, 2025 -
Valgrind: Source and destination overlap in memcpy_chk
#14254 closed
Aug 7, 2025 -
[Build] Docker arm64 build fails.
#14283 closed
Aug 7, 2025 -
ONNX Runtime support for the graph optimization of bigbird_pegasus model
#14295 closed
Aug 7, 2025 -
TensorRT EP same inference Time of INT 8 and FP 16
#14315 closed
Aug 7, 2025 -
STFT op has the wrong expected shape
#14316 closed
Aug 7, 2025 -
Program will stuck when creating 'Ort::Session'
#14317 closed
Aug 7, 2025 -
[Performance] ONNXruntime CPU is slower than Pytorch Tracing to Torchscript on CPU
#14326 closed
Aug 7, 2025 -
RemoveNode Should be unreachable if CanRemoveNodeAndMergeEdges is in sync with the logic
#14360 closed
Aug 7, 2025 -
[Bug] Attention and QAttention don't work properly in some cases
#14363 closed
Aug 7, 2025 -
Add some custom QlinearXXX Ops
#14365 closed
Aug 7, 2025 -
[Build] Error in builiding with Tensorrt EP
#14394 closed
Aug 7, 2025 -
Import Error " cannot import name 'get_all_providers' "
#14395 closed
Aug 7, 2025 -
[Training] The gradient builder has not been registered: ReduceMin
#14412 closed
Aug 7, 2025 -
Free allocated data of Ort::Value in C++
#14420 closed
Aug 7, 2025 -
Pad operator not quantizable?
#14422 closed
Aug 7, 2025 -
Different Python exceptions on OOM with `run_with_iobinding` and `run`
#14438 closed
Aug 7, 2025 -
Modifying QlinearADD
#14441 closed
Aug 7, 2025 -
[ONNXRuntimeError] Unsupported OrtValue type with CUDA EP
#14457 closed
Aug 7, 2025 -
[Performance] There is some confusion with onnx + oneDNN or onnx + OpenVINO
#14468 closed
Aug 7, 2025 -
[Build]
#14471 closed
Aug 7, 2025 -
[Performance] cuda_options.arena_extend_strategy = 1 does not free memory
#14474 closed
Aug 7, 2025 -
[Performance] DirectML cost more memory than CPU when process the Win32(X86) program (official demo).
#14479 closed
Aug 7, 2025 -
[Performance] CPU Usage is too high
#14490 closed
Aug 7, 2025 -
[Performance] cuDNN lib mismatch let to a underutilization of GPU
#14498 closed
Aug 7, 2025 -
missing headers and pkgconfig files in binary packages distribution (from github releases) (linux)
#14503 closed
Aug 7, 2025 -
[Web] Runtime error using `onnxruntime-node` with webpack
#14505 closed
Aug 7, 2025 -
Non-zero status code returned while running DnnlCustomOp2 node
#14543 closed
Aug 7, 2025 -
Check and modify the weights of a layer of an onnx model at runtime
#14545 closed
Aug 7, 2025 -
[Performance] DirectML Dynamic Axes very slow
#14550 closed
Aug 7, 2025 -
[BUG] FusedConv node error
#14561 closed
Aug 7, 2025 -
fp32 model with autocast to fp16: Shape mismatch attempting to re-use buffer
#14582 closed
Aug 7, 2025 -
[Build] cuda dll wrap up
#14585 closed
Aug 7, 2025 -
different results with onnxruntime-gpu-1.10
#14587 closed
Aug 7, 2025 -
[Web] currently non-1 steps is not supported for Slice
#14588 closed
Aug 7, 2025 -
Destroying an inference session without exiting the python process
#14590 closed
Aug 7, 2025 -
C# - CUDA Nuget BUG : DefaultLogger Attempt to use DefaultLogger but none has been registered.
#14593 closed
Aug 7, 2025 -
Onnxruntime Arm NN Ep build error.
#14611 closed
Aug 7, 2025 -
[Performance]
#14615 closed
Aug 7, 2025 -
[Build] cpp_field.h(189,47): error C2059: 语法错误:“)”
#14627 closed
Aug 7, 2025 -
[Performance] Memory grows after reloading model
#14641 closed
Aug 7, 2025 -
[Build] Building for C++ On Jetson Nano CUDA 10.2
#14644 closed
Aug 7, 2025 -
TensorRT Execution Build Fails on Jetson Jetpack 4.6.1
#14658 closed
Aug 7, 2025 -
DEEPFACE LIVE Issue with onnxruntime_pybind_state.
#14667 closed
Aug 7, 2025 -
[Build]
#14674 closed
Aug 7, 2025 -
Custom Operater Output Tensor Shape Error
#14683 closed
Aug 7, 2025 -
`CleanUnusedInitializersAndNodeArgs` warnings are printed only with subgraphs
#14694 closed
Aug 7, 2025 -
A runtime can run on cuda device 0 but fail on cuda device 1
#14710 closed
Aug 7, 2025 -
How to inference with multiple batches and multiple inputs.
#14713 closed
Aug 7, 2025 -
Crash in JavaGPU on Windows
#14714 closed
Aug 7, 2025 -
clog_vlog_fatal[Build]
#14740 closed
Aug 7, 2025 -
Memory Leak
#14745 closed
Aug 7, 2025 -
[Build] macOS: cross compiling arm64 on intel fails
#14746 closed
Aug 7, 2025 -
[Performance] Can oneDNN EP accelerate the inference time of onnxruntime on x86 machines?
#14749 closed
Aug 7, 2025 -
Basic Optimizer adds non-standard ONNX ops
#14752 closed
Aug 7, 2025 -
Basic Optimizer adds non-standard ONNX ops for roi_align
#14753 closed
Aug 7, 2025 -
Basic Optimizer adds non-standard ONNX ops for input tensor
#14754 closed
Aug 7, 2025 -
[Build] cmake install when --use_xnnpack is broken
#14757 closed
Aug 7, 2025 -
Failed to build CUDA docker image[Build]
#14765 closed
Aug 7, 2025 -
`onnx.checker.check_model` raises `Bad node spec` for custom nodes created from ORT `optimize_model`
#14768 closed
Aug 7, 2025 -
Dependency Problem (java onnxruntime)
#14787 closed
Aug 7, 2025 -
[Build] Can't access OrtSessionOptionsAppendExecutionProvider_Dnnl while using oneDNN
#14799 closed
Aug 7, 2025 -
[Build] Dockerfile.arm64 build fails
#14801 closed
Aug 7, 2025 -
[Build] Unable to load TensorRT Execution Provider
#14802 closed
Aug 7, 2025 -
[Web] how to reduce wasm file size
#14817 closed
Aug 7, 2025 -
onnxruntime with CUDA not releasing about 400 MB memory after the session and environment is destroyed
#14819 closed
Aug 7, 2025 -
working model with Resize node becomes invalid after using convert_float_to_float16
#14827 closed
Aug 7, 2025 -
How do I pass a list of tensors in onnxruntime-web?
#14829 closed
Aug 7, 2025 -
DML EP cannot load some quantized onnx files.
#14835 closed
Aug 7, 2025 -
[Performance] Performance degradation while using dynamic axes
#14863 closed
Aug 7, 2025 -
UndefinedBehaviorSanitizer reports problem in onnxruntime_global_thread_pools_test
#14882 closed
Aug 7, 2025 -
[Performance]
#14919 closed
Aug 7, 2025 -
Is there a Python way to get the max supported ONNX IR version from ORT package?
#14932 closed
Aug 7, 2025 -
[Performance] 3-100x regression when opset 16 or 17 is used (CUDA EP)
#14956 closed
Aug 7, 2025 -
[Performance] Can not release memory in gpu.
#14957 closed
Aug 7, 2025 -
Reuse output tensors memory that was allocated by first call to Ort::Session.Run(...)
#14960 closed
Aug 7, 2025 -
Compatibility between Onnx and Blazor Webassembly
#14962 closed
Aug 7, 2025 -
Running T5 export ONNX example leads to shape inference error
#14963 closed
Aug 7, 2025 -
Microsoft.ML.OnnxRuntime.Gpu not working in MAUI project
#14974 closed
Aug 7, 2025 -
[Build] Failed to build in docker container
#14983 closed
Aug 7, 2025 -
conv throws safeint exception
#14985 closed
Aug 7, 2025 -
Static linkage of onnx_runtime and providers library
#14986 closed
Aug 7, 2025 -
[Build] static assertion fails when building from source with GCC 13.0.1
#14991 closed
Aug 7, 2025 -
[Performance] inference problems with io_binding: unexpected shape or unexpected data type
#14998 closed
Aug 7, 2025 -
CUDA Graph Error - CUDA failure 900: operation not permitted when stream is capturing
#15002 closed
Aug 7, 2025 -
ONNX does not support Dirichlet distribution?
#15016 closed
Aug 7, 2025 -
[Build] Problems with FP16 Layernorm
#15021 closed
Aug 7, 2025 -
[Build] api-ms-win-core-heap-l2-1-0.dll missing on windows server 2012 R2
#15025 closed
Aug 7, 2025 -
onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126
#15035 closed
Aug 7, 2025 -
accuracy reduced with multithreaded GPU prediction
#15038 closed
Aug 7, 2025 -
mT5 convert to ONNX and GPU inference problems
#15042 closed
Aug 7, 2025 -
[Build] Cannot specify compile definitions for target "onnx" which is not built by this project.
#15051 closed
Aug 7, 2025 -
[Performance] Inference doubles VRAM (DirectML)
#15074 closed
Aug 7, 2025 -
[Web] Memory spike in ORT-web leading to app crash
#15086 closed
Aug 7, 2025 -
The dimension of incides to ScatterND op is wrong during inference.
#15095 closed
Aug 7, 2025 -
[Performance] onnxruntime allocates lots of cuda memory on T4
#15098 closed
Aug 7, 2025 -
fail build with gcc 12.x in onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_qdq.cc
#15111 closed
Aug 7, 2025 -
How to reduce GPU memory usage when inference
#15127 closed
Aug 7, 2025 -
descriptor_table_tensorboard_2fcompat_2fproto_2fattr_5fvalue_2eproto not declared (TRT 8.5.0)
#15131 closed
Aug 7, 2025 -
how to inference with fp16 precise in python code?
#15134 closed
Aug 7, 2025 -
NOT_IMPLEMENTED GridSample(16) on onnxruntime 1.14.1
#15137 closed
Aug 7, 2025 -
inference speed is very slow when using fp16 while using fp 32 is normal
#15170 closed
Aug 7, 2025 -
A bug occurs when the program terminates
#15174 closed
Aug 7, 2025 -
[Performance] GPT NEO: better performance of python GPT NEO than its onnx runtime version in C++?
#15191 closed
Aug 7, 2025 -
[Build] segfault when run unitest (ctest)
#15224 closed
Aug 7, 2025 -
[Build] fail to build on Windows ARM64
#15252 closed
Aug 7, 2025 -
[Performance] How to debug/reduce GPU utilization?
#15254 closed
Aug 7, 2025 -
[Performance]
#15265 closed
Aug 7, 2025 -
ONNX model with FBNetv3 architecture Conversion to TensorRT Problem
#15269 closed
Aug 7, 2025 -
[Build] ONNX Java Runtime - Handle UnsatisfiedLinkError
#15281 closed
Aug 7, 2025 -
[Documentation Request] Estimating (or Checking) Allocated Memory
#15326 closed
Aug 7, 2025 -
[Performance] Timings feedback
#15328 closed
Aug 7, 2025 -
[Performance] Gemm op is slower after quantization
#15332 closed
Aug 7, 2025 -
[Mobile] onnxruntime-c and onnxruntime-extensions-c pod conflict with DocumentReader pod
#15333 closed
Aug 7, 2025 -
[Performance] ONNXRUNTIME sometime DEAD in python multiprocessing
#15345 closed
Aug 7, 2025 -
ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2235646909
#15349 closed
Aug 7, 2025 -
[Web] custom ops
#15374 closed
Aug 7, 2025 -
[Performance] Running Large Language Models for dynamic input size is poor performance. (DirectML)
#15394 closed
Aug 7, 2025 -
Opset Coverage - Binary Size Tradeoff
#15397 closed
Aug 7, 2025 -
Mask-RCNN network is giving significantly different result with DirectML EP
#15459 closed
Aug 7, 2025 -
Error Unrecognized attribute: layout for operator DynamicQuantizeLSTM
#15465 closed
Aug 7, 2025 -
Please provide informative message on dlopen failures -- python API
#15476 closed
Aug 7, 2025 -
[Performance] WebAssembly 1x1 Conv almost 4x slower than native
#15483 closed
Aug 7, 2025 -
[Performance] Model converted to mixed precision results in higher latency
#15490 closed
Aug 7, 2025 -
Inference slows down on gpu.
#15491 closed
Aug 7, 2025 -
[Bug?] Casting int8-->float
#15492 closed
Aug 7, 2025 -
InferenceSession fails with segmentation fault when fp16 model is loaded with CPUExecutionProvider
#15494 closed
Aug 7, 2025 -
[ErrorCode:Fail] Load model from [...]\latin_ipa_forward.onnx failed:invalid vector subscript
#15495 closed
Aug 7, 2025 -
[Build] Openvino debug build fails on VS2019
#15496 closed
Aug 7, 2025 -
[Web] probability is not returned: `error code = 1`
#15511 closed
Aug 7, 2025 -
SimplifiedLayerNormalization loading error for converted FP16 databricks/dolly-v2-3b model
#15531 closed
Aug 7, 2025 -
[Performance] FP16 model can not get acceleration on GPU with ONNXRuntime-GPU
#15534 closed
Aug 7, 2025 -
Get results from Mask RCNN model with C++
#15541 closed
Aug 7, 2025 -
fatal error: gsl/gsl: No such file or directory
#15554 closed
Aug 7, 2025 -
[Build] 1.14.0-dev-20230120-0204-3d6cea14f4 (This build breaks model on Intel)
#15567 closed
Aug 7, 2025 -
[Performance] CUDA fp16 didn't get speed up
#15585 closed
Aug 7, 2025 -
Error with custom spconv class in onnx runtime
#15594 closed
Aug 7, 2025 -
[Build] Java Nightly build
#15600 closed
Aug 7, 2025 -
[Build] the Linux build config
#15621 closed
Aug 7, 2025 -
Can't use onnxruntime with DirectML built from source
#15628 closed
Aug 7, 2025 -
[Performance] CNN model exported by PyTorch runs slower than Tensorflow 1.0
#15647 closed
Aug 7, 2025 -
onnxRuntimeException and DefaultLogger issues in AWS Lambda runtime
#15650 closed
Aug 7, 2025 -
ONNXRuntime in Docker
#15652 closed
Aug 7, 2025 -
ONNX with FloatTensorType when inferred from C++ returns different label everytime
#15665 closed
Aug 7, 2025 -
[Build] Compile Error if path too long
#15674 closed
Aug 7, 2025 -
[CANN]EP: CANN cannot complete inference on Atlas200DK
#15677 closed
Aug 7, 2025 -
[Performance] Can't get GPU speed-up when exe program is located inside the path with chinese character
#15678 closed
Aug 7, 2025 -
[ErrorCode:InvalidArgument] Invalid Feed Input Name:image
#15692 closed
Aug 7, 2025 -
[Performance] Can we set model weight precision when converting keras model into onnx model?
#15695 closed
Aug 7, 2025 -
[Build] Onnxruntime-gpu for Jetpack 5.1.1 on Jetson Orin Nano Developer Kit
#15732 closed
Aug 7, 2025 -
GraphOptimization (ORT_ENABLE_ALL) is slower using ONNXRuntime-GPU
#15743 closed
Aug 7, 2025 -
Load onnx failed(segmentation fault) with version 1.14.1 (2)
#15745 closed
Aug 7, 2025 -
Inference using the CUDA EP returns nan
#15752 closed
Aug 7, 2025 -
[Build]
#15786 closed
Aug 7, 2025 -
How to set CalibrationDataReader when my datatype is time series?
#15836 closed
Aug 7, 2025 -
[Build]
#15863 closed
Aug 7, 2025 -
[Training] Training Onnx format Models
#15867 closed
Aug 7, 2025 -
Failed top create CUDAExecutionProvider
#15873 closed
Aug 7, 2025 -
[RunTimeError]Infer error shape in runtime and mismatch with onnx spec about Split opset 18
#15882 closed
Aug 7, 2025 -
[Performance] `CUDAExecutionProvider` uses 3x the memory of `CPUExecutionProvider`
#15886 closed
Aug 7, 2025 -
symbolic_shape_infer.py failure
#15898 closed
Aug 7, 2025 -
Linking executable with static libraries --> error LNK2038: mismatch detected
#15928 closed
Aug 7, 2025 -
Atlas200DK uses EP: CANN to infer resnet50 and reports "CANN errorEE9999: Inner Error!"
#15947 closed
Aug 7, 2025 -
float16 result not match with numpy or torch
#15977 closed
Aug 7, 2025 -
[Predict] Prediction from ONNX is same for all images
#16001 closed
Aug 7, 2025 -
The result of Col2Im operator not close with Torch result on fp16 dtype
#16007 closed
Aug 7, 2025 -
[Performance] QUInt8 vs a basic ONNX
#16009 closed
Aug 7, 2025 -
RunOptions.only_execute_path_to_fetches not working
#16013 closed
Aug 7, 2025 -
Cannot open include file: numpy/arrayobject.h
#16027 closed
Aug 7, 2025 -
[Web] The onnxruntime-web example is loading wasm file twice if set to local path
#16028 closed
Aug 7, 2025 -
can we customize memory allocation functions(like malloc/free) for inference in C api?
#16032 closed
Aug 7, 2025 -
[Performance] How to solve the problem of releasing GPU memory in onnxruntime
#16033 closed
Aug 7, 2025 -
[Performance] Huge gap between nn.Conv1d() and nn.Conv2d() - models exported by PyTorch
#16047 closed
Aug 7, 2025 -
Unexpected inference output from QLinearConv
#16105 closed
Aug 7, 2025 -
Memory leak in cpuinfo_x86_linux_init
#16117 closed
Aug 7, 2025 -
Segmentation Fault when optimizing Stable Diffusion models
#16140 closed
Aug 7, 2025 -
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] with swin-t
#16143 closed
Aug 7, 2025 -
Segmentation fault while loading CUDA Provider
#16146 closed
Aug 7, 2025 -
[Performance] ONNX Runtime doesn't parallelize operations in CPU models
#16158 closed
Aug 7, 2025 -
[MacOS] Unable to load libonnxruntime.dylib because binaries are not signed.
#16168 closed
Aug 7, 2025 -
[Build] line 2812, in <module> sys.exit(main())
#16179 closed
Aug 7, 2025 -
no acceleration onnx on e5 2680v3
#16185 closed
Aug 7, 2025 -
[Performance] setIntraOpNumThreads doesn't offer enough parallelization in JAVA-API
#16192 closed
Aug 7, 2025 -
DmlExecutionProvider bound to PyTorch tensor stops running
#16197 closed
Aug 7, 2025 -
NullReferenceException when creating an object of class SessionOptions | Unity
#16205 closed
Aug 7, 2025 -
[quantization] Problem with QDQ of Pow/Sqrt/Div
#16219 closed
Aug 7, 2025 -
why the input doesn't place in cuda ?
#16225 closed
Aug 7, 2025 -
TensorrtExecutionProvider::GetSupportedList graph_build.Resolve().IsOK() was false.
#16234 closed
Aug 7, 2025 -
Inconsistent generation of vectors by TF-IDF ONNX Vectorizer Model
#16252 closed
Aug 7, 2025 -
[OOM] Unable to convert 30B Model
#16254 closed
Aug 7, 2025 -
[Performance] Evaluation behavior with external arrays (C API)
#16255 closed
Aug 7, 2025 -
onnx use more memory than pytorch for some model
#16264 closed
Aug 7, 2025 -
[Web/Build] Failed to consume onnxruntime-common because of JS parser not up-to-date
#16265 closed
Aug 7, 2025 -
how to trace the error "assert node is not None" when use the onnxruntime.transformers.optimizer
#16268 closed
Aug 7, 2025 -
ROCm EP: Errors when trying to infer, which GPUs are supported?
#16271 closed
Aug 7, 2025 -
[Accuracy/Performance]
#16275 closed
Aug 7, 2025 -
Seg faults when creating InferenceSession for SAM backbone
#16300 closed
Aug 7, 2025 -
[Mobile] Error: Non string type of a tensor data is not allowed
#16301 closed
Aug 7, 2025 -
issue running onnxruntime with pytest
#16306 closed
Aug 7, 2025 -
How to catch exception OOM.
#16307 closed
Aug 7, 2025 -
How to edit Clip Operator in OnnxRuntime?
#16315 closed
Aug 7, 2025 -
get error when using libonnxruntime with dnnl EP
#16320 closed
Aug 7, 2025 -
Increase - decrease the maximum number of events during inference profiling.
#16334 closed
Aug 7, 2025 -
[Build] Fails to parse FP16 LayerNormalization in opset>=18
#16341 closed
Aug 7, 2025 -
[Build] Disable ORT_ENABLE_STREAM build error
#16345 closed
Aug 7, 2025 -
MaxPool: When Ceil_mode=1, MaxPool Generates Big Values.
#16350 closed
Aug 7, 2025 -
AveragePool: When Ceil_mode=1, AveragePool Generates Nan or 0 Values.
#16351 closed
Aug 7, 2025 -
[Training]
#16354 closed
Aug 7, 2025 -
multi-GPU inferencing
#16382 closed
Aug 7, 2025 -
Operator Pad reflect mode does not yield correct results
#16401 closed
Aug 7, 2025 -
[Web] Web ~40x slower than native
#16412 closed
Aug 7, 2025 -
[Performance] DML dynamic axes performance regression.
#16424 closed
Aug 7, 2025 -
C++ Runtime does not recognize supposedly correct input.
#16430 closed
Aug 7, 2025 -
Normalizer does not work as expected
#16451 closed
Aug 7, 2025 -
[Mobile] Unable to load models in Xamarin iOS
#16463 closed
Aug 7, 2025 -
m2m 100 418M
#16480 closed
Aug 7, 2025 -
Automatic deallocation (?) of the Ort::Sessions, memory leak?
#16497 closed
Aug 7, 2025 -
[Performance] A model with a large TreeEnsembleClassifier node takes too long to be loaded
#16511 closed
Aug 7, 2025 -
Setting `CUBLAS_WORKSPACE_CONFIG=":4096:8"` leads to `CUBLAS_STATUS_ALLOC_FAILED`
#16512 closed
Aug 7, 2025 -
ONNXRuntimeError: Training mode does not support BN opset 14 (or higher) yet.
#16867 closed
Aug 7, 2025 -
[Build] libonnxruntime_providers_dnnl.so: undefined symbol: omp_get_max_threads
#16561 closed
Aug 7, 2025 -
Large model >2GB save_to_ort
#16573 closed
Aug 7, 2025 -
[Build] fatal error: too many errors emitted, stopping now [-ferror-limit=]
#16576 closed
Aug 7, 2025 -
[Build] Cannot build onnxruntime
#16583 closed
Aug 7, 2025 -
Conv3d precision error between pytorch and onnx
#16589 closed
Aug 7, 2025 -
[Training] Define a custom training with some ONNX models
#16597 closed
Aug 7, 2025 -
[Performance] Performance degradation observed w.r.t DNNL-EP in v1.15.1 compared to v1.13.1
#16609 closed
Aug 7, 2025 -
[Build] No C++ library is generated after compilation completed
#16610 closed
Aug 7, 2025 -
[Build] Dependency on OMP/MPI Runtime
#16631 closed
Aug 7, 2025 -
[Performance]
#16637 closed
Aug 7, 2025 -
The input tensor cannot be reshaped to the requested shape after adding Gather output to model's output
#16670 closed
Aug 7, 2025 -
Access violation reading location when I use CreateArenaCfgV2 and CUDA
#16686 closed
Aug 7, 2025 -
One path in the graph requests feature X(>Y) but input tensor has Y features
#16695 closed
Aug 7, 2025 -
[Bug] Coqui VITS ONNX model can't be statically quantized.
#16738 closed
Aug 7, 2025 -
CUDA Custom Op CUDA failure
#16748 closed
Aug 7, 2025 -
clean build v1.15.1 fails three fp16 tests due to `difference between... exceeds threshold"
#16775 closed
Aug 7, 2025 -
[Performance] FP16 models incur large cast latency when run on CPUs without FP16 support
#16778 closed
Aug 7, 2025 -
Incorrect Output from Java Model
#16781 closed
Aug 7, 2025 -
Segmentation Fault when using TensorRT execution provider
#16790 closed
Aug 7, 2025 -
[Performance] using onnxruntime with ray and also fix for memory footprint too high
#16793 closed
Aug 7, 2025 -
[Training] Proposal: Implement back propagation algorithm for C#
#16809 closed
Aug 7, 2025 -
[Performance]
#16817 closed
Aug 7, 2025 -
[Mobile] Failure to load whisper model .ort with react-native, regular and quantized versions
#16819 closed
Aug 7, 2025 -
op.SequenceEmpty(dtype=xxx) cannot be set to float16.
#16846 closed
Aug 7, 2025 -
[Performance]high latency variance
#16876 closed
Aug 7, 2025 -
[Performance] Convolution layer issue profiling
#16926 closed
Aug 7, 2025 -
[Mobile][Kotlin] OnnxTensor.createTensor from floatBuffer takes up 7 seconds
#16937 closed
Aug 7, 2025 -
Native assemblies aren't copied when Onnx is a transitive dependency and using netstandard
#17010 closed
Aug 7, 2025 -
Why onnxruntime extracts only 483MB json file?
#17013 closed
Aug 7, 2025 -
An error occurred when I used the TensorrtExecutionProvider in onnx runtime
#17047 closed
Aug 7, 2025 -
[Web] Cannot Convert to RGB when using Tensor.fromImage(image,{tensorFormat:'RGB'})
#17094 closed
Aug 7, 2025 -
How to release onnxruntime gpu memory
#17142 closed
Aug 7, 2025 -
[TOOLS]:Using transformers.optimizer optimize large model, segmentation fault (core dumped)
#17212 closed
Aug 7, 2025 -
Onnx model inference Fatal error: ai.onnx.contib:bev_pool_v2(-1) is not a registered function/op
#17214 closed
Aug 7, 2025 -
[C#] Invalid input name error
#17244 closed
Aug 7, 2025 -
AssertionError on num_heads > 0 for bert with specific optimization config
#17254 closed
Aug 7, 2025 -
windows10 x86 x64 inference time varies greatly
#17256 closed
Aug 7, 2025 -
[Performance] Operators assigned to CPU instead of CUDA, CPU thread management problem
#17268 closed
Aug 7, 2025 -
[Web] Error: no available backend found. ERR: [wasm] TypeError: Failed to parse URL from
#17274 closed
Aug 7, 2025 -
[Build] Error: cpuid.h: No such file or directory when cross-compiling ORT 1.15.1 with NNAPI for arm64
#17283 closed
Aug 7, 2025 -
Freeing heap block containing an active critical section
#17345 closed
Aug 7, 2025 -
[Performance] 3X slower inference on onnxruntime than pytorch(huggingface)
#17366 closed
Aug 7, 2025 -
[Performance] Memcpy leads to AllocationError for argmax
#17371 closed
Aug 7, 2025 -
[web/js] need for more methods on tensor object
#17372 closed
Aug 7, 2025 -
[Performance] Quantized model inference on CPU slower/same as FP32
#17389 closed
Aug 7, 2025 -
Default `tensorFormat` should RGBA for HTMLImageElement variant
#17395 closed
Aug 7, 2025 -
[Build] windows dll compilation error with versions above 1.14.0
#17404 closed
Aug 7, 2025 -
[Web] Add binary/where broadcast case when FXC issue got fixed in tint
#17405 closed
Aug 7, 2025 -
[Performance] Data size of Batch Normalization using cuDNN in inference.
#17406 closed
Aug 7, 2025 -
Yolov8 Static Quantization
#17410 closed
Aug 7, 2025 -
CUDA Stream and Synchronous in custom operato
#17412 closed
Aug 7, 2025 -
[Performance] How much memory it needs to load a 3.4 GB model to GPU through DirectML?
#17413 closed
Aug 7, 2025 -
valgrind memcpy_chk overlap onnxruntime1.15.1
#17431 closed
Aug 7, 2025 -
Extract node info
#17444 closed
Aug 7, 2025 -
[Bug] FP16 conversion yields an unusable model
#17447 closed
Aug 7, 2025 -
[Mobile iOS] Run fp16 onnx model on CoreML EP
#17448 closed
Aug 7, 2025 -
C++ API, Memory Leak instantiating Ort::Sessions
#17451 closed
Aug 7, 2025 -
Failure with OpenvinoEP within ORT
#17499 closed
Aug 7, 2025 -
Resize of doesn't work well while the coordinate_transformation_mode is 'align_corners'.
#17564 closed
Aug 7, 2025 -
Inference speed of Quantized model not increased after static Quantization[Performance]
#17634 closed
Aug 7, 2025 -
DML EP One session but called in different threads. [Performance]
#17686 closed
Aug 7, 2025 -
[Web]
#17700 closed
Aug 7, 2025 -
[Mobile | iOS] I got "Unknown exception" error.
#17731 closed
Aug 7, 2025 -
[Web] Custom build packages
#17743 closed
Aug 7, 2025 -
[web] following-up work items for supporting uniform buffers
#17860 closed
Aug 7, 2025 -
[Web] Declaration is not emitted in onnxruntime-node package
#17979 closed
Aug 7, 2025 -
[Build] Why does TensorRT EP need the full version of protobuf?
#18040 closed
Aug 7, 2025 -
[Web] Which node.js version is supposed to be supported?
#18078 closed
Aug 7, 2025 -
Microsoft.ML.OnnxRuntime.OpenVino Encountered unknown exception in Initialize
#18152 closed
Aug 7, 2025 -
ORT bug in Col2Im CPU 3D cases
#18156 closed
Aug 7, 2025 -
[Mobile|Android] Fatal error: ai.onnx.contrib:SentencepieceTokenizer(-1) is not a registered function/op
#18226 closed
Aug 7, 2025 -
The onnx.helper make_function command strips type information leading to inference errors
#18264 closed
Aug 7, 2025 -
[Web] onnxruntime-web and onnxruntime-node return different results for LSTM model
#18335 closed
Aug 7, 2025 -
[Performance] Does `com.microsoft.Attention` use FlashAttention-2?
#18474 closed
Aug 7, 2025 -
Add ORT Extensions to Java and build with Gradle
#18503 closed
Aug 7, 2025 -
Model Run Session wasting time[Performance]
#18510 closed
Aug 7, 2025 -
Is there any way to convert a qdqmodel to qlinearmodel use ort?
#18511 closed
Aug 7, 2025 -
[Training] qat
#18534 closed
Aug 7, 2025 -
[Build] manylinux_2_28 support
#18537 closed
Aug 7, 2025 -
[Build] TRT EP cannot be built without CUDA EP
#18542 closed
Aug 7, 2025 -
Call Session class method name Run failed,don't know why
#18548 closed
Aug 7, 2025 -
Does the computation order affect the computation result?
#18564 closed
Aug 7, 2025 -
[Web] How could I get the shape of the output tensor?
#18568 closed
Aug 7, 2025 -
[Build]
#18570 closed
Aug 7, 2025 -
# Issue with Rounding Behavior in onnxruntime's Quantizelinear Layer
#18576 closed
Aug 7, 2025 -
Session Run throws an access violation exception when I recreate the session
#18578 closed
Aug 7, 2025 -
[Node.js] Support for loading models with external data in `onnxruntime-node`
#18586 closed
Aug 7, 2025 -
Cuda EP does not compute reduce with empty set correctly?
#18588 closed
Aug 7, 2025 -
[Mobile] Model with large input size cause Segmentation Fault while session->run()
#18595 closed
Aug 7, 2025 -
Session initialization stuck/crash in DMLCreateDevice while using DirectML EP
#18599 closed
Aug 7, 2025 -
Profiling multithreaded runs
#18600 closed
Aug 7, 2025 -
Segmentation Fault when some of node outputs is empty
#18601 closed
Aug 7, 2025 -
What is the recommended setup for running multiple models/sessions in parallel in C++?
#18610 closed
Aug 7, 2025 -
DirectML Resize Node error.
#18613 closed
Aug 7, 2025 -
[Build]
#18617 closed
Aug 7, 2025 -
Could not find an implementation for SkipGroupNorm(1) node with name 'SkipGroupNorm_0'
#18623 closed
Aug 7, 2025 -
Crash in ResizeHelper::Initialize executing a model on ARM64
#18628 closed
Aug 7, 2025 -
ONNXRuntime Segmentation Fault Crash on Inference (iOS and Mac)
#18632 closed
Aug 7, 2025 -
[Performance] dynamic batch infer cost time question
#18639 closed
Aug 7, 2025 -
ORT memory error with the graph from linspace
#18648 closed
Aug 7, 2025 -
Are there any benchmark tools for onnx mobile like Tensorflow Lite?
#18664 closed
Aug 7, 2025 -
Different results of consecutive runs for same input
#18672 closed
Aug 7, 2025 -
Strange condition size_t channel_rindex = is_nchw ? 2 : 2;
#18674 closed
Aug 7, 2025 -
Missprinted condition: head_size != num_heads * head_size
#18675 closed
Aug 7, 2025 -
Parallel inference of multiple models in different threads
#18806 closed
Aug 7, 2025 -
[Performance] Java API lacks functionality to control allocator settings.
#18845 closed
Aug 7, 2025 -
[dynamo_export] starts_.size() == ends_.size() + 1 was false. No matching 'start' entry.
#18863 closed
Aug 7, 2025 -
[Web] Non-zero status code returned while running Slice node `webgpu`
#18892 closed
Aug 7, 2025 -
compute_range not available
#18893 closed
Aug 7, 2025 -
the resout of onnx and trt engine is different?why?
#18902 closed
Aug 7, 2025 -
SafeIntOnOverflow() Integer overflow error when inferencing on too many samples with Python
#18905 closed
Aug 7, 2025 -
error 126 Onnx in ComfyUI[Performance] O
#18925 closed
Aug 7, 2025 -
ai.onnxruntime.OrtException: Unsupported type - FLOAT16
#18926 closed
Aug 7, 2025 -
How to use multiple inputs of different types in C++ session
#18932 closed
Aug 7, 2025 -
[Web] onnxruntime-web is not work in nodejs
#18933 closed
Aug 7, 2025 -
How to set `trt_profile_min_shapes` for inputs with name containing colons?
#18939 closed
Aug 7, 2025 -
OP (Conv) inference results mismatch with PyTorch
#18946 closed
Aug 7, 2025 -
[Build] How to build onnxruntime with openvino statically?
#18950 closed
Aug 7, 2025 -
[Performance] 2x Regression in 1st Inference time cost
#18957 closed
Aug 7, 2025 -
High Output Difference between ONNX model with different optimizer settings
#18959 closed
Aug 7, 2025 -
[Build] ModuleNotFoundError: No module named 'onnxruntime'
#18966 closed
Aug 7, 2025 -
Error with finding onnxruntime_binding.node on Windows 10 on a bootcamp Macbook
#18971 closed
Aug 7, 2025 -
How to observe arena allocator memory request metrics
#18972 closed
Aug 7, 2025 -
Could not load library cudnn_cnn_infer64_8.dll. Error code 127
#18973 closed
Aug 7, 2025 -
[Build] Failure with OneDNN on Intel MacOS
#18976 closed
Aug 7, 2025 -
Cannot quantize yolov5 float to int8 onnx model
#18987 closed
Aug 7, 2025 -
Encounter unknown exception in initialize using Openvino EP
#19004 closed
Aug 7, 2025 -
ONNX Runtime inference on string input
#19006 closed
Aug 7, 2025 -
[Error: Exception in HostFunction: <unknown>] while running ort models in react-native
#19021 closed
Aug 7, 2025 -
[Performance] It is not possible to use a discrete graphics card with DML.
#19025 closed
Aug 7, 2025 -
Freeing tensor data created via CreateTensor
#19034 closed
Aug 7, 2025 -
[Build] Linux x86_64 STATIC Build
#19035 closed
Aug 7, 2025 -
cudaMemcpyAsync throws exception in GPUDataTransfer
#19076 closed
Aug 7, 2025 -
[Training] On device training doesn't work with INT8 Models
#19078 closed
Aug 7, 2025 -
[Performance] The CUDA Stream cannot be set through Python API
#19094 closed
Aug 7, 2025 -
Longformer `convert_to_onnx.py` not working due to missing imports
#19149 closed
Aug 7, 2025 -
[Performance] Why run first inference so slow, although run one time in initialzation?
#19177 closed
Aug 7, 2025 -
ORT 1.17.0 Release Candidates available for testing
#19236 closed
Aug 7, 2025 -
[Training] How to update running_mean and running_var of BatchNormalization during training
#19370 closed
Aug 7, 2025 -
[Performance] In ONNX Runtime, the CPU consumption does not scale linearly with the number of threads
#19384 closed
Aug 7, 2025 -
Backwards convolution layers in CUDA provider should heed
#19391 closed
Aug 7, 2025 -
InferenceSession.run does not validate rank of scalar inputs
#19434 closed
Aug 7, 2025 -
[Web] Memory Access Out of Bounds Error When Using ONNX Runtime Web Inference in NPM Package (wasm)
#19443 closed
Aug 7, 2025 -
[Performance] CPU inference much slower from GPU runtime
#19451 closed
Aug 7, 2025 -
[On-device Training] Yolo custom loss
#19464 closed
Aug 7, 2025 -
[Performance]
#19479 closed
Aug 7, 2025 -
Errors about using c# and TensorRT
#19489 closed
Aug 7, 2025 -
Accuracy drops a lot when using fp16 with TensorRT EP
#19492 closed
Aug 7, 2025 -
quantize_dynamic : nodes_to_quantize(Gemm) is ignored
#19503 closed
Aug 7, 2025 -
ONNX Runtime OpenVINO EP is way behind
#19688 closed
Aug 7, 2025 -
Observed TDR on a low-end system
#19724 closed
Aug 7, 2025 -
Inconsistent Prediction Outputs for Onnx Model
#19834 closed
Aug 7, 2025 -
import InferenceSesseion and capi._pybind_state.
#19836 closed
Aug 7, 2025 -
[Performance] onnxruntime 1.17.1 version doesnt support CUDA 12.4
#19839 closed
Aug 7, 2025 -
[Performance] Accuracy dropped heavily using onnxruntime to inference a model quantized by QAT
#19850 closed
Aug 7, 2025 -
Inference speed problem even if using a high-end Hardware.
#19865 closed
Aug 7, 2025 -
[iOS] Output of type sequence<map<int64,float32>> causes crash on iOS
#19867 closed
Aug 7, 2025 -
[Build] Where is official build for Unity?
#19964 closed
Aug 7, 2025 -
[BUG] [OpenVino EP] Only first result in session is correct.
#19975 closed
Aug 7, 2025 -
Onnx Runtime EntryPointNotFoundException: OrtGetApiBase in Unity Application.
#20048 closed
Aug 7, 2025 -
[Performance] Inference failed or unsupported using quantize_dynamic
#20060 closed
Aug 7, 2025 -
openvino with int8
#20072 closed
Aug 7, 2025 -
Unpredictable onnxruntime-node crash when using Electron
#20084 closed
Aug 7, 2025 -
`convert_float_to_float16` results in `failed in shape inference <class 'Exception'>`
#20189 closed
Aug 7, 2025 -
Whether CUDA12.4 and cudnn9 matches onnxruntime-win-x64-cuda12-1.17.1
#20223 closed
Aug 7, 2025 -
[Training] Can we use ORTModule for inference?
#20281 closed
Aug 7, 2025 -
C API Seg Fault from OrtGetApiBase()->GetApi(ORT_API_VERSION);
#20283 closed
Aug 7, 2025 -
[Performance] ScatterND / GridSample operators are on CPU instead of GPU / CUDA
#20297 closed
Aug 7, 2025 -
DirectML returning empty result with ObjectDetection (Mobilinet V2 FPN Keras)
#20386 closed
Aug 7, 2025 -
[Build] Cmake install debug and release configuration
#20387 closed
Aug 7, 2025 -
[Performance] Profiling on CUDA shows confusing values
#20398 closed
Aug 7, 2025 -
[Performance] Massive Performance slowdown from v1.13.1 -> 1.14.0
#20400 closed
Aug 7, 2025 -
onnxruntime 1.17.3 is missing from cuda 12 artifacts feed
#20409 closed
Aug 7, 2025 -
Dockerfile does not work
#20458 closed
Aug 7, 2025 -
[Build] cross-compiling onnxruntime for arm32 and onnxruntime_ENABLE_CPUINFO not working.
#20461 closed
Aug 7, 2025 -
RUNTIME_EXCEPTION, 80070057 The parameter is incorrect in v1.17.3
#20464 closed
Aug 7, 2025 -
[Build] cmake duplicate target "memory" between abseil and xnnpack
#20469 closed
Aug 7, 2025 -
[Build] Error when load pf16 model
#20570 closed
Aug 7, 2025 -
DirectML Exception 80070057 "The parameter is incorrect"
#20575 closed
Aug 7, 2025 -
windows系统,Java中使用onnxruntime进行压测,cpu飙升很快,一直100%
#20593 closed
Aug 7, 2025 -
Missing dll cudnn_ops_infer64_8.dll does not generate a python error
#20605 closed
Aug 7, 2025 -
[BUG] Running operations over concat output rewrites it's values
#20606 closed
Aug 7, 2025 -
[Discussion] ORT GPU binaries do not contain DML
#20638 closed
Aug 7, 2025 -
[Build] TVM EP Build
#20665 closed
Aug 7, 2025 -
LayerNormalization doesnt' work as expected on Mac
#20676 closed
Aug 7, 2025 -
User-provided session logging function is not used for every log
#20680 closed
Aug 7, 2025 -
Windows ARM64 & X64 CLIP Image Encoder different results
#20722 closed
Aug 7, 2025 -
[Build] quantization unittest failed when run all tests
#20821 closed
Aug 7, 2025 -
[.NET] Update tensor implementations to new Tensor<T> type
#20874 closed
Aug 7, 2025 -
Java CreateTensor with NIO ByteBuffer for reuse purpose
#20882 closed
Aug 7, 2025 -
[Build] how to buid on openharmony?
#20895 closed
Aug 7, 2025 -
Stateful/Memory models
#20943 closed
Aug 7, 2025 -
[Performance] Severe performance penalty with transformer model and DirectML
#20983 closed
Aug 7, 2025 -
onnxruntime shape mismatch during quantization of yolov8 models
#21048 closed
Aug 7, 2025 -
DML cannot use device_id = 1 , run_with_iobinding failed.
#21092 closed
Aug 7, 2025 -
Symbolic Shape infer fails on onnx file without much logs
#21120 closed
Aug 7, 2025 -
[Performance] Whisper model inference results incorrect after Transformer Optimizer
#21150 closed
Aug 7, 2025 -
ORT 1.18.1 Release Candidates available for testing
#21173 closed
Aug 7, 2025 -
[Performance] Mapfile support for certain external data files is not working
#21195 closed
Aug 7, 2025 -
[Mobile] QNN failed to finalize QNN graph for attention layer
#21221 closed
Aug 7, 2025 -
TArray used for broadcast was limited to be within range [0, 8] on onnxruntime 1.16.3
#21254 closed
Aug 7, 2025 -
Not able to load onnx model multilingual-e5-large
#21321 closed
Aug 7, 2025 -
TensorRT EP's inference results are abnormal.
#21457 closed
Aug 7, 2025 -
[Build] Unable to build with --use_dml
#21568 closed
Aug 7, 2025 -
Memory leak in NPU inference after each one session.run
#21587 closed
Aug 7, 2025 -
[Performance]
#21635 closed
Aug 7, 2025 -
Quantized SeaLLM v2 Model Outputs Same as Input
#21636 closed
Aug 7, 2025 -
Same Model Hash Code Issue from different models
#21672 closed
Aug 7, 2025 -
[Bug]: Onnxruntime.CPU memoty leaks
#21723 closed
Aug 7, 2025 -
Inferencing FP16 model using onnxruntime
#21737 closed
Aug 7, 2025 -
[Web] requested dist/*.mjs files for cdnjs
#21785 closed
Aug 7, 2025 -
run_async not running asynchronously
#21791 closed
Aug 7, 2025 -
[Bug] [onnxruntime-node] Error: no available backend found. ERR: [wasm] backend not found.
#21813 closed
Aug 7, 2025 -
Error when trying to run vision model onnx
#21869 closed
Aug 7, 2025 -
[Build] “onnxruntime_cxx_api.h”: No such file or directory
#21891 closed
Aug 7, 2025 -
Snapdragon X processor is unsupported
#21947 closed
Aug 7, 2025 -
[Mobile] IOS library crashes in Release configuration
#21960 closed
Aug 7, 2025 -
[Web] Uncaught WebGPU validation error on Snapdragon SM8450 but works on SM8250
#21970 closed
Aug 7, 2025 -
[Build] onnxruntime-openvino library does not have python3.12 support
#22015 closed
Aug 7, 2025 -
onnxruntime-gpu(1.18.0) can not be install
#22028 closed
Aug 7, 2025 -
[Training] Implicit dependency of Python training API on 'torch' package
#22070 closed
Aug 7, 2025 -
GetElementType is not implemented after updating onnxruntime
#22075 closed
Aug 7, 2025 -
[Web] Error when using Web Workers on Next.js
#22113 closed
Aug 7, 2025 -
Warnings displayed as errors during TensorRT optimization.
#22164 closed
Aug 7, 2025 -
trt_weight_stripped_engine_enable does not work for all networks/size ranges.
#22165 closed
Aug 7, 2025 -
trt_weight_stripped_engine_enable does not work together with trt_dump_ep_context_model
#22179 closed
Aug 7, 2025 -
Filenames in OrtTensorRTProviderOptionsV2 should be std::filesystem::path or at least const ORTCHAR_T*
#22182 closed
Aug 7, 2025 -
[Performance] fp16 support and performance
#22242 closed
Aug 7, 2025 -
Upcoming ORT 1.20 Release Overview
#22274 closed
Aug 7, 2025 -
[Performance] High CUDA memory usage with ONNX Runtime and inconsistent memory release
#22297 closed
Aug 7, 2025 -
Build failure on Windows 10 using OpenVino 2024.3 & 2024.4 both.
#22314 closed
Aug 7, 2025 -
The EP_CTX_BLOB seems to have both WRITE and EXECUTABLE permissions enabled
#22437 closed
Aug 7, 2025 -
External data is not loaded with custom allocator
#22468 closed
Aug 7, 2025 -
[Performance] C++ api: destroy the execution provider if the `Ort::Session` is destroyed
#22511 closed
Aug 7, 2025 -
DistilBERT model inference failure using ONNX Runtime QNNExecutionProvider on Snapdragon® X Elite NPU
#22532 closed
Aug 7, 2025 -
[DO NOT UNPIN] ORT Nightly Package Name Change
#22541 closed
Aug 7, 2025 -
Negative output for sigmoid
#22557 closed
Aug 7, 2025 -
[Performance] Model runtime spiky with TensorRT Execution Provider
#22664 closed
Aug 7, 2025 -
FP16 ONNX model outputs NaN after the first successful execution
#22723 closed
Aug 7, 2025 -
CUDA providers failed to build against 12.6 with error error #221-D
#22728 closed
Aug 7, 2025 -
why force max_length <= kMaxSequenceLength in beam_search_parameters.cc ?
#22735 closed
Aug 7, 2025 -
[TensorRT EP] How can I disable generating cache when using trt execution provider
#22822 closed
Aug 7, 2025 -
[Dev] "./onnxruntime_test_all --help" gives segmentation fault
#22838 closed
Aug 7, 2025 -
how to release gpu memory when use onnxruntime with fastapi
#22899 closed
Aug 7, 2025 -
[Performance] Binary operators using SSE on AVX systems
#22905 closed
Aug 7, 2025 -
[Mobile] Error: Can't load a model: Error Code - ORT_INVALID_PROTOBUF
#22927 closed
Aug 7, 2025 -
Remove Python :: 3.7 Python :: 3.8 Python :: 3.9 from pypi metadata
#22993 closed
Aug 7, 2025 -
[Build] Dotnet packages on nuget are not built with Release optimizations
#23053 closed
Aug 7, 2025 -
[Web] ORT format model not working on WebGPU EP + Wasm Static lib
#23072 closed
Aug 7, 2025 -
[Build] onnxruntime_gpu PiPy on a slow host
#23079 closed
Aug 7, 2025 -
Cannot resolve operator 'LSTM' with webgl backend
#23083 closed
Aug 7, 2025 -
[Bug][CUDAExecutionProvider] INVALID_ARGUMENT : unsupported conv activation mode "Sigmoid"
#23114 closed
Aug 7, 2025 -
Understanding max_mem option of OrtArenaCfg class
#23121 closed
Aug 7, 2025 -
[Bug] Inconsistent Results After ONNX Runtime Optimization
#23133 closed
Aug 7, 2025 -
Inconsistent Results After ONNX Runtime Optimization
#23142 closed
Aug 7, 2025 -
[Build] Better support for vcpkg
#23158 closed
Aug 7, 2025 -
ONNX 1.17.0 integration remaining work: fix QNN EP test failures
#23163 closed
Aug 7, 2025 -
Inconsistent Results After ONNX Runtime Optimization
#23199 closed
Aug 7, 2025 -
[Inference Error] The onnx inference result is inconsistent with the numpy inference result
#23202 closed
Aug 7, 2025 -
[Build] how to build onnxruntime with openvino EP for android
#23222 closed
Aug 7, 2025 -
[Build] Building for Mac Catalyst Fails When Installed Via Cocoapods
#23307 closed
Aug 7, 2025 -
memory.enable_memory_arena_shrinkage is not working in python
#23339 closed
Aug 7, 2025 -
Issue loading custom ONNX model with complex-valued operations in ONNX Runtime (C++)
#23341 closed
Aug 7, 2025 -
Memory creeping up
#23348 closed
Aug 7, 2025 -
No speedup from float16 with directml compared to cuda
#23359 closed
Aug 7, 2025 -
[Build] Possibly unintentional or misconfigured dependencies for QNN EP in onnxruntime_python.cmake
#23360 closed
Aug 7, 2025 -
[Performance] GPU Fallback to CPU Without Error When CUDA DLLs Are Missing
#23372 closed
Aug 7, 2025 -
[Performance] 40% slowdown in ONNX Resize Operator on CPU
#23391 closed
Aug 7, 2025 -
[Performance] Round node shows huge performance drop on Windows
#23430 closed
Aug 7, 2025 -
debug result is ok, release get NaN output
#23440 closed
Aug 7, 2025 -
[QUESTION]: onnxruntime with onednn backend
#23543 closed
Aug 7, 2025 -
[Performance] Speed-up TensorRT engine compilation
#23546 closed
Aug 7, 2025 -
Custom operators is not a registered function/op (python)
#23566 closed
Aug 7, 2025 -
[Performance] ORT-WebGPU Average Pooling is working too long in edge case
#23614 closed
Aug 7, 2025 -
TensorRT Provider "Attribute reduction is not supported"
#23618 closed
Aug 7, 2025 -
session.disable_fallback() has no effect, it always fallback to cpu
#23647 closed
Aug 7, 2025 -
Onnxruntime using OpenVINO for older version Intel UHD630
#23735 closed
Aug 7, 2025 -
[Build] Error building with ACL EP on aarch64 linux (Raspberry Pi 5)
#23741 closed
Aug 7, 2025 -
[Mobile] Dynamic Shape Challenge: Enabling LLM on QNN-HTP
#23832 closed
Aug 7, 2025 -
[Performance] Why does inference occupy so much memory?
#23867 closed
Aug 7, 2025 -
The Pad operator has a calculation error in the "reflect" mode.
#23878 closed
Aug 7, 2025 -
Bad Allocation Error in ONNX Runtime on Windows x86 CPU When Processing Multiple Images Sequentially
#23938 closed
Aug 7, 2025 -
TensorRT Support for Multiple Profiles
#23965 closed
Aug 7, 2025 -
[Build] Unsupported AVX512-FP16 Instructions in MLAS (vcvtneeph2ps, vcvtneoph2ps)
#24025 closed
Aug 7, 2025 -
ImportError: Unable to import dependency onnxruntime
#24120 closed
Aug 7, 2025 -
onnxruntime-mobile implementation on custom execution provider
#24135 closed
Aug 7, 2025 -
segmentation fault while using onnxruntime==1.21.0
#24144 closed
Aug 7, 2025 -
Python Session.run_async Causes Program Exit
#24200 closed
Aug 7, 2025 -
OpenVINO EP not able to use CPU device
#24208 closed
Aug 7, 2025 -
Questions about using AMD VitisAI EP, how can i run my model on AMD NPU?
#24214 closed
Aug 7, 2025 -
[Build] OpenVINO ep for macOS
#24273 closed
Aug 7, 2025 -
[Build] Building v1.21.0: unsupported instruction 'vpdpbusds'
#24275 closed
Aug 7, 2025 -
SIGSEGV when calling OrtSession.run()
#24288 closed
Aug 7, 2025 -
[Build] Onnxruntime v1.21.0 fails to build with GCC-13
#24290 closed
Aug 7, 2025 -
quantize onnx models to INT8
#24374 closed
Aug 7, 2025 -
[Performance] [QNN EP] Performance gap between onnxruntime QNN EP and Genie from QNN SDK.
#24417 closed
Aug 7, 2025 -
[Build] Python build fails because onnxruntime/capi/build_and_package_info.py is missing
#24570 closed
Aug 7, 2025 -
[MLAS] Plan to add RISC-V Vector (RVV) support to MLAS
#24596 closed
Aug 7, 2025 -
nuget package 1.21.2 causes conflicts in Solutions targeting .NET Framework 4.8
#24599 closed
Aug 7, 2025 -
[Mobile] Objective-C API for register onnxruntime-extensions as a custom ops library
#24613 closed
Aug 7, 2025 -
[DO NOT UNPIN] ORT 1.22.0 Release Candidates available for testing
#24671 closed
Aug 7, 2025 -
Scale in resize node becomes an identity node not a parameter inside resize node
#24824 closed
Aug 7, 2025 -
Import error in pytest with onnxruntime-directml 1.22.0
#24907 closed
Aug 7, 2025 -
[Web] Clarification on wasm/simd vs wasm/simd/threaded default in onnxruntime-web v1.19.0+
#25666 closed
Aug 7, 2025 -
[Feature Request] Support for ScatterElements op for QNN-EP
#22962 closed
Aug 7, 2025 -
TreeEnsemble can incorrectly decide a root branch is a leaf.
#24679 closed
Aug 6, 2025 -
[Build] CMake Error related to onnxruntime_unittests.cmake
#24972 closed
Aug 6, 2025 -
onnxruntime custom OP Failure
#25644 closed
Aug 5, 2025 -
onnxruntime with the CPUExecutionProvider errors out while processing the ReverseSequence operator
#24920 closed
Aug 4, 2025 -
[Performance] ORT takes ~11GB memory for quantizing a model of size ~1GB
#24954 closed
Aug 4, 2025 -
[Documentation]
#24958 closed
Aug 4, 2025 -
How to use kv_cache more reasonably in the exported onnx model?
#24873 closed
Aug 2, 2025 -
Llama3.2-1B ONNX Graph generated by olive auto-opt fails to run on DirectML execution provider
#24937 closed
Aug 2, 2025 -
[Build] error: array index 7 is past the end of the array (that has type '__m256[4]')
#23180 closed
Aug 1, 2025 -
olive: a weird behavior of a model converted to ONNX format
#25600 closed
Aug 1, 2025
14 Issues opened by 14 people
-
ORT ABI support in onnx perf test
#25685 opened
Aug 7, 2025 -
[Build] Build failed on Qualcomm WOS platform
#25682 opened
Aug 7, 2025 -
[Build] Pybind11 3.0 support
#25681 opened
Aug 7, 2025 -
[Documentation] Comparison with PyTorch code is identical to Comparison with OpenVINO section
#25661 opened
Aug 6, 2025 -
[Feature Request] Integration with ONNX 1.19.0
#25648 opened
Aug 4, 2025 -
IExecutionProvider::FusedNodeAndGraph set intermediate unused results as model outputs
#25647 opened
Aug 4, 2025 -
[Performance] Quantized ONNX models cannot be efficiently used with speculative decoding?
#25636 opened
Aug 1, 2025 -
Output mismatch since version 1.21+
#25634 opened
Aug 1, 2025 -
Inference fails with 4 bit quantization
#25631 opened
Aug 1, 2025 -
[Feature Request] Nvidia TensorRT RTX runtime in C#
#25630 opened
Aug 1, 2025 -
[Feature Request] Support linear_tree=True in Lightgbm
#25623 opened
Aug 1, 2025 -
[Bug] [Performance] Cannot write_calibration_table for per channel quantization calibration
#25621 opened
Aug 1, 2025
84 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[MIGraphX EP] Syncing AMD changes upstream
#25583 commented on
Aug 7, 2025 • 51 new comments -
[webgpu] support And operator
#25440 commented on
Aug 6, 2025 • 5 new comments -
Add support for bitnets to ORT WebGPU EP
#25587 commented on
Aug 3, 2025 • 2 new comments -
Add reduceSum support for uint32 and uint64
#25597 commented on
Aug 6, 2025 • 1 new comment -
[ARM CPU] SVE support for Elementwise kernels
#25238 commented on
Aug 3, 2025 • 1 new comment -
Fix antialias downsample on CUDA EP
#25265 commented on
Aug 4, 2025 • 1 new comment -
[MLAS] Add 8-bit weights ARM64 Gemm implementation
#25110 commented on
Aug 5, 2025 • 0 new comments -
Avoid traversing entire arrays when extracting shape from objects in java
#24833 commented on
Aug 4, 2025 • 0 new comments -
Using separate cuda streams for one session
#23319 commented on
Aug 7, 2025 • 0 new comments -
[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.`
#22994 commented on
Aug 7, 2025 • 0 new comments -
Always getting "Failed to create CUDAExecutionProvider"
#11092 commented on
Aug 7, 2025 • 0 new comments -
Awful performance with LASER model when using TensorRT provider
#8315 commented on
Aug 7, 2025 • 0 new comments -
multiple tests fail on Windows due to `ORT_ENABLE_STREAM` define logic error
#20180 commented on
Aug 7, 2025 • 0 new comments -
Drop support for Python 3.5
#5961 commented on
Aug 7, 2025 • 0 new comments -
Onnx Runtime for Java is packaged with 200MB onnxruntime.pdb in the win-x64 native package
#12084 commented on
Aug 7, 2025 • 0 new comments -
GPU Memory allocation with multiple cuda stream
#12920 commented on
Aug 7, 2025 • 0 new comments -
[Performance] Dynamic Shape performance
#13198 commented on
Aug 7, 2025 • 0 new comments -
Mixed Precision ValueError: validation failed for model with all nodes in node_block_list
#14235 commented on
Aug 7, 2025 • 0 new comments -
[Performance] running on xavier gpu but cpu usage high
#14676 commented on
Aug 7, 2025 • 0 new comments -
Does ortvalue_from_numpy support directml?
#15421 commented on
Aug 7, 2025 • 0 new comments -
perf_view shows nothing after json load
#15927 commented on
Aug 7, 2025 • 0 new comments -
[TEST FAILED] Several tests fails while running onnxruntime_test_all on armv7 based device
#16387 commented on
Aug 7, 2025 • 0 new comments -
Unable to use LSTM with mask of dynamic shape with TensorrtExecutionProvider
#16885 commented on
Aug 7, 2025 • 0 new comments -
[Mobile] android prod crash: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)
#20828 commented on
Aug 7, 2025 • 0 new comments -
[webgpu] And int64 to cast
#25610 commented on
Aug 2, 2025 • 0 new comments -
Refactor code to prevent internal structure from leaking outside Graph class
#25586 commented on
Jul 31, 2025 • 0 new comments -
[webgpu] Optimize dp4 prefill shader for Qualcomm
#25578 commented on
Aug 4, 2025 • 0 new comments -
[Web] Avoid unnecessary data copy for pre-allocated tensors
#25571 commented on
Aug 4, 2025 • 0 new comments -
[CUDA EP] Add hardswish op and add bf16 support for hardsigmoid
#25562 commented on
Aug 4, 2025 • 0 new comments -
[webgpu] Add more GEMM test
#25556 commented on
Aug 1, 2025 • 0 new comments -
2bit matmul implementation
#25542 commented on
Aug 6, 2025 • 0 new comments -
[Not-For-Review] support enableGraphCapture in tests
#25535 commented on
Aug 5, 2025 • 0 new comments -
Retrieve Device and Command buffer for DML
#25533 commented on
Aug 5, 2025 • 0 new comments -
POWER : Implement MlasGemmQuantKernel using VSX builtins for M = 1
#25490 commented on
Aug 5, 2025 • 0 new comments -
Compile API: disable optimizations by default
#25474 commented on
Aug 2, 2025 • 0 new comments -
Compile API: output EPContext binary data to write function
#25471 commented on
Aug 2, 2025 • 0 new comments -
Compile API: output model and initializer stream write functions
#25455 commented on
Aug 2, 2025 • 0 new comments -
[EP ABI] Get EP compiled model compatibility
#25331 commented on
Aug 2, 2025 • 0 new comments -
Fix Sign and Clip operation on int64 tensors
#25280 commented on
Aug 4, 2025 • 0 new comments -
[Mlas] optimize MlasConv using thread partition opt
#25255 commented on
Aug 5, 2025 • 0 new comments -
[WIP] Add some device discovery support for non-Windows platforms
#25228 commented on
Aug 2, 2025 • 0 new comments -
Update index.md
#25119 commented on
Aug 5, 2025 • 0 new comments -
Type mismatch error when loading a Float16 model
#25522 commented on
Aug 4, 2025 • 0 new comments -
[Build] cmake cannot find KLEIDIAI - Windows 11 ARM
#24865 commented on
Aug 3, 2025 • 0 new comments -
onnxruntime errors out due to the wrong process of GatherElements operator with the CPUExecutionProvider: Out of range value in index tensor
#24917 commented on
Aug 3, 2025 • 0 new comments -
[BUG] Non-zero status code returned while running Resize node. in Direct ML backend
#24928 commented on
Aug 3, 2025 • 0 new comments -
Incorrect Use of CUDA Constants in MIGraphXExecutionProvider::CreatePreferredAllocators (Should Use HIP)
#25268 commented on
Aug 3, 2025 • 0 new comments -
pip install keras and pytorch comes with .onnxruntime_pybind11_state error from rembg python package
#25289 commented on
Aug 3, 2025 • 0 new comments -
[Performance] How to used pinned memory in onnxruntime.
#20947 commented on
Aug 2, 2025 • 0 new comments -
[Build] can't build CUDA (+ vino and directML) for latest v1.22 on windows
#25081 commented on
Aug 2, 2025 • 0 new comments -
Initializers use wrong allocator
#25108 commented on
Aug 2, 2025 • 0 new comments -
[Build] CMake configurations files for bin release 1.22.0 are broken
#25242 commented on
Aug 2, 2025 • 0 new comments -
[DirectML EP] Error when validating attributes of `Slice` operator
#25252 commented on
Aug 2, 2025 • 0 new comments -
[Build] CMake configurations files for bin release 1.22.0 are broken for Linux
#25279 commented on
Aug 2, 2025 • 0 new comments -
[Build] Build script for MacOS fails for targets older than 13.4 because tests can not be built
#24277 commented on
Aug 2, 2025 • 0 new comments -
Does BatchNormalization support 2D shape of `X` input
#25230 commented on
Aug 1, 2025 • 0 new comments -
Segmentation Fault running model
#25613 commented on
Aug 1, 2025 • 0 new comments -
[Build] CCCL API migration issue.
#24774 commented on
Aug 1, 2025 • 0 new comments -
[Build] How to build ONNX Runtime as a dynamic framework (.dylib/.framework) for iOS?
#25256 commented on
Aug 1, 2025 • 0 new comments -
about infer ocr Memory exception
#25258 commented on
Aug 1, 2025 • 0 new comments -
[Bug] CUDAExecutionProvider fails to load due to missing libcudnn.so.9 in LD_LIBRARY_PATH when using onnxruntime-gpu==1.22.0
#25609 commented on
Aug 1, 2025 • 0 new comments -
[Feature Request] Improve Telemetry Disablement
#25573 commented on
Aug 1, 2025 • 0 new comments -
[Feature Request] Cast Float16 model to Float32 [Web]
#17230 commented on
Aug 1, 2025 • 0 new comments -
RunAsync C# API crashes without any error
#19140 commented on
Aug 7, 2025 • 0 new comments -
[Web] `Error: [WebGPU] Kernel "[Conv] /text_encoder/encoder/layers.0/feed_forward/conv_2/Conv" failed. Error: FILTER_IN_CHANNEL should be equal to DATA_CHANNEL`
#21108 commented on
Aug 7, 2025 • 0 new comments -
[Performance] Increased memory usage when loading from bytes
#21165 commented on
Aug 7, 2025 • 0 new comments -
[Web] `Error: using ceil() in shape computation is not yet supported for AveragePool`
#21206 commented on
Aug 7, 2025 • 0 new comments -
[Build] Mismatched library directory in linux-x64 package: lib and lib64
#22267 commented on
Aug 7, 2025 • 0 new comments -
[Performance] GPU op placement control when some ops must be on the CPU
#23154 commented on
Aug 7, 2025 • 0 new comments -
[Build] 1.20.2 Microsoft.ML.OnnxRuntime.Managed nuget package needs Microsoft.ML.OnnxRuntime 1.20.2 which is not available
#23640 commented on
Aug 7, 2025 • 0 new comments -
[CPU EP] GatherND crashes with division by zero when batch dimensions mismatch between input and indices
#23828 commented on
Aug 7, 2025 • 0 new comments -
GetShape crashes on Linux
#25295 commented on
Aug 7, 2025 • 0 new comments -
[Build] Cannot read property 'install' of null with onnxruntime-react-native imported
#19510 commented on
Aug 7, 2025 • 0 new comments -
OpenVino Runtime Exception. Unexpected: CPU plug-in doesn't support If operation with dynamic rank. Operation name: input.15
#23757 commented on
Aug 7, 2025 • 0 new comments -
[Performance] Openvino 2x slower than with OpenCV on an Intel HD Graphics 620 / 630
#25266 commented on
Aug 6, 2025 • 0 new comments -
[CANN] When using onnxruntime-cann for inference, it failed to utilize the NPU for inference
#22229 commented on
Aug 6, 2025 • 0 new comments -
OpenVINO EP fails to run models with in-memory external data
#25304 commented on
Aug 6, 2025 • 0 new comments -
[WebGPU] Subgroups feature is not enabled for ort-web WebGPU EP
#25595 commented on
Aug 6, 2025 • 0 new comments -
[Bug] Auto EP selection rejects the combination of DML EP with other EPs like OpenVINO EP
#25504 commented on
Aug 5, 2025 • 0 new comments -
GetEpDevices() does not Detect Intel NPU via OpenVINO EP
#25557 commented on
Aug 5, 2025 • 0 new comments -
onnxruntime-gpu fails to find libnvrtc.so.12 when CUDA is not installed globally
#24719 commented on
Aug 5, 2025 • 0 new comments -
mutex issue on Mac only for release 1.21.X only
#24579 commented on
Aug 5, 2025 • 0 new comments -
Cannot use Microsoft.ML.OnnxRuntime NuGet package 1.22.1 with Microsoft.SemanticKernel.Connectors.Onnx
#25287 commented on
Aug 4, 2025 • 0 new comments -
Incorrect cubic resizing with antialias on CUDA
#25264 commented on
Aug 4, 2025 • 0 new comments