ONNX Models Segmentation Fault
I’m using YOLOX for inference, and I’m getting unexplainable ONNX runtime errors 15% of the time.
2025-01-30 14:30:46,845 E [78] azmlinfsrv - Encountered Exception: Traceback (most recent call last):
File "/opt/miniconda/envs/amlenv/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 132, in invoke_run
run_output = self._wrapped_user_run(**run_parameters, request_headers=dict(request.headers))
File "/opt/miniconda/envs/amlenv/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 156, in <lambda>
self._wrapped_user_run = lambda request_headers, **kwargs: self._user_run(**kwargs)
File "/var/azureml-app/endpoint/score.py", line 88, in run
overlay, boxes, labels, scores = inference_fun(data)
File "/var/azureml-app/endpoint/yolox_inference.py", line 483, in detect
output = self.session.run(None, ort_inputs)
File "/opt/miniconda/envs/amlenv/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Conv node. Name:'/backbone/C3_n3/conv3/conv/Conv' Status Message: /onnxruntime_src/onnxruntime/core/common/safeint.h:17 static void SafeIntExceptionHandler<onnxruntime::OnnxRuntimeException>::SafeIntOnOverflow()
Just in case, I checked that the onnx
and onnxruntime
versions match between my training component environment and the inference environment.
and if you re run the exact same request (same image) the issue persists. It fails 15% of the time but the payload is always the same!
The culprit was a conv node in the loaded ONNX model. The solution was disabling brute-force search for the best algorithm on that node.
if "CUDAExecutionProvider" in onnxruntime.get_available_providers():
providers=[("CUDAExecutionProvider", {"cudnn_conv_algo_search": "DEFAULT"})]
else:
providers = ["CPUExecutionProvider"]
session = onnxruntime.InferenceSession(onnx_file, providers=providers)