Summary: | repeatable drm:amdgpu_job_timedout with vulkan toy | ||
---|---|---|---|
Product: | Mesa | Reporter: | Dave Gilbert <freedesktop> |
Component: | Drivers/Vulkan/radeon | Assignee: | mesa-dev |
Status: | RESOLVED NOTOURBUG | QA Contact: | mesa-dev |
Severity: | normal | ||
Priority: | medium | ||
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Dave Gilbert
2018-05-28 16:11:12 UTC
I think there's a fair chance that it's actually getting stuck in the while loop in my ray.comp (which may well be a screwup on my part); but even so taking out everything in a non-rebootable way is a bit of a mess! adding: diff --git a/ray.comp b/ray.comp index e75039f..0611d56 100644 --- a/ray.comp +++ b/ray.comp @@ -52,13 +52,14 @@ void main() { // sure that none of rx/ry/rz are greater than a pixel ray = ray / length(ray); + int limit = 0; float result = 0.0; bool hitx = false; bool hity = false; bool hitz = false; bool hitedge = false; float lighting = 0.0; - while (result <= 255.4 && !hitedge && + while (result <= 255.4 && !hitedge && limit < 256 && !(hitx=hitend(pvp.x, ray.x, vsize.x)) && !(hity=hitend(pvp.y, ray.y, vsize.y)) && !(hitz=hitend(pvp.z, ray.y, vsize.z))) { @@ -76,6 +77,7 @@ void main() { result+= float(value/8.0); } pvp += ray; + limit++; } if (result > 255.0) result=255.0; seems to stop it triggering. or actually the correct fix to my ray.comp is: --- a/ray.comp +++ b/ray.comp @@ -61,7 +61,7 @@ void main() { while (result <= 255.4 && !hitedge && !(hitx=hitend(pvp.x, ray.x, vsize.x)) && !(hity=hitend(pvp.y, ray.y, vsize.y)) && - !(hitz=hitend(pvp.z, ray.y, vsize.z))) { + !(hitz=hitend(pvp.z, ray.z, vsize.z))) { that will stop the ray tracer running off into the distance for ever. GPU reset and recovery can be enabled using amdgpu.gpu_recovery=1. Otherwise a shader with an endless loop will just keep running forever. (In reply to Christian König from comment #3) > GPU reset and recovery can be enabled using amdgpu.gpu_recovery=1. > > Otherwise a shader with an endless loop will just keep running forever. Hmm; that's a dangerous situation. 1) Recovery doesn't work - I tried it, and it did at least make the machine rebootable, but the GPU state was very broken after the reset attempt. 2) An unprivileged user being able to make the system need & fail to reboot by default is a security issue; couldn't this be triggered by something like WebGL? Yes, that is a known issue, but this particular bug report is not the way to cover it, so closing it again. (In reply to Nicolai Hähnle from comment #5) > Yes, that is a known issue, but this particular bug report is not the way to > cover it, so closing it again. What is the way to cover it then? There have been several bugreports about unrecoverable gpu hangs leading to full system hangs, and all of them were swept off the table just like this one. (In reply to Nicolai Hähnle from comment #5) > Yes, that is a known issue, but this particular bug report is not the way to > cover it, so closing it again. OK, fair enough - which one should I be following? (For reference, I just tried it on an Intel box; according to dmesg it times out in the same way, but it does succesfully manage a reset which I guesss is better; it's still fairly grim locking the GUI for the timeout period). As Nicolai said, this is a known issue. Definitely unrelated to RADV. Please don't re-open, thanks! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.