Summary: | [gen4] GPU crash | ||
---|---|---|---|
Product: | Mesa | Reporter: | Alejandro Gonzalez <agonzalez> |
Component: | Drivers/DRI/i965 | Assignee: | Ian Romanick <idr> |
Status: | RESOLVED DUPLICATE | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | blocker | ||
Priority: | medium | CC: | intel-gfx-bugs |
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
/sys/class/drm/card0/error
dmsg output glxinfo output lsb_release -a output lshw output Xorg -version output |
The video died Could you add more information on the system (which distro, which version of Mesa and X server) and also on the use case or program that was executed? Created attachment 107775 [details]
dmsg output
Created attachment 107776 [details]
glxinfo output
Created attachment 107777 [details]
lsb_release -a output
Created attachment 107778 [details]
lshw output
Created attachment 107779 [details]
Xorg -version output
I executed an "apt-get dist-upgrade" and the the video died. Now, when i boot the machine, when X start then the monitor died (X and text console), The service on machine are running well (the machine is a postgresql node for QA testing). Now I only can access the console by ssh. I attached the output for commands (after reboot the machine): lshw, Xorg -version, lsb_release -a If you need the output for other command just tell me the command-line. I suspect this may be another duplicate of the bug 80568, fixed (worked-around) by this commit: commit c4fd0c9052dd391d6f2e9bb8e6da209dfc7ef35b Author: Kenneth Graunke <kenneth@whitecape.org> Date: Sat Jan 17 23:21:15 2015 -0800 i965: Work around mysterious Gen4 GPU hangs with minimal state changes. Gen4 hardware appears to GPU hang frequently when using Chromium, and also when running 'glmark2 -b ideas'. Most of the error states contain 3DPRIMITIVE commands in quick succession, with very few state packets between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER. I trimmed an apitrace of the glmark2 hang down to two draw calls with a glUniformMatrix4fv call between the two. Either draw by itself works fine, but together, they hang the GPU. Removing the glUniform call makes the hangs disappear. In the hardware state, this translates to removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets. Flushing before emitting CONSTANT_BUFFER packets also appears to make the hangs disappear. I observed a slowdown in glxgears by doing it all the time, so I've chosen to only do it when BRW_NEW_BATCH and BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or already flushed the whole pipeline). I'd much rather understand the problem, but at this point, I don't see how we'd ever be able to track it down further. We have no real tools, and the hardware people moved on years ago. I've analyzed 20+ error states and read every scrap of documentation I could find. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> It's in git, and backports are in Mesa 10.4.x for x > 3. Please try upgrading to >10.4.3. If it's resolved by such an upgrade, please mark as a duplicate of bug 80568. No reply. Marking as duplicate. *** This bug has been marked as a duplicate of bug 80568 *** |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 107557 [details] /sys/class/drm/card0/error The video die [ 255.808017] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 255.808018] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 255.808018] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 255.808019] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 255.808020] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 255.816085] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x5b1d000 ctx 0) at 0x5b1d6c0