Bugzilla – Bug 75295
Frequent hang and render glitches on Ubuntu 14.04
Last modified: 2015-02-12 21:43:07 UTC
After the latest kernel upgrade, my system has many graphical glitches, and is locking up frequently. The dmesg output has errors like these:
[ 1951.568672] Watchdog: segfault at 0 ip 00007fe00773a32e sp 00007fdff869f680 error 6 in chrome[7fe003cbe000+5dd9000]
[ 1959.241676] [drm] stuck on render ring
[ 1959.241685] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 1959.241686] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 1959.241687] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 1959.241688] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 1959.241689] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 1959.244266] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x3dc32000 ctx 17) at 0x3dc32c48
[ 3964.330034] perf samples too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 4849.028446] [drm] stuck on render ring
[ 4849.028492] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x565c2000 ctx 17) at 0x565c2c48
[ 4861.093551] Watchdog: segfault at 0 ip 00007fc00d36f32e sp 00007fbffe2d4680 error 6 in chrome[7fc0098f3000+5dd9000]
[ 4863.020198] [drm] stuck on render ring
[ 4863.020255] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x3dc32000 ctx 17) at 0x3dc32c48
[ 4893.028245] [drm] stuck on render ring
[ 4893.028295] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x4cbed000 ctx 10) at 0x4cbedc98
[ 4899.041855] [drm] stuck on render ring
[ 4899.041900] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0xea3d000 ctx 10) at 0xea3dc98
[ 4899.041903] [drm:i915_context_is_banned] *ERROR* context hanging too fast, declaring banned!
[ 5833.173837] warning: `VBoxHeadless' uses 32-bit capabilities (legacy support in use)
[ 5833.326542] device vboxnet0 entered promiscuous mode
[ 6429.476175] [drm] stuck on render ring
[ 6488.455986] [drm] stuck on render ring
[ 6547.507818] [drm] stuck on render ring
[ 6615.490047] [drm] stuck on render ring
I am not sure if the xserver-xorg-video-intel ricver was also updated at the same time. This is the version in use:
rdhruva@ubuntu:~$ apt-cache policy xserver-xorg-video-intel
*** 2:2.99.910-0ubuntu1 0
500 http://us.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
The corresponding ubuntu bug is: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1282867. That bug has a lot of information, including dmidecode output.
Created attachment 94469 [details]
Attaching as directed by dmesg.
Created attachment 94470 [details]
The versions might be incorrect, I am not sure about that.
Created attachment 94471 [details]
Latest Xorg log
Any clue as to what OpenGL applications are running at the time of the hangs? Also what version of mesa is installed (i.e. the output of glxinfo)?
I don't remember the exact list of applications, but I use KDE 4.12.2. I also have Chrome and Hexchat open. No games, video playback, or other graphic intensive applications were running.
Created attachment 94481 [details]
Output of glxinfo
I have switched to UXA for now, and that seems to have solved all the problems: I don't see any display glitches, and no lock-ups. The "stuck on render ring" messages are also gone.
Let me know if I can provide any more debugging information.
I've been seeing the same glitches, starting when the updated mesa packages were pushed out on 2/20: http://ubuntuforums.org/showthread.php?t=2206883
At exactly the same time the glitches appeared, the primus bridge quit working for Bumblebee: https://github.com/amonakov/primus/issues/133
Backleveling mesa to the packages in the saucy repository causes the graphical glitches to disappear and everything looks normal again.
Can you try to bisect through the mesa git history to find this regression?
Are there any instructions on how to do this for Ubuntu?
First hit on google for kernel besicting ;-)
Sorry, I was not aware that this is a kernel thing. I thought this bisection was required in the mesa source package.
It's a mesa thing, it's me being confused since the bugzilla update somehow ended up in my kernel bugs folder.
For bisecting mesa you can simply build from sources without any need to install anything. You only need to set LIBGL_DRIVERS_PATH to the i965_dri.so binary built by mesa, e.g.
Mesa built from git has the git version tag in the Gl version string embedded, so you can check you run the right thing.
Following the instructions to build mesa, I think I was able to get it running successfully:
rdhruva@ubuntu:~/build/mesa$ LIBGL_DRIVERS_PATH=./lib LD_PRELOAD=./lib/libGL.so.1 glxinfo | grep -i version
server glx version string: 1.4
client glx version string: 1.4
GLX version: 1.4
OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.2.0-devel (git-079bff5)
OpenGL core profile shading language version string: 3.30
OpenGL version string: 3.0 Mesa 10.2.0-devel (git-079bff5)
OpenGL shading language version string: 1.30
Now when I try "glxgears", everything is fine. I am unable to determine if this because the latest git checkout fixed the problem, or if the problem is not surfacing because I am running X in the UXA acceleration mode (instead of SNA).
Looking at the logs, can you suggest a good way to reproduce this problem? Thanks!
You need to check out the same version of mesa you have currently installed, to make sure you can reproduce the issue correctly when building from sources. Then the same for the last known working version.
Only once that's confirmed should you start the bisect.
I updated to the latest packages in Ubuntu, and this bug still exists.
Daniel, are you sure this is a problem in i965? dmesg seems to indicate i915:
[ 70.826529] [drm] stuck on render ring
[ 70.826536] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 70.826538] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[ 70.826538] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[ 70.826539] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[ 70.826540] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[ 70.829103] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x4ddb4000 ctx 2) at 0x4ddb4d50
[ 76.828152] [drm] stuck on render ring
[ 76.828189] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x9162000 ctx 2) at 0x9162c64
[ 76.828191] [drm:i915_context_is_banned] *ERROR* context hanging too fast, declaring banned!
The kernel is just the messenger here reporting that someone hung the GPU. The details about who and how are all in the error state.
@Chris: Does that mean a bisect is not required? Is the attached error information enough to debug the issue?
Hello: what can I provide to remove the NEEDINFO status? I am confused whether the git bisect is still required: Chris' message seems to imply that the problem might be completely visible in the attached error file.
If a bisect is indeed required, I am not sure if it's for i965 or i915 (the dmesg errors reference i915).
This is still happening to me with all the updates applied. Can I provide any more information to help debug this issue?
Chris comment was just about your statement in comment #17 that this is i915 related: The kernel driver is called i915, but the mesa driver for your hw is i965. And like Chris said the kernel is just the messenger.
In short, the bisect of mesa is still required, nothing changed.
Daniel: I updated my git repo and checked out version 10.1 (which is what my install currently hast). I am still unable to repro the problem from git checkout because the moment I start X with "AccelMethod uxa", the problem goes away.
Is there any better of testing this one library against an X which is not started with UXA? Starting X with SNA and testing this is not really an option because everything is unusable then.
The latest "mesa" updates in Ubuntu fixed all the problems for me. The relevant patch I see is http://permalink.gmane.org/gmane.linux.debian.devel.x/115099, but I am not sure.
Closing as fixed per comment #24.