Summary: | [ivb/hsw] batch overwritten with garbage | ||
---|---|---|---|
Product: | Mesa | Reporter: | Mario Golfetto <mariogolf2> |
Component: | Drivers/DRI/i965 | Assignee: | Kenneth Graunke <kenneth> |
Status: | RESOLVED FIXED | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | adriancz, anarsoul, bb, emir72h, famelis, ilanco, intel-gfx-bugs, jnocturna, mtijink.bugs, nicolas.belouin, robert2505, ryllu800proar, saintdev, sibrus, t.kijas, webstrand, zecoucou |
Version: | 10.1 | ||
Hardware: | Other | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 77449 | ||
Attachments: |
dump in /sys/class/drm/card0/error
dump in /sys/class/drm/card0/error (20140411) cat /sys/class/drm/card0/error > error.txt |
Try downgrading to mesa-9 or mesa-10.0. *** Bug 76300 has been marked as a duplicate of this bug. *** I was able to reproduce this today - a sensible looking batch, but the first bunch of DWords are smashed to 0xFFFFFFFF. Also on Haswell. Not sure what's going on. Mario, what version of Mesa are you using? Il 10/04/2014 01:17, bugzilla-daemon@freedesktop.org ha avuto l'onore e l'ardire di scrivere: > *Comment # 3 <https://bugs.freedesktop.org/show_bug.cgi?id=77207#c3> on > bug 77207 <https://bugs.freedesktop.org/show_bug.cgi?id=77207> from > Kenneth Graunke <mailto:kenneth@whitecape.org> * > > I was able to reproduce this today - a sensible looking batch, but the first > bunch of DWords are smashed to 0xFFFFFFFF. Also on Haswell. Not sure what's > going on. > > Mario, what version of Mesa are you using? > Hi, this is mesa version on my box (updated today): > ii libegl1-mesa:amd64 10.1.0-5 > ii libegl1-mesa-drivers:amd64 10.1.0-5 > ii libgl1-mesa-dri:amd64 10.1.0-5 > ii libgl1-mesa-glx:amd64 10.1.0-5 > ii libglapi-mesa:amd64 10.1.0-5 > ii libgles2-mesa:amd64 10.1.0-5 > ii libglu1-mesa:amd64 9.0.0-2 > ii libopenvg1-mesa:amd64 10.1.0-5 > ii libwayland-egl1-mesa:amd64 10.1.0-5 Today I upgraded my system and goes better. I'm still looking for new deatils. Reguards, Mario Created attachment 97229 [details]
dump in /sys/class/drm/card0/error (20140411)
Our current theory is that Mesa is allocating insufficient memory for the MCS buffers, and the GPU is running off the end of those and trampling whatever happens to come after it in memory. In this case, that's the batchbuffer. In MCS speak, 0xff means "this part of the buffer is clear." *** Bug 77376 has been marked as a duplicate of this bug. *** For what it's worth, these problems should go away if you upgrade to the upcoming X server stable releases: either 1.15.1 or 1.14.6. (Those should be coming out any day now.) KWin is erroneously getting an 8x multisampled visual/fbconfig due to a server-side GLX bug, and it really doesn't want one. That said, I think winsys multisampling is still broken, so this is a real bug. Upgrading the X server is probably the best path for users right now, while we figure out what's going on. Another easier workaround in the meantime is to ask KWin to use EGL: KWIN_OPENGL_INTERFACE=egl kwin --replace & *** Bug 76763 has been marked as a duplicate of this bug. *** *** Bug 77109 has been marked as a duplicate of this bug. *** *** Bug 76063 has been marked as a duplicate of this bug. *** *** Bug 77256 has been marked as a duplicate of this bug. *** Kenneth, could you point me to exact xserver commit(s) which should fix the bug? *** Bug 77392 has been marked as a duplicate of this bug. *** (In reply to comment #14) > Kenneth, could you point me to exact xserver commit(s) which should fix the > bug? Sure. It's "glx: Clear new FBConfig attributes to 0 by default.": http://cgit.freedesktop.org/xorg/xserver/commit/?id=96a28e9c914d7ae9b269f73a27b99cbd3c465ac8 *** Bug 77429 has been marked as a duplicate of this bug. *** *** Bug 76039 has been marked as a duplicate of this bug. *** Eric posted a Mesa patch which fixes the corruption/hangs: http://lists.freedesktop.org/archives/mesa-dev/2014-April/057818.html Apparently, it was indeed a problem with our multisample control buffer handling. With that patch, window system multisampling works reliably (at least for me.) Thank you all for the excellent data, and sorry for the trouble! That said, KWin really should not be using 8x multisampling - it adds a lot of unnecessary overhead. I'd still strongly recommend upgrading to X server 1.15.1 or 1.14.6, which fix the GLX bug which caused KWin to do this. Or, you can continue working around that bug by using KWIN_OPENGL_INTERFACE=egl. Any of the above should fix this issue. commit 7ae870211ddc40ef6ed209a322c3a721214bb737 Author: Eric Anholt <eric@anholt.net> Date: Mon Apr 14 16:52:43 2014 -0700 i965: Fix buffer overruns in MSAA MCS buffer clearing. *** Bug 76907 has been marked as a duplicate of this bug. *** *** Bug 76704 has been marked as a duplicate of this bug. *** *** Bug 76574 has been marked as a duplicate of this bug. *** *** Bug 76491 has been marked as a duplicate of this bug. *** KWIN_OPENGL_INTERFACE=egl or KWIN_OPENGL_INTERFACE=egl kwin --replace & do not work to me. Downgrade to mesa 8 makes it work. I am sorry for reopen it, but my original bug report (about graphical corruptions - white stripes #77256 ) has been marked as duplicate of this bug. (In reply to comment #25) > KWIN_OPENGL_INTERFACE=egl or KWIN_OPENGL_INTERFACE=egl kwin --replace & do > not work to me. > > Downgrade to mesa 8 makes it work. > > I am sorry for reopen it, but my original bug report (about graphical > corruptions - white stripes #77256 ) has been marked as duplicate of this > bug. Perhaps your KWin doesn't have EGL support, so that workaround fails. The proper solution is to upgrade to Mesa 10.0.5, 10.1.1, or 10.2-rc1. That will fix the actual driver bug. If you upgrade to one of those releases, and still experience the problem, please reopen the bug. I also highly recommend upgrading to X server 1.15.1 or 1.14.6. It isn't strictly necessary, but without it, things will be slow. *** Bug 77256 has been marked as a duplicate of this bug. *** *** Bug 78362 has been marked as a duplicate of this bug. *** *** Bug 78531 has been marked as a duplicate of this bug. *** Hi Kenneth, I upgraded my Ubuntu trusty to today's state of the xorg-edgers PPA: xserver-xorg-core 2:1.15.1-0ubuntu2 xserver-xorg-video-intel 2:2.99.911+git20140507.18416b51-0ubuntu0ricotz~trusty libegl1-mesa:amd64 10.3.0~git20140514.8a9f5ecd-0ubuntu0sarvatt~trusty libgl1-mesa-dri:amd64 10.3.0~git20140514.8a9f5ecd-0ubuntu0sarvatt~trusty $ glxinfo | grep Mesa client glx vendor string: Mesa Project and SGI OpenGL renderer string: Mesa DRI Intel(R) Ivybridge Desktop OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.3.0-devel OpenGL version string: 3.0 Mesa 10.3.0-devel I'm still experiencing problems. Garbage screen output disappeared, but the screen is stuck for 5-20 secs every 1-3 minutes. The kernel ring buffer says: [ 333.738942] [drm] stuck on render ring [ 333.738951] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 333.738952] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 333.738953] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 333.738954] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 333.738955] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 339.736311] [drm] stuck on render ring [ 339.736354] [drm:i915_context_is_banned] *ERROR* context hanging too fast, declaring banned! [ 348.720483] [drm] stuck on render ring [ 354.729941] [drm] stuck on render ring [ 360.727429] [drm] stuck on render ring [ 360.727558] [drm:i915_context_is_banned] *ERROR* context hanging too fast, declaring banned! [ 439.102608] init: tty1 main process ended, respawning [ 468.674648] [drm] stuck on render ring [ 647.707294] [drm] stuck on render ring [ 716.739191] [drm] stuck on render ring So I'm still having trouble, even with xserver 1.15.1 and a recent mesa version. Regards, Benjamin Created attachment 99107 [details]
cat /sys/class/drm/card0/error > error.txt
Hi Benjamin, The batch buffer in your error state does not contain garbage - instead, it looks like you're hanging on a PIPE_CONTROL after a 3DPRIMITIVE. Plus, you don't have the graphical corruption described here. And you definitely have the fix for this bug. So, I think you're hitting a different GPU hang unrelated to this report. I've gone ahead and created bug #78751 to track your issue. Closing this one again. --Ken |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 97097 [details] dump in /sys/class/drm/card0/error System: Debian Jessie (testing) 64bit kernel: Debian 3.13.7-1 x86_64 GNU/Linux DE: KDE 4.11.3 CPU: Intel i5-4440 mainboard: Asus Z87-plus monitor: 1980*1200@60Hz via DVI + CRT1024*768@85Hz via VGA/DSUB no graphical card added Yesterday I turned on the standard graphical effects on KDE for simple testing and I halted the system. Today, after some hours of job, I opened Iceweasel, Icedove, Virtualbox & one VM and I entered on a Google's hangout to have a videochat. During videochat, 1) I pressed ALT+TAB to switch to another window 2) I saw the graphical effect "Scambiatore circolare" (this is the italian translation: I'm sorry, but I don't find the exact english name of this effect!) 3) I didn't see all windows of programs opened 4) the system freezed some moments (more or less one second). This is the first time I see this, whit DVI monitor. Before today no problem whit a 19" TFT monitor plus CRT. This is on my dmesg: [20044.219079] [drm] stuck on render ring [20044.219082] [drm] GPU crash dump saved to /sys/class/drm/card0/error [20044.219083] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [20044.219084] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [20044.219085] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [20044.219085] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [20044.221558] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0xad9e000 ctx 1) at 0xad9e004 Attached /sys/class/drm/card0/error dump file (compressed). Bye, Mario