Summary: | [SNB regression vsync] WAIT_FOR_EVENT hangs | ||
---|---|---|---|
Product: | DRI | Reporter: | Martin Jørgensen <mkj> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | bugs, jens, mthode |
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Please attach /sys/kernel/debug/dri/0/i915_error_state And also Xorg.0.log Created attachment 82713 [details]
error state dump just after second GPU hang
Created attachment 82714 [details]
Xorg.0.log file just after second GPU hang
Can you please try running with i915.i915_enable_rc6=0 on the kernel commandline? I didnt seem to help. A little dmesg grep: [ 1.709971] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off [ 145.524766] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [ 145.524770] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state [ 145.533684] [drm:kick_ring] *ERROR* Kicking stuck wait on render ring I have the error state for this hang if you need it. Yes, can you attach the error-state as well. I expect it to the be the same, but there is no harm in double checking. Can you also please attach lspci -vvv -s 0:0:2? Created attachment 82737 [details]
error state after hang. rc6 is off.
Created attachment 82738 [details]
output from: lspci -vvv -s 0:0:2
What wm do you use? Is the hang only associated with vlc/games, or general desktop usage? (Trying to work out if every attempt to vsync fails or if it is sporadic.) Otherwise the cmd looks valid and I don't see anything special about your machine - though I have to admit to not having used vsync on pipe B myself, but given the bugs that were fixed involving pipe B I think others are using it successfully... Im running Enlightenment E17, without composistion. The hangs only occurs with vlc/mplayer/games. All my daily applications runs without "special effects" and works fine. I will try testing with E17 composition (fancy AIGLX stuff) on pipe 0 and on pipe 1 with my external monitor. I have some Diplayport->HDMI adapters between my monitor and my laptop. Maybe some of them disturbs the timing/ddc/edid-whatever? (In reply to comment #11) > I have some Diplayport->HDMI adapters between my monitor and my laptop. > Maybe some of them disturbs the timing/ddc/edid-whatever? The messages involved here are all internal to the GPU (actually between the display engine and the GPU...) so should not be affected by external configuration. I've done the testing. It seems not possible to make the GPU hang when using pipe 0 (LVDS), with or without composition, no matter the application. Using pipe 1 (HDMI2), I have turn composition on. Otherwise the GPU hangs consitently, no matter the application. (In reply to comment #13) > I've done the testing. It seems not possible to make the GPU hang when using > pipe 0 (LVDS), with or without composition, no matter the application. > > Using pipe 1 (HDMI2), I have turn composition on. Otherwise the GPU hangs > consitently, no matter the application. This just in'! I managed to get a hang running E17 + composition with VLC fullscreen running some movie. (In reply to comment #14) > (In reply to comment #13) > > I've done the testing. It seems not possible to make the GPU hang when using > > pipe 0 (LVDS), with or without composition, no matter the application. > > > > Using pipe 1 (HDMI2), I have turn composition on. Otherwise the GPU hangs > > consitently, no matter the application. > > This just in'! I managed to get a hang running E17 + composition with VLC > fullscreen running some movie. On pipe 1 :) Created attachment 82749 [details]
Read-after-write patch
Created attachment 82750 [details] [review] Even more paranoid read-after-write Created attachment 82751 [details] [review] One more variant Neither of the patches fixed the problem. glxgears generaly hangs if I resize the window too fast. If i resize the window slow enough no hangs occur. No issues when using composition. I have error_state and Xorg.0.log for attachment 82749 [details] and 82750 if needed. I also get these new messages in dmesg after I apply any of the 2 patches: [ 51.077013] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 255 [ 51.101981] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 255 [ 51.102002] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 255 [ 51.102009] stereo mode not supported [ 51.102013] stereo mode not supported Attachment 82751 [details] (One more variant) fails to patch. Created attachment 82762 [details]
failed output of patch (One more variant, 82751)
(In reply to comment #19) > Neither of the patches fixed the problem. > > glxgears generaly hangs if I resize the window too fast. If i resize the > window slow enough no hangs occur. No issues when using composition. > > I have error_state and Xorg.0.log for attachment 82749 [details] and 82750 > if needed. > > I also get these new messages in dmesg after I apply any of the 2 patches: > > > [ 51.077013] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, > remainder is 255 > [ 51.101981] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, > remainder is 255 > [ 51.102002] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, > remainder is 255 > [ 51.102009] stereo mode not supported > [ 51.102013] stereo mode not supported > > Attachment 82751 [details] (One more variant) fails to patch. Never mind the "new messages" part. It seems I get these errors with the stock driver as well. *** Bug 67856 has been marked as a duplicate of this bug. *** Created attachment 83787 [details]
error state dump - first hang
*** Bug 69099 has been marked as a duplicate of this bug. *** I can't reproduce this bug anymore after the last X stack upgrade in Gentoo. I have: kernel 3.11.3 mesa 9.1.6 xorg-server 1.14.3 xf86-video-intel 2.21.15 libdrm 2.4.46 I've recently also upgraded my Gentoo system to the same package versions as Jens Pranaitis, except the kernel which is 3.10.7-r1. I'm currently running xmonad, and the momemt i run glxgears the GPU hangs big time. I still need to compile xf86-video-intel with uxa and disable sna to avoid hangs. You do realise that UXA only works because it doesn't support vsync? You can also turn off vsync for SNA with either Option "SwapbuffersWait" "false" or Option "VSync" "false" At this moment in time, the most likely reason is that you have an early version of SNB prior to the retrofitted vsync support. :| No i did not realize that. I will try that instead. Which version have the retrofitted vsync? >2.21.15? I was referring to the GPU. (They didn't add vsync back into the design until very, very late.) *sigh* I guess I'm too sleepy. For some reason I read SNB as SNA. I belive my graphics card is HD 3000 (GT2). Is that recent enough for vsync? The question is which stepping of the GPU - as it was never clear which stepping received the fix, which stepping actually went to market first and how to query the stepping from userspace... I have noticed that when switching monitors with xrandr, if I shut them all off before turning them up again I don't hit this bug. #doesn't hit xrandr --output LVDS1 --off xrandr --output HDMI1 --auto xrandr --output LVDS1 --off --output VGA1 --auto --output HDMI1 --auto --right-of VGA1 instead of this #does hit xrandr --output HDMI1 --auto xrandr --output LVDS1 --off xrandr --output LVDS1 --off --output VGA1 --auto --output HDMI1 --auto --right-of VGA1 I experience a similar issue with 3.7+ kernels that seems to be caused by the same underlying cause. This is a Lenovo X220t, (early) SNB graphics. I use Metacity without compositing, in a dual-monitor setup (internal LVDS + external VGA1). Launching glxgears on VGA1 causes lots of "Kicking stuck wait on render ring" messages on dmesg and the framerate (of the entire screen) drops to something like 0.1 fps. Glxgears on LVDS is completely fine (60fps). Launching it on LVDS and then dragging the window to VGA1, however, also causes the problem. The lockups start the moment the center of the glxgears window crosses monitors, so I suspected Vsync issue, and found this bug report. Adding 'Option "SwapbuffersWait" "false"' workarounds the problem. As does enabling compositing, or switching to UXA. (In reply to comment #33) > I experience a similar issue with 3.7+ kernels that seems to be caused by > the same underlying cause. > This is a Lenovo X220t, (early) SNB graphics. > I use Metacity without compositing, in a dual-monitor setup (internal LVDS + > external VGA1). > > Launching glxgears on VGA1 causes lots of "Kicking stuck wait on render > ring" messages on dmesg and the framerate (of the entire screen) drops to > something like 0.1 fps. > Glxgears on LVDS is completely fine (60fps). Launching it on LVDS and then > dragging the window to VGA1, however, also causes the problem. > The lockups start the moment the center of the glxgears window crosses > monitors, so I suspected Vsync issue, and found this bug report. > > Adding 'Option "SwapbuffersWait" "false"' workarounds the problem. > As does enabling compositing, or switching to UXA. Please make sure that you're on the latest version of the intel DDX. Early versions of the snb vsync support had bugs with dual-head configurations. If you still experience hangs then please file a new bug report - for us it's much easier to mark duplicates than to untangle multiple bugs in the same report. And there are countless reasons to hang a gpu ;-) Presuming fixed with latest ddx version. Using kernel 3.12, and ddx 2.99.906, I'm still able to provoke a hang when resizing the glxgears window intensively, but it seems to be alot harder to make the gpu hang now. Fullscreen applications, vlc, and GL apps doesnt seem to hang anyone, but I havent tested it much. I'll open a new one if it gets severe. The glxgears bug could be the infamous blorp death bug, which is fixed in the 9.2.4 release of mesa iirc. So please check that you have that, if not please file a new bug with the error state attached. Closing verified+fixed. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 82694 [details] grep'ed dmesg After I upgraded from 3.7 kernel to 3.8+ kernels, my GPU have started hanging itself. Sometimes it recovers, sometimes it doesnt. Sometimes it recovers after a single "kick" sometimes after 4-5 kicks. I upgraded the Xorg intel driver from 2.20.13 to 2.21.12 because I wasnt even able to resize accelerated windows without hangs. But it still hangs sometimes when running something accelerated (VLC, some game). I also get 2 new errors in my dmesg after i upgraded from kernel 3.7. From kernel 3.8 I get: [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 5 From kernel 3.9 and 3.10 I get: [drm] Wrong MCH_SSKPD value: 0x16040307 [drm] This can cause pipe underruns and display issues. [drm] Please upgrade your BIOS to fix this. I'm running up-to-date Gentoo + a few keyworded packages on a Thinkpad T420.