Bug 71328

Summary: [HSW] Stuck scanline wait on render
Product: xorg Reporter: Javran Cheng <javran.c>
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: christian.roeder, javran.c
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
i915_error_state
none
i915 error from Ferry (905 version - 906 and git no longer 'detect' the error unfortunately but it's still there)
none
my very basic xorg.conf none

Description Javran Cheng 2013-11-07 01:56:23 UTC
Created attachment 88799 [details]
i915_error_state

Whenever I started playing video using mplayer, my srceen became frozen, took long time switching to tty using ctl+alt+f1. there are several lines in the output of `dmesg` like:

[  112.471206] hda-intel: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
[  115.453374] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
[  121.438200] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
[  125.461444] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
[  129.464673] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
[  133.467901] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
[  137.471130] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
[  141.474358] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
[  141.505914] bbswitch: enabling discrete graphics


I've google a lot but none of them solves my problem.

uploaded `/sys/kernel/debug/dri/0/i915_error_state`, hope it helps.

I'm ready to provide any log necessary.

kernel:
3.12.0-gentoo #1 SMP PREEMPT Wed Nov 6 18:07:56 EST 2013 x86_64 Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz GenuineIntel GNU/Linux

x11-libs/libva-intel-driver 2.99.905-r1
Comment 1 Javran Cheng 2013-11-07 02:03:03 UTC
sorry, that's `x11-drivers/xf86-video-intel` instead of `x11-libs/libva-intel-driver`
Comment 2 Daniel Vetter 2013-11-07 09:12:28 UTC
Either I have a case of not enough coffee, or the dump looks funny - it seems to have advanced into the ring already by the time we've dumped ...

I guess video playback is awfully stutter with multi-second freezes?
Comment 3 Daniel Vetter 2013-11-07 09:14:28 UTC
Also please grab the latest xf86-video-intel from git and retest with that, just to make sure it's not fixed already.
Comment 4 Chris Wilson 2013-11-07 09:16:59 UTC
The dump is funny because the kick is before the capture. *sigh*
Comment 5 Chris Wilson 2013-11-07 13:13:52 UTC
commit 68cef6cd281572fcfb76a341dc45b7c8e5baffe6
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Nov 7 13:09:25 2013 +0000

    sna/gen7: Request secure batches for Haswell vsync
    
    Since commit 8ff8eb2b38dc705f5c86f524c1cd74a811a7b04c
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Mon Sep 9 16:23:04 2013 +0100
    
        sna/hsw: Scanline waits require both DERRMR and forcewake
    
    we have been emitting LRI to enable vsync on the render ring. This
    requires a privileged batch buffer, and whilst we were checking for
    kernel support, we forgot to actually tell the kernel to submit the
    batch with the right privileges.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71328
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 6 Christian Roeder 2013-11-15 22:54:58 UTC
I just compiled the intel driver from master (6e9a8c5ae2883ca21d117ac672dd8a55b3429dc1), which should contain the commit you mentioned in your comment, but I still get

[drm:ring_stuck] *ERROR* Kicking stuck wait on render ring

when using an external display attached to DVI  on a docking station or DisplayPort. Machine is a Lenovo X220 with sandy bridge.

Are there any information for debbuging i should collect?

I am also not sure if the issue is related to

https://bugzilla.kernel.org/show_bug.cgi?id=62311
Comment 7 Chris Wilson 2013-11-16 08:09:02 UTC
The /sys/class/drm/card0/error
Comment 8 Christian Roeder 2013-11-16 22:32:18 UTC
(In reply to comment #7)
> The /sys/class/drm/card0/error

Unfortunately, there is no error recorded:
$ cat /sys/class/drm/card0/error 
no error state collected

I checked right after the hang occured.

I also noticed that it only happens if I attach an external display either via DP/DVI-Adapter or via the DVI output on the docking station, and run chromium. Other software in user space does not seem to trigger it. Also, it does not happen if just using the internal display.

I run Kernel 3.12.0 from Arch Linux, but it happened with 3.11.* before.

Any more info i should deliver?
Comment 9 Chris Wilson 2013-11-17 00:03:03 UTC
Maybe you need to switch to a more recent kernel for the GPU dump to be captured on the kick. Without /sys/class/drm/card0/error I can't diagnose the problem - and importantly double check that your system is behaving how I expect.
Comment 10 Ferry 2013-11-18 08:32:01 UTC
Hi,

I have the same issues, although I actually want to use XBMC.

Unfortunately, when using XBMC the video player behaves much like mplayer (first couple of secs it's fine, after that only 1 frame every 3-8 seconds), when using XBMC it does not log any errors whatsoever.

With mplayer I'm seeing the same errors as described here, also with the 906 version of the driver. Haven't tested that with XBMC just yet but will do so this evening, just wondering if there's anything I can do additionally to posting the error state (which is probably only generated by mplayer thus - not XBMC's media player) so I can do that as well.

Do note if I attach a monitor to the DVI it's well (Samsung 20" 1600x1200). If I attach my FHD Panasonic Plasma from 2009 this occurs (whether with DVI->HDMI cable or HDMI->HDMI - also noticed there's no audio whatsoever with the DVI->HDMI cable - there's 3 HDMI audio outputs, I presume these correspond with DVI, HDMI and DisplayPort (in that order)).

Also noticed there's some small commits after the 906. I can see if I can test with that, there's bound to be a git ebuild around for the driver somewhere (probably x11 overlay which I can snatch it from).

Thanks
Comment 11 Ferry 2013-11-20 10:32:00 UTC
Hi,

sorry it took a bit longer - I'm quite sick at the moment.

Anyways - things seem to become worse. I'm running the latest GIT version now: 2.99.906-21-gb14228f and whilst XBMC already didn't log any errors (not with 906, 905, etc. either) mplayer did output errors earlier.

It does no longer.

However, the issue itself hasn't changed. Although I must say I had it working properly with 906 once. Think it was a timing thing. Turned on the PC first and the television somewhat later. It might have to do with the time the television came 'online'. Haven't been able to reproduce it, having the television on before powering on the computer or after always results in the same now, about 1 frame every 5-6 secs.

I'd provide logs - but nothing is outputted anymore thus. I'll provide the dump I caught when it was still on 905.

Nov  4 20:56:07 www kernel: [   58.732555] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Nov  4 20:56:11 www kernel: [   62.734225] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Nov  4 20:56:18 www kernel: [   69.741146] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Nov  4 20:56:22 www kernel: [   73.738820] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Nov  4 20:56:26 www kernel: [   77.740491] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Nov  4 20:56:30 www kernel: [   81.742158] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Nov  4 20:56:34 www kernel: [   85.743838] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Nov  4 20:56:42 www kernel: [   93.747174] [drm:ring_stuck] *ERROR* Kicking stuck wait on render ring
Nov  4 20:56:42 www kernel: [   93.747190] [drm:i915_hangcheck_elapsed] *ERROR* no progress on render ring
Nov  4 20:56:42 www kernel: [   93.747195] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state
Nov  4 20:56:42 www kernel: [   93.768363] [drm:i915_set_reset_status] *ERROR* render ring hung inside bo (0x898000 ctx 0) at 0x89802c
Comment 12 Ferry 2013-11-20 10:33:53 UTC
Created attachment 89520 [details]
i915 error from Ferry (905 version - 906 and git no longer 'detect' the error unfortunately but it's still there)
Comment 13 Chris Wilson 2013-11-20 10:37:18 UTC
The reason that .906 doesn't generate this error is that it contains the fix. You are seeing something else then.
Comment 14 Ferry 2013-11-20 10:46:32 UTC
The something else is only occuring on my Panasonic plasma TV. Not that I have an extensive range of devices to test with here, I might try the television in the bedroom but that'll take quite some work due to the current mounting.

It runs fine on the DVI monitor.

I forgot to mention that when it worked I was using a DVI->HDMI cable. Not sure if that mattered because I can't reproduce it there either and I went back to a HDMI cable (and mplayer with -vo gl) so I have audio (can't seem to get audio working over the DVI->HDMI cable, I have a friend with similar set up where I'll test the cable just to make sure that's not it (doubt it, all the required pins on the DVI side seem to be there)).

What can I do? Without errors (and not being a hardware dev) it's quite hard for me to diagnose as well. I can provide SSH access or similar to the machine, it doesn't contain any private info yet, just a bunch of movies and the distro. Not sure on how well you can reproduce the issues with xv like that tho'.
Comment 15 Chris Wilson 2013-11-20 10:57:45 UTC
Describe the issue, or at least confirm if it is a similar screen freeze. Describe the actual setup, how is the second display configured? Is it an extended desktop? Which is the primary, what image?

Attaching your Xorg.0.log and dmesg (preferrably with drm.debug=6) in the failing config is always vital. Once you have an accurate description of the problem it would be best to file a new bug so that it is no longer confused with the broken HSW vsync.
Comment 16 Ferry 2013-11-20 11:30:10 UTC
Hi,

the logging is more useful for the new bug report I suppose (have to gather it too).

Setup:

3.12 with ZFS modules
Gentoo ~amd64
Gigabyte GA-H81M-HD3 mainboard
Core i3 4130
4GB RAM

Issue:

Playback of videos runs at about 1 frame per 5-8 seconds (estimated) it's not watchable (far far from it). This happens with both XBMC and mplayer (except with mplayer with -vo gl which does work). Also didn't notice the issues on my Samsung SyncMaster 204B monitor (20" 1600x1200, dated) which I used for the initial installation. The plasma TV is supposed to be the primary (and only) monitor thus.

When using mplayer (without -vo gl) I had the Kicking stuck wait on render ring messages. XBMC gave no errors whatsover, not in Xorg.0.log either (but haven't run with the drm.debug=6 yes).

I said it's in XV, but that's probably incorrect as it did work originally on my Samsung monitor. More accurate is that it occurs with XV and not with GL video output on mplayer. This is true for both DVI->HDMI (although that worked properly once - can't reproduce) and HDMI->HDMI (never seen this working but as stated can't reproduce it on DVI->HDMI either so might not mean anything).

There is no monitor configuration. I believe it's not required for a long time due to KMS. The X11 config is very basic (just ran X11 -config or -configure to create a base template - been doing this for years and seems to be fine with the monitor (but not the TV) thus).

I will retest everything with current GIT and 3.12 and might test it with the monitor and perhaps the TV in the bedroom as well. Doubt changing cables will help considering the issues exist on DVI->HDMI too (that already is a separate cable, not a converter plug). With the drm.debug=6 oc.

Is there anything else you'd like me to do whilst I'm at it? :).
Comment 17 Ferry 2013-11-20 11:31:02 UTC
Created attachment 89525 [details]
my very basic xorg.conf
Comment 18 Ferry 2013-11-20 11:32:19 UTC
It might be important to add it runs fine for the first ~6-10 seconds or so, it stalls after that.
Comment 19 Daniel Vetter 2013-11-20 14:49:45 UTC
Yeah, please make a new bug report so that we don't get lost in a massive confusion. And if you see this new/leftover issue indeed only after video playback worked for a few seconds carefully testing/confirming that with different configurations would be good.
Comment 20 Ferry 2013-11-20 20:23:37 UTC
Hi,

it seems audio related, at least now with the newer versions (906 and up where the *ERROR* Kicking stuck wait on render ring messages are gone).

And oddly too. I've tried the Samsung LCD television in the bed room with XBMC and both the DVI->HDMI cable and HDMI->HDMI cable and in both cases I just get audio and stutter free video.

On the Samsung monitor (no audio thus) all runs well with DVI.

On the Panasonic plasma TV XBMC always has stuttering video and no audio on the HDMI ports (audio does work when using the onboard soundcards analog output, XBMC does list the TV in the audio output options on the correct devices (DVI #03/HDMI #07)). Also don't hear anything when there's no movie playing and there should be sounds navigating through the menus (these are there on the Samsung LCD).

With mplayer in -vo gl mode audio works fine on the HDMI (07) output, but no audio on the DVI output (03), video works on both outputs in that case, but mplayer stutters too when using default vo (falls back to xv, there's issues loading va-api).

Kinda confused on where to report this (here/alsa/xbmc/mplayer/ffmpeg/...) ;). Want me to create it here or elsewhere?
Comment 21 Ferry 2013-11-20 20:26:23 UTC
Oh I should have mentioned video playback with XMBC on the Panasonic is also well when using the onboard analog audio. The video stuttering only happens when it's using one of the HDMI outputs.
Comment 22 Daniel Vetter 2013-11-21 08:46:02 UTC
Retesting with latest drm-intel-nightly from http://cgit.freedesktop.org/~danvet/drm-intel/ might be worth a shot, we've recently fixed a few things for audio-over-hdmi.

Otherwise I'd report this on the kernel bugzillal (bugs.kernel.org) against alsa sound drivers. Since it works with other sound modules there's a good chance it's a kernel issue.
Comment 23 Timo Jyrinki 2014-01-02 08:38:44 UTC
I can confirm that cherry-picking 68cef6cd281572fcfb76a341dc45b7c8e5baffe6 on top of Ubuntu's 2.99.904 fixes the described XVideo problem on my Haswell. Reporting this to a downstream bug report too.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.