Bug 65526 - [snb] death in blorp during video playback
Summary: [snb] death in blorp during video playback
Status: RESOLVED INVALID
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Ian Romanick
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-08 01:00 UTC by Tom Horsley
Modified: 2017-02-10 22:39 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Tom Horsley 2013-06-08 01:00:05 UTC
I have no idea if I'm reporting this on the right component or not, but the dmesg error mentions a dri file, so I picked dri.

I'm using fedora 18, and since the first 3.8 kernel (and now 3.9 as well) showed up, I've been winding up with my X display disappearing and the text console showing up, but if I Ctrl-Alt-F2 then Ctrl-Alt-F1 I'm back in X and it doesn't look like it ever knew it was gone.

That one is hard to reproduce, but then I found a new symptom that is very similar. When I play a full screen video with mplayer using the -vo gl_nosw video output option, everything will freeze, 99% of the time within 20 minutes, after a few seconds the sound starts up again, but the video remains frozen till I do the same Ctrl-Alt-F2 Ctrl-Alt-F1 trick again, at which point I see the movie playing just fine again (even has the audio in sync).

Because the movie playing is a reliable way to reproduce this, I finally found some info in the dmesg output:

[ 1221.110322] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[ 1221.110332] [drm] capturing error event; look for more information in/sys/kernel/debug/dri/0/i915_error_state

I'm guessing it is the recovery from that hang that leaves the video pointing at the "wrong" frame buffer and the console switch is getting me back to the right one.

All the files for this (including the error state it mentioned in dmesg) can be found attached to the fedora bug I initially filed:

https://bugzilla.redhat.com/show_bug.cgi?id=958326
Comment 1 Chris Wilson 2013-06-08 06:30:44 UTC
There was a known death in blorp on snb fixed just recently, can you make sure you have the latest mesa (9.1.3 required)?
Comment 2 Tom Horsley 2013-06-08 11:55:03 UTC
I can't tell what version I have :-(. I use whatever is in the fedora repos and the version numbers on the fedora rpms don't appear to have any obvious correlation with the version numbers on the releases described on the mesa web site. Here's the fedora rpm list on my system:

[root@zooty ~]# rpm -q -a | fgrep -i mesa
mesa-libGLU-devel-9.0.0-1.fc18.x86_64
mesa-libgbm-9.2-0.7.20130528.fc18.i686
mesa-libGLES-9.2-0.7.20130528.fc18.x86_64
mesa-libGL-devel-9.2-0.7.20130528.fc18.x86_64
mesa-libEGL-devel-9.2-0.7.20130528.fc18.x86_64
mesa-libglapi-9.2-0.7.20130528.fc18.i686
mesa-libGLU-9.0.0-1.fc18.x86_64
mesa-filesystem-9.2-0.7.20130528.fc18.i686
mesa-libgbm-9.2-0.7.20130528.fc18.x86_64
mesa-dri-drivers-9.2-0.7.20130528.fc18.x86_64
mesa-dri-drivers-9.2-0.7.20130528.fc18.i686
mesa-libEGL-9.2-0.7.20130528.fc18.x86_64
mesa-libGL-9.2-0.7.20130528.fc18.x86_64
mesa-libxatracker-9.2-0.7.20130528.fc18.x86_64
mesa-libEGL-9.2-0.7.20130528.fc18.i686
mesa-filesystem-9.2-0.7.20130528.fc18.x86_64
mesa-libglapi-9.2-0.7.20130528.fc18.x86_64
mesa-libGL-9.2-0.7.20130528.fc18.i686
mesa-libGLU-9.0.0-1.fc18.i686

Perhaps that 20130528 string that appears in a lot of them matches the April 30, 2013 mesa release (which was 9.1.2 and is the latest release mentioned on the front page of the mesa3d.org web site)?
Comment 3 Ian Romanick 2013-06-08 21:12:23 UTC
(In reply to comment #2)
> mesa-libGL-9.2-0.7.20130528.fc18.x86_64

9.2 suggests to me that these are snapshots from some random place on Mesa master from around May 28th, 2013.... so who knows what's in there.
Comment 4 Tom Horsley 2013-06-08 21:42:27 UTC
Yea, I finally downloaded the fedora source rpm and the docs/relnotes directory does contain a 9.1.3.html file as well as a 9.2.html file, so it looks as if the source fedora uses is some kind of git snapshot taken on May 28th, 2013 (guess from the date string).

Which implies it does indeed have any fixes from 9.1.3 (though who knows what regressions since then :-).
Comment 5 Tom Horsley 2013-06-09 21:13:23 UTC
Just as a silly experiment to absolutely verify the problem is associated with something Intel video driver specific, I stuck an Nvidia card in the system, and it seems to be able to play the movie all the way through with no problems, so at this point with old fedora 16 pre-3.8 kernel working fine on Intel video and new fedora 18 3.9 kernel working fine with nvidia but failing with intel, I'm pretty positive this is an intel driver regression that showed up in the 3.8 kernel timeframe.
Comment 6 Tom Horsley 2013-06-14 17:20:07 UTC
Fedora 19 just got a new mesa update with all the rpms having 20130610 in their version (so I guess it is a new snapshot from June 10th). Doesn't seem to help - the hangs still happen when playing movies full screen.
Comment 7 Tom Horsley 2013-06-30 13:03:42 UTC
I've added info to the redhat bugzilla mentioned at the start of this bug. Apparently the original problem I noticed (finding the system displaying the console rather than the X GUI) is correlated with heavy load on the cpu.

For the first time in a few weeks, I had some big video files to transcode last night, and I found it this morning displaying the last console output instead of the normal X GUI.

I'm now suspecting the driver thinks something has timed out and does some kind of reset, but really the system was just very very busy and whatever it was waiting for was just being delayed a bit more than normal.

(This feeling is re-enforced by browsing the intel video mailing list which seems to consist primarily of dueling patches arguing about what should be considered a timeout and what should decide that progress is actually being made and the video system isn't hung after all - I guess it is too complicated to actually fix the real source of the hangs so you no longer need to worry about trying to detect them :-).
Comment 8 Daniel Vetter 2013-10-28 18:15:08 UTC
Please test Ken's snb blorp fixes from

http://cgit.freedesktop.org/~kwg/mesa/log/?h=snbfixes
Comment 9 Annie 2017-02-10 22:39:14 UTC
Dear Reporter,

This Mesa bug has been in the "NEEDINFO" status for over 60 days. I am closing this bug based on lack of response but feel free to reopen if resolution is still needed. Please ensure you're supplying the correct information as requested.

Thank you.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.