Bug 58899

Summary: [HSW] GPU hung when start X in Ubuntu, with specific kernel command
Product: DRI Reporter: Du Yan <yanx.du>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED NOTOURBUG QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: medium CC: haihao.xiang, ouping.zhang, yakui.zhao, yi.sun
Version: unspecifiedKeywords: regression
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg info
none
errorstatus none

Description Du Yan 2012-12-31 06:19:22 UTC
Created attachment 72327 [details]
dmesg info

Environment:
--------------
platform: HSW
Kernel: (drm-intel-nightly)071f0a705c081ed6b1d61b6fa8be969a5b15ddf8
Libva: (staging)2e11d2273b2974a7d1959cbcaf8db5b8e9aedd9e
Intel-driver: (staging)16dc6293002995da07148279f3846a2a9749ded3


Bug Info:
--------------
After boot to system, then play a stream with mplayer, GPU hung shown in dmesg
[drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung


Steps:
--------------
1. Boot to System
2. xinit&
3. mplayer -nosound -fps 30 -va vaapi -vo vaapi BA3_SVA_C.264
4. check the dmesg
Comment 1 Du Yan 2012-12-31 06:20:42 UTC
Created attachment 72328 [details]
errorstatus
Comment 2 Chris Wilson 2012-12-31 09:50:16 UTC
How very 845g.
Comment 3 Ouping Zhang 2013-01-04 01:55:55 UTC
Do you mean QA need to verify this issue on 845g  flatform?
BTW, this issue only can be reproduced on HSW.
(In reply to comment #2)
> How very 845g.
Comment 4 haihao 2013-01-04 03:36:29 UTC
Could you double check the issue is cuased by mplayer -vo vaapi ? I can't reprocude this issue on my HSW.
Comment 5 Ouping Zhang 2013-01-04 04:18:37 UTC
It is a Kernel bug. 
(In reply to comment #4)
> Could you double check the issue is cuased by mplayer -vo vaapi ? I can't
> reprocude this issue on my HSW.
Comment 6 Ouping Zhang 2013-01-04 04:34:37 UTC
HSW:Shark Bay Desktop Beta SDP (Flathead Creek): B0 stepping (id=0x40660, rev 02), Lynx Point 02 (B0 stepping) and Host bridge id=0x0d04 (rev 02) 4Cores/4Thread, CPU 2.0GHz, 0xD26 2000MHz
Comment 7 Gordon Jin 2013-01-04 04:51:00 UTC
Kernel: (drm-intel-nightly)cdb96764a45f87e4614df1b16d68b2ccb9806f57 good
Kernel: (drm-intel-nightly)071f0a705c081ed6b1d61b6fa8be969a5b15ddf8 bad
Comment 8 Ouping Zhang 2013-01-04 09:03:29 UTC
Libdrm:(master)libdrm-2.4.40-1-g7d42b49c0cf19dbb4531cd84efae51f95db2eea1
Mesa:(master)bb284669f85a32900bfec648d68ba4c4300772f4
Xserver:(master)xorg-server-1.13.0-135-g011f8458805e443ac9130865d2840a929a00cabf
Xf86_video_intel:(master)2.20.13-3-g66eb0adffa63ef8ece7621ba90dc96af91549612
Cairo:		(master)62b795fe52c73ad58101c101aa77449f4b61a576
   
Kernel: (drm-intel-nightly)071f0a705c081ed6b1d61b6fa8be969a5b15ddf8
Libva: (staging)2e11d2273b2974a7d1959cbcaf8db5b8e9aedd9e
Intel-driver: (staging)16dc6293002995da07148279f3846a2a9749ded3

HSW:Shark Bay Desktop Beta SDP (Flathead Creek): B0 stepping (id=0x40660, rev 02), Lynx Point 02 (B0 stepping) and Host bridge id=0x0d04 (rev 02) 4Cores/4Thread, CPU 2.0GHz, 0xD26 2000MHz
Comment 9 Chris Wilson 2013-01-04 12:15:57 UTC
Was it the same symptomatic incoherency? With the regression suggestion, is it then bisectable? The symptom suggested a missing w/a or sync-flush, the regression tag suggests potentially otherwise.
Comment 10 Ouping Zhang 2013-01-05 05:30:54 UTC
We can find a good commit.
Kernel: (drm-intel-nightly)cdb96764a45f87e4614df1b16d68b2ccb9806f57 good
Kernel: (drm-intel-nightly)071f0a705c081ed6b1d61b6fa8be969a5b15ddf8 bad
But now the good commit"cdb96764a45f87e4614df1b16d68b2ccb9806f57" can't find in the branch drm-intel-nightly any more.
With the latest kernel, the issue still can be reproduced. But if adding "drm.debug=0xe" in kernel command line, this issue can't be reproduced.

(In reply to comment #9)
> Was it the same symptomatic incoherency? With the regression suggestion, is
> it then bisectable? The symptom suggested a missing w/a or sync-flush, the
> regression tag suggests potentially otherwise.
Comment 11 Gordon Jin 2013-01-06 05:52:59 UTC
Yi, can we locate the specific kernel branch, the try bisecting?
Comment 12 Yi Sun 2013-01-06 08:37:28 UTC
(In reply to comment #11)
> Yi, can we locate the specific kernel branch, the try bisecting?

By quick investigation, I got little confused. 
First, this issue only happen on Ubuntu OS, I didn't reproduce on the Fedora. 
Second, on Ubuntu, I found that issue is related to the order of kernel command. 
For example:
a. BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc7+ root=UUID=452d1dd9-ee6a-451b-8330-8d4a5595f851 ro quiet splash text drm.debug=0xe
b. BOOT_IMAGE=/boot/vmlinuz-3.7.0-rc7+ root=UUID=452d1dd9-ee6a-451b-8330-8d4a5595f851 drm.debug=0xe ro quiet splash text

Boot system with command b can trigger this issue, but command b can't
Comment 13 Yi Sun 2013-01-06 08:41:27 UTC
sorry typo:
Boot system with command a can trigger this issue, but command b can't
Comment 14 Gordon Jin 2013-01-07 06:46:32 UTC
Is this HSW specific?
Comment 15 Yi Sun 2013-01-17 05:12:43 UTC
I do think this issue is due to some special setting in their environment. I tried to reproduce this issue in a refreshed Ubuntu 12.04 64 bit, and use the same kernel command line as previous. But failed to reproduce it. I assume this issue can't say the DRM issue exactly.
Comment 16 Chris Wilson 2013-02-08 16:48:43 UTC
So we don't know how to reproduce the hang, nor whether it is a regression.
Comment 17 Gordon Jin 2013-02-18 05:59:45 UTC
let's close it, unless Media QA could defense.
Comment 18 Jari Tahvanainen 2016-11-03 12:27:49 UTC
Closing verified+notourbug. No activity ~4 years.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.