Bug 51142

Summary: [GM965/GL960 regression] oops in switch_context
Product: DRI Reporter: max <manikulin>
Component: DRM/IntelAssignee: Ben Widawsky <ben>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: medium CC: ben, chris, daniel, jbarnes
Version: XOrg git   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg
none
Disable contexts when HW doesn't support it. none

Description max 2012-06-15 23:19:29 UTC
Created attachment 63097 [details]
dmesg

I have experienced 4 hangs of the kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-experimental/2012-06-15-quantal/

One was during boot, when splash screen should be displayed. Three others
when I tied to launch x server
startx /usr/bin/ck-launch-session xterm
In three cases only power off button helped. I could not switch
to console, magic SysRq keys did not work. The screen was blank.

Once the problem was not so severe and my script (launched with timeout)
was able to catch logs. Dmesg:

[  587.488223] [drm:intel_update_fbc], fbc disabled per module param
[  588.087685] BUG: unable to handle kernel NULL pointer dereference at 00000010
[  588.087783] IP: [<f896baed>] i915_switch_context+0x5d/0xe0 [i915]
[  588.087838] *pdpt = 000000003472c001 *pde = 0000000000000000 
[  588.087890] Oops: 0000 [#1] SMP 

i915_error_state and Xorg.0.log do not contain errors. Usually "apport"
put his file in /var/crash, but now there is no fresh dumps. So the other
crashes were due to kernel hangs.

I do not see the problem with older kernels, e.g.
drm-intel-experimental/2012-06-13

lspci
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) (rev 03)
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (secondary) (rev 03)

ASUS F80L laptop

uname -i
i386

Ubuntu 12.04 + xorgs-edgers ppa + drm-intel-experimental kernel

xserver-xorg-video-intel
Version: 2:2.19.0+git20120613.0db789e1-0ubuntu0sarvatt~precise
xserver-xorg-core
Version: 2:1.12.2+git20120605+server-1.12-branch.aaf48906-0ubuntu0ricotz~precise
libgl1-mesa-dri
Version: 8.1~git20120529.f92b2e5e-0ubuntu0sarvatt~precise
libdrm-intel1
Version: 2.4.35+git20120613.d1fcfb17-0ubuntu0sarvatt~precise
Comment 1 Ben Widawsky 2012-06-16 12:30:12 UTC
Created attachment 63116 [details] [review]
Disable contexts when HW doesn't support it.

Please give this patch a shot.
Comment 2 max 2012-06-16 19:01:20 UTC
The problem is that I have no idea where the sources of
http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-experimental/
kernels reside. I can not associate these kernels neither with ubuntu
git repositories nor with people.freedesktop.org/~danvet/drm-intel
or another repository at freedesktop.
Comment 3 max 2012-06-17 09:53:42 UTC
With drm-intel-experimental/2012-06-17-quantal/ 
even [Alt+SysRq+b] does not help.
Comment 4 Ben Widawsky 2012-06-17 12:34:29 UTC
I've pushed the fix here for you to try. You'll need to work with #ubuntu, or build your own kernel to test it.

I'd really like to know your result, it does fix an issue I see on my ILK, so I suspect this to be the problem.

http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=for-max
Comment 5 Daniel Vetter 2012-06-18 02:26:43 UTC
Fix is merged to drm-intel-next-queued (which is the branch the drm-intel-experimental ppa kernel is built from):

commit e158c5aa1776372cd751e2c395300a3a6ff0bc9c
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Sun Jun 17 09:37:24 2012 -0700

    drm/i915: disable contexts on old HW

Thanks for reporintg this bug, please reopen if you still have issues.
Comment 6 max 2012-06-18 08:37:29 UTC
I have checked the problem is apparent in

commit 8e96d9c4d9843f00ebeb4a9b33596d96602ea101
Author: Ben Widawsky
Date:   Mon Jun 4 14:42:56 2012 -0700
    drm/i915: reset the GPU on context fini

but I can not reproduce it with the following commit

commit 9a70e1d4e200333d39b1e8407cbcbdaf515b33d8
Author: Ben Widawsky
Date:   Sat Jun 16 05:21:13 2012 -0700
    drm/i915: disable contexts on old HW

Thank you for the fix.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.