Bug 21488

Summary: [GM45] [UXA] [KMS] lockup while using opera
Product: xorg Reporter: Arkadiusz Miskiewicz <arekm>
Component: Driver/intelAssignee: Eric Anholt <eric>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: critical    
Priority: medium CC: bgamari, freedesktop, gronslet, kedgedev, mnemo, sa
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Proposed fix for this hang
none
Output of intel_gpu_dump none

Description Arkadiusz Miskiewicz 2009-04-30 02:19:08 UTC
System environment: 
-- chipset: GM45
-- system architecture: 64-bit
-- xf86-video-intel: 417f3784b7fae8de3559c7607a2de60661a6a448 (masterfrom 2 days ago)
-- xserver: 1.6.1 (final release)
-- mesa: git master state as of 20090418
-- libdrm: 2.4.9
-- kernel: linus git master + anholt drm-intel-next as of 2 days ago
-- Linux distribution: PLD/Linux
-- Machine or mobo model: Thinkpad T400 with switchable graphics
-- Display connector: LVDS

Reproducing steps:

Reproducing reliably is very hard. Lockup usually happens when using opera and browsing some pages (lwn.net last time). It happens once per 2-3 days. X is frozen, mouse cursor works. Killing X doesn't help. Reboot is required to get things into working state.

0x00000032c4aca327 in ioctl () from /lib64/libc.so.6
(gdb) bt
#0  0x00000032c4aca327 in ioctl () from /lib64/libc.so.6
#1  0x00007f1b938ea243 in drmIoctl (fd=7, request=25688, arg=0x0) at xf86drm.c:187
#2  0x00007f1b938ea546 in drmCommandNone (fd=7, drmCommandIndex=<value optimized out>) at xf86drm.c:2313
#3  0x00007f1b93470af6 in I830BlockHandler (i=0, blockData=<value optimized out>, pTimeout=0x7fff9e4b9f48, pReadmask=0x7d1ea0) at i830_driver.c:2223
#4  0x000000000052d4b8 in AnimCurScreenBlockHandler (screenNum=0, blockData=0x0, pTimeout=0x7fff9e4b9f48, pReadmask=0x7d1ea0) at animcur.c:222
#5  0x00000000004f93fe in compBlockHandler (i=0, blockData=0x0, pTimeout=0x7fff9e4b9f48, pReadmask=0x7d1ea0) at compinit.c:158
#6  0x000000000044b170 in BlockHandler (pTimeout=0x7fff9e4b9f48, pReadmask=0x7d1ea0) at dixutils.c:384
#7  0x00000000004e7661 in WaitForSomething (pClientsReady=0x3bda950) at WaitFor.c:215
#8  0x00000000004474f0 in Dispatch () at dispatch.c:367
#9  0x000000000042d63d in main (argc=7, argv=0x7fff9e4ba128, envp=<value optimized out>) at main.c:397
(gdb)
Comment 1 Arkadiusz Miskiewicz 2009-04-30 02:22:55 UTC
bugzilla doesn't allow to attach plain/text files > 1MB so intel_gpu_dump output is here:

http://carme.pld-linux.org/~arekm/intel_gpu_dump-bug-21488.txt
Comment 2 Eric Anholt 2009-05-06 15:37:40 UTC
Very nice, this looks like a hang I had as well and was waiting for a reproduce on (takes a lot longer than 2-3 days for me, and tends to only happen when I'm on the train with no other machines around) -- batchbuffer hung where it's bound at the last page of the aperture.  If you had more dumps confirming this, that would be great.  But in the meantime, I should whip up a patch tomorrow or so to try.
Comment 3 Jesse Barnes 2009-05-11 11:21:56 UTC
Adjusting severity: crashes & hangs should be marked critical.
Comment 4 Eric Anholt 2009-05-12 15:31:31 UTC
Created attachment 25806 [details] [review]
Proposed fix for this hang

Posted a patch for review:
http://lists.freedesktop.org/archives/intel-gfx/2009-May/002374.html

Haven't actually tested it.
Comment 5 Eric Anholt 2009-05-12 15:58:17 UTC
*** Bug 21249 has been marked as a duplicate of this bug. ***
Comment 6 Eric Anholt 2009-05-12 16:01:09 UTC
*** Bug 21382 has been marked as a duplicate of this bug. ***
Comment 7 Eric Anholt 2009-05-12 16:16:19 UTC
*** Bug 21240 has been marked as a duplicate of this bug. ***
Comment 8 Eric Anholt 2009-05-15 16:16:00 UTC
Pulled into 2.6.30:

commit 13f4c435ebf2a7c150ffa714f3b23b8e4e8cb42f
Author: Eric Anholt <eric@anholt.net>
Date:   Tue May 12 15:27:36 2009 -0700

    drm/i915: Don't allow binding objects into the last page of the aperture.
    
    This should avoid a class of bugs where the hardware prefetches past the
    end of the object, and walks into unallocated memory when the object is
    bound to the last page of the aperture.
    
    fd.o bug #21488
    
    Signed-off-by: Eric Anholt <eric@anholt.net>
Comment 9 Eric Anholt 2009-05-15 17:10:44 UTC
*** Bug 21621 has been marked as a duplicate of this bug. ***
Comment 10 Milan Bouchet-Valat 2009-05-21 05:12:11 UTC
*** Bug 21414 has been marked as a duplicate of this bug. ***
Comment 11 Robert Huitl 2009-07-30 05:50:20 UTC
I still get GPU lockups, about once a day. Most of the time I can still move the mouse, sometimes it freezes. My configuration:

- GFX hardware: Intel Corporation Mobile 915GM/GMS/910GML Express Graphics Controller (rev 03) (Thinkpad X41)
- Gentoo, 32 bit kernel + userland
- xorg-server-1.6.2-r1
- mesa-7.5-r1
- xf86-video-intel-2.7.99.902-r1
- Kernel 2.6.30.2, KMS enabled, additional patches applied:
   i915: Save/restore cursor state on suspend/resume.
   i915: add ignore lvds quirk info for AOpen Mini PC
   i915: apply G45 vblank count code to all G4x chips and fix max_frame_count
   i915: avoid non-atomic sysrq execution
   i915: Skip lvds with Aopen i945GTt-VFA
   i915: Hook connector to encoder during load detection (fixes tv/vga detect)
   i915: initialize fence registers to zero when loading GEM
   i915: Set SSC frequency for 8xx chips correctly

There are no suspicious messages in dmesg, syslog, Xorg.0.log or ~/.xsession-errors when the freeze occurs. Xorg backtrace looks like the one in  bug 21249:

#0  0xffffe424 in __kernel_vsyscall ()
#1  0xb7ac0719 in ioctl () from /lib/libc.so.6
#2  0xb797fb68 in drm_intel_gem_bo_map_gtt () from /usr/lib/libdrm_intel.so.1
#3  0xb7910f31 in ?? () from /usr/lib/xorg/modules/drivers//intel_drv.so
#4  0x083c5858 in ?? ()
#5  0x00000000 in ?? ()

I attached the output of intel_gpu_dump.
Comment 12 Robert Huitl 2009-07-30 05:51:25 UTC
Created attachment 28191 [details]
Output of intel_gpu_dump
Comment 13 Eric Anholt 2009-08-03 09:47:21 UTC
Robert, open your own bug for your own issue.  You don't have this bug.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.