Bug 27070

Summary: [i915] Page table errors with empty ringbuffer
Product: DRI Reporter: Geir Ove Myhr <gomyhr>
Component: DRM/IntelAssignee: Chris Wilson <chris>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: DRI git   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
i915_error_state from intel-drm-next kernel
none
intel_error_decode output on the previously attache i915_error_state
none
dmesg output
none
Xorg.0.log none

Description Geir Ove Myhr 2010-03-14 09:13:51 UTC
We have several automatically reported GPU errors on 915GM on Ubuntu 10.04 with PGTBL_ER: 0x00000010 and a ringbuffer which is all zero.
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/525517
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/532225
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/532381
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/533364
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/533876
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/535613

Kim Tyler from the first report says that he doesn't notice anything wrong except for the error messages in dmesg and the automatic bug report. Kim has also tested with drm-intel-next kernel and i915_error_state form that will be attached.

Architecture: i386
DistroRelease: Ubuntu 10.04
Frequency: Once a day.
HibernationDevice: RESUME=UUID=feb1300b-9c1a-42f4-a3f1-ea81c5d2b518
InstallationMedia: Ubuntu 10.04 "Lucid Lynx" - Alpha i386 (20100304)
MachineType: ASUSTeK Computer INC. 900
Package: xserver-xorg-video-intel 2:2.9.1-1ubuntu13
PackageArchitecture: i386
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-16-generic root=UUID=4a909b82-8b3d-48b0-ad9a-806db385f535 ro quiet splash
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.utf8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-16.25-generic
Uname: Linux 2.6.32-16-generic i686
dmi.bios.date: 07/07/2008
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0802
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: 900
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: x.xx
dmi.chassis.asset.tag: 0x00000000
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTek Computer INC.
dmi.chassis.version: x.x
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0802:bd07/07/2008:svnASUSTeKComputerINC.:pn900:pvr0704:rvnASUSTeKComputerINC.:rn900:rvrx.xx:cvnASUSTekComputerINC.:ct10:cvrx.x:
dmi.product.name: 900
dmi.product.version: 0704
dmi.sys.vendor: ASUSTeK Computer INC.
glxinfo: Error: [Errno 2] No such file or directory
system:
 distro: Ubuntu
 codename: lucid
 architecture: i686
 kernel: 2.6.32-16-generic
Comment 1 Geir Ove Myhr 2010-03-14 09:14:55 UTC
Created attachment 34034 [details]
i915_error_state from intel-drm-next kernel
Comment 2 Geir Ove Myhr 2010-03-14 09:18:18 UTC
Created attachment 34035 [details]
intel_error_decode output on the previously attache i915_error_state
Comment 3 Geir Ove Myhr 2010-03-14 09:19:13 UTC
Created attachment 34036 [details]
dmesg output
Comment 4 Geir Ove Myhr 2010-03-14 09:19:38 UTC
Created attachment 34037 [details]
Xorg.0.log
Comment 5 Geir Ove Myhr 2010-03-14 10:25:05 UTC
One thing I noticed was that while in the automatically reported outputs of intel_gpu_dump the ringbuffer start at 0x00000000, in the intel_error_decode output it starts at 0x007bf000, but in both cases the head is at ACTHD: 0x00000000, which doesn't make much sense in the latter case.
Comment 6 Geir Ove Myhr 2010-03-14 10:27:56 UTC
I should have been more specific in Kim's description of the symptoms. This is what he writes:

Please note that the PC display is not hanging. It continues on and displays a screen either as tty7 or tty8. If the display is tty7, there are framebuffer error messages above the tty login prompt on tty1 ; if the display is on tty8, there are framebuffer error messages on the tty7, with tty7 not showing any other text/graphics.
Comment 7 Geir Ove Myhr 2010-03-14 11:57:14 UTC
I have marked two more downstream bug reports as duplicates of this one:
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/533139
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/535454

In these bugs there were some slight differences: 
- In the first one the CPU has started writing to the ringbuffer, but the GPU hasn't started writing yet (ACTHD: 0x00000000).
- In the second bug report the GPU has also started reading, and it has PGTBL_ER: 0x00000001 instead of PGTBL_ER: 0x00000001.
Comment 8 Chris Wilson 2010-06-06 06:20:34 UTC
The pair of patches to ensure that the framebuffers are correctly aligned are now upstream, as of 2.6.35-rc2:

commit fd2e8ea597222b8f38ae8948776a61ea7958232e
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Feb 9 14:14:36 2010 +0000

    drm/i915: Increase fb alignment to 64k
    
    An untiled framebuffer must be aligned to 64k. This is normally handled
    by intel_pin_and_fence_fb_obj(), but the intelfb_create() likes to be
    different and do the pinning itself. However, it aligns the buffer
    object incorrectly for pre-i965 chipsets causing a PGTBL_ERR when it is
    installed onto the output.
    
    Fixes:
      KMS error message while initializing modesetting -
      render error detected: EIR: 0x10 [i915]
      http://bugs.freedesktop.org/show_bug.cgi?id=22936
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: stable@kernel.org
    Signed-off-by: Eric Anholt <eric@anholt.net>

commit ac0c6b5ad3b3b513e1057806d4b7627fcc0ecc27
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu May 27 13:18:18 2010 +0100

    drm/i915: Rebind bo if currently bound with incorrect alignment.
    
    Whilst pinning the buffer, check that that its current alignment
    matches the requested alignment. If it does not, rebind.
    
    This should clear up any final render errors whilst resuming,
    for reference:
    
      Bug 27070 - [i915] Page table errors with empty ringbuffer
      https://bugs.freedesktop.org/show_bug.cgi?id=27070
    
      Bug 15502 -  render error detected, EIR: 0x00000010
      https://bugzilla.kernel.org/show_bug.cgi?id=15502
    
      Bug 13844 -  i915 error: "render error detected"
      https://bugzilla.kernel.org/show_bug.cgi?id=13844
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: stable@kernel.org
    Signed-off-by: Eric Anholt <eric@anholt.net>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.