Bug 25510 - Blank X except for mouse cursor
Blank X except for mouse cursor
Status: RESOLVED FIXED
Product: DRI
Classification: Unclassified
Component: DRM/Intel
XOrg 6.7.0
x86-64 (AMD64) Linux (All)
: medium normal
Assigned To: Jesse Barnes
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-12-08 03:02 UTC by Reinhard Karcher
Modified: 2010-01-12 01:30 UTC (History)
8 users (show)

See Also:


Attachments
gpu dump while X running with blank screen (44.65 KB, application/x-bzip)
2009-12-09 05:45 UTC, Reinhard Karcher
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Reinhard Karcher 2009-12-08 03:02:12 UTC
On debian linux-image-2.6.32-trunk-amd64 and 
xserver-xorg-video-intel 2:2.9.1-1 there is a blank screen except for the mouse cursor. The cursor follows the movement of the mouse.
All work well on linux-image-2.6.31-1-amd64 and other older kernels.
On the starting VT there are error messages:
../../../libdrm/intel/intel_bufmgr_gem.c:825: Error setting to CPU domain 96: Input/output error
../../../libdrm/intel/intel_bufmgr_gem.c:899: Error setting domain 773: Input/output error

From lspci:
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 03)                                                 
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 03)                                                        

From an old Xorg.log:
(--) PCI:*(0:0:2:0) 8086:2a02:10cf:13fe Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller rev 3, Mem @ 0xfc000000/1048576, 0xe0000000/268435456, I/O @ 0x00001800/8

(II) LoadModule: "dri"
(II) Loading /usr/lib/xorg/modules/extensions//libdri.so
(II) Module dri: vendor="X.Org Foundation"
        compiled for 1.6.5, module version = 1.0.0
        ABI class: X.Org Server Extension, version 2.0
(II) Loading extension XFree86-DRI
(II) LoadModule: "dri2"
(II) Loading /usr/lib/xorg/modules/extensions//libdri2.so
(II) Module dri2: vendor="X.Org Foundation"
        compiled for 1.6.5, module version = 1.1.0
        ABI class: X.Org Server Extension, version 2.0
(II) Loading extension DRI2
(==) Matched intel for the autoconfigured driver
(==) Assigned the driver to the xf86ConfigLayout
(II) LoadModule: "intel"
(II) Loading /usr/lib/xorg/modules/drivers//intel_drv.so
(II) Module intel: vendor="X.Org Foundation"
        compiled for 1.6.5, module version = 2.9.1
        Module class: X.Org Video Driver
        ABI class: X.Org Video Driver, version 5.0
(II) intel: Driver for Intel Integrated Graphics Chipsets: i810,
        i810-dc100, i810e, i815, i830M, 845G, 852GM/855GM, 865G, 915G,
        E7221 (i915), 915GM, 945G, 945GM, 945GME, Pineview GM, Pineview G,
        965G, G35, 965Q, 946GZ, 965GM, 965GME/GLE, G33, Q35, Q33, GM45,
        4 Series, G45/G43, Q45/Q43, G41, B43, Clarkdale, Arrandale
(II) Primary Device is: PCI 00@00:02:0
(II) resource ranges after probing:
        [0] -1  0       0xffffffff - 0xffffffff (0x1) MX[B]
        [1] -1  0       0x000f0000 - 0x000fffff (0x10000) MX[B]
        [2] -1  0       0x000c0000 - 0x000effff (0x30000) MX[B]
        [3] -1  0       0x00000000 - 0x0009ffff (0xa0000) MX[B]
        [4] -1  0       0x0000ffff - 0x0000ffff (0x1) IX[B]
        [5] -1  0       0x00000000 - 0x00000000 (0x1) IX[B]
(II) Loading sub module "vgahw"
(II) LoadModule: "vgahw"
(II) Loading /usr/lib/xorg/modules//libvgahw.so
(II) Module vgahw: vendor="X.Org Foundation"
        compiled for 1.6.5, module version = 0.1.0
        ABI class: X.Org Video Driver, version 5.0
(II) Loading sub module "ramdac"
(II) LoadModule: "ramdac"
(II) Module "ramdac" already built-in
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 8, (OK)
drmOpenByBusid: Searching for BusID pci:0000:00:02.0
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 8, (OK)
drmOpenByBusid: drmOpenMinor returns 8
drmOpenByBusid: drmGetBusid reports pci:0000:00:02.0
(II) intel(0): Creating default Display subsection in Screen section
        "Default Screen Section" for depth/fbbpp 24/32
(==) intel(0): Depth 24, (--) framebuffer bpp 32
(==) intel(0): RGB weight 888
(==) intel(0): Default visual is TrueColor
(II) intel(0): Integrated Graphics Chipset: Intel(R) 965GM
(--) intel(0): Chipset: "965GM"
(--) intel(0): Linear framebuffer at 0xE0000000
(--) intel(0): IO registers at addr 0xFC000000 size 1048576
(WW) intel(0): libpciaccess reported 0 rom size, guessing 64kB
(II) intel(0): the SDVO device with slave addr 70 is found on DVO 1 port
(II) intel(0): 2 display pipes available.


Reinhard
Comment 1 Chris Wilson 2009-12-09 03:31:59 UTC
You have a wedged GPU. Please follow the instructions on http://intellinuxgraphics.org/how_to_report_bug.html to grab a gpu dump. Thanks.
Comment 2 Reinhard Karcher 2009-12-09 05:45:12 UTC
Created attachment 31882 [details]
gpu dump while X running with blank screen

Gpu dump as requested.
Comment 4 mathieu.taillefumier 2009-12-26 07:55:09 UTC
I have the same bug since the 2.6.32-rc1 kernel and I was able to locate with a linear search in the linus tree which patch seems to trigger the bug. The problem is triggered by one of the patchs between the commit 11670d3c93210793562748d83502ecbef4034765 (xorg works with this commit included) and the commit 44040f107e64d689ccd3211ac62c6bc44f3f0775 (xorg does not work starting with this one). These two commits are actually two successive merges from the 2.6.32 kernel window. I can boot with both kernels on demand if needed. 

I can also confirm that this bug is only triggered when the computer has 4g of memory (maybe more) but not less. the dmesg files are identical for both kernels. 

Comment 5 Reinhard Karcher 2009-12-26 10:50:48 UTC
I can confirm that the bug disappears, if I boot with the parameter mem=3500M.
Comment 6 mathieu.taillefumier 2010-01-05 12:56:01 UTC
I finished a git bisect session and maybe found the patch that is responsible of this bug. The end result is :

# bad: [17d857be649a21ca90008c6dc425d849fa83db5c] Linux 2.6.32-rc1
git bisect bad 17d857be649a21ca90008c6dc425d849fa83db5c
# good: [74fca6a42863ffacaf7ba6f1936a9f228950f657] Linux 2.6.31
git bisect good 74fca6a42863ffacaf7ba6f1936a9f228950f657
# bad: [5d1fe0c98f2aef99fb57aaf6dd25e793c186cea3] Staging: vt6656: Integrate vt6656 into build system.
git bisect bad 5d1fe0c98f2aef99fb57aaf6dd25e793c186cea3
# good: [13af7a6ea502fcdd4c0e3d7de6e332b102309491] netxen: update copyright
git bisect good 13af7a6ea502fcdd4c0e3d7de6e332b102309491
# good: [fdf82dc2e2d43cf135b5fd352dea523642bb553a] V4L/DVB (12549): v4l2: video device: Add FM TX controls default configurations
git bisect good fdf82dc2e2d43cf135b5fd352dea523642bb553a
# good: [a9bbd210a44102cc50b30a5f3d111dbf5f2f9cd4] Merge branch 'docs-next' of git://git.lwn.net/linux-2.6
git bisect good a9bbd210a44102cc50b30a5f3d111dbf5f2f9cd4
# good: [18240904960a39e582ced8ba8ececb10b8c22dd3] Merge branch 'for-linus3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
git bisect good 18240904960a39e582ced8ba8ececb10b8c22dd3
# bad: [ada3fa15057205b7d3f727bba5cd26b5912e350f] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
git bisect bad ada3fa15057205b7d3f727bba5cd26b5912e350f
# good: [ea47689e74a1637fac4f5fc44890f3662c976849] V4L/DVB (12720): em28xx-cards: Add vendor/product id for Kworld DVD Maker 2
git bisect good ea47689e74a1637fac4f5fc44890f3662c976849
# good: [5579fd7e6aed8860ea0c8e3f11897493153b10ad] Merge branch 'for-next' into for-linus
git bisect good 5579fd7e6aed8860ea0c8e3f11897493153b10ad
# bad: [1aaf2e59135fd67321f47c11c64a54aac27014e9] Merge branch 'x86-txt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 1aaf2e59135fd67321f47c11c64a54aac27014e9
# bad: [5e8d6b8bf94f1ffcb7e3c31b73284c20f297f191] agp: fix uninorth build
git bisect bad 5e8d6b8bf94f1ffcb7e3c31b73284c20f297f191
# bad: [91b8e3056bf9107b688eb076c9b804171364db71] intel-agp: Move repeated sglist free into separate function
git bisect bad 91b8e3056bf9107b688eb076c9b804171364db71
# bad: [176616814d700f19914d8509d9f65dec51a6ebf7] intel_agp: Use PCI DMA API correctly on chipsets new enough to have IOMMU
git bisect bad 176616814d700f19914d8509d9f65dec51a6ebf7
# good: [ff663cf8705bea101d5f73cf471855c85242575e] agp: Add generic support for graphics dma remapping
git bisect good ff663cf8705bea101d5f73cf471855c85242575e

and the final answer from git is 

176616814d700f19914d8509d9f65dec51a6ebf7 is first bad commit
commit 176616814d700f19914d8509d9f65dec51a6ebf7
Author: Zhenyu Wang <zhenyu.z.wang@intel.com>
Date:   Mon Jul 27 12:59:57 2009 +0100

    intel_agp: Use PCI DMA API correctly on chipsets new enough to have IOMMU
    
    When graphics dma remapping engine is active, we must fill
    gart table with dma address from dmar engine, as now graphics
    device access to graphics memory must go through dma remapping
    table to get real physical address.
    
    Add this support to all drivers which use intel_i915_insert_entries()
    
    Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

:040000 040000 2e225bb921aba5b816886d0d8ada8fd81e00e7a7 03d1ce91543e61975802c3a26df1ae5b75902b51 M      drivers

Comment 7 Wang Zhenyu 2010-01-07 17:52:58 UTC
This should be fixed in Linus's master now.
commit e6be8d9d17bd44061116f601fe2609b3ace7aa69
Author: Zhenyu Wang <zhenyu.z.wang@intel.com>
Date:   Tue Jan 5 11:25:05 2010 +0800

    drm: remove address mask param for drm_pci_alloc()
    
    drm_pci_alloc() has input of address mask for setting pci dma
    mask on the device, which should be properly setup by drm driver.
    And leave it as a param for drm_pci_alloc() would cause confusion
    or mistake would corrupt the correct dma mask setting, as seen on
    intel hw which set wrong dma mask for hw status page. So remove
    it from drm_pci_alloc() function.
    
    Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
Comment 8 Mike Hommey 2010-01-08 06:27:45 UTC
(In reply to comment #7)
> This should be fixed in Linus's master now.
> commit e6be8d9d17bd44061116f601fe2609b3ace7aa69
> Author: Zhenyu Wang <zhenyu.z.wang@intel.com>
> Date:   Tue Jan 5 11:25:05 2010 +0800
> 
>     drm: remove address mask param for drm_pci_alloc()
> 
>     drm_pci_alloc() has input of address mask for setting pci dma
>     mask on the device, which should be properly setup by drm driver.
>     And leave it as a param for drm_pci_alloc() would cause confusion
>     or mistake would corrupt the correct dma mask setting, as seen on
>     intel hw which set wrong dma mask for hw status page. So remove
>     it from drm_pci_alloc() function.
> 
>     Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
>     Signed-off-by: Dave Airlie <airlied@redhat.com>

This doesn't seem to be solving the issue for me. (cherry-picked and applied on 2.6.32 as currently in debian unstable)
Comment 9 Reinhard Karcher 2010-01-08 06:42:48 UTC
My original bug disapeared by patching debian linux source 2.6.36-trunk,
but X with KMS is unusable. X without KMS is working.

The linux from git (2.6.33-rc3 and newest patches) is OK.
Comment 10 Reinhard Karcher 2010-01-08 07:05:07 UTC
Actually, I didn't apply the patch to linux-source-2.6.32-trunk, but extended the masks in drivers/gpu/drm/i915/i915_{dma,gem}.c from 32 bits to 64 bits.
Comment 11 termite 2010-01-11 14:25:56 UTC
fixed for me in latest git.
Comment 12 Ben Hutchings 2010-01-11 18:23:24 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > This should be fixed in Linus's master now.
> > commit e6be8d9d17bd44061116f601fe2609b3ace7aa69
> > Author: Zhenyu Wang <zhenyu.z.wang@intel.com>
> > Date:   Tue Jan 5 11:25:05 2010 +0800
> > 
> >     drm: remove address mask param for drm_pci_alloc()
> > 
> >     drm_pci_alloc() has input of address mask for setting pci dma
> >     mask on the device, which should be properly setup by drm driver.
> >     And leave it as a param for drm_pci_alloc() would cause confusion
> >     or mistake would corrupt the correct dma mask setting, as seen on
> >     intel hw which set wrong dma mask for hw status page. So remove
> >     it from drm_pci_alloc() function.
> > 
> >     Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
> >     Signed-off-by: Dave Airlie <airlied@redhat.com>
> 
> This doesn't seem to be solving the issue for me. (cherry-picked and applied on
> 2.6.32 as currently in debian unstable)

You might also need this one:

commit fc61901373987ad61851ed001fe971f3ee8d96a3
Author: David Woodhouse <dwmw2@infradead.org>
Date:   Wed Dec 2 11:00:05 2009 +0000

    agp/intel-agp: Clear entire GTT on startup

(In reply to comment #10)
> Actually, I didn't apply the patch to linux-source-2.6.32-trunk, but extended
> the masks in drivers/gpu/drm/i915/i915_{dma,gem}.c from 32 bits to 64 bits.

Based on what intel-agp does, I think the correct mask is 36-bit.
Comment 13 Mike Hommey 2010-01-12 01:30:04 UTC
(In reply to comment #12)
> You might also need this one:
> 
> commit fc61901373987ad61851ed001fe971f3ee8d96a3

Confirmed. With both it works. (Now, if that could make it to 2.6.32-6 ;) )