Bug 83743 - [gm45] initialisation failure
Summary: [gm45] initialisation failure
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-11 00:06 UTC by Dawid Gajownik
Modified: 2017-07-24 22:51 UTC (History)
1 user (show)

See Also:
i915 platform: GM45
i915 features: GPU hang


Attachments
/sys/class/drm/card0/error (676.90 KB, text/plain)
2014-09-11 00:06 UTC, Dawid Gajownik
no flags Details
dmesg (64.18 KB, text/plain)
2014-09-11 00:07 UTC, Dawid Gajownik
no flags Details
Xorg.0.log (7.23 KB, text/plain)
2014-09-11 00:08 UTC, Dawid Gajownik
no flags Details
/sys/class/drm/card0/error (676.90 KB, text/plain)
2014-09-13 18:32 UTC, Dawid Gajownik
no flags Details
dmesg (64.11 KB, text/plain)
2014-09-13 18:32 UTC, Dawid Gajownik
no flags Details
Xorg.0.log (7.21 KB, text/plain)
2014-09-13 18:32 UTC, Dawid Gajownik
no flags Details

Description Dawid Gajownik 2014-09-11 00:06:59 UTC
Created attachment 106094 [details]
/sys/class/drm/card0/error

Hi Team,

sometimes (I think that it happens only after coldboot) video card fails to initialize during system startup:

[    1.610542] [drm] Initialized drm 1.1.0 20060810
[    1.635836] [drm] Memory usable by graphics device = 2048M
[    1.635840] [drm] Replacing VGA console driver
[    1.636458] Console: switching to colour dummy device 80x25
[    1.643179] i915 0000:00:02.0: irq 45 for MSI/MSI-X
[    1.643190] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    1.643191] [drm] Driver supports precise vblank timestamp query.
[    1.643353] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    1.646486] [drm] GPU HANG: ecode -1:0x00000000, reason: Command parser error, iir 0x00008000, action: continue
[    1.646489] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[    1.646491] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[    1.646492] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[    1.646493] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[    1.646495] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[    1.646496] i915: render error detected, EIR: 0x00000010
[    1.646501] i915:   IPEIR: 0x00000000
[    1.646503] i915:   IPEHR: 0x00000000
[    1.646505] i915:   INSTDONE_0: 0xffffffff
[    1.646507] i915:   INSTDONE_1: 0xbfffffff
[    1.646508] i915:   INSTDONE_2: 0x00000000
[    1.646510] i915:   INSTDONE_3: 0x00000000
[    1.646512] i915:   INSTPS: 0x8001e037
[    1.646514] i915:   ACTHD: 0x0007c154
[    1.646516] i915: page table error
[    1.646517] i915:   PGTBL_ER: 0x00000002
[    1.646521] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck: 0x00000010, masking

I have to reload the system and after that it works as expected. I saw similar bug reports but I'm not sure if it's a duplicate or a new bug.

Fedora 20
kernel-3.16.2-200.fc20.x86_64
libdrm-2.4.54-1.fc20.x86_64
mesa-libGL-10.1.5-1.20140607.fc20.x86_64
xorg-x11-server-Xorg-1.14.4-11.fc20.x86_64

# lspci -vnn | grep VGA -A 15
00:02.0 VGA compatible controller [0300]: Intel Corporation 4 Series Chipset Integrated Graphics Controller [8086:2e12] (rev 03) (prog-if 00 [VGA controller])
	Subsystem: Fujitsu Technology Solutions Device [1734:114c]
	Flags: bus master, fast devsel, latency 0, IRQ 45
	Memory at fc000000 (64-bit, non-prefetchable) [size=4M]
	Memory at e0000000 (64-bit, prefetchable) [size=256M]
	I/O ports at 1c70 [size=8]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [d0] Power Management version 2
	Kernel driver in use: i915
	Kernel modules: i915

00:02.1 Display controller [0380]: Intel Corporation 4 Series Chipset Integrated Graphics Controller [8086:2e13] (rev 03)
	Subsystem: Fujitsu Technology Solutions Device [1734:114c]
Comment 1 Dawid Gajownik 2014-09-11 00:07:49 UTC
Created attachment 106095 [details]
dmesg
Comment 2 Dawid Gajownik 2014-09-11 00:08:36 UTC
Created attachment 106096 [details]
Xorg.0.log
Comment 3 Chris Wilson 2014-09-11 06:15:24 UTC
That's a different presentation of the failure to init gm45. There the update to HEAD was delayed until after our checks. In drm-intel-nightly, there is

commit 95468892fdfeef6d1004b524e35957629efdbe00
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Aug 7 15:39:54 2014 +0100

    drm/i915: Reset the HEAD pointer for the ring after writing START
    
    Ville found an old w/a documented for g4x that suggested that we need to
    reset the HEAD after writing START. This is a useful fixup for some of
    the g4x ring initialisation woes, but as usual, not all.
    
    v2: Do the rewrite unconditionally anyway
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=76554
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

which should be the right fix.
Comment 4 Jani Nikula 2014-09-11 16:40:13 UTC
Please re-test with current drm-intel-nightly branch from http://cgit.freedesktop.org/drm-intel and report back.
Comment 5 Dawid Gajownik 2014-09-13 18:31:07 UTC
I'm sorry to say that but I'm still able to reproduce the problem with below kernel:

$ uname -a
Linux rumcajs.zvid.net 3.17.0-rc4+ #1 SMP Sat Sep 13 16:23:56 CEST 2014 x86_64 x86_64 x86_64 GNU/Linux
$ git log -1
commit 43df30da20447e2856b2761215ff274886a9f931
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Sep 12 17:35:41 2014 +0200

    drm-intel-nightly: 2014y-09m-12d-15h-35m-20s UTC integration manifest

On Monday I'll start my travel to the polar station (by ship so I will not have access to the Internet for ~40 days). I would like to apologize in advance for slower replies.
Comment 6 Dawid Gajownik 2014-09-13 18:32:14 UTC
Created attachment 106222 [details]
/sys/class/drm/card0/error
Comment 7 Dawid Gajownik 2014-09-13 18:32:34 UTC
Created attachment 106224 [details]
dmesg
Comment 8 Dawid Gajownik 2014-09-13 18:32:55 UTC
Created attachment 106225 [details]
Xorg.0.log
Comment 9 Chris Wilson 2014-09-14 06:26:59 UTC
(In reply to comment #5)
> I'm sorry to say that but I'm still able to reproduce the problem with below
> kernel:
> 
Don't worry! It's not the same bug. /o\
Comment 10 Jani Nikula 2015-10-23 10:07:53 UTC
Timeout, closing. The problem reported originally was fixed, so resolving fixed.

Please file new bugs if the new problems persist with latest kernels.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.