Bug 57072 - [965gm] Hang with 3.7.0-rc5, presume dup of 56916
Summary: [965gm] Hang with 3.7.0-rc5, presume dup of 56916
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-11-13 15:56 UTC by Zdenek Kabelac
Modified: 2012-12-30 10:38 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
error dump from intel kernel driver (761.55 KB, text/plain)
2012-11-13 15:56 UTC, Zdenek Kabelac
no flags Details

Description Zdenek Kabelac 2012-11-13 15:56:45 UTC
Created attachment 70011 [details]
error dump from intel kernel driver

Looks like  GPU hanging starts again now with recent  3.7-rc kernel and latest git driver - since I do not have reliable test - I'm attaching some error dumps.
During hand I've been just moving forward (right arrow button) in firefox window (actually entering another bugzilla elsewhere)

My hw -  T61 - GMA965, 4GB Ram

kernel: 3.7.0-rc5
intel driver - commit: 66eb0adffa63ef8ece7621ba90dc96af91549612   (Nov 12)

EOF dmesg:
[48651.841902] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung
[48651.847664] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state
[48651.868557] i2c i2c-5: i2c_outb: 0x70, timeout at bit #7
[48651.872388] i2c i2c-5: NAK from device addr 0x38 msg #0
[48651.935429] [drm] GMBUS [i915 gmbus vga] timed out, falling back to bit banging on pin 2
[48651.945162] i2c i2c-2: i2c_outb: 0xa0, timeout at bit #7
[48651.948913] i2c i2c-2: NAK from device addr 0x50 msg #0
[48651.961854] i2c i2c-5: i2c_outb: 0x70, timeout at bit #7
[48651.965617] i2c i2c-5: NAK from device addr 0x38 msg #0
[48651.981872] i2c i2c-2: i2c_outb: 0xa0, timeout at bit #7
[48651.985627] i2c i2c-2: NAK from device addr 0x50 msg #0
[48652.362028] [drm:i915_reset] *ERROR* Failed to reset chip.


EOF xsession-errors:
(EE) [mi] EQ overflowing.  Additional events will be discarded until existing events are processed.
(EE) 
(EE) Backtrace:
(EE) 0: /usr/bin/X (xorg_backtrace+0x36) [0x46a456]
(EE) 1: /usr/bin/X (mieqEnqueue+0x26b) [0x57f95b]
(EE) 2: /usr/bin/X (0x400000+0x4bf22) [0x44bf22]
(EE) 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7fe707ec3000+0x6184) [0x7fe707ec9184]
... (detected GPU hang)

For more info ask.
Comment 1 Zdenek Kabelac 2012-11-13 16:14:01 UTC
Argh - my bad - I've forget to switch from UXA back to SNA while testing another problem before.

So the hang happened with  UXA driver - which is known to be problematic.
Let's hope it's not related to SNA.
Comment 2 Chris Wilson 2012-11-13 16:39:13 UTC
Hmm, we are tracking an apparent regression in 3.7 with gen4, see bug 56916, and the error-state is consistent with that bug (and less likely to be a spontaneous regression in UXA itself, except that there are a few very rare issues in UXA/gen4).
Comment 3 Zdenek Kabelac 2012-11-13 17:55:04 UTC
Could be related - however in my case there was probably no mesa involved - unless Firefox is using it somehow - nothing 3d seemed to be running
(using 2D gnome)

Anyway I've always had GPU hangs with UXA - while with SNA (even sometimes with buggy rendering) I do not experience them.
Comment 4 Chris Wilson 2012-11-21 11:14:54 UTC
Let's first clear the outstanding regression with 3.7 and see what remains.
Comment 5 Chris Wilson 2012-11-21 17:42:00 UTC
I have a kernel branch you can test to exclude the 3.7 regression (I hope at least!):

http://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=for-imre
Comment 6 Chris Wilson 2012-12-30 10:38:27 UTC
xf86-video-intel commit 736b89504a32239a0c7dfb5961c1b8292dd744bd
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sun Dec 30 10:32:18 2012 +0000

    uxa: Align surface allocations to even tile rows
    
    Align surface sizes to an even number of tile rows to cater for sampler
    prefetch. If we read beyond the last page we may catch the PTE in a
    state of flux and trigger a GPU hang. Also detected by enabling invalid
    PTE access checking.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=56916
    References: https://bugs.freedesktop.org/show_bug.cgi?id=55984
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.