Summary: | [GM45] relocation error in 3.14 -- causes hang with -intel-2.99.917 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | andreas.sturmlechner | ||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Severity: | normal | ||||||||
Priority: | medium | CC: | intel-gfx-bugs | ||||||
Version: | XOrg git | ||||||||
Hardware: | x86-64 (AMD64) | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Created attachment 111415 [details]
20141228-0045_3.4.105-gentoo-stop_dmesg-ON.log
Yes, it's a bug in the kernel relocation routines. In case there is a kernel fix, just how big are my chances to get it backported to 3.4? ;) Does it happen with latest drm-intel-nightly branch from cgit.freedesktop.org/drm-intel? If so a bisect could lead you to the fix commit. If it doesn't it is still an upstream issue. As said, it works with 3.19-rc1+, so actually I would need to find out at which point in the past it was fixed. If there was no prominent fix you could point me to from memory, I will start going back the last few majors. I know why I keep my .configs around... I guess Chris might know offhand, otherwise I guess you get to do the long, painful, bisect. :/ Oh. I think I know what it might actually have been: commit 983d308cb8f602d1920a8c40196eb2ab6cc07bd2 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 26 10:47:10 2015 +0000 agp/intel: Serialise after GTT updates That could explain a few of these similar bugs. Some things I can tell now: Kernel versions that hang with >=xf86-video-intel-2.99.917: 3.4.106, 3.10.53 What works: 3.14.33, 3.17.4 Trying to apply the patch over 3.14.33 (only to check for backportability) breaks build, and the same happens for 3.4.106: drivers/char/agp/intel-gtt.c: In function ‘i810_write_entry’: drivers/char/agp/intel-gtt.c:331:2: error: implicit declaration of function ‘writel_relaxed’ If I remove the `writel_relaxed` hunks to make the patch succeed, 3.4.106 still hangs. But it seems that patch isn't the real fix anyway - it must be something between 3.10 and 3.14. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 111414 [details] 20141228-0203_3.4.105-gentoo-stop_i915errdecode-ON.log.tar.xz For other reasons I am bound to use kernel 3.4.105, without any trouble, including xf86-video-intel versions up to 2.99.916. Upgrading xf86-video-intel to 2.99.917 breaks this setup; however, it works in combination with latest linux-3.19-rc1+ (which I'm currently testing for fixing the reason I'm stuck with 3.4.x). All using X.Org X Server 1.16.3; the regression is reproducable: [ 19.540530] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [ 19.541519] render error detected, EIR: 0x00000010 [ 19.541519] IPEIR: 0x00000000 [ 19.541519] IPEHR: 0x00000000 [ 19.541519] INSTDONE: 0xffffffff [ 19.541519] INSTPS: 0x4001e020 [ 19.541519] INSTDONE1: 0xbfffffff [ 19.541519] ACTHD: 0x7ffff000 [ 19.541519] page table error [ 19.541519] PGTBL_ER: 0x00100000 [ 19.541519] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck: 0x00000010, masking [ 25.532145] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung [ 25.532154] render error detected, EIR: 0x00000010 [ 25.532158] IPEIR: 0x00000000 [ 25.532162] IPEHR: 0x00000000 [ 25.532165] INSTDONE: 0xffffffff [ 25.532169] INSTPS: 0x4001e020 [ 25.532172] INSTDONE1: 0xbfffffff [ 25.532175] ACTHD: 0x7ffff000 [ 25.532179] page table error [ 25.532182] PGTBL_ER: 0x00100000