Summary: | [legacy/ums] kernel panic on Debian 2.6.35-rc5 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Martin Sillence <martin> | ||||||||||
Component: | DRM/Intel | Assignee: | Xorg Project Team <xorg-team> | ||||||||||
Status: | CLOSED FIXED | QA Contact: | |||||||||||
Severity: | major | ||||||||||||
Priority: | medium | CC: | kurt | ||||||||||
Version: | XOrg git | ||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||
OS: | Linux (All) | ||||||||||||
Whiteboard: | |||||||||||||
i915 platform: | i915 features: | ||||||||||||
Attachments: |
|
Description
Martin Sillence
2010-07-19 12:01:15 UTC
Created attachment 37184 [details]
kernel log
kernel log from boot to panic
probably worth including I'm using the _latest_ intel driver: context: http://ikibiki.org/blog/2010/07/04/We_need_you_redux/ > Cyril Brulebois <kibi@debian.org> (12/07/2010): >> It would be nice to know how it goes with the packages I built (for >> i386 + amd64) and uploaded there: >> http://people.debian.org/~kibi/packages/xserver-xorg-video-intel/ > > I've put a new version there: 2.12.0-1+ickle2 Hardware: 965GM on a Sony SZ680 Previous bug: 28204 Regards, M Wow, must be an old userspace to use the IRQ_EMIT ioctl... Looks like we're failing to grab the hw lock, probably because some aspect of the master structures aren't set up at this point. (gdb) list *i915_irq_emit+0x18a 0x50fa is in i915_irq_emit (drivers/gpu/drm/i915/i915_irq.c:1073). 1068 if (!dev_priv || !dev_priv->render_ring.virtual_start) { 1069 DRM_ERROR("called with no initialization\n"); 1070 return -EINVAL; 1071 } 1072 1073 RING_LOCK_TEST_WITH_RETURN(dev, file_priv); 1074 1075 mutex_lock(&dev->struct_mutex); 1076 result = i915_emit_irq(dev); 1077 mutex_unlock(&dev->struct_mutex); Chris's legacy branch was implicated in this bug. Created attachment 37185 [details] [review] Check master exists before dereferencing. Hmm, master is dereferenced even earlier and need to understand when and why master might be NULL in the first place. Please note this is a regression, this version of xorg-intel is working fine in 2.6.34-1-amd64 I'm seeing about the same thing using an 2.6.35-rc5 and -rc6 kernel. I'm using the "ickle2" version mentioned before. I'll attach my dmesg shortly. Created attachment 37353 [details]
kernel log when kms is disable.
The patch in the bug report does not fix my issue. The call trace still looks the same. I'm not terribly happy that a broken userspace can cause a kernel BUG(), but I've pushed a fix to my legacy branch. Maybe we can use this as evidence to remove the broken kernel API... A bisect for at least my problem results in: $ git bisect good 8187a2b70e34c727a06617441f74f202b6fefaf9 is the first bad commit commit 8187a2b70e34c727a06617441f74f202b6fefaf9 Author: Zou Nan hai <nanhai.zou@intel.com> Date: Fri May 21 09:08:55 2010 +0800 drm/i915: introduce intel_ring_buffer structure (V2) Introduces a more complete intel_ring_buffer structure with callbacks for setup and management of a particular ringbuffer, and converts the render ring buffer consumers to use it. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com> [anholt: Fixed up whitespace fail and rebased against prep patches] Signed-off-by: Eric Anholt <eric@anholt.net> :040000 040000 b90a540c84c2ffa50b8b0bb7292749cef96e75d3 22c06e081bc722df129f2d0dc937950d5f164c5c M drivers :040000 040000 6ac1363503569458bf035132b01f206c256701cb 757099565b205b0908a8b903db5c9b00d2c6e142 M include As I understand it, this bug seems to contain at least 2 issues: - X doing some wrong API call - The kernel having a problem with it. I think the first issue is solved? (The bug being set to fixed state) But I think the second problem should get fixed too, and I'm not sure if anything is happening with that. Is there a bug in the kernel bug tracker about it? Should I create one? If you can demonstrate that the old userspace, say either 2.6 or 2.9, fails with the current kernel then it needs an immediate fix. If however, it is just one more broken piece in a broken API, the sooner we can kill it with fire the better. In short, it is not at the top of my list of kernel OOPs to fix. :( (In reply to comment #14) > If you can demonstrate that the old userspace, say either 2.6 or 2.9, fails > with the current kernel then it needs an immediate fix. If however, it is just > one more broken piece in a broken API, the sooner we can kill it with fire the > better. > > In short, it is not at the top of my list of kernel OOPs to fix. :( Maybe, but that doesn't mean the bug is fixed. Please don't mark it as such. The bug in legacy is fixed. Created attachment 37415 [details]
xorg log with latest version
There is now an "ickle3" version which contains commit 352016d2da69bfc998a642132ab722940899ad2e.
With that version on a 2.6.32 (Debian version 2.6.32-17), I can get to the login screen, but the screen turns black and the pc locks up somewhere after logging in. I've attached my xorg log file. This is with ums.
This looks like a step back since it used to work with this kernel.
I've now also booted it using drm.debug=0x06, and the kernel log ended with: [ 162.381742] [drm:i915_wait_irq], irq_nr=433 breadcrumb=433 [ 162.381804] [drm:i915_batchbuffer], i915 batchbuffer, start e9000 used 152 cliprects 0 [ 162.381817] [drm:i915_emit_irq], [ 163.237945] [drm:i915_emit_irq], [ 163.237960] [drm:i915_wait_irq], irq_nr=436 breadcrumb=436 [ 163.238029] [drm:i915_batchbuffer], i915 batchbuffer, start ed000 used 56 cliprects 0 [ 163.238042] [drm:i915_emit_irq], [ 163.716533] [drm:i915_emit_irq], [ 163.716549] [drm:i915_wait_irq], irq_nr=439 breadcrumb=439 [ 163.716623] [drm:i915_batchbuffer], i915 batchbuffer, start e9000 used 416 cliprects 0 [ 163.716637] [drm:i915_emit_irq], [ 163.717262] [drm:i915_emit_irq], [ 163.717271] [drm:i915_wait_irq], irq_nr=442 breadcrumb=439 [ 163.737309] [drm:i915_wait_irq], irq_nr=442 breadcrumb=439 [ 163.757321] [drm:i915_wait_irq], irq_nr=442 breadcrumb=439 [ 163.777235] [drm:i915_wait_irq], irq_nr=442 breadcrumb=439 [ 163.781312] [drm:i915_batchbuffer], i915 batchbuffer, start ed000 used 8 cliprects 0 [ 163.781331] [drm:i915_emit_irq], [ 165.836145] [drm:i915_get_vblank_counter], trying to get vblank count for disabled pipe 1 Should be fixed with: commit e8616b6ced6137085e6657cc63bc2fe3900b8616 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jan 20 09:57:11 2011 +0000 drm/i915: Initialise ring vfuncs for old DRI paths We weren't setting up the vfunc table when initialising the old DRI ringbuffer, leading to such OOPSes as: ... commit 5a9a8d1a99c617df82339456fbdd30d6ed3a856b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Jan 23 13:03:24 2011 +0000 drm/i915: Handle the no-interrupts case for UMS by polling If the driver calls into the kernel to wait for a breadcrumb to pass, but hasn't enabled interrupts, fallback to polling the breadcrumb value. Reported-by: Chris Clayton <chris2553@googlemail.com> Tested-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.