Created attachment 109972 [details] dmesg ==System Environment== -------------------------- Regression: yes good commit:bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9 bad commit: cb4975b7c365f6d8e6d17cf4a24d846b3d27e6b7 Non-working platforms: ILK i386 Bug detailed description: ----------------------------- clean boot system, run xinit, X crash. It happens on ILK i386 machines with drm-intel-fixes, drm-intel-next-queued and drm-intel-nightly kernel. It works well on 64 bit machines. output: X.Org X Server 1.16.2 Release Date: 2014-11-10 X Protocol Version 11, Revision 0 Build Operating System: Linux 3.11.10-301.fc20.i686+PAE i686 Current Operating System: Linux x-e6510 3.18.0-rc5_drm-intel-fixes_cb4975_20141124+ #1866 SMP Mon Nov 24 12:26:01 CST 2014 i686 Kernel command line: BOOT_IMAGE=kernels//nightly_parents/2014_11_24/drm-intel-fixes/cb4975b7c365f6d8e6d17cf4a24d846b3d27e6b7/bzImage_i386 root=/dev/sda3 drm.debug=0xe hostname=x-e6510 modules_path=kernels//nightly_parents/2014_11_24/drm-intel-fixes/cb4975b7c365f6d8e6d17cf4a24d846b3d27e6b7/modules_i386/lib/modules/3.18.0-rc5_drm-intel-fixes_cb4975_20141124+ Build Date: 24 November 2014 02:33:40PM Current version of pixman: 0.33.1 Before reporting problems, check http://wiki.x.org to make sure that you have the latest version. Markers: (--) probed, (**) from config file, (==) default setting, (++) from command line, (!!) notice, (II) informational, (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (==) Log file: "/opt/X11R7/var/log/Xorg.0.log", Time: Tue Nov 25 09:23:21 2014 (==) Using config directory: "/etc/X11/xorg.conf.d" (==) Using system config directory "/opt/X11R7/share/X11/xorg.conf.d" [root@x-e6510 opt]# Message from syslogd@x-e6510 at Nov 25 09:23:23 ... kernel:[ 553.399667] CPU: 3 PID: 3998 Comm: X Not tainted 3.18.0-rc5_drm-intel-fixes_cb4975_20141124+ #1866 Message from syslogd@x-e6510 at Nov 25 09:23:23 ... kernel:[ 553.399804] Hardware name: Dell Inc. Latitude E6510/0JKDHD, BIOS A05 08/10/2010 Message from syslogd@x-e6510 at Nov 25 09:23:23 ... kernel:[ 553.399917] task: c2165f00 ti: f2012000 task.ti: f2012000 Message from syslogd@x-e6510 at Nov 25 09:23:23 ... kernel:[ 553.400581] Stack: Message from syslogd@x-e6510 at Nov 25 09:23:23 ... kernel:[ 553.401080] Call Trace: Message from syslogd@x-e6510 at Nov 25 09:23:23 ... kernel:[ 553.403818] Code: 5f 8b 54 24 18 89 42 0c 31 c0 8b 52 04 80 3b 00 74 1e 80 7b 15 00 74 18 0f b6 4b 14 89 d0 31 d2 0f af 4b 0c c1 e0 06 8d 44 01 ff <f7> f1 83 c0 02 8b 54 24 18 89 42 10 c6 02 01 59 5b 5e 5f 5d c3 Message from syslogd@x-e6510 at Nov 25 09:23:23 ... kernel:[ 553.408580] EIP: [<f8561c12>] ilk_compute_wm_level+0x11e/0x133 [i915] SS:ESP 0068:f2013abc Call trace: [ 553.400002] EIP: 0060:[<f8561c12>] EFLAGS: 00013202 CPU: 3 [ 553.400117] EIP is at ilk_compute_wm_level+0x11e/0x133 [i915] [ 553.400206] EAX: 0000017f EBX: f2013b30 ECX: 00000000 EDX: 00000000 [ 553.400303] ESI: 00000004 EDI: 0000000d EBP: 00000007 ESP: f2013abc [ 553.400400] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 553.400485] CR0: 80050033 CR2: b69bc000 CR3: 02828000 CR4: 000007d0 [ 553.400581] Stack: [ 553.400615] 000d000d f62a4000 c2998000 f63d7800 c2998000 f856470e f2013b98 00000000 [ 553.400770] 000011a3 000001ac 00100200 f63d7800 00000002 c2998000 00000000 00000000 [ 553.400925] 00000000 00000000 00000000 00000000 00000001 66660000 00000f8f 00000f8f [ 553.401080] Call Trace: [ 553.401148] [<f856470e>] ? ilk_update_wm+0x24d/0xa07 [i915] [ 553.401244] [<c12dd7e0>] ? number.isra.2+0x155/0x249 [ 553.401354] [<f8565177>] ? intel_update_watermarks+0x11/0x12 [i915] [ 553.401492] [<f85a19c6>] ? ironlake_crtc_enable+0x343/0x9cc [i915] [ 553.401625] [<f857a92d>] ? i915_gem_object_pin_to_display_plane+0x8a/0x140 [i915] [ 553.401781] [<f859edc2>] ? __intel_set_mode+0xf7c/0x1070 [i915] [ 553.401879] [<c10536bf>] ? up+0x9/0x2a [ 553.401944] [<c105d08b>] ? console_unlock+0x34b/0x37f [ 553.402028] [<c10a4f0f>] ? irq_work_queue+0x8/0x5b [ 553.402143] [<f85a450f>] ? intel_set_mode+0x11/0x25 [i915] [ 553.402267] [<f85a4fbe>] ? intel_crtc_set_config+0x674/0x957 [i915] [ 553.402369] [<c16fb04c>] ? printk+0x16/0x1a [ 553.402459] [<f805236d>] ? drm_mode_set_config_internal+0x39/0x97 [drm] [ 553.402580] [<f8055621>] ? drm_mode_setcrtc+0x373/0x405 [drm] [ 553.402689] [<f80552ae>] ? drm_mode_setplane+0x188/0x188 [drm] [ 553.402793] [<f804bc8c>] ? drm_ioctl+0x233/0x35c [drm] [ 553.402894] [<f80552ae>] ? drm_mode_setplane+0x188/0x188 [drm] [ 553.402992] [<c104e21a>] ? pick_next_task_fair+0xd1/0x3ee [ 553.403091] [<f804ba59>] ? drm_copy_field+0x47/0x47 [drm] [ 553.403181] [<c10ee855>] ? do_vfs_ioctl+0x3fa/0x444 [ 553.403262] [<c1700072>] ? __schedule+0x56a/0x6d5 [ 553.403340] [<c10e3214>] ? __sb_end_write+0x1e/0x4c [ 553.403419] [<c10e2291>] ? vfs_write+0x14f/0x165 [ 553.403495] [<c10ee8e1>] ? SyS_ioctl+0x42/0x6d [ 553.403572] [<c1088b2c>] ? __audit_syscall_entry+0x9f/0xbd [ 553.403661] [<c17025e8>] ? sysenter_do_call+0x12/0x12 [ 553.403744] [<c1700000>] ? __schedule+0x4f8/0x6d5 [ 553.403818] Code: 5f 8b 54 24 18 89 42 0c 31 c0 8b 52 04 80 3b 00 74 1e 80 7b 15 00 74 18 0f b6 4b 14 89 d0 31 d2 0f af 4b 0c c1 e0 06 8d 44 01 ff <f7> f1 83 c0 02 8b 54 24 18 89 42 10 c6 02 01 59 5b 5e 5f 5d c3 [ 553.408580] EIP: [<f8561c12>] ilk_compute_wm_level+0x11e/0x133 [i915] SS:ESP 0068:f2013abc Reproduce steps: ---------------------------- 1. clean boot system 2. xinit
Created attachment 109973 [details] Xorg.0.log
Oh this looks bad, division by 0 hooray. Please supply the bisect result.
And smells like watermark fun for Ville.
Note that you may not hit the failure during X start like QA, but should using "xset dpms force off; xset dpms force on".
This looks somewhat familiar. I think it might be the zeroed mode problem I also hit with my ILK a while back. I'll have to see if I can reproduce it still.
(In reply to Daniel Vetter from comment #2) > Oh this looks bad, division by 0 hooray. > > Please supply the bisect result. Between good commit and bad commit, bug 85277 causes boot fail, I am not sure it impacts bisect. I will give it a try.
Bisect shows 83f45fc360c8e16a330474860ebda872d1384c8c is the first bad commit. revert it, this issue goes away. commit 83f45fc360c8e16a330474860ebda872d1384c8c Author: Daniel Vetter <daniel.vetter@ffwll.ch> AuthorDate: Wed Aug 6 09:10:18 2014 +0200 Commit: Daniel Vetter <daniel.vetter@ffwll.ch> CommitDate: Wed Aug 6 10:41:13 2014 +0200 drm: Don't grab an fb reference for the idr The current refcounting scheme is that the fb lookup idr also holds a reference. This works out nicely bacause thus far we've always explicitly cleaned up idr entries for framebuffers: - Userspace fbs get removed in the rmfb ioctl or when the drm file gets closed. - Kernel fbs (for fbdev emulation) get cleaned up by the driver code at module unload time. But now i915 also reconstructs the bios fbs for a smooth transition. And that fb is purely transitional and should get removed immmediately once all crtcs stop using it. Of course if the i915 fbdev code decides to reuse it as the main fbdev fb then it shouldn't be cleaned up, but in that case the fbdev code will grab it's own reference. The problem is now that we also want to register that takeover fb in the idr, so that userspace can do a smooth transition (animated maybe even!) itself. But currently we have no one who will clean up the idr reference once that fb isn't useful any more, and so essentially leak it. Fix this by no longer holding a full fb reference for the idr, but instead just have a weak reference using kref_get_unless_zero. But that requires us to synchronize and clean up with the idr and fb_lock in drm_framebuffer_free, so add that. It's a bit ugly that we have to unconditionally grab the fb_lock, but without that someone might creep through a race. This leak was caught by the fb leak check in drm_mode_config_cleanup. Originally the leak was introduced in commit 46f297fb83d4f9a6f6891964beb184664341a28b Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Fri Mar 7 08:57:48 2014 -0800 drm/i915: add plane_config fetching infrastructure v2 Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77511 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The bisect is misleading as that fixes the leak of the framebuffer (thus preventing the subsequent PIN_BIAS failure thus preventing from switching the displays off/on). Does it still blow up in the wm code if you do xset dpms force off; xset dpms force on?
My ILK was hitting this already for some time but got magically fixed recently. Reverse bisect points at: commit c211a47c2c28562f8a3fff9e027be1a3ed9e154a Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Nov 24 11:12:42 2014 +0100 drm/i915: Disallow pin ioctl completely for kms drivers Oh and that machine is running 64bit gentoo.
I dumperd the setcrtc struct coming in the kernel and it looks like this: [ 52.334815] [drm:drm_ioctl] setcrtc asize=104 usize=104 drv_size=104 [ 52.334823] 00000000: a7b709b0 00007fff 00000001 00000008 [ 52.334828] 00000010: 00000022 00000000 00000000 00000000 [ 52.334832] 00000020: 00000001 00000000 00000000 00000000 [ 52.334837] 00000030: 00000000 00000000 00000000 00000000 [ 52.334842] 00000040: 00000000 00000000 00000000 00000000 [ 52.334846] 00000050: 00000000 00000000 00000000 00000000 [ 52.334850] 00000060: 00000000 00000000 So X is really is trying to set a zeroed mode :P This is the third setcrtc coming in when starting X, the previous two were requests to disable both crtcs (and they were totally zeroed apart from the crtc id). I suppose the next question is how did X come up with that zeroed mode. Did the kernel tell it that such a mode was already used and it's trying to set it again, or did it extract it from somewhere else... Obviosuly we should add some checks to the kernel as well to make sure we don't explode when encountering such an invalid mode...
The simplest way would be to compile xf86-video-intel with --enable-debug=full and read back the reasons for the last modeset before the crash.
*** Bug 87330 has been marked as a duplicate of this bug. ***
*** Bug 87279 has been marked as a duplicate of this bug. ***
Regresses here and there, time for a revert?
(In reply to Jani Nikula from comment #14) > Regresses here and there, time for a revert? Revert of what? You have already applied the "fix" from Daniel to ban the pin ioctl completely. This bug is about userspace being able to oops the kernel by feeding it a malicious mode. Bug 87279 is about corruption in the ddx using sw fallbacks, and bug 87330 has not been analysed to find the actual root cause.
Some kernel fixes that prevent the kernel from blowing up with the zeroed mode: http://lists.freedesktop.org/archives/dri-devel/2014-December/074160.html
(In reply to Chris Wilson from comment #15) > (In reply to Jani Nikula from comment #14) > > Regresses here and there, time for a revert? > > Revert of what? You have already applied the "fix" from Daniel to ban the > pin ioctl completely. This bug is about userspace being able to oops the > kernel by feeding it a malicious mode. Bug 87279 is about corruption in the > ddx using sw fallbacks, and bug 87330 has not been analysed to find the > actual root cause. drm-i915-Disallow-pin-ioctl-completely-for-kms-drive seems to fix https://bugs.freedesktop.org/show_bug.cgi?id=87279 bug aswell, which has been marked as a duplicate of this bug
drm-i915-Disallow-pin-ioctl-completely-for-kms-drive applies cleanly to 3.18.1, but the kernel then fails to compile. Thus, I can't check if this patch fixes my bug, which has been marked as a duplicate of this one.
(In reply to Heinz from comment #18) > drm-i915-Disallow-pin-ioctl-completely-for-kms-drive applies cleanly to > 3.18.1, but the kernel then fails to compile. Thus, I can't check if this > patch fixes my bug, which has been marked as a duplicate of this one. Can't confirm. 3.18.1 builds fine here with the patch. Check your configuration, maybe you enabled something experimental I'm writing this form 3.18.1 Linux ChakraPC 3.18.1-1-CHAKRA #1 SMP PREEMPT Thu Dec 18 10:53:51 EET 2014 x86_64 GNU/Linux
Here's the compiler output where it fails on an otherwise bog standard vanilla 3.18.1, which clearly indicates that it's not some weird .config options which are the cause. GCC is: gcc version 4.9.2 20141101 (Red Hat 4.9.2-1) (GCC) drivers/gpu/drm/i915/i915_gem.c:4282:1: error: redefinition of ‘i915_gem_pin_ioctl’ i915_gem_pin_ioctl(struct drm_device *dev, void *data, ^ drivers/gpu/drm/i915/i915_gem.c:4189:1: note: previous definition of ‘i915_gem_pin_ioctl’ was here i915_gem_pin_ioctl(struct drm_device *dev, void *data, ^ drivers/gpu/drm/i915/i915_gem.c:4335:1: error: redefinition of ‘i915_gem_unpin_ioctl’ i915_gem_unpin_ioctl(struct drm_device *dev, void *data, ^ drivers/gpu/drm/i915/i915_gem.c:4245:1: note: previous definition of ‘i915_gem_unpin_ioctl’ was here i915_gem_unpin_ioctl(struct drm_device *dev, void *data, ^ scripts/Makefile.build:257: recipe for target 'drivers/gpu/drm/i915/i915_gem.o' failed make[4]: *** [drivers/gpu/drm/i915/i915_gem.o] Error 1 scripts/Makefile.build:402: recipe for target 'drivers/gpu/drm/i915' failed make[3]: *** [drivers/gpu/drm/i915] Error 2 scripts/Makefile.build:402: recipe for target 'drivers/gpu/drm' failed make[2]: *** [drivers/gpu/drm] Error 2 scripts/Makefile.build:402: recipe for target 'drivers/gpu' failed make[1]: *** [drivers/gpu] Error 2 Makefile:937: recipe for target 'drivers' failed make: *** [drivers] Error 2 make: *** Waiting for unfinished jobs....
(In reply to Heinz from comment #20) > Here's the compiler output where it fails on an otherwise bog standard > vanilla 3.18.1, which clearly indicates that it's not some weird .config > options which are the cause. GCC is: gcc version 4.9.2 20141101 (Red Hat > 4.9.2-1) (GCC) > > drivers/gpu/drm/i915/i915_gem.c:4282:1: error: redefinition of > ‘i915_gem_pin_ioctl’ > i915_gem_pin_ioctl(struct drm_device *dev, void *data, > ^ > drivers/gpu/drm/i915/i915_gem.c:4189:1: note: previous definition of > ‘i915_gem_pin_ioctl’ was here > i915_gem_pin_ioctl(struct drm_device *dev, void *data, > ^ > drivers/gpu/drm/i915/i915_gem.c:4335:1: error: redefinition of > ‘i915_gem_unpin_ioctl’ > i915_gem_unpin_ioctl(struct drm_device *dev, void *data, > ^ > drivers/gpu/drm/i915/i915_gem.c:4245:1: note: previous definition of > ‘i915_gem_unpin_ioctl’ was here > i915_gem_unpin_ioctl(struct drm_device *dev, void *data, > ^ > scripts/Makefile.build:257: recipe for target > 'drivers/gpu/drm/i915/i915_gem.o' failed > make[4]: *** [drivers/gpu/drm/i915/i915_gem.o] Error 1 > scripts/Makefile.build:402: recipe for target 'drivers/gpu/drm/i915' failed > make[3]: *** [drivers/gpu/drm/i915] Error 2 > scripts/Makefile.build:402: recipe for target 'drivers/gpu/drm' failed > make[2]: *** [drivers/gpu/drm] Error 2 > scripts/Makefile.build:402: recipe for target 'drivers/gpu' failed > make[1]: *** [drivers/gpu] Error 2 > Makefile:937: recipe for target 'drivers' failed > make: *** [drivers] Error 2 > make: *** Waiting for unfinished jobs.... Chakra's gcc is 4.9.1 Here's our build process https://gitorious.org/chakra-packages/core/source/399186728de12a5117ca508de724e9f4da1df0c6:linux/PKGBUILD The gist of it is 1)make mrproper 2)Apply patches (3.18.1 patch first) 3)Make prepare 4)load configuration 5)make ${MAKEFLAGS} LOCALVERSION= bzImage modules 6) Packages everything
(In reply to Heinz from comment #20) > Here's the compiler output where it fails on an otherwise bog standard > vanilla 3.18.1, which clearly indicates that it's not some weird .config > options which are the cause. GCC is: gcc version 4.9.2 20141101 (Red Hat > 4.9.2-1) (GCC) > > drivers/gpu/drm/i915/i915_gem.c:4282:1: error: redefinition of > ‘i915_gem_pin_ioctl’ > i915_gem_pin_ioctl(struct drm_device *dev, void *data, > ^ > drivers/gpu/drm/i915/i915_gem.c:4189:1: note: previous definition of > ‘i915_gem_pin_ioctl’ was here > i915_gem_pin_ioctl(struct drm_device *dev, void *data, > ^ > drivers/gpu/drm/i915/i915_gem.c:4335:1: error: redefinition of > ‘i915_gem_unpin_ioctl’ > i915_gem_unpin_ioctl(struct drm_device *dev, void *data, > ^ > drivers/gpu/drm/i915/i915_gem.c:4245:1: note: previous definition of > ‘i915_gem_unpin_ioctl’ was here > i915_gem_unpin_ioctl(struct drm_device *dev, void *data, > ^ > scripts/Makefile.build:257: recipe for target > 'drivers/gpu/drm/i915/i915_gem.o' failed > make[4]: *** [drivers/gpu/drm/i915/i915_gem.o] Error 1 > scripts/Makefile.build:402: recipe for target 'drivers/gpu/drm/i915' failed > make[3]: *** [drivers/gpu/drm/i915] Error 2 > scripts/Makefile.build:402: recipe for target 'drivers/gpu/drm' failed > make[2]: *** [drivers/gpu/drm] Error 2 > scripts/Makefile.build:402: recipe for target 'drivers/gpu' failed > make[1]: *** [drivers/gpu] Error 2 > Makefile:937: recipe for target 'drivers' failed > make: *** [drivers] Error 2 > make: *** Waiting for unfinished jobs.... Can you build drm-intel-hightly kernel http://cgit.freedesktop.org/drm-intel?h=drm-intel-nightly To see if the same problem exists there
Current drm-intel git clone from today builds flawlessly and boots fine into X.
(In reply to Heinz from comment #23) > Current drm-intel git clone from today builds flawlessly and boots fine into > X. Thanks for letting me know. I warned Chakra devs against using this patch as there could be some unforeseen consequences, it seems
(In reply to Heinz from comment #23) > Current drm-intel git clone from today builds flawlessly and boots fine into > X. Hmm, One of our devs told that if you get redefinition errors (as in your previous post) it usually means that the patch is not correctly applied For sanity maybe try this patch directly from cgit's page http://cgit.freedesktop.org/drm-intel/patch/?id=83f45fc360c8e16a330474860ebda872d1384c8c
(In reply to Ugis Germanis from comment #25) > (In reply to Heinz from comment #23) > > Current drm-intel git clone from today builds flawlessly and boots fine into > > X. > > Hmm, One of our devs told that if you get redefinition errors (as in your > previous post) it usually means that the patch is not correctly applied > > For sanity maybe try this patch directly from cgit's page > http://cgit.freedesktop.org/drm-intel/patch/ > ?id=83f45fc360c8e16a330474860ebda872d1384c8c I'm sorry I posted the wrong patch (I need sleep) This is the correct patch http://cgit.freedesktop.org/drm-intel/patch/?id=d472fcc8379c062bd56a3876fc6ef22258f14a91
Thanks! Applies cleanly and boots fine.
Presumed fixed by commits commit 05acaec334fcc1132d1e48c5042e044651e0b75b Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Wed Dec 17 13:56:22 2014 +0200 drm: Reorganize probed mode validation commit abc0b1447d4974963548777a5ba4a4457c82c426 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Wed Dec 17 13:56:23 2014 +0200 drm: Perform basic sanity checks on probed modes commit 23e1ce89af5404c7a35dbd008ca85fb6adb16aad Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Wed Dec 17 13:56:24 2014 +0200 drm: Do basic sanity checks for user modes in drm-next, queued for 3.20. Also included in drm-intel-nightly. Please reopen if the problem persists.
It works well. Verified.
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.