Bug 67931 - [Bisected]xinit cases call trace and system hang
Summary: [Bisected]xinit cases call trace and system hang
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: high major
Assignee: Ben Widawsky
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-09 03:07 UTC by lu hua
Modified: 2017-10-06 14:44 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg (4.80 KB, text/plain)
2013-08-09 03:07 UTC, lu hua
no flags Details

Description lu hua 2013-08-09 03:07:12 UTC
Created attachment 83871 [details]
dmesg

System Environment:
--------------------------
Platform:  PNV/ILK/SNB/IVB/HSW
Kernel:    (drm-intel-nightly)254329f9a08cc3b0d5e4a877c6ff13cf9ba4fae7

Bug detailed description:
-----------------------------
Run xinit, call trace appears and system hang.It happens on -nightly, -queued kernel. It works well on -fixes kernel.

Bisect shows:4695ec93e3484243574f68072a27d1781d41a5a5 is the first bad commit.
commit 4695ec93e3484243574f68072a27d1781d41a5a5
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Wed Jul 31 17:00:17 2013 -0700

    drm/i915: create vmas at execbuf

    In order to transition more of our code over to using a VMA instead of
    an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up
    until now, we've only had a VMA when actually binding an object.

    The previous patch helped handle the distinction on bound vs. unbound.
    This patch will help us catch leaks, and other issues before we actually
    shuffle a bunch of stuff around.

    The subsequent patch to fix up the rest of execbuf should be mostly just
    moving code around, and this is the major functional change.

    v2: Release table_lock earlier so vma allocation needn't be atomic.
    (Chris)

    Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

dmesg:
[   73.571553] BUG: unable to handle kernel NULL pointer dereference at 00000018
[   73.571603] IP: [<f80a38c0>] drm_mm_remove_node+0x47/0x9e [drm]
[   73.571642] *pde = 00000000
[   73.571661] Oops: 0000 [#1] SMP
[   73.571683] Modules linked in: netconsole configfs ipv6 dm_mod snd_hda_codec_hdmi snd_hda_codec_realtek dcdbas pcspkr serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec snd_hwdep lpc_ich snd_pcm mfd_core snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm mperf freq_table [last unloaded: netconsole]
[   73.571916] CPU: 0 PID: 3760 Comm: X Not tainted 3.11.0-rc2_drm-intel-next-queued_4695ec_20130808_+ #6612
[   73.571962] Hardware name: Dell Inc. OptiPlex 990/0DXWW6, BIOS A02 02/26/2011
[   73.571997] task: f5b26ce0 ti: c31ce000 task.ti: c31ce000
[   73.572024] EIP: 0060:[<f80a38c0>] EFLAGS: 00213246 CPU: 0
[   73.572055] EIP is at drm_mm_remove_node+0x47/0x9e [drm]
[   73.572082] EAX: c3280980 EBX: 00000000 ECX: 00000000 EDX: 00000000
[   73.572113] ESI: 00000000 EDI: 00000000 EBP: c357bb80 ESP: c31cfd1c
[   73.572144]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[   73.572170] CR0: 80050033 CR2: 00000018 CR3: 030e4000 CR4: 000407d0
[   73.572201] Stack:
[   73.572213]  c3280980 f4e8d254 c3280980 c357bb80 f8180f7a c21a8500 f818444b ffffffe4
[   73.572272]  00000000 00300000 f52e3c00 f4e8c000 7fe00000 00001000 00000000 00300000
[   73.572332]  00000000 c30a0400 c21a8500 c357b970 f4e8cc2c f8185e87 00000000 00000000
[   73.572391] Call Trace:
[   73.572413]  [<f8180f7a>] ? i915_gem_vma_destroy+0x37/0x3f [i915]
[   73.572449]  [<f818444b>] ? i915_gem_object_pin+0x3c5/0x509 [i915]
[   73.572485]  [<f8185e87>] ? i915_gem_execbuffer_reserve_object.isra.12+0x70/0x192 [i915]
[   73.572529]  [<f8186191>] ? i915_gem_execbuffer_reserve+0x1e8/0x2fb [i915]
[   73.572568]  [<f8186b00>] ? i915_gem_do_execbuffer.isra.18+0x4a0/0xd5f [i915]
[   73.572609]  [<f8181550>] ? i915_gem_obj_bound_any+0x28/0x43 [i915]
[   73.572645]  [<f8187930>] ? i915_gem_execbuffer2+0x12e/0x1c2 [i915]
[   73.572680]  [<f8187802>] ? i915_gem_execbuffer+0x443/0x443 [i915]
[   73.572713]  [<f809cc1e>] ? drm_ioctl+0x23d/0x323 [drm]
[   73.572744]  [<f8187802>] ? i915_gem_execbuffer+0x443/0x443 [i915]
[   73.572777]  [<c02a065a>] ? handle_pte_fault+0x5a6/0x5e3
[   73.572806]  [<f809c9e1>] ? drm_copy_field+0x47/0x47 [drm]
[   73.572835]  [<c02c04fc>] ? vfs_ioctl+0x18/0x21
[   73.572858]  [<c02c0ec8>] ? do_vfs_ioctl+0x3ec/0x42c
[   73.572885]  [<c08778c9>] ? __do_page_fault+0x400/0x43b
[   73.572911]  [<c087787d>] ? __do_page_fault+0x3b4/0x43b
[   73.572938]  [<c0236d5a>] ? __set_current_blocked+0x24/0x35
[   73.572966]  [<c02c0f51>] ? SyS_ioctl+0x49/0x74
[   73.572990]  [<c087945a>] ? sysenter_do_call+0x12/0x22
[   73.573017]  [<c0870000>] ? create_subvol+0x20f/0x59c
[   73.573042] Code: 1c 8b 18 74 24 01 fe 3b 73 18 75 02 0f 0b 8b 70 08 8b 58 0c 89 5e 04 89 33 c7 40 08 00 01 10 00 c7 40 0c 00 02 20 00 eb 09 01 fe <3b> 73 18 74 02 0f 0b 8b 72 10 8d 6a 08 f7 c6 01 00 00 00 75 0a
[   73.573350] EIP: [<f80a38c0>] drm_mm_remove_node+0x47/0x9e [drm] SS:ESP 0068:c31cfd1c
[   73.573398] CR2: 0000000000000018
[   73.578083] ---[ end trace 95ad56d39717da34 ]---

BTW, I can't find the bisect commit on latest -queued branch. This issue doesn't happen on the latest commit(6d2b888569d).

Reproduce steps:
----------------------------
1. xinit
Comment 1 Chris Wilson 2013-08-09 08:21:23 UTC
We've already dropped that commit in order to rework it for this very problem.
Comment 2 Daniel Vetter 2013-08-09 08:35:53 UTC
(In reply to comment #0)
> BTW, I can't find the bisect commit on latest -queued branch. This issue
> doesn't happen on the latest commit(6d2b888569d).

Yeah, Chris reported that -nightly is completely broken so I've dropped the patch again. Still great work that you have the bug report with bisect result done on the next day.

Ben is working on a fixed version of the patch, assigning to him.
Comment 3 Ben Widawsky 2013-08-13 04:21:19 UTC
Patch has been dropped. New one resubmitted.
Comment 4 lu hua 2013-08-19 01:27:12 UTC
Verified.Fixed.
Comment 5 Elizabeth 2017-10-06 14:44:16 UTC
Closing old verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.