Bug 88987

Summary: [BSW PPGTT Bisected]igt/gem_ring_sync_loop causes system hang
Product: DRI Reporter: lu hua <huax.lu>
Component: DRM/IntelAssignee: Nick Hoath <nicholas.hoath>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: high CC: intel-gfx-bugs
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg none

Description lu hua 2015-02-05 07:52:50 UTC
Created attachment 113177 [details]
dmesg

==System Environment==
--------------------------
Regression: yes, caused by ppgtt

no-working platforms: BSW

==kernel==
--------------------------
drm-intel-nightly/9583cb470cd82997b5f23d4acab852563f5c0047
commit 9583cb470cd82997b5f23d4acab852563f5c0047
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Feb 4 15:22:12 2015 +0100

    drm-intel-nightly: 2015y-02m-04d-14h-21m-52s UTC integration manifest

==Bug detailed description==
-----------------------------
It causes system hang, ctrl+c, it also doesn't exit.  i915.enable_ppgtt=0, it works well.
IGT-Version: 1.9-g87edb51 (x86_64) (Linux: 3.19.0-rc7_drm-intel-nightly_9583cb_20150205+ x86_64)

^C^C^C^C^C^C^C^C

dmesg:
[   34.751917] [drm:i915_hangcheck_elapsed [i915]] *ERROR* Hangcheck timer elapsed... render ring idle
[   62.753026] [drm] stuck on render ring
[   62.753038] ------------[ cut here ]------------
[   62.753090] WARNING: CPU: 1 PID: 1021 at drivers/gpu/drm/i915/i915_irq.c:2612 i915_handle_error+0x58/0x5bd [i915]()
[   62.753095] WARN_ON(mutex_is_locked(&dev_priv->dev->struct_mutex))
[   62.753098] Modules linked in: ipv6 bnep bluetooth rfkill dm_mod snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek snd_hda_codec_generic pcspkr serio_raw snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep i2c_i801 snd_pcm lpc_ich mfd_core snd_timer snd soundcore battery ac acpi_cpufreq i915 button video drm_kms_helper drm cfbfillrect cfbimgblt cfbcopyarea
[   62.753166] CPU: 1 PID: 1021 Comm: kworker/u8:3 Tainted: G        W      3.19.0-rc7_drm-intel-nightly_9583cb_20150205+ #78
[   62.753205] Workqueue: i915-hangcheck i915_hangcheck_elapsed [i915]
[   62.753210]  0000000000000000 0000000000000009 ffffffff8179a69b ffff88007b9efcc8
[   62.753217]  ffffffff8103bdec 0000000000000246 ffffffffa00b6edb 0000000000000006
[   62.753223]  ffff88017a42bb40 ffff880175418ad0 ffff8801759e7000 ffff8801755dac00
[   62.753231] Call Trace:
[   62.753246]  [<ffffffff8179a69b>] ? dump_stack+0x40/0x50
[   62.753260]  [<ffffffff8103bdec>] ? warn_slowpath_common+0x98/0xb0
[   62.753294]  [<ffffffffa00b6edb>] ? i915_handle_error+0x58/0x5bd [i915]
[   62.753301]  [<ffffffff8103be9c>] ? warn_slowpath_fmt+0x45/0x4a
[   62.753335]  [<ffffffffa00b6edb>] ? i915_handle_error+0x58/0x5bd [i915]
[   62.753343]  [<ffffffff81797406>] ? printk+0x48/0x4d
[   62.753378]  [<ffffffffa00b77b7>] ? i915_hangcheck_elapsed+0x339/0x3d5 [i915]
[   62.753389]  [<ffffffff8104d128>] ? process_one_work+0x1ad/0x31a
[   62.753397]  [<ffffffff8104d4ef>] ? worker_thread+0x235/0x330
[   62.753404]  [<ffffffff8104d2ba>] ? process_scheduled_works+0x25/0x25
[   62.753411]  [<ffffffff81050dee>] ? kthread+0xc5/0xcd
[   62.753418]  [<ffffffff81050d29>] ? kthread_freezable_should_stop+0x40/0x40
[   62.753425]  [<ffffffff8179ffec>] ? ret_from_fork+0x7c/0xb0
[   62.753431]  [<ffffffff81050d29>] ? kthread_freezable_should_stop+0x40/0x40
[   62.753437] ---[ end trace 05682ea912c21a80 ]---

==Reproduce steps==
---------------------------- 
1. ./gem_ring_sync_loop
Comment 1 Jani Nikula 2015-02-10 08:19:48 UTC
Probably dupe of bug 88652.
Comment 2 lu hua 2015-02-10 09:01:02 UTC
(In reply to Jani Nikula from comment #1)
> Probably dupe of bug 88652.

Yes. I tested on commit 21076372afe711072b9a447f22a098691dd0b2cb, it doesn't exit testing and GPU hang, but not system hang, so report bug 89060 to track the gpu hang.
Bisect shows: 6d3d8274bc45de4babb62d64562d92af984dd238 is the first bad commit.
commit 6d3d8274bc45de4babb62d64562d92af984dd238
Author:     Nick Hoath <nicholas.hoath@intel.com>
AuthorDate: Thu Jan 15 13:10:39 2015 +0000
Commit:     Daniel Vetter <daniel.vetter@ffwll.ch>
CommitDate: Tue Jan 27 09:50:53 2015 +0100

    drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request

    Move all remaining elements that were unique to execlists queue items
    in to the associated request.

    Issue: VIZ-4274

    v2: Rebase. Fixed issue of overzealous freeing of request.
    v3: Removed re-addition of cleanup work queue (found by Daniel Vetter)
    v4: Rebase.
    v5: Actual removal of intel_ctx_submit_request. Update both tail and postfix
    pointer in __i915_add_request (found by Thomas Daniel)
    v6: Removed unrelated changes

    Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
    Reviewed-by: Thomas Daniel <thomas.daniel@intel.com>
    [danvet: Reformat comment with strange linebreaks.]
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

*** This bug has been marked as a duplicate of bug 88652 ***
Comment 3 lu hua 2015-03-13 06:47:57 UTC
Verified.Fixed.
Comment 4 Elizabeth 2017-10-06 14:31:32 UTC
Closing old verified.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.