Bug 88817 - [BSW ppgtt bisected]igt/gem_reset_stats causes system hang
Summary: [BSW ppgtt bisected]igt/gem_reset_stats causes system hang
Status: CLOSED DUPLICATE of bug 88652
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: highest critical
Assignee: Nick Hoath
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-27 02:13 UTC by lu hua
Modified: 2017-10-06 14:31 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (21.86 KB, text/plain)
2015-01-27 02:13 UTC, lu hua
no flags Details

Description lu hua 2015-01-27 02:13:26 UTC
Created attachment 112866 [details]
dmesg

==System Environment==
--------------------------
Regression:  yes
good commit: 985850bd145655d10dfcd5a03a3fc38540794ca7
bad commit: 518c2e29d566427433b4ebf51f5863b69f8baf4a

no-working platforms: BSW

==kernel==
--------------------------
drm-intel-nightly/67e9eb08a3b967b1ac6b7ec4588a93a2cb030cae
commit 67e9eb08a3b967b1ac6b7ec4588a93a2cb030cae
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Jan 23 17:04:55 2015 +0100

    drm-intel-nightly: 2015y-01m-23d-16h-04m-34s UTC integration manifest

==Bug detailed description==
-----------------------------


output:
IGT-Version: 1.9-g032f30c (x86_64) (Linux: 3.19.0-rc5_drm-intel-nightly_67e9eb_20150126+ x86_64)
Segmentation fault

dmesg:
[  153.932969] CPU: 0 PID: 4456 Comm: gem_reset_stats Not tainted 3.19.0-rc5_drm-intel-nightly_67e9eb_20150126+ #938
[  153.938635] task: ffff880179c14800 ti: ffff880002814000 task.ti: ffff880002814000
[  153.944313] RIP: 0010:[<ffffffff8110dc7e>]  [<ffffffff8110dc7e>] kmem_cache_alloc_trace+0xce/0x104
[  153.950084] RSP: 0018:ffff880002817c38  EFLAGS: 00010286
[  153.955806] RAX: 0000000000000000 RBX: ffff880002fde700 RCX: 000000000000413a
[  153.961550] RDX: 0000000000004139 RSI: 00000000000000d0 RDI: ffff88017b001900
[  153.967290] RBP: ff880179b08d8000 R08: 0000000000015520 R09: 0000000000000000
[  153.972999] R10: ffff880178ac9300 R11: ffff8f908b8ca098 R12: 00000000000000d0
[  153.978703] R13: 0000000000000078 R14: ffffffff81130482 R15: ffff88017b001900
[  153.984390] FS:  00007f129d48f8c0(0000) GS:ffff88017fc00000(0000) knlGS:0000000000000000
[  153.990119] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  153.995815] CR2: 00007fb445799148 CR3: 0000000075c3a000 CR4: 00000000001007f0
[  154.001525] Stack:
[  154.007175]  ffff880178b10af0 ffff880002fde700 ffff880178b10af0 ffff880002fde700
[  154.012935]  ffffffffa009276d ffffffffa00934b3 ffff880002817df4 ffffffff81130482
[  154.018685]  ffff880178b10af0 ffffffffa01216ef ffff880002fde700 ffff880178b10af0
[  154.024395] Call Trace:
[  154.030051]  [<ffffffffa009276d>] ? i915_wedged_get+0x11/0x11 [i915]
[  154.035746]  [<ffffffffa00934b3>] ? i915_ring_missed_irq_set+0x3f/0x3f [i915]
[  154.041420]  [<ffffffff81130482>] ? simple_attr_open+0x33/0x9e
[  154.047080]  [<ffffffffa009282a>] ? i915_ring_missed_irq_fops_open+0x1a/0x1a [i915]
[  154.052762]  [<ffffffff81112001>] ? do_dentry_open+0x184/0x2a6
[  154.058414]  [<ffffffff8111deec>] ? do_last+0x942/0xb75
[  154.064010]  [<ffffffff8111c301>] ? __inode_permission+0x53/0x6e
[  154.069612]  [<ffffffff8111e251>] ? link_path_walk+0x51/0x74a
[  154.075172]  [<ffffffff8111f6bc>] ? path_openat+0x20f/0x560
[  154.080770]  [<ffffffff81120597>] ? do_filp_open+0x2b/0x6f
[  154.086253]  [<ffffffff81129ba1>] ? __alloc_fd+0x58/0xe3
[  154.091686]  [<ffffffff811131d6>] ? do_sys_open+0x14b/0x1cf
[  154.097105]  [<ffffffff8179f452>] ? system_call_fastpath+0x12/0x17
[  154.102508] Code: 7e 08 45 89 e1 49 89 d8 4c 89 e9 48 89 ea 4c 89 fe 41 ff 16 49 83 c6 10 49 83 3e 00 eb 30 eb 32 49 63 47 20 4d 8b 07 48 8d 4a 01 <48> 8b 5c 05 00 48 89 e8 65 49 0f c7 08 0f 94 c0 84 c0 75 89 e9
[  154.108791] RIP  [<ffffffff8110dc7e>] kmem_cache_alloc_trace+0xce/0x104
[  154.114424]  RSP <ffff880002817c38>
[  154.120036] ---[ end trace 773048983473776d ]---

==Reproduce steps==
---------------------------- 
1. ./gem_reset_stats --run-subtest ban-blt
Comment 1 lu hua 2015-01-28 06:22:40 UTC
It also happens on BDW.
Comment 2 lu hua 2015-01-28 06:38:26 UTC
some gem_reset_stats sub cases also cause system hang.
run ./gem_reset_stats on BSW.
IGT-Version: 1.9-gebd8b32 (x86_64) (Linux: 3.19.0-rc6_drm-intel-nightly_70438b_20150128+ x86_64)
Subtest params: SUCCESS (0.008s)
Subtest params-ctx-render: SUCCESS (0.020s)
Subtest reset-stats-render: SUCCESS (5.807s)
Subtest reset-stats-ctx-render: SUCCESS (6.006s)

Run ./gem_reset_stats --run-subtest ban-bsd on BDW and BSW, it also causes system hang.
Comment 3 Ding Heng 2015-02-03 05:32:56 UTC
Add SNB in this case. The following cases will cause the similar output in dmesg.

igt/gem_reset_stats/ban-blt
igt/gem_reset_stats/ban-bsd
igt/gem_reset_stats/ban-ctx-render
igt/gem_reset_stats/ban-render
Comment 4 lu hua 2015-02-03 06:08:51 UTC
Remove SNB, it is a separate bug. remove BDW, track it in bug 88688.
Add i915.enable_ppgtt=0 on drm-intel-nightly kernel(98592c_20150122) kernel, It works well. 
Test on the latest drm-intel-nightly(8b4216_20150203) kernel, it has a new bug. We will file a new bug to track this on the latest drm-intel-nightly kernel.
Comment 5 lu hua 2015-02-03 06:39:36 UTC
(In reply to lu hua from comment #4)

> Test on the latest drm-intel-nightly(8b4216_20150203) kernel, it has a new
> bug. We will file a new bug to track this on the latest drm-intel-nightly
> kernel.

On the latest drm-intel-nightly kernel,it doesn't exit testing, reported bug 88933.
Comment 6 ye.tian 2015-02-07 08:53:47 UTC
Bisected shows: the first bad commit is 180253c7.
It parents commit 84ebd315 is good.

commit 180253c71a9d55352cbad44a02424958320cb618 84ebd31548132e1d9e9014fa08986765b5a20237
Author:     Nick Hoath <nicholas.hoath@intel.com>
AuthorDate: Thu Jan 15 13:10:39 2015 +0000
Commit:     Daniel Vetter <daniel.vetter@ffwll.ch>
CommitDate: Tue Jan 20 12:36:54 2015 +0100

    drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request

    Move all remaining elements that were unique to execlists queue items
    in to the associated request.

    Issue: VIZ-4274

    v2: Rebase. Fixed issue of overzealous freeing of request.
    v3: Removed re-addition of cleanup work queue (found by Daniel Vetter)
    v4: Rebase.
    v5: Actual removal of intel_ctx_submit_request. Update both tail and postfix
    pointer in __i915_add_request (found by Thomas Daniel)
    v6: Removed unrelated changes
Comment 7 ye.tian 2015-02-07 09:06:15 UTC
maybe Bug #88987 Bug 388845 Bug #89000 Bug #87729 same issue as with this bug.
This issue does not exist on latest the drm-intel-fixes branch.
Comment 8 Jani Nikula 2015-02-10 07:54:25 UTC
Please retest with current drm-intel-nightly that has

commit f82107950e9bda3779610e37bdfdccae6fc16f87
Author: Nick Hoath <nicholas.hoath@intel.com>
Date:   Thu Jan 29 16:55:07 2015 +0000

    drm/i915: Fix a use-after-free in intel_execlists_retire_requests
Comment 9 Jani Nikula 2015-02-10 08:19:07 UTC

*** This bug has been marked as a duplicate of bug 88652 ***
Comment 10 lu hua 2015-03-13 06:55:52 UTC
Verified.Fixed.
Comment 11 Elizabeth 2017-10-06 14:31:51 UTC
Closing old verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.