Bug 88840

Summary: [BDW ppgtt Bisected]igt/gem_reloc_vs_gpu/faulting-reloc-interruptible-hang causes system hang
Product: DRI Reporter: lu hua <huax.lu>
Component: DRM/IntelAssignee: Nick Hoath <nicholas.hoath>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: high CC: intel-gfx-bugs
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg none

Description lu hua 2015-01-28 01:59:59 UTC
Created attachment 112899 [details]
dmesg

==System Environment==
--------------------------
Regression: Yes
good commit: 6f34cc393f6407fbec91ff6d4fd1e29fe86b59d5
bad commit: 37ec63030b14ad91f3ee9e03385a8904d0aff741

no-working platforms: BDW

==kernel==
--------------------------
drm-intel-nightly/2c2cd37eb3b97bb8846ac3bf75dcb8b4948922d0
commit 2c2cd37eb3b97bb8846ac3bf75dcb8b4948922d0
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Mon Jan 26 11:39:01 2015 +0200

    drm-intel-nightly: 2015y-01m-26d-09h-38m-30s UTC integration manifest

==Bug detailed description==
-----------------------------
It causes system hang on BDW with -nightly kernel and -next-queued kernel. 

output:
IGT-Version: 1.9-gebd8b32 (x86_64) (Linux: 3.19.0-rc6_drm-intel-nightly_2c2cd3_20150127+ x86_64)
Segmentation fault

dmesg:
[ 1734.800735] CPU: 0 PID: 4980 Comm: gem_reloc_vs_gp Not tainted 3.19.0-rc6_drm-intel-nightly_2c2cd3_20150127+ #982
[ 1734.802447] task: ffff8800a7c46800 ti: ffff880144558000 task.ti: ffff880144558000
[ 1734.804172] RIP: 0010:[<ffffffff8110da14>]  [<ffffffff8110da14>] __kmalloc+0x116/0x151
[ 1734.805919] RSP: 0018:ffff88014455bd68  EFLAGS: 00010282
[ 1734.807662] RAX: 0000000000000000 RBX: ffff88014455be18 RCX: 00000000000039fc
[ 1734.809425] RDX: 00000000000039fb RSI: 0000000000081200 RDI: 0000000000015520
[ 1734.811185] RBP: ff8800023b188000 R08: ffff88014ec15520 R09: 0000000040406469
[ 1734.812934] R10: 0000000000000040 R11: 0000000000000246 R12: ffff88014a003900
[ 1734.814676] R13: 00000000000812d0 R14: 0000000000000070 R15: ffffffffa009f234
[ 1734.816424] FS:  00007f1fa7cf58c0(0000) GS:ffff88014ec00000(0000) knlGS:0000000000000000
[ 1734.818194] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1734.819970] CR2: 00007f1fa7d13000 CR3: 00000001497ca000 CR4: 00000000003407f0
[ 1734.821771] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1734.823570] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1734.825355] Stack:
[ 1734.827127]  0000000000000246 ffff88014455be18 ffff88014455be18 ffff880148c2f000
[ 1734.828947]  ffff8800a7744200 0000000040406469 ffff8800a7744200 ffffffffa009f234
[ 1734.830777]  ffffffffa009f1b3 ffff88014455be18 00000000fffffff2 ffffffffa01200c0
[ 1734.832614] Call Trace:
[ 1734.834433]  [<ffffffffa009f234>] ? i915_gem_execbuffer2+0x81/0x209 [i915]
[ 1734.836282]  [<ffffffffa009f1b3>] ? i915_gem_execbuffer+0x350/0x350 [i915]
[ 1734.838109]  [<ffffffffa001070a>] ? drm_ioctl+0x279/0x3bc [drm]
[ 1734.839923]  [<ffffffffa009f1b3>] ? i915_gem_execbuffer+0x350/0x350 [i915]
[ 1734.841743]  [<ffffffff81122409>] ? do_vfs_ioctl+0x412/0x459
[ 1734.843545]  [<ffffffff81122499>] ? SyS_ioctl+0x49/0x78
[ 1734.845327]  [<ffffffff8179f792>] ? system_call_fastpath+0x12/0x17
[ 1734.847090] Code: d8 4c 89 f1 48 89 ea 4c 89 fe 41 ff 14 24 49 83 c4 10 49 83 3c 24 00 eb 39 49 89 ec eb 38 49 63 44 24 20 49 8b 3c 24 48 8d 4a 01 <48> 8b 5c 05 00 48 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 0f 85 7a
[ 1734.849064] RIP  [<ffffffff8110da14>] __kmalloc+0x116/0x151
[ 1734.850895]  RSP <ffff88014455bd68>
[ 1734.852698] [drm:intel_backlight_device_update_status] updating intel_backlight, brightness=937/937
[ 1734.868381] ---[ end trace d970df286b32461c ]---

==Reproduce steps==
---------------------------- 
1. ./gem_reloc_vs_gpu --run-subtest faulting-reloc-interruptible-hang
Comment 1 Chris Wilson 2015-01-28 09:02:49 UTC
Out of curiosity, does i915.enable_execlists=0 or i915.enable_ppgtt=1 prevent this?
Comment 2 lu hua 2015-01-29 05:36:04 UTC
add i915.enable_execlists=0, It doesn't have hang issue.

output:
IGT-Version: 1.9-gebd8b32 (x86_64) (Linux: 3.19.0-rc6_drm-intel-nightly_70438b_20150128+ x86_64)
(gem_reloc_vs_gpu:4767) CRITICAL: Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240:
(gem_reloc_vs_gpu:4767) CRITICAL: Failed assertion: test == 0xdeadbeef
(gem_reloc_vs_gpu:4767) CRITICAL: mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef
Subtest faulting-reloc-interruptible-hang: FAIL (120.040s)
Comment 3 lu hua 2015-02-09 06:36:07 UTC
add i915.enable_ppgtt=0,it works well.
Comment 4 lu hua 2015-02-10 05:33:59 UTC
Bisect shows: 6d3d8274bc45de4babb62d64562d92af984dd238 is the first bad commit.
commit 6d3d8274bc45de4babb62d64562d92af984dd238
Author:     Nick Hoath <nicholas.hoath@intel.com>
AuthorDate: Thu Jan 15 13:10:39 2015 +0000
Commit:     Daniel Vetter <daniel.vetter@ffwll.ch>
CommitDate: Tue Jan 27 09:50:53 2015 +0100

    drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request

    Move all remaining elements that were unique to execlists queue items
    in to the associated request.

    Issue: VIZ-4274

    v2: Rebase. Fixed issue of overzealous freeing of request.
    v3: Removed re-addition of cleanup work queue (found by Daniel Vetter)
    v4: Rebase.
    v5: Actual removal of intel_ctx_submit_request. Update both tail and postfix
    pointer in __i915_add_request (found by Thomas Daniel)
    v6: Removed unrelated changes

    Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
    Reviewed-by: Thomas Daniel <thomas.daniel@intel.com>
    [danvet: Reformat comment with strange linebreaks.]
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Comment 5 Jani Nikula 2015-02-10 08:18:22 UTC

*** This bug has been marked as a duplicate of bug 88652 ***
Comment 6 Elizabeth 2017-10-06 14:31:47 UTC
Closing old verified.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.