Bug 98716 - gem_ringfill@basic-default-hang incomplete on ivb
Summary: gem_ringfill@basic-default-hang incomplete on ivb
Status: CLOSED NOTABUG
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other Linux (All)
: highest blocker
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-11-14 10:21 UTC by Sagar Kamble
Modified: 2017-01-31 02:44 UTC (History)
1 user (show)

See Also:
i915 platform: IVB
i915 features: GEM/Other


Attachments
dmesg log during the test (1.86 MB, text/plain)
2016-11-14 10:21 UTC, Sagar Kamble
no flags Details
attachment-306-0.html (1.92 KB, text/html)
2017-01-30 17:59 UTC, Sagar Kamble
no flags Details

Description Sagar Kamble 2016-11-14 10:21:56 UTC
Created attachment 127965 [details]
dmesg log during the test

dmesg log shows following kernel bug ad timeout messages which are likely causes of the test being incomplete.

[  234.747310] ------------[ cut here ]------------
[  234.747331] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:127!
[  234.747350] invalid opcode: 0000 [#1] PREEMPT SMP
[  234.747364] Modules linked in: snd_hda_intel i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek ghash_clmulni_intel snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me snd_pcm lpc_ich mei r8169 mii [last unloaded: i915]
[  234.747464] CPU: 1 PID: 8182 Comm: gem_ringfill Tainted: G     U          4.9.0-rc4-CI-Trybot_279+ #1
[  234.747489] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
[  234.747511] task: ffff88010384a740 task.stack: ffffc90003d78000
[  234.747528] RIP: 0010:[<ffffffffa03015e6>]  [<ffffffffa03015e6>] i915_gem_request_retire+0x26/0x380 [i915]
[  234.747569] RSP: 0018:ffffc90003d7bb38  EFLAGS: 00010293
[  234.747584] RAX: 00000000005b133d RBX: ffff8801137225c0 RCX: 00000000005b133b
[  234.747603] RDX: ffff88010f7f8008 RSI: ffffffff81c5da73 RDI: ffff88010384af68
[  234.747623] RBP: ffffc90003d7bb50 R08: 0000000000000000 R09: 0000000000000000
[  234.747642] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801137225c0
[  234.747675] R13: ffff88010f7f8008 R14: 0000000000000000 R15: 00000000000001d8
[  234.747718] FS:  00007f9883095740(0000) GS:ffff88011fa40000(0000) knlGS:0000000000000000
[  234.747751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  234.747767] CR2: 000055f1f1dc6768 CR3: 000000010e065000 CR4: 00000000001406e0
[  234.747788] Stack:
[  234.747797]  ffff8801137225c0 ffff880113724780 ffff88010f7f8008 ffffc90003d7bb78
[  234.747825]  ffffffffa0301d89 0000000000000000 ffff88011814ac88 00000000000003f0
[  234.747852]  ffffc90003d7bbb8 ffffffffa0315bcd ffff880113724780 ffff8801146c0b80
[  234.747880] Call Trace:
[  234.747901]  [<ffffffffa0301d89>] i915_gem_request_retire_upto+0x49/0x90 [i915]
[  234.747933]  [<ffffffffa0315bcd>] intel_ring_begin+0x15d/0x2d0 [i915]
[  234.747962]  [<ffffffffa0315d6b>] intel_ring_alloc_request_extras+0x2b/0x40 [i915]
[  234.747993]  [<ffffffffa030378c>] i915_gem_request_alloc+0x2bc/0x3a0 [i915]
[  234.748023]  [<ffffffffa02ecb03>] i915_gem_do_execbuffer.isra.15+0x783/0x1a10 [i915]
[  234.748062]  [<ffffffff811a6a2e>] ? __might_fault+0x3e/0x90
[  234.748099]  [<ffffffffa02ee180>] i915_gem_execbuffer2+0xc0/0x250 [i915]
[  234.748120]  [<ffffffff81552fd6>] drm_ioctl+0x1f6/0x480
[  234.748146]  [<ffffffffa02ee0c0>] ? i915_gem_execbuffer+0x330/0x330 [i915]
[  234.748178]  [<ffffffff810d6e42>] ? trace_hardirqs_on_caller+0x122/0x1b0
[  234.748208]  [<ffffffff8100107a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
[  234.748238]  [<ffffffff81202d8e>] do_vfs_ioctl+0x8e/0x690
[  234.748264]  [<ffffffff81817c6c>] ? retint_kernel+0x2d/0x2d
[  234.748281]  [<ffffffff810d6e42>] ? trace_hardirqs_on_caller+0x122/0x1b0
[  234.748301]  [<ffffffff812033cc>] SyS_ioctl+0x3c/0x70
[  234.748317]  [<ffffffff8181726e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[  234.748335] Code: 0f 0b 0f 1f 00 55 48 89 e5 41 55 41 54 53 8b 0d 51 70 bb e1 49 89 fc 85 c9 0f 85 30 02 00 00 41 8b 84 24 80 01 00 00 85 c0 75 02 <0f> 0b 49 8b 94 24 a8 00 00 00 48 8b 8a e0 01 00 00 8b 89 c0 00 
[  234.748507] RIP  [<ffffffffa03015e6>] i915_gem_request_retire+0x26/0x380 [i915]
[  234.748540]  RSP <ffffc90003d7bb38>
[  234.748568] ---[ end trace b118ccc440f01b32 ]---


[  528.065681] [drm:i915_hotplug_work_func [i915]] Connector DP-1 (pin 5) received hotplug event.
[  528.066371] [drm:intel_dp_detect [i915]] [CONNECTOR:52:DP-1]
[  528.069544] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.073225] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.076646] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.080050] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.083675] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.087035] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.090602] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.093941] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.097289] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.100641] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.104213] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.107771] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.111329] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.114682] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.118234] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.121596] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.125152] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.128492] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.131839] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.135191] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.138745] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.142093] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.145645] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.149218] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.152778] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.156334] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.159685] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.163225] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.166558] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.170113] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.173681] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.177246] [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x7145003f
[  528.178073] [drm:drm_dp_dpcd_access] Too many retries, giving up. First error: -110
Comment 1 Chris Wilson 2016-11-14 11:01:00 UTC
Can't see how it can escape i915_wait_request() with timeout >= 0 and !completed.
Comment 2 yann 2016-11-14 14:28:32 UTC
updating priority => basic

BTW, with our CI we are not seeing this issue (checked results from Linux: 4.9.0-rc1-CI-CI_DRM_1736+ x86_64 up to Linux: 4.9.0-rc4-CI-CI_DRM_1835+ x86_64) on IVB
Comment 3 Sagar Kamble 2016-11-14 14:44:28 UTC
Actually this patchset has passed this test on public ML.
http://benchsrv.fi.intel.com/archive/results/CI_IGT_test/Patchwork_2979/fi-ivb-3770/igt%40gem_ringfill%40basic-default-hang.html

I have sent 1st patch of the series again on trybot to see if issue happens as that is the only that can impact current i915 behavior.
Comment 4 Chris Wilson 2016-11-14 15:15:43 UTC
(In reply to yann from comment #2)
> updating priority => basic
> 
> BTW, with our CI we are not seeing this issue (checked results from Linux:
> 4.9.0-rc1-CI-CI_DRM_1736+ x86_64 up to Linux: 4.9.0-rc4-CI-CI_DRM_1835+
> x86_64) on IVB

Note that we have only recently enabled GEM_BUG_ON() for CI. Prior to this, we have been seeing sporadic failures in ivb for GPU resets. Could be the same, certainly something is very odd for us to trigger this particular BUG_ON.
Comment 5 Dorota Czaplejewicz 2017-01-30 17:58:52 UTC
I tried to reproduce this on IVB-3770 with a single DVI output attached.

Kernel 396d17a6: drm-tip: 2017y-01m-25d-11h-07m-11s (CI build 2107) *passes* this test.
Comment 6 Sagar Kamble 2017-01-30 17:59:32 UTC
Created attachment 129234 [details]
attachment-306-0.html

Laptop charging issues.
Limited email access till 31/1 AM.

Thanks
Sagar
Comment 7 Sagar Kamble 2017-01-31 02:44:47 UTC
Verified by Dorota


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.