Bug 72740 - [SNB/IVB/HSW/BYT/BDW]igt/gem_reset_stats causes *ERROR* render ring hung inside bo (0x592000 ctx 2) at 0x5928a4
Summary: [SNB/IVB/HSW/BYT/BDW]igt/gem_reset_stats causes *ERROR* render ring hung insi...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: high major
Assignee: Daniel Vetter
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-12-16 03:21 UTC by lu hua
Modified: 2017-10-06 14:41 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (89.37 KB, text/plain)
2013-12-16 03:21 UTC, lu hua
no flags Details
error log (2.09 MB, text/plain)
2013-12-16 03:22 UTC, lu hua
no flags Details
dmesg on nightly e9c4f4 (58.32 KB, text/plain)
2013-12-23 07:11 UTC, lu hua
no flags Details

Description lu hua 2013-12-16 03:21:36 UTC
Created attachment 90816 [details]
dmesg

System Environment:
--------------------------
Platform: SNB/IVB/HSW/BYT
kernel:   (drm-intel-nightly)22dd82bce1840fe48d683a29cccc443281b10248

Bug detailed description:
-------------------------
It happens on SNB/IVB/HSW/BYT with -queued, -fixes and -nightly kernel.It's a new case.

Run: ./gem_reset_stats --run-subtest ban

output:
IGT-Version: 1.5-g62e1cbc (i686) (Linux: 3.13.0-rc3_drm-intel-nightly_22dd82_20131215+ i686)
Subtest ban: SUCCESS


Reproduce steps:
----------------------------
1. ./gem_reset_stats --run-subtest ban
Comment 1 lu hua 2013-12-16 03:22:57 UTC
Created attachment 90817 [details]
error log
Comment 2 Daniel Vetter 2013-12-16 09:13:35 UTC
Mika do we need additional tricks in the testcase to tell the kernel that hangs are expected here? I've though we have that integrated now ...

If we set the stop_rings stuff _after_ having emitted the special batch that shouldn't interfere with your testcase.
Comment 3 Daniel Vetter 2013-12-16 09:14:12 UTC
We might also need a kernel patch to tune down the "hanging too fast" message a bit.
Comment 4 lu hua 2013-12-19 03:07:01 UTC
It also happens on Broadwell.
Comment 5 lu hua 2013-12-23 07:11:49 UTC
Created attachment 91142 [details]
dmesg on nightly e9c4f4

Test on latest nightly kernel, It doesn't exit.
Call Trace:
[   59.916047]  [<f81c745d>] ? i915_reg_read_ioctl+0x49/0x49 [i915]
[   59.916079]  [<f80847ce>] ? drm_ioctl+0x222/0x30c [drm]
[   59.916110]  [<f81c745d>] ? i915_reg_read_ioctl+0x49/0x49 [i915]
[   59.916140]  [<c02a7cbe>] ? page_add_new_anon_rmap+0x44/0x9e
[   59.916170]  [<f80845ac>] ? drm_core_reclaim_buffers+0x52/0x52 [drm]
[   59.916201]  [<c02c39c9>] ? do_vfs_ioctl+0x3f6/0x43d
[   59.916226]  [<c02b3ce9>] ? kmem_cache_free+0xb7/0xbe
[   59.916250]  [<c02b3ce9>] ? kmem_cache_free+0xb7/0xbe
[   59.916275]  [<c02c6363>] ? dentry_kill+0x14d/0x162
[   59.916299]  [<c02c6363>] ? dentry_kill+0x14d/0x162
[   59.916322]  [<c02c6363>] ? dentry_kill+0x14d/0x162
[   59.916345]  [<c02c6653>] ? dput+0xc9/0xd0
[   59.916366]  [<c02b8200>] ? __fput+0x172/0x191
[   59.916388]  [<c02c3a59>] ? SyS_ioctl+0x49/0x74
[   59.916411]  [<c089a13a>] ? sysenter_do_call+0x12/0x22
[   59.916435] Code: 00 00 e8 f0 93 06 c8 84 c0 74 19 8b 87 e0 19 00 00 bf 02 00 00 00 25 ff ff ff 7f 40 99 f7 ff 89 43 08 eb 07 c7 43 08 00 00 00 00 <8b> 45 1c 89 43 0c 8b 45 18 89 43 10 8b 04 24 e8 3e d2 6c c8 eb
[   59.916657] EIP: [<f81c74fc>] i915_get_reset_stats_ioctl+0x9f/0xc2 [i915] SS:ESP 0068:f43ebe60
[   59.916706] CR2: 000000000000001c
[   59.916731] ---[ end trace 4dce54b5a276aded ]---
Comment 6 Guang Yang 2013-12-23 07:18:12 UTC
This case can't exit successfully and block QA's nightly testing on many platforms, so change the priority higher.
Comment 7 Daniel Vetter 2014-01-10 08:26:57 UTC
The original issue (of printing "*ERROR* render ring hung" in dmesg) should have been fixed with

commit 2dd312cbb80be1d8c8a199248095db85eb85155d
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Wed Nov 20 16:58:16 2013 +0200

    tests/gem_reset_stats: stop rings after injecting hang

The hangs are new bugs, likely due to topic/ppgtt. Please file new reports for those and only reopen this bug here if you still see the dmesg noise, but the test otherwise succeeds.
Comment 8 lu hua 2014-01-13 06:49:16 UTC
The error still exists.
Comment 9 Daniel Vetter 2014-01-14 11:01:56 UTC
Please test

http://patchwork.freedesktop.org/patch/17914/
Comment 10 Daniel Vetter 2014-01-14 11:49:14 UTC
Fix merged:

commit 59ec90faa34a6a0b46a33f9adde974e867909993
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Jan 14 11:40:54 2014 +0100

    drm/i915: Tune down reset_stat output from ERROR to debug
Comment 11 lu hua 2014-01-15 07:51:51 UTC
*ERROR* render ring hung inside bo (0x592000 ctx 2) at 0x5928a4 goes away.
*ERROR* context hanging too fast, declaring banned! always exists. 
Close this bug.
Comment 12 Elizabeth 2017-10-06 14:41:22 UTC
Closing old verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.