Bug 98460 - [SNB] [regression] [igt] gem_busy@extended-parallel-render Failed assertion: !"GPU hung"
Summary: [SNB] [regression] [igt] gem_busy@extended-parallel-render Failed assertion:...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) All
: highest critical
Assignee: maria guadalupe
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords: bisect_pending, regression
Depends on:
Blocks:
 
Reported: 2016-10-27 20:12 UTC by maria guadalupe
Modified: 2016-12-05 17:20 UTC (History)
2 users (show)

See Also:
i915 platform: SNB
i915 features: GPU hang


Attachments
kern.log (155.80 KB, text/plain)
2016-10-27 20:12 UTC, maria guadalupe
no flags Details
gpu_hang (71.01 KB, text/plain)
2016-10-27 20:13 UTC, maria guadalupe
no flags Details
gpu_crash (71.01 KB, text/plain)
2016-10-27 20:14 UTC, maria guadalupe
no flags Details
kmsg (108.22 KB, text/plain)
2016-10-27 22:05 UTC, maria guadalupe
no flags Details

Description maria guadalupe 2016-10-27 20:12:53 UTC
Created attachment 127567 [details]
kern.log

Bug description:
===========================================
gem_busy sub-test extended-parallel-render causes GPU hang

this a test regression, please see the following info

WW43 setup (pass)
=============
kernel commit 15dfed2b90e84e7c277f81842fc3f19355293061
igt commit 54f8a3f
drm commit a44c9c3
cairo commit db8a7f1

WW44 setup (fail)
==========
kernel commit 17dc529acb9a6a4328b419048e32df586b90646b
igt commit 93437cb
drm commit 9e24d0c
cairo commit db8a7f1

test output
=================================================
IGT-Version: 1.16-g93437cb (x86_64) (Linux: 4.9.0-rc1-nightly+ x86_64)
(gem_busy:2214) igt-core-DEBUG: Test requirement passed: !igt_run_in_simulation()
(gem_busy:2214) drmtest-DEBUG: Test requirement passed: !(fd<0)
(gem_busy:2214) drmtest-DEBUG: Test requirement passed: drmSetMaster(fd) == 0
(gem_busy:2214) igt-aux-CRITICAL: Test assertion failure function sig_abort, file igt_aux.c:401:
(gem_busy:2214) igt-aux-CRITICAL: Failed assertion: !"GPU hung"
Stack trace:
  #0 [__igt_fail_assert+0x101]
  #1 [sig_abort+0x3a]
  #2 [killpg+0x40]
  #3 [has_extended_busy_ioctl+0x166]
  #4 [__real_main596+0x2b3]
  #5 [main+0x23]
  #6 [__libc_start_main+0xf0]
  #7 [_start+0x29]
  #8 [<unknown>+0x29]
Test gem_busy failed.
**** DEBUG ****
(gem_busy:2214) igt-core-DEBUG: Test requirement passed: !igt_run_in_simulation()
(gem_busy:2214) drmtest-DEBUG: Test requirement passed: !(fd<0)
(gem_busy:2214) drmtest-DEBUG: Test requirement passed: drmSetMaster(fd) == 0
(gem_busy:2214) igt-aux-CRITICAL: Test assertion failure function sig_abort, file igt_aux.c:401:
(gem_busy:2214) igt-aux-CRITICAL: Failed assertion: !"GPU hung"
****  END  ****
Subtest extended-parallel-render: FAIL
(gem_busy:2214) igt-aux-CRITICAL: Test assertion failure function sig_abort, file igt_aux.c:401:
(gem_busy:2214) igt-aux-CRITICAL: Failed assertion: !"GPU hung"
Stack trace:
  #0 [__igt_fail_assert+0x101]
  #1 [sig_abort+0x3a]
  #2 [killpg+0x40]
  #3 [has_extended_busy_ioctl+0x166]
  #4 [__real_main596+0xf53]
  #5 [main+0x23]
  #6 [__libc_start_main+0xf0]
  #7 [_start+0x29]
  #8 [<unknown>+0x29]
Test gem_busy failed.
**** DEBUG ****
(gem_busy:2214) igt-aux-CRITICAL: Test assertion failure function sig_abort, file igt_aux.c:401:
(gem_busy:2214) igt-aux-CRITICAL: Failed assertion: !"GPU hung"
****  END  ****
(gem_busy:2214) ioctl-wrappers-DEBUG: Test requirement passed: has_ban_period
(gem_busy:2214) igt-gt-DEBUG: Test requirement passed: has_gpu_reset(fd)
(gem_busy:2214) ioctl-wrappers-CRITICAL: Test assertion failure function gem_execbuf, file ioctl_wrappers.c:604:
(gem_busy:2214) ioctl-wrappers-CRITICAL: Failed assertion: __gem_execbuf(fd, execbuf) == 0
(gem_busy:2214) ioctl-wrappers-CRITICAL: error: -5 != 0
Stack trace:
  #0 [__igt_fail_assert+0x101]
  #1 [gem_execbuf+0x44]
  #2 [has_extended_busy_ioctl+0x151]
  #3 [__real_main596+0xd04]
  #4 [main+0x23]
  #5 [__libc_start_main+0xf0]
  #6 [_start+0x29]
  #7 [<unknown>+0x29]
Test gem_busy failed.
**** DEBUG ****
(gem_busy:2214) ioctl-wrappers-DEBUG: Test requirement passed: has_ban_period
(gem_busy:2214) igt-gt-DEBUG: Test requirement passed: has_gpu_reset(fd)
(gem_busy:2214) ioctl-wrappers-CRITICAL: Test assertion failure function gem_execbuf, file ioctl_wrappers.c:604:
(gem_busy:2214) ioctl-wrappers-CRITICAL: Failed assertion: __gem_execbuf(fd, execbuf) == 0
(gem_busy:2214) ioctl-wrappers-CRITICAL: error: -5 != 0
****  END  ****
(gem_busy:2214) igt-core-DEBUG: Exiting with status code 99


relevant dmesg output
=================================================
Oct 27 13:06:14 SNB-1 kernel: [  166.813633] [drm] GPU HANG: ecode 6:0:0xe77ffeff, in gem_busy [1272], reason: Hang on render ring, action: reset
Oct 27 13:06:14 SNB-1 kernel: [  166.813634] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Oct 27 13:06:14 SNB-1 kernel: [  166.813635] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Oct 27 13:06:14 SNB-1 kernel: [  166.813636] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Oct 27 13:06:14 SNB-1 kernel: [  166.813636] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Oct 27 13:06:14 SNB-1 kernel: [  166.813637] [drm] GPU crash dump saved to /sys/class/drm/card0/error

 Software information
============================================
Kernel version                  : 4.9.0-rc1-nightly+
Linux distribution              : Ubuntu 16.04.1 LTS
Architecture                    : 64-bit
Gfx stack code                  : 4283756084
xf86-video-intel version        : 2.99.917
Xorg-Xserver version            : 1.18.99.901 (1.19.0 RC 1)
DRM version                     : 2.4.71
VAAPI version                   : Intel i965 driver for Intel(R) Sandybridge Desktop - 1.7.3.pre1 (1.7.2-140-g852cea1)
Cairo version                   : 1.15.2
Intel GPU Tools version         : Tag [intel-gpu-tools-1.16-96-g93437cb] / Commit [93437cb]
Kernel driver in use            : i915
Bios revision                   : 4.6
doesn't has this firmware


 Hardware information
============================================
Platform                        : SNB
Motherboard model               : OptiPlex990
Motherboard type                : 06D7TR Mini Tower
Motherboard manufacturer        : DellInc.
CPU family                      : Core i7
CPU information                 : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
GPU Card                        : Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
Memory ram                      : 16 GB
Maximum memory ram allowed      : 32 GB
CPU thread                      : 8
CPU core                        : 4
Signature                       : Type 0, Family 6, Model 42, Stepping 7
Hard drive capacity             : 111GiB (120GB)

 Kernel parameters
============================================
 quiet drm.debug=0xe resume=/dev/sda3

Attachments
==================================================
kern.log
gpu_hang.log
gpu_crash.log
Comment 1 maria guadalupe 2016-10-27 20:13:33 UTC
Created attachment 127568 [details]
gpu_hang
Comment 2 maria guadalupe 2016-10-27 20:14:09 UTC
Created attachment 127569 [details]
gpu_crash
Comment 3 Chris Wilson 2016-10-27 20:22:41 UTC
Can you please make sure your kernel is uptodate and allows /dev/kmsg access?
Comment 4 maria guadalupe 2016-10-27 22:03:51 UTC
(In reply to Chris Wilson from comment #3)
> Can you please make sure your kernel is uptodate and allows /dev/kmsg access?

I tried with the latest kernel and test is still failing. Added kmsg
Comment 5 maria guadalupe 2016-10-27 22:05:09 UTC
Created attachment 127572 [details]
kmsg
Comment 6 Chris Wilson 2016-10-28 07:41:31 UTC
14,1280,255241527,-;[IGT] gem_busy: executing
7,1281,255241602,-;[drm:i915_gem_open [i915]] 
7,1282,255241959,-;[drm:i915_gem_open [i915]] 
6,1283,268864341,-;[drm] GPU HANG: ecode 6:0:0xe77ffeff, in gem_busy [1280], reason: Hang on render ring, action: reset
6,1284,268864342,-;[drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
6,1285,268864342,-;[drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
6,1286,268864342,-;[drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
6,1287,268864343,-;[drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
6,1288,268864343,-;[drm] GPU crash dump saved to /sys/class/drm/card0/error
7,1289,268864371,-;[drm:i915_reset_and_wakeup [i915]] resetting chip
5,1290,268864378,-;drm/i915: Resetting chip after gpu hang
7,1291,268864397,-;[drm:i915_gem_reset [i915]] resetting render ring to restart from tail of request 0x22
7,1292,268864419,-;[drm:intel_print_rc6_info [i915]] Enabling RC6 states: RC6 on RC6p off RC6pp off
7,1293,268868451,-;[drm:intel_guc_setup [i915]] GuC fw status: path (null), fetch NONE, load NONE
7,1294,281854085,-;[drm:i915_reset_and_wakeup [i915]] resetting chip
5,1295,281854091,-;drm/i915: Resetting chip after gpu hang
7,1296,281854111,-;[drm:i915_gem_reset [i915]] resetting render ring to restart from tail of request 0x23
14,1304,297808259,-;[IGT] gem_busy: exiting, ret=99

Either we are not getting all the [IGT] breadcrumbs or that dies before we even start the test.
Comment 7 Jari Tahvanainen 2016-11-08 04:37:19 UTC
Highest due to regression without workaround
Comment 8 Nobody 2016-11-29 14:57:36 UTC
Lupita could you re-test with latest configuration
Comment 9 maria guadalupe 2016-11-29 19:14:43 UTC
(In reply to ricardo.vega from comment #8)
> Lupita could you re-test with latest configuration

This issue does not occur using the next configuration:

 Software information
============================================
Kernel version                  : 4.9.0-rc7drm-intel-qa-ww49-commit-477cc06+
Linux distribution              : Ubuntu 16.04.1 LTS
Architecture                    : 64-bit
Gfx stack code                  : 4152731784
xf86-video-intel version        : 2.99.917
Xorg-Xserver version            : 1.18.3
DRM version                     : 2.4.73
Cairo version                   : 1.15.2
Intel GPU Tools version         : Tag [intel-gpu-tools-1.16-166-gd072258] / Commit [d072258]
Kernel driver in use            : i915
Bios revision                   : 4.6

 Kernel parameters
============================================
 quiet drm.debug=0xe resume=/dev/sd

result
============================================
IGT-Version: 1.16-gd072258 (x86_64) (Linux: 4.9.0-rc7drm-intel-qa-ww49-commit-477cc06+ x86_64)
Subtest extended-parallel-render: SUCCESS (0.040s)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.