Bug 88482 - [BSW]igt/gem_evict_alignment/minor-hang fails
Summary: [BSW]igt/gem_evict_alignment/minor-hang fails
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium major
Assignee: Rami
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 90063 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-01-16 06:05 UTC by Ding Heng
Modified: 2016-02-22 11:33 UTC (History)
3 users (show)

See Also:
i915 platform: BSW/CHT
i915 features: GPU hang


Attachments
dmesg for this case (123.33 KB, text/plain)
2015-01-16 06:05 UTC, Ding Heng
no flags Details
dmesg info (123.15 KB, text/plain)
2015-04-08 03:27 UTC, ye.tian
no flags Details
i915_error_state info (2.71 MB, text/plain)
2015-04-08 03:28 UTC, ye.tian
no flags Details
dmesg info (123.58 KB, text/plain)
2015-04-20 07:16 UTC, ye.tian
no flags Details
dmesg (191.77 KB, text/plain)
2015-11-20 11:24 UTC, Rami
no flags Details

Description Ding Heng 2015-01-16 06:05:20 UTC
Created attachment 112328 [details]
dmesg for this case

==System Environment==
--------------------------
Regression: No, this is a new case.

Non-working platforms: BSW
==kernel==
--------------------------
origin/drm-intel-nightly:95cce4b4c5f3ecaf9c1c01d42f670da2748fcffb(fails)

==Bug detailed description==
-----------------------------

dmesg -r | egrep "<[1-6]>" |grep drm                            
<6>[  160.817604] [drm] stuck on render ring
<6>[  160.847315] [drm] GPU HANG: ecode 8:0:0xe757fffe, reason: Ring hung, action: reset
<6>[  160.847329] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
<6>[  160.847332] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
<6>[  160.847348] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
<6>[  160.847353] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
<6>[  160.847356] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<5>[  160.849206] drm/i915: Resetting chip after gpu hang
<4>[  162.820074] WARNING: CPU: 0 PID: 891 at drivers/gpu/drm/i915/intel_pm.c:3874 valleyview_set_rps+0x58/0x153 [i915]()
<4>[  162.820080] Modules linked in: ipv6 dm_mod snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support pcspkr serio_raw snd_hda_codec_realtek snd_hda_codec_generic i2c_i801 lpc_ich mfd_core snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore battery ac acpi_cpufreq joydev i915 button video drm_kms_helper drm cfbfillrect cfbimgblt cfbcopyarea
<4>[  162.820132] CPU: 0 PID: 891 Comm: kworker/0:1 Tainted: G        W      3.19.0-rc4_drm-intel-nightly_95cce4_20150116+ #498

==Reproduce steps==
---------------------------- 
1. ./gem_evict_alignment --run-subtest minor-hang
Comment 1 Ding Heng 2015-01-16 07:13:13 UTC
The mem of my DUT seems not enough, I don't know wheather the cause of this issue have something to do wtih this, here is the output when running the case:

./gem_evict_alignment --run-subtest minor-hang
IGT-Version: 1.9-g3214a27 (x86_64) (Linux: 3.19.0-rc4_drm-intel-nightly_95cce4_20150116+ x86_64)
Test requirement not met in function intel_require_memory, file intel_os.c:244:
Test requirement: !(total <= required)
Estimated that we need 3222798336 bytes for the test, but only have 1890582528 bytes available (RAM)
Subtest minor-hang: SKIP (0.034s)
Comment 2 ye.tian 2015-04-08 03:25:23 UTC
Tested it on the latest nightly kernel,it will cause GPU hang and Call Trace.
please see the dmesg, i915 error info.

output:
-----------------
root@x-bsw08:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_evict_alignment --run-subtest minor-hang
IGT-Version: 1.10-ga6c3b32 (x86_64) (Linux: 4.0.0-rc6_drm-intel-nightly_333cf6_20150403+ x86_64)



dmesg info:
-----------

[  193.823487] [drm] stuck on render ring
[  193.845102] [drm] GPU HANG: ecode 8:0:0xe757fffe, in gem_evict_align [4253], reason:                                                                                 Ring hung, action: reset
[  193.845109] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, incl                                                                                uding userspace.
[  193.845112] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI                                                                                 -> DRM/Intel
[  193.845115] [drm] drm/i915 developers can then reassign to the right component if it'                                                                                s not a kernel issue.
[  193.845118] [drm] The gpu crash dump is required to analyze gpu hangs, so please alwa                                                                                ys attach it.
[  193.845121] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  193.845332] [drm:i915_reset_and_wakeup] resetting chip
[  193.878459] drm/i915: Resetting chip after gpu hang
[  193.878479] [drm:gen8_init_common_ring] Execlists enabled for render ring
[  193.878487] [drm:gen8_init_common_ring] Execlists enabled for bsd ring
[  193.878494] [drm:gen8_init_common_ring] Execlists enabled for blitter ring
[  193.878499] [drm:gen8_init_common_ring] Execlists enabled for video enhancement ring
[  195.824320] [drm:cherryview_enable_rps] GT fifo had a previous error 1080000
[  195.826383] [drm:cherryview_enable_rps] GPLL enabled? yes
[  195.826388] [drm:cherryview_enable_rps] GPU status: 0x00004010
[  195.826393] [drm:cherryview_enable_rps] current GPU freq: 640 MHz (64)
[  195.826397] [drm:cherryview_enable_rps] setting GPU freq to 400 MHz (40)
[  199.834211] [drm] stuck on render ring
[  199.856054] [drm] GPU HANG: ecode 8:0:0xe757fffe, in gem_evict_align [4253], reason:                                                                                 Ring hung, action: reset
[  199.856153] [drm:i915_reset_and_wakeup] resetting chip
[  199.861094] drm/i915: Resetting chip after gpu hang
[  199.861197] [drm:gen8_init_common_ring] Execlists enabled for render ring
[  199.861205] [drm:gen8_init_common_ring] Execlists enabled for bsd ring
[  199.861211] [drm:gen8_init_common_ring] Execlists enabled for blitter ring
[  199.861216] [drm:gen8_init_common_ring] Execlists enabled for video enhancement ring
[  201.826936] [drm:cherryview_enable_rps] GT fifo had a previous error 1080000
[  201.828550] [drm:cherryview_enable_rps] GPLL enabled? yes
[  201.828555] [drm:cherryview_enable_rps] GPU status: 0x00004010
[  201.828560] [drm:cherryview_enable_rps] current GPU freq: 640 MHz (64)
[  201.828564] [drm:cherryview_enable_rps] setting GPU freq to 400 MHz (40)
[  205.836983] [drm] stuck on render ring
[  205.857800] [drm] GPU HANG: ecode 8:0:0xe757fffe, in gem_evict_align [4253], reason:                                                                                 Ring hung, action: reset
[  205.857937] [drm:i915_reset_and_wakeup] resetting chip
[  265.242533] INFO: rcu_sched self-detected stall on CPU { 2}  (t=60000 jiffies g=3276                                                                                 c=3275 q=712)
[  265.242711] Task dump for CPU 2:
[  265.242715] gem_evict_align R  running task        0  4251   4229 0x0000000c
[  265.242722]  ffffffff81bc84c0 0000000000000002 ffffffff81076b90 0000000000000000
[  265.242728]  ffffffff81bc84c0 ffff88017fd14000 ffff880002973b88 0000000000000000
[  265.242734]  ffffffff810792b0 000000017fd133c0 00000000000002c8 00000f4240000000
[  265.242740] Call Trace:
[  265.242744]  <IRQ>  [<ffffffff81076b90>] ? rcu_dump_cpu_stacks+0x63/0x83
[  265.242763]  [<ffffffff810792b0>] ? rcu_check_callbacks+0x202/0x5e9
[  265.242770]  [<ffffffff81081046>] ? timekeeping_update.constprop.7+0xc5/0xef
[  265.242778]  [<ffffffff81087b61>] ? tick_sched_handle+0x3a/0x3a
[  265.242784]  [<ffffffff8107cb2c>] ? update_process_times+0x24/0x47
[  265.242790]  [<ffffffff81087b56>] ? tick_sched_handle+0x2f/0x3a
[  265.242796]  [<ffffffff81087b90>] ? tick_sched_timer+0x2f/0x55
[  265.242802]  [<ffffffff8107cec1>] ? __run_hrtimer+0x70/0x14d
[  265.242808]  [<ffffffff8107d623>] ? hrtimer_interrupt+0xcb/0x1b4
[  265.242815]  [<ffffffff81026b54>] ? smp_apic_timer_interrupt+0x34/0x43
[  265.242823]  [<ffffffff8179b76a>] ? apic_timer_interrupt+0x6a/0x70
[  265.242826]  <EOI>  [<ffffffffa009385b>] ? i915_gem_execbuffer_reserve+0x66/0x2e3 [i9                                                                                15]
[  265.242895]  [<ffffffffa009381a>] ? i915_gem_execbuffer_reserve+0x25/0x2e3 [i915]
[  265.242925]  [<ffffffffa00940a7>] ? i915_gem_do_execbuffer.isra.12+0x5cf/0xd6d [i915]
[  265.242933]  [<ffffffff81104696>] ? alloc_pages_current+0xad/0xca
[  265.242940]  [<ffffffff810e9e32>] ? kmalloc_order+0x10/0x3d
[  265.242945]  [<ffffffff810e9e7b>] ? kmalloc_order_trace+0x1c/0x7e
[  265.242952]  [<ffffffff81043ead>] ? dequeue_signal+0x9f/0x119
[  265.242982]  [<ffffffffa00957ff>] ? i915_gem_execbuffer2+0x16e/0x205 [i915]
[  265.242999]  [<ffffffffa00047a9>] ? drm_ioctl+0x322/0x38d [drm]
[  265.243004]  [<ffffffff81046112>] ? __set_current_blocked+0x2d/0x40
[  265.243034]  [<ffffffffa0095691>] ? i915_gem_execbuffer+0x339/0x339 [i915]
[  265.243041]  [<ffffffff8107db14>] ? hrtimer_nanosleep+0x8a/0x110
[  265.243048]  [<ffffffff8111d9a1>] ? do_vfs_ioctl+0x360/0x424
[  265.243054]  [<ffffffff81046bd9>] ? restore_altstack+0xf/0x21
[  265.243060]  [<ffffffff8111daae>] ? SyS_ioctl+0x49/0x77
[  265.243065]  [<ffffffff8179afbd>] ? stub_rt_sigreturn+0x6d/0xb0
[  265.243071]  [<ffffffff8179a972>] ? system_call_fastpath+0x12/0x17
[  445.323989] INFO: rcu_sched self-detected stall on CPU { 2}  (t=240003 jiffies g=3276                                                                                 c=3275 q=6349)
Comment 3 ye.tian 2015-04-08 03:27:49 UTC
Created attachment 114939 [details]
dmesg info
Comment 4 ye.tian 2015-04-08 03:28:31 UTC
Created attachment 114940 [details]
i915_error_state info
Comment 5 ye.tian 2015-04-20 07:10:05 UTC
Tested it on BSW with the latest nightly kernel(d60065) and the latest igt(1.10-gbeddb3b), this case still fails.

output:
----------------
root@x-bsw01:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# time ./gem_evict_alignment  --run-subtest minor-hang
IGT-Version: 1.10-gbeddb3b (x86_64) (Linux: 4.0.0_drm-intel-nightly_d60065_20150417+ x86                                                                                _64)
Test assertion failure function copy, file gem_evict_alignment.c:130:
Failed assertion: ret == error
error: 0 != 28
Stack trace:
  #0 [__igt_fail_assert+0xf1]
  #1 [major_evictions.constprop.0+0x0]
  #2 [minor_evictions.constprop.1+0x14e]
  #3 [__real_main194+0x1bc]
  #4 [main+0x21]
  #5 [__libc_start_main+0xf5]
  #6 [_start+0x29]
  #7 [<unknown>+0x29]
Subtest minor-hang failed.
**** DEBUG ****
Checking 3072 surfaces of size 1048576 bytes (total 3222798336) against RAM
Test requirement passed: !(total <= required)
Test requirement passed: !igt_run_in_simulation()
Test assertion failure function copy, file gem_evict_alignment.c:130:
Failed assertion: ret == error
error: 0 != 28
****  END  ****
Subtest minor-hang: FAIL (2.951s)

real    0m6.263s
user    0m0.018s
sys     0m3.516s
Comment 6 ye.tian 2015-04-20 07:16:40 UTC
Created attachment 115205 [details]
dmesg info
Comment 7 ye.tian 2015-04-20 07:25:21 UTC
Fail and Skip subcases
---------------
minor-normal                 FAIL
major-normal                 SKIP
minor-interruptible          FAIL
major-interruptible          SKIP
minor-hang                   FAIL
major-hang                   SKIP
Comment 8 lu hua 2015-06-23 03:26:58 UTC
*** Bug 90063 has been marked as a duplicate of this bug. ***
Comment 9 cprigent 2015-10-06 08:13:35 UTC
Reproduced on BSW:
Platform: Braswell M
CPU : Intel(R) CPU  @ 1.52 GHz (family: 6, model: 76 stepping: 3)
SoC : BSW C0
CRB : BRASWELL RVP Fab2
Mandatory Reworks : All
Feature Reworks: F28, F32,F33 & F37
Optional reworks : O-01a
Software
BIOS : SKLSE2R1.R00.X093.B02.1507222151
ME FW : 11.0.0.1157
Ksc (EC FW): 1.15
Linux distribution: Ubuntu 14.04 LTS 64 bits
Kernel: drm-intel-nightly 78a01ed08ac09d84cb47db59dd10fe9de1ee6c4a 4.3.0-rc2 from git://anongit.freedesktop.org/drm-intel
cairo: (HEAD, origin/master, origin/HEAD, master) f6c46d9473e40d4a3363c96e1fc7fffc81ed12e7 from git://git.freedesktop.org/git/cairo
drm: (HEAD, origin/master, origin/HEAD, master) c3301d013444b7b5d02c58307e188e292d8cf18a from git://git.freedesktop.org/git/mesa/drm
intel-driver: (HEAD, origin/master, origin/HEAD, master) 29f4234504fd99299997a3fc2f01393fb77030b7 from git://git.freedesktop.org/git/vaapi/intel-driver
libva: (HEAD, origin/master, origin/HEAD, master) fdd6ee00c916f530e4d0aa1b250633643999dcf1 from git://git.freedesktop.org/git/vaapi/libva
mesa: (HEAD, origin/master, origin/HEAD) 30e84530a097278c7cf01c0491dba5866510c4c5 from git://git.freedesktop.org/git/mesa/mesa
xf86-video-intel: (HEAD, origin/master, origin/HEAD, master) 300319e2044cb1050e9cbc49c9985b995eaca5fe from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel
xserver: (HEAD, origin/master, origin/HEAD, master) bcb60a49c5e74aa11d0256874659afddea91e53d from git://git.freedesktop.org/git/xorg/xserver
intel-gpu-tools: (HEAD, origin/master, origin/HEAD, master) 88cbb41ade5a66f96b7cd3844ce86f43d192afa0 from git://git.freedesktop.org/git/xorg/app/intel-gpu-tools 

Kernel:
commit 78a01ed08ac09d84cb47db59dd10fe9de1ee6c4a
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Mon Aug 31 18:43:56 2015 +0300
drm-intel-nightly: 2015y-08m-31d-15h-42m-59s UTC integration manifes
Comment 10 cprigent 2015-10-20 06:41:07 UTC
Reproduced on BSW:

Platform: Braswell M
CPU : Intel(R) Celeron N3060 1.60GHz @ 1.6 GHz (family: 6, model: 76 stepping: 4)
SoC : BSW D0
QDF : K6XC
CRB : BRASWELL RVP Fab2
Mandatory Reworks : All 
Feature Reworks: F28, F32, F33, F35, F37
Optional reworks : O-01a; O-02, O-03
BIOS : BRAS.X64.B084.R00.1508310642
TXE FW : 2.0.0.2073
Ksc : 1.08

Linux distribution: Ubuntu 14.04 LTS 64 bits
kernel 4.3.0-rc5-drm-intel-nightly+ 819f710081d7ea116b9b44a9264061d2c030f009 from git://anongit.freedesktop.org/drm-intel
Mesa - 11.0.3 from http://cgit.freedesktop.org/mesa/mesa/
xf86-video-intel - 2.99.917 from http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/
Libdrm - 2.4.65 from http://cgit.freedesktop.org/mesa/drm/
Libva - 1.6.1 from http://cgit.freedesktop.org/libva/
vaapi intel-driver - 1.6.1 from http://cgit.freedesktop.org/vaapi/intel-driver
Cairo - 1.14.2 from http://cgit.freedesktop.org/cairo
Xorg Xserver - 1.17.2 from http://cgit.freedesktop.org/xorg/xserver

Kernel commit 819f710081d7ea116b9b44a9264061d2c030f009
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Oct 14 19:05:17 2015 +0200
drm-intel-nightly: 2015y-10m-14d-17h-04m-36s UTC integration manifest
Comment 11 cprigent 2015-11-08 12:48:27 UTC
Reproduced on BSW-M with last setup:

Platform: Braswell M
CPU : Intel(R) Celeron N3060 1.60GHz @ 1.6 GHz (family: 6, model: 76 stepping: 4)
SoC : BSW D0
QDF : K6XC
CRB : BRASWELL RVP Fab2
Mandatory Reworks : All 
Feature Reworks: F28, F32, F33, F35, F37
Optional reworks : O-01a; O-02, O-03
Software
BIOS : SKLSE2R1.R00.X097.B02.1509020030
ME FW : 11.0.0.1173
Ksc (EC FW): 1.19
Linux distribution: Ubuntu 14.04 LTS 64 bits
kernel 4.3.0-rc7-drm-intel-nightly (86ba603) from git://anongit.freedesktop.org/drm-intel
  commit 86ba603f327626055fe1436112b3786eaaaf7fb1
  Author: Daniel Vetter <daniel.vetter@ffwll.ch>
  Date:   Sat Oct 31 09:27:21 2015 +0100
  drm-intel-nightly: 2015y-10m-31d-08h-26m-39s UTC integration manifest
Mesa 11.0.4 from http://cgit.freedesktop.org/mesa/mesa/
xf86-video-intel - 2.99.917 from http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/
Libdrm - 2.4.65 from http://cgit.freedesktop.org/mesa/drm/
Libva - 1.6.1 from http://cgit.freedesktop.org/libva/
vaapi intel-driver - 1.6.1 from http://cgit.freedesktop.org/vaapi/intel-driver
Cairo - 1.14.2 from http://cgit.freedesktop.org/cairo
Xorg Xserver - 1.17.2 from http://cgit.freedesktop.org/xorg/xserver
Comment 12 cprigent 2015-11-17 17:31:29 UTC
Bug scrub
Rami,
Could you check with last setup and provide igt and kernel logs.
Thanks
Comment 13 Rami 2015-11-20 11:24:08 UTC
Created attachment 119979 [details]
dmesg

with last setup:
Hardware:
Platform: Braswell M 
CPU : Intel(R) Celeron N3060 1.60GHz @ 1.6 GHz (family: 6, model: 76 stepping: 4)
SoC : BSW C0
QDF : K6XC
CRB : BRASWELL RVP Fab2
Mandatory Reworks : All Feature Reworks: F28, F32, F33, F35, F37
Optional reworks : O-01a; O-02, O-03 

Software:
Linux distribution: Ubuntu 15.04 LTS 64 bits 
BIOS : BRAS.X64.B084.R00.1508310642
TXE FW : 2.0.0.2073
Ksc : 1.08
kernel  drm-intel-nightly: 2015y-11m-12d-15h-35m-53s UTC integration manifest
commit 4c2531304c0a2f36f6b2cce2add5b5b2bd3fd893
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Thu Nov 12 17:36:12 2015 +0200
cairo: (HEAD, tag: 1.14.2) 93422b3cb5e0ef8104b8194c8873124ce2f5ea2d from git://git.freedesktop.org/git/cairo
drm: (HEAD, tag: libdrm-2.4.65, tag: 2.4.65) c3496167637e35cf8a52d5e7e53a412e79d80db0 from git://git.freedesktop.org/git/mesa/drm
intel-driver: (HEAD, tag: 1.6.1, origin/v1.6-branch) 35858c69166b845c59ca32e19a3dbb0b758df209 from git://git.freedesktop.org/git/vaapi/intel-driver
libva: (HEAD, tag: libva-1.6.1, origin/v1.6-branch) 613eb962b45fbbd1526d751e88e0d8897af6c0e0 from git://git.freedesktop.org/git/vaapi/libva
mesa: (HEAD, tag: mesa-11.0.5) ee57c22141c42d9b511a7dfa5971c4428cd1c6e7 from git://git.freedesktop.org/git/mesa/mesa
xf86-video-intel: (HEAD, tag: 2.99.917) baec802b21387d04aebb10ac29e719a1800c5aa0 from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel
xserver: (HEAD, tag: xorg-server-1.17.2) 2123f7682d522619f101b05fb75efa75dabbe371 from git://git.freedesktop.org/git/xorg/xserver

* Tools *
intel-gpu-tools: (HEAD, origin/master, origin/HEAD, master) e42936d86b52c6804da41755df7155cafded5eb2 from git://git.freedesktop.org/git/xorg/app/intel-gpu-tools

Results:
=======
./gem_evict_alignment --run-subtest minor-hang
IGT-Version: 1.12-ge42936d (x86_64) (Linux: 4.3.0-nightly+ x86_64)
Test requirement not met in function intel_require_memory, file intel_os.c:244:
Test requirement: !(total <= required)
Estimated that we need 6,445,596,672 bytes for the test, but only have 3,921,674,240 bytes available (RAM)
Subtest minor-hang: SKIP (0.060s)
Comment 14 Chris Wilson 2016-01-28 10:20:44 UTC
So we never intentionally fixed the underlying bug, but now the test case should just skip on this machine. Let's assume we fixed the GPU reset.
Comment 15 cprigent 2016-02-22 11:33:05 UTC
gem_evict_alignment: memory pre-condition not met (256 terabytes), It is tracked by https://bugs.freedesktop.org/show_bug.cgi?id=93849
so closed


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.