Bug 90057

Summary: [SKL]igt/gem_ppgtt/blt-vs-render-ctx0 sporadically doesn't exit testing
Product: DRI Reporter: lu hua <huax.lu>
Component: DRM/IntelAssignee: cprigent <christophe.prigent>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: christophe.prigent, intel-gfx-bugs
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: SKL i915 features: GEM/PPGTT
Attachments:
Description Flags
dmesg
none
drm/i915: fill scratch page
none
drm/i915: Workaround to avoid lite restore with HEAD==TAIL
none
dmesg(SKL) none

Description lu hua 2015-04-17 01:56:31 UTC
Created attachment 115137 [details]
dmesg

==System Environment==
--------------------------
Regression: not sure

Non-working platforms: BDW

==kernel==
--------------------------
drm-intel-nightly/d600654ab94b325f253e267422dcf60302120ea0
commit d600654ab94b325f253e267422dcf60302120ea0
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Apr 16 17:54:10 2015 +0200

    drm-intel-nightly: 2015y-04m-16d-15h-53m-28s UTC integration manifest

==Bug detailed description==
-----------------------------
It sporadically doesn't exit testing. It works well at the 1st cycle and fails at he 2nd cycle.

output:
root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# time ./gem_ppgtt --run-subtest blt-vs-render-ctx0
IGT-Version: 1.10-gbeddb3b (x86_64) (Linux: 4.0.0_drm-intel-nightly_d60065_20150417+ x86_64)
Subtest blt-vs-render-ctx0: SUCCESS (30.853s)

real    0m30.882s
user    0m15.809s
sys     0m53.543s
root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# time ./gem_ppgtt --run-subtest blt-vs-render-ctx0
IGT-Version: 1.10-gbeddb3b (x86_64) (Linux: 4.0.0_drm-intel-nightly_d60065_20150417+ x86_64)

^C^C^C^C^C^C


==Reproduce steps==
---------------------------- 
1. time ./gem_ppgtt --run-subtest blt-vs-render-ctx0
Comment 1 Mika Kuoppala 2015-04-20 08:56:16 UTC
Created attachment 115212 [details] [review]
drm/i915: fill scratch page
Comment 2 Mika Kuoppala 2015-04-20 12:22:20 UTC
Created attachment 115222 [details] [review]
drm/i915: Workaround to avoid lite restore with HEAD==TAIL
Comment 3 Mika Kuoppala 2015-04-20 12:24:36 UTC
The first patch should release the client, so fix the 'doesn't exit testing' part.

The second patch should fix the problem that the test hangs in the first place.

Please test both, separately.
Comment 4 ye.tian 2015-04-21 09:02:54 UTC
Test 2 rounds on the latest kernel with first patch, this case will causes system hang when running fifth and sixth time, if enable the serial console,the system can not boot up.

Test 2 rounds on the latest kernel with the second patch , this problem will appear when running third and eighth time.

Test the both patchs for 2 rounds, this case will causes system hang when running third and eighth time.

BTW,this problem doest not exists the BDW GT3.
Comment 5 lu hua 2015-04-23 08:35:35 UTC
Created attachment 115288 [details]
dmesg(SKL)

Also reproduce it on SKL. Fail rate:2/5.
leave it run 20 minutes, it doesn't exit testing, reports following warning:
[ 1219.072060] WARNING: CPU: 2 PID: 5177 at drivers/gpu/drm/i915/i915_gem.c:1999 i915_gem_object_put_pages_gtt+0x47/0x14d [i915]()
[ 1219.072062] WARN_ON(ret != -EIO)
[ 1219.072081] Modules linked in: dm_mod snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel pcspkr snd_hda_controller snd_hda_codec i2c_i801 snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore joydev ppdev wmi battery parport_pc parport ac acpi_cpufreq i915 button video drm_kms_helper drm
[ 1219.072087] CPU: 2 PID: 5177 Comm: gem_ppgtt Not tainted 4.0.0_drm-intel-nightly_b9fe35_20150421+ #368
[ 1219.072089] Hardware name: Intel Corporation Skylake Client platform/Skylake Y LPDDR3 RVP3, BIOS SKLSE2R1.86C.B067.R00.1412310711 12/31/2014
[ 1219.072094]  0000000000000000 0000000000000009 ffffffff81795847 ffff88014546b568
[ 1219.072097]  ffffffff8103bd5a 0000000000000000 ffffffffa009e2e5 0000000000000001
[ 1219.072100]  ffff8801443e32c0 0000000000000000 ffffffffa010eb90 ffff88014546b648
[ 1219.072101] Call Trace:
[ 1219.072111]  [<ffffffff81795847>] ? dump_stack+0x40/0x50
[ 1219.072119]  [<ffffffff8103bd5a>] ? warn_slowpath_common+0x98/0xb0
[ 1219.072149]  [<ffffffffa009e2e5>] ? i915_gem_object_put_pages_gtt+0x47/0x14d [i915]
[ 1219.072156]  [<ffffffff8103bdb7>] ? warn_slowpath_fmt+0x45/0x4a
[ 1219.072183]  [<ffffffffa009e018>] ? i915_gem_object_set_to_cpu_domain+0x27/0x133 [i915]
[ 1219.072208]  [<ffffffffa009e2e5>] ? i915_gem_object_put_pages_gtt+0x47/0x14d [i915]
[ 1219.072238]  [<ffffffffa009fb80>] ? i915_gem_object_put_pages+0x77/0xcf [i915]
[ 1219.072266]  [<ffffffffa00a2b34>] ? i915_gem_shrink+0x177/0x1dc [i915]
[ 1219.072292]  [<ffffffffa00a2bf9>] ? i915_gem_shrinker_scan+0x60/0x81 [i915]
[ 1219.072299]  [<ffffffff810de3b3>] ? shrink_slab.part.57.constprop.67+0x1a5/0x2b5
[ 1219.072303]  [<ffffffff810e038b>] ? shrink_zone+0x67/0x92
[ 1219.072308]  [<ffffffff810e0771>] ? do_try_to_free_pages+0x20d/0x241
[ 1219.072312]  [<ffffffff810e0871>] ? try_to_free_pages+0xcc/0x108
[ 1219.072318]  [<ffffffff810d7b46>] ? __alloc_pages_nodemask+0x48e/0x6fc
[ 1219.072324]  [<ffffffff81104767>] ? alloc_pages_current+0xad/0xca
[ 1219.072329]  [<ffffffff810d4a5b>] ? __get_free_pages+0x6/0x33
[ 1219.072335]  [<ffffffff81340e49>] ? __sg_alloc_table+0x73/0x13d
[ 1219.072339]  [<ffffffff81340f28>] ? sg_kfree+0x15/0x15
[ 1219.072344]  [<ffffffff813410f9>] ? sg_alloc_table+0x1e/0x46
[ 1219.072370]  [<ffffffffa009b9b2>] ? i915_gem_object_get_pages_gtt+0x6e/0x36e [i915]
[ 1219.072397]  [<ffffffffa009c792>] ? i915_gem_object_get_pages+0x61/0xb5 [i915]
[ 1219.072424]  [<ffffffffa00a07f0>] ? i915_gem_object_do_pin+0x36e/0x77f [i915]
[ 1219.072451]  [<ffffffffa0094fc1>] ? i915_gem_execbuffer_reserve_vma.isra.12+0x5d/0x103 [i915]
[ 1219.072476]  [<ffffffffa00952b3>] ? i915_gem_execbuffer_reserve+0x24c/0x2e3 [i915]
[ 1219.072503]  [<ffffffffa0095916>] ? i915_gem_do_execbuffer.isra.13+0x5cc/0xd88 [i915]
[ 1219.072510]  [<ffffffff8179985f>] ? __mutex_unlock_slowpath+0x13/0x2f
[ 1219.072537]  [<ffffffffa00a138a>] ? i915_gem_pwrite_ioctl+0x75a/0x7e0 [i915]
[ 1219.072543]  [<ffffffff8110948a>] ? __kmalloc+0x65/0x13d
[ 1219.072568]  [<ffffffffa0097085>] ? i915_gem_execbuffer2+0x16e/0x205 [i915]
[ 1219.072583]  [<ffffffffa00047ae>] ? drm_ioctl+0x322/0x38d [drm]
[ 1219.072588]  [<ffffffff81123250>] ? file_update_time+0x25/0xc1
[ 1219.072612]  [<ffffffffa0096f17>] ? i915_gem_execbuffer+0x339/0x339 [i915]
[ 1219.072617]  [<ffffffff8105ede9>] ? set_next_entity+0x32/0x55
[ 1219.072621]  [<ffffffff81060a23>] ? pick_next_task_fair+0xe5/0x3dc
[ 1219.072627]  [<ffffffff8111daa6>] ? do_vfs_ioctl+0x360/0x424
[ 1219.072634]  [<ffffffff81032a63>] ? __do_page_fault+0x345/0x3d8
[ 1219.072639]  [<ffffffff8111dbb3>] ? SyS_ioctl+0x49/0x7a
[ 1219.072643]  [<ffffffff8179cb62>] ? page_fault+0x22/0x30
[ 1219.072647]  [<ffffffff8179b0f2>] ? system_call_fastpath+0x12/0x17
[ 1219.072650] ---[ end trace af3e923f6c738f7b ]---
Comment 6 Mika Kuoppala 2015-07-16 08:07:45 UTC
Could you retest with latest nightly please
Comment 7 cprigent 2015-08-03 21:55:58 UTC
Assigned to me. I will try with latest nightly
Comment 8 Humberto Israel Perez Rodriguez 2016-01-07 17:36:27 UTC
The following test pass in BDW after 20 iterations with the next configuration :

./gem_ppgtt --run-subtest blt-vs-render-ctx0


kernel drm-intel-testing:

commit 91587c722c28c4116dedbfbf08aa874377bc76f8
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Dec 4 17:35:54 2015 +0100

    drm-intel-nightly: 2015y-12m-04d-16h-35m-07s UTC integration manifest


kernel version : 4.4.0-rc3
git url        : git://anongit.freedesktop.org/drm-intel
git branch     : drm-intel-testing
git describe   : drm-intel-next-2015-11-20-rebased-13721-g91587c7

igt tools :
branch : intel-gpu-tools-1.13
commit : 2db78a4995a8ee298ae0cd68879baf80407a0e5e

cairo version: 1.15.2 / commit :  db8a7f1 
drm version :  libdrm-2.4.66  / commit : b38a4b2 
intel-driver : 1.6.2 / commit: 683edee
libva version : libva-1.6.2 / commit : 304bc13
mesa version : mesa-11.0.8 / commit : 261daab 
xf86-video-intel version : 2.99.917  / commit : baec802 
xserver version :xorg-server-1.18.0 / commit :7921764
Comment 9 Chris Wilson 2016-01-28 10:32:34 UTC
Marking as resolved, and ignoring the spurious GPU hang from skl...
Comment 10 cprigent 2016-09-30 14:27:31 UTC
Tested in a loop on SKL, I don't see any problem. Checked with:

Platform SKL Skull Canyon: NUC6i7KYK
CPU: IntelĀ® Core(TM) i7-6770HQ CPU @ 2.6GHz (family 6, model 94, stepping 3)
GPU: IntelĀ® IrisTM Pro Graphics 580 - Intel Corporation Sky Lake Integrated Graphics (rev 09)
Motherboard version: H90766-405
Memory: 2 x 4GB card Kingston KVR21S15S8/4

Software
Bios: KYSKLi70.86A.0041.2016.0817.1130 from https://downloadcenter.intel.com/downloads/eula/26210/BIOS-Update-KYSKLi70-86A-?httpDown=https%3A%2F%2Fdownloadmirror.intel.com%2F26210%2Feng%2FKY0041.bio
Linux distribution: Ubuntu 16.04 64 bits
DMC 1.26 from https://01.org/sites/default/files/downloads/intelr-graphics-linux/skldmcver126.tar_1.bz2
GUC 6.1 from https://01.org/sites/default/files/downloads/intelr-graphics-linux/sklgucver61.tar.bz2
Kernel: 4.8.0-rc8 aab15c2 from http://cgit.freedesktop.org/drm-intel/
   commit aab15c274da587bcab19376d2caa9d6626440335
   Author: Jani Nikula <jani.nikula@intel.com>
   Date:   Mon Sep 26 15:11:53 2016 +0300
   drm-intel-nightly: 2016y-09m-26d-12h-11m-33s UTC integration manifest
libdrm-2.4.70-14 0659558 from git://anongit.freedesktop.org/mesa/drm
mesa: mesa-12.0.0 8b06176 from git://anongit.freedesktop.org/mesa/mesa
cairo 1.15.2 db8a7f1 from git://anongit.freedesktop.org/cairo
xorg-server-1.18.99.901-14 ba199cb from git://git.freedesktop.org/git/xorg/xserver
xf86-video-intel 2.99.917-708 8f33f80 from git://git.freedesktop.org/git/xorg/driver/xf86-video-intel
libva-1.7.2-38 3b7e499 from git://git.freedesktop.org/git/vaapi/libva 
vaapi-intel-driver: 1.7.2-101 302cf63 from git://git.freedesktop.org/git/vaapi/intel-driver
IGT: intel-gpu-tools-1.16-30 32b2021 from http://anongit.freedesktop.org/git/xorg/app/intel-gpu-tools.git
External screens: ASUS PB287Q (DP), DELL P2715Qt (HDMI)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.