Bug 97618

Summary: W600 (Cape Verde PRO): reproducible hang on piglit test spec@ext_texture_lod_bias@lodbias in drmCommandWrite()
Product: Mesa Reporter: Dan Kegel <dank>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium    
Version: 11.2   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Dan Kegel 2016-09-06 22:07:16 UTC
On my dual Xeon with AMD W600 graphics card / stock ubuntu 16.04,
running piglit's spec@ext_texture_lod_bias@lodbias test hangs in drmCommandWrite().

To reproduce:

sudo apt-get install -y time libwaffle-dev python3-dev python3-nose python3-six python3-numpy python3-matplotlib python3-scipy libgles2-mesa-dev libgl1-mesa-dev
git clone git://anongit.freedesktop.org/git/piglit
cd piglit
cmake .
make -j4
./piglit run tests/sanity results/sanity
./piglit summary console results/sanity
./piglit run -1 -v --dmesg --sync -t ext_texture_lod_bias tests/quick results/quick

and it looks like it's hung on an ioctl:

$ lspci -nn | grep ATI
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde PRO [FirePro W600] [1002:6828]
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] [1002:aab0]
$ cat /etc/issue
Ubuntu 16.04.1 LTS \n \l
$ uname -a
Linux rbb-ubu1604-1 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ apt-cache policy xserver-xorg-video-radeon
xserver-xorg-video-radeon:
  Installed: 1:7.7.0-1
$ dpkg-query -S  /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so
libgl1-mesa-dri:amd64: /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so

$ apt-cache policy libgl1-mesa-dri
libgl1-mesa-dri:
  Installed: 11.2.0-1ubuntu2.1
$ sudo apt-get install libgl1-mesa-dri-dbg
E: Unable to locate package libgl1-mesa-dri-dbg
$ ps augxw | grep piglit
buildbot 46308  3.2  0.5 1267912 115796 pts/1  Sl+  11:52   2:18
python2 piglit run -1 -v --dmesg --sync -t texture tests/quick results/quick
buildbot 62511  0.7  0.2 181476 42584 pts/1    Sl+  12:34   0:12
src/piglit/bin/lodbias -auto
buildbot@rbb-ubu1604-1:~/.local/share/xorg$ sudo strace -p 62511
strace: Process 62511 attached
ioctl(6, DRM_IOCTL_RADEON_GEM_WAIT_IDLE

$ sudo gdb
(gdb) attach 62511
(gdb) bt
#0  0x00007fe2827c0687 in ioctl () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007fe281c0e0f8 in drmIoctl () from /usr/lib/x86_64-linux-gnu/libdrm.so.2
#2  0x00007fe281c10dbb in drmCommandWrite () from /usr/lib/x86_64-linux-gnu/libdrm.so.2
#3  0x00007fe27cff6e2c in ?? () from /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so
#4  0x00007fe27cff83d7 in ?? () from /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so
#5  0x00007fe27d01c452 in ?? () from /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so
#6  0x00007fe27cb4163d in ?? () from /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so
#7  0x00007fe27cab8cd8 in ?? () from /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so
#8  0x00007fe27cb45105 in ?? () from /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so
#9  0x00007fe27caba19f in ?? () from /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so
#10 0x00007fe27caba322 in ?? () from /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so
#11 0x00007fe282dabbf0 in piglit_read_pixels_float (x=110, y=70, width=1, height=1, format=6407, pixels=0x7fff34347e70)
    at src/piglit/tests/util/piglit-util-gl.c:1055
#12 0x00007fe282dac010 in piglit_probe_pixel_rgb (x=110, y=70, expected=0x7fff34347f10) at
src/piglit/tests/util/piglit-util-gl.c:1149
#13 0x000000000040177f in probe_cell (testname=0x4025f8 "multitex", cellx=2, celly=1, expected=0x7fff34347f10)
    at src/piglit/tests/texturing/lodbias.c:85
#14 0x0000000000401da2 in test_multitex_combo (bias1=-9, level1=2, bias2=-13, level2=1) at src/piglit/tests/texturing/lodbias.c:192
#15 0x0000000000401ed8 in test_multitex (bias1=-9, bias2=-13) at src/piglit/tests/texturing/lodbias.c:221
#16 0x0000000000401fd4 in piglit_display () at src/piglit/tests/texturing/lodbias.c:246
#17 0x00007fe282dd2275 in process_next_event (x11_fw=0x1ab7c20) at src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:137
#18 0x00007fe282dd2335 in enter_event_loop (winsys_fw=0x1ab7c20) at src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:153
#19 0x00007fe282dd176e in run_test (gl_fw=0x1ab7c20, argc=1, argv=0x7fff343482b8)
    at src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:88
#20 0x00007fe282db5d90 in piglit_gl_test_run (argc=1, argv=0x7fff343482b8, config=0x7fff34348170)
    at src/piglit/tests/util/piglit-framework-gl.c:199
#21 0x00000000004016cd in main (argc=1, argv=0x7fff343482b8) at src/piglit/tests/texturing/lodbias.c:55


$ sudo cat /var/log/kern.log
...
[ 3304.897193] [TTM] Illegal buffer object size
[ 3304.897239] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to allocate GEM object (0, 2, 4096, -22)
[ 3304.897326] [TTM] Illegal buffer object size
[ 3304.897351] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to allocate GEM object (0, 2, 4096, -22)
[ 3315.592187] perf interrupt took too long (5070 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[ 3347.940586] [TTM] Illegal buffer object size
[ 3347.940638] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to allocate GEM object (0, 2, 4096, -22)
[ 3347.940678] [TTM] Illegal buffer object size
[ 3347.940701] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to allocate GEM object (0, 2, 4096, -22)
[ 3349.047906] [TTM] Failed to find memory space for buffer 0xffff880044987868 eviction
[ 3349.047913] [TTM] No space for ffff880044987868 (262144 pages, 1048576K, 1024M)
[ 3349.047917] [TTM]   placement[0]=0x00060002 (1)
[ 3349.047919] [TTM]     has_type: 1
[ 3349.047921] [TTM]     use_type: 1
[ 3349.047923] [TTM]     flags: 0x0000000A
[ 3349.047925] [TTM]     gpu_offset: 0x80000000
[ 3349.047927] [TTM]     size: 524288
[ 3349.047929] [TTM]     available_caching: 0x00070000
[ 3349.047931] [TTM]     default_caching: 0x00010000
[ 3349.048630] [TTM] Failed to find memory space for buffer 0xffff880044987868 eviction
[ 3349.048632] [TTM] No space for ffff880044987868 (262144 pages, 1048576K, 1024M)
[ 3349.048634] [TTM]   placement[0]=0x00060002 (1)
[ 3349.048635] [TTM]     has_type: 1
[ 3349.048636] [TTM]     use_type: 1
[ 3349.048637] [TTM]     flags: 0x0000000A
[ 3349.048638] [TTM]     gpu_offset: 0x80000000
[ 3349.048639] [TTM]     size: 524288
[ 3349.048640] [TTM]     available_caching: 0x00070000
[ 3349.048641] [TTM]     default_caching: 0x00010000
[ 3349.052755] [TTM] Failed to find memory space for buffer 0xffff880044987868 eviction
[ 3349.052758] [TTM] No space for ffff880044987868 (262144 pages, 1048576K, 1024M)
[ 3349.052759] [TTM]   placement[0]=0x00060002 (1)
[ 3349.052760] [TTM]     has_type: 1
[ 3349.052761] [TTM]     use_type: 1
[ 3349.052762] [TTM]     flags: 0x0000000A
[ 3349.052763] [TTM]     gpu_offset: 0x80000000
[ 3349.052764] [TTM]     size: 524288
[ 3349.052766] [TTM]     available_caching: 0x00070000
[ 3349.052767] [TTM]     default_caching: 0x00010000
[ 3349.052769] [TTM] Failed to find memory space for buffer 0xffff880044987868 eviction
[ 3349.052771] [TTM] No space for ffff880044987868 (262144 pages, 1048576K, 1024M)
[ 3349.052772] [TTM]   placement[0]=0x00060002 (1)
[ 3349.052773] [TTM]     has_type: 1
[ 3349.052774] [TTM]     use_type: 1
[ 3349.052775] [TTM]     flags: 0x0000000A
[ 3349.052776] [TTM]     gpu_offset: 0x80000000
[ 3349.052777] [TTM]     size: 524288
[ 3349.052778] [TTM]     available_caching: 0x00070000
[ 3349.052779] [TTM]     default_caching: 0x00010000
[ 3349.052840] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to parse relocation -12!
[ 5641.849863] DMAR: DRHD: handling fault status reg 2
[ 5641.849876] DMAR: INTR-REMAP: Request device [[00:00.0] fault index 18
[ 5641.849876] INTR-REMAP:[fault reason 38] Blocked an interrupt
request due to source-id verification failure
[ 6000.668710] INFO: task Xorg:6978 blocked for more than 120 seconds.
[ 6000.668718]       Not tainted 4.4.0-36-generic #55-Ubuntu
[ 6000.668720] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6000.668723] Xorg            D ffff8800c8c53938     0  6978   6977 0x00000000
[ 6000.668730]  ffff8800c8c53938 0000000000000000 ffff880512e97080 ffff88050e008c80
[ 6000.668735]  ffff8800c8c54000 ffff8800c8c53a88 ffff88003553c000 ffff8800c8c53a20
[ 6000.668739]  ffff88003553d498 ffff8800c8c53950 ffffffff81829ec5 7fffffffffffffff
[ 6000.668743] Call Trace:
[ 6000.668756]  [<ffffffff81829ec5>] schedule+0x35/0x80
[ 6000.668760]  [<ffffffff8182cfe5>] schedule_timeout+0x1b5/0x270
[ 6000.668807]  [<ffffffffc0217ed2>] ? radeon_fence_process+0x12/0x30 [radeon]
[ 6000.668830]  [<ffffffffc02181a4>] radeon_fence_wait_seq_timeout.constprop.8+0x234/0x320 [radeon]
[ 6000.668835]  [<ffffffff810c3cb0>] ? wake_atomic_t_function+0x60/0x60
[ 6000.668853]  [<ffffffffc021874f>] radeon_fence_wait_empty+0x7f/0xb0 [radeon]
[ 6000.668878]  [<ffffffffc0262189>] radeon_pm_compute_clocks+0x5f9/0x870 [radeon]
[ 6000.668895]  [<ffffffffc0208da7>] atombios_crtc_dpms+0x67/0xf0 [radeon]
[ 6000.668912]  [<ffffffffc020a399>] atombios_crtc_disable+0x39/0x350 [radeon]
[ 6000.668939]  [<ffffffffc027f6e9>] ?  atombios_get_encoder_mode+0x119/0x1c0 [radeon]
[ 6000.668966]  [<ffffffffc0281b50>] ?  radeon_atom_encoder_disable+0xf0/0x170 [radeon]
[ 6000.668975]  [<ffffffffc01afbc6>] __drm_helper_disable_unused_functions+0xa6/0xe0 [drm_kms_helper]
[ 6000.668981]  [<ffffffffc01b00c3>] drm_crtc_helper_set_config+0x103/0xba0 [drm_kms_helper]
[ 6000.669005]  [<ffffffffc02ba07f>] ?  ni_dpm_vblank_too_short+0x1f/0x30 [radeon]
[ 6000.669025]  [<ffffffffc02257b4>] radeon_crtc_set_config+0x44/0x110 [radeon]
[ 6000.669050]  [<ffffffffc0041e32>] drm_mode_set_config_internal+0x62/0x100 [drm]
[ 6000.669065]  [<ffffffffc004648c>] drm_mode_setcrtc+0x3cc/0x4f0 [drm]
[ 6000.669076]  [<ffffffffc0037742>] drm_ioctl+0x152/0x540 [drm]
[ 6000.669091]  [<ffffffffc00460c0>] ? drm_mode_setplane+0x1b0/0x1b0 [drm]
[ 6000.669107]  [<ffffffffc01fc04c>] radeon_drm_ioctl+0x4c/0x80 [radeon]
[ 6000.669110]  [<ffffffff81220c1f>] do_vfs_ioctl+0x29f/0x490
[ 6000.669113]  [<ffffffff8106b544>] ? __do_page_fault+0x1b4/0x400
[ 6000.669115]  [<ffffffff81220e89>] SyS_ioctl+0x79/0x90
[ 6000.669118]  [<ffffffff8182dfb2>] entry_SYSCALL_64_fastpath+0x16/0x71
Comment 1 Dan Kegel 2016-09-06 22:12:41 UTC
Sorry, there was a bit of garbage at the start of that /var/log/kern.log
output, ignore everything before second 6000.
Comment 2 Nicolai Hähnle 2016-09-12 16:43:17 UTC
Hi Dan, thanks for the report. I cannot reproduce this with a Verde and recent drivers from Git. Can you reproduce this with more recent drivers, e.g. installing the padoka PPA?
Comment 3 Dan Kegel 2016-09-12 22:06:04 UTC
Maybe I reduced the test case too far?
The next run of that one test didn't hang for me, but the next run of my larger set,
time python2 piglit run -1 -v --dmesg --sync -t texture tests/quick results/quick
did still hang on that test.

I'll try the ppa next.
Comment 4 Dan Kegel 2016-09-13 14:29:56 UTC
Installing from the padoka PPA did indeed solve my problem.
No warnings in /var/log/kern.log, either!
So fixed as of libgl1-mesa-dri 12.1~git1600912162600.546bc07~x~padoka0
et al.

(There are still 78 fails vs. 5984 passes, but IIRC that's less fails than before.)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.