Created attachment 94270 [details] dmesg System Environment: -------------------------- Platform: Sandybridge kernel: (drm-intel-nightly)1be8f2b4dd6d3db00af24d4891c82d2650bd282d Bug detailed description: --------------------------- Run ./kms_flip --run-subtest flip-vs-modeset-vs-hang, It doesn't exit testing. It happens on Sandybridge with -queued and -nightly kernel. TThe latest known good commit: c461562e84d180fb691af57f93a42bd9cc7eb69c The latest known bad commit: 4c0e552882114d1edb588242d45035246ab078a0 output: IGT-Version: 1.5-g9597836 (i686) (Linux: 3.13.0_drm-intel-nightly_1be8f2_20140218+ i686) Using monotonic timestamps Beginning flip-vs-modeset-vs-hang on crtc 3, connector 7 1280x1024 60 1280 1328 1440 1688 1024 1025 1028 1066 0x5 0x48 108000 ...Test assertion failure function exec_nop, file kms_flip.c:693: Last errno: 5, Input/output error Failed assertion: drmIoctl(fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, &execbuf) == 0 Subtest flip-vs-modeset-vs-hang: FAIL Test assertion failure function gem_quiescent_gpu, file drmtest.c:156: Last errno: 5, Input/output error Failed assertion: drmIoctl((fd), ((((1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0x40 + 0x29)) << 0) | ((((sizeof(struct drm_i915_gem_execbuffer2)))) << ((0+8)+8)))), (&execbuf)) == 0 Reproduce steps: ---------------------------- 1. ./kms_flip --run-subtest flip-vs-modeset-vs-hang
Fellow cases also have this issue: igt/kms_flip/flip-vs-modeset-vs-hang-interruptible igt/kms_flip/flip-vs-panning-vs-hang igt/kms_flip/flip-vs-panning-vs-hang-interruptible igt/kms_flip/rcs-wf_vblank-vs-dpms-interruptible igt/kms_flip/plain-flip-ts-check
This sounds like either the gpu reset failed (and the gpu is gone for good) or we have a spurious -EIO somewhere. Do other hang tests (anything which contains "hang" somewhere in the test/subtest name) also fail like this on snb? Mika&I are aware that gpu reset seems to be a bit busted currently on snb :(
This is probably just the same old default context ban issue. There's a kernel patch on the list to prevent the ban, but I'm not sure if we can apply it since Mika tells me it's got something to do with dmesg spam handling in piglit. There's another way to work around the problem by adding ~10second sleep between the iterations of these subtests. That will prevent the context ban, but it will slow down the test significantly. The reason the test actually gets stuck while trying to terminate is that it does something signal-unsafe (printf is the likely culprit) from the signal hadler, and then gets stuck in some glic malloc futex.
It randomly has timeout issue, sometimes it aborted. It doesn't exit testing 3 in 5 runs, the rest 3 runs are aborted. output: IGT-Version: 1.5-g072d358 (i686) (Linux: 3.14.0-rc5_drm-intel-nightly_2bbdb4_20140304+ i686) Using monotonic timestamps Beginning flip-vs-panning-vs-hang on crtc 3, connector 7 1280x1024 60 1280 1328 1440 1688 1024 1025 1028 1066 0x5 0x48 108000 ...Test assertion failure function run_test_step, file kms_flip.c:933: Last errno: 5, Input/output error Failed assertion: hang failed to exercise page flip hang recovery Subtest flip-vs-panning-vs-hang: FAIL Test assertion failure function gem_quiescent_gpu, file drmtest.c:156: Last errno: 5, Input/output error Failed assertion: drmIoctl((fd), ((((1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0x40 + 0x29)) << 0) | ((((sizeof(struct drm_i915_gem_execbuffer2)))) << ((0+8)+8)))), (&execbuf)) == 0 kms_flip: drmtest.c:1113: igt_fail: Assertion `!test_with_subtests || in_fixture' failed. Aborted (core dumped)
Can anyone please dig out that the default context ban prevention patch for Lu to test?
Created attachment 95112 [details] [review] [PATCH] drm/i915: Don't ban default context when stop_rings!=0 Here you go.
(In reply to comment #6) > Created attachment 95112 [details] [review] [review] > [PATCH] drm/i915: Don't ban default context when stop_rings!=0 > > Here you go. Fixed by this patch. output: IGT-Version: 1.5-g072d358 (i686) (Linux: 3.14.0-rc5_prts_558104_20140305 i686) Using monotonic timestamps Beginning flip-vs-modeset-vs-hang on crtc 3, connector 7 1280x1024 60 1280 1328 1440 1688 1024 1025 1028 1066 0x5 0x48 108000 .... flip-vs-modeset-vs-hang on crtc 3, connector 7: PASSED Beginning flip-vs-modeset-vs-hang on crtc 5, connector 7 1280x1024 60 1280 1328 1440 1688 1024 1025 1028 1066 0x5 0x48 108000 .... flip-vs-modeset-vs-hang on crtc 5, connector 7: PASSED Subtest flip-vs-modeset-vs-hang: SUCCESS
Fix merged into dinq: commit ccc7bed05e27a654db1e9e248ce5fb291c12add1 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Fri Feb 21 16:26:47 2014 +0200 drm/i915: Don't ban default context when stop_rings!=0
Created attachment 95278 [details] dmesg Output shows pass, but call trace appears in dmesg. IGT-Version: 1.5-gcdf74b6 (i686) (Linux: 3.13.0_drm-intel-next-queued_eb162c_20140307+ i686) Using monotonic timestamps Beginning flip-vs-modeset-vs-hang on crtc 3, connector 7 1280x1024 60 1280 1328 1440 1688 1024 1025 1028 1066 0x5 0x48 108000 .... flip-vs-modeset-vs-hang on crtc 3, connector 7: PASSED Beginning flip-vs-modeset-vs-hang on crtc 5, connector 7 1280x1024 60 1280 1328 1440 1688 1024 1025 1028 1066 0x5 0x48 108000 .... flip-vs-modeset-vs-hang on crtc 5, connector 7: PASSED Subtest flip-vs-modeset-vs-hang: SUCCESS [ 28.739294] ------------[ cut here ]------------ [ 28.739309] WARNING: CPU: 0 PID: 736 at drivers/gpu/drm/i915/intel_uncore.c:994 intel_gpu_reset+0x125/0x426 [i915]() [ 28.739310] Modules linked in: dm_mod snd_hda_codec_hdmi snd_hda_codec_realtek dcdbas pcspkr serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc lpc_ich snd_timer mfd_core snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm [ 28.739322] CPU: 0 PID: 736 Comm: kworker/0:1 Not tainted 3.13.0_drm-intel-next-queued_eb162c_20140307+ #534 [ 28.739323] Hardware name: Dell Inc. OptiPlex 990/0DXWW6, BIOS A02 02/26/2011 [ 28.739329] Workqueue: events i915_error_work_func [i915] [ 28.739330] 00000007 c08910a9 00000000 c022b7a3 f81d0c41 f4d20000 f5145000 f4d20064 [ 28.739333] 00000000 c022b7c3 00000009 00000000 f81d0c41 00000293 f5145000 f5145000 [ 28.739336] f5145014 f4d20000 f81808f5 0000000f f4d21aa8 f5145000 f5e11c80 00000000 [ 28.739338] Call Trace: [ 28.739342] [<c08910a9>] ? dump_stack+0x3e/0x4e [ 28.739345] [<c022b7a3>] ? warn_slowpath_common+0x61/0x74 [ 28.739352] [<f81d0c41>] ? intel_gpu_reset+0x125/0x426 [i915] [ 28.739354] [<c022b7c3>] ? warn_slowpath_null+0xd/0x10 [ 28.739363] [<f81d0c41>] ? intel_gpu_reset+0x125/0x426 [i915] [ 28.739372] [<f81808f5>] ? i915_reset+0x3b/0x11e [i915] [ 28.739378] [<f81855f1>] ? i915_error_work_func+0xa5/0xf5 [i915] [ 28.739382] [<c023a32e>] ? process_one_work+0x16b/0x278 [ 28.739384] [<c023a800>] ? worker_thread+0x19b/0x27b [ 28.739385] [<c023a665>] ? rescuer_thread+0x20d/0x20d [ 28.739387] [<c023e396>] ? kthread+0xa1/0xa6 [ 28.739390] [<c0899c77>] ? ret_from_kernel_thread+0x1b/0x28 [ 28.739392] [<c023e2f5>] ? kthread_freezable_should_stop+0x3b/0x3b [ 28.739393] ---[ end trace ec54be492b20dfeb ]---
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.