https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4055/shard-snb1/igt@drv_selftest@mock_breadcrumbs.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4055/shard-kbl1/igt@drv_selftest@mock_breadcrumbs.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4055/shard-glkb4/igt@drv_selftest@mock_breadcrumbs.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4055/shard-glk2/igt@drv_selftest@mock_breadcrumbs.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4055/shard-apl5/igt@drv_selftest@mock_breadcrumbs.html https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4055/shard-hsw1/igt@drv_selftest@mock_breadcrumbs.html [ 53.726304] Setting dangerous option mock_selftests - tainting kernel [ 54.693605] Timed out waiting for 0 remaining waiters [ 54.913754] i915/intel_breadcrumbs_mock_selftests: igt_wakeup failed with error 10000 [ 54.930617] ------------[ cut here ]------------ [ 54.930620] breadcrumbs returned 10000, conflicting with selftest's magic values! [ 54.930682] WARNING: CPU: 3 PID: 1320 at drivers/gpu/drm/i915/selftests/i915_selftest.c:149 __run_selftests+0x13a/0x1b0 [i915] [ 54.930684] Modules linked in: i915(+) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb btrtl btbcm x86_pkg_temp_thermal intel_powerclamp btintel coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec bluetooth snd_hwdep snd_hda_core ghash_clmulni_intel e1000e snd_pcm ecdh_generic mei_me mei prime_numbers [last unloaded: i915] [ 54.930713] CPU: 3 PID: 1320 Comm: drv_selftest Tainted: G U 4.17.0-rc1-CI-CI_DRM_4055+ #1 [ 54.930714] Hardware name: /NUC7i5BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017 [ 54.930752] RIP: 0010:__run_selftests+0x13a/0x1b0 [i915] [ 54.930753] RSP: 0018:ffffc900002c3c60 EFLAGS: 00010282 [ 54.930756] RAX: 0000000000000000 RBX: ffffffffa0758370 RCX: 0000000000000001 [ 54.930757] RDX: 0000000080000001 RSI: ffffffff820c21ce RDI: 00000000ffffffff [ 54.930758] RBP: ffffffffa0758448 R08: 0000000000000001 R09: 0000000000000001 [ 54.930759] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 54.930760] R13: ffff880271370040 R14: ffff880271370040 R15: ffffffffa076d450 [ 54.930762] FS: 00007f02609b3980(0000) GS:ffff88027ed80000(0000) knlGS:0000000000000000 [ 54.930763] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.930764] CR2: 000055e2776f6140 CR3: 0000000271158004 CR4: 00000000003606e0 [ 54.930765] Call Trace: [ 54.930767] ? 0xffffffffa07cc000 [ 54.930801] i915_mock_selftests+0x27/0x50 [i915] [ 54.930834] i915_init+0x7/0x68 [i915] [ 54.930836] ? 0xffffffffa07cc000 [ 54.930838] do_one_initcall+0x9f/0x370 [ 54.930842] ? rcu_read_lock_sched_held+0x6f/0x80 [ 54.930844] ? kmem_cache_alloc_trace+0x264/0x2d0 [ 54.930848] do_init_module+0x56/0x1ea [ 54.930850] load_module+0x2431/0x2e00 [ 54.930853] ? show_coresize+0x20/0x20 [ 54.930862] ? __se_sys_finit_module+0x95/0xe0 [ 54.930864] __se_sys_finit_module+0x95/0xe0 [ 54.930870] do_syscall_64+0x4f/0x180 [ 54.930873] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 54.930874] RIP: 0033:0x7f0260062839 [ 54.930876] RSP: 002b:00007ffc8935b2b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 54.930878] RAX: ffffffffffffffda RBX: 000055aad6e4dcf0 RCX: 00007f0260062839 [ 54.930879] RDX: 0000000000000000 RSI: 000055aad6e522c0 RDI: 0000000000000004 [ 54.930880] RBP: 000055aad6e522c0 R08: 0000000000000004 R09: 0000000000000000 [ 54.930881] R10: 00007ffc8935b420 R11: 0000000000000246 R12: 0000000000000000 [ 54.930882] R13: 000055aad6e51af0 R14: 0000000000000000 R15: 000000000000003f [ 54.930888] Code: 74 4e 4c 89 e7 ff 53 10 83 f8 fc 74 4b 83 f8 00 74 b9 7f 05 83 f8 e7 75 4f 48 8b 73 08 89 c2 48 c7 c7 98 6a 6e a0 e8 b6 37 a2 e0 <0f> 0b b8 ff ff ff ff eb 34 48 8b 53 08 48 c7 c6 44 d2 6b a0 48 [ 54.930980] WARNING: CPU: 3 PID: 1320 at drivers/gpu/drm/i915/selftests/i915_selftest.c:149 __run_selftests+0x13a/0x1b0 [i915] [ 54.930981] irq event stamp: 711186 [ 54.930984] hardirqs last enabled at (711185): [<ffffffff810f7bcf>] console_unlock+0x47f/0x650 [ 54.930986] hardirqs last disabled at (711186): [<ffffffff81a0111c>] error_entry+0x7c/0x100 [ 54.930987] softirqs last enabled at (710182): [<ffffffff81c003a1>] __do_softirq+0x3a1/0x4aa [ 54.930989] softirqs last disabled at (710161): [<ffffffff8108b0f4>] irq_exit+0xa4/0xb0 [ 54.930991] ---[ end trace c21e682f6ab1bf11 ]---
This comes with the new 4.17.0-rc1, so it is likely a core regression.
For reference, commit d224985a5e312ab05b624143a3fd9bb91b53e52a Author: Peter Zijlstra <peterz@infradead.org> Date: Thu Mar 15 11:41:39 2018 +0100 sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API The old wait_on_atomic_t() is going to get removed, use the more flexible wait_var_event() API instead. Unlike wake_up_atomic_t(), wake_up_var() will issue the wakeup even if the variable is not 0. No change in functionality. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> did impact upon mock_breadcrumbs. Whether that alone is the cause...
Testing that commit does indicate it is the introduction of the error in the test (as opposed to a later issue with the sched/wait.c). Still more likely a bug in wait_var than that patch (afaict, since it is a very simple replacement).
*** Bug 106184 has been marked as a duplicate of this bug. ***
commit 77cbe925bf77bd3159f49c4db0ea89a2045d9071 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 17 18:06:38 2018 +0100 drm/i915/selftests: Fix error checking for wait_var_timeout The old wait_on_atomic_t used a custom callback to perform the schedule(), which used my return semantics of reporting an error code on timeout. wait_var_event_timeout() uses the schedule() return semantics of reporting the remaining jiffies (1 if it timed out with 0 jiffies remaining!) and 0 on failure. This semantic mismatch lead to us falsely claiming a time out occurred. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106085 Fixes: d224985a5e31 ("sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180417170638.20550-1-chris@chris-wilson.co.uk
Thanks!, It seems to be doing the trick! Closing :)
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.