Summary: | [BAT] igt@kms_* incomplete softdog | ||
---|---|---|---|
Product: | DRI | Reporter: | Jani Saarinen <jani.saarinen> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED WORKSFORME | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | critical | ||
Priority: | high | CC: | intel-gfx-bugs |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | CFL | i915 features: | display/atomic |
Description
Jani Saarinen
2017-09-28 08:54:36 UTC
The error begun with [ 421.502977] [IGT] kms_pipe_crc_basic: starting subtest nonblocking-crc-pipe-B-frame-sequence igt@kms_pipe_crc_basic@nonblocking-crc-pipe-c is an innocent victim. I consider this bug to be about all softdog incompletes on CFL. I have to file the bug on something, so don't stare at the tests so far implicated. Reproduced also on igt@kms_flip@basic-flip-vs-dpm. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3165/fi-cfl-s/igt@kms_flip@basic-flip-vs-dpms.html This from dmesg appear frequently just before the softdog times out: <3>[ 354.277367] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:46:pipe B] flip_done timed out <7>[ 354.277608] [drm:verify_connector_state.isra.72 [i915]] [CONNECTOR:48:eDP-1] <7>[ 354.277718] [drm:intel_atomic_commit_tail [i915]] [CRTC:46:pipe B] <7>[ 354.277912] [drm:verify_single_dpll_state.isra.73 [i915]] DPLL 0 <14>[ 354.289994] [IGT] kms_flip: exiting, ret=99 <3>[ 364.517139] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:46:pipe B] flip_done timed out <3>[ 374.757352] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:46:pipe B] flip_done timed out <6>[ 386.728770] [drm] GPU HANG: ecode 9:1:0xfffffffe, reason: Hang on bcs0, vcs0, vecs0, action: reset <5>[ 386.729156] i915 0000:00:02.0: Resetting bcs0 after gpu hang <7>[ 386.729960] [drm:gen8_init_common_ring [i915]] Execlists enabled for bcs0 <5>[ 386.730083] i915 0000:00:02.0: Resetting vcs0 after gpu hang <7>[ 386.730312] [drm:gen8_init_common_ring [i915]] Execlists enabled for vcs0 <5>[ 386.730320] i915 0000:00:02.0: Resetting vecs0 after gpu hang <7>[ 386.730458] [drm:gen8_init_common_ring [i915]] Execlists enabled for vecs0 <5>[ 390.693538] i915 0000:00:02.0: Resetting rcs0 after gpu hang <7>[ 390.693959] [drm:gen8_init_common_ring [i915]] Execlists enabled for rcs0 <7>[ 390.694170] [drm:init_workarounds_ring [i915]] rcs0: Number of context specific w/a: 16 <3>[ 392.806459] [drm:i915_gem_idle_work_handler [i915]] *ERROR* Timeout waiting for engines to idle <7>[ 392.808157] [drm:missed_breadcrumb [i915]] rcs0 missed breadcrumb at intel_engine_disarm_breadcrumbs+0xb4/0xe0 [i915], irq posted? yes, current seqno=295740, last=295740 <7>[ 392.833212] [drm:intel_atomic_check [i915]] New cdclk calculated to be logical 337500 kHz, actual 337500 kHz <3>[ 402.917171] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:46:pipe B] flip_done timed out <2>[ 412.592339] softdog: Initiating panic Also: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3171/fi-cfl-s/igt@drv_module_reload@basic-reload-inject.html and: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3179/fi-cfl-s/igt@drv_module_reload@basic-no-display.html Note: before softdog is triggered dmesg: <4>[ 536.788670] DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) <4>[ 536.788683] ------------[ cut here ]------------ <4>[ 536.788723] WARNING: CPU: 0 PID: 4959 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60 <4>[ 536.788748] Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178a usbnet x86_pkg_temp_thermal mii intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid [last unloaded: snd_hda_intel] <4>[ 536.788878] CPU: 0 PID: 4959 Comm: drv_module_relo Tainted: G U 4.14.0-rc3-CI-CI_DRM_3171+ #1 <4>[ 536.788905] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 <4>[ 536.788939] task: ffff8804146b5440 task.stack: ffffc9000054c000 <4>[ 536.788959] RIP: 0010:mutex_destroy+0x4e/0x60 <4>[ 536.788974] RSP: 0018:ffffc9000054fd58 EFLAGS: 00010292 <4>[ 536.788992] RAX: 000000000000002a RBX: ffff88044d8be108 RCX: 0000000000000000 <4>[ 536.789013] RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffffff810f0cf0 <4>[ 536.789033] RBP: ffffc9000054fd60 R08: 0000000000000001 R09: 0000000000000001 <4>[ 536.789054] R10: 000000000f21cb81 R11: 0000000000000000 R12: ffff8803f5d12e98 <4>[ 536.789074] R13: ffffffffa02c1d20 R14: ffffffffa02c1d90 R15: ffff880459b29308 <4>[ 536.789095] FS: 00007f3c5d91f8c0(0000) GS:ffff88045d200000(0000) knlGS:0000000000000000 <4>[ 536.789119] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 536.789137] CR2: 000055967e4cf090 CR3: 00000004579f7006 CR4: 00000000003606f0 <4>[ 536.789157] Call Trace: <4>[ 536.789172] drm_fb_helper_fini+0xd9/0x130 <4>[ 536.789264] intel_fbdev_destroy+0x12/0x60 [i915] <4>[ 536.789349] intel_fbdev_fini+0x28/0x30 [i915] <4>[ 536.789432] intel_modeset_cleanup+0x45/0xa0 [i915] <4>[ 536.789492] i915_driver_unload+0x92/0x180 [i915] <4>[ 536.789554] i915_pci_remove+0x19/0x30 [i915] <4>[ 536.789607] pci_device_remove+0x39/0xb0 <4>[ 536.789625] device_release_driver_internal+0x15d/0x220 <4>[ 536.789653] driver_detach+0x40/0x80 <4>[ 536.789673] bus_remove_driver+0x58/0xd0 <4>[ 536.789694] driver_unregister+0x2c/0x40 <4>[ 536.789715] pci_unregister_driver+0x36/0xb0 <4>[ 536.789799] i915_exit+0x1a/0x8b [i915] <4>[ 536.789821] SyS_delete_module+0x18c/0x1e0 <4>[ 536.789842] entry_SYSCALL_64_fastpath+0x1c/0xb1 <4>[ 536.789858] RIP: 0033:0x7f3c5be37287 <4>[ 536.789871] RSP: 002b:00007ffdf2f1f208 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 <4>[ 536.789896] RAX: ffffffffffffffda RBX: ffffffff81493a03 RCX: 00007f3c5be37287 <4>[ 536.789917] RDX: 0000000000000001 RSI: 0000000000000800 RDI: 00005641d2708e48 <4>[ 536.789938] RBP: ffffc9000054ff88 R08: 0000000000000000 R09: 0000000000000080 <4>[ 536.789958] R10: 00007f3c5d91f8c0 R11: 0000000000000246 R12: 0000000000000000 <4>[ 536.789978] R13: 00007ffdf2f1f3f0 R14: 0000000000000000 R15: 0000000000000000 <4>[ 536.790003] ? __this_cpu_preempt_check+0x13/0x20 <4>[ 536.790024] Code: 00 00 5b 5d c3 e8 93 b9 3a 00 85 c0 74 ec 8b 05 e1 53 c3 01 85 c0 75 e2 48 c7 c6 86 a6 c7 81 48 c7 c7 8b 8d c6 81 e8 03 ae 01 00 <0f> ff eb cb 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 b8 <4>[ 536.790210] ---[ end trace 824a7d8ce15a77b4 ]--- <7>[ 536.829138] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 1 <7>[ 536.843505] [drm:drm_do_probe_ddc_edid] drm: skipping non-existent adapter i915 gmbus dpb <7>[ 536.895952] [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 1 <7>[ 536.910417] [drm:drm_dp_dual_mode_detect] DP dual mode HDMI ID: (err -6) <7>[ 536.910442] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:62:HDMI-A-1] disconnected <4>[ 536.910472] general protection fault: 0000 [#1] PREEMPT SMP <4>[ 536.910488] Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178a usbnet x86_pkg_temp_thermal mii intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid [last unloaded: snd_hda_intel] <4>[ 536.910601] CPU: 4 PID: 224 Comm: kworker/4:3 Tainted: G U W 4.14.0-rc3-CI-CI_DRM_3171+ #1 <4>[ 536.910625] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 <4>[ 536.910718] Workqueue: events intel_dp_modeset_retry_work_fn [i915] <4>[ 536.910738] task: ffff8804573a8040 task.stack: ffffc9000121c000 <4>[ 536.910757] RIP: 0010:drm_setup_crtcs+0x143/0xbf0 <4>[ 536.910771] RSP: 0018:ffffc9000121fd20 EFLAGS: 00010202 <4>[ 536.910787] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000031 RCX: 0000000000000000 <4>[ 536.910806] RDX: 0000000000000003 RSI: 0000000000000780 RDI: ffffc9000121fc98 <4>[ 536.910824] RBP: ffffc9000121fdb8 R08: ffff8804533442f0 R09: ffff88044dae8e88 <4>[ 536.910842] R10: ffffc9000121fb78 R11: 0000000000000000 R12: 0000000000000004 <4>[ 536.910861] R13: ffff88044d8bdfa8 R14: 0000000000000780 R15: 0000000000000438 <4>[ 536.910880] FS: 0000000000000000(0000) GS:ffff88045d300000(0000) knlGS:0000000000000000 <4>[ 536.910901] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 536.910917] CR2: 00007f0b3594e000 CR3: 0000000456b07004 CR4: 00000000003606e0 <4>[ 536.910936] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[ 536.910954] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 <4>[ 536.910971] Call Trace: <4>[ 536.910990] drm_fb_helper_hotplug_event.part.18+0x7e/0xc0 <4>[ 536.911007] drm_fb_helper_hotplug_event+0x1a/0x20 <4>[ 536.911078] intel_fbdev_output_poll_changed+0x1a/0x20 [i915] <4>[ 536.911096] drm_kms_helper_hotplug_event+0x27/0x30 <4>[ 536.911162] intel_dp_modeset_retry_work_fn+0x77/0x80 [i915] <4>[ 536.911181] process_one_work+0x233/0x660 <4>[ 536.911199] worker_thread+0x206/0x3b0 <4>[ 536.911214] kthread+0x152/0x190 <4>[ 536.911226] ? process_one_work+0x660/0x660 <4>[ 536.911240] ? kthread_create_on_node+0x40/0x40 <4>[ 536.911257] ret_from_fork+0x27/0x40 <4>[ 536.911272] Code: 06 00 00 45 8b 45 20 31 db 45 31 e4 45 85 c0 0f 8e 91 06 00 00 44 8b 75 94 44 8b 7d 90 49 8b 45 28 49 63 d4 44 89 f6 41 83 c4 01 <48> 8b 04 d0 44 89 fa 48 8b 38 48 8b 87 a8 01 00 00 ff 50 20 01 <1>[ 536.911432] RIP: drm_setup_crtcs+0x143/0xbf0 RSP: ffffc9000121fd20 <4>[ 536.911472] ---[ end trace 824a7d8ce15a77b5 ]--- <2>[ 618.454192] softdog: Initiating panic <0>[ 618.454213] Kernel panic - not syncing: Software Watchdog Timer expired <0>[ 618.454308] Kernel Offset: disabled The drv_module_reload_basic hangs should be gone now with commit 886c6b8692ba5f71b578097524b3b082e2e02119 (drm-intel/for-linux-next, drm-intel/drm-intel-next-queued) Author: Manasi Navare <manasi.d.navare@intel.com> Date: Thu Oct 26 14:52:00 2017 -0700 drm/i915: Cancel the modeset retry work during modeset cleanup There hasn't been any softdog incomplete on CFL-s since https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3179/fi-cfl-s/igt@drv_module_reload@basic-no-display.html, so I am not sure the patch has anything to do with it. Anyways it is fine with me to close this. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.