This bug just for recording for posterity what we found out: On snb CI shards (gt1, but I managed to kill my gt2 a few times too) the system can seemingly hard-hang when running the above testcases. This was tested on igt commits commit c8811338e8a7723b5e99a303361ed97c092fc270 (HEAD -> master, fdo/master) Author: Kelvin Gardiner <kelvin.gardiner@intel.com> Date: Tue Jun 27 14:04:51 2017 -0700 intel-ci: Add fast-feedback-simulation.testlist Kernel integration manifest is roughly drm-intel drm-intel-fixes 781cc76e0c2469cb7ac12ba238a4ea006978e321 drm/i915: Avoid the gpu reset vs. modeset deadlock drm-upstream drm-fixes 46828dc77961d9286e55671c4dd3b6c9effadf1a Merge branch 'linux-4.13' of git://github.com/skeggsb/linux into drm-fixes drm-intel drm-intel-next-fixes 04941829b0049d2446c7042ab9686dd057d809a6 drm/i915: Hold RPM wakelock while initializing OA buffer drm-intel drm-intel-next-queued 4e34935fcf691b2f553fdc34502d649bf979a06f drm/i915/cnl: Setup PAT Index. drm-upstream drm-next 0c697fafc66830ca7d5dc19123a1d0641deaa1f6 Backmerge tag 'v4.13-rc5' into drm-next sound-upstream for-next c9480d055e306a855f8a8d2b3b097773cd0d5ad0 sound: emu8000: constify emu8000_ops sound-upstream for-linus a8e800fe0f68bc28ce309914f47e432742b865ed ALSA: usb-audio: Apply sample rate quirk to Sennheiser headset drm-intel topic/core-for-CI 01cbe29aa8f8d7ffca23cf6e147a17529fae680e e1000e: fix buffer overrun while the I219 is processing DMA transactions drm-misc drm-misc-next b9c55b6e2cc4369b0688961fa5de0e057f3ec0c4 drm/vc4: Continue the switch to drm_*_put() helpers drm-misc drm-misc-next-fixes 1ed134e6526b1b513a14fba938f6d96aa1c7f3dd drm/vc4: Fix VBLANK handling in crtc->enable() path drm-misc drm-misc-fixes a0ffc51e20e90e0c1c2491de2b4b03f48b6caaba drm/atomic: If the atomic check fails, return its value first I'll attach a netconsole log of a typical death, but tldr is that we stall for a few minutes (with not even the NMI watchdog being able to do anything) until eventually the system recovers and the batch completes and the dpms/modeset-off goes through.
Created attachment 133552 [details] netconsole capture right around the stall Includes the following debug patch applied on top: diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index decf5da63950..15582af42be7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -10485,6 +10485,8 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc, return ret; } + printk("after clock compute\n"); + if (crtc_state->color_mgmt_changed) { ret = intel_color_check(crtc, crtc_state); if (ret) @@ -12025,6 +12027,8 @@ static int intel_atomic_check(struct drm_device *dev, if (ret) return ret; + printk("after check_modeset\n"); + for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, crtc_state, i) { struct intel_crtc_state *pipe_config = to_intel_crtc_state(crtc_state); @@ -12089,7 +12093,11 @@ static int intel_atomic_check(struct drm_device *dev, return ret; intel_fbc_choose_crtc(dev_priv, state); - return calc_watermark_data(state); + ret = calc_watermark_data(state); + + printk("end of atomic_check\n"); + + return ret; } static int intel_atomic_prepare_commit(struct drm_device *dev, @@ -12343,7 +12351,9 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) unsigned crtc_vblank_mask = 0; int i; + printk("before wait\n"); intel_atomic_commit_fence_wait(intel_state); + printk("after wait\n"); drm_atomic_helper_wait_for_dependencies(state); @@ -12573,6 +12583,8 @@ static int intel_atomic_commit(struct drm_device *dev, return ret; } + printk("after atomic prepare commit\n"); + /* * The intel_legacy_cursor_update() fast path takes care * of avoiding the vblank waits for simple cursor
commit f978cc027cd02a6c43b54b69fab2b538bbe05330 (HEAD -> master, fdo/master) Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Aug 16 14:39:15 2017 +0100 lib/dummyload: Pad with a few nops so that we do not completely hog the system Fingers crossed.
Hello Daniel, have this been verified, can we close this bug? Thanks.
Closing, please re-open if still occurs.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.