On fi-skl-6770hq (newly added) gem_exec_suspend@basic-s3 gets dmesg warning. E.g. for the last incomplete run CI_DRM_1640 (15-Sep-2016) dmesg has following ... dmesg [ 380.357615] Suspending console(s) (use no_console_suspend to debug) [ 380.362507] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 380.365508] sd 0:0:0:0: [sda] Stopping disk [ 380.549425] Broke affinity for irq 123 [ 380.564050] Broke affinity for irq 123 [ 380.583338] Broke affinity for irq 121 [ 380.583341] Broke affinity for irq 123 [ 380.602692] Broke affinity for irq 8 [ 380.602706] Broke affinity for irq 9 [ 380.602718] Broke affinity for irq 120 [ 380.602722] Broke affinity for irq 121 [ 380.602725] Broke affinity for irq 123 [ 380.617164] cache: parent cpu1 should not be sleeping [ 380.622504] cache: parent cpu2 should not be sleeping [ 380.627852] cache: parent cpu3 should not be sleeping [ 380.633436] cache: parent cpu4 should not be sleeping [ 380.639013] cache: parent cpu5 should not be sleeping [ 380.644518] cache: parent cpu6 should not be sleeping [ 380.650026] cache: parent cpu7 should not be sleeping [ 380.970873] sd 0:0:0:0: [sda] Starting disk [ 381.275808] ata1.00: supports DRM functions and may not be fully accessible [ 381.283883] ata1.00: supports DRM functions and may not be fully accessible [ 382.630079] [drm:intel_dp_link_training_clock_recovery] *ERROR* failed to enable link training [ 382.906189] [drm:intel_dp_link_training_channel_equalization] *ERROR* failed to start channel equalization
Created attachment 126565 [details] Dmesg before run
Created attachment 126566 [details] Dmesg during run
Try reverting commit c92bd2fa33038d18858beca3cc65d53c99ce95f6 Author: Navare, Manasi D <manasi.d.navare@intel.com> Date: Thu Sep 1 15:08:15 2016 -0700 drm/i915: Make DP link training channel equalization DP 1.2 Spec compliant and commit 13b1996e842aa4164c4d838908bc6dd76c3bd2b2 Author: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Date: Wed Sep 7 11:28:01 2016 -0700 drm/dp/i915: Make clock recovery in the link training compliant with DP Spec 1.2
How about we do a bisect before we start reverting changes? Blind dart-throwing may or may not work out, but I'd rather see a data-based decision being made.
Could you also please tell us which ports physically have devices connected to them and what types (DP / HDMI, etc.)? I can see what dmesg is reporting; I want to verify that what's in there matches the physical hardware configuration.
Before the link training failure in the second log I noticed the following: [ 380.984744] rtc_cmos 00:04: System wakeup disabled by ACPI It seems to me that any failures seen after seeing this message would be expected, particularly after suspending. If the system, or part of the system, is still powered down I wouldn't expect anything to necessarily work right. I'm further confounded by the fact that this failure apparently happened on a new system. If this is a new system, then there is no baseline to show that this test case ever worked on the platform in question, so I don't understand why folks are calling this a regression. A regression implies that there was a known functional test case in the first place that now no longer works. Can we please re-test this on the platform in question and see that it's a repeatable thing? Additionally, may I suggest making sure that the EFI firmware on that system is up-to-date?
Hardware: Intel NUC6i7KYK Displays connected (not connected) HDMI 4k, mDP 4k (USB-C) Where is this bug it says this is regression. This is new system as written in bug and you say too. But this started to fail all the time when was added and other SKL systems do not fail on this. Yann, any updates on your side
As I said in email, the HDMI 4k monitor may not be a valid configuration depending on whether or not there's proper LSPCON support in the kernel in question. The dmesg output implies that there's some sort of a failure on suspend with the storage subsystem. The failure to resume I indicated is likely the root cause of the link training failure.
(In reply to Jim Bride from comment #4) > How about we do a bisect before we start reverting changes? Blind > dart-throwing may or may not work out, but I'd rather see a data-based > decision being made. I wasn't suggesting the commits should be reverted from the tree, but rather try without them to see if that makes a difference. A debug thing to try before resorting to bisecting. Based on what's changed recently that could cause link training errors rather than blind guessing.
I looked at these error messages in the driver and this error "Failed to enable Link training" is thrown even before it starts the voltage swing retries loop. This looks like it is happening because it is unable to do any AUX writes which could be if the clocks are not set up or if the system is not resumed correctly before link training.
(In reply to Jani Saarinen from comment #7) > Hardware: Intel NUC6i7KYK > > Displays connected (not connected) > HDMI 4k, mDP 4k (USB-C) > > Where is this bug it says this is regression. This is new system as written > in bug and you say too. But this started to fail all the time when was added > and other SKL systems do not fail on this. > > Yann, any updates on your side We used same system and display configuration (HDMI 4k + mDP 4k) and confirmed that we are seeing same behavior (dmesg_warn) and issue with kernel: url: git://anongit.freedesktop.org/drm-intel branch: drm-intel-nightly tag: drm-intel-qa-testing-2016-09-19 tree: 97fb7fb559d224b8ea64f9388482700c20d0f52a parent: 8879a16a66841bed49d5aeea2f61d49905310f8d commit: 0e34cb5b35f0f837219495c402073141481b1b90 summary: 'drm-intel-nightly: 2016y-09m-19d-15h-38m-53s UTC integration manifest' author: Jani Nikula <jani.nikula@intel.com> authored_date: Mon Sep 19 15:39:27 2016 +0000 committer: Jani Nikula <jani.nikula@intel.com> committed_date: Mon Sep 19 15:39:27 2016 +0000 title: drm-intel-qa source_dir: /opt/git/misc/drm-intel.git output_dir: /home/shared/out/kernels/drm-intel-qa/WW38.1_4.8.0-rc7_0e34cb5 config_file: /home/shared/configs/config-drm-nightly kernel_version: 4.8.0-rc7
Created attachment 126739 [details] Manasi_testing_with_CR_Channel_EQ_reverted_on_boot This is the dmesg log on booting Skull Canyon system with HDMI(LSPCON) port connected to the 4K monitor after removing the Channel EQ and Clock Recovery Commits as requested by Jani.
I tested on the same Skull Canyon system after removing the commits regarding the CR and Chaneel EQ as requested by Jani. I see that there is no display with a bunch of retry errors. I suspect this to be a lack of LSPCON support issue. I am working on testing it after cherry picking the LSPCON patches from the mailing list. Manasi
HI Manasi, any updates on this still?
I was working with some folks from VPG and they said that they have seen some issues with the connectors in case of on chip LSPCON because of which they were seeing some Link training issues. I tested the same with SKL and an external LSPCON board and I don't see any errors. Is there a way to use a SKL with external LSPCON board on your end to run this BAT test? Also there are some patches related to LSPCON that are not merged yet which you will need for any testing with LSPCON.
Folks. The last failure as described on this bug has occurred at 2016-10-25 17:21:50 CI_DRM_1757/fi-skl-6770hq/igt@gem_exec_suspend@basic-s3 [ 195.052268] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training Assumable Imre's patches related to LSPCON committed on 2016-10-26 made the issue to go away. Proposing this to be marked as resolved. Jani Saarinen - do you agree or disagree?
Yes, that is my assumption too. But if you give me couple of days, I can test it here on same Skull Canyon system and see if this is fixed.
I agree. I am ok with few days and close if no issues.
Not seen lately
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.