Bug 110629 - Intel i915 graphics driver issues with external monitor when laptop in docking station (opensuse bug 1132926)
Summary: Intel i915 graphics driver issues with external monitor when laptop in dockin...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL: https://bugzilla.opensuse.org/show_bu...
Whiteboard: Triaged, ReadyForDev
Keywords: regression
Depends on:
Blocks:
 
Reported: 2019-05-06 18:12 UTC by otrebor
Modified: 2019-11-06 09:17 UTC (History)
6 users (show)

See Also:
i915 platform: BDW
i915 features: display/watermark


Attachments
test with kernel 5.1.0 on openSUSE 15.0 (390 bytes, text/plain)
2019-05-08 13:04 UTC, otrebor
no flags Details
test with kernel 5.1.0 on openSUSE 15.0 with debug options (gzipped) (299.46 KB, application/gzip)
2019-05-08 13:06 UTC, otrebor
no flags Details
test with kernel 5.2.0-rc3 on openSUSE 15.0 with debug options (168.32 KB, application/gzip)
2019-06-04 21:15 UTC, otrebor
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description otrebor 2019-05-06 18:12:20 UTC
I have a HP EliteBook 840 G2 running openSUSE Leap 15.0 with all patches applied.
The laptop runs fine when it is in normal portable mode (off dock), also with an external monitor. When I dock it into its docking station at home, I experience weird graphic device behaviour. It is a matter of luck if it is able to initialize and configure the attached external monitor the way it is expected. Sometimes it hangs in an endless config loop when trying to initialize the monitor.
The external monitor is a SAMSUNG U28D590D connected via display port to port #1 to the docking station.
However, if I undock the laptop and plug the very same display cable to the onboard display port then the display and the laptop run as expected. It really is an issue I can only experience when putting the device into the docking station.
For comparison: I had the chance to try another device, a HP EliteBook 840 G3 with the very same docking station and monitor. It was running Windows 7 and I did not see any strange behaviour. Additionally, I was earlier running openSUSE 42.3 on the very same device configuration and did not have any issues. The odd behaviour only started after upgrading to Leap 15.0 and got worse with each patch cycle, to reach total unusability as of today. Hence, thats why I am writing this bug report.
One thing I occasionally see is this error message in the logs:
kernel: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe C FIFO underrun

I attaching output from dmesg, journalctl, and xrandr in different configurations. Let me know if you need more.
Thanks for the support.

Please also refer to opensuse bug #1132926 here for more details:
https://bugzilla.opensuse.org/show_bug.cgi?id=1132926

Somehow might relate to this bug here:
https://bugs.freedesktop.org/show_bug.cgi?id=95298
However, my enviroenment refers to the Intel i915 graphics driver as I have no Radeon hardware.

Please also see the opensuse bug first.
Let me know If you need more info.
Thanks for the support.
Comment 1 Lakshmi 2019-05-07 07:06:07 UTC
Can you please try to reproduce this issue with kernel 5.1 or drmtip (https://cgit.freedesktop.org/drm-tip). If persists please attach dmesg from boot with kernel parameters drm.debug=0x1e
log_buf_len=4M.
Comment 2 otrebor 2019-05-07 14:41:27 UTC
I have to be careful not to trash my system as this is a much needed production system, and I have no other hw to test.

However, I did this already with a 5.0.10 kernel (packaged by suse) with little success. I can repeat this with the most recent kernels provided by suse's packaging system (now at 5.0.11), as I don't have a full build environment for the kernel. Would that also work?
Comment 3 Takashi Iwai 2019-05-07 15:08:40 UTC
You can try the kernel in OBS Kernel:HEAD repo, which already contains 5.1 final.
  https://download.opensuse.org/repositories/Kernel:/HEAD/standard/
Comment 4 otrebor 2019-05-07 16:24:05 UTC
Super, thats cool :-)
Allow a couple of days for this, as the device is in production use.
Comment 5 otrebor 2019-05-08 13:04:55 UTC
Created attachment 144196 [details]
test with kernel 5.1.0 on openSUSE 15.0
Comment 6 otrebor 2019-05-08 13:06:09 UTC
Created attachment 144197 [details]
test with kernel 5.1.0 on openSUSE 15.0 with debug options (gzipped)
Comment 7 otrebor 2019-05-08 13:06:26 UTC
ok, done that.
see attached logs
Comment 8 otrebor 2019-05-20 08:40:12 UTC
Any news yet?

When I've tested this with the 5.1 kernel, the behaviour was worse than before.
It went into an endless loop with monitor detection. No chance to influence anything from the on-screen setup. The flickering cycles were just too fast.
The only solution to stop this was to pull the cable of the monitor out.

Can I be of help?
Comment 9 Lakshmi 2019-05-31 08:29:41 UTC
Recently there were some patches merged to drmtip that fixes underruns.
I recommend you to verify the issue with latest drmtip (https://cgit.freedesktop.org/drm-tip).

If issue persists with drmtip, further investigation is needed.
Comment 10 otrebor 2019-06-04 21:15:58 UTC
Created attachment 144453 [details]
test with kernel 5.2.0-rc3 on openSUSE 15.0 with debug options

Thanks for coming back

I tested with the very latest available kernel provided by opensuse being 5.2.0-rc3.

I don't know which version of the drm-tip went in there. Maybe @tiwai can tell more about this?

However, the results are negative.
The shown behaviour is again different with every patch cycle. Trying to configure the monitor is weird at best. Sometimes when I lowered the resolution to the same as the builtin one (1920x1080) the screen was occasionally showing. Most of the times not. The gui was showing the monitor as disabled when in fact it was displaying and the other way around. Trying to find a stable configuration proved to be impossible.
Comment 11 Lakshmi 2019-06-05 06:02:44 UTC
@Ville, any suggestion here?
Comment 12 Norman Golisz 2019-06-05 14:56:02 UTC
I've got the very same symptoms with my Thinkpad T430, 3 external monitors (Display Port) and docking station. In my case it's OpenBSD 6.5-current after my OS vendor updated the DRM code in April from Linux 4.4 to 4.19.34.

Part of my dmesg:

inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 4000" rev 0x09
drm0 at inteldrm0
inteldrm0: msi
inteldrm0: 1920x1080, 32bpp
wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation), using wskbd0
[drm] *ERROR* Link Training Unsuccessful

The last message is repeated 63 times.
Comment 13 Ville Syrjala 2019-06-10 16:49:17 UTC
(In reply to Norman Golisz from comment #12)
> I've got the very same symptoms with my Thinkpad T430, 3 external monitors
> (Display Port) and docking station. In my case it's OpenBSD 6.5-current
> after my OS vendor updated the DRM code in April from Linux 4.4 to 4.19.34.
> 
> Part of my dmesg:
> 
> inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 4000" rev 0x09
> drm0 at inteldrm0
> inteldrm0: msi
> inteldrm0: 1920x1080, 32bpp
> wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation), using wskbd0
> [drm] *ERROR* Link Training Unsuccessful
> 
> The last message is repeated 63 times.

That is a different issue.

otrebor's logs show that the MST link is getting retrained, which among other things causes the FIFO underrun. I have zero faith that the current MST link retraining can do its job properly. We need to rework that code to behave more like the SST code. Ie. move the retraining to the hotplug work, make it grab the appropriate locks for all the active streams going over the same link, make sure it's not racing with ongoing commits, etc. Probably need to hoist the locking higher up into the generic hotplug code so that we can handle the multiple streams properly. Occasionally I've also pondered whether we shouldn't just throw out the special case link retraining paths and just force a full modeset for the pipe(s) in question.

In the meantime we could try to just get rid of the MST retraining code. But if the link has truly failed this will just guarantee that you get no picture:

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 3aef12041672..ab2da1c8e1d9 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -4722,15 +4722,6 @@ intel_dp_check_mst_status(struct intel_dp *intel_dp)
                bret = intel_dp_get_sink_irq_esi(intel_dp, esi);
 go_again:
                if (bret == true) {
-
-                       /* check link status - esi[10] = 0x200c */
-                       if (intel_dp->active_mst_links > 0 &&
-                           !drm_dp_channel_eq_ok(&esi[10], intel_dp->lane_count)) {
-                               DRM_DEBUG_KMS("channel EQ not ok, retraining\n");
-                               intel_dp_start_link_train(intel_dp);
-                               intel_dp_stop_link_train(intel_dp);
-                       }
-
                        DRM_DEBUG_KMS("got esi %3ph\n", esi);
                        ret = drm_dp_mst_hpd_irq(&intel_dp->mst_mgr, esi, &handled);
Comment 14 Lakshmi 2019-06-25 13:14:59 UTC
@Otrebor, Can you please provide results by applying the above diff?
Comment 15 otrebor 2019-07-03 16:17:48 UTC
I do not have a build environment for neither the kernel nor drivers. sorry.
I also do lack experience in how to do that.

For the previous tests I took a test kernel from here:
https://download.opensuse.org/repositories/Kernel:/HEAD/standard/
I that one may contain this fix, then testing that is easy.

Another private repo may work as well, as long as i get the kernel and the modules as an opensuse rpm package. Is that possible?
Comment 16 Jani Saarinen 2019-11-06 09:17:13 UTC
Stan, do you have any comments here?


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.