Bug 99334 - [SKL] External screen not recognized on boot, disconnects during operation
Summary: [SKL] External screen not recognized on boot, disconnects during operation
Status: CLOSED INVALID
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-09 19:20 UTC by post+fdo
Modified: 2018-03-08 17:45 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: display/HDMI


Attachments
dmesg captured with drm-tip (341.82 KB, text/plain)
2017-01-09 19:20 UTC, post+fdo
no flags Details
lspci -vv for the GPU (2.38 KB, text/plain)
2017-01-09 19:21 UTC, post+fdo
no flags Details
xrandr -q --verbose after a disconnect (11.00 KB, text/plain)
2017-01-09 19:21 UTC, post+fdo
no flags Details
xrandr -q after a disconnect (2.85 KB, text/plain)
2017-01-09 19:21 UTC, post+fdo
no flags Details

Description post+fdo 2017-01-09 19:20:45 UTC
Created attachment 128835 [details]
dmesg captured with drm-tip

I am having trouble with an external monitor connected via HDMI.  This is with an Acer Aspire VN7-592G.  The machine has an HDMI port and a Thunderbolt 3 port; all of this is with a screen directly connected to the HDMI port (and nothing connected to Thunderbolt).  I am on Debian testing and using KDE.  The trouble may not be specific to HDMI at all, that's just the only connector I happen to be able to try right now.

The issue is as follows:  The external screen is not recognized properly when it is plugged in during boot.  "xrandr" shows no external screen after logging it.  I have to un-plug and re-plug the screen for it to be detected.  Sometimes, I have to do this multiple times.  Only then it will show up in xrandr and be recognized by the screen setup of my DE).  I then configure my system to use only the external screen, i.e., the internal one is disabled.
  Furthermore, there are some things that may trigger the screen to suddenly disconnect (so that the screen setup tool automatically re-enables the internal screen):  Starting a game (tried this with neverball [both in windowed and full-screen mode] and a proprietary game on Steam [full-screen]), starting Steam (in the moment the Steam main window pops up), restarting kwin, logging out.  In each case (potentially after logging back in), I can unplug and replug the screen (potentially multiple times) and then everything works again.  If I try this often enough (well, I tried with neverball as that is fastest to start), at some point, it will stop making trouble -- the external screen will stay connected with neverball started -- and from then on, starting Steam or other, proprietary games also generally works.  Somehow, this fix is "sticky" until I reboot.  Notice that this is not related to modesetting to happen; I saw the disconnect with neverball in windowed mode or with the steam main window.
  On drm-tip, when the external screen disconnects, it can also happen that the X session dies.  I was unable to find any backtrace or anything in the X logs, they just end with some modesetting stuff.  When I wanted to reproduce this to capture the logs, it did not happen again.  The X session dying never happened with the debian 4.8 kernel or with a vanilla 4.9.1 kernel.  
  Needless to say, this gets rather frustrating as it becomes hard to play a game on the external screen.  The screen missing after boot is less of a problem, I don't frequently reboot anyway.

I am honestly quite lost about how to even start diagnosing this.  I report this bug against the Kernel component as I saw this behavior both with the modesetting Xorg driver and with the intel Xorg driver -- and I can't imagine this being the fault of OpenGL, but what do I know?

Attached you can find:

* dmesg-4.10:  This is a dmesg (via journalctl as "dmesg" would only output parts of the logs) from drm-tip (9ea6a075).  I booted the machine, logged in, unplugged and re-plugged the screen, then started neverball.  This triggered a disconnect.
* xrandr-strange-verbose:  The output of "xrandr -q --verbose" after neverball was started and the screen got disconnected on the Debian 4.8 kernel with the modesetting Xorg driver.  This looks rather strange, there are some additional modelines being printed after all screens have been listed; these modes seem not to belong to any screen?  Notice that these are *not* modes of HDMI-2 which is disconnected.  Furthermore, these modes are shown in "verbose" mode even without the "--verbose" flag, as the attachment xrandr-strange (output of "xrandr -q" caught at another occasion) shows.
* lspci: What lspci -vv has to say about my GPU.

Please let me know if you need any further information.  I would be very grateful for any assistance.
Comment 1 post+fdo 2017-01-09 19:21:13 UTC
Created attachment 128836 [details]
lspci -vv for the GPU
Comment 2 post+fdo 2017-01-09 19:21:31 UTC
Created attachment 128837 [details]
xrandr -q --verbose after a disconnect
Comment 3 post+fdo 2017-01-09 19:21:46 UTC
Created attachment 128838 [details]
xrandr -q after a disconnect
Comment 4 post+fdo 2017-01-12 13:30:11 UTC
I did some brief tests with a Thunderbolt 3 -> HDMI adapter I lent from a friend, and these tests indicate that the problem *only* arises with the HDMI port on the laptop. If I use HDMI through the Thunderbolt port, there seems to be no problem.

(Btw, wouldn't the tag "display/HDMI" be more accurate than "display/DP"?)
Comment 5 Ville Syrjala 2017-01-12 14:09:45 UTC
Jan 09 20:09:34 r-schwarzschild kernel: [drm:intel_hdmi_detect [i915]] [CONNECTOR:59:HDMI-A-1]
Jan 09 20:09:34 r-schwarzschild kernel: [drm:drm_dp_dual_mode_detect [drm_kms_helper]] DP dual mode HDMI ID: DP-HDMI ADAPTOR\004 (err 0)
Jan 09 20:09:34 r-schwarzschild kernel: [drm:drm_dp_dual_mode_detect [drm_kms_helper]] DP dual mode adaptor ID: a0 (err 0)
Jan 09 20:09:34 r-schwarzschild kernel: [drm:intel_hdmi_set_edid [i915]] DP dual mode adaptor (type 2 HDMI) detected (max TMDS clock: 300000 kHz)
Jan 09 20:09:34 r-schwarzschild kernel: [drm:i915_hotplug_work_func [i915]] [CONNECTOR:59:HDMI-A-1] status updated from disconnected to connected

So there we were able to read the EDID.

[CONNECTOR:59:HDMI-A-1]
Jan 09 20:09:42 r-schwarzschild kernel: [drm:intel_hdmi_detect [i915]] [CONNECTOR:59:HDMI-A-1]
Jan 09 20:09:42 r-schwarzschild kernel: [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpb] NAK for addr: 0050 w(1)
Jan 09 20:09:42 r-schwarzschild kernel: [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpb] NAK on first message, retry
Jan 09 20:09:42 r-schwarzschild kernel: [drm:gmbus_xfer [i915]] GMBUS [i915 gmbus dpb] NAK for addr: 0050 w(1)
Jan 09 20:09:42 r-schwarzschild kernel: [drm:drm_do_probe_ddc_edid [drm]] drm: skipping non-existent adapter i915 gmbus dpb
Jan 09 20:09:42 r-schwarzschild kernel: [drm:drm_dp_dual_mode_detect [drm_kms_helper]] DP dual mode HDMI ID: DP-HDMI ADAPTOR\004 (err 0)
Jan 09 20:09:42 r-schwarzschild kernel: [drm:drm_dp_dual_mode_detect [drm_kms_helper]] DP dual mode adaptor ID: a0 (err 0)
Jan 09 20:09:42 r-schwarzschild kernel: [drm:intel_hdmi_set_edid [i915]] DP dual mode adaptor (type 2 HDMI) detected (max TMDS clock: 300000 kHz)
Jan 09 20:09:42 r-schwarzschild kernel: [drm:drm_helper_probe_single_connector_modes [drm_kms_helper]] [CONNECTOR:59:HDMI-A-1] status updated from connected to disconnected
Jan 09 20:09:42 r-schwarzschild kernel: [drm:drm_helper_probe_single_connector_modes [drm_kms_helper]] [CONNECTOR:59:HDMI-A-1] disconnected

but here a bit later we are again unable to read it. It's quite weird since the i2c bus seems to be fine itself since we can detect the DP dual mode chip present on the bus. My only theory is that the dual mode chip is somehow going crazy and preventing the i2c access for the EDID read from reaching the actual display.
Comment 6 yann 2017-01-12 14:11:23 UTC
(In reply to post+fdo from comment #4)
> I did some brief tests with a Thunderbolt 3 -> HDMI adapter I lent from a
> friend, and these tests indicate that the problem *only* arises with the
> HDMI port on the laptop. If I use HDMI through the Thunderbolt port, there
> seems to be no problem.

thanks for this additional test and info

> 
> (Btw, wouldn't the tag "display/HDMI" be more accurate than "display/DP"?)

reason why I put under DP is what Ville was pointing out.
Comment 7 post+fdo 2017-02-11 13:59:45 UTC
> I did some brief tests with a Thunderbolt 3 -> HDMI adapter I lent from a
> friend,

Actually I now learned that this was not using Thunderbolt 3 but "just" the "alternate data mode" of the USB-C connector.

Is there any other information I could provide, anything else I could do to help solve this issue?
Comment 8 post+fdo 2017-03-31 10:02:46 UTC
> If I use HDMI through the Thunderbolt port, there seems to be no problem.

I may have been a little over-optimistic here; while one adapter (from Apple) works fine, I am having huge trouble with a USB-C docking station from Dell.  About one in ten times, when I connect it, the entire machine just freezes and I have to hard-reboot. Frequently, not even SysRq keys will react.

Also, it occasionally (like twice a week) happens that an HDMI screen is not recognized even if I re-plug it 10 times.  In this case I have to reboot to get it to connect.

Essentially, I now find myself rebooting the laptop 3 or 4 times as week because of issues with the external screen. It's like the 90s all over again...
Comment 9 Elizabeth 2017-06-29 17:35:20 UTC
(In reply to post+fdo from comment #8)
> > If I use HDMI through the Thunderbolt port, there seems to be no problem.
> 
> I may have been a little over-optimistic here; while one adapter (from
> Apple) works fine, I am having huge trouble with a USB-C docking station
> from Dell.  About one in ten times, when I connect it, the entire machine
> just freezes and I have to hard-reboot. Frequently, not even SysRq keys will
> react.
> 
> Also, it occasionally (like twice a week) happens that an HDMI screen is not
> recognized even if I re-plug it 10 times.  In this case I have to reboot to
> get it to connect.
> 
> Essentially, I now find myself rebooting the laptop 3 or 4 times as week
> because of issues with the external screen. It's like the 90s all over
> again...

Hello,
Sorry for the delay. There are some features for docking stations that where included on kernel 4.12. Could you please try to replicate the bug with this version and share the results? Thank you.
Comment 10 Elizabeth 2017-09-12 18:47:28 UTC
(In reply to Elizabeth from comment #9)
> (In reply to post+fdo from comment #8)
> > > If I use HDMI through the Thunderbolt port, there seems to be no problem.
> > ...
> ...
Hello again, 
Have you been able to retest with the latest drm-tip or vanilla mainline? At the moment we don't have the specific model of docking station that you're using to try and reproduce it, so it would be really helpful if you could give us some feedback if this is still a problem.
Thank you.
Comment 11 post+fdo 2017-09-15 09:42:10 UTC
Thanks for following up on this!

I finally updated to 4.12 a week ago.  I can already say that the HDMI problems described in the first post persist:  I often have to plug in a screen, then plug it off, then plug it in again.  Also, starting games or restarting my window manager frequently disconnects the external screen.

As far as the docking stations is concerned, I have been using it for 2 days now (i.e., 2 connects and 2 disconnects) and that worked fine.  I will watch it some more time to see if the crashes return.

The docking station is a Dell WD15.
Comment 12 Elizabeth 2017-09-15 14:59:06 UTC
Thanks for the update. Do you mind sharing a new dmesg with the latest kernel to check if it is showing the same as in comment #5.
Comment 13 post+fdo 2017-09-26 08:40:55 UTC
After almost two weeks of using the Dell WD15 docking station flawlessly on the 4.12 kernel, I now had the first full-freeze again.  I just connected the docking station, and then my machine froze.  I had to use SysRq to reboot.

This is after I upgraded to 4.13.3 yesterday.
Comment 14 post+fdo 2017-10-27 08:21:49 UTC
Sorry, I am not using that laptop any more.  The next owner is going to use Windows on it, after all the trouble I had (not just this, also broken UEFI and other issues).
Comment 15 Elizabeth 2018-01-22 17:48:03 UTC
I'm closing this ticket as INVALID since it seems that no other person has reported a new sight of the issue and the original device won't be for this use anymore. Either it was an specific issue of that device or this specific configuration. Thanks.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.