Bug 105812

Summary: Multimonitor does not work correctly with Modesetting driver on Intel hardware with xorg-server 1.19.99.902
Product: xorg Reporter: Michael Marley <michael>
Component: Driver/modesettingAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: medium CC: answer2002, bastian.beischer, dvlohp, mat.dot.sch, michael, mike.javorski, peter
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg.0.log from a reproduction of the issue
none
Output of dmesg with drm.debug=0x1e while reproducing the issue
none
Complete Xorg.0.log from a reproduction of the issue on 1.19.99.904
none
Full dmesg output from a reproduction of the issue with 1.19.99.904 none

Description Michael Marley 2018-03-29 20:37:27 UTC
Created attachment 138436 [details]
Xorg.0.log from a reproduction of the issue

After upgrading from xserver 1.19.6 (the stock provided version) to 1.19.99.902 (a custom build) on a Kubuntu 18.04 x86_64 system, multimonitor support has ceased to work correctly.  With the older version it worked as expected but with the newer version, even though the monitors are set to an extended desktop, the second monitor appears black until KDE starts and then is a clone of the first monitor, except the mouse cursor does not show on the second monitor at all.  Attempting to change the display configuration using the KDE display configuration utility has no effect.

Both monitors are Dell U2312HM monitors connected to the computer by DisplayPort.  The computer is a Lenovo ThinkCentre m900, which has a "Intel(R) HD Graphics 530 (Skylake GT2)" graphics device.  The system is currently running Linux 4.16-rc7, though the kernel version does not have any apparent effect on the problem, nor does any useful information print in the kernel log.

I have attached the Xorg.0.log file, which contains the line "[   228.979] (EE) modeset(0): failed to set mode: No space left on device" which I believe is related to the problem.

If I can provide any other data, please let me know.
Comment 1 Michael Marley 2018-04-03 13:01:29 UTC
I just tried again with 1.19.99.903 and now the behavior is different, but still bugged.  The display goes into extended desktop mode as it should, but the secondary monitor is blank except for the mouse cursor and the primary monitor flickers rapidly and intensely when any change or movement of the display contents occurs.  The following messages are printed repeatedly in Xorg.0.log:

	Information	[ 87410.817] (WW) modeset(0): flip queue failed: Cannot allocate memory
	Information	[ 87410.817] (WW) modeset(0): Page flip failed: Cannot allocate memory
	Information	[ 87410.817] (EE) modeset(0): present flip failed
Comment 2 Michael Marley 2018-04-11 14:33:04 UTC
This still happens on 1.19.99.904 in exactly the same manner it did on 1.19.99.903.
Comment 3 Olivier Fourdan 2018-04-11 15:13:27 UTC
I think it would be interesting to capture the kernel messages (dmesg) when the issue occurs with drm.debug=0x1e (you can also adjust drm debug level using /sys/module/drm/parameters/debug)
Comment 4 Michael Marley 2018-04-11 16:57:57 UTC
Created attachment 138765 [details]
Output of dmesg with drm.debug=0x1e while reproducing the issue
Comment 5 Olivier Fourdan 2018-04-12 07:16:22 UTC
Humm...

OK, I can't spot any drm error in there, so maybe it is not a drm issue (or I miss it in the logs).

Back to basics...

  * Message "Page flip failed" comes from ms_do_pageflip().
  * Message "queue_flip_on_crtc" is from queue_flip_on_crtc() called from ms_do_pageflip().

The reported error (ENOMEM) is the same, so basically, it means queue_flip_on_crtc() fails because ms_flush_drm_events() fails.

ms_flush_drm_events() fails either because xserver_poll() fails or drmHandleEvent() fails.

poll() can fail with ENOMEM if there was no space to allocate the file descriptor tables (man 2 poll).

drmHandleEvent() wouldn't return ENOMEM, it returns either 0 (success) or -1 (failure) so indeed, that cannot be a drm failure, so it's poll() that failed to allocate memory.
Comment 6 Olivier Fourdan 2018-04-12 09:33:07 UTC
Unfortunately, attachment 138436 [details] is from RC2 and attachment 138765 [details] is from RC4 with different symptoms, and I fail to find the relevant error there :/

Can we try once again:

1. Make sure to have enough kernel buffer log, like booting with "log_buf_len=1M"
2. Set drm.debug to the max, 0xff
3. Reboot and reproduce the issue
4. Provide both complete Xorg logs and dmesg from that run
Comment 7 Michael Marley 2018-04-12 13:01:25 UTC
Created attachment 138785 [details]
Complete Xorg.0.log from a reproduction of the issue on 1.19.99.904
Comment 8 Michael Marley 2018-04-12 13:03:24 UTC
Created attachment 138786 [details]
Full dmesg output from a reproduction of the issue with 1.19.99.904

OK, here's the requested data.

Curiously, the problem sometimes doesn't happen when I have all the debugging output turned on (silly heisenbugs…), but I did capture the logs from a reproduction that yielded the same behavior as described previously.
Comment 9 Michael Marley 2018-04-26 20:13:21 UTC
Still happening with 1.19.99.905 with exactly the same symptoms and log output as 1.19.99.904.
Comment 10 Michael Marley 2018-05-11 12:41:21 UTC
This still happens (again with the exact same behavior and log output) in 1.20.  Is there anything else I can provide or something I can do to push this along?  This bug makes the X server unusable on the affected system.
Comment 11 Victor NOEL 2018-05-18 08:36:16 UTC
I confirm this happens to me too since latest 1.20.

Same as the others, if I can do something to help find the source of the problem, I will be happy too.
Comment 12 Michel Dänzer 2018-05-22 09:20:26 UTC
*** Bug 106593 has been marked as a duplicate of this bug. ***
Comment 13 bozont 2018-06-13 11:17:46 UTC
I can also confirm this, even with the now latest version (1.20.0-7). I had to roll back to 1.19.6-2 to use my vertical monitor. The same thing happens in the logs as with the other people. I have a Dell Latitude E5470 (i5 6200u, HD Graphics 520) running Arch Linux. Also all of my collagues using linux/xorg can confirm this issue (also on Lenovo T470p).

If any further logs/dumps are needed for the resolution, I can provide them them.
Comment 14 Peter Wu 2018-08-14 08:54:46 UTC
@bozont Are you actually using the modesetting driver for the Intel hardware?
If you are using the intel module instead, a potential fix for the broken multimonitor support in 1.20 is posted in bug 100086.
Comment 15 Michael Marley 2018-08-14 10:08:22 UTC
I am actually using the modesetting driver, but I seem to have stumbled on a workaround for this bug.  If I disable IOMMU access to the GPU using "intel_iommu=on,igfx_off" in the kernel arguments, it works correctly.  I figure there is still a bug somewhere though, because previous versions of Xorg did not require that workaround.
Comment 16 GitLab Migration User 2018-12-13 18:12:27 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/xserver/issues/62.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.