Bug 100086 - xorg server 1.19.2: Crash with PRIME and multiple displays
Summary: xorg server 1.19.2: Crash with PRIME and multiple displays
Status: REOPENED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 106891 107253 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-03-06 17:42 UTC by bastian.beischer
Modified: 2019-11-08 17:10 UTC (History)
9 users (show)

See Also:
i915 platform:
i915 features:


Attachments
xorg.log v0 (40.07 KB, text/plain)
2017-03-07 08:43 UTC, bastian.beischer
no flags Details
xorg.log from 1.19.1 (32.80 KB, text/plain)
2017-03-07 09:09 UTC, bastian.beischer
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description bastian.beischer 2017-03-06 17:42:07 UTC
Arch Linux just updated xorg-server to 1.19.2.

Since the update I'm seeing problems with my second display. I'm using a Lenovo W520 laptop which features an Intel integrated GPU and a dedicated NVIDIA GF106GLM [Quadro 2000M] GPU.

The external display port is attached to the NVIDIA GPU, but I'm using the Intel GPU as the primary one. I'm using PRIME and I'm configuring the Intel GPU to offload to the NVIDIA GPU for display on the external display:

xrandr --setprovideroutputsource nouveau Intel
primary=$(xrandr | grep -e 'LVDS' | grep -v -e 'disconnected' | awk '{print $1}')
office_con=$(xrandr | grep -e 'DP-1' | grep -v -e 'disconnected' | awk '{print $1}')
if [[ -n ${office_con} ]]; then
    xrandr --output ${office_con} --auto --right-of ${primary}
fi

Furthermore I'm using KDE Plasma 5.9 as my desktop environment. After logging in in SDDM the picture on the external screen does not update, it just shows the black plasma splash screen, even though Plasma has fully started on the left screen. I can move the mouse over the picture on the second screen and I can see the cursor move.

When disabling the second screen in the KDE systemsettings (or using xrandr) and then reenabling it the X server freezes and I have to shutdown my laptop forcibly.

Downgrading to xorg-server 1.19.1-5 fixes the problem. The Arch package 1.19.1-5 contained the following two patches on top of 1.19.1:

https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/xorg-server&id=63f2ddee51705b0055041fdf67895d7383cd07cc
https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/xorg-server&id=fdb75aee720eedd503a7a0ce819e45a6e13a0705
Comment 1 Michel Dänzer 2017-03-07 06:35:55 UTC
Please attach the corresponding Xorg log file.
Comment 2 bastian.beischer 2017-03-07 08:43:09 UTC
Created attachment 130105 [details]
xorg.log v0

xorg log file for session in which

a) picture on external screen is frozen (except for mouse) after login (timestamp ~221)
b) external screen is disabled (timestamp ~318)
c) external screen is enabled again and picture is again frozen (timestamp ~375)
Comment 3 bastian.beischer 2017-03-07 08:46:21 UTC
The log file I attached differs slightly in symptoms from the original report: There's no crash of the X server, but the picture on the external screen is frozen (except for the mouse pointer) just like in the original report.

Yesterday my X server crashed when I disabled and reenabled the external screen, but I can't reproduce that at the moment.
Comment 4 Jonas Lundberg 2017-03-07 08:58:50 UTC
I second this bug, same behavior on gnome 3.22.2.
Downgrading the xorg-server from 1.19.2-1 to 1.19.1-5 fixes it.
Comment 5 bastian.beischer 2017-03-07 09:09:50 UTC
Created attachment 130106 [details]
xorg.log from 1.19.1

Here's the xorg.log for a session which does not show the bug (from xorg 1.19.1). Unfortunately it looks identical to xorg.log_v0.
Comment 6 Michel Dänzer 2017-03-07 09:36:24 UTC
Does the problem also occur with the modesetting driver instead of the intel driver?
Comment 7 bastian.beischer 2017-03-07 10:23:04 UTC
No it seems to be working with the modesetting driver (I switched both GPUs to the modesetting driver, I tried to do it only for the Intel GPU but I think I failed, at least xrandr --listproviders names them both "modesetting").
Comment 8 Michel Dänzer 2017-03-08 01:55:45 UTC
Looks like an intel SNA bug. sna_accel_post_damage/migrate_dirty_tracking need to be fixed to properly handle the root window pixmap not being the screen pixmap (while a Present client is flipping).
Comment 9 Chris Wilson 2017-03-09 11:18:39 UTC
You've got to be kidding me.

The screen->pixmap_dirty_list does not track the dirty pixmaps anymore due to

commit b5b292896f647c85f03f53b20b2f03c0e94de428
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Wed Feb 1 18:35:56 2017 +0900

    prime: Sync shared pixmap from root window instead of screen pixmap
Comment 10 Michel Dänzer 2017-03-16 01:57:26 UTC
Thanks for the report, fixed in the 1.19.3 release.

commit 1097bc9c184db4c722d5a8d2c5a4c0da9cdc70f5
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Mar 9 11:25:34 2017 +0000

    Revert "prime: Sync shared pixmap from root window instead of screen pixmap"
Comment 11 bastian.beischer 2018-05-16 15:32:19 UTC
I believe that 1097bc9c184db4c722d5a8d2c5a4c0da9cdc70f5 wasn't commited to the branch which was used for the 1.20.0 release?

I'm seeing problems in 1.20.0 again.
Comment 12 Michel Dänzer 2018-05-16 17:18:00 UTC
(In reply to bastian.beischer from comment #11)
> I'm seeing problems in 1.20.0 again.

Yes, the change remains in 1.20's ABI, xf86-video-intel needs to be adapted to it.
Comment 13 bastian.beischer 2018-06-07 09:44:08 UTC
Do we have any updates here?

I should add that the symptoms are different to the original report - I don't observe a crash anymore, but instead the second display is blank except for the mouse cursor which is visible and moves if the mouse moves, which makes me think if there's any connection to:

https://bugs.freedesktop.org/show_bug.cgi?id=105812

I also tried to switch both GPU drivers to modesetting, but I had no luck getting the external output to work at all (which was no problem in X server 1.19 either). That might be a different bug since I'm getting EDID related errors in xrandr.

All I can say with certainty at the moment is that there are definitely issues with NVIDIA (reverse) prime setups with X server 1.20.
Comment 14 Michel Dänzer 2018-06-11 16:13:43 UTC
*** Bug 106891 has been marked as a duplicate of this bug. ***
Comment 15 Michel Dänzer 2018-07-17 08:00:19 UTC
*** Bug 107253 has been marked as a duplicate of this bug. ***
Comment 16 Peter Wu 2018-08-14 08:43:59 UTC
Hi, after being handicapped for a few months, I decided to give it a go. The current intel module (built for 1.20) has a type confusion issue (assuming Pixmap, got Window) and crashes under ASAN, so that was an easy way to detect the cause.

Proposed patches:
[PATCH xf86-video-intel] SNA: fix PRIME output support since xserver 1.20
https://lists.freedesktop.org/archives/intel-gfx/2018-August/173523.html

Optional (it does not affect normal operation, only server exit):
[PATCH xserver] randr: fix RRCrtcDetachScanoutPixmap crash on server exit
https://lists.freedesktop.org/archives/intel-gfx/2018-August/173523.html
Comment 17 Peter Wu 2018-08-14 08:47:34 UTC
Oops, the most important patch had the wrong link, it should be:

[PATCH xf86-video-intel] SNA: fix PRIME output support since xserver 1.20
https://lists.freedesktop.org/archives/intel-gfx/2018-August/173522.html
Comment 18 Jorge Luis Martinez Gomez 2018-08-15 21:51:39 UTC
Awesome to see a patch. Thank you Peter Wu. I'll try it out as soon as I get back to work on Monday. I didn't get the crash,  so I can't test that, but I do get the blank screen with the cursor.
Comment 19 Jorge Luis Martinez Gomez 2018-08-17 15:41:59 UTC
I just tested the SNA patch Peter Wu shared and can confirm it works. :)
Comment 20 Yurii Kolesnykov 2019-09-28 13:38:08 UTC
This patch is shipping in ArchLinux xf86-video-intel package[0] and seems to work. I do maintain an alternative package in AUR, xf86-video-intel-git, which is very similar to official one but uses the latest master instead of sticking with some commit.

And now I see that patch needs to be rebased[2]. Peter, could you please rebase it and submit again, maybe it will see more attention now.

[0]https://archlinux.org/packages/extra/x86_64/xf86-video-intel/
[1]https://aur.archlinux.org/packages/xf86-video-intel-git/
[2]https://aur.archlinux.org/packages/xf86-video-intel-git/#comment-709686
Comment 21 Peter Wu 2019-09-29 15:07:35 UTC
Ville has applied a different patch that repeats much of my earlier patch:
https://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=581ddc5d2f55efa2cf5ec76a802fb781ee142b01

However, it appears that it misses one crucial detail which most likely will result in crashes (assertion failed). I'll try to test the latest git without patch and prepare an updated patch if needed.
Comment 22 Jorge Luis Martinez Gomez 2019-10-25 00:50:26 UTC
@Peter Wu

Well, I got the blank screen with the cursor, again. I made a regression bug report on Archlinux here:

https://bugs.archlinux.org/task/64238

There, loqs provided a patch there that fixed the issue.
Comment 23 Peter Wu 2019-11-08 17:10:46 UTC
I can confirm 1:2.99.917+893+gbff5eca4-1 on Arch Linux resulted in a black screen with just a cursor being displayed. Not great if you only have a few minutes before doing a presentation for a full room of people... I rebased the patch and reach the exact same diff as in the patch in the linked issue tracker.

I will do an ASAN test while the European Intel developers are enjoying their weekend, and submit a patch on Monday.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.