Summary: | xorg server 1.19.2: Crash with PRIME and multiple displays | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | bastian.beischer | ||||||
Component: | Driver/intel | Assignee: | Chris Wilson <chris> | ||||||
Status: | RESOLVED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Severity: | major | ||||||||
Priority: | medium | CC: | andyrtr, bastian.beischer, dongeryduo, jol, jonas.h.lundberg, michel, peter, root, v_bachvarov, ville.syrjala | ||||||
Version: | unspecified | ||||||||
Hardware: | x86-64 (AMD64) | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Description
bastian.beischer
2017-03-06 17:42:07 UTC
Please attach the corresponding Xorg log file. Created attachment 130105 [details]
xorg.log v0
xorg log file for session in which
a) picture on external screen is frozen (except for mouse) after login (timestamp ~221)
b) external screen is disabled (timestamp ~318)
c) external screen is enabled again and picture is again frozen (timestamp ~375)
The log file I attached differs slightly in symptoms from the original report: There's no crash of the X server, but the picture on the external screen is frozen (except for the mouse pointer) just like in the original report. Yesterday my X server crashed when I disabled and reenabled the external screen, but I can't reproduce that at the moment. I second this bug, same behavior on gnome 3.22.2. Downgrading the xorg-server from 1.19.2-1 to 1.19.1-5 fixes it. Created attachment 130106 [details]
xorg.log from 1.19.1
Here's the xorg.log for a session which does not show the bug (from xorg 1.19.1). Unfortunately it looks identical to xorg.log_v0.
Does the problem also occur with the modesetting driver instead of the intel driver? No it seems to be working with the modesetting driver (I switched both GPUs to the modesetting driver, I tried to do it only for the Intel GPU but I think I failed, at least xrandr --listproviders names them both "modesetting"). Looks like an intel SNA bug. sna_accel_post_damage/migrate_dirty_tracking need to be fixed to properly handle the root window pixmap not being the screen pixmap (while a Present client is flipping). You've got to be kidding me. The screen->pixmap_dirty_list does not track the dirty pixmaps anymore due to commit b5b292896f647c85f03f53b20b2f03c0e94de428 Author: Michel Dänzer <michel.daenzer@amd.com> Date: Wed Feb 1 18:35:56 2017 +0900 prime: Sync shared pixmap from root window instead of screen pixmap Thanks for the report, fixed in the 1.19.3 release. commit 1097bc9c184db4c722d5a8d2c5a4c0da9cdc70f5 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Mar 9 11:25:34 2017 +0000 Revert "prime: Sync shared pixmap from root window instead of screen pixmap" I believe that 1097bc9c184db4c722d5a8d2c5a4c0da9cdc70f5 wasn't commited to the branch which was used for the 1.20.0 release? I'm seeing problems in 1.20.0 again. (In reply to bastian.beischer from comment #11) > I'm seeing problems in 1.20.0 again. Yes, the change remains in 1.20's ABI, xf86-video-intel needs to be adapted to it. Do we have any updates here? I should add that the symptoms are different to the original report - I don't observe a crash anymore, but instead the second display is blank except for the mouse cursor which is visible and moves if the mouse moves, which makes me think if there's any connection to: https://bugs.freedesktop.org/show_bug.cgi?id=105812 I also tried to switch both GPU drivers to modesetting, but I had no luck getting the external output to work at all (which was no problem in X server 1.19 either). That might be a different bug since I'm getting EDID related errors in xrandr. All I can say with certainty at the moment is that there are definitely issues with NVIDIA (reverse) prime setups with X server 1.20. *** Bug 106891 has been marked as a duplicate of this bug. *** *** Bug 107253 has been marked as a duplicate of this bug. *** Hi, after being handicapped for a few months, I decided to give it a go. The current intel module (built for 1.20) has a type confusion issue (assuming Pixmap, got Window) and crashes under ASAN, so that was an easy way to detect the cause. Proposed patches: [PATCH xf86-video-intel] SNA: fix PRIME output support since xserver 1.20 https://lists.freedesktop.org/archives/intel-gfx/2018-August/173523.html Optional (it does not affect normal operation, only server exit): [PATCH xserver] randr: fix RRCrtcDetachScanoutPixmap crash on server exit https://lists.freedesktop.org/archives/intel-gfx/2018-August/173523.html Oops, the most important patch had the wrong link, it should be: [PATCH xf86-video-intel] SNA: fix PRIME output support since xserver 1.20 https://lists.freedesktop.org/archives/intel-gfx/2018-August/173522.html Awesome to see a patch. Thank you Peter Wu. I'll try it out as soon as I get back to work on Monday. I didn't get the crash, so I can't test that, but I do get the blank screen with the cursor. I just tested the SNA patch Peter Wu shared and can confirm it works. :) This patch is shipping in ArchLinux xf86-video-intel package[0] and seems to work. I do maintain an alternative package in AUR, xf86-video-intel-git, which is very similar to official one but uses the latest master instead of sticking with some commit. And now I see that patch needs to be rebased[2]. Peter, could you please rebase it and submit again, maybe it will see more attention now. [0]https://archlinux.org/packages/extra/x86_64/xf86-video-intel/ [1]https://aur.archlinux.org/packages/xf86-video-intel-git/ [2]https://aur.archlinux.org/packages/xf86-video-intel-git/#comment-709686 Ville has applied a different patch that repeats much of my earlier patch: https://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=581ddc5d2f55efa2cf5ec76a802fb781ee142b01 However, it appears that it misses one crucial detail which most likely will result in crashes (assertion failed). I'll try to test the latest git without patch and prepare an updated patch if needed. @Peter Wu Well, I got the blank screen with the cursor, again. I made a regression bug report on Archlinux here: https://bugs.archlinux.org/task/64238 There, loqs provided a patch there that fixed the issue. I can confirm 1:2.99.917+893+gbff5eca4-1 on Arch Linux resulted in a black screen with just a cursor being displayed. Not great if you only have a few minutes before doing a presentation for a full room of people... I rebased the patch and reach the exact same diff as in the patch in the linked issue tracker. I will do an ASAN test while the European Intel developers are enjoying their weekend, and submit a patch on Monday. The crash can still be reproduced with Intel + modesetting. On Arch Linux with xf86-video-intel 1:2.99.917+893+gbff5eca4-1 and xorg-server 1.20.5-4 it resulted in an instant segfault on connecting an external monitor. That instant occurrence is likely due to the "autobind GPUs to the screen" patch. With pristine xorg-server 1.20.5 + a glvnd build patch, and xf86-video-intel 2.99.917-893-gbff5eca4 from git, the following ASAN trace is observable after: xrandr --setprovideroutputsource modesetting Intel xrandr --output HDMI-1-1 --mode 2560x1440 # should not crash I'll submit the updated patch to the list. ==369074==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6120001ad918 at pc 0x7f33f0a08153 bp 0x7ffd89f50630 sp 0x7ffd89f50620 READ of size 8 at 0x6120001ad918 thread T0 #0 0x7f33f0a08152 in to_sna_from_pixmap ../../../src/sna/sna.h:521 #1 0x7f33f0a08152 in sna_pixmap_move_to_gpu ../../../src/sna/sna_accel.c:4222 #2 0x7f33f0a57f3f in sna_accel_post_damage ../../../src/sna/sna_accel.c:17773 #3 0x7f33f0a5c561 in sna_accel_block ../../../src/sna/sna_accel.c:18414 #4 0x7f33f0acce2e in sna_block_handler ../../../src/sna/sna_driver.c:777 #5 0x55bc9c56e97c in BlockHandler ../xorg-server-1.20.5/dix/dixutils.c:388 #6 0x55bc9c80ecc0 in WaitForSomething ../xorg-server-1.20.5/os/WaitFor.c:201 #7 0x55bc9c55edb7 in Dispatch ../xorg-server-1.20.5/dix/dispatch.c:421 #8 0x55bc9c56cd9c in dix_main ../xorg-server-1.20.5/dix/main.c:276 #9 0x7f33f4c21152 in __libc_start_main (/usr/lib/libc.so.6+0x27152) #10 0x55bc9c4b264d in _start (/tmp/nv/xprefix2/bin/Xorg.bin+0xdd64d) 0x6120001ad918 is located 56 bytes to the right of 288-byte region [0x6120001ad7c0,0x6120001ad8e0) allocated by thread T0 here: #0 0x7f33f5432aca in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cc:144 #1 0x55bc9c5bedbf in _dixAllocateScreenObjectWithPrivates ../xorg-server-1.20.5/dix/privates.c:709 #2 0x55bc9c5df890 in CreateRootWindow ../xorg-server-1.20.5/dix/window.c:571 #3 0x55bc9c56cb12 in dix_main ../xorg-server-1.20.5/dix/main.c:220 #4 0x7f33f4c21152 in __libc_start_main (/usr/lib/libc.so.6+0x27152) *** Bug 111976 has been marked as a duplicate of this bug. *** Fixed the black screen issue (nouveau) and crash (modesetting) in xf86-video-intel 2.99.917-895-gcb6bff95. Sometimes there is still a crash on Xorg exit when a external screen is attached, but at least it does not happen while in use (hopefully!). I can live with that issue, but for completeness the trace can be found below. Thread 1 "Xorg.bin" received signal SIGSEGV, Segmentation fault. 0x00005612376305cc in PixmapStopDirtyTracking (src=0x0, slave_dst=0x6110000a9140) at ../xorg-server-1.20.5/dix/pixmap.c:251 251 ScreenPtr screen = src->pScreen; #0 0x00005612376305cc in PixmapStopDirtyTracking (src=0x0, slave_dst=0x6110000a9140) at ../xorg-server-1.20.5/dix/pixmap.c:251 #1 0x0000561237692a2c in RRCrtcDetachScanoutPixmap (crtc=crtc@entry=0x617000004a00) at ../xorg-server-1.20.5/randr/rrcrtc.c:413 #2 0x0000561237692dcd in RRCrtcDestroyResource (value=0x617000004a00, pid=<optimized out>) at ../xorg-server-1.20.5/randr/rrcrtc.c:900 #3 0x0000561237644021 in doFreeResource (res=0x60300000b4a0, skip=skip@entry=0) at ../xorg-server-1.20.5/dix/resource.c:880 #4 0x0000561237647698 in FreeClientResources (client=0x60e000000040) at ../xorg-server-1.20.5/dix/resource.c:1146 #5 0x0000561237647698 in FreeClientResources (client=0x60e000000040) at ../xorg-server-1.20.5/dix/resource.c:1109 #6 0x00005612376478e5 in FreeAllResources () at ../xorg-server-1.20.5/dix/resource.c:1161 #7 0x00005612375e1e19 in dix_main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../xorg-server-1.20.5/dix/main.c:292 #8 0x00007f9b9ba12153 in __libc_start_main () at /usr/lib/libc.so.6 #9 0x000056123752764e in _start () |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.