Summary: | [Regression] Graphical corruption after resuming from suspend (w/ dual monitor configuration) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Furkan <falaca> | ||||||||||
Component: | DRM/Radeon | Assignee: | Default DRI bug account <dri-devel> | ||||||||||
Status: | RESOLVED FIXED | QA Contact: | |||||||||||
Severity: | normal | ||||||||||||
Priority: | medium | CC: | maraeo | ||||||||||
Version: | unspecified | ||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||
OS: | Linux (All) | ||||||||||||
Whiteboard: | |||||||||||||
i915 platform: | i915 features: | ||||||||||||
Attachments: |
|
Description
Furkan
2015-04-11 02:28:15 UTC
Created attachment 115014 [details]
dmesg
Since this is a regression can you narrow down the component (kernel, mesa, ddx) and bisect? Based on my debugging so far, it seems that mesa is the likely culprit. Step #1: Fresh install of Ubuntu 14.04 (installed on a new, clean partition): Linux 3.13 kernel -> Manually upgraded to Linux 4.0 mainline Xorg 1.15.1 Mesa 10.1.3 xf86-video-ati 7.3 libdrm 2.4.56 Status: The bug is not present, and confirms that the issue is not with the kernel. ---------- Step #2: Installed Ubuntu 14.04.2 hardware enablement stack: Linux 4.0 (mainline) Xorg 1.16.0 Mesa 10.3.2 xf86-video-ati 7.4 libdrm 2.4.56 Status: The bug is present. ---------- Step #3: I reverted to the original 14.04 packages, but compiled xf86-video-ati 7.4 from git. Status: The bug is not present, and confirms (?) that the issue is not with ddx. ---------- Step #4: I had trouble getting Ubuntu to work with Mesa compiled from git (whenever I try to log in, I just get kicked back to the lightdm greeter), and I couldn't upgrade Mesa from the Ubuntu repo without also upgrading Xorg, so I upgraded Mesa from Oibaf PPA: Linux 4.0 (mainline) Xorg 1.15.1 Mesa 10.6 (oibaf-ppa) xf86-video-ati 7.4 (git) (also tested 7.5.99 from oibaf-ppa) libdrm 2.4.60 (oibaf-ppa) Status: The bug is present. So it seems likely that the bug was introduced somewhere between Mesa 10.1.3 and 10.3.2. If I can figure out how to get Ubuntu to play nice with mainline Mesa compiled from git (maybe if I figure out how to apply the Ubuntu patches), I can do a bisect, but that's where I'm stuck as of now. I have found the bad commit. I will attach my bisect log. I bisected between 10.1-branchpoint and 10.2-branchpoint, and here's the final result: 4a5519f1e019dbf1103e4f3abe0a695637a87518 is the first bad commit commit 4a5519f1e019dbf1103e4f3abe0a695637a87518 Author: Marek Olšák <marek.olsak@amd.com> Date: Mon Feb 10 01:25:54 2014 +0100 r600g,radeonsi: set correct initial domain for shared resources :040000 040000 eafa3cdc6eea908c6ba8861f3d063f6a3161217b 7938f0ed0cdf8c677af35f1b2e67739dc210bda8 M src Created attachment 115071 [details]
Mesa bisect log
(In reply to falaca from comment #4) > 4a5519f1e019dbf1103e4f3abe0a695637a87518 is the first bad commit > commit 4a5519f1e019dbf1103e4f3abe0a695637a87518 > Author: Marek Olšák <marek.olsak@amd.com> > Date: Mon Feb 10 01:25:54 2014 +0100 > > r600g,radeonsi: set correct initial domain for shared resources Weird. Marek, any ideas? Created attachment 115102 [details]
Different "checkerboard" corruption
This might be totally unrelated, but since I don't know enough to make that judgement, I thought I should share just in case (otherwise I could make a separate bug report for it).
Basically, I get intermittent checkerboard patterns which appear on my screen, as seen in the 2 screenshots I'm attaching (some areas blacked out by me for privacy). I don't know how to reproduce the patterns - they appear intermittent and seem unrelated to suspend/resume. They either look like "noise", like in the first screenshot, or they are remnants from a previous window that was open, like in the second screenshot.
Both of those screenshots are from the portrait display (unlike the behaviour from the video I posted in the original bug report, which only happens on the landscape display). I can't remember if I've seen this happen on the landscape display so far. I can keep collecting screenshots to see if it's confined to specific areas of the screen.
For what it's worth, I have used Catalyst 14.12 for a couple of months with this card, and didn't observe this type of behaviour.
(In reply to falaca from comment #0) > I can only reproduce this with when I have 2 displays connected. My primary > screen is set to 2560x1440, and the secondary screen in portrait mode is set > to 1200x1920 on the left-hand side. I have the landscape monitor centered > with respect to the portrait one, so y = 240 in ~/.config./monitors.xml. > > I cannot observe the bug when both screens are aligned at the top, i.e., > with y=0 in ~/.config/monitors.xml. Have you tried moving the landscape monitor to y = 0 and back to y = 240 after suspend/resume, while the session is up? Does that fix the problem, or does it stay corrupted? (In reply to Michel Dänzer from comment #8) > (In reply to falaca from comment #0) > > I can only reproduce this with when I have 2 displays connected. My primary > > screen is set to 2560x1440, and the secondary screen in portrait mode is set > > to 1200x1920 on the left-hand side. I have the landscape monitor centered > > with respect to the portrait one, so y = 240 in ~/.config./monitors.xml. > > > > I cannot observe the bug when both screens are aligned at the top, i.e., > > with y=0 in ~/.config/monitors.xml. > > Have you tried moving the landscape monitor to y = 0 and back to y = 240 > after suspend/resume, while the session is up? Does that fix the problem, or > does it stay corrupted? I just tried right now, and it doesn't make a difference. But you know what, it turns out that it still *does* happen when y=0, it's just that it's a little less noticeable to me, e.g., I'm having trouble seeing it when maximizing a window, but I'm still seeing it happen in the menus. This is purely based on my eyesight, so it's hardly scientific, but I could make more videos if desired. I tried to move my landscape screen further up (above the portrait one, but still overlapping, so I presume that would be y=0 for the landscape screen, but y= positive for the portrait screen). That resulted in X becoming unusable. My landscape screen turned white, and restarting X didn't make things much better - I just got an unusable tiled pattern: https://www.dropbox.com/s/sfrxv4owqchyq75/tiledpattern.jpg?dl=0 I rebooted and tried with linux 3.16, and also with Arch Linux + Gnome 3 + linux 3.19 (or maybe it was 4.0). Same result (white screen). So unfortunately I wasn't able to test out what would happen in that scenario. Is there anybody else who can test this configuration (dual monitors with a portrait display)? It seems like it doesn't take much effort to break something. I wanted to add that I built Mesa 10.1 from git and installed it on Ubuntu 15.04. Along with Xorg 1.17.1 and the latest DDX compiled from git, I can't observe the bug. Is there anything else that I can do to help this along? I tried cloning the master branch and just reverting Marek's commit (the one that I narrowed the bug down to with my git bisect), but of course that didn't work since there is other newer code which now depends on that. I also tried disabling hyperz (since I believe 10.2 turned hyperz on by default), and that had no effect. (In reply to Michel Dänzer from comment #6) > (In reply to falaca from comment #4) > > 4a5519f1e019dbf1103e4f3abe0a695637a87518 is the first bad commit > > commit 4a5519f1e019dbf1103e4f3abe0a695637a87518 > > Author: Marek Olšák <marek.olsak@amd.com> > > Date: Mon Feb 10 01:25:54 2014 +0100 > > > > r600g,radeonsi: set correct initial domain for shared resources > > Weird. Marek, any ideas? Sorry, no. The commit just obtains the initial domain from the kernel, so that it can use it for command submission. The idea of the commit is that the driver shouldn't move imported buffers to a domain that is different from the domain where the buffer was originally created. Is there any sort of debugging trace that I can collect, to objectively compare the difference in behaviour before and after a suspend? Good news! I saw this today: http://lists.x.org/archives/xorg-driver-ati/2015-April/027345.html So I built and installed Michel's xf86-video-ati repo and enabled TearFree in xorg.conf. The corruption is now gone - so I suppose it was simply some manifestation of tearing, but only after a suspend/resume cycle. To test it, I installed the module, enabled TearFree, then did a suspend/resume cycle, and I couldn't observe any tearing in the global menus. So then I commented out the TearFree option in xorg.conf and restarted lightdm, and immediately started seeing the really obvious tearing in the menus like in the video I posted. As a final check, I uncommented the TearFree option and restarted lightdm again, and the tearing was gone. Thanks Michel! And I hope the TearFree feature will eventually be extended to support rotated displays as well! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.