Summary: | EXA corruption is back with xorg-server 1.7.0 on RS780 | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Mikko C. <mikko.cal> | ||||||||||||||||||||||||||||
Component: | DRM/Radeon | Assignee: | Xorg Project Team <xorg-team> | ||||||||||||||||||||||||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||||||||||||||||||||
Severity: | normal | ||||||||||||||||||||||||||||||
Priority: | medium | CC: | desintegr, jlp.bugs, oldium.pro, orzel, phercek, smoothhound, Tanktalus, tomas.linhart, victor.noel, xorg-driver-ati, zecmerquise | ||||||||||||||||||||||||||||
Version: | DRI git | ||||||||||||||||||||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||||||||||||
Attachments: |
|
Created attachment 30032 [details]
corruption in chrome
If it matters, I'm not using Composite Does Option "EXAOptimizeMigration" "off" work around the problem? Created attachment 30035 [details]
corruption in chrome
Not really. It might be slightly better, but I'm not so sure. See this screeshot and the following. The corruption looks exactly the same as without "EXAOptimizeMigration" "off"
Created attachment 30036 [details]
corruption in Yakuake
this also looks the same as it did before.
Please attach the log file from testing the option. Created attachment 30063 [details]
Xorg.0.log
Does Option "EXANoDownloadFromScreen" work around the problem? If not, I'm not sure what the problem could be... Please also attach a log file from a working run with xserver 1.6.4, and try isolating the problem if you can - ideally with git bisect, but even just trying the xserver 1.7 RCs could be a start. Created attachment 30066 [details] corruption in Konversation (In reply to comment #8) > Does > > Option "EXANoDownloadFromScreen" > > work around the problem? > Not really, it seems to happen less often, but it's definitely still there. See screenshots. Created attachment 30067 [details]
corruption in chrome
Created attachment 30068 [details]
Xorg.0.log with EXANoDownloadFromScreen
New Xorg.0.log
I'll get a 1.6.4 Xorg.0.log when I have time to do the downgrade. I'm not sure I'll have the time to do the bisect in the next few weeks tho.
Created attachment 30072 [details]
Xorg.0.log with 1.6.4
(In reply to comment #9) > > Does > > > > Option "EXANoDownloadFromScreen" > > > > work around the problem? > > Not really, it seems to happen less often, but it's definitely still there. What if you add in Option "EXANoUploadToScreen" as well? Option "EXANoUploadToScreen" does not help unfortunately. But I've noticed another thing: with 1.7.0 my dmesg contains lots of these errors: [drm:radeon_cp_indirect] *ERROR* sending pending buffer 21 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 2 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 25 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 4 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 12 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 29 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 14 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 11 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 28 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 0 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 12 [drm:radeon_cp_indirect] *ERROR* sending pending buffer 25 So it might be related. More info. A user on IRC reported that he was having the same "drm:radeon_cp_indirect" errors with radeonHD. He also reported that, at the time, reverting 510cbd43cd4e34bd459e8f74ab2855714b4ca95d EXA: Defragment offscreen memory. fixed the issue for him. Michel, if you say this is a bug in radeon, can you please modify the bug report and cc accordingly? Thanks this happens also with radeonhd. I get the same corruption and message [drm:radeon_cp_indirect] *ERROR* sending pending buffer Kernel 2.6.32-rc5 + last patches from drm-next, mesa master, ddx master. Latest radeon driver + Xorg 1.7.1 = this problem for me (Sapphire Radeon 3870HD, dual monitor, 2.6.30-gentoo-r5). Hello, is this linked to bug #24771 ? (In reply to comment #19) > Hello, is this linked to bug #24771 ? Possibly, but until it's confirmed it's better to track them separately. yes it looks exactly the same corruption/bug. Enabling KMS makes the error go away: [drm:radeon_cp_indirect] *ERROR* sending pending buffer but there are tons of other corruption issues with KMS enabled, so it's not really an option for me. This looks to be related to xserver exa changes in xserver 1.7.x. We are probably ending up with a missing *Done() call in the exa code which results in the command buffer not getting sent when it's supposed to. I would suggest bisecting the xserver. If someone wants to start bisecting the xserver I suggest reading my comment #15 in this report and start from commit 510cbd43cd4e34bd459e8f74ab2855714b4ca95d, EXA: Defragment offscreen memory. (In reply to comment #23) > If someone wants to start bisecting the xserver I suggest reading my comment > #15 in this report and start from commit > 510cbd43cd4e34bd459e8f74ab2855714b4ca95d, EXA: Defragment offscreen memory. I've tried to compile xorg-server from 1.7 branch but first compilable version (with libs that I have) was 20daa145c437c3ba67970146f6182849f87a1b43, but this one didn't want to start-up. I need to recompile more things probably, but this is no-go for me as I need to use the computer... Created attachment 31477 [details] [review] Patch that reverts EXA defragmentation As the compiling the whole xorg-server is a no-go, I've tried to revert first patch that Mikko suggested. There were some conflicts, but I've tried to be reasonable during resolving them. I don't understand the code, so it might be wrong :-) At least it works for me currently, I will write if I encounter corruption again. So please test it yourself too :-) I can confirm the attached patch fixes both the corruption and the [drm:radeon_cp_indirect] *ERROR* sending pending buffer. It applies cleanly to xorg-server 1.7.1 Thanks, that patch seems to have fixed it here, as well. I've not noticed any corruption so far today with your patch applied, and normally oowriter would trigger it so bad as to make it nearly impossible to use. Hmm, interesting. I guess this leaves a few basic possibilities as to what the problem could be: 1. The R6/7xx EXA code might not like something about the way ExaOffscreenDefragment() calls its Copy hooks. Can't see anything offhand that could be problematic though. 2. The ExaOffscreenDefragment() call in exaOffscreenAlloc() might happen at times the driver can't handle it. It should be easy to test this by disabling just that call. 3. Some kind of ordering or other issue between the ExaOffscreenDefragment() call in ExaWakeupHandler() and the R6/7xx EXA code. Again should be easily testable by disabling just that call. Would be great if you guys could try ruling out 2. or 3. Of course I might be missing something, I'm open for other ideas. Just a note - my patch actually reverts also (partially) the next one - "EXA: Allocate from the end of free offscreen memory rather than from the start." The real_size is now most probably wrongly calculated. Ok, I take that back, somewhat. I'm still getting corruptions, but not from scrolling anymore. The patch has improved X to the point of being usable, but not 100%. And the ERRORs in dmesg have disappeared, even when I do get corruption. And the corruption I do get is harder to reproduce. Created attachment 31504 [details] [review] patch to fix corruption (In reply to comment #28) > 2. The ExaOffscreenDefragment() call in exaOffscreenAlloc() might happen at > times > the driver can't handle it. It should be easy to test this by disabling just > that call. > I did as you suggested and the corruption is gone with this patch, it applies to xorg 1.7.1. I also tried your 3rd suggestion but that did not fix it. (In reply to comment #31) > I did as you suggested and the corruption is gone with this patch, it applies > to xorg 1.7.1. This makes 2. very likely to be the problem. My only concern is that your patch still updates pExaScr->lastDefragment, can you try disabling that as well and confirm that it still fixes the problem? > I also tried your 3rd suggestion but that did not fix it. That's good to know. (In reply to comment #32) > > This makes 2. very likely to be the problem. My only concern is that your patch > still updates pExaScr->lastDefragment, can you try disabling that as well and > confirm that it still fixes the problem? > Yep, I tried removing that line too and it still works. No errors in dmesg or corrpution. Does the defragment code deal with the fact that r6xx+ usually requires a temp surface for overlapping copies? (In reply to comment #34) > Does the defragment code deal with the fact that r6xx+ usually requires a temp > surface for overlapping copies? As r600_exa.c doesn't set the EXA_SUPPORTS_OFFSCREEN_OVERLAPS flag, there should be no overlapping copies from the defragmentation. I suspect the problem is that exaOffscreenAlloc() may trigger defragmentation in the middle of whatever. I'll probably just remove that and only keep the defragmentation at regular intervals. Created attachment 31531 [details] [review] Proposed fix This is the fix I'm planning to submit for xserver Git. Please test to make sure it doesn't cause any regressions. (In reply to comment #36) > > This is the fix I'm planning to submit for xserver Git. Please test to make > sure it doesn't cause any regressions. > So far I can't notice any regressions, using 1.7.1. Is the fix going to be backported to 1.7 branch? Created attachment 31597 [details]
Corruption in konqueror with Oldrich's patch
With Oldrich's patch, I'm still getting corruption. I've compiled with Michel's patch instead, and as soon as I get the chance to restart X, I will test it out.
I'm attaching the corruption from konqueror. All I have to do to get rid of it is select another tab and then come back, which makes it seem like it's an upload/download type of issue.
Fix landed on master and nominated for server-1.7-branch. Hmm, this bug is marked as resolved/fixed, but I still can reproduce the issue readily with xorg-server-1.7.7 and xorg-server-1.8.2 on FreeBSD (so no KMS). Committed fix is only a half of what was proposed by Oldrich in comment #25. I applied the second part too and now the situation seems to be very much improved. (In reply to comment #40) > Committed fix is only a half of what was proposed by Oldrich in comment #25. > I applied the second part too and now the situation seems to be very much > improved. Oldrich's patch completely removed EXA offscreen memory defragmentation again, which is undesirable as performance will degrade over time due to the fragmentation. Your problem could be due to a bug in the driver or maybe an ordering issue between the EXA Block/WakeupHandlers and those of the driver (if any). Well, not sure what's going, but the problem is still reproducible for me even after applying full Oldrich's path. I'll follow up on my issue in bug #24771. (In reply to comment #42) > Well, not sure what's going, but the problem is still reproducible for me even > after applying full Oldrich's path. > I'll follow up on my issue in bug #24771. Just a FYI - I see the corruption, but it seems to be of somewhat different kind and caused by Bug 27627. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 30031 [details] corruption in Konversation After it was fixed in some 1.6.X release, it's back in 1.7.0 Using radeon rs780 with 2.6.32-rc1. All is fine when going back to 1.6.4. Two screenshots will follow.