Description
Bryce Harrington
2008-11-05 14:19:50 UTC
Does Option "AccelDFS" "off" or Option "SWcursor" work around the problem? Can you attach a picture of the corruption? Hello, This one is harder to reproduce consistently. I will try to provoke it by stressing the driver (open many windows, use up all GPU texture memory, fancy Compiz-stuff, etc). I'll attach a screenshot as soon as I've managed to reproduce it. If I can find some recipe which triggers it, I can test the options you are suggesting and see if any of them help. Regards, Øyvind I think there is a display watermark issue that causes problems with the cursor from time to time. Unfortunately, I've had no luck so far finding any info on how to properly program the display watermarks. (In reply to comment #3) > I think there is a display watermark issue that causes problems with the cursor > from time to time. Unfortunately, I've had no luck so far finding any info on > how to properly program the display watermarks. > Well, the first attempt at reproducing this failed. I tried hard. Running lots of glxgears (causing massive flickering), opening many windows (slowing down Compiz), running remote applications with X-forwarding which have different cursor themes and lastly playing a few videos using Textured Video through XVideo. My poor laptop isn't made for that kind of load, but it did not cause mouse cursor corruption. So I think I am on the wrong track as to how to reproduce this, sorry :(. But like you say, it's very "from time to time", but I fail to recognize any pattern. (In reply to comment #4) > But like you say, it's very "from time to time", but > I fail to recognize any pattern. > Right. That's the other problem; I'm not able to reproduce it regularly. OK, I managed to reproduce it, and I'm attaching a screenshot (kind of). It's not a screen dump, but a picture taken with my cell phone. Here's why: The corruption itself does *not* appear in screen-dumps. So whatever framebuffer is captured does not include the corrupted pixels (I used PrintScreen to make the dump). But they *are* there and it's persistent once it appears. Probably has something to do with the fact that the cursor itself is hardware-accelerated ? I though about how I could reproduce it, and I went into the cursor theme control panel in Gnome. I started switching themes back and forth, and quite consistenly I can manage to make the corruption show itself after a while. It might take a few shots. Please see the attached picture taken with my cell phone. Hope this is of some help, at least .. Perhaps you will be able to reproduce it yourself ? Regards, Øyvind Created attachment 20119 [details]
mousecursor-corruption-radeon-exa.jpg
Do you still get the corruption with the DRI off (Option "DRI" False")? (In reply to comment #8) > Do you still get the corruption with the DRI off (Option "DRI" False")? > Hmm, no, it doesn't look like I do. However, a few (perhaps significant) variables change when I do this: * Metacity instead of Compiz (fallback), no compositing * EXA + no-DRI seems to be a very no-go combination performance-wise :). I'm not able to switch very fast between cursor themes because X is sluggish. But this is perhaps not significant for reproducing. It might just be enough that the theme is actually altered, no matter how fast. Created attachment 20120 [details] [review] disable upload to screen (In reply to comment #9) > Hmm, no, it doesn't look like I do. > > However, a few (perhaps significant) variables change when I do this: > * Metacity instead of Compiz (fallback), no compositing > * EXA + no-DRI seems to be a very no-go combination performance-wise :). I'm > not able to switch very fast between cursor themes because X is sluggish. But > this is perhaps not significant for reproducing. It might just be enough that > the theme is actually altered, no matter how fast. > It's not supposed to be performant :) I think there may be an issue with the Upload to screen hook. See if you still get corruption with the DRI enabled with metacity or enable DRI and try the attached patch. (In reply to comment #10) <snip> > I think there may be an issue with the Upload to screen hook. See if you still > get corruption with the DRI enabled with metacity or enable DRI and try the > attached patch. > Another data point: Still happens with DRI+Metacity, so Compiz or not seems to not matter (for once ;). Going to try patch next .. (In reply to comment #11) > (In reply to comment #10) > <snip> > > I think there may be an issue with the Upload to screen hook. See if you still > > get corruption with the DRI enabled with metacity or enable DRI and try the > > attached patch. > > > > Another data point: > Still happens with DRI+Metacity, so Compiz or not seems to not matter (for once > ;). > > Going to try patch next .. > OK, tried the patch. I updated my git-tree first (using git pull, never really used git before, but assumed this was correct). [Last commit to the driver I tested (git log): commit 902eaf768142c6c7dcc487e10775027b84cd1f9a Author: Alex Deucher <alexdeucher@gmail.com> Date: Thu Nov 6 15:46:43 2008 -0500 Check for LVDS on all IGP chips - fixes bug 18395 ] Applied patch, compiled and installed to prefix /usr. The problem certainly still happens, if even worse now. The cursor was immediately corrupted upon desktop login, but that might just be because I didn't use the default-theme (because I had changed it before I logged out and restarted X). Seems small and transparent cursor themes triggers this most easily through the cursor theme control panel. And I can definitely say that the patch did not resolve the issue, that's something at least .. So, does disabling AccelDFS or RenderAccel work around this or not? P.S. No need to patch the driver to disable UploadToScreen, there's Option "EXANoUploadToScreen" for that. :) (In reply to comment #13) > So, does disabling AccelDFS or RenderAccel work around this or not? > > P.S. No need to patch the driver to disable UploadToScreen, there's Option > "EXANoUploadToScreen" for that. :) > I'll try these options with latest radeon driver in Ubuntu (6.9.0+git20081003.f9826a56-0ubuntu2.1) and report back. (In reply to comment #14) > (In reply to comment #13) > > So, does disabling AccelDFS or RenderAccel work around this or not? > > > > P.S. No need to patch the driver to disable UploadToScreen, there's Option > > "EXANoUploadToScreen" for that. :) > > > > I'll try these options with latest radeon driver in Ubuntu > (6.9.0+git20081003.f9826a56-0ubuntu2.1) and report back. > Hello again, I am not able to reproduce the mouse-cursor corruption problem when AccelDFS is set to False. Regards, Øyvind I get this too on a X1400 (probably on the same laptop, a T60). Interestingly, I do not think I had this before updating to 1.5.3 (from 1.4.something). It may be related to some bad font corruption I see sometimes (especially oo.org impress presentation mode showed this well....) Also several small sized icons in the "k-menu" show it. Theory: this happens only with small pixmaps (16x16 or so). radeonhd does not suffer from this bug, I didn't check whether it accels dfs though... d. (In reply to comment #16) > I get this too on a X1400 (probably on the same laptop, a T60). Interestingly, Same gfx card, not the same laptop model, I've got a Z61m. > I do not think I had this before updating to 1.5.3 (from 1.4.something). > > It may be related to some bad font corruption I see sometimes (especially > oo.org impress presentation mode showed this well....) Also several small sized > icons in the "k-menu" show it. Theory: this happens only with small pixmaps > (16x16 or so). > > radeonhd does not suffer from this bug, I didn't check whether it accels dfs > though... I've also tested recent radeonhd-snapshots, and it does not have any corrpution bugs that I have observed (it's also slower). (In reply to comment #17) > (In reply to comment #16) > > radeonhd does not suffer from this bug, I didn't check whether it accels dfs > > though... > I've also tested recent radeonhd-snapshots, and it does not have any corrpution > bugs that I have observed (it's also slower). > You have to enable the DRI on radeonhd to use EXA composite and UTS/DFS: Option "DRI" "TRUE" I'd expect you'd see the same issue as they mostly share the same accel code. Created attachment 21381 [details] [review] limit accel DFS to 32x32 or larger Does this patch help (make sure accelDFS is enabled)? (In reply to comment #19) > Does this patch help (make sure accelDFS is enabled)? The size checks should probably be done earlier, to save superfluous function calls if they fail. In particular, I suspect that calling RADEONCPGetBuffer() but then still failing could cause the stability issues some people on IRC encountered with this patch. Created attachment 21387 [details] [review] updated patch (In reply to comment #20) > The size checks should probably be done earlier, to save superfluous function > calls if they fail. In particular, I suspect that calling RADEONCPGetBuffer() > but then still failing could cause the stability issues some people on IRC > encountered with this patch. > yeah, I was actually just thinking the same thing. (In reply to comment #21) > Created an attachment (id=21387) [details] > updated patch > > (In reply to comment #20) > > The size checks should probably be done earlier, to save superfluous function > > calls if they fail. In particular, I suspect that calling RADEONCPGetBuffer() > > but then still failing could cause the stability issues some people on IRC > > encountered with this patch. > > > > yeah, I was actually just thinking the same thing. > Hi, tried the latest patch (id=21387), and here's what I've observed so far: + I cannot reproduce mouse cursor corruption problem (cursor changes also take longer before they are visible, like a 2 second lag) + Solves this bug as well: https://bugs.freedesktop.org/show_bug.cgi?id=18398 - Compiz is noticably slower, and animations tend to be more jerky (for instance when switching desktop) All testing so far done with only "AccelMethod" "EXA" in xorg.conf. Øyvind (In reply to comment #18) > (In reply to comment #17) > > (In reply to comment #16) > > > radeonhd does not suffer from this bug, I didn't check whether it accels dfs > > > though... > > I've also tested recent radeonhd-snapshots, and it does not have any corrpution > > bugs that I have observed (it's also slower). > > > > You have to enable the DRI on radeonhd to use EXA composite and UTS/DFS: > Option "DRI" "TRUE" > I'd expect you'd see the same issue as they mostly share the same accel code. > Yes, I always enable DRI explicitly when testing radeonhd, but still, it's slower, at least with Compiz. (In reply to comment #22) <snip> > + Solves this bug as well: https://bugs.freedesktop.org/show_bug.cgi?id=18398 The bug referenced above is marked as duplicate of this: https://bugs.freedesktop.org/show_bug.cgi?id=18399 The bitmap font corruption is no longer reproducible, even without this DFS patch. I do not know what has fixed that one (I suddenly realised it had stopped happening one day). I also run latest DRM snapshot (drm.ko, radeon.ko). However, the problem with GTK widget corruption (bug 18398) does need this patch to vanish. Øyvind (In reply to comment #18) > (In reply to comment #17) > > (In reply to comment #16) > > > radeonhd does not suffer from this bug, I didn't check whether it accels dfs > > > though... > > I've also tested recent radeonhd-snapshots, and it does not have any corrpution > > bugs that I have observed (it's also slower). > > > > You have to enable the DRI on radeonhd to use EXA composite and UTS/DFS: > Option "DRI" "TRUE" > I'd expect you'd see the same issue as they mostly share the same accel code. > I can re-confirm that latest radeonhd-snapshot with full DRI+EXA does not have corruption problems with: - Mouse cursor - Certain GTK widgets (or small icons in KDE) - Bitmap/unaccelerated fonts (But as mentioned, it's generally slower than radeon when it comes to acceleration and Compiz. It's also apparently more unstable, as it just froze on me when I tested it with many windows). (In reply to comment #22) > Hi, tried the latest patch (id=21387), and here's what I've observed so far: > > + I cannot reproduce mouse cursor corruption problem (cursor changes also take > longer before they are visible, like a 2 second lag) > + Solves this bug as well: https://bugs.freedesktop.org/show_bug.cgi?id=18398 > > - Compiz is noticably slower, and animations tend to be more jerky (for > instance when switching desktop) > > All testing so far done with only "AccelMethod" "EXA" in xorg.conf. Can you try reducing the w/h cut-offs (currently 32) in the patch to see at what limit you get corruption? (In reply to comment #26) > (In reply to comment #22) > > Hi, tried the latest patch (id=21387), and here's what I've observed so far: > > > > + I cannot reproduce mouse cursor corruption problem (cursor changes also take > > longer before they are visible, like a 2 second lag) > > + Solves this bug as well: https://bugs.freedesktop.org/show_bug.cgi?id=18398 > > > > - Compiz is noticably slower, and animations tend to be more jerky (for > > instance when switching desktop) > > > > All testing so far done with only "AccelMethod" "EXA" in xorg.conf. > > Can you try reducing the w/h cut-offs (currently 32) in the patch to see at > what limit you get corruption? > I'm on it .. (In reply to comment #27) > (In reply to comment #26) > > (In reply to comment #22) > > > Hi, tried the latest patch (id=21387), and here's what I've observed so far: > > > > > > + I cannot reproduce mouse cursor corruption problem (cursor changes also take > > > longer before they are visible, like a 2 second lag) > > > + Solves this bug as well: https://bugs.freedesktop.org/show_bug.cgi?id=18398 > > > > > > - Compiz is noticably slower, and animations tend to be more jerky (for > > > instance when switching desktop) > > > > > > All testing so far done with only "AccelMethod" "EXA" in xorg.conf. > > > > Can you try reducing the w/h cut-offs (currently 32) in the patch to see at > > what limit you get corruption? > > > > I'm on it .. > Some data points: * w/h > 16 => GTK checkbox corruption occurs in Firefox (very easy to reproduce, so I use it as test case) * w/h > 24 => No GTK checkbox corruption in Firefox (so no corruption). Another observation: Emacs 22 scrolling gets very slow (to the point of total "acquarium-effect" when scrolling up or down), when the patch is enabled, even for w/h cutoff as small as 16. Scrolling is back to normal/fast without the patch. Just noticed it when editing src/radeon_exa_funcs.c .. So I guess disabling DFS for smaller pixmaps certainly has drawbacks. Øyvind Created attachment 21411 [details] [review] flush 3d and 3d caches when using blitchunk does this patch help? (In reply to comment #29) > Created an attachment (id=21411) [details] > flush 3d and 3d caches when using blitchunk > > does this patch help? > Unfortunately no, it does not help for the GTK checkbox corruption or mouse cursor corruption problem. Regards, Øyvind *** Bug 19745 has been marked as a duplicate of this bug. *** Can somebody update the subject of this bug? It's not about "Occasional mouse cursor corruption" it's small pixmap corruption (very often), right? For me attachment #21387 [details] [review] works, but not the patch for RADEONBlitChunk. (In reply to comment #32) > Can somebody update the subject of this bug? It's not about "Occasional mouse > cursor corruption" it's small pixmap corruption (very often), right? > I agree. It must be the same problem as bug 18398. I've taken the liberty of updating the bug title, if anybody don't mind. Created attachment 22772 [details] [review] make sure the engine is idle before sw access Does this patch help? (In reply to comment #35) > Created an attachment (id=22772) [details] > make sure the engine is idle before sw access > > Does this patch help? > Sorry to be the bearer of bad news; the patch makes a difference alright, but it's for the worse. It's the very same problem, only three times as agressive. Garbage underneath animated mouse cursor (alternating along with animation cursor frames), and checkboxes in Firefox are almost always corrupted now. I'll attach an example. Created attachment 22773 [details] Screenshot showing checkbox corruption when using patch in attachment 22772 [details] [review] Created attachment 22774 [details] [review] make sure the engine is idle before sw access how about this patch? (In reply to comment #38) > Created an attachment (id=22774) [details] > make sure the engine is idle before sw access > > how about this patch? > Now you're on the right direction. My infamous checkbox-test in Firefox/GMail shows there is improvement (it's better than I've ever seen it before with EXA). There are still rendering errors in a few check-boxes, but they are fewer now. I see idle-looping going on, should I try to increase the number of iterations before giving up and returning in RADEONWaitforIdlePoll() ? Or perhaps that does not make sense and is unwise .. Obviously racy little slime bug this. I noticed you put back the drm busy waiting code in the latest patch (in the DFS func), that at least seemed to help. Also, I see the only XXX: in radeon_exa_funcs.c is located above that part :). (In reply to comment #39) > (In reply to comment #38) > > Created an attachment (id=22774) [details] [details] > > make sure the engine is idle before sw access > > > > how about this patch? > > > > Now you're on the right direction. My infamous checkbox-test in Firefox/GMail > shows there is improvement (it's better than I've ever seen it before with > EXA). There are still rendering errors in a few check-boxes, but they are fewer > now. I see idle-looping going on, should I try to increase the number of > iterations before giving up and returning in RADEONWaitforIdlePoll() ? Or > perhaps that does not make sense and is unwise .. Obviously racy little slime > bug this. I noticed you put back the drm busy waiting code in the latest patch > (in the DFS func), that at least seemed to help. Also, I see the only XXX: in > radeon_exa_funcs.c is located above that part :). > Forgive my stupid and ignorant suggestions; bumping to 2000000 iterations didn't do squat to help the situation it seems ;). Well, I figured it couldn't hurt to try .. (In reply to comment #39) > Now you're on the right direction. My infamous checkbox-test in Firefox/GMail > shows there is improvement (it's better than I've ever seen it before with > EXA). There are still rendering errors in a few check-boxes, but they are fewer > now. I see idle-looping going on, should I try to increase the number of > iterations before giving up and returning in RADEONWaitforIdlePoll() ? Or > perhaps that does not make sense and is unwise .. Obviously racy little slime > bug this. I noticed you put back the drm busy waiting code in the latest patch > (in the DFS func), that at least seemed to help. Also, I see the only XXX: in > radeon_exa_funcs.c is located above that part :). > I think it's just adding latency. We really need to wait on a timestamp written by the CP after rendering is done to be sure the hw is done and caches are flushed. (In reply to comment #39) > (In reply to comment #38) > > Created an attachment (id=22774) [details] [details] > > make sure the engine is idle before sw access > > > > how about this patch? > > > > Now you're on the right direction. My infamous checkbox-test in Firefox/GMail > shows there is improvement (it's better than I've ever seen it before with > EXA). There are still rendering errors in a few check-boxes, but they are fewer > now. I see idle-looping going on, should I try to increase the number of > iterations before giving up and returning in RADEONWaitforIdlePoll() ? Or > perhaps that does not make sense and is unwise .. Obviously racy little slime > bug this. I noticed you put back the drm busy waiting code in the latest patch > (in the DFS func), that at least seemed to help. Also, I see the only XXX: in > radeon_exa_funcs.c is located above that part :). Same here (there's a lot of improvements, but still errors). (In reply to comment #41) > (In reply to comment #39) > > Now you're on the right direction. My infamous checkbox-test in Firefox/GMail > > shows there is improvement (it's better than I've ever seen it before with > > EXA). There are still rendering errors in a few check-boxes, but they are fewer > > now. I see idle-looping going on, should I try to increase the number of > > iterations before giving up and returning in RADEONWaitforIdlePoll() ? Or > > perhaps that does not make sense and is unwise .. Obviously racy little slime > > bug this. I noticed you put back the drm busy waiting code in the latest patch > > (in the DFS func), that at least seemed to help. Also, I see the only XXX: in > > radeon_exa_funcs.c is located above that part :). > > > > I think it's just adding latency. We really need to wait on a timestamp > written by the CP after rendering is done to be sure the hw is done and caches > are flushed. > Like you commented in the code. I wont pretend to know much at all about graphics drivers, but the CP facilitates a command queue/buffer(of register updates?) with GPU as consumer and CPU/driver-software as producer ? Just curious, really .. I'm a programmer by profession, but know very little about graphics drivers. It's interesting to look at, but still looks hellishly complex :) Created attachment 22960 [details] [review] Miminal patch. I tried to minimize the changes of this patch #22774 and I found something very weird. In my laptop the only place where waiting is needed is in RADEONDownloadFromScreenCP, but I decided to go further and see how many iterations where required. It turns out that 99.99% of the time only 1 iteration is needed (i == 0), in very rare occasions (one per session) 2 iterations are needed. So I tried to minimize even more and I found out that doing a single read seems to fix the issue almost completely. I'm attaching the patch, hopefully this would help track down the issue. (In reply to comment #44) > I'm attaching the patch, hopefully this would help track down the issue. Interesting; I think this could indicate that radeon_do_wait_for_idle()/radeon_do_pixcache_flush() in the DRM are missing something for properly waiting for the cache flush to be finished. (In reply to comment #45) > (In reply to comment #44) > > I'm attaching the patch, hopefully this would help track down the issue. > > Interesting; I think this could indicate that > radeon_do_wait_for_idle()/radeon_do_pixcache_flush() in the DRM are missing > something for properly waiting for the cache flush to be finished. Is it possible that cache is idle but memory controller not yet (that is would need to wait for MC_IDLE)? Or another idea, is writing DSTCACHE_CTLSTAT for flushing cache and immediately reading it back guaranteed to give the right answer (busy if there's something to flush) or is there some delay needed? Don't know why, but there is some improvement to this issue in the latest radeon driver snapshot and drm 2.4.5. The checkbox corruption still occurs, but it's not as nasty as it used to be. To put it simply: fewer checkboxes are corrupted. I have similar problems with xserver 1.5.3 (and 1.6.0) with latest radeon and mesa from git. However it doesn't occur if only one DVI output of my radeon r580 is enabled. When I enable both, I can reproduce this always as described in http://bugs.freedesktop.org/show_bug.cgi?id=16865 (In reply to comment #48) > http://bugs.freedesktop.org/show_bug.cgi?id=16865 That's a hardware cursor issue, whereas this report is about rendering corruption. *** Bug 21218 has been marked as a duplicate of this bug. *** Just confirming that problem is still there in Xserver 1.6.0 and radeon driver 6.12.1 (Ubuntu Jaunty beta). Created attachment 25349 [details] [review] use dma engine rather than blitter for DFS This patches fixed the DFS corruption for me. (In reply to comment #52) > Created an attachment (id=25349) [details] > use dma engine rather than blitter for DFS > > This patches fixed the DFS corruption for me. That hangs the whole system for me. Created attachment 25355 [details] [review] take 2 R1xx only supports dma via descriptor tables, so stick with blits for now on. Felipe, as to your lockup, can you try after rebooting or does it always lockup? (In reply to comment #54) > Created an attachment (id=25355) [details] > take 2 > > R1xx only supports dma via descriptor tables, so stick with blits for now on. > > Felipe, as to your lockup, can you try after rebooting or does it always > lockup? Always. It works for a while, and suddenly it hangs completely. Created attachment 25356 [details] [review] always sync 2D/DMA before DMA Does this patch help your lockups? I'm having a GNU autoconf build-problem on Ubuntu Jaunty for latest git snapshot from git://anongit.freedesktop.org/xorg/driver/xf86-video-ati: $ ./autogen.sh --prefix=/usr ... ... checking dependency style of gcc... (cached) gcc3 ./configure: line 11929: syntax error near unexpected token `XINERAMA,' ./configure: line 11929: `XORG_DRIVER_CHECK_EXT(XINERAMA, xineramaproto)' I think I got most development-packages installed (at least all -dev-packages that Ubuntu uses when building its version of the radeon-driver). I haven't got time to dive into autoconf-problems right now, any hints as to what might be wrong ? :) I'm very interested in testing out your patches. (In reply to comment #57) > I'm having a GNU autoconf build-problem on Ubuntu Jaunty for latest git > snapshot from git://anongit.freedesktop.org/xorg/driver/xf86-video-ati: > > $ ./autogen.sh --prefix=/usr > ... > ... > checking dependency style of gcc... (cached) gcc3 > ./configure: line 11929: syntax error near unexpected token `XINERAMA,' > ./configure: line 11929: `XORG_DRIVER_CHECK_EXT(XINERAMA, xineramaproto)' > > > I think I got most development-packages installed (at least all -dev-packages > that Ubuntu uses when building its version of the radeon-driver). I haven't got > time to dive into autoconf-problems right now, any hints as to what might be > wrong ? :) I'm very interested in testing out your patches. > I figured this out (installed a whole slew of xorg development packages). Will report back once I have some results. Created attachment 25384 [details]
Corrupted screenshot from DFS-DMA patch
I was unable to take a proper screenshot with the DFS-DMA patch. It was totally corrupted by the time Gimp got hold of it. This picture shows how it looks.
(In reply to comment #54) > Created an attachment (id=25355) [details] > take 2 > > R1xx only supports dma via descriptor tables, so stick with blits for now on. > This patch does not cause hangs here. It seems to reduce corruption a little, though. But .. I was not able to take screenshots with this patch (they were completely corrupted, see https://bugs.freedesktop.org/attachment.cgi?id=25384) (In reply to comment #56) > Created an attachment (id=25356) [details] > always sync 2D/DMA before DMA > > Does this patch help your lockups? > This patch still has some corruption, and it completely locks up the machine when attempting to grab a screenshot (even Alt+SysRQ was hosed). Created attachment 25538 [details]
Screenshot from Firefox showing some corrupted glyphs
Seems like this bug might be affecting font-rendering as well as widgets in Ubuntu Jaunty. But in the font-case, the glyphs seem to be cached somewhere, so it doesn't help to force a redraw to correct the bad letters :(
Hello, bug still persists in last git version. Any news? Thanks *** Bug 22397 has been marked as a duplicate of this bug. *** On which cards does this happen exactly? Because it doesn't happen on my integrated radeon 3200 or my old laptop-X300, but it happens on my 4770. (In reply to comment #65) > On which cards does this happen exactly? Because it doesn't happen on my > integrated radeon 3200 or my old laptop-X300, but it happens on my 4770. It happens on my R520, but actually I've moved to Fedora 11 and I don't see the problem, but probably because of all the KMS stuff. What about that BUG? It's really annoying. Will there soon be a new driver release which addresses that problem or do I have to wait for kernel 2.6.31 and KMS? Still nothing new here? Created attachment 28899 [details] [review] patch for pre-r6xx chips This seems to fix the DFS issues for me on my problematic r5xx card. Basically for small xfers we do the blit twice to make sure it's hit memory. Very nice. Your latest patch for pre-R6xx cards works around the corruption issues I've been having on my ATI X1400 mobile. I did some experimentation with the small pixmap limits in your patch. It also works for me if I reduce this to <= 24x24 (w < 25 || h < 25). Patch tested against latest snapshot of master branch at http://cgit.freedesktop.org/xorg/driver/xf86-video-ati/ (commit a2968896884545f5c8f3f16c398c1ee4534ad7a8) Thank you ! (In reply to comment #69) > Created an attachment (id=28899) [details] > patch for pre-r6xx chips > > This seems to fix the DFS issues for me on my problematic r5xx card. Basically > for small xfers we do the blit twice to make sure it's hit memory. Hmm I don't like the idea. Surely there must be some way to correctly determine the blit has really completed? Otherwise, isn't actually the size (so something like width * height * bpp) rather than width or height relevant for this? this is written in some unspecified docs, however 2 has never made sense to me. 1. After BITBLT_MULTI copy data from frame buffer to system memory, flush entire 2D pixel cache and wait for 2D engine idle and clean to ensure the copied data to arrive into bus controller. 2. Copy the last pixel back to frame buffer from system memory to flush data out of bus controller to system physical memory. 3. Wait for flush process to be completed. This is done by either waiting for engine idle or using timestamp write back mechanism. (In reply to comment #71) > (In reply to comment #69) > > Created an attachment (id=28899) [details] [details] > > patch for pre-r6xx chips > > > > This seems to fix the DFS issues for me on my problematic r5xx card. Basically > > for small xfers we do the blit twice to make sure it's hit memory. > > Hmm I don't like the idea. Surely there must be some way to correctly determine > the blit has really completed? > Otherwise, isn't actually the size (so something like width * height * bpp) > rather than width or height relevant for this? > Yes, the size is relevant. There's apparently some threshold for the write combiner in the bus interface. *** Bug 23732 has been marked as a duplicate of this bug. *** I can confirm that lastest radeonhd git is even worse than radeon in this regard, at least on Radeon HD 4770. In fact, radeon is quite usable, and only displays corruption "here and there", while radeonhd suffers from extreme corruption of whole screen a minute or so after logging in. I can provide camera snapshots if you want. Screenshoting corrupts image, so I'm forced to take camera snapshots instead. Created attachment 29300 [details]
Xorg log - radeonhd
Xorg log with radeonhd 1.2.5-git27cfbaa3 and kernel 2.6.31-rc9
Created attachment 29301 [details]
Xorg log - radeon
Xorg log, same kernel, radeon git
Created attachment 29800 [details]
Xorg log, screen corruption with HD4770 and EXA
Same problems here with my HD4770. Kernel 2.6.30.6-1 (Arch), latest git of mesa and radeonhd, xorg 1.6.3.901. Well, it's not as bad as reported from Vedran (more or less usable apart from random glyph corruptions), but corruptions are still there.
(They're far worse on screenshots, so I can't provide usable ones)
I did some more testing here. I have XFX 4770 512MB, and a pretty standard AM2 configuration (Athlon 7850 on ASRock 770/SB700 board). 1) Recent changes in xf86-video-ati, I believe, made this less frequent. For example, System/Preferences/Appearance in GNOME shows themes correctly. They used to be corrupted before (looking like a chess board). Xv works in Totem and mplayer, but YouTube videos get corrupted. On mode switch, whole screen gets corrupted, but goes back to normal after changing desktop background. I believe this trick didn't work before. However... 2) Upgrading to mesa-git and drm-git from airlied's repository, and enabling DRI makes all kinds of corruption that appeared before go away. Even YouTube works normally. OpenGL vendor string: Advanced Micro Devices, Inc. OpenGL renderer string: Mesa DRI R600 (RV740 94B3) 20090101 TCL DRI2 OpenGL version string: 1.4 Mesa 7.7-devel A bit off-topic - two things break: first and foremost, Xv displays only blank screen, even xvinfo looks normal. second, I get do_wait: drmWaitVBlank returned -1, IRQs don't seem to be working correctly. Try adjusting the vblank_mode configuration parameter. when running glxgears third, dmesg is filled with [drm:r600_cs_packet_parse_vline] *ERROR* unknown crtc reloc [drm:r600_packet0_check] *ERROR* No reloc for ib[691]=0x6538 Hope some of this helps... (In reply to comment #79) > A bit off-topic - two things break: > first and foremost, Xv displays only blank screen, even xvinfo looks normal. This is due to issue 3 below. > second, I get > do_wait: drmWaitVBlank returned -1, IRQs don't seem to be working correctly. > Try adjusting the vblank_mode configuration parameter. > when running glxgears Irqs are not not supported yet on r6xx/r7xx. > third, dmesg is filled with > [drm:r600_cs_packet_parse_vline] *ERROR* unknown crtc reloc > [drm:r600_packet0_check] *ERROR* No reloc for ib[691]=0x6538 This is a separate issue and is why Xv isn't working. Please file a different bug for this. > --- Comment #72 from Dave Airlie <airlied@freedesktop.org> 2009-08-26 04:31:27 PST ---
> this is written in some unspecified docs, however 2 has never made sense to me.
>
> 1. After BITBLT_MULTI copy data from frame buffer to system memory, flush
> entire 2D pixel cache and
> wait for 2D engine idle and clean to ensure the copied data to arrive into
> bus controller.
> 2. Copy the last pixel back to frame buffer from system memory to flush data
> out of bus controller to system
> physical memory.
> 3. Wait for flush process to be completed. This is done by either waiting for
> engine idle or using timestamp
> write back mechanism.
Just my 2 cents. Number 2 makes sense to me: When you write from dev A to
dev B and want to ensure that all the writes have arrived in B, you need
to read from B to A. Because PCI specifies that a read flushes all write
cashes (in the other direction) on its path, this flushes all outstanding
writes to B.
Usually A=CPU/main memory and B=device and one uses a read of a
side-effect free reg, but it should work the other way round, too.
*** Bug 18400 has been marked as a duplicate of this bug. *** *** Bug 18399 has been marked as a duplicate of this bug. *** Confirming the bug on my x1400 mobile (lenovo z61m laptop) with a stock Ubuntu 9.10 setup. Correction of my previous statement. On Fedora 12, with: mesa-libGL-7.6-0.13.fc12.x86_64 mesa-dri-drivers-7.6-0.13.fc12.x86_64 mesa-libGLU-7.6-0.13.fc12.x86_64 mesa-dri-drivers-experimental-7.6-0.13.fc12.x86_64 I do see corruption even when DRI is enabled: OpenGL vendor string: Advanced Micro Devices, Inc. OpenGL renderer string: Mesa DRI R600 (RV740 94B3) 20090101 TCL DRI2 OpenGL version string: 1.4 Mesa 7.7-devel However, I need to actually use Compiz (or probably some other 3D app) to actually make it start occuring; it will happen in less than half an hour. When I don't use a 3D app, and DRI is enabled, corruption will not occur for hours (and probably never, but I can't test that hypothesis :-D ). *** Bug 22312 has been marked as a duplicate of this bug. *** Created attachment 31921 [details]
screenshot of corrupted pixmaps on Lenovo z61m
FWIW, I'm also facing pixmap corruption on a Lenovo z61m notebook with XAA.
This is on Debian unstable with
xorg 1:7.4+4 and xserver-xorg-video-radeon 1:6.12.3-1
(In reply to comment #87) > FWIW, I'm also facing pixmap corruption on a Lenovo z61m notebook with XAA. But this report is EXA specific, it can't really be related. Something new about this? Can somebody tell, if it still occurs with kms? I cannot try it myself! Thanks! (In reply to comment #89) > Can somebody tell, if it still occurs with kms? I cannot try it myself! It looks like this bug is gone with kms. At least for me. I tried 2.6.33.rc5 and git mesa, libdrm, and xf86 driver on rv670 agp. No bitmap corruption but sometimes kernel crashes during boot. If it manages to boot then it works fine later on. Maybe it is because of agp portion of the drivers (my agp card works well on windows and older linux). I'll try rc6 and possibly fill in a bug report if it is still there. I have reproduced this with 2.6.33-rc5 and rv280 agp. Sometimes corruption appears in mouse cursor and fonts. I have seen whole desktop corruption too but I think that is different DFS bug. Of course my hw setup is probably the one of the most easiest to reproduce this problem. (In reply to comment #89) > > Can somebody tell, if it still occurs with kms? I cannot try it myself! > > It looks like this bug is gone with kms. At least for me. I tried > 2.6.33.rc5 > and git mesa, libdrm, and xf86 driver on rv670 agp. No bitmap corruption > but > sometimes kernel crashes during boot. If it manages to boot then it works > fine > later on. Maybe it is because of agp portion of the drivers (my agp card > works > well on windows and older linux). I'll try rc6 and possibly fill in a bug > report if it is still there. > > I think Ubuntu finally got rid of this with the new -ati import: xserver-xorg-video-ati (1:6.12.99+git20100126.e5933fd7-0ubuntu1) * New upstream git snapshot 20100126 (master) up to commit e5933fd7, includes: o [3a30210d] RS4xx: fix 200M freezes on VT switch if CRTC is disabled (LP: #333377, #494672) o Speedups for r600 o Fixes to various dpms / incorrect resolution issues o Fixes to low memory EXA; fix NoAccel to work with KMS Right Bryce? (In reply to comment #92) > I think Ubuntu finally got rid of this with the new -ati import: > xserver-xorg-video-ati (1:6.12.99+git20100126.e5933fd7-0ubuntu1) I'm tracking git master of the radeon driver on Ubuntu Karmic (using UMS, not KMS), and the corruption issues at least for my card [X1400] are still there. Currently running a snapshot from 2010-02-04 (up to commit 8d63d70f).. *** Bug 26523 has been marked as a duplicate of this bug. *** *** Bug 26595 has been marked as a duplicate of this bug. *** Maybe we should move hd4770 (RV740) issues to separate bug as I dont think other chips experience the same issue. RV740 seems not to render to GTT correctly afaik. You get a checkerboard pattern or stripes with half correct - half garbage. The easiest way for me to reproduce is using xmag under UMS. Just take a mag and you get a checkerboard. Using some screenshot app will also show this, just as in http://bugs.freedesktop.org/attachment.cgi?id=33228 Under KMS I can make screen corruption (x pixmaps?) happen with some opengl game with lots' of textures (Set to high/ultra) or also a minute or two with googleearth. I guess it starts when kms starts to evict & bring back buffers from vram to gtt. I guess we'r not programming/setting up rv740 correctly. Just to mark 2 other issues with rv740: it returns only half the occlusion query samples in ogl and pixmaps w<32 are disabbled altogether in dfs for this chip. Andre Hi, I just want to report much much stronger corruption all over the place with my HD4770: 01:00.0 VGA compatible controller [0300]: ATI Technologies Inc Radeon HD 4770 [RV740] [1002:94b3] Subsystem: ATI Technologies Inc Device [1002:0d00] As I'm currently using Kubuntu 9.10 Karmic with mainline Kernel 2.6.32 I wanted to finally give radeon KMS a shot but to no avail as it just displays totally garbage on my big Dell 30" (2560x1600 native). So I'm stuck with UMS but at least it "works". I can see the first corruption when I log in using KDM when animations are displays after providing the credentials. Then when starting to work the desktop seem fine at first. The next things that are corrupted is the KDE4 main menu and when browsing with dolphin any graphic sooner or later become corrupted. This goes on as I continue to do things. In contrast to that glxgears/-heads don't show any corruption. So I tried to enable desktop effects with the result that all window decorations disappear immediately and windows cannot be raised/lowered/moved/resized, shortcuts like ALT+F4 doesn't work any longer but the applications itself continue to work. I'll attach some screens for further details. The rv740 bug is a separate issue and has a fix available here for UMS and KMS: http://marc.info/?l=dri-devel&m=126661490126109&w=2 Which will hopefully get into 2.6.33, if not then 2.6.33.1 Created attachment 33438 [details]
corruption on 2560x1600 KDE4
Created attachment 33439 [details]
corruption on 2560x1600 KDE4 #2
Created attachment 33440 [details]
corruption on 2560x1600 KDE4 #3
(In reply to comment #98) > The rv740 bug is a separate issue and has a fix available here for UMS and KMS: > http://marc.info/?l=dri-devel&m=126661490126109&w=2 > Which will hopefully get into 2.6.33, if not then 2.6.33.1 > Ah OK. That sounds good. Thanx and sorry to disturb this thread. @Ancoron: Your desktop looks exactly like the one I had without the patch Alex D. told you to apply. You can find it in drm-linus: http://git.kernel.org/?p=linux/kernel/git/airlied/drm-2.6.git;a=shortlog;h=refs/heads/drm-linus. WOW!!! I just compiled my first kernel (snapshot 7d404c7b5f4c004712bc15ed6e6edd6779842126, airlied/drm-2.6.git) and well, no corruptions for me yet. And the speed of the KWIN desktop effects is astonishing. I am impressed! Now I'm going to like this machine... ;) But one thing is exactly the same as before: KMS. Still displaying only garbage. But as long as UMS works fine I don't care. Thanx again for all your work! Ancoron The corruption issues disappear with KMS, at least in the graphics stack included in Ubuntu Lucid. I use plain uncomposited desktop (Compiz too slow with KMS and my X1400 card). With UMS, the corruption is still the same in Ubuntu Lucid. Well, that's some good news. I think since in Ubuntu we're shipping KMS on as the default for this hardware, I'll close out the bug in launchpad and consider KMS the fix. I'll leave this fdo bug open so the -ati guys can decide if they want to continue troubleshooting the UMS issue. I'm using Debian vanilla upstream kernel 2.6.34 X.Org X Server 1.7.7 radeon: module version = 6.13.1 ATI Radeon HD 4290 (ChipID = 0x9714) (**) RADEON(0): Option "AccelMethod" "EXA" (II) AIGLX: Loaded and initialized /usr/lib/dri/r600_dri.so (II) RADEON(0): Output VGA-0 using initial mode 1920x1200 (II) RADEON(0): Output DVI-0 using initial mode 1920x1200 (II) [KMS] Kernel modesetting enabled. Yes, it's using KMS so this photo is a touch ironic ;) http://www.flickr.com/photos/96141280@N00/4824397821/sizes/l/ This exact problem should be fixed since some time ago. In indeed was fixed a long time ago. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.