Summary: | [sna snb/ivb] corruption with chromium | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Vedran Rodic <vrodic> | ||||||||||||||||||||||
Component: | Driver/intel | Assignee: | Chris Wilson <chris> | ||||||||||||||||||||||
Status: | RESOLVED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||||||||||||||
Severity: | normal | ||||||||||||||||||||||||
Priority: | medium | CC: | anarsoul, bay, czajernia, devtty5, joe, mail, steven, xlinuxro | ||||||||||||||||||||||
Version: | git | ||||||||||||||||||||||||
Hardware: | Other | ||||||||||||||||||||||||
OS: | All | ||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||||||
Attachments: |
|
Created attachment 82531 [details]
SNA-issue-4
Created attachment 82532 [details]
SNA-issue-3
SNA-Issue-4 shows a problem on the bottom left (the status bar of PHPstorm is broken by text above) SNA-Issue-3 show text corruption on the bottom-centre part of the screen, when entering text in text box. (In reply to comment #2) > Created attachment 82532 [details] > SNA-issue-3 This looks like the kernel bug fixed by commit daa13e1ca587bc773c1aae415ed1af6554117bd4 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jun 28 16:54:08 2013 +0100 drm/i915: Only clear write-domains after a successful wait-seqno Can you describe the first issue more clearly, is it only with that application? The first issue is the only one I can reproduce easily, I'm not sure if it is PHPStorm specific, but it might be. Basically it's seen only when I scroll that treeview of the application. Initial rendering of the status bar is fine. Can you check which commit you have of the DDX, there was a various recent bug with scrolling, fixed by commit 34c9b759fbab8d548108e954d55de38c6f5bec31 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Jul 16 19:39:37 2013 +0100 sna: Note that borderClip region may be more than a singular box Hmm, I think I just fixed a further bug from commit 34c9b759fbab8d548108e954d55de38c6f5bec31 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Jul 16 19:39:37 2013 +0100 sna: Note that borderClip region may be more than a singular box with: commit a764a6e69b23f644957cf3e4e98868464f458758 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jul 17 10:51:56 2013 +0100 sna: Fix typo in computing box intersection Do you mind attaching your Xorg.0.log so that I can check which version you are running? I tested with a version compiled at 8:50 CET, so it had 34c9b759fbab8d548108e954d55de38c6f5bec31 But right now I tested with current a764a6e69b23f644957cf3e4e98868464f458758, and the problem in PHPStorm is gone. It looks like the first issue in this bug, is still present. I have a kernel version a0ab62339af858b63eba0205a583a5a503536da6 (got it from drm-intel-nightly ubuntu mainline builds), and I still saw a very similar problem as with previous sna-issue-3 screenshot. My DDX is e386ba86ea487a2db62d80a0e60f176e052d6406, do I don't have the latest single commit since that. I'll attach the image of the new issue, it looks slightly different, but that probably is a just in the random garbage differences. Created attachment 82715 [details]
sna-issue-3-new
Could be an unwanted side-effect of commit 6921abd81017c9ed7f3b2413784068fbc609a0ea Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jul 18 16:21:27 2013 +0100 sna: Add a fast path for the most common fallback for CPU-CPU blits in which case please test with current master. Thanks. Retested. Sorry, still an issue. * scratches head Not sure then. Please can you attach Xorg.0.log to confirm the configuration details, and if you can identify any pattern (i.e. reproduction steps) leading up to the corruption that will be very useful. Thanks. This is what I use to reproduce a problem from a fresh reboot (drm-intel-nightly 68c6cd3f1312965698b2af5bb08e15807ce9ae2d, DDX 7b1a5024df96070bab70744ad7e7d78a41fb0f72 - current): - Open Google Chrome (Version 28.0.1500.71) - Go to http://support.humblebundle.com/customer/portal/articles/754604-torchlight-changelog - Try selecting text in last three bullet points - If the right edge of the white bounding box that surronds the main content in the page is obscured by making the window narrower, the problem when selecting text goes away - Attached image for reference (bug-chrome.png) - Attached Xorg.log Created attachment 82823 [details]
bug-chrome.png
Created attachment 82824 [details]
Xorg.log
Ben confirmed seeing something similar also after updating his kernels to 3.10.3, and only on ivb (not ilk, but then again not exhaustively tested). I've switched to browsing with chromium (rather than just light testing) and have also seen the occasional glitch. I have not yet found a pattern, so it remains elusively unreproducible. Spotted something that looks like it would be hit by Chromium from time to time: commit c9d89499806779cd6c62d5d6d34df76279cc5abd Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jul 29 11:51:39 2013 +0100 sna: Composite region is already in dst drawable space So do not offset it again when processing the fallback composite operation. Regression from commit 6921abd81017c9ed7f3b2413784068fbc609a0ea Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jul 18 16:21:27 2013 +0100 sna: Add a fast path for the most common fallback for CPU-CPU blits References: https://bugs.freedesktop.org/show_bug.cgi?id=66990 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> I'm hoping that we can still find a pattern behind the corruption, otherwise it will remain nigh on impossible to test. :| I'm away on a vacation. I'll be able to test in a week with the scenario above that was reproducible every time on my machine. I'm not sure this helps, but I've been seeing what looks like exactly this bug on my Lenovo T430s as well. I first saw it when entering text: the garbled long, horizontal rectangle changed as I typed (my typing was a part of it and/or near it). Like the reporter of this bug, the rectangles often appear as if the width does not match the contents, causing offsetting/slanting of the pixel lines (perhaps just one effect). It's just been the past few weeks or less (approx.) that I remember noticing it. I am on xf86-video-intel 2.21.13-1 (Arch Linux), and kernel 3.10.2, but I do believe I saw the issue with kernel 3.9.9 as well (and I know I saw it with xf86-video-intel 2.21.12-1, also). It happens at seemingly random times, and often I just see a random rectangle (mostly in Chromium) with the garbled contents, but it goes away quickly when something changes. The rectangles sometimes look different than described above, but I suspect it is the same cause. I'll try to get a screen capture next time I see it. I'll attached my Xorg.0.log file, in case that is of help. Created attachment 83554 [details]
Xorg log file
Here's my Xorg log file, in case it helps.
Created attachment 83675 [details] Screenshot showing problem on my Lenovo T430s OK, I captured a screenshot, finally, of the bug as I've often seen it. I think it looks strikingly similar to the reporter's visual effect. Hope this is of help! Note that this is from my Lenovo T430s (not the older Lenovo that I was using when I reported the unrelated bug 55500 a while back). Retested with latest DDX (c01c66b), drm-intel-nightly kernel 3224cf6c3ee5ab9c280052c9fbed4f660310c411 Still able to reproduce with the instructions above. If you are keen, you can try: the userptr branch from http://cgit.freedesktop.org/~ickle/linux-2.6 and compiling the ddx with ./configure --enable-userptr <usual configure options> The difference will be subtle, only a path where we need to operate on a busy target will use the userptr directly. At the moment, we will allocate a staging buffer to perform the copy. My feeling is that we are missing some barrier around that staging buffer and the GPU reads garbage instead of the updated content from chromium. So if switching to userptr does fix the corruption, I think that points towards the staging buffer. *** Bug 67894 has been marked as a duplicate of this bug. *** I am seeing very similar rendering corruption (in the chrome browser) to those reported by Vedran Rodic. My system is archlinux, with the following key packages linux 3.10.5-1 xf86-video-intel 2.21.14-2 By reverting to UXA the rendering problems don't seem to appear so this seems to be due to SNA in the current version of xf86-video-intel. Created attachment 83939 [details]
xorg log where graphics corruption was observed (SNA)
Created attachment 83945 [details]
xorg log after switching back to UXA
(In reply to comment #27) > I am seeing very similar rendering corruption (in the chrome browser) to > those reported by Vedran Rodic. My system is archlinux, with the following > key packages > > linux 3.10.5-1 > xf86-video-intel 2.21.14-2 > > By reverting to UXA the rendering problems don't seem to appear so this > seems to be due to SNA in the current version of xf86-video-intel. If it is any additional help the system has an I3-3220T processor with HD2500 graphics. $ lspci | egrep -i vga 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09) Can you please all test whether: diff --git a/src/sna/sna_composite.c b/src/sna/sna_composite.c index 58dd356..6f24eeb 100644 --- a/src/sna/sna_composite.c +++ b/src/sna/sna_composite.c @@ -520,7 +520,7 @@ sna_composite_fb(CARD8 op, if (mask) validate_source(mask); - if (mask == NULL && + if (mask == NULL && 0 && src->pDrawable && dst->pDrawable->bitsPerPixel >= 8 && src->filter != PictFilterConvolution && stops the corruption? Another step in the saga, commit e8dfc5b3f4ffeec93e52a5319b5a3118edf0e94e Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Aug 12 10:33:41 2013 +0100 sna: Fix destination offset along memcpy composite fallback fastback The application of dst_x|y was incorrect, and so the drawing could end up in the wrong location for a window. References: https://bugs.freedesktop.org/show_bug.cgi?id=66990 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Pretty sure this is it! Haven't been able to reproduce my irregular chromium corruption since, but then it was fairly irregular... No, that's not it, sorry. I can still reproduce it with "SNA compiled from 2.21.14-27-g9645e71". I'll retest with latest drm-nightly-intel (In reply to comment #33) > No, that's not it, sorry. I can still reproduce it with "SNA compiled from > 2.21.14-27-g9645e71". > > I'll retest with latest drm-nightly-intel Can you also try with the quick little hack from c31 diff --git a/src/sna/sna_composite.c b/src/sna/sna_composite.c index 58dd356..6f24eeb 100644 --- a/src/sna/sna_composite.c +++ b/src/sna/sna_composite.c @@ -520,7 +520,7 @@ sna_composite_fb(CARD8 op, if (mask) validate_source(mask); - if (mask == NULL && + if (mask == NULL && 0 && src->pDrawable && dst->pDrawable->bitsPerPixel >= 8 && src->filter != PictFilterConvolution && to sanity check that I am barking up the right tree. Latest git plus this patch applied still has the same corruption. Thanks, that suggests you have something I haven't seen yet. Ben, what happens with your test case? No luck for me either. Elsewhere it has been reported that strange artefacts occur when ring switching on IVB. It is definitely worth testing with current -intel to see if the corruption has changed. You are right! :) This is fixed on my IVB with current version (intel(0): SNA compiled from 2.21.15-13-ge98cc0b). Thanks Chris. I'll be optimistic and close this bug now. I am still experiencing this issue, both on [ 792.429] (II) intel(0): SNA compiled from 2.21.15-13-ge98cc0b and on 2.99.901. I get exact same results as in comment #14. Platform: Lenovo Thinkpad X1 (Sandy Bridge/Intel(R) HD Graphics 3000) Kernel 3.10.5 Google Chrome 28.0.1500.95 *** Bug 68964 has been marked as a duplicate of this bug. *** Not to be redundant, but I am also still seeing this issue on Arch Linux (on Lenovo T430s) with xf86-video-intel 2.21.15-1. Looks like it is fixed as of: (II) intel(0): SNA compiled from 2.99.901-45-g76790db Can anyone else confirm this? Confirming. I've updated video driver from git and I haven't seen any corruptions anymore. But it will take several days to say about this accurately. I've just seen screen corruption in the latest version of intel driver. But now it is really rare, only one time in two days(was 10-20 times). Just experienced that weird corruption again - but, as bay said, it is only once in a while - very intermittent and I cannot reproduce it at will. I am still experiencing corruption as well using a git build (from the day before yesterday) of xf86-video-intel... It's very difficult to reliably reproduce the issue but it's definitely still there. I've made quite a few minor fixes, none of which ostensibly look like it should fix this issue, but double checking with the current rc kernel (3.12-rc2) and latest xf86-video-intel.git is a must. Afterwards, you can try enabling (independently, and/or in combination) #define DBG_NO_UPLOAD_CACHE #define DBG_NO_UPLOAD_ACTIVE #define DBG_NO_MAP_UPLOAD in src/sna/kgem.c and see if any of those prevent the corruption. I can reproduce it most of the time on http://www.pastebin.com/ now, when clicking on the search box. Tested on gentoo-sources-3.10.5 and vanilla 3.12-rc3 with xf86-video-intel c724098 It looks like enabling just #define DBG_NO_UPLOAD_CACHE is enough to prevent the corruption on 3.12-rc3 (I enabled all 3 options first, then disabled them one by one to pinpoint the right combination). Finally! commit a048f436a0210d076fc844404bf56b8b7fcb4b7b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Oct 2 14:59:11 2013 +0100 sna: Only delete unused io buffers Before deleting the io buffer, we need to check that it is not active. Currently we check that it is not pending use in the current batch, but we also need to double check that it does not have outstanding use by the GPU. Failing to do so could mean overwriting the data prior to it being read by the GPU, a very small race but often hit! Reported-by: Vedran Rodic <vrodic@gmail.com> # and many others Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66990 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Yep, seems to work fine on 3.12.0-rc3 and a048f436. Good work, thanks! Confirming. I haven't seen any corruptions for a week since I applied the patch. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 82530 [details] SNA-issue-4 I'm seeing various screen corruption issues with latest Intel SNA DDX on my Ivy Bridge. On the kernel side, I have 3.10.1, and on X server side I have Ubuntu ppa of xserver-xorg-core 1.13.4~git20130508. I'm not sure when exactly it started (could be months ago). Switching to UXA makes the issue in bug-intel-ddx-4.png go away, and probably in bug-intel-ddx-3.png, but that one is a bit harder to reproduce. I'm using mostly GTK2 clients on LXDE environment, without compositing.