Created attachment 116810 [details] Xorg log file Running: xf86-video-intel 2.99.917-5 in Arch Linux If I launch LibreOffice with a couple of dark-background terminal windows on the screen, select the "Impress" application (clone of powerpoint), bring the terminal windows forward, and move my mouse around a bit (hitting areas of the LibreOffice window), bright pixels appear in my dark terminal windows. This only happens when using "SNA" mode in the intel driver. This is on a ThinkPad T430s. I can reproduce this very easily. It does not happen in UXA. I've attached an Xorg.0.log file. I'll attach a screenshot showing some bright pixels in the term windows in the next entry. Let me know if I can provide any other files.
Created attachment 116811 [details] Screenshot showing bright pixels Here's the screenshot showing the bright pixels.
A couple of possibilities: - it's a missed GPU w/a, state, flush - it's just tearfree damage tracking going awry The second is very easy to test as you can quickly turn it off from your xorg.conf. At a guess, I would say the window manager is wmaker?
(In reply to Chris Wilson from comment #2) > - it's just tearfree damage tracking going awry > > The second is very easy to test as you can quickly turn it off from your > xorg.conf. Nope, I turned off tearfree and could still make it happen. > At a guess, I would say the window manager is wmaker? It's fvwm. Let me know of any more tests I can try.
(In reply to Joe Peterson from comment #3) > It's fvwm. > > Let me know of any more tests I can try. Note that I use "focus follows mouse" (and with fvwm, it's strict, meaning focus goes in and out of windows as the mouse passes over them). This may make the issue much easier to reproduce for me, as moving the mouse causes focus/unfocus events, changes terminal cursor, etc. The terminal I use mainly is urxvt (rxvt-unicode), but I also checked to see if xterm gets the bright pixels, and it does.
It would be handy if you could compile with --enable-debug=pixmap. That will make sure that we verify the coordinates before drawing and so hopefully ruling out a simple mistake.
Strange, I recompiled xf86-video-intel (using the Arch package). With or without the option, --enable-debug=pixmap, I cannot get X to start. The following is in Xorg.0.log: [ 8087.510] (==) intel(0): Depth 24, (--) framebuffer bpp 32 [ 8087.510] (==) intel(0): RGB weight 888 [ 8087.510] (==) intel(0): Default visual is TrueColor [ 8087.510] (**) intel(0): Option "AccelMethod" "sna" [ 8087.510] (**) intel(0): Option "TearFree" "true" [ 8087.510] (EE) [ 8087.510] (EE) Backtrace: [ 8087.510] (EE) 0: /usr/lib/xorg-server/Xorg (OsLookupColor+0x139) [0x596749] [ 8087.510] (EE) 1: /usr/lib/libc.so.6 (__restore_rt+0x0) [0x7f4f84cba5af] [ 8087.511] (EE) 2: ? (?+0x0) [0x0] [ 8087.511] (EE) [ 8087.511] (EE) Segmentation fault at address 0x0 [ 8087.511] (EE) Fatal server error: [ 8087.511] (EE) Caught signal 11 (Segmentation fault). Server aborting Am I missing a vital step in building it?
You are trying to build -intel-2.99.917 with new libdrm, and there was an abi break in libdrm. You need to build from -intel.git to assist with debugging anyway.
Created attachment 116832 [details] Xorg.0.log
OK, thanks for the tip on that. I do get an extra line in Xorg.0.log: intel(0): SNA compiled with extra pixmap/damage validation So I assume my build it active (although version appears the same). At first try, I could not, this time, cause the issue to appear. Could the compile option prevent it now? I've attached a new log file (just before this comment) in case it helps...
It'll change timings (it runs a bit slower), but it shouldn't otherwise impact the rendering results - I tried to avoid perturbing the paths it validates.
(In reply to Chris Wilson from comment #10) > It'll change timings (it runs a bit slower), but it shouldn't otherwise > impact the rendering results - I tried to avoid perturbing the paths it > validates. OK, I compiled the git version *without* the pixmap option, and now I cannot make it happen anymore. Have there been changes, that you know of, since the version I was using that could have fixed this? If you have any more suggestions of things to try, let me know.
(In reply to Joe Peterson from comment #11) > OK, I compiled the git version *without* the pixmap option, and now I cannot > make it happen anymore. Have there been changes, that you know of, since the > version I was using that could have fixed this? If you have any more > suggestions of things to try, let me know. About the only thing which appears relevant are engine selection tweaks. Nothing that purports to change the rendering for your machine, just fine tuning.
(In reply to Chris Wilson from comment #12) > About the only thing which appears relevant are engine selection tweaks. > Nothing that purports to change the rendering for your machine, just fine > tuning. OK, interesting. I was trying latest git version, which seemed to either fix it or mask the issue. And today, I dyd an Arch Linux update, and it seems the official package now grabs git commit b24e758 (backlight: Factor known names into preferred interfaces). This one is earlier than latest, of course, but it also seems to mask or fix the issue. I guess there is no way to go back to the 2.99.917 (due to the abi change) to find out what cause the "fix". I hope it's not just a random timing change that is now causing it to work better, since that means it could still be lurking...
You can bisect, you just have to cherry-pick the fix every time. I would appreciate it because as you say without understanding why it is no longer apparent, we have no idea if it is fixed or merely hidden.
(In reply to Chris Wilson from comment #14) > You can bisect, you just have to cherry-pick the fix every time. I would > appreciate it because as you say without understanding why it is no longer > apparent, we have no idea if it is fixed or merely hidden. Is there a patch I could make that would update the older versions to current abi, or is it more complex than that? Bisecting would be easier if that were possible. BTW, do you know the commit at which the abi changed to the current?
commit 8576527cfacaf42af8316e1030c192193e94225a Author: Neil Roberts <neil@linux.intel.com> Date: Thu Mar 19 17:11:08 2015 +0000 intel: Merge latest i915_drm.h is the fateful commit in libdrm. diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h index ded43b1..e398949 100644 --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -497,14 +497,6 @@ struct drm_i915_gem_mmap { * This is a fixed-size type for 32/64 compatibility. */ __u64 addr_ptr; - - /** - * Flags for extended behaviour. - * - * Added in version 2. - */ - __u64 flags; -#define I915_MMAP_WC 0x1 }; should be all that is required.
(In reply to Chris Wilson from comment #16) > commit 8576527cfacaf42af8316e1030c192193e94225a > Author: Neil Roberts <neil@linux.intel.com> > Date: Thu Mar 19 17:11:08 2015 +0000 > > intel: Merge latest i915_drm.h > > is the fateful commit in libdrm. > > diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h > index ded43b1..e398949 100644 > --- a/include/drm/i915_drm.h > +++ b/include/drm/i915_drm.h > @@ -497,14 +497,6 @@ struct drm_i915_gem_mmap { > * This is a fixed-size type for 32/64 compatibility. > */ > __u64 addr_ptr; > - > - /** > - * Flags for extended behaviour. > - * > - * Added in version 2. > - */ > - __u64 flags; > -#define I915_MMAP_WC 0x1 > }; > > > should be all that is required. Hi, so that's the change in libdrm, but when in xf86-video-intel did the code start to use the new libdrm abi? And is there a patch I can apply to the older versions of xf86-video-intel (not to libdrm) to allow them to work with the new libdrm? I'd rather just bisect xf86-video-intel's code and reinstall that package than downgrade libdrm on mys system (that sounds a lot more dangerous).
Is it this one:? 7fe2b29 sna: Protect against ABI breakage in recent versions of libdrm
(In reply to Joe Peterson from comment #17) > Hi, so that's the change in libdrm, but when in xf86-video-intel did the > code start to use the new libdrm abi? The point is the existent ABI changed. We thought we were being smart by zero extending the field and believed it wouldn't break. But then we started passing in nearby stack because the ioctl was passing in more data than the old struct. (In reply to Joe Peterson from comment #18) > Is it this one:? > > 7fe2b29 sna: Protect against ABI breakage in recent versions of libdrm Yes.
(In reply to Chris Wilson from comment #19) > (In reply to Joe Peterson from comment #17) > > Hi, so that's the change in libdrm, but when in xf86-video-intel did the > > code start to use the new libdrm abi? > > The point is the existent ABI changed. We thought we were being smart by > zero extending the field and believed it wouldn't break. But then we started > passing in nearby stack because the ioctl was passing in more data than the > old struct. > > > (In reply to Joe Peterson from comment #18) > > Is it this one:? > > > > 7fe2b29 sna: Protect against ABI breakage in recent versions of libdrm > > Yes. OK, cool. So do you expect that if I apply the patch (git diff bacaf7f..7fe2b29) to earlier versions of xf86-video-intel that it will work with new abi? I can try, but if hopeless, I'd rather know now. :)
Ok, according to my bisection (and I want to test the "good" version a little more before I declare it good), the following commit appears to "fix" it: e47eb0c sna: Don't unroll BLT points Looking at the diff for that commit, I do see something potentially suspect in the old code in the second unrolled block: - if (n_this_time & 4) { - *((uint64_t *)b + 0) = pt_add(cmd, p+0, dx, dy); - *((uint64_t *)b + 1) = pt_add(cmd, p+1, dx, dy); - *((uint64_t *)b + 2) = pt_add(cmd, p+2, dx, dy); - *((uint64_t *)b + 3) = pt_add(cmd, p+3, dx, dy); - b += 8; - p += 8; Note that both b and p are incremented by 8. In the previous block, which looks more correct they are: - if (n_this_time & 4) { - *((uint64_t *)b + 0) = pt_add(cmd, p+0, 0, 0); - *((uint64_t *)b + 1) = pt_add(cmd, p+1, 0, 0); - *((uint64_t *)b + 2) = pt_add(cmd, p+2, 0, 0); - *((uint64_t *)b + 3) = pt_add(cmd, p+3, 0, 0); - b += 8; - p += 4; I didn't analyze the unrolled code completely yet, but I wonder if that is where the bug was before the code was "rolled". I'll look at this a bit more later, but I wanted to get where I was so far to you now...
Oh, I remember seeing that bug as well and feeling guilty for not commenting on it in the changelog. That would explain the pixels very nicely and why they evade --enable-debug. Thanks.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.