Bug 90725

Summary: [965g] Undetected bit17 swizzling (note desktop not mobile)
Product: DRI Reporter: Fab Stz <fabstz-it>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: fabstz-it, intel-gfx-bugs, mar.kolya, maxx
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: I965G i915 features: GEM/Other
Attachments:
Description Flags
Display chunks
none
Xorg log
none
Xorg.0.log with latest git master : fb1643f0f904eb258da71cd0b8deb8d3ec6dafed
none
Xorg log with enable-debug=full
none
Xorg log (see comment 15)
none
Log for case (7)
none
Log in reply to comment 35
none
Display issue when not using debug=full
none
Log in reply to comment 40
none
Another display issue
none
Another display issue (2) none

Description Fab Stz 2015-05-28 13:03:05 UTC
Created attachment 116114 [details]
Display chunks

Hello,

I have some display chunks as soon as I use some of my swap space. I noticed it
first with a VirtualBox VM as it generally needs space from my swap partition.
But it also appears when compiling programs as soon as it requires the swap
partition.

I'm on an up-to-date Debian stable (jessie) (kernel 3.16.7-ckt9-3~deb8u1 x86_64) and my Xorg uses
intel driver.
The issue appears with default config (no xorg.conf, ie. driver intel, with DRI
& Accel).
But I found this workaround : using either of these options in xorg.conf hides
the issue
    Option        "NoAccel"     "on"
    Option        "AccelMethod" "off"

Using driver fbdev is another workaround

The other AccessMethods "UXA, SNA, blt" and DRI=off won't workaround the issue.

BTW: Compiling & installing the last version of xserver-xorg-video-intel from
experimental (2:2.99.917) doesn't fix the issue. Neither Mesa 10.5.5, nor kernel 4.0.2 fixes it.

Regards


NB : I initially reported this bug in debian
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=780363
Comment 1 Chris Wilson 2015-05-28 13:36:19 UTC
commit 656bfa3afc14e45e2d9e1624bf60d79b3beb12f2 [v3.19]
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Nov 20 09:26:30 2014 +0100

    drm/i915: Pin tiled objects for L-shaped configs

*** This bug has been marked as a duplicate of bug 28813 ***
Comment 2 Fab Stz 2015-05-28 17:25:29 UTC
Created attachment 116121 [details]
Xorg log
Comment 3 Fab Stz 2015-05-28 17:27:01 UTC
> > My device is as follows, which is appearently Gen3
> >  - 00:02.0 VGA compatible controller: Intel Corporation 82946GZ/GL
> > Integrated Graphics Controller (rev 02)
> >  - 00:02.0 0300: 8086:2972 (rev 02)
> >
> > Since your patch is designed for Gen4 could that explain that I'm still
> > facing the issue ?
> 
> No, only that I guessed incorrectly you had gen4 given the tiling artifacts.
> 
> > Should I reopen this one, or reopen 90725 ?
> 
> Hmm, reopen bug 90725 and this time add your Xorg.0.log! Can you also please
> use xf86-video-intel.git to rule out one gen3 swizzling bug in the process.

log attached.

I also installed driver v 2.99.917
What do you mean by "use xf86-video-intel.git" ? Should I take the very last version or is 2.99.917 enough ?
Comment 4 Chris Wilson 2015-05-28 22:25:30 UTC
I mean the latest from http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/ as there is one swizzling bug that has been fixed since 2.99.917 that would affect 4.0 and could be affecting you.

Interesting, it is a gen4 device! One of the bizarre gen3 chipset + gen4 GPU...
Comment 5 Fab Stz 2015-05-29 16:26:40 UTC
Linux 4.0.2 + xf86-video-intel (latest git fb1643f0f904eb258da71cd0b8deb8d3ec6dafed) is worse than Linux 4.0.2 + xf86-video-intel 2.99.917

With the latest git master, even the kdm logon screen has the display issues (at boot up, and while no swap space is used yet), while with 2.99.917, the issue is triggered only when the system starts using swap space

I attach the new Xorg.0.log with the latest master from git. (version displayed in the log is 2.99.917 though...)
Comment 6 Fab Stz 2015-05-29 16:27:25 UTC
Created attachment 116154 [details]
Xorg.0.log with latest git master : fb1643f0f904eb258da71cd0b8deb8d3ec6dafed
Comment 7 Chris Wilson 2015-05-29 17:50:37 UTC
Ok, that's worse! Can you please please send me an Xorg.0.log with xf86-video-intel compiled with ./configure --enable-debug=full (it will be huge, so just when logging into the system will be enough).
Comment 8 Fab Stz 2015-05-29 18:14:39 UTC
Created attachment 116157 [details]
Xorg log with enable-debug=full

I didn't log into my session but just started kdm
Comment 9 Chris Wilson 2015-05-29 19:25:21 UTC
Hmm, no swizzling. That wasn't what I was expecting, can you please cat /sys/kernel/debug/dri/0/i915_swizzle_info to confirm?
Comment 10 Fab Stz 2015-05-29 19:32:41 UTC
# cat /sys/kernel/debug/dri/0/i915_swizzle_info

bit6 swizzle for X-tiling = none
bit6 swizzle for Y-tiling = none
DDC = 0x00200010
DDC2 = 0x00200020
C0DRB3 = 0x0020
C1DRB3 = 0x0010
Comment 11 Chris Wilson 2015-05-29 19:37:06 UTC
Ok, swizzle path looks most suspicious here. All the rendering is done as a XCopyArea from a ShmPixmap which basically results in manual detiling through WC mmap.

Can you please test which of

diff --git a/src/sna/kgem.c b/src/sna/kgem.c
index 463f65f..853da31 100644
--- a/src/sna/kgem.c
+++ b/src/sna/kgem.c
@@ -86,7 +86,7 @@ search_snoop_cache(struct kgem *kgem, unsigned int num_pages, unsigned flags);
 #define DBG_NO_WC_MMAP 0
 #define DBG_NO_BLT_Y 0
 #define DBG_NO_SCANOUT_Y 0
-#define DBG_NO_DETILING 0
+#define DBG_NO_DETILING 1
 #define DBG_DUMP 0
 #define DBG_NO_MALLOC_CACHE 0

or

diff --git a/src/sna/kgem.c b/src/sna/kgem.c
index 463f65f..251a299 100644
--- a/src/sna/kgem.c
+++ b/src/sna/kgem.c
@@ -83,7 +83,7 @@ search_snoop_cache(struct kgem *kgem, unsigned int num_pages, unsigned flags);
 #define DBG_NO_FAST_RELOC 0
 #define DBG_NO_HANDLE_LUT 0
 #define DBG_NO_WT 0
-#define DBG_NO_WC_MMAP 0
+#define DBG_NO_WC_MMAP 1
 #define DBG_NO_BLT_Y 0
 #define DBG_NO_SCANOUT_Y 0
 #define DBG_NO_DETILING 0

fixes the issue. I think either of them will fix the kdm corruption, so they require a little soak testing (pretty much try and reproduce the earlier swapping -> corruption issue).
Comment 12 Chris Wilson 2015-05-29 19:46:48 UTC
Ok, I think the swizzle detection here is bogus...

kernel:
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index af23a5a7b870..b90eb4812c6b 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -215,8 +215,8 @@ i915_gem_detect_bit_6_swizzle(struct drm_device *dev)
                 * the minimum size of a rank.
                 */
                if (I915_READ16(C0DRB3) != I915_READ16(C1DRB3)) {
-                       swizzle_x = I915_BIT_6_SWIZZLE_NONE;
-                       swizzle_y = I915_BIT_6_SWIZZLE_NONE;
+                       swizzle_x = I915_BIT_6_SWIZZLE_UNKNOWN;
+                       swizzle_y = I915_BIT_6_SWIZZLE_UNKNOWN;
                } else {
                        swizzle_x = I915_BIT_6_SWIZZLE_9_10;
                        swizzle_y = I915_BIT_6_SWIZZLE_9;

Danvet, do you still have the source for the magic ddc2 bit? By any chance do you also have the desktop variant of the spec?
Comment 13 Fab Stz 2015-05-29 20:06:13 UTC
Both fix the kdm issue, but none of them fixes the chunks when starting to use swap space.

BTW, if that may help, most times, a refresh or resize of the window of the app fixes the corruption, but for some elements there is no redraw therefore the corruption persists

Should I also apply the kernel patch and relaunch cat /sys/kernel/debug/dri/0/i915_swizzle_info ?
Comment 14 Chris Wilson 2015-05-29 20:45:22 UTC
The kernel patch will fix the kdm corruption (should at least). However, to fix the swap related corruption we need to port the l-shape tiling quirk to your machine. As a quick test of that hypothesis:

diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_
index cfd85f4..ce36ebb 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -225,6 +225,7 @@ i915_gem_detect_bit_6_swizzle(struct drm_device *dev)
 
        dev_priv->mm.bit_6_swizzle_x = swizzle_x;
        dev_priv->mm.bit_6_swizzle_y = swizzle_y;
+       dev_priv->quirks |= QUIRK_PIN_SWIZZLED_PAGES;
 }
 
 /* Check pitch constriants for all chips & tiling formats */
Comment 15 Fab Stz 2015-05-29 22:42:20 UTC
* xf86-video-intel git master + kernel patch -> error while launching X11 (see Xorg.log error while launching x11)

* xf86-video-intel patched with comment 11  + kernel patch (comment 12 & 14) -> I have other display issues

* output of cat /sys/kernel/debug/dri/0/i915_swizzle_info with kernel patches (comment 12 & 14) :
bit6 swizzle for X-tiling = unknown
bit6 swizzle for Y-tiling = unknown
DDC = 0x00200010
DDC2 = 0x00200020
C0DRB3 = 0x0020
C1DRB3 = 0x0010
L-shaped memory detected
Comment 16 Fab Stz 2015-05-29 22:42:59 UTC
Created attachment 116161 [details]
Xorg log (see comment 15)
Comment 17 Chris Wilson 2015-05-30 08:22:47 UTC
(In reply to Fab Stz from comment #16)
> Created attachment 116161 [details]
> Xorg log (see comment 15)

Can you put a breakpoint on FatalError and get the symbolic backtrace? That seems an odd repercussion from marking swizzling as unknown - so I probably have a logic error before the assertion.
Comment 18 Fab Stz 2015-05-30 09:15:59 UTC
That's beyond my skills. Is there any other way I can help or do you have a short tutorial for me on that ?
Comment 19 Chris Wilson 2015-05-30 11:21:20 UTC
Can you please try instead:
$ addr2line -i -e /usr/lib/xorg/modules/drivers/intel_drv.so 0x2b3af 0x3eee4 0x4d5f9 0x4e667 0xd2250 0x920a6 0xce3af
Comment 20 Fab Stz 2015-05-30 14:35:55 UTC
(In reply to Chris Wilson from comment #19)
> Can you please try instead:
> $ addr2line -i -e /usr/lib/xorg/modules/drivers/intel_drv.so 0x2b3af 0x3eee4
> 0x4d5f9 0x4e667 0xd2250 0x920a6 0xce3af

/src/sna/kgem.c:292 (discriminator 1)
/src/sna/kgem.c:5720
/src/sna/sna_accel.c:1053 (discriminator 4)
/src/sna/sna_accel.c:1330
/src/sna/sna_glyphs.c:237
/src/sna/sna_accel.c:18089
/src/sna/sna_driver.c:242


BTW, I tried making a backtrace (I found an easy tutorial) but I wasn't able to switch back to console to create the backtrace in gdb. (even though I launched with "X -keeptty"). The screen seems locked. But pressing the button of the PC shuts it down.
Comment 21 Chris Wilson 2015-05-31 07:55:36 UTC
Oh, the kernel doesn't like setting swizzle unknown:

 /* If we can't handle the swizzling, make it untiled. */
 if (args->swizzle_mode == I915_BIT_6_SWIZZLE_UNKNOWN) {
      args->tiling_mode = I915_TILING_NONE;
      args->swizzle_mode = I915_BIT_6_SWIZZLE_NONE;
       args->stride = 0;
 }

and of course doesn't flag an error to the user. So it is just a broken API.
Comment 22 Chris Wilson 2015-05-31 08:03:28 UTC
So, setting swizzle unknown is not a good test. Probably, something like

diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index af23a5a7b870..a192a9e83aa8 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -214,13 +214,10 @@ i915_gem_detect_bit_6_swizzle(struct drm_device *dev)
                 * the table above, or from the 1-ch value being less than
                 * the minimum size of a rank.
                 */
-               if (I915_READ16(C0DRB3) != I915_READ16(C1DRB3)) {
-                       swizzle_x = I915_BIT_6_SWIZZLE_NONE;
-                       swizzle_y = I915_BIT_6_SWIZZLE_NONE;
-               } else {
-                       swizzle_x = I915_BIT_6_SWIZZLE_9_10;
-                       swizzle_y = I915_BIT_6_SWIZZLE_9;
-               }
+               swizzle_x = I915_BIT_6_SWIZZLE_9_10;
+               swizzle_y = I915_BIT_6_SWIZZLE_9;
+               if (I915_READ16(C0DRB3) != I915_READ16(C1DRB3))
+                       dev_priv->quirks |= QUIRK_PIN_SWIZZLED_PAGES;
        }
 
        dev_priv->mm.bit_6_swizzle_x = swizzle_x;

is better for testing.
Comment 23 Fab Stz 2015-05-31 08:55:12 UTC
(In reply to Chris Wilson from comment #22)
> So, setting swizzle unknown is not a good test. Probably, something like
> is better for testing.


# cat /sys/kernel/debug/dri/0/i915_swizzle_info                                                                
bit6 swizzle for X-tiling = bit9/bit10
bit6 swizzle for Y-tiling = bit9
DDC = 0x00200010
DDC2 = 0x00200020
C0DRB3 = 0x0020
C1DRB3 = 0x0010
L-shaped memory detected


(1)* xf86 git + kernel patch (comment 22 only) -> kdm screen is ugly (I didn't test further)

(2)* xf86 git with patch for wc_mmap (comment 11) + kernel patch (comment 22 only) -> kdm ok, swapping ok, but at some time I still have display issues, I guess when a program whose memory was put to swap space

(3)* xf86 git with patch for detiling (comment 11) + kernel patch (comment 22 only) -> kdm ok, swapping ok, no problem faced at any time

I'll stay with (3) for some time to see how it behaves
Comment 24 Chris Wilson 2015-06-01 08:33:56 UTC
(In reply to Fab Stz from comment #23)
> (In reply to Chris Wilson from comment #22)
> > So, setting swizzle unknown is not a good test. Probably, something like
> > is better for testing.
> 
> 
> # cat /sys/kernel/debug/dri/0/i915_swizzle_info                             
> 
> bit6 swizzle for X-tiling = bit9/bit10
> bit6 swizzle for Y-tiling = bit9
> DDC = 0x00200010
> DDC2 = 0x00200020
> C0DRB3 = 0x0020
> C1DRB3 = 0x0010
> L-shaped memory detected
> 
> 
> (1)* xf86 git + kernel patch (comment 22 only) -> kdm screen is ugly (I
> didn't test further)

That implies the swizzle bits are still wrong. Once you have done a soak test for (3), it would be interesting trying

swizzle_x = I915_BIT_6_SWIZZLE_NONE;
then 
I915_BIT_6_SWIZZLE_9_10
I915_BIT_6_SWIZZLE_9_10_11

(I can only hope we don't have bit17 mixed in, but for completeness:
I915_BIT_6_SWIZZLE_9_10_17)

and see if any of those fix the display of kdm.

The important observation from 3 is that swapping is suspect. This either means we have the L-shaped issue with varying swizzling depending upon memory address, or we have bit17 swizzling. I am beginning to suspect the latter.
Comment 25 Fab Stz 2015-06-01 18:12:19 UTC
(In reply to Chris Wilson from comment #24)
> That implies the swizzle bits are still wrong. Once you have done a soak
> test for (3), it would be interesting trying
> 
> swizzle_x = I915_BIT_6_SWIZZLE_NONE;
> then 
> I915_BIT_6_SWIZZLE_9_10
> I915_BIT_6_SWIZZLE_9_10_11
> 
> (I can only hope we don't have bit17 mixed in, but for completeness:
> I915_BIT_6_SWIZZLE_9_10_17)
> 
> and see if any of those fix the display of kdm.


I didn't face any issue with (3) yet.

Results of the other tests:
(4)* intel_drv git + I915_BIT_6_SWIZZLE_NONE -> kdm first displays, and after logout kdm has display problems
(5)* intel_drv git + I915_BIT_6_SWIZZLE_9_10 (ie. same as (1) above) -> kdm is ugly
(6)* intel_drv git + I915_BIT_6_SWIZZLE_9_10_11  -> kdm is ugly
(7)* intel_drv git + I915_BIT_6_SWIZZLE_9_10_17 -> kdm is ok, I login, and after some time X crashes (see attachment)
(8)* intel_drv git with patch for detiling + I915_BIT_6_SWIZZLE_9_10_17 -> kdm is ok, no crash yet, seems fine

Shall I test (7) also with patch for wc_mmap ?
Comment 26 Fab Stz 2015-06-01 18:17:33 UTC
Log for (7) is way too big. The error in the file is 

[   377.715] (EE)
[   377.715] (EE) Backtrace:
[   377.716] (EE) 0: /usr/bin/X (xorg_backtrace+0x56) [0x7fe5e61e7d46]
[   377.716] (EE) 1: /usr/bin/X (0x7fe5e6031000+0x1baf29) [0x7fe5e61ebf29]
[   377.716] (EE) 2: /lib/x86_64-linux-gnu/libc.so.6 (0x7fe5e3d25000+0x35180) [0x7fe5e3d5a180]
[   377.716] (EE) 3: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe5dffcb000+0xd6e4a) [0x7fe5e00a1e4a]
[   377.716] (EE) 4: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe5dffcb000+0xd6fb3) [0x7fe5e00a1fb3]
[   377.716] (EE) 5: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe5dffcb000+0xd74e3) [0x7fe5e00a24e3]
[   377.716] (EE) 6: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fe5dffcb000+0xd8925) [0x7fe5e00a3925]
[   377.716] (EE) 7: /usr/bin/X (0x7fe5e6031000+0x13d643) [0x7fe5e616e643]
[   377.716] (EE) 8: /usr/bin/X (0x7fe5e6031000+0x133587) [0x7fe5e6164587]
[   377.716] (EE) 9: /usr/bin/X (0x7fe5e6031000+0x573f7) [0x7fe5e60883f7]
[   377.716] (EE) 10: /usr/bin/X (0x7fe5e6031000+0x5b596) [0x7fe5e608c596]
[   377.716] (EE) 11: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf5) [0x7fe5e3d46b45]
[   377.716] (EE) 12: /usr/bin/X (0x7fe5e6031000+0x4590e) [0x7fe5e607690e]
[   377.716] (EE)
[   377.716] (EE) Segmentation fault at address 0x9
[   377.716] (EE)
Fatal server error:
[   377.716] (EE) Caught signal 11 (Segmentation fault). Server aborting
[   377.716] (EE)
[   377.716] (EE)
Please consult the The X.Org Foundation support
         at http://wiki.x.org
 for help.
[   377.716] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[   377.717] (EE)
[   377.717] (II) AIGLX: Suspending AIGLX clients for VT switch
[   377.717] sna_leave_vt
[   377.717] sna_accel_leave
[   377.717] sna_mode_reset
[   377.717] sna_disable_cursors
[   377.728] __sna_crtc_disable: releasing handle=9 from scanout, active=0
[   377.755] (EE) Server terminated with error (1). Closing log file.
Comment 27 Chris Wilson 2015-06-01 19:44:56 UTC
(In reply to Fab Stz from comment #26)
> Log for (7) is way too big. 

That would be really useful to have, say, the last 1500 lines.

> The error in the file is 
> 
> [   377.715] (EE)
> [   377.715] (EE) Backtrace:
> [   377.716] (EE) 0: /usr/bin/X (xorg_backtrace+0x56) [0x7fe5e61e7d46]
> [   377.716] (EE) 1: /usr/bin/X (0x7fe5e6031000+0x1baf29) [0x7fe5e61ebf29]
> [   377.716] (EE) 2: /lib/x86_64-linux-gnu/libc.so.6
> (0x7fe5e3d25000+0x35180) [0x7fe5e3d5a180]
> [   377.716] (EE) 3: /usr/lib/xorg/modules/drivers/intel_drv.so
> (0x7fe5dffcb000+0xd6e4a) [0x7fe5e00a1e4a]

Can you please run addr2line -i -e /usr/lib/xorg/modules/drivers/intel_drv.so 0xd6e4a 0xd6fb3 0xd74e3 0xd8925

Though I expect the log to be more informative in this case - the crash is likely to occur much later than the actual error.
Comment 28 Chris Wilson 2015-06-01 19:49:02 UTC
(In reply to Fab Stz from comment #25)
> (7)* intel_drv git + I915_BIT_6_SWIZZLE_9_10_17 -> kdm is ok, I login, and
> after some time X crashes (see attachment)
> (8)* intel_drv git with patch for detiling + I915_BIT_6_SWIZZLE_9_10_17 ->
> kdm is ok, no crash yet, seems fine
> 
> Shall I test (7) also with patch for wc_mmap ?

No. That pair conclusively says the issue is bit17 tiling, and not L-shaped swizzling. The swizzle detection routine is completely bogus for your machine.
Comment 29 Fab Stz 2015-06-01 20:02:42 UTC
(In reply to Chris Wilson from comment #27)
> (In reply to Fab Stz from comment #26)
> > Log for (7) is way too big. 
> 
> That would be really useful to have, say, the last 1500 lines.
> 
> > The error in the file is 
> > 
> > [   377.715] (EE)
> > [   377.715] (EE) Backtrace:
> > [   377.716] (EE) 0: /usr/bin/X (xorg_backtrace+0x56) [0x7fe5e61e7d46]
> > [   377.716] (EE) 1: /usr/bin/X (0x7fe5e6031000+0x1baf29) [0x7fe5e61ebf29]
> > [   377.716] (EE) 2: /lib/x86_64-linux-gnu/libc.so.6
> > (0x7fe5e3d25000+0x35180) [0x7fe5e3d5a180]
> > [   377.716] (EE) 3: /usr/lib/xorg/modules/drivers/intel_drv.so
> > (0x7fe5dffcb000+0xd6e4a) [0x7fe5e00a1e4a]
> 
> Can you please run addr2line -i -e
> /usr/lib/xorg/modules/drivers/intel_drv.so 0xd6e4a 0xd6fb3 0xd74e3 0xd8925
> 
> Though I expect the log to be more informative in this case - the crash is
> likely to occur much later than the actual error.

Strange... Is it normal that addr2line returns this (git version of the driver) ?
addr2line -i -e /usr/lib/xorg/modules/drivers/intel_drv.so 0xd6e4a 0xd6fb3 0xd74e3 0xd8925
??:0
??:0
??:0
??:0
Comment 30 Fab Stz 2015-06-01 20:03:52 UTC
Created attachment 116220 [details]
Log for case (7)

(In reply to Chris Wilson from comment #27)
> (In reply to Fab Stz from comment #26)
> > Log for (7) is way too big. 
> 
> That would be really useful to have, say, the last 1500 lines.
Comment 31 Fab Stz 2015-06-01 20:23:14 UTC
I reproduced the crash for (7), I noticed these additional errors in the log that I didn't have in the previous log

[  3022.724] (EE) intel(0): sna_mode_check: invalid state found on pipe 0, disabling CRTC:20
[  3041.861] (EE) intel(0): sna_mode_check: invalid state found on pipe 0, disabling CRTC:20
[  3056.676] (EE) intel(0): sna_mode_check: invalid state found on pipe 0, disabling CRTC:20


This time the log ends with 
[  3079.772] (EE) Backtrace:
[  3079.772] (EE) 0: /usr/bin/X (xorg_backtrace+0x56) [0x7f0a4aa6dd46]
[  3079.773] (EE) 1: /usr/bin/X (0x7f0a4a8b7000+0x1baf29) [0x7f0a4aa71f29]
[  3079.773] (EE) 2: /lib/x86_64-linux-gnu/libc.so.6 (0x7f0a485ab000+0x35180) [0x7f0a485e0180]
[  3079.773] (EE) 3: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7f0a44851000+0xd6e4a) [0x7f0a44927e4a]
[  3079.773] (EE) 4: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7f0a44851000+0xd6fb3) [0x7f0a44927fb3]
[  3079.773] (EE) 5: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7f0a44851000+0xd8790) [0x7f0a44929790]
[  3079.773] (EE) 6: /usr/bin/X (0x7f0a4a8b7000+0x13d643) [0x7f0a4a9f4643]
[  3079.773] (EE) 7: /usr/bin/X (0x7f0a4a8b7000+0x133587) [0x7f0a4a9ea587]
[  3079.773] (EE) 8: /usr/bin/X (0x7f0a4a8b7000+0x573f7) [0x7f0a4a90e3f7]
[  3079.773] (EE) 9: /usr/bin/X (0x7f0a4a8b7000+0x5b596) [0x7f0a4a912596]
[  3079.773] (EE) 10: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf5) [0x7f0a485ccb45]
[  3079.773] (EE) 11: /usr/bin/X (0x7f0a4a8b7000+0x4590e) [0x7f0a4a8fc90e]
[  3079.773] (EE) 
[  3079.773] (EE) Segmentation fault at address 0x9
[  3079.773] (EE) 
[  3079.773] (EE) Caught signal 11 (Segmentation fault). Server aborting

I ran 
addr2line -i -e /usr/lib/xorg/modules/drivers/intel_drv.so 0xd6e4a 0xd6fb3 0xd8790

And got the same output :
??:0
??:0
??:0
Comment 32 Fab Stz 2015-06-02 07:28:14 UTC
The issues where kde suddenly logsout, or X "crashes" (7) seems linked to "enable-debug=full". If I don't compile with it, it doesn't happen...

It also happens with (8) when I use enable-debug=full.
Comment 33 Chris Wilson 2015-06-02 08:16:17 UTC
(In reply to Fab Stz from comment #32)
> The issues where kde suddenly logsout, or X "crashes" (7) seems linked to
> "enable-debug=full". If I don't compile with it, it doesn't happen...
> 
> It also happens with (8) when I use enable-debug=full.

commit 2fa48450c79a27cdd923faa690e5e8e772ae4dad
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jun 2 09:15:09 2015 +0100

    sna: Avoid using NULL pointer inside DBG
    
    When pretty printing the format for DBG, make sure it is not NULL!
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90725#c32
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 34 Fab Stz 2015-06-02 09:33:25 UTC
(In reply to Chris Wilson from comment #33)
> commit 2fa48450c79a27cdd923faa690e5e8e772ae4dad
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Jun 2 09:15:09 2015 +0100
> 
>     sna: Avoid using NULL pointer inside DBG
>     
>     When pretty printing the format for DBG, make sure it is not NULL!
>     
>     Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90725#c32
>     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

It doesn't fix the crash issue.
Comment 35 Chris Wilson 2015-06-02 09:55:42 UTC
Same last messages in the log? I'm having to guess which one it is dying on...
Comment 36 Fab Stz 2015-06-02 10:18:46 UTC
Created attachment 116236 [details]
Log in reply to comment 35

BTW, addr2line still returns ??:0
I also noticed that during kde startup, the screen usually flickers at a given moment. It is at that moment that X crashes.
Comment 37 Fab Stz 2015-06-02 10:21:41 UTC
Created attachment 116237 [details]
Display issue when not using debug=full

When compiling git without debug=full then I have other display issues, see attachment (I still have i915.ko with I915_BIT_6_SWIZZLE_9_10_17)
Comment 38 Fab Stz 2015-06-02 10:24:17 UTC
And when using i915.ko with I915_BIT_6_SWIZZLE_9_10_17 + intel_drv 2.99.917 or 2.21.15 there is no display issue either (not compiled with debug=full)
Comment 39 Chris Wilson 2015-06-02 10:33:08 UTC
(In reply to Fab Stz from comment #37)
> Created attachment 116237 [details]
> Display issue when not using debug=full
> 
> When compiling git without debug=full then I have other display issues, see
> attachment (I still have i915.ko with I915_BIT_6_SWIZZLE_9_10_17)

Hmm, that is more disturbing. But the debug log is complaining about a glitch...

In the meantime, can I please ask you to run a git bisection on xf86-video-intel:

$ cd xf86-video-intel
$ git bisect start
$ git bisect bad
$ git bisect good 2.99.917

This will checkout a version between good/bad for you to build, install and test.

Then say "git bisect good" or "bad" based on whether you see the corruption. Rinse and repeat, until it proclaims the bad commit.
Comment 40 Chris Wilson 2015-06-02 10:36:24 UTC
Hmm, we are also triggering shadow scanout. Can you please attach both the first 1500 lines and last 1500 lines?

Something like:

log=/var/log/Xorg.0.log.old; (head -1500 $log; echo ...; tail -1500 $log) | gzip -c9 > Xorg.0.log.gz
Comment 41 Fab Stz 2015-06-02 10:44:55 UTC
Created attachment 116239 [details]
Log in reply to comment 40
Comment 42 Chris Wilson 2015-06-02 10:53:15 UTC
I think I know why you hit the full-debug assertion, and that should be fixed by:

commit dbfbbcb4b37548172fd6fe9a6976e5ec310477ca
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jun 2 11:50:40 2015 +0100

    sna: Mark GPU as wholly damage when replacing a drawable
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=90725#c37
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

There is a good chance that is the display corruption, but there is equally good chance that there are further bugs...
Comment 43 Chris Wilson 2015-06-02 10:58:53 UTC
commit b0aa9d349ddf727dc544bc46d066f990d3e42776
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jun 2 11:57:18 2015 +0100

    sna: Reorder can-fence test to account for bit17 swizzling
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=90725#c40
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

will take care of the unwanted shadow scanout.
Comment 44 Fab Stz 2015-06-02 12:42:52 UTC
(In reply to Chris Wilson from comment #39)
> In the meantime, can I please ask you to run a git bisection on
> xf86-video-intel:

I always had bad displays. Result of git bisection :

0d42b0ed25d4112e0b3e3218e5c42947bbeb9e27 is the first bad commit
commit 0d42b0ed25d4112e0b3e3218e5c42947bbeb9e27
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Dec 24 08:12:52 2014 +0000

    sna/gen7: Limit threads on HSW GT1

    For whatever reason, it seems that for HSW GT1 we cannot specify the
    maximum value of the field and leave it to the hardware to clamp the
    value to the maximum supported. The impact should be zero other than the
    possibilty it workarounds an issue if the hardware doesn't apply the
    limit.

    References: https://bugs.freedesktop.org/show_bug.cgi?id=87564
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

:040000 040000 ea65d57f3c6f2e8ce48d2dc3f20dda9c0c9e1147 feb8dd1f98bc4a14d9fddcca3be4272b9fdc211a M      src


Latest git containing these two commits from Comment 42 and Comment 43
> commit b0aa9d349ddf727dc544bc46d066f990d3e42776
> commit dbfbbcb4b37548172fd6fe9a6976e5ec310477ca
fixed both issues (the crashing issue and the display issue of attachment 116237 [details] )

So I can say I didn't face any issue yet with kernel I915_BIT_6_SWIZZLE_9_10_17 + git as of today 11am GMT whether I use --enable-debug=full or don't set debug=full
Comment 45 Mads Villadsen 2015-06-05 16:50:40 UTC
I am seeing the same behaviour on a Dell Lattitude 1525 laptop.

00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) (rev 0c)

xorg-x11-drv-intel-2.99.917-10.20150526.fc22.x86_64

cat /sys/kernel/debug/dri/0/i915_swizzle_info

bit6 swizzle for X-tiling = bit9/bit10/bit11
bit6 swizzle for Y-tiling = bit9/bit11
DDC = 0x000f0002
DDC2 = 0x00000000
C0DRB3 = 0x0000
C1DRB3 = 0x0000
L-shaped memory detected
Comment 46 Chris Wilson 2015-06-05 16:54:35 UTC
(In reply to Mads Villadsen from comment #45)
> I am seeing the same behaviour on a Dell Lattitude 1525 laptop.
> 
> 00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960
> Integrated Graphics Controller (primary) (rev 0c)

I actually expect that to be a different bug and not unknown bit17 swizzling (since we do have accurate detection for i965gm).

Can you please file this seperately? We can remerge later after testing if required.
Comment 47 Fab Stz 2015-06-12 13:13:21 UTC
The status if this bug is "needinfo". I didn't face any issue in the meantime. Are you still expecting some feedback from me ? Or are you waiting for someone else's response to patch the kernel (see comment 12) ?
Comment 48 Chris Wilson 2015-06-12 13:18:55 UTC
Nope, just forgetting to unflag it. In the chipset specs, there is no description of the swizzling or the appropriate detection. Bleh.
Comment 49 Fab Stz 2015-06-17 05:49:12 UTC
Created attachment 116545 [details]
Another display issue

In Iceweasel / Firefox.
It disappears when the window is redrawn.
Before taking the screenshot, the 2 gray lines above & under menu "Fichier" was actually a gray rectangle whose size went under the URL bar.
Comment 50 Fab Stz 2015-06-17 11:03:12 UTC
Created attachment 116555 [details]
Another display issue (2)
Comment 51 Chris Wilson 2015-11-23 10:05:19 UTC
Should be all fixed up now:

commit a53f2afb7e3dfc2c7acbb0c015b44783d99d8119
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Nov 19 09:58:05 2015 +0000

    drm/i915: Mark uneven memory banks on gen4 desktop as unknown swizzling
    
    We have varied reports of swizzling corruption on gen4 desktop, and
    confirmation that one at least is triggered by uneven memory banks
    (L-shaped memory). The implication is that the swizzling varies between
    the paired channels and the remainder of memory on the single channel. As
    the object then has unpredictable swizzling (it will vary depending on
    exact page allocation and may even change during the object's lifetime as
    the pages are replaced), we have to report to userspace that the swizzling
    is unknown.
    
    However, some existing userspace is buggy when it meets an unknown
    swizzling configuration and so we need to tell another white lie and
    mark the swizzling as NONE but report it as UNKNOWN through the extended
    get-tiling-ioctl. See
    
    commit 5eb3e5a5e11d14f9deb2a4b83555443b69ab9940
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Sun Jun 28 09:19:26 2015 +0100
    
        drm/i915: Declare the swizzling unknown for L-shaped configurations
    
    for the previous example where we found that telling the truth to
    userspace just ends up in a world of hurt.
    
    Also since we don't truly know what the swizzling is on the pages, we
    need to keep them pinned to prevent swapping as the reports also
    suggest that some gen4 devices have previously undetected bit17
    swizzling.
    
    v2: Combine unknown + quirk patches to prevent userspace ever seeing
    unknown swizzling through the normal get-tiling-ioctl. Also use the same
    path for the existing uneven bank detection for mobile gen4.
    
    Reported-by: Matti Hämäläinen <ccr@tnsp.org>
    Tested-by: Matti Hämäläinen <ccr@tnsp.org>
    References: https://bugs.freedesktop.org/show_bug.cgi?id=90725
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Matti Hämäläinen <ccr@tnsp.org>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Cc: Jani Nikula <jani.nikula@intel.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Link: http://patchwork.freedesktop.org/patch/msgid/1447927085-31726-1-git-send-email-chris@chris-wilson.co.uk
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.