Bug 88112 - [865g 3.16 regression] Desktop image is distorted
Summary: [865g 3.16 regression] Desktop image is distorted
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) Linux (All)
: high major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-06 14:21 UTC by Eugene
Modified: 2017-07-24 22:49 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (44.26 KB, text/plain)
2015-01-06 14:22 UTC, Eugene
no flags Details
glxinfo (30.81 KB, text/plain)
2015-01-06 14:23 UTC, Eugene
no flags Details
Xorg.0.log (19.29 KB, text/plain)
2015-01-06 14:23 UTC, Eugene
no flags Details
Distorted desktop image screenshot (356.00 KB, image/png)
2015-01-06 14:28 UTC, Eugene
no flags Details
/sys/class/drm/card0/error (697.64 KB, text/plain)
2015-01-09 01:12 UTC, Eugene
no flags Details
Xorg.0.log (xz compressed) (130.13 KB, text/plain)
2015-01-09 15:55 UTC, Eugene
no flags Details
Xorg.0.log with backtrace (255.59 KB, text/plain)
2015-01-09 16:15 UTC, Eugene
no flags Details
Xorg.0.log.old with backtrace2 (148.50 KB, text/plain)
2015-01-09 17:54 UTC, Eugene
no flags Details
Xorg.0.log.old (no backtrace but errors) (1.27 MB, text/plain)
2015-01-09 19:09 UTC, Eugene
no flags Details
desktop screen (through teamviewer) (433.12 KB, image/png)
2015-01-09 19:45 UTC, Eugene
no flags Details
Xorg.0.log.old (still errors; no backtrace) (1.30 MB, text/plain)
2015-01-09 20:33 UTC, Eugene
no flags Details
Screenshot of an image (good) (404.79 KB, image/png)
2015-01-09 20:35 UTC, Eugene
no flags Details

Description Eugene 2015-01-06 14:21:37 UTC
With the latest MESA 10.5git my desktop screen is distorted: http://www.zimagez.com/zimage/screenshot-181214-163540.php
It is observed on Linux from 3.16 till 3.19 (drm-intel-nightly also). With 3.15 and lower kernel versions it looks fine.

Xubuntu 14.04 LTS
Linux: 3.19.0-994 (drm-intel-nightly)
MESA 10.5.0-devel
xserver-xorg-video-intel 2.99.917+git1412311932.19a95b

glxinfo errors:

libGL error: failed to create dri screen
libGL error: failed to load driver: i915

Xorg.0.log errors:

intel: Failed to load module "present" (module does not exist, 0)
AIGLX error: Calling driver entry point failed
AIGLX: reverting to software rendering

With old MESA 10.3 release I'm expiriencing another issue with GPU HANG. Please look at: https://bugs.freedesktop.org/show_bug.cgi?id=86583 report.
Comment 1 Eugene 2015-01-06 14:22:59 UTC
Created attachment 111851 [details]
dmesg
Comment 2 Eugene 2015-01-06 14:23:26 UTC
Created attachment 111852 [details]
glxinfo
Comment 3 Eugene 2015-01-06 14:23:56 UTC
Created attachment 111853 [details]
Xorg.0.log
Comment 4 Eugene 2015-01-06 14:28:15 UTC
Created attachment 111854 [details]
Distorted desktop image screenshot
Comment 5 Chris Wilson 2015-01-06 14:32:30 UTC
It looks like the desktop is in the right place, but incorrectly rendered. I would have said that was a userspace bug, except that if 3.15 works correctly... Would it be possible to do a bisection of i915.ko between 3.15 and 3.16?
Comment 6 Eugene 2015-01-06 14:46:20 UTC
Update.
There is no distortion with desktop effects turned off - rendering issue disappears.
Comment 7 Eugene 2015-01-06 14:50:21 UTC
(In reply to Chris Wilson from comment #5)
>Would it be possible to do a bisection of i915.ko between 3.15
> and 3.16?

It's a pity but I don't know how to do it. But if you could explain me step-by-step I would try it.
Comment 8 Daniel Vetter 2015-01-06 16:19:41 UTC
My preferred bisect howto: https://wiki.ubuntu.com/Kernel/KernelBisection (You need to scroll down for bisecting upstream versions a bit).
Comment 9 Chris Wilson 2015-01-06 16:21:06 UTC
If I had to guess, I would say that it was the ddx screwing up with userptr. Slightly quicker than doing a full bisect, if you download xf86-video-intel:

$ sudo apt-get build-dep xserver-xorg-video-intel
$ git clone git://anongit.freedesktop.org/xorg/driver/xf86-video-intel

edit line 71 of src/sna/kgem.c to read:
#define DBG_NO_USERPTR 1

$ ./autogen.sh --prefix=/usr
$ make && sudo make install

And retest.
Comment 10 Eugene 2015-01-06 20:32:35 UTC
Sorry but I didn't understand anything. I discovered only that 3.15.10 is the last "good" and 3.16.0 is the first bad. So what I should do?

All I understood is I should install git. Thank also I need obtain kernel sources. But from what place: from kernel.org or using 'git clone git://kernel.ubuntu.com/ubuntu/ubuntu-utopic.git' command?

I did the last one but then:
git log --oneline Ubuntu-3.15.10..Ubuntu-3.16.0
fatal: ambiguous argument 'Ubuntu-3.15.10..Ubuntu-3.16.0': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

P.S. Sorry, never did it.
Comment 11 Eugene 2015-01-09 01:10:21 UTC
Ok, what I've done:

$ sudo apt-get build-dep xserver-xorg-video-intel
$ git clone git://anongit.freedesktop.org/xorg/driver/xf86-video-intel

edit line 71 of src/sna/kgem.c to read:
#define DBG_NO_USERPTR 1

$ ./autogen.sh --prefix=/usr
$ make && sudo make install

The image became fine after reboot but libGL error still is:

glxinfo | grep render
libGL error: failed to create dri screen
libGL error: failed to load driver: i915
direct rendering: Yes
    GLX_MESA_multithread_makecurrent, GLX_MESA_query_renderer, 
    GLX_MESA_query_renderer, GLX_OML_swap_method, GLX_SGIS_multisample, 
OpenGL renderer string: Gallium 0.4 on llvmpipe (LLVM 3.5, 128 bits)
    GL_ARB_conditional_render_inverted, GL_ARB_conservative_depth, 
    GL_NV_blend_square, GL_NV_conditional_render, GL_NV_depth_clamp,

Also:

dmesg | grep error
[   66.825508] [drm] GPU crash dump saved to /sys/class/drm/card0/error

I think that's the same as I've already discribe in: https://bugs.freedesktop.org/show_bug.cgi?id=86583


So, should I try bisect still?

Thanks.
Comment 12 Eugene 2015-01-09 01:12:29 UTC
Created attachment 111984 [details]
/sys/class/drm/card0/error
Comment 13 Chris Wilson 2015-01-09 10:27:04 UTC
(In reply to Eugene from comment #11)
> Ok, what I've done:
> 
> $ sudo apt-get build-dep xserver-xorg-video-intel
> $ git clone git://anongit.freedesktop.org/xorg/driver/xf86-video-intel
> 
> edit line 71 of src/sna/kgem.c to read:
> #define DBG_NO_USERPTR 1
> 
> $ ./autogen.sh --prefix=/usr
> $ make && sudo make install
> 
> The image became fine after reboot
...
> 
> So, should I try bisect still?

It seems clear as to what the trigger is, so no bisect necessary.

Can you reset src/sna/kgem.c (i.e. reenable userptr) and compile with ./configure --enable-debug=full, reproduce and attach the Xorg.0.log (you will need to compress it!).
Comment 14 Chris Wilson 2015-01-09 10:34:58 UTC
(In reply to Eugene from comment #12)
> Created attachment 111984 [details]
> /sys/class/drm/card0/error

That error state is nasty. It implies incoherence with the CS (*ACTHD != IPEHR).
Comment 15 Eugene 2015-01-09 14:21:31 UTC
>Can you reset src/sna/kgem.c (i.e. reenable userptr) and compile with ./configure --enable-debug=full, reproduce and attach the Xorg.0.log (you will need to compress it!).

Yes, I've already did:

make uninstall

and did reinstall "xserver-xorg-video-intel" package.

So, instead of:

./autogen.sh --prefix=/usr

I should run

./configure --enable-debug=full ?

Or I should run first

./autogen.sh --prefix=/usr

and then:

./configure --enable-debug=full

both?
Comment 16 Chris Wilson 2015-01-09 14:30:17 UTC
Just "./autogen.sh --prefix=/usr --enable-debug=full" will do :) I hope this actually triggers a FatalError catching the error in progress, so watch this space...
Comment 17 Eugene 2015-01-09 15:55:43 UTC
Created attachment 112011 [details]
Xorg.0.log (xz compressed)
Comment 18 Eugene 2015-01-09 16:00:52 UTC
Done. But it seems nothing special in Xorg.0.log. Any additional info? Please, ask.
Comment 19 Eugene 2015-01-09 16:13:00 UTC
Wait a minute, It seems I should reboot that machine to catch backtrace (you need backtrace, ah?). So it is in Xorg.0.log.old file in attachment.
Comment 20 Eugene 2015-01-09 16:15:56 UTC
Created attachment 112013 [details]
Xorg.0.log with backtrace
Comment 21 Chris Wilson 2015-01-09 16:34:04 UTC
Looks like userptr is being used with ShmPixmaps, that's useful to know.

It blew up dereferencing a mmap(wc) pointer. That's new with drm-intel-nightly. Looks like I forgot some sigtrap safe-guards.
Comment 22 Chris Wilson 2015-01-09 16:50:01 UTC
This should handle that failure, thanks:

commit e0463036bbd4e5f0201e122f9b29dd776ba4446f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jan 9 16:45:32 2015 +0000

    sna: Replace assert with conditional setting of sna_pixmap->mapped
    
    The status of sna_pixmap->mapped was changed with the introduction of
    mmap(wc), but the code was still asserting that the mmap could only be
    cached.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=88112#c20
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Please could you update and retest.
Comment 23 Eugene 2015-01-09 17:08:54 UTC
>sna: Replace assert with conditional setting of sna_pixmap->mapped
>Please could you update and retest.

Please, explain what to do: in which file what to change ?
Comment 24 Chris Wilson 2015-01-09 17:12:09 UTC
$ cd xf86-video-intel
$ git pull
$ make && sudo make install

Now you are ready for retesting
Comment 25 Eugene 2015-01-09 17:54:31 UTC
Created attachment 112020 [details]
Xorg.0.log.old with backtrace2
Comment 26 Eugene 2015-01-09 17:55:55 UTC
Backtrace again. And now I can't see desktop at all - black screen.
Comment 27 Chris Wilson 2015-01-09 18:01:15 UTC
Silly, silly me. I rewrote one line too many before pushing. :(

Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jan 9 17:58:24 2015 +0000

    sna: Actually set the priv->mapped type for mmap(wc)
    
    A glitch from the last patch forgot to set the priv->mapped flag
    correctly after setting up a mmap(wc).
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Hopefuly, if you update again, we should be closer to getting to the original bug. :)
Comment 28 Eugene 2015-01-09 19:09:55 UTC
Created attachment 112024 [details]
Xorg.0.log.old (no backtrace but errors)
Comment 29 Eugene 2015-01-09 19:11:10 UTC
This time no backtrace, but errors still is. Also screen is distorted it seems more than earlier.
Comment 30 Chris Wilson 2015-01-09 19:36:46 UTC
Thanks. That we don't hit an assert is a little daunting - I have the obvious bugs already tested for, so I have to look for something that didn't occur to me in the first place...

What does the corruption now look like?
Comment 31 Eugene 2015-01-09 19:45:41 UTC
Created attachment 112026 [details]
desktop screen (through teamviewer)

The real image may be slightly better. But not too much.
Comment 32 Chris Wilson 2015-01-09 19:57:48 UTC
Spotted one issue in that the new detiling routines are incorrect for gen2, so I've disabled those:

commit 836f9e11d67356babc80464f1183b907cb6cb2f2
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jan 9 19:55:41 2015 +0000

    sna: Disable detiling for gen2
    
    gen2 use a different tile layout to all the other generations, and are
    not supported by the existing routines. Disable for now.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=88112
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

does that restore the original corruption (if you git pull and rebuild)?
Comment 33 Chris Wilson 2015-01-09 20:17:34 UTC
Implementation looked straightforward, so

commit ebdc4d1eeb23604cf5c57c3d3a70629af041297d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jan 9 20:15:36 2015 +0000

    sna: Add basic unswizzled manual detilers for gen2
    
    gen2 uses a unique tile setup, and as far as we known, no swizzling.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 34 Eugene 2015-01-09 20:31:00 UTC
> Spotted one issue in that the new detiling routines are incorrect for gen2,
> so I've disabled those:
> 
> commit 836f9e11d67356babc80464f1183b907cb6cb2f2
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jan 9 19:55:41 2015 +0000
> 
>     sna: Disable detiling for gen2
>     
>     gen2 use a different tile layout to all the other generations, and are
>     not supported by the existing routines. Disable for now.
>     
>     References: https://bugs.freedesktop.org/show_bug.cgi?id=88112
>     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> does that restore the original corruption (if you git pull and rebuild)?

The screen began to look fine. In a minute I'll add screenshot and Xorg log which is downloading now from remote machine. But errors in it still presents.
Comment 35 Eugene 2015-01-09 20:33:08 UTC
Created attachment 112028 [details]
Xorg.0.log.old (still errors; no backtrace)
Comment 36 Eugene 2015-01-09 20:35:17 UTC
Created attachment 112029 [details]
Screenshot of an image (good)
Comment 37 Eugene 2015-01-09 20:54:25 UTC
(In reply to Chris Wilson from comment #33)
> Implementation looked straightforward, so
> 
> commit ebdc4d1eeb23604cf5c57c3d3a70629af041297d
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Jan 9 20:15:36 2015 +0000
> 
>     sna: Add basic unswizzled manual detilers for gen2
>     
>     gen2 uses a unique tile setup, and as far as we known, no swizzling.
>     
>     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Image is also looks fine. But errors still there in Xorg.0.log:

grep -i "(EE)" /var/log/Xorg.0.log
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[    35.678] (EE) intel: Failed to load module "present" (module does not exist, 0)
[    36.239] (EE) AIGLX error: Calling driver entry point failed
[    36.239] (EE) AIGLX: reverting to software rendering
Comment 38 Chris Wilson 2015-01-10 09:18:26 UTC
(In reply to Eugene from comment #37)
> (In reply to Chris Wilson from comment #33)
> > Implementation looked straightforward, so
> > 
> > commit ebdc4d1eeb23604cf5c57c3d3a70629af041297d
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Jan 9 20:15:36 2015 +0000
> > 
> >     sna: Add basic unswizzled manual detilers for gen2
> >     
> >     gen2 uses a unique tile setup, and as far as we known, no swizzling.
> >     
> >     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Image is also looks fine. But errors still there in Xorg.0.log:

Good. Thinking more about it, the incorrect detiling explains the original corruption, but I am not sure how the userptr test became a red herring. Anyway problem solved.
 
> grep -i "(EE)" /var/log/Xorg.0.log
>         (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
> [    35.678] (EE) intel: Failed to load module "present" (module does not
> exist, 0)

Not an issue, should really be a warning rather than error - the present extension won't be enabled if it doesn't exist.

> [    36.239] (EE) AIGLX error: Calling driver entry point failed
> [    36.239] (EE) AIGLX: reverting to software rendering

Something is wrong with mesa, the call to dri2->createNewScreen() fails. Check your build?

If you see the error from comment 12 again, please file a new bug. Thanks for the report and the testing!
Comment 39 Eugene 2015-01-10 15:12:48 UTC
> [    36.239] (EE) AIGLX error: Calling driver entry point failed
> [    36.239] (EE) AIGLX: reverting to software rendering

>Something is wrong with mesa, the call to dri2->createNewScreen() fails. Check your build?

What exactly to check? This issue was initially. I'm using MESA from oibaf ppa: https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers?field.series_filter=utopic

If you see the error from comment 12 again, please file a new bug. Thanks for the report and the testing!
No, I'm not discovering it now.

So, should I need to write a new report to:

> [    36.239] (EE) AIGLX error: Calling driver entry point failed
> [    36.239] (EE) AIGLX: reverting to software rendering

?


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.