Bug 43716 - [945GM SNA] X crashes when using Xv [possibly resolved with xserver update]
Summary: [945GM SNA] X crashes when using Xv [possibly resolved with xserver update]
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Chris Wilson
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-11 06:13 UTC by Paul Neumann
Modified: 2011-12-18 15:15 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Backtrace of the crash (2.74 KB, text/plain)
2011-12-11 06:13 UTC, Paul Neumann
no flags Details
Backtrace of X crashing with mplayer (3.93 KB, text/plain)
2011-12-11 09:00 UTC, Paul Neumann
no flags Details
tail -n 10000 of Xorg.log of the crash (43.15 KB, application/x-xz)
2011-12-14 10:50 UTC, Paul Neumann
no flags Details
backtrace of the crash I have Xorg.log of (2.37 KB, text/plain)
2011-12-14 10:51 UTC, Paul Neumann
no flags Details
yet another backtrace of X (5.46 KB, text/plain)
2011-12-15 09:44 UTC, Paul Neumann
no flags Details

Description Paul Neumann 2011-12-11 06:13:17 UTC
Created attachment 54322 [details]
Backtrace of the crash

When playing a video in vlc with Xv acceleration, after some time (from about 30 seconds to 10 minutes) X segfaults.

I am experiencing this with
xorg-server 1.11.2
xf86-video-intel from git with SNA enabled
Comment 1 Chris Wilson 2011-12-11 06:25:02 UTC
That hints to memory corruption. Are you using textured or the overlay adaptor? [And what is calling XGetImage? ;-]
Comment 2 Paul Neumann 2011-12-11 06:58:32 UTC
In the vlc settings, i have enabled the "Accelerated video output (Overlay)" checkbox and "XVideo output (XCB)". Is this measurement reliable enough?
Comment 3 Chris Wilson 2011-12-11 07:53:49 UTC
Despite what it says, it uses the TexturedAdaptor as opposed to the OverlayAdaptor. Not that it makes much difference, and is certainly preferable with a compositing manager. Which brings us to the next point. Running X under valgrind whilst using xfce4 and vlc, the only hits I'm getting are use of uninitialised values within XComposite. (They would appear to have been fixed in xserver.git, but not yet in my distro X package.)

Please can you keep running X under gdb (or grabbing the core file) and attach any other stacktraces if they differ.
Comment 4 Chris Wilson 2011-12-11 08:30:07 UTC
What version of X are you using? I keep hitting lots of bugs in 1.11.2.901 (including memory corruption) that I know are fixed in 1.11.99.1...
Comment 5 Paul Neumann 2011-12-11 08:59:58 UTC
I am using 1.11.2.
I just happened to experience another crash. This time with mplayer however and the backtrace looks different.
Comment 6 Paul Neumann 2011-12-11 09:00:54 UTC
Created attachment 54324 [details]
Backtrace of X crashing with mplayer
Comment 7 Chris Wilson 2011-12-11 09:24:34 UTC
Yup, that does suggest memory corruption. If you feel brave (and can withstand a slow desktop for a bit), you can try ./configure --enable-sna --enable-debug=full which will check everything I've had the foresight (or bad experience) to assert.
Comment 8 Paul Neumann 2011-12-14 10:50:57 UTC
Created attachment 54428 [details]
tail -n 10000 of Xorg.log of the crash

I managed to crash X by using Xv and having gtkperf open at the same time very quickly. I am not sure whether this is the bug I am looking for, as the backtrace seems completely different, but I just want to let you know.
Comment 9 Paul Neumann 2011-12-14 10:51:40 UTC
Created attachment 54429 [details]
backtrace of the crash I have Xorg.log of
Comment 10 Chris Wilson 2011-12-14 11:44:47 UTC
Unfortunately that's a separate bug:

commit 95cceb5ae5503af0ac50a923fa47e134f0da8743
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Dec 14 19:27:53 2011 +0000

    sna: Fix DBG crash whilst pruning inactive GPU buffers
    
    Don't attempt to dereference the NULL gpu_bo after having just freed it.
    Here in lies the folly of trying to blindly silence the compiler.
    
    Instead we should heed the error return as it means that we didn't
    decouple the pixmap from the inactive list and so we choose to place it
    back on the active list to purge again in the near future.
    
    Reported-by: Paul Neumann <paul104x@yahoo.de
    References: https://bugs.freedesktop.org/show_bug.cgi?id=43716
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

I've almost got my 945gm ready to roll again...
Comment 11 Chris Wilson 2011-12-14 14:04:14 UTC
I'm running gtkperf and mplayer in a loop, with and without compiz. Do I need anything else to trigger this? Seems happy enough with that so far...
Comment 12 Paul Neumann 2011-12-14 21:29:31 UTC
(In reply to comment #11)
> I'm running gtkperf and mplayer in a loop, with and without compiz. Do I need
> anything else to trigger this? Seems happy enough with that so far...

Gtkperf shouldn't really be necessary, this only triggered the bug you already fixed. I am using xfwm4 with compositing enabled.
What I also noticed is that I can relatively easily reproduce this without --enable-debug=full (seldom more than 30 minutes of watching video), but with full debugging, X never crashed in such a short time which is kind of problematic since I "only" have about 17G on / for the logs ;).
Comment 13 Chris Wilson 2011-12-15 03:55:24 UTC
Ok, I used xfce4-session instead of compiz, and ran X under valgrind, not even a whimper. :(

Can you keep on pasting the crash backlogs, and see if you can narrow down the reproduction steps?
Comment 14 Paul Neumann 2011-12-15 09:44:38 UTC
Created attachment 54469 [details]
yet another backtrace of X

So what I did this time (xfwm4 with composting, DDX -git from just half an hour ago):

- turn on vlc, windowed
- browse around a bit with chromium (no flash because of chromium' click-to-play)
- quit chromium
- watch vlc in fullscreen (the same stream as before, MPEG2 DVB-T)
- segfault happens

total time maybe 20-30 minutes

If you want I can leave the chromium part away or even start vlc in twm.
Comment 15 Chris Wilson 2011-12-15 10:24:25 UTC
Looks to be another unrelated buglet, keep them coming! :>

I think this is only possible after/during a gpu hang, but anyway crashing is not an option...

commit 19c184b7e4f8de747ed6fb1f6f910238193cf2a1
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Dec 15 18:18:19 2011 +0000

    sna/gen3: Check for upload failure of video bo
    
    And propagate that failure back to the client.
    
    Reported-by: Paul Neumann <paul104x@yahoo.de>
    References: https://bugs.freedesktop.org/show_bug.cgi?id=43716
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 16 Chris Wilson 2011-12-15 10:31:20 UTC
Is it the same movie every time? Video format, source size, output placement all can cause the code to behave slightly differently. I'm guessing from the source size that you're playing back an mpeg2ts stream, or perhaps dvd?
Comment 17 Paul Neumann 2011-12-15 11:29:36 UTC
(In reply to comment #16)
> Is it the same movie every time? Video format, source size, output placement
> all can cause the code to behave slightly differently. I'm guessing from the
> source size that you're playing back an mpeg2ts stream, or perhaps dvd?

It is TV but it is mostly the same channel.
Here is the ffmpeg -i output:

Stream #0:1[0x101]: Video: mpeg2video (Main), yuv420p, 704x576 [SAR 16:11 DAR 16:9], 10000 kb/s, 25.20 fps, 25 tbr, 90k tbn, 50 tbc
Comment 18 Paul Neumann 2011-12-17 01:20:33 UTC
Hmm, the crashes seem to be gone. However, after some time now, the video is not updated anymore until I manually resize the window.
The only pattern I have seen now, although I have not investigated very much yet, is: watching video in fullscreen -> after some time it stops -> exiting fullscreen makes the video update again.
I hope this could give you any idea about what is going on.
Comment 19 Chris Wilson 2011-12-17 01:48:38 UTC
When the video stops, could you make a note of system activity (top, sudo perf top) and if anything unusual appears in the logs. Otherwise there is no real difference between the first frame and the 1,000,000th except for the accumulation of errors...
Comment 20 Chris Wilson 2011-12-17 14:04:16 UTC
Although the first crashes were before the introduction of some of the recent bugs, can you try updating to get the crash fix for the vma cache, and cross your fingers! ;-)
Comment 21 Paul Neumann 2011-12-18 05:39:23 UTC
(In reply to comment #20)
> Although the first crashes were before the introduction of some of the recent
> bugs, can you try updating to get the crash fix for the vma cache, and cross
> your fingers! ;-)

I crossed my fingers, but it didn't help :(. As before, there are no crashes anymore but after some time, the video stalls. This only happens when vlc is in fullscreen.

However, as you suggested earlier, I tried 1.11.99.2 and it works so far, no crashes, no video stalls. It works perfectly.

As I am not able to reproduce the bug when enabling full debugging and the issue is fixed in 1.12, I don't know whether there is a point in trying to provide meaningful logs so you can think of a workaround.
Besides, I have no access to my 945gm in the next three weeks.

Anyways, thank you very much for your help :).
Comment 22 Chris Wilson 2011-12-18 05:47:20 UTC
Hmm, that perhaps points towards one of the Composite bugs fixed in the xserver. But I still don't have a clue, so lets leave this open for the time being and do enjoy the holidays!
Comment 23 Chris Wilson 2011-12-18 15:15:44 UTC
Now this could explain why you had a failure that took a long time to show up...

commit fed8d145c148bfa8a8a29f4088902377f9a10440
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sun Dec 18 19:26:38 2011 +0000

    sna: Use a safe iterator whilst searching for inactive linear bo
    
    As we may free a purged bo whilst iterating, we need to keep the next bo
    as a local member.
    
    Include the debugging that led to this find.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

I'm considering this bug closed for now, let me now if I break anything else when you get the chance.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.