Bug 28077

Summary: X segfault in miCopyRegion / fbCopyNtoN
Product: xorg Reporter: Eric Piel <e.a.b.piel>
Component: Server/Acceleration/EXAAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: albertomilone, alfredo, bernardo, luca.forina, maarten.fonville
Version: 7.5 (2009.10)   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg log of crash with psb driver
none
full backtrace of core after crash
none
Patch to fix the issue
none
unified, not inverted version of yves' patch
none
udiff tested version of Yves' patch none

Description Eric Piel 2010-05-12 11:00:50 UTC
It looks like the little brother of bug 27380. I have this segfault happening when running xserver 1.7.7 with DRI disabled (with the psb driver):

Backtrace:
0: /usr/bin/Xorg (xorg_backtrace+0x37) [0x80e9127]
1: /usr/bin/Xorg (0x8047000+0x61a3a) [0x80a8a3a]
2: (vdso) (__kernel_rt_sigreturn+0x0) [0xb78ee40c]
3: /usr/lib/xorg/modules/libfb.so (fbBlt+0xd08) [0xb77dc108]
4: /usr/lib/xorg/modules/libfb.so (fbCopyNtoN+0x2c7) [0xb77df007]
5: /usr/bin/Xorg (miCopyRegion+0x1d3) [0x81a04e3]
6: /usr/bin/Xorg (miDoCopy+0x52b) [0x81a0b7b]
7: /usr/lib/xorg/modules/libfb.so (fbCopyArea+0x7e) [0xb77de3ee]
8: /usr/lib/xorg/modules/libexa.so (0xb6f1a000+0xf9d9) [0xb6f299d9]
9: /usr/lib/xorg/modules/libexa.so (0xb6f1a000+0x8b34) [0xb6f22b34]
10: /usr/bin/Xorg (miCopyRegion+0x1d3) [0x81a04e3]
11: /usr/bin/Xorg (miDoCopy+0x52b) [0x81a0b7b]
12: /usr/lib/xorg/modules/libexa.so (0xb6f1a000+0x7c9a) [0xb6f21c9a]
13: /usr/bin/Xorg (0x8047000+0xde619) [0x8125619]
14: /usr/bin/Xorg (0x8047000+0x262ea) [0x806d2ea]
15: /usr/bin/Xorg (0x8047000+0x27ad7) [0x806ead7]
16: /usr/bin/Xorg (0x8047000+0x1c025) [0x8063025]
17: /lib/i686/libc.so.6 (__libc_start_main+0xe6) [0x484a4b96]
18: /usr/bin/Xorg (0x8047000+0x1bc11) [0x8062c11]
Segmentation fault at address 0xef8


Or in gdb:
Program received signal SIGSEGV, Segmentation fault.
fbBlt (srcLine=0x0, srcStride=1376, srcX=30368, dstLine=0x0, dstStride=1376, dstX=31136, width=320, 
    height=25, alu=3, pm=<value optimized out>, bpp=32, reverse=1, upsidedown=0) at fbblt.c:146
146				WRITE(--dst, FbDoDestInvarientMergeRop(READ(--src)));
(gdb) bt
#0  fbBlt (srcLine=0x0, srcStride=1376, srcX=30368, dstLine=0x0, dstStride=1376, dstX=31136, 
    width=320, height=25, alu=3, pm=<value optimized out>, bpp=32, reverse=1, upsidedown=0)
    at fbblt.c:146
#1  0xb77a1007 in fbCopyNtoN (pSrcDrawable=0x9403d38, pDstDrawable=0x9421e60, pGC=0x954a658, 
    pbox=0xbfb94264, nbox=<value optimized out>, dx=-24, dy=0, reverse=1, upsidedown=0, bitplane=0, 
    closure=0x0) at fbcopy.c:110
#2  0x081a04e3 in miCopyRegion (pSrcDrawable=0x9403d38, pDstDrawable=0x9421e60, pGC=0x954a658, 
    pDstRegion=0xbfb94264, dx=-24, dy=0, copyProc=0xb77a0d40 <fbCopyNtoN>, bitPlane=0, closure=0x0)
    at micopy.c:138
#3  0x081a0b7b in miDoCopy (pSrcDrawable=0x9403d38, pDstDrawable=0x9421e60, pGC=0x954a658, 
    xIn=<value optimized out>, yIn=0, widthSrc=10, heightSrc=25, xOut=963, yOut=0, 
    copyProc=0xb77a0d40 <fbCopyNtoN>, bitPlane=0, closure=0x0) at micopy.c:338
#4  0xb77a03ee in fbCopyArea (pSrcDrawable=0x0, pDstDrawable=0xa, pGC=0x954a658, xIn=949, yIn=0, 
    widthSrc=10, heightSrc=25, xOut=963, yOut=0) at fbcopy.c:344
#5  0xb6ef09d9 in ExaCheckCopyNtoN (pSrc=0x9403d38, pDst=0x9421e60, pGC=0x954a658, pbox=0xbfb9449c, 
    nbox=1, dx=-24, dy=0, reverse=1, upsidedown=0, bitplane=0, closure=0x0) at exa_unaccel.c:133
#6  0xb6ee9b34 in exaCopyNtoN (pSrcDrawable=0x9403d38, pDstDrawable=0x9421e60, pGC=0x954a658, 
    pbox=0xbfb94494, nbox=1, dx=-24, dy=0, reverse=1, upsidedown=0, bitplane=0, closure=0x0)
    at exa_accel.c:587
#7  0x081a04e3 in miCopyRegion (pSrcDrawable=0x9403d38, pDstDrawable=0x9421e60, pGC=0x954a658, 
    pDstRegion=0xbfb94494, dx=-24, dy=0, copyProc=0xb6ee9a10 <exaCopyNtoN>, bitPlane=0, closure=0x0)
    at micopy.c:138
#8  0x081a0b7b in miDoCopy (pSrcDrawable=0x9403d38, pDstDrawable=0x9421e60, pGC=0x954a658, 
    xIn=<value optimized out>, yIn=0, widthSrc=10, heightSrc=25, xOut=963, yOut=0, 
    copyProc=0xb6ee9a10 <exaCopyNtoN>, bitPlane=0, closure=0x0) at micopy.c:338
#9  0xb6ee8c9a in exaCopyArea (pSrcDrawable=0x9403d38, pDstDrawable=0x9421e60, pGC=0x954a658, 
    srcx=949, srcy=0, width=10, height=25, dstx=963, dsty=0) at exa_accel.c:601
#10 0x08125619 in damageCopyArea (pSrc=0x9403d38, pDst=0x9421e60, pGC=0x954a658, srcx=949, srcy=0, 
    width=10, height=25, dstx=963, dsty=0) at damage.c:949
#11 0x0806d2ea in ProcCopyArea (client=0x93e53e0) at dispatch.c:1725
#12 0x0806ead7 in Dispatch () at dispatch.c:439
#13 0x08063025 in main (argc=8, argv=0xbfb94744, envp=0xbfb94768) at main.c:285


Maybe all it needs is a similar fix to the one applied to exaHWCopyNtoN() for the non-accelerated equivalent fbCopyNtoN()? 

However, I looked a bit at the code, and it was not obvious how to do it...
Comment 1 Michel Dänzer 2010-05-14 08:46:35 UTC
(In reply to comment #1)
> Maybe all it needs is a similar fix to the one applied to exaHWCopyNtoN() for
> the non-accelerated equivalent fbCopyNtoN()? 

ExaCheckCopyNtoN() should take care of it (that's how the exaHWCopyNtoN() fix works). Basically we need to find out why the PrepareAccess calls in there don't result in ->devPrivate.ptr being non-NULL for all pixmaps involved.

BTW, does the psb driver set the EXA_MIXED_PIXMAPS flag?
Comment 2 Alberto Milone 2010-05-15 00:46:05 UTC
(In reply to comment #1)
> ExaCheckCopyNtoN() should take care of it (that's how the exaHWCopyNtoN() fix
> works). Basically we need to find out why the PrepareAccess calls in there
> don't result in ->devPrivate.ptr being non-NULL for all pixmaps involved.
>

Any suggestions on how to debug this further?

> BTW, does the psb driver set the EXA_MIXED_PIXMAPS flag?

No, it doesn't set the EXA_MIXED_PIXMAPS flag.
Comment 3 Michel Dänzer 2010-05-20 02:58:30 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > ExaCheckCopyNtoN() should take care of it (that's how the exaHWCopyNtoN() fix
> > works). Basically we need to find out why the PrepareAccess calls in there
> > don't result in ->devPrivate.ptr being non-NULL for all pixmaps involved.
> 
> Any suggestions on how to debug this further?

I'd trace the pExaScr->prepare_access_reg / exaPrepareAccess calls in ExaCheckCopyNtoN to try and find out the above.


> > BTW, does the psb driver set the EXA_MIXED_PIXMAPS flag?
> 
> No, it doesn't set the EXA_MIXED_PIXMAPS flag.

Actually I guess it still uses the 'classic' EXA scheme, where this flag isn't relevant. Maybe you can attach a log file to clarify this and maybe other things.
Comment 4 Jose Bernardo Silva 2010-05-20 06:30:23 UTC
Created attachment 35772 [details]
Xorg log of crash with psb driver
Comment 5 Jose Bernardo Silva 2010-05-20 06:31:12 UTC
Created attachment 35773 [details]
full backtrace of core after crash
Comment 6 Jose Bernardo Silva 2010-05-20 06:39:28 UTC
I took the liberty of adding my log and gdb backtrace, as I have the same problem on my pc.
Comment 7 Jose Bernardo Silva 2010-06-02 03:00:40 UTC
Any progress on this? It seems the main blocking point on having 3D support on poulsbo drivers under xorg 1.7.
Thanks.
Comment 8 Yves De Muyter 2010-06-06 14:46:39 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > (In reply to comment #1)
> > > ExaCheckCopyNtoN() should take care of it (that's how the exaHWCopyNtoN() fix
> > > works). Basically we need to find out why the PrepareAccess calls in there
> > > don't result in ->devPrivate.ptr being non-NULL for all pixmaps involved.
> > 
> > Any suggestions on how to debug this further?
> 
> I'd trace the pExaScr->prepare_access_reg / exaPrepareAccess calls in
> ExaCheckCopyNtoN to try and find out the above.
> 
> 
> > > BTW, does the psb driver set the EXA_MIXED_PIXMAPS flag?
> > 
> > No, it doesn't set the EXA_MIXED_PIXMAPS flag.
> 
> Actually I guess it still uses the 'classic' EXA scheme, where this flag isn't
> relevant. Maybe you can attach a log file to clarify this and maybe other
> things.

Is it possible that our PrepareAccess handler is never called? I see not a single invocation in our xorg.0.log  ... (I added some ErrorF messages to it, as i did on other handlers)

Some other strange thing is, ->devPrivate.ptr is sometimes NULL on xorg side, but for the same object (same address) on our driver side, it is filled in allright...

-Yves
Comment 9 Yves De Muyter 2010-06-07 13:09:54 UTC
I found a bug in EXA, notably exa_classic.

In case you want to copy a region with source = dest, you have the same pixmap as source and dest.

At the end of exaPixmapIsOffscreen_classic() the devPrivate.ptr is reset to NULL (look at the sources).

Now this is what happens in ExaCheckCopyNtoN:

exaPrepareAccess( pDst );
   Calls IsOffscreen()
      sets devPrivate.ptr to NULL
   sets up devPrivate.ptr to real pointer
   Everything OK
exaPrepareAccess( pSrc );
   Calls IsOffscreen()
      sets devPrivate.ptr to NULL
   BAILS OUT CAUSE OF NESTED OPERATION SINCE DST EQUALS SRC

We end up with devPrivate.ptr as NULL, and that is clearly wrong.
   
Simple patch will follow shortly

Yves De Muyter
Comment 10 Yves De Muyter 2010-06-07 13:35:46 UTC
Created attachment 36116 [details] [review]
Patch to fix the issue

This fixed a bug in EXA. exaPixmapIsOffscreen_classic() should not set devPrivate.ptr to but set it to the previous value
Comment 11 Michel Dänzer 2010-06-08 02:33:36 UTC
Sounds good, but please attach a patch generated by Git or diff -u, or just send it to the xorg-devel mailing list for review.
Comment 12 Adam Williamson 2010-06-08 10:25:31 UTC
Created attachment 36156 [details] [review]
unified, not inverted version of yves' patch

I believe as well as being old-style and without context, Yves' patch is inverted. I think this is what he meant (this is against 1.8.0). Untested, I'm just looking at Yves' patch to figure out how it's meant to be.
Comment 13 Jose Bernardo Silva 2010-06-08 10:49:17 UTC
Created attachment 36157 [details] [review]
udiff tested version of Yves' patch

Here is the version we are using on the gma500 ppa. It applies successfully, and seems to fix the bug, at least now the driver won't crash.
Comment 14 Michel Dänzer 2010-06-09 01:19:01 UTC
Review of attachment 36157 [details] [review]:

Looks good, though I'd move the declaration of old_ptr into the block where it's used.
Comment 15 Adam Williamson 2010-06-09 11:50:25 UTC
(In reply to comment #14)
> Review of attachment 36157 [details] [review]:
> 
> Looks good, though I'd move the declaration of old_ptr into the block where
> it's used.

That's exactly what mine did, and I think what Yves' did, if you count the line numbers and check the indent :)
Comment 16 Jose Bernardo Silva 2010-06-09 11:55:48 UTC
Well, I just applied Yves patch then generated a udiff... :) But I can move the declaration inside the block and regenerate and test the patch, if you guys want.
Comment 17 Michel Dänzer 2010-06-10 03:47:53 UTC
Note that the patch will need to be posted to the xorg-devel mailing list to be considered for inclusion.
Comment 18 Eric Piel 2010-06-10 15:08:58 UTC
Sent a clean version of the patch on the mailing list:
http://lists.x.org/archives/xorg-devel/2010-June/009970.html
Comment 19 Michel Dänzer 2010-06-11 10:31:37 UTC
Fix has been applied to the master branch, thanks.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.