Bug 42813 - firefox causes crash while loading page [SNA]
Summary: firefox causes crash while loading page [SNA]
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Chris Wilson
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-11 04:45 UTC by Clemens Eisserer
Modified: 2011-12-13 07:52 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
cairo-trace which leads to xorg crash (24.00 KB, application/octet-stream)
2011-11-11 09:48 UTC, Clemens Eisserer
no flags Details
dmesg (68.23 KB, text/plain)
2011-11-11 14:52 UTC, Clemens Eisserer
no flags Details
lspci (21.24 KB, text/plain)
2011-11-11 14:53 UTC, Clemens Eisserer
no flags Details
cairo-trace from console with --flush (14.81 KB, application/octet-stream)
2011-11-12 02:05 UTC, Clemens Eisserer
no flags Details
xtrace log (285.28 KB, application/x-bzip2)
2011-11-12 03:20 UTC, Clemens Eisserer
no flags Details
debug=full log (1.64 MB, application/x-bzip2)
2011-11-12 03:31 UTC, Clemens Eisserer
no flags Details

Description Clemens Eisserer 2011-11-11 04:45:20 UTC
When loading the page http://www.koenig-waermepumpen.de/ using the official firefox build, I reliably get the following crash:

Program received signal SIGBUS, Bus error.
__memcpy_ssse3 () at ../sysdeps/i386/i686/multiarch/memcpy-ssse3.S:714
714        movdqu    %xmm0, (%esi)
(gdb) bt
#0  __memcpy_ssse3 () at ../sysdeps/i386/i686/multiarch/memcpy-ssse3.S:714
#1  0x0014acb4 in memcpy_blt (src=0xb3234048, dst=0xae414000, bpp=<optimized out>, src_stride=4200, dst_stride=8192,
    src_x=0, src_y=0, dst_x=0, dst_y=0, width=1050, height=10000) at blt.c:68
#2  0x001750c4 in sna_replace (sna=0x8a76d90, bo=0x8ea72f0, width=1050, height=10000, bpp=32, src=0xb3234048, stride=4200)
    at sna_io.c:455
#3  0x00160dde in sna_pixmap_move_to_gpu (pixmap=0xb3234008) at sna_accel.c:1028
#4  0x001638d7 in sna_copy_boxes (src=0xb3234008, dst=0xb616a008, gc=0x8e946a0, box=0xbff6b944, n=1, dx=-106, dy=-160,
    reverse=0, upsidedown=0, bitplane=0, closure=0x0) at sna_accel.c:1959
#5  0x081a9efb in miCopyRegion (pSrcDrawable=0xb3234008, pDstDrawable=0xb616a008, pGC=0x8e946a0, pDstRegion=0xbff6b944,
    dx=-106, dy=-160, copyProc=0x1636a0 <sna_copy_boxes>, bitPlane=0, closure=0x0) at micopy.c:137
#6  0x081aa3e0 in miDoCopy (pSrcDrawable=0xb3234008, pDstDrawable=0xb616a008, pGC=0x8e946a0, xIn=0, yIn=0, widthSrc=1050,
    heightSrc=520, xOut=106, yOut=160, copyProc=0x1636a0 <sna_copy_boxes>, bitPlane=0, closure=0x0) at micopy.c:334
#7  0x0015f086 in sna_copy_area (src=0xb3234008, dst=0xb616a008, gc=0x8e946a0, src_x=0, src_y=0, width=1050, height=520,
    dst_x=106, dst_y=160) at sna_accel.c:2233
#8  0x081597b7 in damageCopyArea (pSrc=0xb3234008, pDst=0xb616a008, pGC=0x8e946a0, srcx=0, srcy=0, width=1050, height=520,
    dstx=106, dsty=160) at damage.c:864
#9  0x08071c16 in ProcCopyArea (client=0x8da92d0) at dispatch.c:1645
#10 0x0807609f in Dispatch () at dispatch.c:432
#11 0x0806431a in main (argc=6, argv=0xbff6bc04, envp=0xbff6bc20) at main.c:287

intel i945GM
libdrm-2.4.27
pixman-0.23.8
linux-3.1.0
xorg 1.11.1
Comment 1 Chris Wilson 2011-11-11 06:06:40 UTC
A BusError indicates the kernel hit an unsolvable resource issue when trying to fault in the page. Can you please attach lspci -vv and dmesg?

Extremely reliable after a fresh start or does it take sometime?
Comment 2 Chris Wilson 2011-11-11 06:52:52 UTC
Loading that page here (in the UK, but using the German layout) I don't hit this path using firefox 8.0 from the tarball, using a local version of cairo master. I don't see the 10,0000 tall pixmap created at all.

Can you please run cairo-trace --profile firefox http://www.koenig-waermepumpen.de/.
Comment 3 Clemens Eisserer 2011-11-11 09:48:26 UTC
Created attachment 53414 [details]
cairo-trace which leads to xorg crash
Comment 4 Clemens Eisserer 2011-11-11 14:52:50 UTC
Created attachment 53423 [details]
dmesg
Comment 5 Clemens Eisserer 2011-11-11 14:53:21 UTC
Created attachment 53424 [details]
lspci
Comment 6 Chris Wilson 2011-11-12 01:51:53 UTC
(In reply to comment #3)
> Created attachment 53414 [details]
> cairo-trace which leads to xorg crash

Ah, the trace I think was truncated by the crash. I think you need to run firefox from the console under cairo-trace and watch X burn. (Or you can use cairo-trace --flush firefox, but that still has a slight risk of loosing the last command compared to running it from outside X.)
Comment 7 Clemens Eisserer 2011-11-12 02:05:00 UTC
Created attachment 53436 [details]
cairo-trace from console with --flush
Comment 8 Chris Wilson 2011-11-12 02:44:05 UTC
Hmm, that's still a lot smaller than I'm expecting and notably absent of any huge pixmaps or operations. I'm guessing that firefox is embedding a mozcairo. :(
The alternatives are an xtrace or a debug=full Xorg.log 

Have you tried the recent tweaks for source pixmap migration on the CopyArea path? I am slightly concerned at uploading the 1050x10000 pixmap for a 1050x520 copy.
Comment 9 Clemens Eisserer 2011-11-12 03:20:09 UTC
Created attachment 53439 [details]
xtrace log
Comment 10 Clemens Eisserer 2011-11-12 03:31:58 UTC
Created attachment 53440 [details]
debug=full log
Comment 11 Clemens Eisserer 2011-11-12 03:35:12 UTC
just tested, the crash also happens with FF7, shipped with Fedora-16 using system cairo.
Comment 12 Chris Wilson 2011-11-12 04:07:38 UTC
I'd just like to say wtf.

CreatePixmap 1050x10000
clear using GPU
PutImage 1050x10000
CopyArea 1050x750

Never to be used again.

I've just tweaked the heuristics to hopefully behave better for the incremental PutImage. (Yes, I know I'm just papering around the issue rather than root causing it, except that this is equally unacceptable in terms of performance.)
Comment 13 Chris Wilson 2011-11-12 04:25:52 UTC
And hopefully this would outlaw trying to fence such large objects in the first place:

commit e8799cdea461df5102d421fda26fecceae79b929
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sat Nov 12 12:19:31 2011 +0000

    sna: Be stricter and disallow allocation of large fenced objects
    
    When allocating objects, we need to check the size of the full fenced
    regions against the mappable limits in order to be able to mmap the
    object later.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=42813
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 14 Clemens Eisserer 2011-11-12 05:24:48 UTC
Thanks, I'll give it a try soon.

I'll also file a bug against mozilla, I don't see any need for allocating a 1050x10000 pixmap (40mb!) which seems really crazy.
Comment 15 Chris Wilson 2011-12-13 04:06:19 UTC
I'm pretty sure, even with the more recent changes, that I've appropriately capped the largest bo to prevent this issue...
Comment 16 Clemens Eisserer 2011-12-13 07:52:49 UTC
sorry for the lack of feedback - forgot to reply after everything worked as intended =)

Thanks!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.