Bug 73083 - Semi-repeatable SIGBUS in R600UploadToScreenCS's memcpy
Summary: Semi-repeatable SIGBUS in R600UploadToScreenCS's memcpy
Status: RESOLVED DUPLICATE of bug 44099
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
: 73128 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-12-27 20:22 UTC by Dave Gilbert
Modified: 2014-01-06 09:35 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
possible fix (599 bytes, patch)
2013-12-27 23:50 UTC, Alex Deucher
no flags Details | Splinter Review
Workaround for the evergreen driver (569 bytes, patch)
2013-12-29 13:22 UTC, François Guerraz
no flags Details | Splinter Review

Description Dave Gilbert 2013-12-27 20:22:06 UTC
I can trigger the X server SIGBUSing in the memcpy in R600UploadToScreeCS
visiting a specific webpage in firefox - I load the webpage, scroll around a bit and it SIGBUSs - some of the time, sometimes it gets in the mood where it doesn't, but once it's in the mood it's pretty repeatable.
I can also get firefox to seg in the radeon code on that side - but I suspect they're two separate bugs.

This corresponds to:
The X side corresponds to https://bugzilla.redhat.com/show_bug.cgi?id=1029144

Graphics hardware:
07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV710 [Radeon HD 4350/4550] (prog-if 00 [VGA controller])
        Subsystem: ASUSTeK Computer Inc. Device 02a8

OS: Fedora F20; kernel 3.12.5-302.fc20.x86_64

X.Org X Server 1.14.4
Release Date: 2013-10-31
xorg-x11-drv-ati-7.2.0-3.20131101git3b38701.fc20.x86_64

The page that triggers the error:
http://www.theguardian.com/technology/2013/oct/07/nokia-lumia-1020-review-41-megapixel-camera

which has a heck of a lot of very high res images (7k pixels across)

Debug so far:

BT:
#0  __memcpy_sse2_unaligned ()
    at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:39
No locals.
#1  0x00007fe4ae0f4cb7 in memcpy (__len=30848, __src=<optimized out>,
    __dest=0x7fe49fd18000) at /usr/include/bits/string3.h:51
No locals.
#2  R600UploadToScreenCS (pDst=0x136e220, x=0, y=0, w=7712, h=8,
    src=<optimized out>, src_pitch=30848) at r600_exa.c:1609     ./BUILD/xf86-video-ati-20131101/src/r600_exa.c
        pScrn = 0xe05310
        info = 0xe06170
        accel_state = <optimized out>
        driver_priv = 0x147f680
        scratch = <optimized out>
        copy_dst = 0x1440a30
        dst = 0x7fe49fd18000 <Address 0x7fe49fd18000 out of bounds>
        size = 30848
        dst_domain = 4
        bpp = <optimized out>
        scratch_pitch = <optimized out>
        copy_pitch = 31232
        dst_pitch_hw = <optimized out>
        ret = <optimized out>
        flush = <optimized out>
        r = 1
        i = 0   <---- so first iteration
        src_obj = {pitch = 7712, width = 30848, height = 32, bpp = 0, domain = 0,
          bo = 0x24005cbd63d5600, tiling_flags = 14740400,
          surface = 0x7fe4ae0e9b69 <RADEONEXAPixmapIsOffscreen+9>}
        dst_obj = {pitch = 20374048, width = 0, height = 2907249802, bpp = 32740,
                                                                                          but seems unlikely because src_obj is reasonable
          domain = 2909410176, bo = 0x7fe4ae0e9b69 <RADEONEXAPixmapIsOffscreen+9>,
          tiling_flags = 20374048,
          surface = 0x7fe4ad491c8a <exaPixmapHasGpuCopy_mixed+106>}
        height = <optimized out>
        base_align = <optimized out>

all parameters to R600UploadToScreenCS are int's x/y/w/h src_pitch - all of which look reasonable (src_patch=width*4),
multiple ways to get to the copy: bit where it blew up.
h looks unusually small - only 8 vertical - so it's a redraw as scrolled?

1609:   memcpy(dst + i * copy_pitch, src, size);

#3  0x00007fe4ad492e70 in exaDoPutImage (depth=32, src_stride=<optimized out>,
    bits=0x1442048 "6Va\377\062R]\377/R\\\377\062U_\377-Q[\377\063Wa\377\062Yb\377/V_\377\065Yc\377,PZ\377\070[e\377\063V`\377\064R]\377:Xc\377\064P[\377\066T_\377*HS\377\065U`\377\060P[\377\063S^\377\070Xc\377\061Q\\\377\060S]\377-PZ\377\062U_\377\065Xb\377*MW\377\060S]\377\063S`\377\070Xe\377\064Ta\377/O\\\377:\\i\377\067Yf\377\063Ub\377\062Ta\377\064Xb\377\063Wa\377\061U_\377\062V`\377/T\\\377\060U]\377.S[\377\066[c\377-QY\377*NV\377\063W_\377\064Wa\377\062Ta\377\065Wd\377"..., format=2, h=8, w=7712, y=0,
    x=0, pGC=0x122dff0, pDrawable=0x136e220) at exa_accel.c:212
        y1 = <optimized out>
        x2 = <optimized out>
        ok = <optimized out>
        x1 = <optimized out>
        y2 = <optimized out>
        src = <optimized out>
---Type <return> to continue, or q <return> to quit---
        pExaPixmap = <optimized out>
        nbox = <optimized out>
        pPix = 0x136e220
        xoff = 0
        ret = 1
        pClip = <optimized out>
        pbox = 0x116f900
        yoff = 0
        bpp = <optimized out>
#4  exaPutImage (pDrawable=0x136e220, pGC=0x122dff0, depth=32, x=0, y=0, w=7712,
    h=8, leftPad=0, format=2,
    bits=0x1442048 "6Va\377\062R]\377/R\\\377\062U_\377-Q[\377\063Wa\377\062Yb\377/V_\377\065Yc\377,PZ\377\070[e\377\063V`\377\064R]\377:Xc\377\064P[\377\066T_\377*HS\377\065U`\377\060P[\377\063S^\377\070Xc\377\061Q\\\377\060S]\377-PZ\377\062U_\377\065Xb\377*MW\377\060S]\377\063S`\377\070Xe\377\064Ta\377/O\\\377:\\i\377\067Yf\377\063Ub\377\062Ta\377\064Xb\377\063Wa\377\061U_\377\062V`\377/T\\\377\060U]\377.S[\377\066[c\377-QY\377*NV\377\063W_\377\064Wa\377\062Ta\377\065Wd\377"...) at exa_accel.c:233
No locals.
#5  0x0000000000436b89 in ProcPutImage (client=0x116db50) at dispatch.c:1966
        pGC = 0x122dff0
        pDraw = 0x136e220
        length = <optimized out>
        lengthProto = <optimized out>
        tmpImage = 0x1442048 "6Va\377\062R]\377/R\\\377\062U_\377-Q[\377\063Wa\377\062---Type <return> to continue, or q <return> to quit---
Yb\377/V_\377\065Yc\377,PZ\377\070[e\377\063V`\377\064R]\377:Xc\377\064P[\377\066T_\377*HS\377\065U`\377\060P[\377\063S^\377\070Xc\377\061Q\\\377\060S]\377-PZ\377\062U_\377\065Xb\377*MW\377\060S]\377\063S`\377\070Xe\377\064Ta\377/O\\\377:\\i\377\067Yf\377\063Ub\377\062Ta\377\064Xb\377\063Wa\377\061U_\377\062V`\377/T\\\377\060U]\377.S[\377\066[c\377-QY\377*NV\377\063W_\377\064Wa\377\062Ta\377\065Wd\377"...
        stuff = 0x1442030
#6  0x000000000043a137 in Dispatch () at dispatch.c:432
        clientReady = 0x10c8220
        result = <optimized out>
        client = 0x116db50
        nready = 0
        icheck = 0x821570 <checkForInput>
        start_tick = 80
#7  0x00000000004286ca in main (argc=2, argv=0x7fff628342a8, envp=<optimized out>)
    at main.c:298
        i = <optimized out>
        alwaysCheckForInput = {0, 1}


so I added some debug; when it's blowing up it's following the:

    if (!(driver_priv->tiling_flags & (RADEON_TILING_MACRO | RADEON_TILING_MICRO))) {
        if (!radeon_bo_is_referenced_by_cs(driver_priv->bo, info->cs)) {
            flush = FALSE;
            if (!radeon_bo_is_busy(driver_priv->bo, &dst_domain)) {
                goto copy;  <-----
            }
        }

goto - so that's before the dst_obj structure is setup.
and in there we have a memcpy:

1609:   memcpy(dst + i * copy_pitch, src, size);

initially I assumed that dst/etc was bogus, but looking at gdb:
Program received signal SIGBUS, Bus error.
__memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:39
39              movdqu  %xmm8, (%rdi)

rdi            0x7fee9b646000   140662785990656

which is mapped as:
7fee9b646000-7feea37e6000 rw-s 185556000 00:05 9484                      /dev/dri/card0                                                                                                  
     (gdb) print $_siginfo                                                                                                                                                                                                                        
$1 = {si_signo = 7, si_errno = 0, si_code = 2, _sifields = {_pad = {-1687920640, 32750, 0, 0, 1872902144, 32767, 0, 0, 91, 110, 0, 0, 0, 0, 119, 124, 1872902143, 32767,                                                                     
      574453248, -938495348, 0, 0, 0, 0, 27891376, 0, 682180569, 50}, _kill = {si_pid = -1687920640, si_uid = 32750}, _timer = {si_tid = -1687920640, si_overrun = 32750,                                                                    
      si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _rt = {si_pid = -1687920640, si_uid = 32750, si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _sigchld = {                                                                                
      si_pid = -1687920640, si_uid = 32750, si_status = 0, si_utime = 8044053457088282624, si_stime = 32767}, _sigfault = {si_addr = 0x7fee9b646000}, _sigpoll = {                                                                           
      si_band = 140662785990656, si_fd = 0}}}                                                                                                                                                            

and the si_code=2 is apparently non-existent physical area.
Comment 1 Alex Deucher 2013-12-27 23:50:02 UTC
Created attachment 91242 [details] [review]
possible fix

Does this fix the issue?  The issue is that that buffer is larger than the pci bar window into vram.  The patch forces a scratch copy for all uploads.  It would also probably be good to fix the kernel to check the offset in radeon_mode_dumb_mmap() if the domain is vram to see if it's larger than the pci window and return an error if so.
Comment 2 Dave Gilbert 2013-12-28 00:21:55 UTC
Thanks, that does seem to have survived a test session; but as I say it's only partially repeatable so I'll keep trying tomorrow, if it survives a couple of days I'd say it's good.
Comment 3 Dave Gilbert 2013-12-29 12:54:43 UTC
Hi Alex,
  I think this is surviving for me OK, but if I understand you correctly this is a workaround rather than a fix, since it's now doing copies that are unnecessary for small images that do fit?

  Also, Francois Guerraz posted a backtrace that looks very similar on his HD6750M in the evergreen code, but it reads to me as the same problem in the Evergreen matching code:
https://bugzilla.redhat.com/show_bug.cgi?id=955617#c31
Comment 4 François Guerraz 2013-12-29 13:22:44 UTC
Created attachment 91287 [details] [review]
Workaround for the evergreen driver

A similar patch for the evergreen family. The bug was reproducible 100% of the time in my case (by uploading a bunch of pictures using the image uploader on photobox.com using Firefox 26) and that fixes / work around the problem. It's hard to tell if there is any performance impact because it used to be crashing. But using the aforementioned uploader now trigger high (>50%) CPU usage for Xorg.
Comment 5 Michel Dänzer 2014-01-06 09:32:18 UTC
*** Bug 73128 has been marked as a duplicate of this bug. ***
Comment 6 Michel Dänzer 2014-01-06 09:35:15 UTC

*** This bug has been marked as a duplicate of bug 44099 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.