Bug 43893 - X is crashing very often X: radeon_cs_gem.c:181: cs_gem_write_reloc: Assertion `boi->space_accounted' failed.
Summary: X is crashing very often X: radeon_cs_gem.c:181: cs_gem_write_reloc: Assertio...
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/Radeon (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: xf86-video-ati maintainers
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-16 08:38 UTC by Matthias
Modified: 2012-02-09 01:12 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
xorg.log of crash machine (30.54 KB, text/plain)
2011-12-16 09:07 UTC, Matthias
no flags Details
kernel log (49.55 KB, text/plain)
2011-12-16 09:07 UTC, Matthias
no flags Details
libdrm_radeon: account for write domain or read domains, not both (1.04 KB, patch)
2011-12-20 10:38 UTC, Michel Dänzer
no flags Details | Splinter Review
xf86-video-ati: Specify read domains or write domain for space check, not both (1.10 KB, patch)
2011-12-20 10:39 UTC, Michel Dänzer
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias 2011-12-16 08:38:42 UTC
Hi there,

I encountered the following error:
X: radeon_cs_gem.c:181: cs_gem_write_reloc: Assertion `boi->space_accounted' failed.

on my x86_64 archlinux box. verified with newest arch-xorg version with video-ati driver replaced by git version.

Reproduce (very simple):
start xinit session (default, the one without any fancy window manager, just a xterm in it).
In xterm I did "yaourt -S milkytracker" to show some coloured output.
When asking me if I want to see PKGBUILD, X crashes.

Attached gdb to X (without stripped symbols from radeon driver):


failed to revalidate
X: radeon_cs_gem.c:181: cs_gem_write_reloc: Assertion `boi->space_accounted' failed.

Program received signal SIGABRT, Aborted.
0x00007fd09ade5905 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00007fd09ade5905 in raise () from /lib/libc.so.6
#1  0x00007fd09ade6d7b in abort () from /lib/libc.so.6
#2  0x00007fd09adde74e in ?? () from /lib/libc.so.6
#3  0x00007fd09adde7f2 in __assert_fail () from /lib/libc.so.6
#4  0x00007fd098b035eb in ?? () from /usr/lib/libdrm_radeon.so.1
#5  0x00007fd098dbcfd3 in r600_cp_set_surface_sync.isra.0 ()
   from /usr/lib/xorg/modules/drivers/radeon_drv.so
#6  0x00007fd098dc7c29 in r600_finish_op () from /usr/lib/xorg/modules/drivers/radeon_drv.so
#7  0x00007fd098dbba0a in R600Copy () from /usr/lib/xorg/modules/drivers/radeon_drv.so
#8  0x00007fd0982e1148 in ?? () from /usr/lib/xorg/modules/libexa.so
#9  0x00007fd0982e147f in ?? () from /usr/lib/xorg/modules/libexa.so
#10 0x00000000005456ca in miCopyRegion ()
#11 0x0000000000545bc2 in miDoCopy ()
#12 0x00007fd0982df786 in ?? () from /usr/lib/xorg/modules/libexa.so
#13 0x00000000004fa78c in ?? ()
#14 0x000000000042fcf3 in ?? ()
#15 0x0000000000433c59 in ?? ()
#16 0x0000000000422e8a in ?? ()
#17 0x00007fd09add214d in __libc_start_main () from /lib/libc.so.6
#18 0x000000000042317d in _start ()



I have a M4A78LT-M-LE, BIOS 0507    02/23/2010 board
with CPU0: AMD Athlon(tm) II X2 235e Processor stepping 02.

Dont ask about the onboard video chip, here is what dmesg says:
radeon 0000:01:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
[    3.968925] radeon 0000:01:05.0: setting latency timer to 64
[    3.969170] [drm] initializing kernel modesetting (RS780 0x1002:0x9616 0x1043:0x8388).
[    3.969187] [drm] register mmio base: 0xFEAF0000
[    3.969188] [drm] register mmio size: 65536
[    3.969815] ATOM BIOS: B27722_RS780C
[    3.969831] radeon 0000:01:05.0: VRAM: 32M 0x00000000C0000000 - 0x00000000C1FFFFFF (32M used)
[    3.969833] radeon 0000:01:05.0: GTT: 512M 0x00000000A0000000 - 0x00000000BFFFFFFF
[    3.970062] [drm] Detected VRAM RAM=32M, BAR=32M
[    3.970065] [drm] RAM width 32bits DDR
[    3.970121] [TTM] Zone  kernel: Available graphics memory: 1010832 kiB.
[    3.970123] [TTM] Initializing pool allocator.
[    3.970146] [drm] radeon: 32M of VRAM memory ready
[    3.970148] [drm] radeon: 512M of GTT memory ready.
[    3.970167] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
[    3.970168] [drm] Driver supports precise vblank timestamp query.
[    3.970186] [drm] radeon: irq initialized.
[    3.970190] [drm] GART: num cpu pages 131072, num gpu pages 131072
[    3.971148] [drm] Loading RS780 Microcode



I also installed XFCE4. With very light usage I encountered the same problem. I did not try to reproduce it there, but it might be possible with some steps...

Does nobody except me use this kind of graphics chip under linux?! All other bugs I found to that crash are not relevant, the only one relevant was closed as fixed in april 2011.
(http://comments.gmane.org/gmane.linux.debian.devel.x/102978)
Comment 1 Matthias 2011-12-16 08:59:38 UTC
I managed to compile libdrm and the radeon driver with debug symbols (and with O0). Here is a new backtrace:

failed to revalidate
X: radeon_cs_gem.c:181: cs_gem_write_reloc: Assertion `boi->space_accounted' failed.

Program received signal SIGABRT, Aborted.
0x00007fe4d0943905 in raise () from /lib/libc.so.6
(gdb) bt
#0  0x00007fe4d0943905 in raise () from /lib/libc.so.6
#1  0x00007fe4d0944d7b in abort () from /lib/libc.so.6
#2  0x00007fe4d093c74e in ?? () from /lib/libc.so.6
#3  0x00007fe4d093c7f2 in __assert_fail () from /lib/libc.so.6
#4  0x00007fe4ce5e9856 in cs_gem_write_reloc (cs=0x8e5410, bo=0xc756d0, read_domain=2, 
    write_domain=0, flags=0) at radeon_cs_gem.c:181
#5  0x00007fe4ce5eb380 in radeon_cs_write_reloc (cs=0x8e5410, bo=0xc756d0, read_domain=2, 
    write_domain=0, flags=0) at radeon_cs.c:20
#6  0x00007fe4ce8f4dcf in r600_cp_set_surface_sync (pScrn=0x8d8e40, ib=0x0, sync_type=8388608, 
    size=48, mc_addr=0, bo=0xc756d0, rdomains=2, wdomain=0) at r6xx_accel.c:333
#7  0x00007fe4ce8f92f2 in r600_set_vtx_resource (pScrn=0x8d8e40, ib=0x0, res=0x7fff1b876570, 
    domain=2) at r6xx_accel.c:583
#8  0x00007fe4ce905429 in r600_finish_op (pScrn=0x8d8e40, vtx_size=16) at r6xx_accel.c:1267
#9  0x00007fe4ce8e9240 in R600DoCopy (pScrn=0x8d8e40) at r600_exa.c:514
#10 0x00007fe4ce8e9dd3 in R600Copy (pDst=0x92f000, srcX=4, srcY=17, dstX=4, dstY=4, w=480, h=299)
    at r600_exa.c:752
#11 0x00007fe4cddc7148 in ?? () from /usr/lib/xorg/modules/libexa.so
#12 0x00007fe4cddc747f in ?? () from /usr/lib/xorg/modules/libexa.so
#13 0x00000000005456ca in miCopyRegion ()
#14 0x0000000000545bc2 in miDoCopy ()
#15 0x00007fe4cddc5786 in ?? () from /usr/lib/xorg/modules/libexa.so
#16 0x00000000004fa78c in ?? ()
#17 0x000000000042fcf3 in ?? ()
#18 0x0000000000433c59 in ?? ()
#19 0x0000000000422e8a in ?? ()
#20 0x00007fe4d093014d in __libc_start_main () from /lib/libc.so.6
#21 0x000000000042317d in _start ()



Also the reproduce does not work 100% the times. sometimes it crashes only when I do one action inside yaourt, so there needs to be some more output on xterm... but mostly it crashes without me doing anything except calling yaourt.
Comment 2 Alex Deucher 2011-12-16 09:02:48 UTC
What version of the ddx (xf86-video-ati) are you using?  Please attach your xorg log and dmesg output.
Comment 3 Matthias 2011-12-16 09:07:14 UTC
Created attachment 54504 [details]
xorg.log of crash machine
Comment 4 Matthias 2011-12-16 09:07:50 UTC
Created attachment 54505 [details]
kernel log
Comment 5 Matthias 2011-12-16 09:09:23 UTC
attached dmesg and xorg log. 

version is git aacbd629b02cbee3f9e6a0ee452b4e3f21376bd3
(also reproduceable with extra/xf86-video-ati 6.14.3-1)

Also tried libdrm from git (latest). the latest backtrace is with both packages from git, as attached logs are.
Comment 6 Michel Dänzer 2011-12-20 10:38:08 UTC
Created attachment 54605 [details] [review]
libdrm_radeon: account for write domain or read domains, not both
Comment 7 Michel Dänzer 2011-12-20 10:39:40 UTC
Created attachment 54606 [details] [review]
xf86-video-ati: Specify read domains or write domain for space check, not both
Comment 8 Michel Dänzer 2011-12-20 10:41:27 UTC
I've attached two patches, one of which should fix the root problem in libdrm_radeon, the other of which should avoid the problem in xf86-video-ati. Can you verify that either patch alone fixes / avoids the problem?
Comment 9 Matthias 2011-12-20 10:45:29 UTC
Thank you very much.

Unfortunately I'm not at home until january, 9th. I will try both patches (combined and individual) and reply here at that time.

Maybe someone else is affected too and can try that earlier than me? 3 Weeks is a long time :)
Comment 10 Michel Dänzer 2012-01-17 02:24:14 UTC
(In reply to comment #9)
> Unfortunately I'm not at home until january, 9th. I will try both patches
> (combined and individual) and reply here at that time.

Did you get a chance to test the patches?
Comment 11 Matthias 2012-02-03 09:20:56 UTC
Hi,

I tested both patches alone and combined just today, sorry for that delay.

I used "old" ARCH packages as base. Patches applied without any problem and both alone fixed the occurence of the bug.

I did not encounter any other problems during my test session (worked about 2 hours with firefox/graphical terminal/milkytracker on the system - this would not have been possible for even one minute without the fix).

Thank you very much!
Comment 12 Matthias 2012-02-03 09:22:49 UTC
ah sorry,
I used these versions of todays archlinux repository, if it matters:
libdrm-2.4.28
xf86-video-ati-6.14.3
Comment 13 Michel Dänzer 2012-02-08 06:07:10 UTC
Fix pushed, thanks for testing.
Comment 14 aceman 2012-02-08 11:02:28 UTC
This was whole X server crashing?
May it be related to bug 33038? But there only a single program was crashing alone.
Comment 15 Michel Dänzer 2012-02-09 01:12:31 UTC
(In reply to comment #14)
> May it be related to bug 33038?

No, that was a bug in r600c, which is now dead in favour of r600g.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.