Bug 28995

Summary: [RADEON:KMS::R300G] gl program -> rejects command buffers
Product: Mesa Reporter: almos <aaalmosss>
Component: Drivers/Gallium/r300Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: git   
Hardware: x86 (IA32)   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: dmesg after trying vdrift

Description almos 2010-07-09 13:14:52 UTC
In some games hexadecimal numbers are printed into the console, rendering errors occur and performance drops significantly. Affected games: VDrift, Nexuiz, certain levels in Sauerbraten (eg. venice), certain places in Doom3.
Comment 1 Marek Olšák 2010-07-09 13:41:01 UTC
I fixed VDrift with commit 347c00c46e9ecf858a8c21abf58a706b658b5b37 just 2 hours ago. Have you tested the commit?
Comment 2 almos 2010-07-09 14:22:08 UTC
After a git pull the situation is the same here.
Comment 3 Marek Olšák 2010-07-09 18:11:49 UTC
Could you please post your dmesg?
Comment 4 almos 2010-07-10 02:41:08 UTC
Created attachment 36933 [details]
dmesg after trying vdrift

In the console several hundred lines of numbers were printed, but only 22 messages into dmesg.
Comment 5 almos 2010-07-25 05:48:40 UTC
After updating today VDrift got fixed, but Nexuiz and Sauerbraten are still bad.

dmesg contains lots of these after Sauerbraten:
[153770.722790] [drm:r100_cs_track_texture_check] *ERROR* Texture of unit 0 needs 45184 bytes but is 45056
[153770.722796] [drm:r100_cs_track_texture_print] *ERROR* pitch         1
[153770.722800] [drm:r100_cs_track_texture_print] *ERROR* use_pitch     0
[153770.722803] [drm:r100_cs_track_texture_print] *ERROR* width        32
[153770.722806] [drm:r100_cs_track_texture_print] *ERROR* width_11   2048
[153770.722809] [drm:r100_cs_track_texture_print] *ERROR* height      256
[153770.722811] [drm:r100_cs_track_texture_print] *ERROR* height_11  2048
[153770.722814] [drm:r100_cs_track_texture_print] *ERROR* num levels    8
[153770.722817] [drm:r100_cs_track_texture_print] *ERROR* depth         0
[153770.722820] [drm:r100_cs_track_texture_print] *ERROR* bpp           4
[153770.722822] [drm:r100_cs_track_texture_print] *ERROR* coordinate type   0
[153770.722825] [drm:r100_cs_track_texture_print] *ERROR* width round to power of 2  1
[153770.722828] [drm:r100_cs_track_texture_print] *ERROR* height round to power of 2 1
[153770.722830] [drm:r100_cs_track_texture_print] *ERROR* compress format    2
[153770.722833] [drm:radeon_cs_ioctl] *ERROR* Invalid command stream !
Comment 6 Marek Olšák 2010-07-25 06:28:47 UTC
Which level in sauerbraten?
Comment 7 almos 2010-07-25 06:55:47 UTC
The only map I found so far to be problematic is venice, but the CS reject also happens when I press <ESC> to get to the menu, and select something (except for 'editing...' and 'about...').
Comment 8 Fabio Pedretti 2010-07-26 00:44:41 UTC
This looks like bug #28459 which is already fixed with newer kernel.
Comment 9 Marek Olšák 2010-07-26 04:16:11 UTC
Unfortunately I can't reproduce this bug.
Comment 10 almos 2010-07-28 11:35:44 UTC
I've been using 2.6.33.1 kernel, but after reading the conclusion of bug #28459, I upgraded to 2.6.35-rc6. Now Sauerbraten venice is OK, however, Nexuiz is still problematic. It says:

[  929.906079] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!

if I look at certain directions on most maps, but in Desert Factory, for example, all directions are bad.
Comment 11 almos 2010-07-28 12:57:03 UTC
An offtopic sidenote: I needed to downgrade back to 2.6.33.1, as both 2.6.34 and 2.6.35-rc6 hangs after an hour or so. It suddenly freezes, and only the reset button is working.
Comment 12 Marek Olšák 2010-07-28 13:28:02 UTC
I recall I was getting "Failed to parse relocation" on a PC with very little RAM and VRAM. I have no idea why, though.
Comment 13 almos 2010-07-28 13:36:59 UTC
I have 2535MB RAM and 128MB VRAM:
[   13.494922] radeon 0000:01:00.0: putting AGP V3 device into 8x mode
[   13.494994] [drm] radeon: VRAM 128M
[   13.495056] [drm] radeon: VRAM from 0x00000000 to 0x07FFFFFF
[   13.495103] [drm] radeon: GTT 64M
[   13.495146] [drm] radeon: GTT from 0xF0000000 to 0xF3FFFFFF
[   13.495207] [drm] radeon: irq initialized.
[   13.495341] [drm] Detected VRAM RAM=128M, BAR=256M
[   13.495390] [drm] RAM width 128bits DDR
[   13.495494] [TTM] Zone  kernel: Available graphics memory: 441928 kiB.
[   13.495542] [TTM] Zone highmem: Available graphics memory: 1297964 kiB.
[   13.495604] [drm] radeon: 128M of VRAM memory ready
[   13.495650] [drm] radeon: 64M of GTT memory ready.
[   13.495925] [drm] radeon: 1 quad pipes, 1 Z pipes initialized.
[   13.495978] [drm] radeon: cp idle (0x10000C03)
Comment 14 almos 2010-08-07 15:34:58 UTC
Now I tried vdrift again with texture size set to large, and most parts of the track are flashing and a series of this appears in dmesg:

[16009.453899] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!
[16009.636217] [TTM] Failed to find memory space for buffer 0xdb60462c eviction.
[16009.636225] [TTM] No space for db60462c (5462 pages, 21848K, 21M)
[16009.636229] [TTM]   placement[0]=0x00070002 (1)
[16009.636232] [TTM]     has_type: 1
[16009.636234] [TTM]     use_type: 1
[16009.636237] [TTM]     flags: 0x00000002
[16009.636240] [TTM]     gpu_offset: 0xF0000000
[16009.636242] [TTM]     size: 16384
[16009.636245] [TTM]     available_caching: 0x00060000
[16009.636248] [TTM]     default_caching: 0x00040000
[16009.636252] [TTM]  0x00000000-0x00000100:      256: used
[16009.636256] [TTM]  0x00000100-0x00000101:        1: used
[16009.636260] [TTM]  0x00000101-0x00000201:      256: used
[16009.636263] [TTM]  0x00000201-0x000002d1:      208: free
[16009.636267] [TTM]  0x000002d1-0x000002d9:        8: used
[16009.636270] [TTM]  0x000002d9-0x00000361:      136: free
[16009.636274] [TTM]  0x00000361-0x00000369:        8: used
[16009.636277] [TTM]  0x00000369-0x00000411:      168: free
[16009.636280] [TTM]  0x00000411-0x00000419:        8: used
[16009.636284] [TTM]  0x00000419-0x000004a1:      136: free
[16009.636287] [TTM]  0x000004a1-0x000004a9:        8: used
[16009.636291] [TTM]  0x000004a9-0x000004c9:       32: free
[16009.636294] [TTM]  0x000004c9-0x000004d1:        8: used
[16009.636298] [TTM]  0x000004d1-0x00000551:      128: free
[16009.636301] [TTM]  0x00000551-0x00000559:        8: used
[16009.636305] [TTM]  0x00000559-0x000005b9:       96: free
[16009.636308] [TTM]  0x000005b9-0x000005c1:        8: used
[16009.636312] [TTM]  0x000005c1-0x000005c9:        8: free
[16009.636315] [TTM]  0x000005c9-0x000005d1:        8: used
[16009.636319] [TTM]  0x000005d1-0x000005d9:        8: used
[16009.636322] [TTM]  0x000005d9-0x000005e1:        8: used
[16009.636326] [TTM]  0x000005e1-0x000005e9:        8: used
[16009.636329] [TTM]  0x000005e9-0x000005f1:        8: used
[16009.636332] [TTM]  0x000005f1-0x000005f9:        8: used
[16009.636336] [TTM]  0x000005f9-0x00000601:        8: used
[16009.636339] [TTM]  0x00000601-0x00000c32:     1585: free
[16009.636343] [TTM]  0x00000c32-0x00000c3a:        8: used
[16009.636346] [TTM]  0x00000c3a-0x00000c42:        8: used
[16009.636350] [TTM]  0x00000c42-0x00000c4a:        8: used
[16009.636353] [TTM]  0x00000c4a-0x00000c52:        8: used
[16009.636357] [TTM]  0x00000c52-0x000013f4:     1954: free
[16009.636360] [TTM]  0x000013f4-0x0000149f:      171: used
[16009.636364] [TTM]  0x0000149f-0x000019d3:     1332: free
[16009.636367] [TTM]  0x000019d3-0x0000247e:     2731: used
[16009.636371] [TTM]  0x0000247e-0x00002f29:     2731: used
[16009.636374] [TTM]  0x00002f29-0x00002f31:        8: used
[16009.636378] [TTM]  0x00002f31-0x00002f39:        8: used
[16009.636381] [TTM]  0x00002f39-0x00002f41:        8: used
[16009.636385] [TTM]  0x00002f41-0x00003172:      561: free
[16009.636388] [TTM]  0x00003172-0x000036c8:     1366: used
[16009.636392] [TTM]  0x000036c8-0x00003987:      703: free
[16009.636395] [TTM]  0x00003987-0x00003a07:      128: used
[16009.636399] [TTM]  0x00003a07-0x00003a1d:       22: used
[16009.636402] [TTM]  0x00003a1d-0x00003a33:       22: used
[16009.636406] [TTM]  0x00003a33-0x00003d8c:      857: free
[16009.636409] [TTM]  0x00003d8c-0x00003e0c:      128: used
[16009.636413] [TTM]  0x00003e0c-0x00003e8c:      128: used
[16009.636416] [TTM]  0x00003e8c-0x00003f0c:      128: used
[16009.636420] [TTM]  0x00003f0c-0x00003f62:       86: used
[16009.636423] [TTM]  0x00003f62-0x00003fb8:       86: used
[16009.636427] [TTM]  0x00003fb8-0x00003fc0:        8: used
[16009.636430] [TTM]  0x00003fc0-0x00003fc8:        8: used
[16009.636434] [TTM]  0x00003fc8-0x00003fd0:        8: used
[16009.636437] [TTM]  0x00003fd0-0x00003fd8:        8: used
[16009.636441] [TTM]  0x00003fd8-0x00003fe0:        8: used
[16009.636444] [TTM]  0x00003fe0-0x00003fe8:        8: used
[16009.636448] [TTM]  0x00003fe8-0x00003ff0:        8: used
[16009.636451] [TTM]  0x00003ff0-0x00003ff8:        8: used
[16009.636455] [TTM]  0x00003ff8-0x00004000:        8: used
[16009.636458] [TTM]  total: 16384, used 8480 free 7904

I also noticed that the kernel hang I wrote about in comment 11 happens shortly after such CS rejects happen. It's interesting that .33 can survive that...
Comment 15 Marek Olšák 2010-08-07 15:47:10 UTC
It looks like you don't have enough memory and the memory manager is fighting with it. Nevertheless, r300g should not submit a hardlocking CS no matter how much memory you have. I'll see what I can do.
Comment 16 Dave Airlie 2010-08-07 15:56:47 UTC
http://git.kernel.org/?p=linux/kernel/git/airlied/drm-2.6.git;a=commit;h=e376573f7267390f4e1bdc552564b6fb913bce76

might be helpful or you need to increase AGP aperture in BIOS if you can.

also it might be worth trying to boot with radeon.agpmode=-1 as a test.
Comment 17 almos 2010-08-22 07:45:24 UTC
I tried the patch. It improves the situation a bit, but does not solve the problem. After increasing the agpgart from 64MB to 256MB vdrift runs correctly even with texture size set to 'large'. I also tried sauerbraten and nexuiz, and both seem to be ok. After these tests I wanted to report the result, but the kernel (2.6.35-rc6) crashed, so I think the deadlock problem still exists.
Comment 18 Marek Olšák 2010-08-22 07:53:28 UTC
What deadlock problem? Did the 3D driver cause a hardlock or did it freeze when no 3D application was running?
Comment 19 almos 2010-08-22 08:33:45 UTC
After running the above mentioned games with r300g and a kernel newer than 2.6.33, the kernel freezes completely. Not necessarily during the game, often shortly after closing it. If I don't run any of them, 2.6.35-rc6 works fine, thus I suspect there is a correlation. I might be wrong, though...

About other running 3d applications: I use compiz with mesa 7.8.2 installed. Should I try with a non-compositing wm?
Comment 20 Marek Olšák 2010-11-13 12:38:16 UTC
Is this still an issue with current mesa git and kernel 2.6.36?
Comment 21 almos 2010-11-16 15:19:40 UTC
I now tried with 2.6.36 and mesa git from nov 12.
vdrift 2009 release with large textures on monaco: [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!
vdrift 2010 release with the same settings: OK
sauerbraten with venice map: OK
nexuiz with desertfactory map: [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!
doom3: I couldn't test, because since I increased GART from 64MB to 256MB, it segfaults immediately with:
  WARNING: vertex array range in virtual memory (SLOW)
  signal caught: Segmentation fault
  si_code 1
  Trying to exit gracefully..

Meanwhile I found out what caused the hardlock: the preemption model was set to preemptible kernel, now with voluntary kernel preemption it seems to be stable.
Comment 22 Marek Olšák 2010-11-17 03:59:52 UTC
Aren't you running an x86_64 kernel?
Comment 23 almos 2010-11-17 06:36:47 UTC
nope, it's i686 on a pentium 4, which is not 64bit capable
Comment 24 Jerome Glisse 2011-02-09 07:59:06 UTC
Is this still an issue with recent kernel + mesa ?
Comment 25 almos 2011-03-09 04:40:53 UTC
I now tried nexuiz, sauerbraten, alien-arena, doom3, quake4, and both versions of vdrift with 2.6.37 kernel and current mesa master. No sign of this bug. Closing.

BTW the doom3 segfault (either complains about vertex array in virtual memory or cannot open libGL.so) is solved by removing the libgcc_s.so.1 and libstdc++.so.5 from its directory. The same applies to quake4.
Comment 26 Tobias Jakobi 2011-03-09 07:30:59 UTC
Yeah, that's because doom3/quake4 and your system's libGL link to incompatible libgcc_s.so.1 and libstdc++.so.5.
Comment 27 almos 2011-03-09 10:25:57 UTC
(In reply to comment #26)
> Yeah, that's because doom3/quake4 and your system's libGL link to incompatible
> libgcc_s.so.1 and libstdc++.so.5.

Yeah, I knew the reason immediately after removed those. Furthermore, the faq on zerowing.idsoftware.com says:
Should I replace libgcc and libstdc++ with the ones from my distro?
While we haven't seen that replacing them will noticeably improve performance, on some distribution (gentoo for instance) this is known to cause crashes. At your own risks.

Now _not_ replacing them causes crash. Maybe this should be mentioned on http://www.x.org/wiki/RadeonProgram.
Who can edit that page anyways? It seems quite outdated.
Comment 28 Marek Olšák 2011-03-09 14:18:05 UTC
(In reply to comment #27)
> Maybe this should be mentioned on
> http://www.x.org/wiki/RadeonProgram.
> Who can edit that page anyways? It seems quite outdated.

Anyone I guess. All you need to do is to create an account.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.