Bug 29901 - [r300g] emit_aos_swtcl crash
Summary: [r300g] emit_aos_swtcl crash
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r300 (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-31 02:14 UTC by Niels Clemmensen
Modified: 2010-09-16 11:38 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Output from civ4 with RADEON_DEBUG=swtcl (939.20 KB, text/plain)
2010-09-05 20:35 UTC, Tom Stellard
Details
possible fix (3.55 KB, patch)
2010-09-15 17:10 UTC, Marek Olšák
Details | Splinter Review

Description Niels Clemmensen 2010-08-31 02:14:59 UTC
In the last week a lot of my program crash with a segfault (OpenSceneGraph, Blender 2.5.3, Make Human etc.). Other program with gallium works ok. They all crash the same way and here are my (gdb) bt:

Program received signal SIGSEGV, Segmentation fault.

#0  0x00007fffe52fd130 in r300_emit_aos_swtcl (r300=0xfceb40, indexed=0 '\000') at r300_emit.c:847
#1  0x00007fffe52fec13 in r300_render_draw_arrays (render=0xb01750, start=<value optimised out>, count=4) at r300_render.c:834
#2  0x00007fffe54bb52b in draw_pt_emit_linear (emit=<value optimised out>, vert_info=<value optimised out>, prim_info=0x7fffffff9930) at draw/draw_pt_emit.c:265
#3  0x00007fffe54ba49e in emit (middle=0xa69800, fetch_info=0x7fffffff9960, prim_info=0x7fffffff9930) at draw/draw_pt_fetch_shade_pipeline_llvm.c:200
#4  llvm_pipeline_generic (middle=0xa69800, fetch_info=0x7fffffff9960, prim_info=0x7fffffff9930) at draw/draw_pt_fetch_shade_pipeline_llvm.c:288
#5  0x00007fffe54ba5a4 in llvm_middle_end_linear_run (middle=0xc7, start=<value optimised out>, count=4, prim_flags=201) at draw/draw_pt_fetch_shade_pipeline_llvm.c:348
#6  0x00007fffe5479981 in vsplit_segment_simple_linear (frontend=0xff60c0, start=368, count=4) at draw/draw_pt_vsplit_tmp.h:229
#7  vsplit_run_linear (frontend=0xff60c0, start=368, count=4) at draw/draw_split_tmp.h:61
#8  0x00007fffe5475ba5 in draw_pt_arrays (draw=0xfe1570, info=0x7fffffffa170) at draw/draw_pt.c:113
#9  draw_vbo (draw=0xfe1570, info=0x7fffffffa170) at draw/draw_pt.c:398
#10 0x00007fffe52ffc86 in r300_swtcl_draw_vbo (pipe=0xfceb40, info=0x7fffffffa170) at r300_render.c:687
#11 0x00007fffe5430b8b in st_draw_vbo (ctx=<value optimised out>, arrays=<value optimised out>, prims=<value optimised out>, nr_prims=<value optimised out>, ib=0x0, index_bounds_valid=<value optimised out>, min_index=0, max_index=511)
    at state_tracker/st_draw.c:722
#12 0x00007fffe5454588 in vbo_save_playback_vertex_list (ctx=0x100cc20, data=0xe178b0) at vbo/vbo_save_draw.c:287
#13 0x00007fffe5340d7a in ext_opcode_execute (ctx=0x100cc20, list=<value optimised out>) at main/dlist.c:533
#14 execute_list (ctx=0x100cc20, list=<value optimised out>) at main/dlist.c:6883
#15 0x00007fffe5343922 in _mesa_CallList (list=3) at main/dlist.c:8140

My system: Radeon 1250x (rs600), ubuntu maverick 10.10 64 bit, running mesa from xorg-edgers fresh X crack (mesa from 30/8/2010).
Comment 1 Tom Stellard 2010-09-05 20:35:27 UTC
Created attachment 38465 [details]
Output from civ4 with RADEON_DEBUG=swtcl
Comment 2 Tom Stellard 2010-09-05 20:36:44 UTC
I have the same problem running civ4 via wine on my RC410.  I've bisected this bug and the first bad commit is eb430b0e948caf02b9f4095d0e1435880073c2aa
Comment 3 Dave Airlie 2010-09-06 15:33:01 UTC
weird I'm having trouble reproducing this on real r300 using RADEON_NO_TCL=1

Niels can you give me an exact OSG example to run?

blender 2.4 seems fine here.
Comment 4 Tom Stellard 2010-09-09 16:41:17 UTC
I can reproduce this bug with the game nexuiz (http://alientrap.org/nexuiz/).  This is on an RV515 with RADEON_NO_TCL=1.  The game crashes right right before the main menu loads.
Comment 5 Marek Olšák 2010-09-13 04:30:10 UTC
The problem is:

- r300_render_allocate_vertices allocates a vertex buffer.
- Draw stores vertices in the buffer.
- r300_render_draw_arrays is called.
- r300_prepare_for_rendering is called and inside that function, assume the command stream is full and must be flushed. r300_flush calls r300_draw_flush_vbuf, which releases the vertex buffer.
- r300_prepare_for_rendering begins to emit states.
- r300_emit_aos_swtcl is called, which crashes due to the buffer having been deleted.

The commit 0392e48867c27f2aa445c5c9b35f4a52ecef2f2d in Mesa fixes the regressions I was able to reproduce.

I have split r300_prepare_for_rendering into 2 parts:
- r300_reserve_cs_dwords, which is called before r300_render_allocate_vertices
- r300_emit_states, which is called instead of r300_prepare_for_rendering

Please test.
Comment 6 Tom Stellard 2010-09-14 20:26:59 UTC
nexuiz works now, but I am still having problems running civ4 with wine.   Using mesa from git, commit fd7f70af4897e4e31b11562eb1c473f0ee00fce5 I get this assertion failure:

r300_render.c:973:r300_render_draw_elements: Assertion `6 + (short_count+1)/2 <= (cs_copy->ndw - cs_copy->cdw)' failed.


With commit 0b9eb5c9bb03e5134d9a41786178100109e80c5a wine segfaults.  Here is the backtrace:

Backtrace:
=>0 0x7cf4b24f radeon_bo_get_tiling+0xf() in libdrm_radeon.so.1 (0x0033edc4)
  1 0x7d0759de radeon_drm_bufmgr_get_tiling+0x4d(ws=0x3, _buf=0x300091fc, microtiled=0x1, macrotiled=(nil)) [/home/steltho/mesa/src/gallium/winsys/radeon/drm/radeon_drm_
buffer.c:379] in r300_dri.so (0x0033edc4)
  2 0x00000000 (0x7c04f798)
  3 0x7d074460 in r300_dri.so (+0x11345f) (0x7d074670)
  4 0x8b182474 (0x891cec83)
0x7cf4b24f radeon_bo_get_tiling+0xf in libdrm_radeon.so.1: movl 0x0(%edx),%edx
Comment 7 Marek Olšák 2010-09-15 05:52:12 UTC
The crash in radeon_bo_get_tiling has been fixed by commit 09ef8e9283f17. I'll look into the assertion failure.
Comment 8 Marek Olšák 2010-09-15 17:10:22 UTC
Created attachment 38727 [details] [review]
possible fix

Could you please test this patch?
Comment 9 Tom Stellard 2010-09-15 20:00:07 UTC
(In reply to comment #8)
> Created an attachment (id=38727) [details]
> possible fix
> 
> Could you please test this patch?

This patch causes the game to freeze (no more assertion failure) and outputs the following errors:
Mesa: User error: GL_INVALID_OPERATION in glGetUniformfv(program)
Mesa: 2105 similar GL_INVALID_OPERATION errors
Mesa: User error: GL_INVALID_OPERATION in glUseProgram(program 17 not linked)
Mesa: User error: GL_INVALID_OPERATION in glGetUniformfv(program)
Mesa: 19 similar GL_INVALID_OPERATION errors
Mesa: User error: GL_INVALID_OPERATION in glUseProgram(program 17 not linked)
Mesa: User error: GL_INVALID_OPERATION in glUseProgram(program 17 not linked)


I've seen errors like this before, so they might not be directly related to this bug.
Comment 10 Marek Olšák 2010-09-16 03:01:08 UTC
We have 2 options for 7.9. Either I'll commit this patch but I must be sure there are no regressions, or I'll revert all swtcl fixes and the Dave's commit (in 7.9, not in master), and will get to it later when I have some time.

I'll do some additional testing here later.
Comment 11 Tom Stellard 2010-09-16 11:05:32 UTC
(In reply to comment #8)
> Created an attachment (id=38727) [details]
> possible fix
> 
> Could you please test this patch?

I upgraded from 2.6.34 to 2.6.35 and applied this patch and now civ4 doesn't crash.  I tested 2.6.35 without this patch and I still saw the assertion failure, so your patch fixed the assertion failure and upgrading to 2.6.35 fixed the lockup.
Comment 12 Marek Olšák 2010-09-16 11:38:50 UTC
OK, I've pushed the patch and additional testing here shows no other regression. Closing..


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.