Summary: | doom3 segfaults sometimes in the _tnl_DrawElements code | ||
---|---|---|---|
Product: | Mesa | Reporter: | Roland Scheidegger <sroland> |
Component: | Mesa core | Assignee: | mesa-dev |
Status: | RESOLVED NOTOURBUG | QA Contact: | |
Severity: | normal | ||
Priority: | high | ||
Version: | git | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | insert an assertion to make the problem more apparent |
Description
Roland Scheidegger
2006-01-26 03:10:33 UTC
Is it possible to record a demo at the place where the SEGFAULT is hit? If there's only a 20% hitting it (that isn't dependent on exact view parameters), looping the demo 5 times should be able to consistently reproduce it. (In reply to comment #1) > Is it possible to record a demo at the place where the SEGFAULT is hit? If > there's only a 20% hitting it (that isn't dependent on exact view parameters), > looping the demo 5 times should be able to consistently reproduce it. This is a good idea. I've recorded a short, yet 6MB, demo, http://homepage.hispeed.ch/rscheidegger/segfault.demo. However, I'm not sure anymore this is easily reproducible, it seems very highly moon-phase dependant. For instance, if it succeeds once, it will always succeed in subsequent runs, it looks like you need to restart doom3 or at least run a different demo (like demo1). You can try your luck :-). this bug has unfortunately easily survived all tnl changes so far, a newer backtrace with some more information: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 1077327872 (LWP 24516)] 0x41a76d87 in loopback_Color4ubv_f ( v=0x11845334 <Address 0x11845334 out of bounds>) at main/api_loopback.c:280 280 COLORF( UBYTE_TO_FLOAT(v[0]), UBYTE_TO_FLOAT(v[1]), (gdb) bt #0 0x41a76d87 in loopback_Color4ubv_f ( v=0x11845334 <Address 0x11845334 out of bounds>) at main/api_loopback.c:280 #1 0x4190cae2 in _ae_loopback_array_elt (elt=1391) at main/api_arrayelt.c:1293 #2 0x41ac128f in fallback_drawelements (ctx=0xb93bc00, mode=4, count=336, indices=0x52456f60) at tnl/t_array_api.c:75 #3 0x41ac1b48 in _tnl_DrawElements (mode=4, count=336, type=5125, indices=0x52456f60) at tnl/t_array_api.c:393 #4 0x08156450 in ?? () (gdb) print *aa->array $25 = {Size = 4, Type = 5121, Stride = 60, StrideB = 60, Ptr = 0x38 <Address 0x38 out of bounds>, Enabled = 1, Normalized = 1 '\001', BufferObj = 0xa9de738, _MaxElement = 326, Flags = 1} That doesn't sound right. _MaxElement is only 326, and we're tryiing to access element 1391? I'll wonder what's going on here - some problem if stride isn't 0? Created attachment 7868 [details] [review] insert an assertion to make the problem more apparent this diff inserts an assertion to make the problem more visible. With it doom3 will reproducibly instantly fail an assertion where previously some garbage triangles were seen and where it was a bit hard to reproduce a segfault. It will also crash in the intro movie at always the same place, even with software renderer if you're patient enough... (just make sure you use r_renderer arb, at least r_renderer r200 does not appear to be affected). (In reply to comment #4) > Created an attachment (id=7868) [edit] > insert an assertion to make the problem more apparent I looked a bit closer but couldn't find anything which goes wrong. So I suspect this is a bug in the doom3 arb render path, it seems to simply sometimes use a (color) vertex buffer which does not contain all elements (given the stride) it'll try to address. This (depending on how the buffers are laid out in memory) usually just means random data will be used, but sometimes result in a segfault. The spec only mentions "implementation dependant behaviour" in such a case, so if that includes segfaults mesa's behaviour should be fine... Sorry for the bug hijacking; this isn't related to Doom 3 at all, but it could be the same bug. I'm not sure. I have actually noticed some cases with my own code where it appears to get "random" data (vertices, texcoords, colors, ...); so far I haven't found the cause of this... So maybe there is a bug in the DrawElements code. I also haven't been able to reliably reproduce this with my code. It seems to randomly happen 1 times in 100, maybe more. :( This is on R300, btw. Sounds like a job for valgrind. It'll be slow, but valgrind is probably the best tool for solving intermediate crashes/glitches. (In reply to comment #6) > Sorry for the bug hijacking; this isn't related to Doom 3 at all, but it could > be the same bug. I'm not sure. I don't think so. I'm pretty sure the issue with doom3 is a game bug (nobody uses arb path really, so it's probably not that well tested). If you enable CheckArrayBounds you can see (with some additional debug output) that exactly at the places where it sometimes used to segfault some DrawRangeElements calls are ignored because it tries to access array elements beyond the allocated buffer. This happens ONLY with the arb path (with todays's mesa git fixes), and I believe it's always due to the colorPointer buffer. I don't think there is anything mesa can do about that (well other than always using array bounds checking, which sucks and doesn't work if you don't use ARB_vbo anyway). Thus closing this bug now. Open a new bug for your issues, but I'm afraid you need to be more specific. (In reply to comment #7) > Sounds like a job for valgrind. It'll be slow, but valgrind is probably the > best tool for solving intermediate crashes/glitches. I've actually tried that, but it's not practical for doom3 at least, simply because it's way too slow - that and you need to filter the output a lot because it detects tons of bogus things by mistake. Mass version move, cvs -> git |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.