Summary: | Illegal instruction _mesa_x86_64_transform_points4_general | ||
---|---|---|---|
Product: | Mesa | Reporter: | John Wimer <john> |
Component: | Mesa core | Assignee: | mesa-dev |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | ||
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
attachment-30621-0.html
Use prefetcht1 instead of prefetch[w] |
Description
John Wimer
2010-04-07 05:09:31 UTC
Possible workaround is to set the MESA_NO_ASM env var. Further, I noted in the xform4.S file that there are several instances of .byte 0x66, 0x66, 0x90 /* manual align += 3 */ but one instance of .byte 0x66, 0x66, 0x66, 0x90 /* manual align += 3 */ in the failing _mesa_x86_64_transform_points4_general function. I think that the point of these lines is to insert 3 nops, actually a single noop preceded by operand size override opcodes, for alignment. The single instance noted above inserts 4 bytes, probably not what was intended. Just a wild guess; hope it helps. Reassigning away from the driver as it appears to be an issue in the common x86-64 assembly, and hopefully someone more knowledgeable will be able to give a definite answer. *** Bug 29245 has been marked as a duplicate of this bug. *** Is this still an issue with the current Mesa master branch? (In reply to comment #4) > Is this still an issue with the current Mesa master branch? I haven't got hardware to test it on, but the code hasn't changed except for this, [http://cgit.freedesktop.org/mesa/mesa/commit/?id=3fda80246f0c41edebdfb4b1ce35bb4726a8c521] and I don't think that is related. I am experiencing a crash with a SIGILL, Illegal instruction in Debian when using Kodi. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/x86_64-linux-gnu/kodi/kodi.bin --standalone'. Program terminated with signal SIGILL, Illegal instruction. #0 _mesa_x86_64_transform_points4_general () at x86-64/xform4.S:72 72 prefetch 16(%rdx) [Current thread is 1 (Thread 0x7f9054aeb9c0 (LWP 791))] Thread 1 (Thread 0x7f9054aeb9c0 (LWP 791)): #0 _mesa_x86_64_transform_points4_general () at x86-64/xform4.S:72 #1 0x00007f902577102d in run_vertex_stage (ctx=0x1ae3248, stage=<optimized out>) at tnl/t_vb_vertex.c:160 #2 0x00007f902575fc62 in _tnl_run_pipeline (ctx=ctx@entry=0x1ae3248) at tnl/t_pipel ine.c:241 #3 0x00007f90258f856f in intelRunPipeline (ctx=0x1ae3248) at intel_tris.c:1086 #4 0x00007f902575f27c in _tnl_draw_prims (ctx=0x1ae3248, prim=0x1b53938, nr_prims=1 , ib=0x0, index_bounds_valid=<optimized out>, min_index=0, max_index=7, tfb_vertcoun t=0x0, stream=0, indirect=0x0) at tnl/t_draw.c:521 #5 0x00007f9025745504 in vbo_exec_vtx_flush (exec=0x1b53158, keepUnmapped=keepUnmap ped@entry=0 '\000') at vbo/vbo_exec_draw.c:422 #6 0x00007f902572732f in vbo_exec_wrap_buffers (exec=exec@entry=0x1b53158) at vbo/vbo_exec_api.c:104 #7 0x00007f90257278e3 in vbo_exec_wrap_upgrade_vertex (exec=0x1b53158, attr=attr@entry=3, newSize=newSize@entry=4) at vbo/vbo_exec_api.c:280 #8 0x00007f9025727e73 in vbo_exec_fixup_vertex (ctx=ctx@entry=0x1ae3248, attr=attr@entry=3, newSize=newSize@entry=4, newType=newType@entry=5126) at vbo/vbo_exec_api.c:406 #9 0x00007f902572fe6e in vbo_Color4f (x=<optimized out>, y=<optimized out>, z=<optimized out>, w=<optimized out>) at vbo/vbo_attrib_tmp.h:402 #10 0x00000000009a7535 in CLinuxRendererGL::RenderUpdate(bool, unsigned int, unsigned int) () #11 0x000000000099ff84 in CXBMCRenderManager::PresentSingle(bool, unsigned int, unsigned int) () #12 0x00000000009a02f2 in CXBMCRenderManager::Render(bool, unsigned int, unsigned int, bool) () #13 0x0000000000eb63a8 in CGUIWindowFullScreen::Render() () #14 0x000000000081f239 in CGUIControl::DoRender() () #15 0x00000000008008a4 in CGUIWindow::DoRender() () #16 0x000000000080661e in CGUIWindowManager::RenderPass() const () #17 0x0000000000806853 in CGUIWindowManager::Render() () #18 0x0000000000d09d33 in CApplication::RenderNoPresent() () #19 0x0000000000d0df31 in CApplication::Render() () #20 0x0000000000dae551 in CXBApplicationEx::Run() () #21 0x0000000000db3dfb in XBMC_Run () #22 0x00000000006cb2e8 in main () (In reply to Michael Harder from comment #6) > I am experiencing a crash with a SIGILL, Illegal instruction in Debian when > using Kodi. > > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > Core was generated by `/usr/lib/x86_64-linux-gnu/kodi/kodi.bin --standalone'. > Program terminated with signal SIGILL, Illegal instruction. > #0 _mesa_x86_64_transform_points4_general () at x86-64/xform4.S:72 > 72 prefetch 16(%rdx) > [Current thread is 1 (Thread 0x7f9054aeb9c0 (LWP 791))] > Oh what cpu? As far as I can tell, intel cpus never supported "prefetch" (new ones support prefetchw which is the same opcode with different modr/m), only prefetcht0/t1/t2/nta so I wonder how this is supposed to work. From 'cat /proc/cpuinfo' model name : Intel(R) Pentium(R) 4 CPU 3.00GHz and from 'lspci' VGA compatible controller: Intel Corporation 82915G/GV/910GL Integrated Graphics Controller (rev 04) (In reply to Michael Harder from comment #8) > From 'cat /proc/cpuinfo' > model name : Intel(R) Pentium(R) 4 CPU 3.00GHz Using any special build flags? As said I can't see how this code could work with intel cpus. There's other functions which should work (like _mesa_sse_transform_points4_general) albeit these might be working in 32bit builds only. Not my area of expertise... At a quick glance USE_X86_64_ASM actually might be defined by default, but this particular cpu instruction just doesn't look like it could run on intel cpus. Unless some cpus tolerate that instruction even if the manuals don't say so (would not be all that surprising even, seems the OP also had a P4, so maybe all later cpus support prefetch/prefetchw for some reason regardless...). If so the code should be fixed up (replacing prefetch/prefetchw with one of prefetcht0/t1/t2/nta, these should run on all x86_64 capable cpus). I don't really know that code, though... Created attachment 120821 [details]
attachment-30621-0.html
Given that there is a _mesa_3dnow_transform_points4_2d in the x86-64 asm
(using MMX/3DNow! is deprecated in x86-64), it appears that this code was
copy-pasted. I wrote a quick patch to change prefetch[w] to prefetcht1,
which is more or less the equivalent in SSE. However, I'm not actually sure
those prefetches really benefit the code since they appear to be monotonic
addresses and hinting only 16 bytes ahead (a cache line is almost always at
least 32 bytes) -- maybe that sort of testing is for another day.
Created attachment 120822 [details] [review] Use prefetcht1 instead of prefetch[w] This should fix the SIGILL when running this code. It replaces 3DNow! prefetch[w] instructions with SSE prefetcht1. I'm still not convinced that the prefetch logic actually has any beneficial performance characteristics. (In reply to Patrick Baggett from comment #10) > Created attachment 120821 [details] > attachment-30621-0.html > > Given that there is a _mesa_3dnow_transform_points4_2d in the x86-64 asm > (using MMX/3DNow! is deprecated in x86-64), it appears that this code was > copy-pasted. I wrote a quick patch to change prefetch[w] to prefetcht1, > which is more or less the equivalent in SSE. However, I'm not actually sure > those prefetches really benefit the code since they appear to be monotonic > addresses and hinting only 16 bytes ahead (a cache line is almost always at > least 32 bytes) -- maybe that sort of testing is for another day. I'd agree that it's dubious that "modern" cpus would benefit - as you said addresses are monotonic and certainly hw prefetchers should handle that pretty well. Though you could argue someone might still use some cpus with terrible prefetchers, and the prefetch instructions should not hurt (at least not much) on modern cpus neither... (In reply to Patrick Baggett from comment #11) > Created attachment 120822 [details] [review] [review] > Use prefetcht1 instead of prefetch[w] > > This should fix the SIGILL when running this code. It replaces 3DNow! > prefetch[w] instructions with SSE prefetcht1. I'm still not convinced that > the prefetch logic actually has any beneficial performance characteristics. I don't see much point though in replacing the prefetch[w] instructions in the 3dnow functions however. Though I'd guess since this is x86_64 it should still work... (In reply to Patrick Baggett from comment #11) > Created attachment 120822 [details] [review] [review] > Use prefetcht1 instead of prefetch[w] > > This should fix the SIGILL when running this code. It replaces 3DNow! > prefetch[w] instructions with SSE prefetcht1. I'm still not convinced that > the prefetch logic actually has any beneficial performance characteristics. Thank you. This has resolved the issue for me. It worked for a few days but now I get this: [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/lib/x86_64-linux-gnu/kodi/kodi.bin --standalone'. Program terminated with signal SIGILL, Illegal instruction. #0 _mesa_x86_64_transform_points4_general () at x86-64/xform4.S:72 72 prefetcht1 16(%rdx) [Current thread is 1 (Thread 0x7fd24af779c0 (LWP 797))] I've been able to reinstall and get it working with the patch again. Not sure what I was doing wrong before. Do I need to do anything to move this along? (In reply to Michael Harder from comment #15) > It worked for a few days but now I get this: > > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > Core was generated by `/usr/lib/x86_64-linux-gnu/kodi/kodi.bin --standalone'. > Program terminated with signal SIGILL, Illegal instruction. > #0 _mesa_x86_64_transform_points4_general () at x86-64/xform4.S:72 > 72 prefetcht1 16(%rdx) > [Current thread is 1 (Thread 0x7fd24af779c0 (LWP 797))] I ran into this problem with my new old hardware I've been playing with recently. The problem can be reproduced running a number of piglit tests such as: ./bin/fbo-stencil readpixels GL_DEPTH24_STENCIL8 -auto -fbo The patch doesn't fix the problem as it seems prefetcht1 doesn't like offsets. If I change for example prefetcht1 16(%rdx) -> prefetcht1 (%rdx) removing the offset for all instances the piglit will now pass. Not sure how to work around this problem. (In reply to Timothy Arceri from comment #17) > (In reply to Michael Harder from comment #15) > > It worked for a few days but now I get this: > > > > [Thread debugging using libthread_db enabled] > > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > > Core was generated by `/usr/lib/x86_64-linux-gnu/kodi/kodi.bin --standalone'. > > Program terminated with signal SIGILL, Illegal instruction. > > #0 _mesa_x86_64_transform_points4_general () at x86-64/xform4.S:72 > > 72 prefetcht1 16(%rdx) > > [Current thread is 1 (Thread 0x7fd24af779c0 (LWP 797))] > > I ran into this problem with my new old hardware I've been playing with > recently. > > The problem can be reproduced running a number of piglit tests such as: > > ./bin/fbo-stencil readpixels GL_DEPTH24_STENCIL8 -auto -fbo > > > The patch doesn't fix the problem as it seems prefetcht1 doesn't like > offsets. > If I change for example prefetcht1 16(%rdx) -> prefetcht1 (%rdx) removing > the offset for all instances the piglit will now pass. Not sure how to work > around this problem. That doesn't make sense to me. The offset is just part of the memory operand. Unless the assembler encodes it wrong I can't see why that wouldn't work (which I would think to be unlikely, but the locality hints are also encoded into the mod r/m byte - what's the encoding of the instruction?) I suppose a solution would just be to ditch prefetch - as was pointed out it's not really far ahead enough in any case, even k8 and p4 had primitive hw prefetchers which should make such a simple software prefetch completely unnecessary. (In reply to Roland Scheidegger from comment #18) > > That doesn't make sense to me. The offset is just part of the memory > operand. Unless the assembler encodes it wrong I can't see why that wouldn't > work (which I would think to be unlikely, but the locality hints are also > encoded into the mod r/m byte - what's the encoding of the instruction?) > I suppose a solution would just be to ditch prefetch - as was pointed out > it's not really far ahead enough in any case, even k8 and p4 had primitive > hw prefetchers which should make such a simple software prefetch completely > unnecessary. I shouldn't play with asm passed my bedtime ... I was having trouble with the patch applying so recreated it myself. Seems the problem was I missed one instruction. All works well once fixed so I've sent that patch to the mailing list. Fix pushed. commit 9c78cfd547a69f6f45d7acaa8ade681640caee95 mesa: Use SSE prefetch instructions rather than 3DNow instructions 64-bit Pentium 4 CPUs don't have the 3DNow prefetch instructions which results in an Illegal instruction crash. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Timothy Arceri <t_arceri@yahoo.com.au> https://bugs.freedesktop.org/show_bug.cgi?id=27512 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.