Bugzilla – Bug 33388
Dubious assembler in xform4.S
Last modified: 2011-01-24 13:35:23 UTC
Created attachment 42347 [details]
Change movzx to movzbl in xform4.S
When building Mesa with the trunk version of clang, using its integrated
assembler, it complains about the ambiguous 'movzx' mnemonics in
clang -c -I../../include -I../../src/mesa -I../../src/mesa/main -I/usr/local/include -O2 -pipe -fno-strict-aliasing -Wall -Wmissing-prototypes -std=c99 -ffast-math -fno-strict-aliasing -fPIC -DUSE_X86_64_ASM -DHAVE_POSIX_MEMALIGN -DUSE_XCB -DPTHREADS -DUSE_EXTERNAL_DXTN_LIB=1 -DIN_DRI_DRIVER -DHAVE_ALIAS -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING x86-64/xform4.S -o x86-64/xform4.o
/tmp/cc-A7bOtT.s:36:2: error: invalid operand for instruction
movzx 20(%rdx), %eax
The mnemonics are ambiguous, because there are no suffixes to specify
whether the source is byte or word-sized. They are accepted by GNU as,
and it apparently just assumes we want to zero-extend a byte:
23: 0f b6 42 14 movzbl 0x14(%rdx),%eax
E.g. move 'b' (byte) to 'l' (dword). The source in question says:
movzx V4F_STRIDE(%rdx), %eax /* stride */
where V4F_STRIDE is the offset to the stride member of the GLvector4f
struct, which is a GLuint.
If the original intention was to ignore the upper bytes of the stride
field, and just use the first byte, then I propose to change the 'movzx'
mnemonics to 'movzbl', which explicitly specify the source and
destination type. See the attached patch.
I'm not an x86 assembly expert and the original author isn't involved anymore. Have you tested this change with a few things?
If there's no concerns I'll commit the patch in a few days.
Yes, I have built and run the Mesa port on FreeBSD several times with
this patch. In any case, it is very simple to show the patch does not
change the output at all. Just use GNU as to assemble the following two
and then disassemble the resulting .o file, which gives:
example.o: file format elf64-x86-64
Disassembly of section .text:
0: 0f b6 42 14 movzbl 0x14(%rdx),%eax
4: 0f b6 42 14 movzbl 0x14(%rdx),%eax
E.g exactly the same opcodes are produced, so nothing will change in the
actual behaviour of the routine at runtime. The only difference is that
now the .S file is unambiguous.
OK, thanks. Committed as 3fda80246f0c41edebdfb4b1ce35bb4726a8c521