Created attachment 105842 [details] Windows 32-bit binary exhibiting the crash Unhandled exception at 0x027B8FAA in glyphy-demo.exe: 0xC0000094: Integer division by zero. > opengl32.dll!lp_rast_shade_quads_mask(lp_rasterizer_task * task, const lp_rast_shader_inputs * inputs, unsigned int x, unsigned int y, unsigned int mask) Line 454 C opengl32.dll!do_block_4_32_2(lp_rasterizer_task * task, const lp_rast_triangle * tri, const lp_rast_plane * plane, int x, int y, const __int64 * c) Line 61 C opengl32.dll!do_block_16_32_2(lp_rasterizer_task * task, const lp_rast_triangle * tri, const lp_rast_plane * plane, int x, int y, const __int64 * c) Line 130 C opengl32.dll!lp_rast_triangle_32_2(lp_rasterizer_task * task, const lp_rast_cmd_arg arg) Line 232 C opengl32.dll!do_rasterize_bin(lp_rasterizer_task * task, const cmd_bin * bin, int x, int y) Line 595 C opengl32.dll!rasterize_scene(lp_rasterizer_task * task, lp_scene * scene) Line 663 C opengl32.dll!thread_function(void * init_data) Line 788 C opengl32.dll!impl_thrd_routine(void * p) Line 144 C Source code: https://github.com/rconde01/glyphy/tree/add_some_demo_features
I haven't really gotten a grasp on the code base, but my assessment is that the issue is: tgsi_exec.c static void micro_idiv(union tgsi_exec_channel *dst, const union tgsi_exec_channel *src0, const union tgsi_exec_channel *src1) { dst->i[0] = src0->i[0] / src1->i[0]; dst->i[1] = src0->i[1] / src1->i[1]; dst->i[2] = src0->i[2] / src1->i[2]; dst->i[3] = src0->i[3] / src1->i[3]; } Section 4.1.3 of the 4.40 spec says: "Division and multiplication operations resulting in overflow or underflow will not cause any exception but will result in an undefined value." Therefore I believe it should be changed to: static void micro_idiv(union tgsi_exec_channel *dst, const union tgsi_exec_channel *src0, const union tgsi_exec_channel *src1) { dst->i[0] = src1->i[0] != 0 ? src0->i[0] / src1->i[0] : 0; dst->i[1] = src1->i[1] != 0 ? src0->i[1] / src1->i[1] : 0; dst->i[2] = src1->i[2] != 0 ? src0->i[2] / src1->i[2] : 0; dst->i[3] = src1->i[3] != 0 ? src0->i[3] / src1->i[3] : 0; } I've confirmed that this fixes my particular issue.
Actually, that only fixes the non-llvm build. I'm not sure how to track down the equivalent piece of code in the llvm variant.
I figured out the LLVM side - I'll have a patch in the next few days.
Ok, I just was thinking it is probably something along these lines, and could reproduce this with some hacked up piglit test doing integer division by zero. Though I got a SIGFPE instead, I guess that's due to different environment. Should be in lp_build_div().
Actually, I think it would be a good idea if you could return 0xffffffff instead of 0. This is typically what GPUs will do, since the d3d10 docs claim this is required, both for div and mod - http://msdn.microsoft.com/en-us/library/windows/desktop/hh447244%28v=vs.85%29.aspx
(In reply to comment #5) > Actually, I think it would be a good idea if you could return 0xffffffff > instead of 0. This is typically what GPUs will do, since the d3d10 docs > claim this is required, both for div and mod - > http://msdn.microsoft.com/en-us/library/windows/desktop/hh447244%28v=vs. > 85%29.aspx Though actually that might not make sense for signed div (which d3d10 doesn't have).
(In reply to comment #6) > (In reply to comment #5) > > Actually, I think it would be a good idea if you could return 0xffffffff > > instead of 0. This is typically what GPUs will do, since the d3d10 docs > > claim this is required, both for div and mod - > > http://msdn.microsoft.com/en-us/library/windows/desktop/hh447244%28v=vs. > > 85%29.aspx > > Though actually that might not make sense for signed div (which d3d10 > doesn't have). And actually I didn't look closely enough, the unsigned versions already handle this (in udiv_emit_cpu() etc for gallivm), so it's just idiv_emit_cpu() which needs fixing. And zero is probably fine, I can't really think of a more useful value (maybe -MAX_INT or something would work too if anyone cares).
for the llvm version I was going to return bld->undef. I figure I might as well change tgsi_exec.c version to 0xffffffff for consistency...but I don't have strong feelings either way.
(In reply to comment #8) > for the llvm version I was going to return bld->undef. I figure I might as > well change tgsi_exec.c version to 0xffffffff for consistency...but I don't > have strong feelings either way. I'm not sure how your code is going to look like, but I can't really see a way to return bld->undef since you will be required to replace the input vector with something sensible (similar to udiv_emit_cpu()) (*). That said, I guess using the same code as udiv_emit_cpu() but omitting the final "or" would also work - in which case the result for the idiv for x / 0 would be x / -1, why not... I'm not really sure it makes sense to return 0xffffffff just for consistency, for uint div this value sort of makes sense, but for signed div it does not - MAX_INT or MIN_INT probably would make more sense. But since d3d10 doesn't support it, and glsl doesn't care, I think whatever is cheapest is ok (another cheap option would be to use a "andnot" instead of the "or" udiv does, in which case you'd get zero - MIN_INT/MAX_INT are going to be slightly more complicated). (*) Because sse doesn't have int div, llvm actually breaks down the vector div to ordinary multiple int divs anyway (and yes it's bound to be very slow), thus breaking it down manually and do per element selection wouldn't really be too bad. Still wouldn't really make sense imho however.
Could be totally borked, but this is what I have at the moment: if (type.floating) { return LLVMBuildFDiv(builder, a, b, ""); } else { //The spec says divide by zero is undefined, but shouldn't throw an exception LLVMValueRef zeromask = lp_build_cmp(bld, PIPE_FUNC_EQUAL, b, lp_build_zero(bld->gallivm,type)); if (type.sign) { return lp_build_select(bld, zeromask, bld->undef, LLVMBuildSDiv(builder, a, b, "")); } else { return lp_build_select(bld, zeromask, bld->undef, LLVMBuildUDiv(builder, a, b, "")); } } I'll take a look at udiv_emit_cpu
(In reply to comment #10) > Could be totally borked, but this is what I have at the moment: > > if (type.floating) { > return LLVMBuildFDiv(builder, a, b, ""); > } > else { > //The spec says divide by zero is undefined, but shouldn't throw an > exception > LLVMValueRef zeromask = lp_build_cmp(bld, > PIPE_FUNC_EQUAL, > b, > lp_build_zero(bld->gallivm,type)); > if (type.sign) { > return lp_build_select(bld, > zeromask, > bld->undef, > LLVMBuildSDiv(builder, a, b, "")); > } > else { > return lp_build_select(bld, > zeromask, > bld->undef, > LLVMBuildUDiv(builder, a, b, "")); > } > } > > I'll take a look at udiv_emit_cpu Hmm that can't quite work since you still do the division by zero, you just don't use the result in the end.
Roland, Perhaps you can help me here. I'm looking at udiv_emit_cpu and I'm just not getting it. The comment says: /* udiv by zero is guaranteed to return 0xffffffff */ but the way I read it, udiv by zero returns the numerator. Am I missing something?
(In reply to comment #12) > Roland, > Perhaps you can help me here. I'm looking at udiv_emit_cpu and I'm just > not getting it. The comment says: > > /* udiv by zero is guaranteed to return 0xffffffff */ > > but the way I read it, udiv by zero returns the numerator. Am I missing > something? Well in the end it is "ored" with the comparison mask, hence it's going to be all ones (comparison masks are either all ones or all zeros). So the first or is there to make sure the zero in the division gets replaced by something else (all ones because this is easiest) so it doesn't crash, and the second or is there to make sure the result is all ones (because d3d10 requires this result, but since d3d10 lacks signed integer division there's not really any such requirement for idiv).
(In reply to comment #13) > (comparison masks are either all ones or all zeros). AHH!!! That was the piece I was missing. My hair follicles thank you.
I've fixed up the commit message and commited it, ffeb77c7b0552a8624e46e65d6347240ac5ae84d.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.