|Summary:||[REGRESSION] [BISECTED] Large CS workgroup sizes broken in combination with FP64 on Intel.|
|Product:||Mesa||Reporter:||Francisco Jerez <currojerez>|
|Component:||Drivers/DRI/i965||Assignee:||Jason Ekstrand <jason>|
|Status:||RESOLVED MOVED||QA Contact:||Intel 3D Bugs Mailing List <intel-3d-bugs>|
|i915 platform:||i915 features:|
Description Francisco Jerez 2019-09-05 21:45:20 UTC
Large enough workgroup sizes that cause the Intel compiler back-end to generate SIMD32 code are currently broken on master in combination with some shader features that cause the back-end to allocate VGRFs larger than 16 GRFs, which aren't supported by the register allocator and other back-end compiler passes, and lead to assertion failures like: "shader_runner: ../src/intel/compiler/brw_fs.cpp:2020: void fs_visitor::split_virtual_grfs(): Assertion `offset <= MAX_VGRF_SIZE' failed. Aborted (core dumped)" I've written the following Piglit test that reproduces the issue: https://gitlab.freedesktop.org/currojerez/piglit/commit/331b8afb47b9b79e5a6f697d4fe1fc2be6702617 I bisected the regressions to change: "commit f4ef34f207d15bcade7aed644328035dd0f2cc16 Author: Jason Ekstrand <email@example.com> Date: Wed May 29 17:46:55 2019 -0500 intel/fs: Add an UNDEF instruction to avoid excess live ranges With 8 and 16-bit types and anything where we have to use non-trivial strides registersto deal with restrictions, we end up with things that look like partial writes even though we don't care about any values in the register except those written by that instruction. This is particularly important when dealing with loops because liveness sees is_partial_write and the fact that an old version from a previous loop iteration may be valid at that point and extends all purely partially written values to the entire loop. This commit adds a new UNDEF instruction which does nothing (the generator doesn't emit anything) but which does a fake write to the register. This informs liveness that we don't care about any values before that point so it won't consider those registers to be falsely live. We can safely emit UNDEF instructions for all SSA values that come in from NIR and nearly all temporaries generated by various stages of the compiler. In particular, we need to insert UNDEF instructions when we handle region restrictions because the newly allocated registers are almost guaranteed to be partially written. No shader-db changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110432 Reviewed-by: Matt Turner <firstname.lastname@example.org>" I had a half-baked fix for this but it seems like Jason is in the mood of fixing SIMD32 regressions -- So here you got another one.
Comment 1 Jason Ekstrand 2019-09-07 01:59:38 UTC
Comment 2 GitLab Migration User 2019-09-25 20:35:26 UTC
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1832.