| Summary: | Talos Principle Vulkan version crash: spirv_to_nir() returns NULL entry_point | ||
|---|---|---|---|
| Product: | Mesa | Reporter: | Eero Tamminen <eero.t.tamminen> |
| Component: | Drivers/DRI/i965 | Assignee: | Jason Ekstrand <jason> |
| Status: | VERIFIED FIXED | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
| Severity: | normal | ||
| Priority: | medium | ||
| Version: | git | ||
| Hardware: | Other | ||
| OS: | All | ||
| Whiteboard: | |||
| i915 platform: | i915 features: | ||
Manual git bisect gave following as the first bad commit: --------------------------------------------------------- commit a7c2be9944a9e2028a02fcfbab501891293401b1 Author: Jason Ekstrand <jason.ekstrand@intel.com> AuthorDate: Wed Dec 6 09:14:20 2017 -0800 Commit: Jason Ekstrand <jason.ekstrand@intel.com> CommitDate: Mon Dec 11 22:28:34 2017 -0800 spirv: Add type validation for OpSelect Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> --------------------------------------------------------- FYI: Ever since 94ca8e04adf681b0cad6ade1c9f28856efe35ae6, most SPIR-V errors result in spirv_to_nir bailing cleanly and returning a NULL. anv_pipeline_compile_to_nir then dereferences the NULL and crashes. A more useful backtrace would be if you set a breakpoint on _vtn_fail and gave me the backtrace from there. Arguably, it may be better to add an abort() to _vtn_fail when built in debug mode because the NULL dereference is kind-of mean. My dev SSD was completely corrupted this morning (fsck.ext4 has been listing inodes with issues for the last half an hour). -> I won't be able to provide better backtrace before next year (without working setup it would take too much time, so other stuff than Steam gets more priority until that). :-/ Miraculously, the SSD got into fully working condition eventually (never happened to me before, with that much errors from fsck).
Here's the backtrace you asked:
----------------------------------------------
Thread 1 "Talos" hit Breakpoint 1, _vtn_fail (b=b@entry=0x5169c60, file=file@entry=0x7fffe6828cc0 "../../../src/compiler/spirv/spirv_to_nir.c",
line=line@entry=3517,
fmt=fmt@entry=0x7fffe6829980 "Condition type of OpSelect must be a scalar or vector of Boolean type. It must have the same number of components as Result Type") at ../../../src/compiler/spirv/spirv_to_nir.c:112
112 {
(gdb) bt
#0 _vtn_fail (b=b@entry=0x5169c60, file=file@entry=0x7fffe6828cc0 "../../../src/compiler/spirv/spirv_to_nir.c", line=line@entry=3517,
fmt=fmt@entry=0x7fffe6829980 "Condition type of OpSelect must be a scalar or vector of Boolean type. It must have the same number of components as Result Type") at ../../../src/compiler/spirv/spirv_to_nir.c:112
#1 0x00007fffe67b5f0c in vtn_handle_body_instruction (b=0x5169c60, opcode=<optimized out>, w=0x3c67bfc, count=<optimized out>)
at ../../../src/compiler/spirv/spirv_to_nir.c:3514
#2 0x00007fffe67ae7a6 in vtn_foreach_instruction (b=b@entry=0x5169c60, start=<optimized out>, end=end@entry=0x3c67c60,
handler=handler@entry=0x7fffe67b4a00 <vtn_handle_body_instruction>) at ../../../src/compiler/spirv/spirv_to_nir.c:323
#3 0x00007fffe67c31e1 in vtn_emit_cf_list (b=b@entry=0x5169c60, cf_list=cf_list@entry=0x5121f38, switch_fall_var=switch_fall_var@entry=0x0,
has_switch_break=has_switch_break@entry=0x0, handler=handler@entry=0x7fffe67b4a00 <vtn_handle_body_instruction>) at ../../../src/compiler/spirv/vtn_cfg.c:703
#4 0x00007fffe67c3562 in vtn_function_emit (b=b@entry=0x5169c60, func=func@entry=0x5121f10,
instruction_handler=instruction_handler@entry=0x7fffe67b4a00 <vtn_handle_body_instruction>) at ../../../src/compiler/spirv/vtn_cfg.c:878
#5 0x00007fffe67b6394 in spirv_to_nir (words=<optimized out>, words@entry=0x3c675e8, word_count=416, spec=spec@entry=0x0, num_spec=num_spec@entry=0,
stage=stage@entry=MESA_SHADER_FRAGMENT, entry_point_name=<optimized out>, options=0x7fffffff8c80, nir_options=0x7fffe6806fc0 <scalar_nir_options>)
at ../../../src/compiler/spirv/spirv_to_nir.c:3742
#6 0x00007fffe6411a55 in anv_shader_compile_to_nir (pipeline=0x5123fd0, pipeline=0x5123fd0, spec_info=0x0, stage=MESA_SHADER_FRAGMENT,
entrypoint_name=0x7fffffff8e40 "", module=0x3c675d0, mem_ctx=0x3c674b0) at ../../../src/intel/vulkan/anv_pipeline.c:149
#7 anv_pipeline_compile (pipeline=pipeline@entry=0x5123fd0, mem_ctx=mem_ctx@entry=0x3c674b0, module=module@entry=0x3c675d0,
entrypoint=entrypoint@entry=0x237b915 "main", stage=stage@entry=MESA_SHADER_FRAGMENT, spec_info=spec_info@entry=0x0, prog_data=0x7fffffff8e40,
map=0x7fffffff8d60) at ../../../src/intel/vulkan/anv_pipeline.c:395
#8 0x00007fffe6412162 in anv_pipeline_compile_fs (pipeline=pipeline@entry=0x5123fd0, cache=cache@entry=0x3af2090, info=info@entry=0x7fffec7ac9b0,
module=module@entry=0x3c675d0, entrypoint=0x237b915 "main", spec_info=0x0) at ../../../src/intel/vulkan/anv_pipeline.c:871
#9 0x00007fffe641393e in anv_pipeline_init (pipeline=pipeline@entry=0x5123fd0, device=device@entry=0x3bfde10, cache=cache@entry=0x3af2090,
pCreateInfo=pCreateInfo@entry=0x7fffec7ac9b0, alloc=0x3bfde18, alloc@entry=0x0) at ../../../src/intel/vulkan/anv_pipeline.c:1347
#10 0x00007fffe65ae8cf in gen9_graphics_pipeline_create (pPipeline=0x7fffffffcaf0, pAllocator=0x0, pCreateInfo=0x7fffec7ac9b0, cache=0x3af2090,
_device=0x3bfde10) at ../../../src/intel/vulkan/genX_pipeline.c:1661
#11 gen9_CreateGraphicsPipelines (_device=0x3bfde10, pipelineCache=0x3af2090, count=1, pCreateInfos=<optimized out>, pAllocator=0x0, pPipelines=0x7fffffffcaf0)
at ../../../src/intel/vulkan/genX_pipeline.c:1864
(gdb) up
#1 0x00007fffe67b5f0c in vtn_handle_body_instruction (b=0x5169c60, opcode=<optimized out>, w=0x3c67bfc, count=<optimized out>)
at ../../../src/compiler/spirv/spirv_to_nir.c:3514
3514 vtn_fail_if(sel_val->type->type != sel_type,
(gdb) info locals
ssa = <optimized out>
sel_type = <optimized out>
res_type = <optimized out>
(gdb) print *b
$1 = {nb = {cursor = {option = nir_cursor_after_instr, {block = 0x516a5d0, instr = 0x516a5d0}}, exact = false, shader = 0x5125820, impl = 0x51220d0},
fail_jump = {{__jmpbuf = {37206293, -845490108815733866, 85082064, 140737488326208, 63337960, 63337960, 845490111369350038, 845546221558336406},
__mask_was_saved = 0, __saved_mask = {__val = {0 <repeats 16 times>}}}}, spirv = 0x3c675e8, shader = 0x5125820, options = 0x7fffffff8c80, block = 0x0,
spirv_offset = 1556, file = 0x0, line = -1, col = -1, const_table = 0x5122830, phi_table = 0x5122950, num_specializations = 0, specializations = 0x0,
value_id_bound = 24916, values = 0x516e490, entry_point_stage = MESA_SHADER_FRAGMENT, entry_point_name = 0x237b915 "main", entry_point = 0x51a5968,
origin_upper_left = true, pixel_center_integer = false, func = 0x0, functions = {head_sentinel = {next = 0x5121f10, prev = 0x0}, tail_sentinel = {next = 0x0,
prev = 0x5121f10}}, func_param_idx = 0, has_loop_continue = false}
(gdb) c
Continuing.
Thread 1 "Talos" received signal SIGSEGV, Segmentation fault.
anv_shader_compile_to_nir (pipeline=0x5123fd0, pipeline=0x5123fd0, spec_info=0x0, stage=MESA_SHADER_FRAGMENT, entrypoint_name=0x7fffffff8e40 "",
module=0x3c675d0, mem_ctx=0x3c674b0) at ../../../src/intel/vulkan/anv_pipeline.c:153
153 nir_shader *nir = entry_point->shader;
----------------------------------------------
There's a patch on the list to fix this: https://patchwork.freedesktop.org/patch/193453/ This is fixed by the following commit: commit 3be382cd7cb637f463a4618dc19d87d66a644b0e Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Thu Dec 14 19:53:05 2017 -0800 spirv: Relax the validation conditions of OpSelect The Talos Principle contains shaders with an OpSelect between two vectors where the condition is a scalar boolean. This is technically against the spec bout nir_builder gracefully handles it by splatting out the condition to all the channels. So long as the condition is a boolean, just emit a warning instead of failing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104246 Verified with Mesa git tip. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Setup: - KBL GT3e - Ubuntu 16.04 - Mesa git version - Latest Talos Principle available from Steam downloaded - Steam game launch options set to use Vulkan: "%command% +gfxStrAPI VLK" - Talos Gfx options set to high GPU speed Test-case: - Start Talos Principle Expected outcome: - Talos starts, like with Mesa commit "mesa-17.3.0" Actual outcome: - Talos Principle segfaults before showing anything Crash is because of NULL pointer access in spirv->nir fragment shader compilation: --------------------------------------------------------- Thread 1 "Talos" received signal SIGSEGV, Segmentation fault. anv_shader_compile_to_nir (pipeline=0x5142730, pipeline=0x5142730, spec_info=0x0, stage=MESA_SHADER_FRAGMENT, entrypoint_name=0x7fffffff90d0 "", module=0x3c69600, mem_ctx=0x37a8170) at ../../../src/intel/vulkan/anv_pipeline.c:153 153 nir_shader *nir = entry_point->shader; (gdb) bt #0 anv_shader_compile_to_nir (pipeline=0x5142730, pipeline=0x5142730, spec_info=0x0, stage=MESA_SHADER_FRAGMENT, entrypoint_name=0x7fffffff90d0 "", module=0x3c69600, mem_ctx=0x37a8170) at ../../../src/intel/vulkan/anv_pipeline.c:153 #1 anv_pipeline_compile (pipeline=pipeline@entry=0x5142730, mem_ctx=mem_ctx@entry=0x37a8170, module=module@entry=0x3c69600, entrypoint=entrypoint@entry=0x237b915 "main", stage=stage@entry=MESA_SHADER_FRAGMENT, spec_info=spec_info@entry=0x0, prog_data=0x7fffffff90d0, map=0x7fffffff8ff0) at ../../../src/intel/vulkan/anv_pipeline.c:395 #2 0x00007fffe6056162 in anv_pipeline_compile_fs (pipeline=pipeline@entry=0x5142730, cache=cache@entry=0x3923c20, info=info@entry=0x7fffecabf8f0, module=module@entry=0x3c69600, entrypoint=0x237b915 "main", spec_info=0x0) at ../../../src/intel/vulkan/anv_pipeline.c:871 #3 0x00007fffe605793e in anv_pipeline_init (pipeline=pipeline@entry=0x5142730, device=device@entry=0x3c059c0, cache=cache@entry=0x3923c20, pCreateInfo=pCreateInfo@entry=0x7fffecabf8f0, alloc=0x3c059c8, alloc@entry=0x0) at ../../../src/intel/vulkan/anv_pipeline.c:1347 #4 0x00007fffe61f28cf in gen9_graphics_pipeline_create (pPipeline=0x7fffffffcd80, pAllocator=0x0, pCreateInfo=0x7fffecabf8f0, cache=0x3923c20, _device=0x3c059c0) at ../../../src/intel/vulkan/genX_pipeline.c:1661 #5 gen9_CreateGraphicsPipelines (_device=0x3c059c0, pipelineCache=0x3923c20, count=1, pCreateInfos=<optimized out>, pAllocator=0x0, pPipelines=0x7fffffffcd80) at ../../../src/intel/vulkan/genX_pipeline.c:1864 (gdb) list anv_shader_compile_to_nir ... 149 nir_function *entry_point = 150 spirv_to_nir(spirv, module->size / 4, 151 spec_entries, num_spec_entries, 152 stage, entrypoint_name, &spirv_options, nir_options); 153 nir_shader *nir = entry_point->shader; (gdb) disassemble Dump of assembler code for function anv_pipeline_compile: ... 0x00007fffe6055a50 <+256>: callq 0x7fffe63fa130 <spirv_to_nir> => 0x00007fffe6055a55 <+261>: mov 0x18(%rax),%rbx 0x00007fffe6055a59 <+265>: mov 0x20(%rsp),%rdi (gdb) info registers rax rbx rax 0x0 0 rbx 0x0 0 --------------------------------------------------------- In case it matters, here are variable values & struct contents: --------------------------------------------------------- (gdb) info locals device = <optimized out> spec_entries = 0x0 spirv_options = {lower_workgroup_access_to_offsets = true, caps = {float64 = true, image_ms_array = false, tessellation = true, draw_parameters = true, image_read_without_format = false, image_write_without_format = true, int64 = true, multiview = true, variable_pointers = true, storage_16bit = true}, debug = {func = 0x0, private_data = 0x0}} entry_point = <optimized out> nir = <optimized out> compiler = 0x39d2330 nir_options = 0x7fffe644afc0 <scalar_nir_options> spirv = 0x3c69618 num_spec_entries = 0 (gdb) print *module $7 = {sha1 = "Y%cewe\242\022\065\064\225\t\354ͥ\222\222A\333 ", size = 1664, data = 0x3c69618 "\003\002#\a"} (gdb) print *nir_options $1 = {lower_fdiv = true, lower_ffma = false, fuse_ffma = false, lower_flrp32 = false, lower_flrp64 = true, lower_fpow = false, lower_fsat = false, lower_fsqrt = false, lower_fmod32 = true, lower_fmod64 = false, lower_bitfield_extract = true, lower_bitfield_insert = true, lower_uadd_carry = true, lower_usub_borrow = true, lower_negate = false, lower_sub = true, lower_scmp = true, lower_idiv = false, fdot_replicates = false, lower_ffract = false, lower_pack_half_2x16 = true, lower_pack_unorm_2x16 = true, lower_pack_snorm_2x16 = true, lower_pack_unorm_4x8 = true, lower_pack_snorm_4x8 = true, lower_unpack_half_2x16 = true, lower_unpack_unorm_2x16 = true, lower_unpack_snorm_2x16 = true, lower_unpack_unorm_4x8 = true, lower_unpack_snorm_4x8 = true, lower_extract_byte = false, lower_extract_word = false, native_integers = true, vertex_id_zero_based = true, lower_cs_local_index_from_id = false, use_interpolated_input_intrinsics = true, max_unroll_iterations = 32} --------------------------------------------------------- Debug output I got by prefixing launch options with: gdbserver 127.0.0.1:1234 And in another terminal doing: (gdb) target remote :1234