Bug 110240 - Assassins Creed Odyssey crashes with nir errors
Summary: Assassins Creed Odyssey crashes with nir errors
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Vulkan/radeon (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: mesa-dev
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-25 23:47 UTC by James.Dutton
Modified: 2019-03-27 21:28 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Log of wine output from "wine ACOdyssey.exe" (762.80 KB, text/plain)
2019-03-25 23:47 UTC, James.Dutton
Details
Log of wine output from "wine ACOdyssey.exe" after trying revert (128.70 KB, text/plain)
2019-03-27 08:55 UTC, James.Dutton
Details

Description James.Dutton 2019-03-25 23:47:58 UTC
Created attachment 143777 [details]
Log of wine output from "wine ACOdyssey.exe"

Summary:
error: instr->type == glsl_get_array_element(parent->type) (../src/compiler/nir/nir_validate.c:466)

Please see attachment for full log.
git checkout master
at point
git checkout 39da1deb497af55496308c0867b5ce5a0e9df56e
Comment 1 James.Dutton 2019-03-26 00:26:38 UTC
Comment on attachment 143777 [details]
Log of wine output from "wine ACOdyssey.exe"

The following combination is good:
git checkout 5f5ac19f138125b04d8ddedd6334b996f8925a4a
git cherry-pick db07f0554a83bef534525c47d2dd3a2c3fcbf8b9
git cherry-pick 427a6fee439b2df96edc813c56572169385772a6
Comment 2 James.Dutton 2019-03-26 00:45:19 UTC
the git checkout and cherry-pick are from the mesa git.

My graphics card is a AMD Vega 56.
Comment 3 Jason Ekstrand 2019-03-27 03:45:46 UTC
I can reproduce locally and I think I know what the culpret is.  Try reverting 3b3653c4cfbedf55a00cbddd0f341ebd1de81665 on master and see if it fixes it.
Comment 4 James.Dutton 2019-03-27 08:46:15 UTC
git checkout 39da1deb497af55496308c0867b5ce5a0e9df56e
git revert 3b3653c4cfbedf55a00cbddd0f341ebd1de81665

The game ACO still fails to work, with a NIR compile error.
Although, it might be crashing at a different point now.
Comment 5 James.Dutton 2019-03-27 08:55:21 UTC
Created attachment 143786 [details]
Log of wine output from "wine ACOdyssey.exe" after trying revert
Comment 6 Jason Ekstrand 2019-03-27 15:25:16 UTC
This bug appears to be some sort of memory corruption in the type system.  In particular, type names show up garbled in most of the validation fail dumps.  I'm still working on where the corruption could be coming from.

This seems to be related to this commit:

commit 9d0ae777dd68dad682dcc7768726996639ae2684
Author: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Date:   Thu Mar 21 13:26:48 2019 -0700

    spirv: Use interface type for block and buffer block
    
    Also handle GLSL_TYPE_INTERFACE the same way we do GLSL_TYPE_STRUCT in
    various places.  Motivated by ARB_gl_spirv work, that will take
    advantage of the interface types when handling NIR coming from SPIR-V.
    
    Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

Though I have no idea why this is causing problems.:-(
Comment 7 Jason Ekstrand 2019-03-27 16:25:13 UTC
Got a proper bisect:

commit 4e1bbb000cdfe4ba01bee5a6868c54fed7285dae
Author: Tapani Pälli <tapani.palli@intel.com>
Date:   Tue Mar 19 11:01:53 2019 +0200

    anv/radv: release memory allocated by glsl types during spirv_to_nir
    
    Fixes leaks for each glsl_type generated:
    
       ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18
       ==32470==    at 0x483880B: malloc (vg_replace_malloc.c:309)
       ==32470==    by 0x4C43F4A: ralloc_size (ralloc.c:119)
       ==32470==    by 0x4C44014: rzalloc_size (ralloc.c:151)
       ==32470==    by 0x4C44258: rzalloc_array_size (ralloc.c:215)
       ==32470==    by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:114)
       ==32470==    by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:1146)
       ==32470==    by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501)
       ==32470==    by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269)
       ==32470==    by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018)
       ==32470==    by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365)
       ==32470==    by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490)
       ==32470==    by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173)
    
    v2: move release call to vkDestroyInstance
    v3: apply fix also to radv driver
    
    Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
    Cc: mesa-stable@lists.freedesktop.org
    Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

I've reverted that patch on master and it's now playing properly.
Comment 8 James.Dutton 2019-03-27 17:42:07 UTC
git checkout master  at point:
git checkout 15f131b7b7fdb9baca69e19eedf16b4568ab32c8
is Good. The ACO benchmark test passes.
Comment 9 James.Dutton 2019-03-27 18:33:34 UTC
Fixed in git master.
Comment 10 Jason Ekstrand 2019-03-27 21:28:22 UTC
Fixed by the following commit in master:

commit ce47999ceed7efe010a1b6cc592780514803670a (public/master)
Author: Jason Ekstrand <jason@jlekstrand.net>
Date:   Wed Mar 27 11:16:15 2019 -0500

    Revert "anv/radv: release memory allocated by glsl types during spirv_to_nir"
    
    This reverts commit 4e1bbb000cdfe4ba01bee5a6868c54fed7285dae.  It turns
    out that some DXVK apps due to some implementation detail of DXVK or
    other create and destroy instances in an interleaved way.  Freeing the
    glsl_type memory without being a bit more careful causes use-after-free
    issues.  Looks like we need to try again.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.