Summary: | [i965 on HD4600 Haswell] xcom switch to ingame cinematics cause segmentation fault | ||
---|---|---|---|
Product: | Mesa | Reporter: | _archuser_ <banana_banana_banana> |
Component: | Drivers/DRI/i965 | Assignee: | Matt Turner <mattst88> |
Status: | RESOLVED FIXED | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | banana_banana_banana |
Version: | git | Keywords: | bisected, regression |
Hardware: | Other | ||
OS: | All | ||
URL: | https://lists.freedesktop.org/archives/mesa-dev/2016-August/126898.html | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | [PATCH] nir: Walk blocks in source code order in lower_vars_to_ssa. |
Description
_archuser_
2016-08-06 12:09:29 UTC
(In reply to _archuser_ from comment #0) > Specific parts of xcom: enemy unknown cause a segfault in i965_dri.so > (64bit), last known mesa version to work without segfault was 11.2.2-1 (arch > package from Feb 11, if i965_dri.so loaded via LIBGL_DRIVERS_PATH, it still > works). > Xcom seems to to some specific things when switching to ingame cinematics > (e.g. after first mission transition, some stuff near endgame etc.) and only > crashes on these parts , rest of the game seems to work fine. > > apitrace / glretrace produce traces that always crash semi-randomly during > replay and are unusable for me atm. If the trace was used to get the backtrace, I suspect it'll be useful to me. Could you post it somewhere? > Trace shows recursive calls to nir_phi_builder_value_get_block_def ? -> > stack overflow causing SEGSEGV ? That looks plausible. > Not a C programmer here, took some time to set this up to work at all. I'm very appreciative! I created a new trace via the apitrace wrapper library to run while running gdb using latest mesa-git, the new trace and backtrace can be found here (glretrace still crashes randomly on replay though): https://drive.google.com/open?id=0B_oUaHk11vY2T2t5X1lVUEdLNDg (In reply to _archuser_ from comment #2) > I created a new trace via the apitrace wrapper library to run while running > gdb using latest mesa-git, the new trace and backtrace can be found here > (glretrace still crashes randomly on replay though): > > https://drive.google.com/open?id=0B_oUaHk11vY2T2t5X1lVUEdLNDg Thanks. I have run this trace in valgrind on my Haswell, master and 12.0.1 and I cannot reproduce same backtrace as you. However, I do see ==26838== Thread 4: ==26838== Invalid read of size 2 ==26838== at 0x4C30D78: memcpy@@GLIBC_2.14 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==26838== by 0xA75A1BD: memcpy (string3.h:53) ==26838== by 0xA75A1BD: copy_array_to_vbo_array.isra.1 (brw_draw_upload.c:428) ==26838== by 0xA75AB98: brw_prepare_vertices (brw_draw_upload.c:663) ==26838== by 0xA75B1C8: brw_emit_vertices (brw_draw_upload.c:779) ==26838== by 0xA76FACC: check_and_emit_atom (brw_state_upload.c:761) ==26838== by 0xA76FACC: brw_upload_pipeline_state (brw_state_upload.c:874) ==26838== by 0xA76FACC: brw_upload_render_state (brw_state_upload.c:896) ==26838== by 0xA75941B: brw_try_draw_prims (brw_draw.c:582) ==26838== by 0xA75941B: brw_draw_prims (brw_draw.c:673) ==26838== by 0xA57B837: vbo_validated_drawrangeelements (vbo_exec_array.c:813) ==26838== by 0xA57BB77: vbo_exec_DrawRangeElementsBaseVertex (vbo_exec_array.c:907) ==26838== by 0x4AC5A5: ??? (in /usr/bin/glretrace) ==26838== by 0x40CDA8: ??? (in /usr/bin/glretrace) ==26838== by 0x40D3E7: ??? (in /usr/bin/glretrace) ==26838== by 0x40D7B1: ??? (in /usr/bin/glretrace) I also see ==26838== Invalid read of size 4 ==26838== at 0xA83D002: brw::vec4_visitor::var_range_end(unsigned int, unsigned int) const (brw_vec4_live_variables.cpp:335) ==26838== by 0xA82AFE9: brw::vec4_visitor::opt_register_coalesce() (brw_vec4.cpp:1114) ==26838== by 0xA82F18D: brw::vec4_visitor::run() [clone .part.18] [clone .constprop.19] (brw_vec4.cpp:1992) ==26838== by 0xA82FE5F: brw_compile_vs (brw_vec4.cpp:2175) ==26838== by 0xA774E99: brw_codegen_vs_prog (brw_vs.c:194) ==26838== by 0xA7754F2: brw_vs_precompile (brw_vs.c:405) ==26838== by 0xA75F93A: brw_shader_precompile (brw_link.cpp:65) ==26838== by 0xA75F93A: brw_link_shader (brw_link.cpp:283) ==26838== by 0xA6012BA: _mesa_glsl_link_shader (ir_to_mesa.cpp:3070) ==26838== by 0xA51534B: _mesa_link_program.part.20 (shaderapi.c:1093) ==26838== by 0x5816AD: ??? (in /usr/bin/glretrace) ==26838== by 0x40CDA8: ??? (in /usr/bin/glretrace) ==26838== by 0x40D3E7: ??? (in /usr/bin/glretrace) which I've fixed with commit commit e7c376adfdecd4c1333997c8be8bb066a87c67b4 Author: Matt Turner <mattst88@gmail.com> Date: Thu Aug 18 15:54:47 2016 -0700 i965/vec4: Ignore swizzle of VGRF for use by var_range_end(). I'll investigate the problem in copy_array_to_vbo_array now. I've tried Mesa master and 12.0.1 compiled with and without debugging (-O0 -ggdb3) with gcc 4.9.3 and gcc 5.4.0. A co-worker tried on Haswell and Arch, with the default Arch packages, as well as mesa master compiled with gcc 6.1.1. He could only reproduce the same problems I've noted in this comment. I wouldn't think the problem in copy_array_to_vbo_array was related to the crash you're seeing, but I guess we won't know for sure until it's fixed. I should also mention that I can reproduce the two problems I noted with Mesa 11.2.2, so they seem unlikely to be the cause of the problem you see. Another idea, you might try reverting this commit: commit 2c1c060b031a7c179653ee83f28f7325c47ebd04 Author: Jordan Justen <jordan.l.justen@intel.com> Date: Tue May 10 14:22:13 2016 -0700 util/ralloc: Remove double zero'ing of rzalloc buffers It is in the 12.0 branch but not 11.2. (In reply to Matt Turner from comment #5) > Another idea, you might try reverting this commit: > > commit 2c1c060b031a7c179653ee83f28f7325c47ebd04 > Author: Jordan Justen <jordan.l.justen@intel.com> > Date: Tue May 10 14:22:13 2016 -0700 > > util/ralloc: Remove double zero'ing of rzalloc buffers > > It is in the 12.0 branch but not 11.2. Note that both before and after this patch we should be (still) zero filling all ralloc'd buffers by using calloc. This patch was just removing the additional memset when rzalloc was used. (In reply to Jordan Justen from comment #6) > This patch was just removing the additional memset > when rzalloc was used. Oh, right. That's probably not the culprit then. I just played the first mission of Enemy Within and got a segfault immediately before the cinematic began. Progress! After a very painful bisect, I've found this commit to be the culprit: commit 7d539080c1a491aff9fb3e90c25df89884477aa8 Author: Kenneth Graunke <kenneth@whitecape.org> Date: Tue Nov 17 00:26:37 2015 -0800 nir: Add a writemask to store intrinsics. Interestingly, it is in 11.2. I found 11.2-branchpoint to be bad and 11.1-branchpoint to be good, and so those were the starting points of my bisection. I've got a fix. Patch on the list: [PATCH] nir: Walk blocks in source code order in lower_vars_to_ssa. Created attachment 126028 [details] [review] [PATCH] nir: Walk blocks in source code order in lower_vars_to_ssa. Thanks again for the great bug report. I've pushed the fix as commit e53130cc27b966a09d48be53cb51e09ea7ad0649 Author: Matt Turner <mattst88@gmail.com> Date: Wed Aug 24 19:25:58 2016 -0700 nir: Walk blocks in source code order in lower_vars_to_ssa. and tagged it for the stable branch, so it should be available in 11.2.3 and 12.0.2. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.