Bug 65654 - Bugs in i965 driver reported by Valgrind memcheck
Summary: Bugs in i965 driver reported by Valgrind memcheck
Status: VERIFIED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on: 66779 68233
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-11 13:55 UTC by Eero Tamminen
Modified: 2014-12-30 11:18 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Eero Tamminen 2013-06-11 13:55:04 UTC
Running apitrace's eglretrace under Valgrind memcheck reported following problems in Mesa's DRI i965 driver:
-------------------------
==19691== Conditional jump or move depends on uninitialised value(s)
==19691==    at 0x4C2B0C1: bcmp (mc_replace_strmem.c:889)
==19691==    by 0x803B241: brw_vs_prog_data_compare (brw_vs.c:231)
==19691==    by 0x8013D91: brw_upload_cache (brw_state_cache.c:218)
==19691==    by 0x803ACA1: do_vs_prog (brw_vs.c:376)
==19691==    by 0x803B5F7: brw_vs_precompile (brw_vs.c:609)
==19691==    by 0x8013317: brw_link_shader (brw_shader.cpp:72)
==19691==    by 0x84D57D1: _mesa_glsl_link_shader (ir_to_mesa.cpp:3221)
==19691==    by 0x8393AED: link_program (shaderapi.c:784)
==19691==    by 0x5789E5: retrace_glLinkProgram(trace::Call&) (glproc.hpp:12147)

==19691== Conditional jump or move depends on uninitialised value(s)
==19691==    at 0x8013D94: brw_upload_cache (brw_state_cache.c:218)
==19691==    by 0x803ACA1: do_vs_prog (brw_vs.c:376)
==19691==    by 0x803B5F7: brw_vs_precompile (brw_vs.c:609)
==19691==    by 0x8013317: brw_link_shader (brw_shader.cpp:72)
==19691==    by 0x84D57D1: _mesa_glsl_link_shader (ir_to_mesa.cpp:3221)
==19691==    by 0x8393AED: link_program (shaderapi.c:784)
==19691==    by 0x5789E5: retrace_glLinkProgram(trace::Call&) (glproc.hpp:12147)

==19691== Conditional jump or move depends on uninitialised value(s)
==19691==    at 0x8013D94: brw_upload_cache (brw_state_cache.c:218)
==19691==    by 0x803ACA1: do_vs_prog (brw_vs.c:376)
==19691==    by 0x803B0EC: brw_upload_vs_prog (brw_vs.c:551)
==19691==    by 0x8016051: brw_upload_state (brw_state_upload.c:506)
==19691==    by 0x7FD2DFE: brw_draw_prims (brw_draw.c:446)
==19691==    by 0x83E8233: vbo_draw_arrays (vbo_exec_array.c:624)
==19691==    by 0x4C3D9C: retrace_glDrawArrays(trace::Call&) (glproc.hpp:8982)

==19691== Conditional jump or move depends on uninitialised value(s)
==19691==    at 0x8013C2B: brw_search_cache (brw_state_cache.c:159)
==19691==    by 0x7FC62D0: brw_blorp_clear_params::get_wm_prog(brw_context*, brw_blorp_prog_data**) const (brw_blorp_clear.cpp:173)
==19691==    by 0x8045E4C: gen7_blorp_exec(intel_context*, brw_blorp_params const*) (gen7_blorp.cpp:845)
==19691==    by 0x7FC1A68: brw_blorp_exec(intel_context*, brw_blorp_params const*) (brw_blorp.cpp:178)
==19691==    by 0x7FC640F: brw_blorp_clear_color (brw_blorp_clear.cpp:298)
==19691==    by 0x7FC7822: brw_clear (brw_clear.c:245)
==19691==    by 0x4D77AA: retrace_glClear(trace::Call&) (glproc.hpp:4602)
-------------------------

I.e. 3 differe problems.  At least first 2 would seem to be real bugs.


First one is for this code:
-------- dri/i965/brw_vs.c code -------
   /* Compare the rest of the struct. */
   const unsigned offset = sizeof(struct brw_vec4_prog_data);
   if (memcmp(((char *) &a) + offset, ((char *) &b) + offset,
              sizeof(struct brw_vs_prog_data) - offset)) {
      return false;
   }
--------------------------
(last touched by Paul Berry in commit 5fff3752c88255ea3f4eb26cddb2c996694b33b1)

Which does comparison on padding bytes of brw_vs_prog_data, which apparently aren't initialized as Valgrind complains about them.  Structure layout is following according to pahole:
--------------------------
struct brw_vs_prog_data {
        struct brw_vec4_prog_data  base;                 /*     0   184 */
        /* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
        GLbitfield64               inputs_read;          /*   184     8 */
        /* --- cacheline 3 boundary (192 bytes) --- */
        _Bool                      uses_vertexid;        /*   192     1 */

        /* size: 200, cachelines: 4, members: 3 */
        /* padding: 7 */
        /* last cacheline: 8 bytes */
};
--------------------------


Second issue is prog_offset not being initialized before use which also can cause random behavior, depending on stack content:
--------------------------
bool
brw_search_cache(struct brw_cache *cache,
                 enum brw_cache_id cache_id,
                 const void *key, GLuint key_size,
                 uint32_t *inout_offset, void *out_aux)
{
...
   if (item->offset != *inout_offset) {
...
uint32_t
brw_blorp_clear_params::get_wm_prog(struct brw_context *brw,
                                   brw_blorp_prog_data **prog_data) const
{
   uint32_t prog_offset;
   if (!brw_search_cache(&brw->cache, BRW_BLORP_CLEAR_PROG,
                         &this->wm_prog_key, sizeof(this->wm_prog_key),
                         &prog_offset, prog_data)) {
--------------------------
(added by Eric Anholt in commit e56095dc2e40d6d1e37e123c694a609d16932b4a)


On quick look I didn't see what's the problem with the last valgrind warning, but the code which it complains about is this:
--------------------------
  if (!cache->aux_compare[result_item->cache_id](item_aux, aux,
                                                 item->aux_size,
                                                 item->key))
     continue;
--------------------------

I didn't check whether there are similar issues in other drivers.
Comment 1 Eero Tamminen 2013-08-05 14:07:38 UTC
> Running apitrace's eglretrace under Valgrind memcheck

Note: trace was of GLBenchmark Egypt test that had been run on Android. Retracing was done on Debian stable with new Mesa libs and latest Apitrace from git.


> On quick look I didn't see what's the problem with the last valgrind warning

That was reported also separately, as bug 66779, which got just fixed. Adding dep to that.
Comment 2 Kenneth Graunke 2013-09-17 06:04:29 UTC
I just posted a patch to mesa-div which fixes the first case:
"i965: Fix brw_vs_prog_data_compare to actually check field members."
(unfortunately the list archive hasn't updated yet, otherwise I'd link it)

That case was also reported as bug #68233.
Comment 3 Eero Tamminen 2013-09-17 07:45:43 UTC
(In reply to comment #2)
> I just posted a patch to mesa-div which fixes the first case:
> "i965: Fix brw_vs_prog_data_compare to actually check field members."
> (unfortunately the list archive hasn't updated yet, otherwise I'd link it)

Oh, I had missed it using the wrong variable (or using &)... <blush>

> That case was also reported as bug #68233.

Adding as dep.

The only valgrind report left from GLBenchmark startup strace eglretrace is this (which happens also with Mesa build from last week):
------------------------------------------
==19691== Conditional jump or move depends on uninitialised value(s)
==19691==    at 0x8013D94: brw_upload_cache (brw_state_cache.c:218)
==19691==    by 0x803ACA1: do_vs_prog (brw_vs.c:376)
==19691==    by 0x803B5F7: brw_vs_precompile (brw_vs.c:609)
==19691==    by 0x8013317: brw_link_shader (brw_shader.cpp:72)
==19691==    by 0x84D57D1: _mesa_glsl_link_shader (ir_to_mesa.cpp:3221)
==19691==    by 0x8393AED: link_program (shaderapi.c:784)
==19691==    by 0x5789E5: retrace_glLinkProgram(trace::Call&) (glproc.hpp:12147)
------------------------------------------

I.e. this code:
------------------------------------------
  if (!cache->aux_compare[result_item->cache_id](item_aux, aux,
                                                 item->aux_size,
                                                 item->key))
     continue;
------------------------------------------

From which I couldn't see what variable was triggering it.
Comment 4 Eero Tamminen 2014-12-30 11:18:47 UTC
These don't happen anymore, I assume this was fixed a long time ago.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.