Bug 104895

Summary: [CI] igt@tools_test@tools_test - fail - Test assertion failure function __real_main62 - Failed assertion: igt_system_quiet("./intel_reg dump") == 0
Product: DRI Reporter: Marta Löfstedt <marta.lofstedt>
Component: IGTAssignee: Mika Kahola <mika.kahola>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: i915 features:

Description Marta Löfstedt 2018-02-01 07:57:35 UTC
Starting at IGT_4206:

https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4206/shard-apl2/igt@tools_test@tools_test.html
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4207/shard-glkb4/igt@tools_test@tools_test.html
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4207/shard-kbl6/igt@tools_test@tools_test.html

(tools_test:1577) CRITICAL: Test assertion failure function __real_main62, file tools_test.c:141:
(tools_test:1577) CRITICAL: Failed assertion: igt_system_quiet("./intel_reg dump") == 0
(tools_test:1577) CRITICAL: error: 139 != 0
Subtest tools_test failed.
Comment 1 Marta Löfstedt 2018-02-01 07:59:41 UTC
IGT-Version: 1.21-g7f0be0e7


author
Mika Kuoppala <mika.kuoppala@linux.intel.com> 2018-01-10 15:42:58 +0200 

committer
Mika Kuoppala <mika.kuoppala@linux.intel.com> 2018-01-31 15:27:09 +0200 

commit
7f0be0e7d9becb79630093bf0e6daeadcd937062 (patch) 


tools/intel_reg: Add reading and writing registers through engineHEADmaster
Add option to specify engine for register read/write operation.
If engine is specified, use MI_LOAD_REGISTER_IMM and MI_STORE_REGISTER_IMM
to write and read register using a batch targeted at that engine.

v2: no MI_NOOP after BBE (Chris)
v3: use modern engine names (Chris), use global fd
v4: strcasecmp (Chris)
v5: use register definition format for engine (Jani)

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
CC: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Acked-by: Jani Nikula <jani.nikula@intel.com>
Comment 2 Mika Kuoppala 2018-02-01 13:08:42 UTC
The test in question fails before the commit in question, so that is not the offending commit.
Comment 3 Marta Löfstedt 2018-02-01 13:20:47 UTC
(In reply to Mika Kuoppala from comment #2)
> The test in question fails before the commit in question, so that is not the
> offending commit.

please specify on which CI/IGT run before IGT_4206 the test was failing.

IGT_4206 is in my opinion the first fail. This run has kernel from CI_DRM_3707. However, the test pass on CI_DRM_3707, then it fails on CI_DRM_3708 and so far all consecutive runs. CI_DRM_3708 has the same IGT commit as IGT_4206.
Comment 4 Chris Wilson 2018-02-02 09:17:26 UTC
Could someone log into the machine and run "intel_reg dump" by hand and grab the segfault?
Comment 5 Tomi Sarvela 2018-02-02 09:29:57 UTC
shard-apl2:

sudo /opt/igt/bin/intel_reg dump
[  329.455080] intel_reg[1851]: segfault at 781216 ip 00007f6db49f9a56 sp 00007ffc25918cf8 error 4 in libc-2.24.so[7f6db496c000+1be000]


sudo gdb --args /opt/igt/bin/intel_reg dump
run
Starting program: /opt/igt/bin/intel_reg dump
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Warning: register spec not found in '/opt/igt/share/intel-gpu-tools/registers'. Using builtin register spec.

Program received signal SIGSEGV, Segmentation fault.
strlen () at ../sysdeps/x86_64/strlen.S:106
106	../sysdeps/x86_64/strlen.S: No such file or directory.
Comment 6 Tomi Sarvela 2018-02-02 09:39:00 UTC
(gdb) bt
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
#1  0x0000555555560f7e in find_engine (
    name=0x781216 <error: Cannot access memory at address 0x781216>) at intel_reg.c:248
#2  0x00005555555615ec in register_srm (config=0x7fffffffe8c0, val_in=0x0, reg=<optimized out>, 
    reg=<optimized out>) at intel_reg.c:287
#3  0x000055555556198a in read_register (config=<optimized out>, reg=0x555555798630, 
    valp=0x7fffffffe844) at intel_reg.c:365
#4  0x0000555555562266 in dump_register (config=0x7fffffffe8c0, reg=0x555555798630)
    at intel_reg.c:409
#5  0x00005555555622fc in intel_reg_dump (config=0x7fffffffe8c0, argc=<optimized out>, 
    argv=<optimized out>) at intel_reg.c:636
#6  0x0000555555560d2d in main (argc=1, argv=0x7fffffffea30) at intel_reg.c:1049
Comment 7 Mika Kuoppala 2018-02-02 10:33:54 UTC
(In reply to Marta Löfstedt from comment #3)
> (In reply to Mika Kuoppala from comment #2)
> > The test in question fails before the commit in question, so that is not the
> > offending commit.
> 
> please specify on which CI/IGT run before IGT_4206 the test was failing.
> 
> IGT_4206 is in my opinion the first fail. This run has kernel from
> CI_DRM_3707. However, the test pass on CI_DRM_3707, then it fails on
> CI_DRM_3708 and so far all consecutive runs. CI_DRM_3708 has the same IGT
> commit as IGT_4206.

My apologizes, please ignore the above. My build left the old and stale intel_reg hanging. (meson!?)

It is the offending commit and fix mailed on intel-gfx.
Comment 8 Marta Löfstedt 2018-02-05 07:59:45 UTC
Fixed integrated on IGT_4212

commit c219cc5307474cb53612ca759354f9473955e110
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Feb 2 10:07:05 2018 +0000

    tools: Clear unused fields in register spec
    
    If we fail to clear the other fields inside the register spec, they may
    be left with garbage instructing us to access the register via an
    invalid path.
    
    v2: Grab Mika's fix for get_regs() and check all parse_port_desc()
    callers.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104895
    Fixes: 7f0be0e7d9be ("tools/intel_reg: Add reading and writing registers through engine")
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Jani Nikula <jani.nikula@intel.com>
    CC: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Reviewed-by: Jani Nikula <jani.nikula@intel.com>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.