Bug 111559

Summary: [CI][DRMTIP] igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV
Product: DRI Reporter: Lakshmi <lakshminarayana.vudum>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED MOVED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: medium CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: GLK i915 features: GEM/Other

Description Lakshmi 2019-09-05 07:14:25 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_359/fi-glk-dsi/igt@gem_eio@in-flight-suspend.html

Starting subtest: in-flight-suspend
Received signal SIGSEGV.
Stack trace: 
 #0 [fatal_sig_handler+0xd6]
 #1 [killpg+0x40]
 #2 [_dl_find_dso_for_object+0x3194]
 #3 [igt_spin_free+0x66]
 #4 [__real_main825+0xd05]
 #5 [main+0x27]
 #6 [__libc_start_main+0xe7]
 #7 [_start+0x2a]
Subtest in-flight-suspend: CRASH (2.179s)
Comment 1 CI Bug Log 2019-09-05 07:15:44 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* GLK:  igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_359/fi-glk-dsi/igt@gem_eio@in-flight-suspend.html
Comment 2 Chris Wilson 2019-09-05 11:52:55 UTC
Seems quite bizarre. igt_spin_free has the obligatory if (!spin) return guard, and 0x66 does imply we got into the function before dying. On a local build, gdb suggests 0x66 is 

(gdb) list *igt_spin_free+0x66
0x23a66 is in igt_spin_free (igt_dummyload.c:448).
443	
444		igt_spin_end(spin);
445		gem_munmap((void *)((unsigned long)spin->condition & (~4095UL)),
446			   BATCH_SIZE);
447	
448		if (spin->poll) {
449			gem_munmap(spin->poll, 4096);
450			gem_close(fd, spin->poll_handle);
451		}
452

spin is not NULL, so the suggestion is either spin->condition lead to a SIGSEGV in gem_munmap() (unlikely, it should return -EFAULT if broken) or spin->poll is garbage. But igt_spin_t is calloc... And spin->poll is never assigned to again.

I don't see this as being a i915.ko bug, and I haven't spotted a potential issue here, my worries turn towards random memcorruption. Hopefully a second look can find a way igt_spin_t can be corrupt.
Comment 3 CI Bug Log 2019-09-20 08:04:58 UTC
A CI Bug Log filter associated to this bug has been updated:

{- GLK:  igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV -}
{+ APL GLK:  igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6922/shard-apl3/igt@gem_eio@in-flight-suspend.html
Comment 4 Chris Wilson 2019-09-20 10:42:50 UTC
(In reply to CI Bug Log from comment #3)
> A CI Bug Log filter associated to this bug has been updated:
> 
> {- GLK:  igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV -}
> {+ APL GLK:  igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV
> +}
> 
> New failures caught by the filter:
> 
>   *
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6922/shard-apl3/
> igt@gem_eio@in-flight-suspend.html

Interesting, but that's a new bug. Should be more easily fixed as it is an unwanted fixture failing that leads to the death.
Comment 5 Lakshmi 2019-09-24 06:27:01 UTC
(In reply to Chris Wilson from comment #4)
> (In reply to CI Bug Log from comment #3)
> > A CI Bug Log filter associated to this bug has been updated:
> > 
> > {- GLK:  igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV -}
> > {+ APL GLK:  igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV
> > +}
> > 
> > New failures caught by the filter:
> > 
> >   *
> > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6922/shard-apl3/
> > igt@gem_eio@in-flight-suspend.html
> 
> Interesting, but that's a new bug. Should be more easily fixed as it is an
> unwanted fixture failing that leads to the death.

Separate bug 111796 is created.
Comment 6 CI Bug Log 2019-09-24 06:27:45 UTC
A CI Bug Log filter associated to this bug has been updated:

{- APL GLK:  igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV -}
{+ GLK:  igt@gem_eio@in-flight-suspend - crash - Received signal SIGSEGV +}


  No new failures caught with the new filter
Comment 7 Martin Peres 2019-11-29 19:26:05 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/395.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.