Bug 86643 - [ILK/SNB/BYT Bisected]672e7b7 drm/i915: Don't continually defer the hangcheck
Summary: [ILK/SNB/BYT Bisected]672e7b7 drm/i915: Don't continually defer the hangcheck
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-24 00:44 UTC by Guo Jinxian
Modified: 2017-10-06 14:33 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Guo Jinxian 2014-11-24 00:44:36 UTC
==System Environment==
--------------------------
Regression: Yes

Non-working platforms: ILK SNB BYT

==kernel==
--------------------------
origin/drm-intel-nightly: 0f8cb1fb8e01c53f9ad47344e9448d72df49fcf2
    drm-intel-nightly: 2014y-11m-21d-19h-18m-03s UTC integration manifest

==Bug detailed description==
(ILK)igt/drv_missed_irq_hang PASS->TIMEOUT
(SNB)igt/drv_missed_irq_hang PASS->TIMEOUT
(BYT)igt/drv_missed_irq_hang PASS->TIMEOUT

==Reproduce steps==
---------------------------- 
1. ./drv_missed_irq_hang

==Bisect results from PRTS==
----------------------------
Bisect shows: 672e7b7c1849c904b2c55185906b3940843c55c6 is the first bad commit
commit 672e7b7c1849c904b2c55185906b3940843c55c6
Author:     Chris Wilson <chris@chris-wilson.co.uk>
AuthorDate: Wed Nov 19 09:47:19 2014 +0000
Commit:     Daniel Vetter <daniel.vetter@ffwll.ch>
CommitDate: Wed Nov 19 11:44:50 2014 +0100

    drm/i915: Don't continually defer the hangcheck
    
    With multiple rings, we may continue to render on the blitter whilst
    executing an infinite shader on the render ring. As we currently, rearm
    the timer with each execbuf, in this scenario the hangcheck will never
    fire and we will never detect the lockup on the render ring. Instead,
    only arm the timer once per hangcheck, so that hangcheck runs more
    frequently.
    
    v2: Rearrange code to avoid triggering a BUG_ON in add_timer from
    softirq context.
    
    Testcase: igt/gem_reset_stats/defer-hangcheck*
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86225
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@intel.com>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Comment 1 Chris Wilson 2014-11-24 07:44:56 UTC
commit d9e600b2e4a5e9f1dfe80cfcb453c8f5067a2a8a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Nov 20 20:10:33 2014 +0000

    drm/i915: Only call mod_timer() if not already pending
    
    The final arrangement of updating timer->expires and calling mod_timer()
    used in
    
    commit 672e7b7c1849c904b2c55185906b3940843c55c6
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Wed Nov 19 09:47:19 2014 +0000
    
        drm/i915: Don't continually defer the hangcheck
    
    turns out to be very unsafe. Try again.
    
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Mika Kuoppala <mika.kuoppala@intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Comment 2 Guo Jinxian 2014-11-25 06:58:43 UTC
Verified on latest -nightly(ab4b258a6ea5f58b5cb17131aced8f9a8dd64499)

[root@x-hnr9 tests]# ./drv_missed_irq_hang
Interrupts masked
Interrupts unmasked
Cleared missed interrupts
[root@x-hnr9 tests]# echo $?
0


[root@x-pk5 tests]# ./drv_missed_irq_hang
Interrupts masked
Interrupts unmasked
Cleared missed interrupts
Comment 3 Elizabeth 2017-10-06 14:33:36 UTC
Closing old verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.