Bug 71029

Summary: [ilk] igt/gem_close_race is slow due to mutex contention vs clflush
Product: DRI Reporter: lu hua <huax.lu>
Component: DRM/IntelAssignee: Damien Lespiau <damien.lespiau>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: minor    
Priority: low CC: intel-gfx-bugs, yi.sun
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg
none
dmesg on ilk
none
dmesg with Serialise i915_gem_release() and i915_gem_free_request()
none
dmesg after clean boot
none
dmesg after run case none

Description lu hua 2013-10-30 05:27:17 UTC
Created attachment 88338 [details]
dmesg

System Environment:
--------------------------
Platform: Q965/Pineview/GM45/Ironlake/Sandybridge/Ivybridge/Haswell
Kernel:	(drm-intel-nightly)d3c493af99141b2ca1f189bce56ba7e8c2257aa6

Bug detailed description:
-------------------------
It fails on all platforms with -fixes, -nightly and -queued kernel.It's a new igt case.

output:
gem_close_race: intel_drm.c:69: intel_get_drm_devid: Assertion `ret == 0' failed
gem_close_race: drmtest.c:925: igt_skip: Assertion `!test_child' failed.
Subtest gem-close-race: SUCCESS
child 0 died with signal 6, Aborted
gem_close_race: drmtest.c:994: igt_fail: Assertion `!test_with_subtests || in_fi
gem_close_race: drmtest.c:1161: children_exit_handler: Assertion `kill(test_chil
Aborted (core dumped)
[root@x-pk2 tests]# gem_close_race: drmtest.c:925: igt_skip: Assertion `!test_ch
gem_close_race: drmtest.c:925: igt_skip: Assertion `!test_child' failed.
gem_close_race: intel_drm.c:69: intel_get_drm_devid: Assertion `ret == 0' failed
gem_close_race: intel_drm.c:69: intel_get_drm_devid: Assertion `ret == 0' failed
gem_close_race: intel_drm.c:69: intel_get_drm_devid: Assertion `ret == 0' failed
gem_close_race: intel_drm.c:69: intel_get_drm_devid: Assertion `ret == 0' failed
gem_close_race: intel_drm.c:69: intel_get_drm_devid: Assertion `ret == 0' failed
gem_close_race: intel_drm.c:69: intel_get_drm_devid: Assertion `ret == 0' failed
gem_close_race: intel_drm.c:69: intel_get_drm_devid: Assertion `ret == 0' failed


Reproduce steps:
-------------------------
1. ./gem_close_race
Comment 2 lu hua 2013-11-04 06:56:11 UTC
Test on latest -nightly kernel.
It works well on sandybridge, ivybridge and haswell.

It causes call trace on ironlake, and it doesn't exit test.
output:
Subtest process-exit: SUCCESS
Call Trace:
[  962.964724]  [<ffffffffa0077091>] ? i915_gem_retire_requests_ring+0x3a/0x8e [i915]
[  962.964777]  [<ffffffffa0077c96>] ? i915_gem_busy_ioctl+0x63/0xbb [i915]
[  962.964825]  [<ffffffffa0009f94>] ? drm_ioctl+0x26c/0x38d [drm]
[  962.964868]  [<ffffffffa0077c33>] ? i915_gem_set_seqno+0x4a/0x4a [i915]
[  962.964913]  [<ffffffff8104be83>] ? hrtimer_try_to_cancel+0x67/0x70
[  962.964954]  [<ffffffff810e84ae>] ? vfs_ioctl+0x1e/0x31
[  962.964988]  [<ffffffff810e8c84>] ? do_vfs_ioctl+0x3ad/0x3ef
[  962.965026]  [<ffffffff81074d75>] ? SyS_futex+0x12e/0x161
[  962.965060]  [<ffffffff810e8d14>] ? SyS_ioctl+0x4e/0x7e
[  962.965095]  [<ffffffff8170ebe2>] ? system_call_fastpath+0x16/0x1b
[  962.965133] Code: 8b 6b 40 48 89 7b 30 4c 89 43 38 48 85 ed 74 40 48 8d 7d 08 e8 44 3c 69 e1 48 8b 53 48 48 8b 43 50 48 b9 00 02 20 00 00 00 ad de <48> 89 42 08 48 89 10 48 ba 00 01 10 00 00 00 ad de 48 89 53 48
[  962.965413] RIP  [<ffffffffa0075cff>] i915_gem_free_request+0x55/0x9e [i915]
[  962.965462]  RSP <ffff88009e2b5d48>

Also test the comment 1 patch, It has same result as the latest kernel.
Comment 3 lu hua 2013-11-04 06:56:34 UTC
Created attachment 88586 [details]
dmesg on ilk
Comment 4 Daniel Vetter 2013-11-04 07:04:59 UTC
Apparently the test works ...
Comment 5 Chris Wilson 2013-11-04 09:05:21 UTC
See bug 70874.
Comment 6 Daniel Vetter 2013-11-04 09:56:11 UTC
(In reply to comment #5)
> See bug 70874.

I.e. please test the patch attached there.
Comment 7 lu hua 2013-11-05 07:46:18 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > See bug 70874.
> 
> I.e. please test the patch attached there.

Test the patch on Ironlake. Run more than 1 hour, the test doesn't exit. 
output:
Subtest process-exit: SUCCESS

dmesg:
[ 5201.186618] [drm:i915_gem_open],
[ 5201.195147] [drm:i915_gem_open],
[ 5201.219552] [drm:i915_gem_open],
[ 5201.227760] [drm:i915_gem_open],
[ 5201.236246] [drm:i915_gem_open],
[ 5201.258691] [drm:i915_gem_open],
[ 5201.266885] [drm:i915_gem_open],
[ 5201.266911] [drm:i915_gem_open],
[ 5201.290080] [drm:i915_gem_open],
[ 5201.290105] [drm:i915_gem_open],
[ 5201.297611] [drm:i915_gem_open],
[ 5201.305690] [drm:i915_gem_open],
Comment 8 lu hua 2013-11-05 07:47:16 UTC
Created attachment 88667 [details]
dmesg with Serialise i915_gem_release() and i915_gem_free_request()
Comment 9 Chris Wilson 2013-12-13 14:20:16 UTC
The test works, but is just slow. For some reason, abnormally slow on Ironlake.
Comment 10 Daniel Vetter 2013-12-16 09:15:28 UTC
Should be "fixed" now ...
Comment 11 lu hua 2013-12-20 02:38:55 UTC
(In reply to comment #10)
> Should be "fixed" now ...

Test on latest -nightly kernel. The fail goes away.
It spends about 25 minutes testing. Is it expected?

It spends about 2 minutes testing on Sandybridge.
Comment 12 lu hua 2013-12-30 05:55:35 UTC
Reopen it.
Test on Pineview, It takes about 5 minutes.
It takes about 25 minutes on Ironlake.
Comment 13 Chris Wilson 2013-12-30 12:18:06 UTC
Is the ilk w/a active?

E.g. does

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 998f9a0b322a..2adfef0fa6ac 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1727,6 +1727,9 @@ static int i915_gmch_probe(struct drm_device *dev,
        dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
        dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
 
+       if (unlikely(dev_priv->gtt.do_idle_maps))
+               DRM_INFO("applying Ironlake quirks for intel_iommu\n");
+
        return 0;
 }
 

report anything?
Comment 14 lu hua 2013-12-31 06:52:15 UTC
(In reply to comment #13)
> Is the ilk w/a active?
> 
> E.g. does
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
> b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 998f9a0b322a..2adfef0fa6ac 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1727,6 +1727,9 @@ static int i915_gmch_probe(struct drm_device *dev,
>         dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
>         dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
>  
> +       if (unlikely(dev_priv->gtt.do_idle_maps))
> +               DRM_INFO("applying Ironlake quirks for intel_iommu\n");
> +
>         return 0;
>  }
>  
> 
> report anything?

output:
Subtest process-exit: SUCCESS
Subtest gem-close-race: SUCCESS

dmesg:
[  173.963588] systemd-journald[2374]: Got invalid event from epoll.
[  174.229685] systemd[1]: Unit systemd-journald.service entered failed state.
[  174.313572] systemd[1]: systemd-journald.service holdoff time over, scheduling restart.
[  174.409767] systemd[1]: Stopping Journal Service...
[  174.468447] systemd[1]: Starting Journal Service...
[  174.528119] systemd[1]: Started Journal Service.
[  174.583752] systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
[  280.871090] systemd-journald[14135]: Got invalid event from epoll.
[  281.348185] systemd[1]: Unit systemd-journald.service entered failed state.
[  281.432383] systemd[1]: systemd-journald.service holdoff time over, scheduling restart.
[  281.529290] systemd[1]: Stopping Journal Service...
[  281.588296] systemd[1]: Starting Journal Service...
[  281.648607] systemd[1]: Started Journal Service.
[  281.704748] systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
[  297.277227] systemd-journald[7062]: Got invalid event from epoll.
[  297.887872] systemd[1]: Unit systemd-journald.service entered failed state.
[  297.972935] systemd[1]: systemd-journald.service holdoff time over, scheduling restart.
[  298.070402] systemd[1]: Stopping Journal Service...
[  298.129492] systemd[1]: Starting Journal Service...
[  298.188906] systemd[1]: Started Journal Service.
[  298.245167] systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
[  298.368935] systemd-journald[21308]: Got invalid event from epoll.
[  298.445478] systemd[1]: Unit systemd-journald.service entered failed state.
[  298.529547] systemd[1]: systemd-journald.service holdoff time over, scheduling restart.
[  298.625551] systemd[1]: Stopping Journal Service...
[  298.685082] systemd[1]: Starting Journal Service...
[  298.744512] systemd[1]: Started Journal Service.
[  298.801121] systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
[  407.183744] systemd-journald[21534]: Got invalid event from epoll.
[  407.725206] systemd[1]: Unit systemd-journald.service entered failed state.
[  407.809623] systemd[1]: systemd-journald.service holdoff time over, scheduling restart.
[  407.906516] systemd[1]: Stopping Journal Service...
[  407.965268] systemd[1]: Starting Journal Service...
[  408.025209] systemd[1]: Started Journal Service.
[  408.081274] systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
[  408.195253] systemd-journald[14884]: Got invalid event from epoll.
[  408.273373] systemd[1]: Unit systemd-journald.service entered failed state.
[  408.357218] systemd[1]: systemd-journald.service holdoff time over, scheduling restart.
[  408.453920] systemd[1]: Stopping Journal Service...
[  408.512768] systemd[1]: Starting Journal Service...
[  408.572970] systemd[1]: Started Journal Service.
[  408.629781] systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
[  408.656757] systemd-journald[15133]: Got invalid event from epoll.
[  408.799599] systemd[1]: systemd-journald.service: main process exited, code=exited, status=1/FAILURE
[  408.909586] systemd[1]: Unit systemd-journald.service entered failed state.
[  408.993369] systemd[1]: systemd-journald.service holdoff time over, scheduling restart.
[  409.089380] systemd[1]: Stopping Journal Service...
[  409.148239] systemd[1]: Starting Journal Service...
[  409.207622] systemd[1]: Started Journal Service.
[  409.284042] systemd-journald[15581]: Got invalid event from epoll.
[  409.363928] systemd[1]: Unit systemd-journald.service entered failed state.
[  409.447596] systemd[1]: systemd-journald.service holdoff time over, scheduling restart.
[  409.543684] systemd[1]: Stopping Journal Service...
[  409.606480] systemd[1]: Starting Journal Service...
[  409.666021] systemd[1]: Started Journal Service.
[  409.721899] systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
Comment 15 Chris Wilson 2014-01-08 10:15:27 UTC
It would be reported during boot. Can you please attach a dmesg from the ilk machine upon booting with the current drm-intel-nightly?
Comment 16 lu hua 2014-01-09 02:48:31 UTC
Created attachment 91730 [details]
dmesg after clean boot
Comment 17 lu hua 2014-01-09 02:49:01 UTC
Created attachment 91731 [details]
dmesg after run case
Comment 18 Chris Wilson 2014-01-24 12:04:19 UTC
Finally booted an Ironlake (i3-330m). Woohoo!

The time here is spent in mutex contention (specifically throwing away cycles in mutes_spin_on_owner whilst we are clflushing). I think this is just a pathologically bad condition and not one to worry about (just yet).
Comment 19 Chris Wilson 2014-01-24 12:38:28 UTC
For reference:

ickle@u160 ~/intel-gpu-tools/tests $ time sudo ./gem_close_race
IGT-Version: 1.5-ga9d1fe0 (x86_64) (Linux: 3.13.0-rc8+ x86_64)
Subtest process-exit: SUCCESS
Subtest gem-close-race: SUCCESS

real	211m48.483s
user	0m5.860s
sys	432m50.890s
Comment 20 Chris Wilson 2014-07-11 22:34:17 UTC
One idea is to recognise that we have some extremely long mutex hold times, and disabling spinning for thosei, i.e:

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 338f80c..ed45276 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3769,7 +3769,10 @@ i915_gem_clflush_object(struct drm_i915_gem_object *obj,
                return false;
 
        trace_i915_gem_object_clflush(obj);
+
+       mutex_spin_disable(&obj->base.dev->struct_mutex);
        drm_clflush_sg(obj->pages);
+       mutex_spin_enable(&obj->base.dev->struct_mutex);
 
        return true;
 }

That's obvious not going to fly upstream. A better suggestion would perhaps be to autotune the spinning by timing out. So something like:

diff --git a/kernel/locking/mcs_spinlock.c b/kernel/locking/mcs_spinlock.c
index 838dc9e..5e0f27e 100644
--- a/kernel/locking/mcs_spinlock.c
+++ b/kernel/locking/mcs_spinlock.c
@@ -63,6 +63,7 @@ bool osq_lock(struct optimistic_spin_queue **lock)
 {
        struct optimistic_spin_queue *node = this_cpu_ptr(&osq_node);
        struct optimistic_spin_queue *prev, *next;
+       unsigned long timeout;
 
        node->locked = 0;
        node->next = NULL;
@@ -82,7 +83,11 @@ bool osq_lock(struct optimistic_spin_queue **lock)
         * cmpxchg in an attempt to undo our queueing.
         */
 
+       timeout = jiffies + 1;
        while (!smp_load_acquire(&node->locked)) {
+               if (time_after(jiffies, timeout))
+                       goto unqueue;
+
                /*
                 * If we need to reschedule bail... so we can block.
                 */
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index bc73d33..ac8c435 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -147,8 +147,17 @@ static inline bool owner_running(struct mutex *lock, struct task_struct *owner)
 static noinline
 int mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
 {
+       unsigned long timeout;
+
+       if (need_resched())
+               return false;
+
        rcu_read_lock();
+       timeout = jiffies + 1;
        while (owner_running(lock, owner)) {
+               if (time_after(jiffies, timeout))
+                       break;
+
                if (need_resched())
Comment 21 Chris Wilson 2014-07-11 22:40:12 UTC
commit 98165d60bec04d258eeb9623550e58a8bc2f3906
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Jul 11 23:36:15 2014 +0100

    igt/gem_close_race: Trim object size
    
    The challenge here is to race GPU activity versus the gem_close. Using a
    large objects makes the window of GPU activity larger - except on !llc
    systems we then incur massive overhead from clflush, likely destroying
    any race (due to mutex contention). And that mutex contention is
    amplified by the number of cores - bad news for Baytrail.
    
    Give up and make the objects smaller and hope that the test is run
    frequently enough to catch the race.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=71029
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 22 lu hua 2014-07-14 02:28:21 UTC
Fixed.
IGT-Version: 1.7-ge74dd22 (x86_64) (Linux: 3.16.0-rc4_drm-intel-nightly_271bff_20140713+ x86_64)
Subtest process-exit: SUCCESS
Subtest gem-close-race: SUCCESS

real    0m9.192s
user    0m0.336s
sys     0m14.457s
Comment 23 Elizabeth 2017-10-06 14:42:19 UTC
Closing old verified.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.