Bug 88928

Summary: [all Bisected]igt pm_rps/reset doesn't exit testing
Product: DRI Reporter: lu hua <huax.lu>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: high CC: hengx.ding, intel-gfx-bugs
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg none

Description lu hua 2015-02-03 02:24:35 UTC
Created attachment 113079 [details]
dmesg

==System Environment==
--------------------------
Regression: Yes

no-working platforms: all, BDW has another bug 88654.

==kernel==
--------------------------
drm-intel-nightly/8b4216f91c7bf8d3459cadf9480116220bd6545e
commit 8b4216f91c7bf8d3459cadf9480116220bd6545e
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Sat Jan 31 17:46:32 2015 +0100

    drm-intel-nightly: 2015y-01m-31d-16h-46m-12s UTC integration manifest

==Bug detailed description==
-----------------------------
It takes more than 10 minutes and doesn't exit testing on all platforms with drm-intel-nightly and drm-intel-next-queued kernel, works well on drm-intel-fixes kernel.

output:
IGT-Version: 1.9-g51d87b8 (x86_64) (Linux: 3.19.0-rc6_drm-intel-nightly_8b4216_20150202+ x86_64)
^C(pm_rps:4100) CRITICAL: Test assertion failure function load_helper_stop, file pm_rps.c:290:
(pm_rps:4100) CRITICAL: Failed assertion: igt_wait_helper(&lh.igt_proc) == 0
(pm_rps:4100) CRITICAL: Last errno: 10, No child processes
Subtest reset: FAIL (748.362s)
pm_rps: igt_core.c:1012: fork_helper_exit_handler: Assertion `helper_process_count == 0' failed.
Aborted (core dumped)

real    12m30.762s
user    0m0.276s
sys     0m0.980s

dmesg:
[  115.786326] WARNING: CPU: 2 PID: 1056 at drivers/gpu/drm/i915/i915_irq.c:2615 i915_handle_error+0x58/0x5bd [i915]()
[  115.786386] WARN_ON(mutex_is_locked(&dev_priv->dev->struct_mutex))
[  115.786422] Modules linked in:
[  115.786444]  dm_mod snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support ppdev snd_hda_codec_idt snd_hda_codec_generic pcspkr serio_raw uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev lpc_ich mfd_core snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep firewire_ohci snd_pcm firewire_core crc_itu_t snd_timer snd soundcore wmi parport_pc parport tpm_infineon tpm_tis tpm battery ac acpi_cpufreq joydev i915 button video drm_kms_helper drm cfbfillrect cfbimgblt cfbcopyarea
[  115.786804] CPU: 2 PID: 1056 Comm: kworker/u16:4 Not tainted 3.19.0-rc6_drm-intel-nightly_8b4216_20150202+ #126
[  115.786862] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.22 12/22/2011
[  115.786922] Workqueue: i915-hangcheck i915_hangcheck_elapsed [i915]
[  115.786962]  0000000000000000 0000000000000009 ffffffff8179a48b ffff8800b810fcc8
[  115.787032]  ffffffff8103bdec 0000000000000246 ffffffffa00b6a74 0000000000000006
[  115.787096]  ffff8800b82140c0 ffff8800028d8ad0 ffff8800b818f000 ffff88013878ae00
[  115.787165] Call Trace:
[  115.787192]  [<ffffffff8179a48b>] ? dump_stack+0x40/0x50
[  115.787237]  [<ffffffff8103bdec>] ? warn_slowpath_common+0x98/0xb0
[  115.787299]  [<ffffffffa00b6a74>] ? i915_handle_error+0x58/0x5bd [i915]
[  115.787355]  [<ffffffff8103be9c>] ? warn_slowpath_fmt+0x45/0x4a
[  115.787415]  [<ffffffffa00b6a74>] ? i915_handle_error+0x58/0x5bd [i915]
[  115.787470]  [<ffffffff817971f6>] ? printk+0x48/0x4d
[  115.787524]  [<ffffffffa00b7350>] ? i915_hangcheck_elapsed+0x339/0x3d5 [i915]
[  115.787585]  [<ffffffff8104d128>] ? process_one_work+0x1ad/0x31a
[  115.787638]  [<ffffffff8104d4ef>] ? worker_thread+0x235/0x330
[  115.787686]  [<ffffffff8104d2ba>] ? process_scheduled_works+0x25/0x25
[  115.787740]  [<ffffffff81050dee>] ? kthread+0xc5/0xcd
[  115.787784]  [<ffffffff81050d29>] ? kthread_freezable_should_stop+0x40/0x40
[  115.787843]  [<ffffffff8179fdec>] ? ret_from_fork+0x7c/0xb0
[  115.787892]  [<ffffffff81050d29>] ? kthread_freezable_should_stop+0x40/0x40
[  115.787951] ---[ end trace 06291ce2c930ef1e ]---


Bisect shows: b8d24a06568368076ebd5a858a011699a97bfa42 is the first bad commit.
commit b8d24a06568368076ebd5a858a011699a97bfa42
Author:     Mika Kuoppala <mika.kuoppala@linux.intel.com>
AuthorDate: Wed Jan 28 17:03:14 2015 +0200
Commit:     Daniel Vetter <daniel.vetter@ffwll.ch>
CommitDate: Thu Jan 29 18:03:07 2015 +0100

    drm/i915: Remove nested work in gpu error handling

    Now when we declare gpu errors only through our own dedicated
    hangcheck workqueue there is no need to have a separate workqueue
    for handling the resetting and waking up the clients as the deadlock
    concerns are no more.

    The only exception is i915_debugfs::i915_set_wedged, which triggers
    error handling through process context. However as this is only used through
    test harness it is responsibility for test harness not to introduce hangs
    through both debug interface and through hangcheck mechanism at the same time.

    Remove gpu_error.work and let the hangcheck work do the tasks it used to.

    v2: Add a big warning sign into i915_debugfs::i915_set_wedged (Chris)

    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
    Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

==Reproduce steps==
---------------------------- 
1.  ./pm_rps --run-subtest reset
Comment 1 Mika Kuoppala 2015-02-03 13:51:18 UTC
*** Bug 88915 has been marked as a duplicate of this bug. ***
Comment 2 Mika Kuoppala 2015-02-03 13:52:15 UTC
*** Bug 88908 has been marked as a duplicate of this bug. ***
Comment 3 Mika Kuoppala 2015-02-03 13:55:52 UTC

*** This bug has been marked as a duplicate of bug 88933 ***
Comment 4 Elizabeth 2017-10-06 14:31:39 UTC
Closing old verified.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.