Bug 78652

Summary: [BDW GT3]Some subcases of igt/drv_suspend causes system hang sporadically
Product: DRI Reporter: Guo Jinxian <jinxianx.guo>
Component: DRM/IntelAssignee: Ville Syrjala <ville.syrjala>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: high CC: intel-gfx-bugs, wendy.wang
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg none

Description Guo Jinxian 2014-05-13 08:31:08 UTC
Created attachment 98964 [details]
dmesg

==System Environment==
--------------------------
Regression: No. 
It's first time to run the test on BDW GT3

Non-working platforms: BDW GT3

==kernel==
--------------------------
-nightly: 18a7661946082f5a3b353e50d1ced5d89f864024 (fails)
-queued: ca26115df8a2ed25b2c0ed1b7c9557c7779d1556 (fails)
    Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Date:   Thu May 8 22:19:42 2014 +0300

    x86/gpu: Sprinkle const, __init and __initconst to stolen memory quirks

    gen8_stolen_size() is missing __init, so add it.

    Also all the intel_stolen_funcs structures can be marked
    __initconst.

    intel_stolen_ids[] can also be made const if we replace the
    __initdata with __initconst.

    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

-fixes: 05adaf1f101f25f40f12c29403e6488f0e45f6b6 (fails)
    Author: Jani Nikula <jani.nikula@intel.com>
    Date:   Fri May 9 14:52:34 2014 +0300

    drm/i915/vlv: reset VLV media force wake request register

    Media force wake get hangs the machine when the system is booted without
    displays attached. The assumption is that (at least some versions of)
    the firmware has skipped some initialization in that case.

    Empirical evidence suggests we need to reset the media force wake
    request register in addition to the render one to avoid hangs.

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75895
    Reported-by: Imre Deak <imre.deak@intel.com>
    Reported-by: Darren Hart <dvhart@linux.intel.com>
    Tested-by: Darren Hart <dvhart@linux.intel.com>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>

==Bug detailed description==
-----------------------------
Some subcases of igt/drv_suspend causes system hang sporadically, the hang rate was about 1 out of 5.

Dmesg shows:
[systemd-[  158.631171] PM: early resume of devices complete after 0.480 msecs
fsck[3850]: /dev[  158.633311] sd 1:0:0:0: [sda] Starting disk
/sda5: recoverin[  158.635570] ------------[ cut here ]------------
g journal  [  158.635579] WARNING: CPU: 1 PID: 4328 at drivers/pnp/pnpacpi/core.c:97 pnpacpi_set_resources+0x84/0x10b()
OK  ] Starte[  158.635606] Modules linked in: ip6table_filter ip6_tables ipv6 iptable_filter ip_tables ebtable_nat ebtables x_tables dm_mod snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi ppdev iTCO_wdt iTCO_vendor_support pcspkr i2c_i801 lpc_ich mfd_core snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore battery parport_pc parport ac acpi_cpufreq i915 video button drm_kms_helper drm
d File System Ch[  158.635611] CPU: 1 PID: 4328 Comm: rtcwake Not tainted 3.15.0-rc3_drm-intel-next-queued_ca2611_20140513+ #2578
eck on /dev/disk[  158.635615]  0000000000000000 0000000000000009 ffffffff817233c0 0000000000000000
/by-uuid/0a7e090[  158.635617]  ffffffff8103517a ffffea0002ab1800 ffffffff81351a3d ffff880149ac5000
c-d2b9-498e-bc0d[  158.635620]  ffff880149ac5000 ffff880149ac5000 ffff88014a853488 0000000000000000
[  158.635620] Call Trace:


[systemd-fsc[  158.635628]  [<ffffffff817233c0>] ? dump_stack+0x41/0x51
k[3850]: /dev/sd[  158.635633]  [<ffffffff8103517a>] ? warn_slowpath_common+0x73/0x8b
a5: clean, 67180[  158.635636]  [<ffffffff81351a3d>] ? pnpacpi_set_resources+0x84/0x10b
6/11083776 files[  158.635640]  [<ffffffff81351a3d>] ? pnpacpi_set_resources+0x84/0x10b
, 33495911/44329[  158.635643]  [<ffffffff8134e4d0>] ? pnp_bus_suspend+0xd/0xd
728 blocks [  158.635647]  [<ffffffff8134fd89>] ? pnp_start_dev+0x5f/0x95
 OK  ] Activ[  158.635653]  [<ffffffff8133f864>] ? acpi_evaluate_object+0x257/0x268
ated swap /dev/d[  158.635656]  [<ffffffff8134e52e>] ? pnp_bus_resume+0x5e/0x90
isk/by-uuid/1a1b[  158.635662]  [<ffffffff813896c1>] ? dpm_run_callback.isra.8+0x24/0x52
79a3-6d33-48d2-a[  158.635666]  [<ffffffff81389cb0>] ? device_resume+0x10c/0x14e
27e-1b5805d88291[  158.635669]  [<ffffffff8138aae3>] ? dpm_resume+0xc8/0x1d6
.

[  O[  158.635673]  [<ffffffff8138ad75>] ? dpm_resume_end+0x8/0x10
K  ] Started[  158.635679]  [<ffffffff81062839>] ? suspend_devices_and_enter+0x267/0x2af
 File System Che[  158.635683]  [<ffffffff81062988>] ? pm_suspend+0x107/0x1bb
ck on /dev/disk/[  158.635687]  [<ffffffff81061aea>] ? state_store+0x9f/0xbe
by-uuid/d79f918f[  158.635693]  [<ffffffff811376b1>] ? kernfs_fop_write+0xca/0x110
-693d-4480-91cf-[  158.635698]  [<ffffffff810e0b64>] ? vfs_write+0xba/0x176
6840792221db.
[  158.635702]  [<ffffffff810e0f1c>] ? SyS_write+0x41/0x84
         Mountin[  158.635707]  [<ffffffff8172dba2>] ? system_call_fastpath+0x16/0x1b
g /home...
[  158.635708] ---[ end trace 29072e4048251a91 ]---



Output:
./drv_suspend --run-subtest sysfs-reader
IGT-Version: 1.6-gd848a36 (x86_64) (Linux: 3.15.0-rc3_drm-intel-next-queued_ca2611_20140513+ x86_64)
rtcwake: wakeup from "mem" using /dev/rtc0 at Mon Jan 22 23:07:17 2001


==Reproduce steps==
---------------------------- 
1. ./drv_suspend --run-subtest sysfs-reader
Comment 1 Chris Wilson 2014-05-13 08:36:59 UTC
That warning is not the actual error, it is scary though and looks more like an acpi issue.

The hang is from the mismerge of 
commit 78325f2d270897c9ee0887125b7abb963eb8efea
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Tue Apr 29 14:52:29 2014 -0700

    drm/i915: Virtualize the ringbuffer signal func
Comment 2 Chris Wilson 2014-05-13 14:05:58 UTC
The hangs should be fixed by

commit d1533379584f8edcfcabb024dffc1b334db8da0f
Author: Oscar Mateo <oscar.mateo@intel.com>
Date:   Fri May 9 13:44:59 2014 +0100

    drm/i915: Ringbuffer signal func for the second BSD ring
    
    This is missing in:
    
    commit 78325f2d270897c9ee0887125b7abb963eb8efea
    Author: Ben Widawsky <benjamin.widawsky@intel.com>
    Date:   Tue Apr 29 14:52:29 2014 -0700
    
        drm/i915: Virtualize the ringbuffer signal func
    
    Looks to me like a rebase side-effect...
    
    Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Comment 3 Guo Jinxian 2014-05-16 05:21:01 UTC
On latest -next-queued(b7bb243924e9284f605368e22c3aa4ca3c980d81), run this test 5 times, didn't find system hang. Verified.
Comment 4 Jari Tahvanainen 2017-09-04 10:19:23 UTC
Closing old verified+fixed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.