Bug 98804

Summary: WARN_ON(reg->pin_count) during IGT
Product: DRI Reporter: Tvrtko Ursulin <tvrtko.ursulin>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: bblanco, bjorn, ernstp, intel-gfx-bugs, lionel.g.landwerlin
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: SKL i915 features:
Attachments:
Description Flags
dmesg 1
none
dmesg 2
none
dmesg.txt
none
journalctl -k -b none

Description Tvrtko Ursulin 2016-11-21 10:01:33 UTC
Created attachment 128097 [details]
dmesg 1

With the current nightly I can get the below trace in so far two different scenarios. First is while running the BAT suite in which case it happens during the kms_busy/basic-flip-default-C subtest, and I also got it while running test manually while running kms_setmode/basic. 

[ 1915.935231] [drm:intel_runtime_suspend [i915]] Suspending device
[ 1915.935248] ------------[ cut here ]------------
[ 1915.935332] WARNING: CPU: 1 PID: 10861 at drivers/gpu/drm/i915/i915_gem.c:2022 i915_gem_runtime_suspend+0x116/0x130 [i915]
[ 1915.935383] WARN_ON(reg->pin_count)[ 1915.935399] Modules linked in:
 snd_hda_intel i915 drm_kms_helper vgem netconsole scsi_transport_iscsi fuse vfat fat x86_pkg_temp_thermal coretemp intel_cstate intel_uncore snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mei_me mei serio_raw intel_rapl_perf intel_pch_thermal soundcore wmi acpi_pad i2c_algo_bit syscopyarea sysfillrect sysimgblt fb_sys_fops drm r8169 mii video [last unloaded: drm_kms_helper]
[ 1915.935785] CPU: 1 PID: 10861 Comm: kworker/1:0 Tainted: G     U  W       4.9.0-rc5+ #170
[ 1915.935799] Hardware name: LENOVO 80MX/Lenovo E31-80, BIOS DCCN34WW(V2.03) 12/01/2015
[ 1915.935822] Workqueue: pm pm_runtime_work
[ 1915.935845]  ffffc900044fbbf0 ffffffffac3220bc ffffc900044fbc40 0000000000000000
[ 1915.935890]  ffffc900044fbc30 ffffffffac059bcb 000007e6044fbc60 ffff8801626e3198
[ 1915.935937]  ffff8801626e0000 0000000000000002 ffffffffc05e5d4e 0000000000000000
[ 1915.935985] Call Trace:
[ 1915.936013]  [<ffffffffac3220bc>] dump_stack+0x4f/0x73
[ 1915.936038]  [<ffffffffac059bcb>] __warn+0xcb/0xf0
[ 1915.936060]  [<ffffffffac059c4f>] warn_slowpath_fmt+0x5f/0x80
[ 1915.936158]  [<ffffffffc052d916>] i915_gem_runtime_suspend+0x116/0x130 [i915]
[ 1915.936251]  [<ffffffffc04f1c74>] intel_runtime_suspend+0x64/0x280 [i915]
[ 1915.936277]  [<ffffffffac0926f1>] ? dequeue_entity+0x241/0xbc0
[ 1915.936298]  [<ffffffffac36bb85>] pci_pm_runtime_suspend+0x55/0x180
[ 1915.936317]  [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0
[ 1915.936339]  [<ffffffffac4514e2>] __rpm_callback+0x32/0x70
[ 1915.936356]  [<ffffffffac451544>] rpm_callback+0x24/0x80
[ 1915.936375]  [<ffffffffac36bb30>] ? pci_pm_runtime_resume+0xa0/0xa0
[ 1915.936392]  [<ffffffffac45222d>] rpm_suspend+0x12d/0x680
[ 1915.936415]  [<ffffffffac69f6d7>] ? _raw_spin_unlock_irq+0x17/0x30
[ 1915.936435]  [<ffffffffac0810b8>] ? finish_task_switch+0x88/0x220
[ 1915.936455]  [<ffffffffac4534bf>] pm_runtime_work+0x6f/0xb0
[ 1915.936477]  [<ffffffffac074353>] process_one_work+0x1f3/0x4d0
[ 1915.936501]  [<ffffffffac074678>] worker_thread+0x48/0x4e0
[ 1915.936523]  [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0
[ 1915.936542]  [<ffffffffac074630>] ? process_one_work+0x4d0/0x4d0
[ 1915.936559]  [<ffffffffac07a2c9>] kthread+0xd9/0xf0
[ 1915.936580]  [<ffffffffac07a1f0>] ? kthread_park+0x60/0x60
[ 1915.936600]  [<ffffffffac69fe62>] ret_from_fork+0x22/0x30
[ 1915.936646] ---[ end trace f9357374d2333f67 ]---
[ 1915.940175] [drm:intel_runtime_suspend [i915]] Device suspended
Comment 1 Tvrtko Ursulin 2016-11-21 10:01:59 UTC
Created attachment 128098 [details]
dmesg 2
Comment 2 Ernst Sjöstrand 2016-12-12 09:31:38 UTC
Created attachment 128425 [details]
dmesg.txt

I got this 4 times in a row.

I'm running this kernel:

http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-next/2016-12-01/

These binary packages represent builds of the mainline or stable Linux kernel tree at the commit below:

  cod/tip/drm-next/2016-12-01 (0d5320fc194128a1a584a7e91a606cb3af2ded80)
Comment 3 Chris Murphy 2017-01-11 00:40:54 UTC
Created attachment 128881 [details]
journalctl -k -b

I'm getting a call trace like this every 24 seconds with kernels 4.10-rc1 through rc3. It happens with or without i915.enable_guc_loading=-1 i915.enable_guc_submission=-1

And the skl firmware is always being used, I haven't tried inhibiting it.

Jan 10 12:04:08 f25h kernel: [drm] Finished loading i915/skl_dmc_ver1_26.bin (v1.26)
Jan 10 12:04:08 f25h kernel: [drm] Initialized i915 1.6.0 20161121 for 0000:00:02.0 on minor 0
Comment 4 Bjørn Mork 2017-01-30 15:19:29 UTC
Still there in v4.10.0-rc6:

[10534.837409] ------------[ cut here ]------------
[10534.837476] WARNING: CPU: 1 PID: 4083 at drivers/gpu/drm/i915/i915_gem.c:2013 i915_gem_runtime_suspend+0x122/0x130 [i915]
[10534.837481] WARN_ON(reg->pin_count)
[10534.837483] Modules linked in: rfcomm xt_multiport iptable_filter 8021q garp mrp stp llc tun cmac bnep snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_codec_generic intel_rapl mei_wdt x86_pkg_temp_thermal intel_powerclamp coretemp cdc_mbim cdc_wdm uvcvideo kvm_intel videobuf2_vmalloc cdc_ncm videobuf2_memops usbnet videobuf2_v4l2 videobuf2_core mii videodev arc4 kvm qcserial usb_wwan usbserial nls_utf8 irqbypass nls_cp437 crct10dif_pclmul crc32_pclmul vfat fat ghash_clmulni_intel pcbc btusb btrtl btbcm btintel bluetooth efi_pstore i915 intel_gtt iwlmvm mac80211 aesni_intel aes_x86_64 snd_hda_intel crypto_simd glue_helper cryptd evdev snd_hda_codec snd_hwdep snd_hda_core efivars serio_raw i2c_algo_bit drm_kms_helper syscopyarea snd_pcm iwlwifi sysfillrect iTCO_wdt sysimgblt fb_sys_fops snd_timer
[10534.837592]  iTCO_vendor_support drm cfg80211 intel_pch_thermal mei_me thinkpad_acpi mei wmi nvram snd soundcore rfkill ac battery tpm_crb video button tpm_tis tpm_tis_core tpm sunrpc efivarfs ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto mbcache crc32c_generic intel_ishtp_hid hid rtsx_pci_sdmmc mmc_core crc32c_intel psmouse e1000e ptp i2c_i801 pps_core xhci_pci nvme nvme_core xhci_hcd rtsx_pci mfd_core usbcore intel_ish_ipc intel_ishtp thermal
[10534.837664] CPU: 1 PID: 4083 Comm: kworker/1:0 Tainted: G        W       4.10.0-rc6 #439
[10534.837669] Hardware name: LENOVO 20FB006AMN/20FB006AMN, BIOS N1FET47W (1.21 ) 11/28/2016
[10534.837681] Workqueue: pm pm_runtime_work
[10534.837686] Call Trace:
[10534.837699]  dump_stack+0x67/0x92
[10534.837705]  __warn+0xd1/0xf0
[10534.837710]  warn_slowpath_fmt+0x5f/0x80
[10534.837761]  i915_gem_runtime_suspend+0x122/0x130 [i915]
[10534.837772]  ? pci_pm_runtime_resume+0xa0/0xa0
[10534.837808]  intel_runtime_suspend+0x64/0x270 [i915]
[10534.837817]  pci_pm_runtime_suspend+0x5b/0x180
[10534.837824]  __rpm_callback+0xbe/0x200
[10534.837832]  ? cpuacct_charge+0x36/0x1e0
[10534.837839]  rpm_callback+0x22/0x90
[10534.837847]  ? pci_pm_runtime_resume+0xa0/0xa0
[10534.837853]  rpm_suspend+0x132/0x700
[10534.837861]  pm_runtime_work+0x7b/0xc0
[10534.837867]  process_one_work+0x1fe/0x6d0
[10534.837873]  ? process_one_work+0x17f/0x6d0
[10534.837879]  worker_thread+0x69/0x4c0
[10534.837888]  kthread+0x12b/0x160
[10534.837893]  ? process_one_work+0x6d0/0x6d0
[10534.837900]  ? kthread_create_on_node+0x60/0x60
[10534.837907]  ret_from_fork+0x2e/0x40
[10534.837958] ---[ end trace 412b5ca7a53a8b1d ]---


Easily triggered by doing

bjorn@miraculix:~$  grep . /sys/class/drm/card0/device/power/*
/sys/class/drm/card0/device/power/async:enabled
/sys/class/drm/card0/device/power/autosuspend_delay_ms:2000
/sys/class/drm/card0/device/power/control:auto
/sys/class/drm/card0/device/power/runtime_active_kids:0
/sys/class/drm/card0/device/power/runtime_active_time:10806522
/sys/class/drm/card0/device/power/runtime_enabled:enabled
/sys/class/drm/card0/device/power/runtime_status:active
/sys/class/drm/card0/device/power/runtime_suspended_time:8310
/sys/class/drm/card0/device/power/runtime_usage:6
bjorn@miraculix:~$ xset dpms force off 

and waiting a couple of seconds.
Comment 5 Chris Wilson 2017-02-03 12:43:07 UTC
*** Bug 99652 has been marked as a duplicate of this bug. ***
Comment 6 Lionel Landwerlin 2017-02-03 12:46:36 UTC
Same issue on KBL system. I can confirm it happens everytime the screen goes off.
Comment 7 Chris Wilson 2017-02-10 22:22:58 UTC
commit e0ec3ec698851a6c97a12d696407b3ff77700c23
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Feb 3 12:57:17 2017 +0000

    drm/i915: Remove overzealous fence warn on runtime suspend

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.