Bug 107915

Summary: drivers/gpu/drm/i915/intel_uncore.c:1083 __unclaimed_reg_debug - HP Pavilion x360
Product: DRI Reporter: Len Brown <lenb>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED WORKSFORME QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs, ville.syrjala
Version: XOrg git   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard: Triaged
i915 platform: BSW/CHT i915 features:
Bug Depends on:    
Bug Blocks: 110785    

Description Len Brown 2018-09-13 00:25:47 UTC
Ubuntu 18.04 base distro with Linux-4.19.0-rc2

Running suspend to ram (freeze) endurance tests.
Once out of 1500 runs, the kernel WARNING below appeared.


[ 1273.448987] PM: resume from suspend-to-idle
[ 1273.449272] i915 0000:00:02.0: calling i915_pm_resume_early+0x0/0x10 [i915] @ 46, parent: pci0000:00
[ 1273.449550] intel-vbtn INT33D6:00: calling acpi_subsys_resume_early+0x0/0x30 @ 6289, parent: PNP0C09:00
[ 1273.449580] intel-vbtn INT33D6:00: acpi_subsys_resume_early+0x0/0x30 returned 0 after 12 usecs
[ 1273.449604] i2c_designware 808622C1:00: calling acpi_lpss_resume_early+0x0/0x30 @ 6289, parent: pci0000:00
[ 1273.449902] ------------[ cut here ]------------
[ 1273.449912] Unclaimed write to register 0x1e0100
[ 1273.450298] WARNING: CPU: 3 PID: 46 at drivers/gpu/drm/i915/intel_uncore.c:1083 __unclaimed_reg_debug+0x43/0x60 [i915]
[ 1273.450302] Modules linked in: btrfs zstd_compress xor raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c ext2 rfcomm cmac bnep nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek intel_rapl intel_powerclamp coretemp snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core kvm_intel snd_hwdep hid_sensor_accel_3d hid_sensor_gyro_3d hid_sensor_magn_3d hid_sensor_rotation hid_sensor_incl_3d hid_sensor_trigger punit_atom_debug industrialio_triggered_buffer kfifo_buf uvcvideo hid_sensor_iio_common industrialio videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 crct10dif_pclmul videobuf2_common snd_pcm snd_seq_dummy cmdlinepart crc32_pclmul videodev media intel_spi_platform ghash_clmulni_intel snd_seq_oss pcbc intel_spi snd_seq_midi snd_seq_midi_event aesni_intel arc4 spi_nor aes_x86_64
[ 1273.450547]  mtd crypto_simd cryptd glue_helper snd_rawmidi iwlmvm mac80211 intel_cstate iwlwifi cfg80211 snd_seq snd_seq_device snd_timer rtsx_pci_ms memstick btusb btrtl nxp_nci_i2c nxp_nci input_leds snd intel_xhci_usb_role_switch roles nci serio_raw soundcore hci_uart hp_wmi btqca btbcm wmi_bmof btintel bluetooth ecdh_generic joydev nfc intel_vbtn pwm_lpss_platform processor_thermal_device sparse_keymap lpc_ich rfkill_gpio dw_dmac dw_dmac_core mei_txe mei soc_button_array pwm_lpss intel_soc_dts_iosf hp_accel int3403_thermal int340x_thermal_zone int3400_thermal acpi_thermal_rel lis3lv02d mac_hid input_polldev intel_int0002_vgpio hp_wireless acpi_pad hid_multitouch sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 i915 hid_sensor_custom hid_sensor_hub rtsx_pci_sdmmc hid_generic
[ 1273.450809]  kvmgt vfio_mdev mdev vfio_iommu_type1 vfio kvm irqbypass i2c_algo_bit cec rc_core drm_kms_helper psmouse syscopyarea sysfillrect sysimgblt ahci fb_sys_fops rtsx_pci libahci drm r8169 usbhid wmi video i2c_hid hid
[ 1273.450942] CPU: 3 PID: 46 Comm: kworker/u8:1 Tainted: G     U  W         4.19.0-rc2+ #2
[ 1273.450949] Hardware name: Hewlett-Packard HP Pavilion x360 Convertible/8074, BIOS F.16 05/08/2015
[ 1273.450966] Workqueue: events_unbound async_run_entry_fn
[ 1273.451189] RIP: 0010:__unclaimed_reg_debug+0x43/0x60 [i915]
[ 1273.451204] Code: ff ff 38 d8 76 2d 45 84 ed 48 c7 c0 33 30 7e c0 48 c7 c6 3d 30 7e c0 48 0f 45 f0 44 89 e2 48 c7 c7 46 30 7e c0 e8 1d eb 77 cd <0f> 0b 83 2d 4c 7d 10 00 01 5b 41 5c 41 5d 5d c3 0f 1f 00 66 2e 0f
[ 1273.451211] RSP: 0018:ffffa75380837cd8 EFLAGS: 00010082
[ 1273.451223] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
[ 1273.451229] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff925eb9b96420
[ 1273.451236] RBP: ffffa75380837cf0 R08: 0000000000000000 R09: 0000000000000024
[ 1273.451242] R10: 00000000000604f0 R11: 0000000000000000 R12: 00000000001e0100
[ 1273.451248] R13: 0000000000000000 R14: 0000000000000202 R15: 00000000050007fc
[ 1273.451258] FS:  0000000000000000(0000) GS:ffff925eb9b80000(0000) knlGS:0000000000000000
[ 1273.451265] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1273.451271] CR2: 00005582d2196718 CR3: 000000013a9e4000 CR4: 00000000001006e0
[ 1273.451278] Call Trace:
[ 1273.451500]  fwtable_write32+0x190/0x1e0 [i915]
[ 1273.451695]  intel_power_domains_init_hw+0x925/0xa80 [i915]
[ 1273.451876]  i915_drm_resume_early+0x9b/0x130 [i915]
[ 1273.452056]  ? i915_pm_thaw_early+0x10/0x10 [i915]
[ 1273.452236]  i915_pm_restore_early+0x1e/0x30 [i915]
[ 1273.452415]  i915_pm_resume_early+0xe/0x10 [i915]
[ 1273.452429]  dpm_run_callback+0x59/0x180
[ 1273.452443]  device_resume_early+0xe8/0x170
[ 1273.452455]  async_resume_early+0x1d/0x50
[ 1273.452467]  async_run_entry_fn+0x3c/0x150
[ 1273.452480]  process_one_work+0x167/0x3f0
[ 1273.452491]  worker_thread+0x4d/0x460
[ 1273.452506]  kthread+0x105/0x140
[ 1273.452515]  ? rescuer_thread+0x360/0x360
[ 1273.452528]  ? kthread_destroy_worker+0x50/0x50
[ 1273.452541]  ret_from_fork+0x35/0x40
[ 1273.452554] ---[ end trace c8450f2c8aab2bf7 ]---
[ 1273.454957] i2c_designware 808622C1:00: acpi_lpss_resume_early+0x0/0x30 returned 0 after 5181 usecs
[ 1273.455003] INT0002 Virtual GPIO INT0002:00: calling acpi_subsys_resume_early+0x0/0x30 @ 6289, parent: platform
[ 1273.455037] INT0002 Virtual GPIO INT0002:00: acpi_subsys_resume_early+0x0/0x30 returned 0 after 14 usecs
Comment 1 Imre Deak 2018-09-13 14:38:22 UTC
Triggered by chv_phy_control_init()/I915_WRITE(DISPLAY_PHY_CONTROL,...) .

According to Ville the register is backed by the 'Display' power well. Not actually sure why the WARN triggers only sometimes, this power well should be off whenever we resume from idle.

Len, could you provide a full drm.debug=0x1e log including the WARN with the following patch applied (based on 4.19.0-rc2):

From d7065f41143d9826554a11fd26052f6f9110dc9f Mon Sep 17 00:00:00 2001
From: Imre Deak <imre.deak@intel.com>
Date: Thu, 13 Sep 2018 17:26:58 +0300
Subject: [PATCH] drm/i915: Dump power domains state during init/resume

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/intel_runtime_pm.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 6b5aa3b074ec..4ac88ed3c1a4 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -3528,6 +3528,8 @@ static void vlv_cmnlane_wa(struct drm_i915_private *dev_priv)
 	cmn->ops->disable(dev_priv, cmn);
 }
 
+static void intel_power_domains_dump_info(struct drm_i915_private *dev_priv);
+
 /**
  * intel_power_domains_init_hw - initialize hardware power domain state
  * @dev_priv: i915 device instance
@@ -3545,6 +3547,8 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
 
 	power_domains->initializing = true;
 
+	intel_power_domains_dump_info(dev_priv);
+
 	if (IS_ICELAKE(dev_priv)) {
 		icl_display_core_init(dev_priv, resume);
 	} else if (IS_CANNONLAKE(dev_priv)) {
@@ -3606,8 +3610,11 @@ static void intel_power_domains_dump_info(struct drm_i915_private *dev_priv)
 	for_each_power_well(dev_priv, power_well) {
 		enum intel_display_power_domain domain;
 
-		DRM_DEBUG_DRIVER("%-25s %d\n",
-				 power_well->name, power_well->count);
+		DRM_DEBUG_DRIVER("%-25s %d enabled %d\n",
+				 power_well->name,
+				 power_well->count,
+				 power_well->ops->is_enabled(dev_priv,
+							     power_well));
 
 		for_each_power_domain(domain, power_well->domains)
 			DRM_DEBUG_DRIVER("  %-23s %d\n",
-- 
2.13.2
Comment 2 Lakshmi 2018-09-21 10:44:07 UTC
Len, any progress here?
Comment 3 Lakshmi 2018-10-24 10:44:26 UTC
Len, do you still have the issue?
Can you apply the patch suggested by Imre and send full dmesg with kernel parameter drm.debug=0x1e?
Comment 4 Lakshmi 2018-11-02 09:33:13 UTC
No feedback from more than a month. closing as resolved works for me.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.