Bug 98695 - [byt bisected] WARN_ON(!power_domains->domain_use_count[domain])
Summary: [byt bisected] WARN_ON(!power_domains->domain_use_count[domain])
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: highest blocker
Assignee: Matwey V. Kornilov
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords: bisected, regression
Depends on:
Blocks:
 
Reported: 2016-11-12 13:17 UTC by Matwey V. Kornilov
Modified: 2017-07-24 22:40 UTC (History)
4 users (show)

See Also:
i915 platform: BYT
i915 features: display/DP


Attachments
hwinfo (326.99 KB, text/plain)
2016-11-12 13:29 UTC, Matwey V. Kornilov
no flags Details
.config (42.94 KB, application/x-gzip)
2016-11-12 14:44 UTC, Matwey V. Kornilov
no flags Details
dmesg with drm.debug=0xe and latest drm-intel nightly (54.26 KB, text/plain)
2016-12-19 17:15 UTC, Matwey V. Kornilov
no flags Details
dmesg with drm.debug=0xe and latest drm-intel nightly (103.13 KB, text/plain)
2016-12-19 20:25 UTC, Matwey V. Kornilov
no flags Details
[PATCH] drm/i915: Force VDD off on the new power seqeuencer before starting to use it (5.51 KB, patch)
2016-12-20 11:02 UTC, Ville Syrjala
no flags Details | Splinter Review
dmesg for patch (173.00 KB, text/plain)
2016-12-20 16:35 UTC, Matwey V. Kornilov
no flags Details

Description Matwey V. Kornilov 2016-11-12 13:17:42 UTC
Hello,

I see the following warnings at boot and every 5 seconds:

[   17.637647] ------------[ cut here ]------------
[   17.637814] WARNING: CPU: 0 PID: 123 at ../drivers/gpu/drm/i915/intel_runtime_pm.c:1455 intel_display_power_put+0x13a/0x170 [i915]()
[   17.637831] WARN_ON(!power_domains->domain_use_count[domain])
[   17.637989] Modules linked in: iscsi_ibft iscsi_boot_sysfs sd_mod hid_generic snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic joydev hid_penmount usbhid intel_rapl intel_soc_dts_iosf intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel jitterentropy_rng snd_hda_intel iTCO_wdt drbg iTCO_vendor_support igb snd_hda_codec ahci libahci ptp pps_core lpc_ich dca ansi_cprng i915 mei_txe aesni_intel xhci_pci xhci_hcd aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw drm_kms_helper pcspkr i2c_i801 snd_hda_core mfd_core shpchp snd_hwdep mei usbcore usb_common drm fb_sys_fops syscopyarea sysfillrect sysimgblt libata i2c_algo_bit fan thermal snd_intel_sst_acpi snd_intel_sst_core snd_soc_sst_mfld_platform snd_soc_rt5640 snd_soc_rl6231 snd_soc_core snd_compress 8250_fintek snd_pcm regmap_i2c snd_timer fjes intel_smartconnect video i2c_hid rfkill_gpio snd soundcore sdhci_acpi sdhci i2c_designware_platform i2c_designware_core mmc_core snd_soc_sst_acpi rfkill 8250_dw pwm_lpss_platform pwm_lpss button processor sg scsi_mod autofs4
[   17.638061] CPU: 0 PID: 123 Comm: kworker/0:2 Tainted: G        W        4.4.27-2-default #1
[   17.638066] Hardware name: Lex BayTrail       3I380CW A2         /Type2 - Board Product Name, BIOS 3I380CW A3 09/29/2014
[   17.638220] Workqueue: events edp_panel_vdd_work [i915]
[   17.638235]  0000000000000000 ffffffff81327657 ffff88007f84bda8 ffffffffa059c598
[   17.638245]  ffffffff8107e821 ffff88012ed4006c ffff88007f84bdf8 ffff88012ed40000
[   17.638256]  ffff88012ed49570 0000000000000000 ffffffff8107e89c ffffffffa05adebf
[   17.638258] Call Trace:
[   17.638308]  [<ffffffff81019e69>] dump_trace+0x59/0x320
[   17.638330]  [<ffffffff8101a22a>] show_stack_log_lvl+0xfa/0x180
[   17.638348]  [<ffffffff8101afd1>] show_stack+0x21/0x40
[   17.638367]  [<ffffffff81327657>] dump_stack+0x5c/0x85
[   17.638387]  [<ffffffff8107e821>] warn_slowpath_common+0x81/0xb0
[   17.638404]  [<ffffffff8107e89c>] warn_slowpath_fmt+0x4c/0x50
[   17.638544]  [<ffffffffa04e6faa>] intel_display_power_put+0x13a/0x170 [i915]
[   17.638584]  [<ffffffff81097205>] process_one_work+0x155/0x440
[   17.638603]  [<ffffffff81097d46>] worker_thread+0x116/0x4b0
[   17.638623]  [<ffffffff8109d328>] kthread+0xc8/0xe0
[   17.638645]  [<ffffffff8160978f>] ret_from_fork+0x3f/0x70
[   17.652669] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[   17.652671] 
[   17.652674] Leftover inexact backtrace:
[   17.652674] 
[   17.652693]  [<ffffffff8109d260>] ? kthread_park+0x50/0x50
[   17.652769] ---[ end trace a1367dbfc4ba2a52 ]---

Using bisect, I've found that a781ce79d51fc4952870c998937980a042927e84 is the first broken commit, and this issue is present in kernel master too.

Additional information are available in openSUSE bugzilla:
https://bugzilla.novell.com/show_bug.cgi?id=1009674
Comment 1 Matwey V. Kornilov 2016-11-12 13:29:59 UTC
Created attachment 127931 [details]
hwinfo
Comment 2 Matwey V. Kornilov 2016-11-12 14:44:27 UTC
Created attachment 127932 [details]
.config
Comment 3 Chris Wilson 2016-11-12 17:58:47 UTC
commit a781ce79d51fc4952870c998937980a042927e84
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Fri Nov 27 18:55:25 2015 +0200

    drm/i915: Clean up AUX power domain handling
    
    Introduce intel_display_port_aux_power_domain() which simply returns
    the appropriate AUX power domain for a specific port, and then replace
    the intel_display_port_power_domain() with calls to the new function
    in the DP code. As long as we're not actually enabling the port we don't
    need the lane power domains, and those are handled now purely from
    modeset_update_crtc_power_domains().
    
    My initial motivation for this was to see if I could keep the DPIO power
    wells powered down while doing AUX on CHV, but turns out I can't so this
    doesn't change anything for CHV at least. But I think it's still a
    worthwile change.
Comment 4 yann 2016-11-14 09:55:23 UTC
(In reply to Chris Wilson from comment #3)
> commit a781ce79d51fc4952870c998937980a042927e84
> Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Date:   Fri Nov 27 18:55:25 2015 +0200
> 
>     drm/i915: Clean up AUX power domain handling
>     
>     Introduce intel_display_port_aux_power_domain() which simply returns
>     the appropriate AUX power domain for a specific port, and then replace
>     the intel_display_port_power_domain() with calls to the new function
>     in the DP code. As long as we're not actually enabling the port we don't
>     need the lane power domains, and those are handled now purely from
>     modeset_update_crtc_power_domains().
>     
>     My initial motivation for this was to see if I could keep the DPIO power
>     wells powered down while doing AUX on CHV, but turns out I can't so this
>     doesn't change anything for CHV at least. But I think it's still a
>     worthwile change.

Matwey, please re-test with latest drm-intel-nightly (https://cgit.freedesktop.org/drm-intel/) and confirm that is now fixed also on your side.
Comment 5 Jani Nikula 2016-11-14 10:58:11 UTC
a781ce79d51f ("drm/i915: Clean up AUX power domain handling") is the *bad* commit, not a fix.
Comment 6 Matwey V. Kornilov 2016-11-14 11:12:22 UTC
(In reply to yann from comment #4)
> (In reply to Chris Wilson from comment #3)
> > commit a781ce79d51fc4952870c998937980a042927e84
> > Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Date:   Fri Nov 27 18:55:25 2015 +0200
> > 
> >     drm/i915: Clean up AUX power domain handling
> >     
> >     Introduce intel_display_port_aux_power_domain() which simply returns
> >     the appropriate AUX power domain for a specific port, and then replace
> >     the intel_display_port_power_domain() with calls to the new function
> >     in the DP code. As long as we're not actually enabling the port we don't
> >     need the lane power domains, and those are handled now purely from
> >     modeset_update_crtc_power_domains().
> >     
> >     My initial motivation for this was to see if I could keep the DPIO power
> >     wells powered down while doing AUX on CHV, but turns out I can't so this
> >     doesn't change anything for CHV at least. But I think it's still a
> >     worthwile change.
> 
> Matwey, please re-test with latest drm-intel-nightly
> (https://cgit.freedesktop.org/drm-intel/) and confirm that is now fixed also
> on your side.

Ok, I will do. Should I pay extra attention to the specific commit which you believe fix the issue.
Comment 7 yann 2016-11-14 11:23:37 UTC
(In reply to Matwey V. Kornilov from comment #6)
> (In reply to yann from comment #4)
> > (In reply to Chris Wilson from comment #3)
> > > commit a781ce79d51fc4952870c998937980a042927e84
> > > Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > Date:   Fri Nov 27 18:55:25 2015 +0200
> > > 
> > >     drm/i915: Clean up AUX power domain handling
> > >     
> > >     Introduce intel_display_port_aux_power_domain() which simply returns
> > >     the appropriate AUX power domain for a specific port, and then replace
> > >     the intel_display_port_power_domain() with calls to the new function
> > >     in the DP code. As long as we're not actually enabling the port we don't
> > >     need the lane power domains, and those are handled now purely from
> > >     modeset_update_crtc_power_domains().
> > >     
> > >     My initial motivation for this was to see if I could keep the DPIO power
> > >     wells powered down while doing AUX on CHV, but turns out I can't so this
> > >     doesn't change anything for CHV at least. But I think it's still a
> > >     worthwile change.
> > 
> > Matwey, please re-test with latest drm-intel-nightly
> > (https://cgit.freedesktop.org/drm-intel/) and confirm that is now fixed also
> > on your side.
> 
> Ok, I will do. Should I pay extra attention to the specific commit which you
> believe fix the issue.

The goal is to have a version where that commit is merged. If you take latest drm-intel-nightly, for instance, you should be safe because it already merged in that branch
Comment 8 yann 2016-11-14 11:28:45 UTC
(In reply to yann from comment #7)
> (In reply to Matwey V. Kornilov from comment #6)
> > (In reply to yann from comment #4)
> > > (In reply to Chris Wilson from comment #3)
> > > > commit a781ce79d51fc4952870c998937980a042927e84
> > > > Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > Date:   Fri Nov 27 18:55:25 2015 +0200
> > > > 
> > > >     drm/i915: Clean up AUX power domain handling
> > > >     
> > > >     Introduce intel_display_port_aux_power_domain() which simply returns
> > > >     the appropriate AUX power domain for a specific port, and then replace
> > > >     the intel_display_port_power_domain() with calls to the new function
> > > >     in the DP code. As long as we're not actually enabling the port we don't
> > > >     need the lane power domains, and those are handled now purely from
> > > >     modeset_update_crtc_power_domains().
> > > >     
> > > >     My initial motivation for this was to see if I could keep the DPIO power
> > > >     wells powered down while doing AUX on CHV, but turns out I can't so this
> > > >     doesn't change anything for CHV at least. But I think it's still a
> > > >     worthwile change.
> > > 
> > > Matwey, please re-test with latest drm-intel-nightly
> > > (https://cgit.freedesktop.org/drm-intel/) and confirm that is now fixed also
> > > on your side.
> > 
> > Ok, I will do. Should I pay extra attention to the specific commit which you
> > believe fix the issue.
> 
> The goal is to have a version where that commit is merged. If you take
> latest drm-intel-nightly, for instance, you should be safe because it
> already merged in that branch

Matwev, seeing Jani's comment #5, don't re-test since this is the commit that introduced regression (sorry my bad here).
Comment 9 Jari Tahvanainen 2016-12-19 09:38:28 UTC
Highest+Blocker as being regression w/o workaround
Comment 10 Ville Syrjala 2016-12-19 11:02:41 UTC
Does this happen with latest nightly?
Comment 11 Matwey V. Kornilov 2016-12-19 17:00:23 UTC
commit cda2d70a4395323bcf064c81ee0f89d2de015544 still does NOT work
Comment 12 Ville Syrjala 2016-12-19 17:07:23 UTC
(In reply to Matwey V. Kornilov from comment #11)
> commit cda2d70a4395323bcf064c81ee0f89d2de015544 still does NOT work

No idea what tha commit is since I don't have it. You shouldn't refer to commits purely by their sha1 when dealing with rebasing trees.

Also what does "does NOT work" mean exactly? The warning is still printed? Can you grab a full dmesg all the way from boot with drm.debug=0xe passed to the kernel, and attach it here?
Comment 13 Matwey V. Kornilov 2016-12-19 17:15:56 UTC
Created attachment 128561 [details]
dmesg with drm.debug=0xe and latest drm-intel nightly

https://cgit.freedesktop.org/drm-intel/commit/?h=drm-intel-nightly&id=cda2d70a4395323bcf064c81ee0f89d2de015544

cda2d70a4395 ("drm-tip: 2016y-12m-19d-13h-00m-10s UTC integration manifestdrm-intel-nightly")

Requested dmesg is attached.
Comment 14 Matwey V. Kornilov 2016-12-19 20:25:25 UTC
Created attachment 128568 [details]
dmesg with drm.debug=0xe and latest drm-intel nightly

I've updated dmesg. I've managed to obtain verbose version.
Comment 15 Ville Syrjala 2016-12-20 11:02:57 UTC
Created attachment 128576 [details] [review]
[PATCH] drm/i915: Force VDD off on the new power seqeuencer before  starting to use it

The only explanation I can derive from the log is that the VDD force bit is left on by the BIOS for the other power sequencer as well.

This patch should make sure our state tracking will be synced properly with that reality. Please test.
Comment 16 Ville Syrjala 2016-12-20 11:03:57 UTC
(In reply to Ville Syrjala from comment #15)
> Created attachment 128576 [details] [review] [review]
> [PATCH] drm/i915: Force VDD off on the new power seqeuencer before  starting
> to use it
> 
> The only explanation I can derive from the log is that the VDD force bit is
> left on by the BIOS for the other power sequencer as well.
> 
> This patch should make sure our state tracking will be synced properly with
> that reality. Please test.

Oh and I'd like to see the dmesg w/ drm.debug=0xe with that patch whether or not it fixes the problem.
Comment 17 Matwey V. Kornilov 2016-12-20 16:35:47 UTC
Created attachment 128593 [details]
dmesg for patch

The warning doesn't appear with the patch.
Attached here is dmesg.
Comment 18 Ville Syrjala 2016-12-20 16:46:19 UTC
(In reply to Matwey V. Kornilov from comment #17)
> Created attachment 128593 [details]
> dmesg for patch
> 
> The warning doesn't appear with the patch.
> Attached here is dmesg.


[    9.445473] [drm:vlv_detach_power_sequencer [i915]] detaching pipe B power sequencer from port C
[    9.445530] [drm:intel_enable_dp [i915]] initializing pipe A power sequencer for port C
[    9.445586] [drm:intel_dp_init_panel_power_sequencer_registers [i915]] VDD already on, disabling first
[    9.445645] [drm:intel_dp_init_panel_power_sequencer_registers [i915]] panel power sequencer register settings: PP_ON 0x87d00001, PP_OFF 0x1f40001, PP_DIV 0x270f06

Yep. So my theory was correct. Thank for testing.
Comment 19 Ville Syrjala 2016-12-22 14:48:48 UTC
Fixed with

commit 5d5ab2d26f32bdaa5872b938658e0bf8d341bc4c
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Tue Dec 20 18:51:17 2016 +0200

    drm/i915: Force VDD off on the new power seqeuencer before starting to use it


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.