Bug 69251 - [DP] Getting kernel WARN every time I disconnect a DisplayPort monitor
Summary: [DP] Getting kernel WARN every time I disconnect a DisplayPort monitor
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-12 06:05 UTC by Stéphane Graber
Modified: 2017-07-24 22:57 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg when running with drm.debug=0xe (245.57 KB, text/plain)
2013-09-12 14:13 UTC, Stéphane Graber
no flags Details

Description Stéphane Graber 2013-09-12 06:05:05 UTC
Every single time I unplug a displayport monitor, I'm getting the following OOPS in dmesg:

[88314.863262] ------------[ cut here ]------------
[88314.863325] WARNING: CPU: 3 PID: 1563 at /build/buildd/linux-3.11.0/drivers/gpu/drm/i915/intel_dp.c:2211 intel_dp_link_down+0x1b2/0x1e0 [i915]()
[88314.863328] Modules linked in: pl2303 usbserial mmc_block ufs(F) qnx4 hfsplus hfs minix ntfs msdos(F) jfs xfs(F) reiserfs ext2(F) md4(F) nls_utf8 cifs(F) fscache(F) overlayfs(F) usb_storage(F) veth(F) ipt_MASQUERADE(F) iptable_nat(F) nf_nat_ipv4(F) nf_conntrack_ipv4(F) nf_defrag_ipv4(F) xt_conntrack(F) ip6t_REJECT(F) ipt_REJECT(F) xt_CHECKSUM(F) iptable_mangle(F) xt_tcpudp(F) bridge(F) stp(F) llc(F) ip6table_filter(F) iptable_filter(F) ip_tables(F) ebtable_nat(F) ebtables(F) ip6t_MASQUERADE(F) ip6table_nat(F) nf_conntrack_ipv6(F) nf_defrag_ipv6(F) nf_nat_ipv6(F) nf_nat(F) nf_conntrack(F) zram(C) ip6_tables(F) x_tables(F) parport_pc(F) ppdev(F) rfcomm bnep binfmt_misc(F) dm_crypt(F) btusb bluetooth arc4(F) x86_pkg_temp_thermal intel_powerclamp coretemp iwldvm kvm_intel(F) mac80211 kvm(F) crc32_pclmul(F) ghash_clmulni_intel(F) aesni_intel(F) aes_x86_64(F) lrw(F) gf128mul(F) glue_helper(F) ablk_helper(F) cryptd(F) dm_multipath(F) scsi_dh(F) microcode(F) snd_hda_codec_hdmi snd_hda_codec_realtek psmouse(F) serio_raw(F) snd_hda_intel iwlwifi snd_hda_codec lpc_ich cfg80211 snd_hwdep(F) snd_pcm(F) snd_page_alloc(F) snd_seq_midi(F) snd_seq_midi_event(F) snd_rawmidi(F) snd_seq(F) snd_seq_device(F) thinkpad_acpi snd_timer(F) tpm_tis nvram(F) snd(F) mei_me mei soundcore(F) mac_hid lp(F) parport(F) nls_iso8859_1(F) btrfs(F) xor(F) zlib_deflate(F) raid6_pq(F) libcrc32c(F) i915 ahci(F) sdhci_pci libahci(F) i2c_algo_bit sdhci drm_kms_helper drm e1000e(F) ptp(F) pps_core(F) wmi video(F)
[88314.863430] CPU: 3 PID: 1563 Comm: Xorg Tainted: GF       WC   3.11.0-5-generic #11-Ubuntu
[88314.863433] Hardware name: LENOVO 2306CT0/2306CT0, BIOS G2ET92WW (2.52 ) 02/22/2013
[88314.863436]  0000000000000009 ffff880403b0fbe8 ffffffff816f23a1 0000000000000000
[88314.863441]  ffff880403b0fc20 ffffffff81061cfd ffff88040247c0c0 ffff880402298000
[88314.863445]  ffff88040d818000 0000000080180344 ffff880402622800 ffff880403b0fc30
[88314.863450] Call Trace:
[88314.863459]  [<ffffffff816f23a1>] dump_stack+0x45/0x56
[88314.863466]  [<ffffffff81061cfd>] warn_slowpath_common+0x7d/0xa0
[88314.863472]  [<ffffffff81061dda>] warn_slowpath_null+0x1a/0x20
[88314.863498]  [<ffffffffa0136f82>] intel_dp_link_down+0x1b2/0x1e0 [i915]
[88314.863522]  [<ffffffffa0138e98>] intel_disable_dp+0x68/0x70 [i915]
[88314.863545]  [<ffffffffa01247ca>] ironlake_crtc_disable+0x18a/0x8c0 [i915]
[88314.863567]  [<ffffffffa0128c6f>] intel_crtc_update_dpms+0x6f/0xa0 [i915]
[88314.863589]  [<ffffffffa0128d3a>] intel_encoder_dpms+0x1a/0x30 [i915]
[88314.863611]  [<ffffffffa012c078>] intel_connector_dpms+0x38/0x70 [i915]
[88314.863635]  [<ffffffffa00824e8>] drm_mode_obj_set_property_ioctl+0x308/0x320 [drm]
[88314.863654]  [<ffffffffa0082530>] drm_mode_connector_property_set_ioctl+0x30/0x40 [drm]
[88314.863673]  [<ffffffffa0071212>] drm_ioctl+0x532/0x660 [drm]
[88314.863687]  [<ffffffff815e3fb1>] ? sock_aio_read+0x21/0x30
[88314.863693]  [<ffffffff811a6a00>] ? do_sync_read+0x80/0xb0
[88314.863698]  [<ffffffff811b90c5>] do_vfs_ioctl+0x2e5/0x4d0
[88314.863702]  [<ffffffff81097f69>] ? vtime_account_user+0x69/0x80
[88314.863706]  [<ffffffff811b9331>] SyS_ioctl+0x81/0xa0
[88314.863713]  [<ffffffff8170246f>] tracesys+0xe1/0xe6
[88314.863716] ---[ end trace 953c742bbb2ebb3f ]---

As far as I can tell, it doesn't have any other side effect apart from spamming my kernel logs. I can go for weeks plugging/unplugging monitors multiple times a day, every time I get a similar oops but things still work fine and nothing crashes.

That's running on Ubuntu 13.10 (development release) with a kernel based on the stable 3.11 mainline kernel.
The xorg intel driver is at version 2.21.14.
That's all running on a Thinkpad x230 with a 64bit OS under UEFI.

That bug isn't something new, as a matter of fact, I can't remember ever not seeing it, so I think it's been around since at least the 3.2 kernel.

https://bugzilla.redhat.com/show_bug.cgi?id=889220 has a few similar reports on the 3.7 and 3.9 kernel.

https://www.google.com/search?q=WARNING+intel_dp_link_down will get you some more reports too.
Comment 1 Mika Kuoppala 2013-09-12 08:06:30 UTC
This is kernel warn produced by:

if (WARN_ON((I915_READ(intel_dp->output_reg) & DP_PORT_EN) == 0))

So we are disabling an already disabled dp link.

Was the monitor you unplugged active and showing some stuff
or was it turned off or in sleep when you unplugged?
Comment 2 Daniel Vetter 2013-09-12 08:29:47 UTC
Our state tracking when unplugging a DP port is a bit wreaked: First we get a hpd event, try to retrain the link and since that fails we shut down the port. Then userspace tries to do something else (modeset usually to shut down the pipe) and we notice the pipe is off already.

Imo the right thing to do is that instead of disabling the port we enable it with some failsafe link training parameters. That should also help in a few other inconsistencies. I.e. instead of calling intel_dp_link_down we should imo call intel_dp_set_idle_link_train.

Stéphane, can you please boot with drm.debug=0xe added to your kernel bootline, reproduce the issue once and then attach the complete dmesg? I just want to confirm that I didn't miss anything.
Comment 3 Stéphane Graber 2013-09-12 13:48:01 UTC
(In reply to comment #1)
> This is kernel warn produced by:
> 
> if (WARN_ON((I915_READ(intel_dp->output_reg) & DP_PORT_EN) == 0))
> 
> So we are disabling an already disabled dp link.
> 
> Was the monitor you unplugged active and showing some stuff
> or was it turned off or in sleep when you unplugged?

The monitor was active.
Comment 4 Stéphane Graber 2013-09-12 14:13:34 UTC
Created attachment 85723 [details]
dmesg when running with drm.debug=0xe

Attaching the dmesg as requested by Daniel.

I'm plugging the display around 261 and unplugging around 275.

Unfortunately the kernel log buffer was to short to store the boot log on top of that, but hopefully just that intel drm debug output will be enough.
Comment 5 Daniel Vetter 2013-09-12 14:28:19 UTC
With the kernel option log_buf_len=4M or so you can extend the dmesg buffer. But the file contains enough I think and confirms my theory.
Comment 6 Daniel Vetter 2013-09-16 06:09:07 UTC
This isn't ivb-specific, but a generic issue with our DP code.
Comment 7 Paulo Zanoni 2013-09-17 14:24:11 UTC
(In reply to comment #6)
> This isn't ivb-specific, but a generic issue with our DP code.

I can't reproduce this on Haswell.
Comment 8 Todd Previte 2013-09-17 17:09:09 UTC
I can reliably reproduce this on my Maho Bay machine. It looks like there's a discrepancy between the HW state and the SW state. I'm already looking into this one.
Comment 9 Todd Previte 2013-09-27 21:53:59 UTC
As might be expected, just issuing an HPD pulse will also cause this to happen. I'll see if I can get this cleaned up sometime in the near future.

-T
Comment 10 Jani Nikula 2013-12-17 13:05:45 UTC
Todd, have you had (or do you expect to have) the chance to look at this, or shall we just reset the assignee back to intel-gfx-bugs?
Comment 11 Todd Previte 2013-12-20 17:28:11 UTC
If I recall correctly, Ville had fixed some WARNs around this not too long ago. I've been up to my eyeballs in the ByT Displayport work, hence the lack of forward motion on this issue. I did find that it's not just Displayport though - I noticed this same WARN when hotplugging VGA on an Acer laptop as well. This issue isn't critical (as far as I can tell) so I'd like to hang onto it so I can get it fixed as soon as the DP/ByT stuff is done. If someone else comes along in the interrim and wants to take a shot at it though, feel free...

-T
Comment 12 Todd Previte 2014-01-24 21:46:22 UTC
Is this problem still present? As I mentioned above, Ville had fixed a bunch of these a while back. I seem to recall this happening when hotplugging VGA as well, so it might not be limited to DP. Let me know if it's still happening though and I can look into further.

-T
Comment 13 andreas.sturmlechner 2014-01-25 14:03:46 UTC
I got lots of these today right at startup, LVDS only, no Displayport monitor available on train ;)


[    0.530358] ------------[ cut here ]------------
[    0.530430] WARNING: CPU: 0 PID: 0 at drivers/gpu/drm/i915/i915_irq.c:1240 i965_irq_handler+0x59a/0x6b0()
[    0.530517] Received HPD interrupt although disabled
[    0.530567] Modules linked in:
[    0.530632] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-gentoo-r1 #1
[    0.530686] Hardware name: LENOVO 7469CTO/7469CTO, BIOS 6DET70WW (3.20 ) 05/16/2011
[    0.530769]  0000000000000000 0000000000000009 ffffffff815d88df ffff88023bc03df8
[    0.531346]  ffffffff810426fb 0000000000000004 ffff880232338030 0000000000020000
[    0.531346]  ffffffff8166a850 ffff880232338000 ffffffff810427ea ffffffff817f3568
[    0.531346] Call Trace:
[    0.531346]  <IRQ>  [<ffffffff815d88df>] ? dump_stack+0x41/0x51
[    0.531346]  [<ffffffff810426fb>] ? warn_slowpath_common+0x8b/0xc0
[    0.531346]  [<ffffffff810427ea>] ? warn_slowpath_fmt+0x4a/0x50
[    0.531346]  [<ffffffff8138977b>] ? gen4_read32+0x4b/0xc0
[    0.531346]  [<ffffffff8131e23a>] ? i965_irq_handler+0x59a/0x6b0
[    0.531346]  [<ffffffff81080313>] ? handle_irq_event_percpu+0x53/0x1c0
[    0.531346]  [<ffffffff810804bf>] ? handle_irq_event+0x3f/0x70
[    0.531346]  [<ffffffff81082ccf>] ? handle_edge_irq+0x6f/0x110
[    0.531346]  [<ffffffff810049aa>] ? handle_irq+0x1a/0x30
[    0.531346]  [<ffffffff815e1bd7>] ? do_IRQ+0x57/0xe0
[    0.531346]  [<ffffffff815df52a>] ? common_interrupt+0x6a/0x6a
[    0.531346]  <EOI>  [<ffffffff81091608>] ? clockevents_notify+0x1d8/0x200
[    0.531346]  [<ffffffff8146d5eb>] ? cpuidle_enter_state+0x5b/0xe0
[    0.531346]  [<ffffffff8146d5e7>] ? cpuidle_enter_state+0x57/0xe0
[    0.531346]  [<ffffffff8146d71e>] ? cpuidle_idle_call+0xae/0x1d0
[    0.531346]  [<ffffffff8100c3fe>] ? arch_cpu_idle+0xe/0x30
[    0.531346]  [<ffffffff8107f83a>] ? cpu_startup_entry+0x7a/0x230
[    0.531346]  [<ffffffff81ab9d28>] ? start_kernel+0x2e6/0x2f1
[    0.531346]  [<ffffffff81ab9859>] ? repair_env_string+0x5b/0x5b
[    0.531346] ---[ end trace 5e69a3b4183dfd2a ]---
Comment 14 Jani Nikula 2014-08-14 13:27:04 UTC
(In reply to comment #13)
> I got lots of these today right at startup, LVDS only, no Displayport
> monitor available on train ;)

Completely unrelated.
Comment 15 Jani Nikula 2014-08-14 13:27:56 UTC
(In reply to comment #12)
> Is this problem still present? As I mentioned above, Ville had fixed a bunch
> of these a while back. I seem to recall this happening when hotplugging VGA
> as well, so it might not be limited to DP. Let me know if it's still
> happening though and I can look into further.

I also recall we have some fixes related to this. Please retest with recent kernels and report back.
Comment 16 Leho Kraav (:macmaN :lkraav) 2014-08-31 12:59:52 UTC
I'm experiencing this with 3.16.1.

aug   31 15:46:00 xps14 kernel: ------------[ cut here ]------------
aug   31 15:46:00 xps14 kernel: WARNING: CPU: 2 PID: 5938 at drivers/gpu/drm/i915/intel_dp.c:3122 intel_dp_link_down+0x53/0x1cb [i915]()
aug   31 15:46:00 xps14 kernel: Modules linked in: nfsv3 nfs_acl nfs lockd sunrpc snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_seq_device msr cpufreq_stats tun ctr ccm btusb bnep b
aug   31 15:46:00 xps14 kernel:  processor snd crc32_pclmul i915 fbcon bitblit softcursor cfbfillrect font cfbimgblt intel_gtt cfbcopyarea drm_kms_helper drm ehci_pci agpgart ehci_hcd fb xhci
aug   31 15:46:00 xps14 kernel: CPU: 2 PID: 5938 Comm: Xorg Tainted: G        W     3.16.1 #37
aug   31 15:46:00 xps14 kernel: Hardware name: Dell Inc. Latitude E7440/0PPXP5, BIOS A08 02/18/2014
aug   31 15:46:00 xps14 kernel:  0000000000000009 ffff8800bb687988 ffffffff8147b2c7 0000000000000006
aug   31 15:46:00 xps14 kernel:  0000000000000000 ffff8800bb6879c8 ffffffff81038df3 0000000000000006
aug   31 15:46:00 xps14 kernel:  ffffffffa0149dd4 ffff880405910000 ffff88040694d8d8 ffff88040694f000
aug   31 15:46:00 xps14 kernel: Call Trace:
aug   31 15:46:00 xps14 kernel:  [<ffffffff8147b2c7>] dump_stack+0x4e/0x71
aug   31 15:46:00 xps14 kernel:  [<ffffffff81038df3>] warn_slowpath_common+0x7c/0x96
aug   31 15:46:00 xps14 kernel:  [<ffffffffa0149dd4>] ? intel_dp_link_down+0x53/0x1cb [i915]
aug   31 15:46:00 xps14 kernel:  [<ffffffff81038e22>] warn_slowpath_null+0x15/0x17
aug   31 15:46:00 xps14 kernel:  [<ffffffffa0149dd4>] intel_dp_link_down+0x53/0x1cb [i915]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa014e5a3>] intel_dp_complete_link_train+0x107/0x296 [i915]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa0147e56>] intel_ddi_pre_enable+0x124/0x164 [i915]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa0134639>] haswell_crtc_enable+0x4b0/0x8ef [i915]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa0148af2>] ? intel_ddi_pll_select+0x41/0x2b0 [i915]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa0137595>] __intel_set_mode+0x10d7/0x11dd [i915]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa013954b>] intel_set_mode+0x11/0x2a [i915]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa013a18f>] intel_crtc_set_config+0x710/0xa0f [i915]
aug   31 15:46:00 xps14 kernel:  [<ffffffff8126de4e>] ? idr_alloc+0x91/0xb9
aug   31 15:46:00 xps14 kernel:  [<ffffffffa00728d6>] drm_mode_set_config_internal+0x4e/0xba [drm]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa0075882>] drm_mode_setcrtc+0x3da/0x484 [drm]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa00690d5>] drm_ioctl+0x2b3/0x414 [drm]
aug   31 15:46:00 xps14 kernel:  [<ffffffffa00754a8>] ? drm_mode_setplane+0x39f/0x39f [drm]
aug   31 15:46:00 xps14 kernel:  [<ffffffff8111f214>] do_vfs_ioctl+0x3f9/0x443
aug   31 15:46:00 xps14 kernel:  [<ffffffff81110b06>] ? vfs_write+0x134/0x195
aug   31 15:46:00 xps14 kernel:  [<ffffffff8111f297>] SyS_ioctl+0x39/0x62
aug   31 15:46:00 xps14 kernel:  [<ffffffff814805d2>] system_call_fastpath+0x16/0x1b
aug   31 15:46:00 xps14 kernel: ---[ end trace 22887e7bb10acd1a ]---
aug   31 15:46:00 xps14 kernel: [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting

Even though hostname says xps14, it's actually the E7440, as the trace also says in Hardware name.
Comment 17 Daniel Vetter 2014-11-04 16:29:14 UTC
Oops, that's another regression, which is fixed in latest kernels. Specifically this one here:

commit e0a52ac9dd2a77b71bc132075cbe0057d22963ba
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Nov 3 11:39:24 2014 +0100

    drm/i915/dp: Don't stop the link when retraining

So presuming fixed in latest drm-intel-nightly kernels (patch will only land in 3.19). Please reopen if that's not the case, and thanks a lot for reporting this issues and quickly coming back with updated information.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.