Bug 111219 - [CI][DRMTIP] igt@kms_chamelium@hdmi-hpd-storm - dmesg-warn - Received HPD interrupt on pin 9 although disabled
Summary: [CI][DRMTIP] igt@kms_chamelium@hdmi-hpd-storm - dmesg-warn - Received HPD in...
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-26 06:48 UTC by Lakshmi
Modified: 2019-11-29 19:20 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: display/Other


Attachments

Description Lakshmi 2019-07-26 06:48:48 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_335/fi-icl-u4/igt@kms_chamelium@hdmi-hpd-storm.html

 Received HPD interrupt on pin 9 although disabled
<4> [72.639073] WARNING: CPU: 5 PID: 0 at drivers/gpu/drm/i915/display/intel_hotplug.c:506 intel_hpd_irq_handler+0x34d/0x3d0 [i915]
<4> [72.639078] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep snd_hda_core ghash_clmulni_intel e1000e snd_pcm ptp pps_core mei_me mei prime_numbers cdc_ether usbnet mii
<4> [72.639111] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G     U            5.3.0-rc1-ga6efe73f1e08-drmtip_335+ #1
<4> [72.639115] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
<4> [72.639220] RIP: 0010:intel_hpd_irq_handler+0x34d/0x3d0 [i915]
<4> [72.639226] Code: dc 37 17 00 00 0f 85 0a ff ff ff 89 de 48 c7 c7 88 7b 78 c0 44 88 44 24 0c 44 88 4c 24 08 c6 05 bc 37 17 00 01 e8 f3 26 9f c6 <0f> 0b 44 0f b6 4c 24 08 44 0f b6 44 24 0c e9 d8 fe ff ff 80 bd f0
<4> [72.639230] RSP: 0000:ffffb8d840224df8 EFLAGS: 00010082
<4> [72.639236] RAX: 0000000000000000 RBX: 0000000000000009 RCX: 0000000000000006
<4> [72.639239] RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffffa4315fea6670
<4> [72.639243] RBP: ffffa4313dda0000 R08: 0000000000000000 R09: 0000000000000001
<4> [72.639248] R10: 00000000f96bc3bb R11: 00000000fecf4604 R12: 0000000000000200
<4> [72.639251] R13: 0000000000000000 R14: 0000000000000200 R15: 0000000000000200
<4> [72.639256] FS:  0000000000000000(0000) GS:ffffa4315fe80000(0000) knlGS:0000000000000000
<4> [72.639260] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [72.639264] CR2: 00007fbe5b66b3c0 CR3: 000000026b210006 CR4: 0000000000760ee0
<4> [72.639268] PKRU: 55555554
<4> [72.639272] Call Trace:
<4> [72.639278]  <IRQ>
<4> [72.639357]  gen8_de_irq_handler+0x729/0x870 [i915]
<4> [72.639435]  gen11_irq_handler+0x2c9/0x420 [i915]
<4> [72.639455]  __handle_irq_event_percpu+0x41/0x2a0
<4> [72.639461]  ? handle_irq_event+0x27/0x50
<4> [72.639472]  handle_irq_event_percpu+0x2b/0x70
<4> [72.639481]  handle_irq_event+0x2f/0x50
<4> [72.639490]  handle_edge_irq+0xa6/0x190
<4> [72.639498]  handle_irq+0x17/0x20
<4> [72.639506]  do_IRQ+0x52/0x110
<4> [72.639515]  common_interrupt+0xf/0xf
<4> [72.639521]  </IRQ>
<4> [72.639529] RIP: 0010:cpuidle_enter_state+0xae/0x440
<4> [72.639533] Code: 44 00 00 31 ff e8 52 fd 90 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 62 03 00 00 31 ff e8 7b 49 98 ff e8 86 4c 9c ff fb 45 85 ed <0f> 88 b3 02 00 00 4c 2b 24 24 48 ba cf f7 53 e3 a5 9b c4 20 49 63
<4> [72.639538] RSP: 0000:ffffb8d840133e80 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffdc
<4> [72.639543] RAX: ffffa4315d4dc040 RBX: ffffffff882a5440 RCX: 0000000000000000
<4> [72.639547] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffffa4315d4dc040
<4> [72.639551] RBP: ffffa43159d33528 R08: 0000000000000000 R09: 0000000000000000
<4> [72.639555] R10: 0000000000000000 R11: 0000000000000000 R12: 00000010e999eb34
<4> [72.639559] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000003
<4> [72.639586]  cpuidle_enter+0x24/0x40
<4> [72.639595]  do_idle+0x1e7/0x250
<4> [72.639605]  cpu_startup_entry+0x14/0x20
<4> [72.639612]  start_secondary+0x15f/0x1b0
<4> [72.639621]  secondary_startup_64+0xa4/0xb0
<4> [72.639641] irq event stamp: 2545456
<4> [72.639648] hardirqs last  enabled at (2545453): [<ffffffff877dbe3a>] cpuidle_enter_state+0xaa/0x440
<4> [72.639653] hardirqs last disabled at (2545454): [<ffffffff8700199a>] trace_hardirqs_off_thunk+0x1a/0x20
<4> [72.639660] softirqs last  enabled at (2545456): [<ffffffff870b9f99>] irq_enter+0x59/0x60
<4> [72.639665] softirqs last disabled at (2545455): [<ffffffff870b9f7e>] irq_enter+0x3e/0x60
<4> [72.639767] WARNING: CPU: 5 PID: 0 at drivers/gpu/drm/i915/display/intel_hotplug.c:506 intel_hpd_irq_handler+0x34d/0x3d0 [i915]
Comment 1 Lakshmi 2019-07-26 06:50:00 UTC
(In reply to Lakshmi from comment #0)
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_335/fi-icl-u4/
> igt@kms_chamelium@hdmi-hpd-storm.html
> 
>  Received HPD interrupt on pin 9 although disabled

@Simon, you might be interested in this bug.
Comment 2 CI Bug Log 2019-07-26 06:50:57 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* fi-icl-u4: igt@kms_chamelium@hdmi-hpd-storm - dmesg-warn - Received HPD interrupt on pin 9 although disabled
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_335/fi-icl-u4/igt@kms_chamelium@hdmi-hpd-storm.html
Comment 3 emersion 2019-07-26 07:06:08 UTC
Seems like a kernel bug in HPD storm handling. The driver correctly falls back to polling, but then errors because of an HPD interrupt. Note that this happens on USB-C, which might be relevant:

<7> [71.911207] [drm:__intel_tc_port_lock [i915]] Port F/TC#4: TC port mode reset (dp-alt -> tbt-alt)

User impact: seems like the driver ends up in a bad state, which could result in a blank screen when a monitor with e.g. a bad cable is connected via USB-C.
Comment 4 Lakshmi 2019-07-26 07:25:17 UTC
(In reply to emersion from comment #3)
> Seems like a kernel bug in HPD storm handling. The driver correctly falls
> back to polling, but then errors because of an HPD interrupt. Note that this
> happens on USB-C, which might be relevant:
> 
> <7> [71.911207] [drm:__intel_tc_port_lock [i915]] Port F/TC#4: TC port mode
> reset (dp-alt -> tbt-alt)
> 
> User impact: seems like the driver ends up in a bad state, which could
> result in a blank screen when a monitor with e.g. a bad cable is connected
> via USB-C.

Thanks for the quick assessment. Setting the priority to High.
Comment 5 ashutosh.dixit 2019-11-06 19:42:08 UTC
Bug assessment: still happening regularly with 3.4 percent repro rate. Now we have a description of the test from Simon so that helps:

static const char test_hpd_storm_detect_desc[] =
        "Trigger a series of hotplugs in a very small timeframe to simulate a"
        "bad cable, check the kernel falls back to polling to avoid a hotplug "
        "storm";

Warning is the same as above, agree that seems to be a kernel/i915 issue. Leaving severity/priority unchanged.
Comment 6 Martin Peres 2019-11-29 19:20:34 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/351.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.