Bug 111219 - [CI][DRMTIP] igt@kms_chamelium@hdmi-hpd-storm - dmesg-warn - Received HPD interrupt on pin 9 although disabled
Summary: [CI][DRMTIP] igt@kms_chamelium@hdmi-hpd-storm - dmesg-warn - Received HPD in...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-26 06:48 UTC by Lakshmi
Modified: 2019-11-06 19:42 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: display/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lakshmi 2019-07-26 06:48:48 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_335/fi-icl-u4/igt@kms_chamelium@hdmi-hpd-storm.html

 Received HPD interrupt on pin 9 although disabled
<4> [72.639073] WARNING: CPU: 5 PID: 0 at drivers/gpu/drm/i915/display/intel_hotplug.c:506 intel_hpd_irq_handler+0x34d/0x3d0 [i915]
<4> [72.639078] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep snd_hda_core ghash_clmulni_intel e1000e snd_pcm ptp pps_core mei_me mei prime_numbers cdc_ether usbnet mii
<4> [72.639111] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G     U            5.3.0-rc1-ga6efe73f1e08-drmtip_335+ #1
<4> [72.639115] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
<4> [72.639220] RIP: 0010:intel_hpd_irq_handler+0x34d/0x3d0 [i915]
<4> [72.639226] Code: dc 37 17 00 00 0f 85 0a ff ff ff 89 de 48 c7 c7 88 7b 78 c0 44 88 44 24 0c 44 88 4c 24 08 c6 05 bc 37 17 00 01 e8 f3 26 9f c6 <0f> 0b 44 0f b6 4c 24 08 44 0f b6 44 24 0c e9 d8 fe ff ff 80 bd f0
<4> [72.639230] RSP: 0000:ffffb8d840224df8 EFLAGS: 00010082
<4> [72.639236] RAX: 0000000000000000 RBX: 0000000000000009 RCX: 0000000000000006
<4> [72.639239] RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffffa4315fea6670
<4> [72.639243] RBP: ffffa4313dda0000 R08: 0000000000000000 R09: 0000000000000001
<4> [72.639248] R10: 00000000f96bc3bb R11: 00000000fecf4604 R12: 0000000000000200
<4> [72.639251] R13: 0000000000000000 R14: 0000000000000200 R15: 0000000000000200
<4> [72.639256] FS:  0000000000000000(0000) GS:ffffa4315fe80000(0000) knlGS:0000000000000000
<4> [72.639260] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [72.639264] CR2: 00007fbe5b66b3c0 CR3: 000000026b210006 CR4: 0000000000760ee0
<4> [72.639268] PKRU: 55555554
<4> [72.639272] Call Trace:
<4> [72.639278]  <IRQ>
<4> [72.639357]  gen8_de_irq_handler+0x729/0x870 [i915]
<4> [72.639435]  gen11_irq_handler+0x2c9/0x420 [i915]
<4> [72.639455]  __handle_irq_event_percpu+0x41/0x2a0
<4> [72.639461]  ? handle_irq_event+0x27/0x50
<4> [72.639472]  handle_irq_event_percpu+0x2b/0x70
<4> [72.639481]  handle_irq_event+0x2f/0x50
<4> [72.639490]  handle_edge_irq+0xa6/0x190
<4> [72.639498]  handle_irq+0x17/0x20
<4> [72.639506]  do_IRQ+0x52/0x110
<4> [72.639515]  common_interrupt+0xf/0xf
<4> [72.639521]  </IRQ>
<4> [72.639529] RIP: 0010:cpuidle_enter_state+0xae/0x440
<4> [72.639533] Code: 44 00 00 31 ff e8 52 fd 90 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 62 03 00 00 31 ff e8 7b 49 98 ff e8 86 4c 9c ff fb 45 85 ed <0f> 88 b3 02 00 00 4c 2b 24 24 48 ba cf f7 53 e3 a5 9b c4 20 49 63
<4> [72.639538] RSP: 0000:ffffb8d840133e80 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffdc
<4> [72.639543] RAX: ffffa4315d4dc040 RBX: ffffffff882a5440 RCX: 0000000000000000
<4> [72.639547] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffffa4315d4dc040
<4> [72.639551] RBP: ffffa43159d33528 R08: 0000000000000000 R09: 0000000000000000
<4> [72.639555] R10: 0000000000000000 R11: 0000000000000000 R12: 00000010e999eb34
<4> [72.639559] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000003
<4> [72.639586]  cpuidle_enter+0x24/0x40
<4> [72.639595]  do_idle+0x1e7/0x250
<4> [72.639605]  cpu_startup_entry+0x14/0x20
<4> [72.639612]  start_secondary+0x15f/0x1b0
<4> [72.639621]  secondary_startup_64+0xa4/0xb0
<4> [72.639641] irq event stamp: 2545456
<4> [72.639648] hardirqs last  enabled at (2545453): [<ffffffff877dbe3a>] cpuidle_enter_state+0xaa/0x440
<4> [72.639653] hardirqs last disabled at (2545454): [<ffffffff8700199a>] trace_hardirqs_off_thunk+0x1a/0x20
<4> [72.639660] softirqs last  enabled at (2545456): [<ffffffff870b9f99>] irq_enter+0x59/0x60
<4> [72.639665] softirqs last disabled at (2545455): [<ffffffff870b9f7e>] irq_enter+0x3e/0x60
<4> [72.639767] WARNING: CPU: 5 PID: 0 at drivers/gpu/drm/i915/display/intel_hotplug.c:506 intel_hpd_irq_handler+0x34d/0x3d0 [i915]
Comment 1 Lakshmi 2019-07-26 06:50:00 UTC
(In reply to Lakshmi from comment #0)
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_335/fi-icl-u4/
> igt@kms_chamelium@hdmi-hpd-storm.html
> 
>  Received HPD interrupt on pin 9 although disabled

@Simon, you might be interested in this bug.
Comment 2 CI Bug Log 2019-07-26 06:50:57 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* fi-icl-u4: igt@kms_chamelium@hdmi-hpd-storm - dmesg-warn - Received HPD interrupt on pin 9 although disabled
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_335/fi-icl-u4/igt@kms_chamelium@hdmi-hpd-storm.html
Comment 3 emersion 2019-07-26 07:06:08 UTC
Seems like a kernel bug in HPD storm handling. The driver correctly falls back to polling, but then errors because of an HPD interrupt. Note that this happens on USB-C, which might be relevant:

<7> [71.911207] [drm:__intel_tc_port_lock [i915]] Port F/TC#4: TC port mode reset (dp-alt -> tbt-alt)

User impact: seems like the driver ends up in a bad state, which could result in a blank screen when a monitor with e.g. a bad cable is connected via USB-C.
Comment 4 Lakshmi 2019-07-26 07:25:17 UTC
(In reply to emersion from comment #3)
> Seems like a kernel bug in HPD storm handling. The driver correctly falls
> back to polling, but then errors because of an HPD interrupt. Note that this
> happens on USB-C, which might be relevant:
> 
> <7> [71.911207] [drm:__intel_tc_port_lock [i915]] Port F/TC#4: TC port mode
> reset (dp-alt -> tbt-alt)
> 
> User impact: seems like the driver ends up in a bad state, which could
> result in a blank screen when a monitor with e.g. a bad cable is connected
> via USB-C.

Thanks for the quick assessment. Setting the priority to High.
Comment 5 ashutosh.dixit 2019-11-06 19:42:08 UTC
Bug assessment: still happening regularly with 3.4 percent repro rate. Now we have a description of the test from Simon so that helps:

static const char test_hpd_storm_detect_desc[] =
        "Trigger a series of hotplugs in a very small timeframe to simulate a"
        "bad cable, check the kernel falls back to polling to avoid a hotplug "
        "storm";

Warning is the same as above, agree that seems to be a kernel/i915 issue. Leaving severity/priority unchanged.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.