Bug 103997 - [CNL] eDP + HDMI sometimes Hard Hang machine
Summary: [CNL] eDP + HDMI sometimes Hard Hang machine
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: high critical
Assignee: James Ausmus
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-30 15:36 UTC by Elizabeth
Modified: 2018-01-05 17:19 UTC (History)
3 users (show)

See Also:
i915 platform: CNL
i915 features: display/HDMI


Attachments
log_testdisplay_hang (19.52 KB, text/plain)
2017-11-30 15:36 UTC, Elizabeth
no flags Details

Description Elizabeth 2017-11-30 15:36:39 UTC
Created attachment 135830 [details]
log_testdisplay_hang

eDP - ok
DP - ok
HDMI - ok
eDP + DP - ok
eDP + HDMI - fail
DP + HDMI - fail
eDP + DP + HDMI - fail
Comment 1 Chris Wilson 2017-11-30 15:59:53 UTC
Is there any chance in getting the lost kernel messages? Particularly at the end for the panic.
Comment 2 Elizabeth 2017-11-30 16:13:58 UTC
(In reply to Chris Wilson from comment #1)
> Is there any chance in getting the lost kernel messages? Particularly at the
> end for the panic.
Hello Chris, Rodrigo Vivi is checking that bug, not sure if that information can be obtained.
Comment 3 Rodrigo Vivi 2017-11-30 18:51:14 UTC
They never appeared on serial here even with ignoreloglevel...
After setting up the DPLL there is that CPU didn't received broadcast message and then platform is rebooting and reboot get stuck....

I believe I should've tried the journalctl --boot=-1 to confirm there was nothing... I will try to do this as soon as I finish that bisect on the other bug, unless if James get this first here ;)
Comment 4 Rodrigo Vivi 2017-11-30 18:54:02 UTC
Oh and BTW this table on the description is wrong...
That was an old issue actually.

The new issue is when we had already eDP using DPLL0
and external output on DPLL1 and we disable eDP and move
external (HDMI or DP) to the DPLL0. So machine hard hangs.

My bad... I should had opened a new bug here at fdo directly for this, but I was lazy and reused the old one there.
Comment 5 Rodrigo Vivi 2017-12-01 01:23:19 UTC
No way of getting more message after that panic:
Max info we have is this:

[  129.621015] [drm:intel_enable_shared_dpll [i915]] enable DPLL 0 (active 1, on? 0) for crtc 43
[  129.629651] [drm:intel_enable_shared_dpll [i915]] enabling DPLL 0
[  129.635903] [drm:intel_dp_dual_mode_set_tmds_output [i915]] Enabling DP dual mode adaptor TMDS output
[  129.646168] [drm:intel_power_well_enable [i915]] enabling DDI C IO power well
[  129.653635] [drm:intel_ena[  129.656104] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 5: ba00000011000402
[  129.656105] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff81b503f9> {acpi_processor_ffh_cstate_enter+0x69/0xb0}
[  129.656106] mce: [Hardware Error]: TSC 565b752bda
[  129.656107] mce: [Hardware Error]: PROCESSOR 0:60662 TIME 1512065764 SOCKET 0 APIC 0 microcode 12
[  129.656107] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[  129.656107] mce: [Hardware Error]: Machine check: Processor context corrupt
[  129.656108] Kernel panic - not syncing: Fatal machine check
[  129.656113] Kernel Offset: disabled
Comment 6 James Ausmus 2017-12-01 02:19:54 UTC
Issue was that we were not masking out the previous value for the PLL to DDI mapping before OR'ing in the new value. Thus, with HDMI/DP previously being mapped to DPLL1, when we tried to map it to DPLL0, the register continnued to contain the mapping to DPLL1, which was disabled.

Patch is at https://patchwork.freedesktop.org/series/34726/
Comment 7 Elizabeth 2018-01-05 17:19:21 UTC
As reference:
commit 46442beed972d439210580739bc006713375c5b4
Author: James Ausmus <james.ausmus@intel.com>
Date:   Thu Nov 30 18:17:00 2017 -0800

    drm/i915/cnl: Mask previous DDI - PLL mapping


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.