Bug 88449 - [hsw hdmi] display not detected in low ambient temperature
Summary: [hsw hdmi] display not detected in low ambient temperature
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-15 08:10 UTC by samuelchueh
Modified: 2017-07-18 20:30 UTC (History)
2 users (show)

See Also:
i915 platform: HSW
i915 features: display/HDMI


Attachments
Linux kernel message (61.98 KB, text/plain)
2015-01-15 08:10 UTC, samuelchueh
no flags Details
Non working kernel message(drm.debug=14) (72.42 KB, text/plain)
2015-01-21 10:43 UTC, samuelchueh
no flags Details
working kernel message(drm.debug=14) (79.86 KB, text/plain)
2015-01-21 10:44 UTC, samuelchueh
no flags Details
3.19-rc6 Non working kernel message(drm.debug=14) (63.44 KB, text/plain)
2015-01-29 06:11 UTC, samuelchueh
no flags Details

Description samuelchueh 2015-01-15 08:10:38 UTC
Created attachment 112274 [details]
Linux kernel message

System environment:
-- chipset: Intel Haswell/Denlow
-- system architecture: 64-bit
-- xf86-video-intel: 2.99.917
-- xserver:1.15
-- mesa:10
-- libdrm:2.4.54
-- kernel: 3.12.20, 3.16.2, 3.18.2
-- Linux distribution: ubuntu

Reproducing steps:
  linux kernel boot exception message happen when environment temperature is low. kernel dmesg as below:

When Intel i915 driver insert:

[   21.917499] i915 0000:00:02.0: No connectors reported connected with modes
[   21.924407] [drm] Cannot find any crtc or sizes - going 1024x768

When start Xorg :

[   47.236534] WARNING: CPU: 1 PID: 78 at /asustor/branch2_3/x86_64/source/linux-3.12.20/drivers/gpu/drm/i915/intel_display.c:6035 hsw_disable_lcpll+0xa1/0x53a [i915]()
[   47.251280] Power well on
[   47.253904] Modules linked in: snd_seq_midi snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_midi_event snd_seq snd_seq_device cryptodev(O) iptable_filter ip_tables x_tables iscsi_scst(O) scst_vdisk(O) scst(O) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd_page_alloc snd i915 drm_kms_helper drm isofs udf crc_itu_t sr_mod cdrom asleddrv(O) quota_v2 quota_tree dm_crypt [last unloaded: bluetooth]
[   47.292187] CPU: 1 PID: 78 Comm: kworker/1:1 Tainted: G           O 3.12.20 #1
[   47.299401] Hardware name: Intel Corporation Shark Bay Platform/Flathead Creek, BIOS SBD_3.0.0.247 X64 10/22/2014
[   47.309663] Workqueue: events hsw_enable_pc8_work [i915]
[   47.314993]  0000000000001793 ffff88006b685d08 ffffffff814a8130 ffff88010024dc08
[   47.322441]  ffff88006b685d58 ffff88006b685d48 ffffffff8103ab35 00000000000034d9
[   47.329887]  ffffffffa00f7afa ffff88006d3b0000 ffff88006b50bc48 ffff88006b50b801
[   47.337334] Call Trace:
[   47.339781]  [<ffffffff814a8130>] dump_stack+0x46/0x58
[   47.344920]  [<ffffffff8103ab35>] warn_slowpath_common+0x77/0x91
[   47.350932]  [<ffffffffa00f7afa>] ? hsw_disable_lcpll+0xa1/0x53a [i915]
[   47.357540]  [<ffffffff8103abe3>] warn_slowpath_fmt+0x41/0x43
[   47.363291]  [<ffffffffa0120853>] ? i915_read32+0x91/0x9d [i915]
[   47.369301]  [<ffffffffa00f7afa>] hsw_disable_lcpll+0xa1/0x53a [i915]
[   47.375743]  [<ffffffffa00f849a>] hsw_enable_pc8_work+0xa0/0xa9 [i915]
[   47.382267]  [<ffffffff8104d56d>] process_one_work+0x172/0x232
[   47.388107]  [<ffffffff8104da14>] worker_thread+0x159/0x1ee
[   47.393676]  [<ffffffff8104d8bb>] ? rescuer_thread+0x264/0x264
[   47.399504]  [<ffffffff8104d8bb>] ? rescuer_thread+0x264/0x264
[   47.405349]  [<ffffffff81052c53>] kthread+0x88/0x90
[   47.410223]  [<ffffffff81052bcb>] ? kthread_freezable_should_stop+0x39/0x39
[   47.417169]  [<ffffffff814b17bc>] ret_from_fork+0x7c/0xb0
[   47.422562]  [<ffffffff81052bcb>] ? kthread_freezable_should_stop+0x39/0x39
[   47.429507] ---[ end trace d3b3f133587ed2f1 ]---
Comment 1 Jani Nikula 2015-01-15 13:42:11 UTC
(In reply to samuelchueh from comment #0)
> Reproducing steps:
>   linux kernel boot exception message happen when environment temperature is
> low. kernel dmesg as below:

How low is low? Are you within chip specs?

Please attach dmesg with drm.debug=14 module parameter set, all the way from boot, for both working and non-working cases.
Comment 2 samuelchueh 2015-01-16 02:21:39 UTC
Environment temperature is lower (20 degree). The temperature range within the chipset spec. 

How to enable dmesg with drm.debug=14 module parameter set?
Comment 3 Jani Nikula 2015-01-16 15:28:19 UTC
(In reply to samuelchueh from comment #2)
> How to enable dmesg with drm.debug=14 module parameter set?

Depends on your setup. For example in grub you can hit 'e' before booting the kernel to edit the module parameters. Add drm.debug=14 to the boot line.

If you have drm as loadable module it's possible to add that in modprobe config.
Comment 4 samuelchueh 2015-01-21 10:43:21 UTC
Created attachment 112595 [details]
Non working kernel message(drm.debug=14)

Non working kernel message(drm.debug=14)
Comment 5 samuelchueh 2015-01-21 10:44:03 UTC
Created attachment 112596 [details]
working kernel message(drm.debug=14)

working kernel message(drm.debug=14)
Comment 6 Rodrigo Vivi 2015-01-22 00:21:42 UTC
This hsw_enable_pc8_work doesn't exist and it is part of runtime PM suspend.
This patch and the rest of its series fixed this issue: [PATCH 02/16] drm/i915: make PC8 be part of runtime PM suspend/resume.
Comment 7 samuelchueh 2015-01-22 15:11:22 UTC
Where is the fixed patch? I can't find it in attachments.
Comment 8 Jani Nikula 2015-01-26 08:43:46 UTC
(In reply to samuelchueh from comment #7)
> Where is the fixed patch? I can't find it in attachments.

The code has changed considerably in v3.15. Please try a recent kernel, preferably one of the v3.19-rc or drm-intel-nightly branch of http://cgit.freedesktop.org/drm-intel.
Comment 9 Jani Nikula 2015-01-26 08:44:16 UTC
And attach logs similar to comment #4 and comment #5 for the new kernel.
Comment 10 samuelchueh 2015-01-26 09:08:27 UTC
We had test the latest linux kernel 3.18.2. Test fail on this version.
Comment 11 samuelchueh 2015-01-28 09:02:36 UTC
We had test the linux kernel v3.19-rc6. It didn't fixed this issue.
Comment 12 Jani Nikula 2015-01-28 09:46:53 UTC
Please attach the logs for the new kernel as requested in comment #9. Thanks.
Comment 13 samuelchueh 2015-01-29 06:11:49 UTC
Created attachment 112926 [details]
3.19-rc6 Non working kernel message(drm.debug=14)
Comment 14 samuelchueh 2015-01-29 06:20:30 UTC
[    6.943962] [drm:intel_hdmi_detect] [CONNECTOR:22:HDMI-A-1]
[    6.944112] [drm:gmbus_xfer] GMBUS [i915 gmbus dpc] NAK for addr: 0050 r(1)


GMBUS NAK return when environment temperature is low. 
Can it retry intel_hdmi_detect when GMBUS NAK?
Comment 15 Jani Nikula 2015-01-29 09:22:58 UTC
I'm suspecting a hardware problem. But let's go back to basics and check some of the assumptions.

Do you get a picture on screen from BIOS when you boot up, before the kernel and i915 are loaded, in the non-working conditions?

Is the display in the same ambient temperature as the machine?

Can you try the machine with another display, or another machine with the same display, in the non-working conditions?

Can you try having the machine in the non-working temperature and the display in the working temperature, and vice versa?

Does the problem go away if you let the machine and display warm up in the low ambient temperature? (You can e.g. unplug and replug the monitor, or just reboot, to retry.)
Comment 16 samuelchueh 2015-01-29 10:22:43 UTC
Below(==>) is my answers for your question:

Do you get a picture on screen from BIOS when you boot up, before the kernel and i915 are loaded, in the non-working conditions?
==> Yes, I can see the picture on screen from BIOS when boot up, in the non-   working conditions. And no signal output on monitor after i915 driver loaded. 

Is the display in the same ambient temperature as the machine?
==> In the working conditions, it can output the 1080p resolution.

Can you try the machine with another display, or another machine with the same display, in the non-working conditions?
Can you try having the machine in the non-working temperature and the display in the working temperature, and vice versa?
==> We had tried the another monitor. And the test result is fail in the non-working conditions. All test monitors are normal in other machines. 

Does the problem go away if you let the machine and display warm up in the low ambient temperature? (You can e.g. unplug and replug the monitor, or just reboot, to retry.)
==> Yes the problem go away when the machine warm up.  
==> It can display to 1024x768 resolution when unplug and replug the HDMI cable. 
==> Can the i915 driver fix the issue? 
==> If we want to work around this issue. Do you know how to let the i915/drm driver reset or re-detect the Edid using the linux command line?
Comment 17 Paulo Zanoni 2015-02-23 21:00:28 UTC
(In reply to samuelchueh from comment #2)
> Environment temperature is lower (20 degree). 

Celsius or Fahrenheit?
Comment 18 samuelchueh 2015-03-02 11:13:19 UTC
Celsius
Comment 19 Elizabeth 2017-07-18 20:30:34 UTC
Closing bug since no new updates on the case. Thanks.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.