Created attachment 112274 [details] Linux kernel message System environment: -- chipset: Intel Haswell/Denlow -- system architecture: 64-bit -- xf86-video-intel: 2.99.917 -- xserver:1.15 -- mesa:10 -- libdrm:2.4.54 -- kernel: 3.12.20, 3.16.2, 3.18.2 -- Linux distribution: ubuntu Reproducing steps: linux kernel boot exception message happen when environment temperature is low. kernel dmesg as below: When Intel i915 driver insert: [ 21.917499] i915 0000:00:02.0: No connectors reported connected with modes [ 21.924407] [drm] Cannot find any crtc or sizes - going 1024x768 When start Xorg : [ 47.236534] WARNING: CPU: 1 PID: 78 at /asustor/branch2_3/x86_64/source/linux-3.12.20/drivers/gpu/drm/i915/intel_display.c:6035 hsw_disable_lcpll+0xa1/0x53a [i915]() [ 47.251280] Power well on [ 47.253904] Modules linked in: snd_seq_midi snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_midi_event snd_seq snd_seq_device cryptodev(O) iptable_filter ip_tables x_tables iscsi_scst(O) scst_vdisk(O) scst(O) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd_page_alloc snd i915 drm_kms_helper drm isofs udf crc_itu_t sr_mod cdrom asleddrv(O) quota_v2 quota_tree dm_crypt [last unloaded: bluetooth] [ 47.292187] CPU: 1 PID: 78 Comm: kworker/1:1 Tainted: G O 3.12.20 #1 [ 47.299401] Hardware name: Intel Corporation Shark Bay Platform/Flathead Creek, BIOS SBD_3.0.0.247 X64 10/22/2014 [ 47.309663] Workqueue: events hsw_enable_pc8_work [i915] [ 47.314993] 0000000000001793 ffff88006b685d08 ffffffff814a8130 ffff88010024dc08 [ 47.322441] ffff88006b685d58 ffff88006b685d48 ffffffff8103ab35 00000000000034d9 [ 47.329887] ffffffffa00f7afa ffff88006d3b0000 ffff88006b50bc48 ffff88006b50b801 [ 47.337334] Call Trace: [ 47.339781] [<ffffffff814a8130>] dump_stack+0x46/0x58 [ 47.344920] [<ffffffff8103ab35>] warn_slowpath_common+0x77/0x91 [ 47.350932] [<ffffffffa00f7afa>] ? hsw_disable_lcpll+0xa1/0x53a [i915] [ 47.357540] [<ffffffff8103abe3>] warn_slowpath_fmt+0x41/0x43 [ 47.363291] [<ffffffffa0120853>] ? i915_read32+0x91/0x9d [i915] [ 47.369301] [<ffffffffa00f7afa>] hsw_disable_lcpll+0xa1/0x53a [i915] [ 47.375743] [<ffffffffa00f849a>] hsw_enable_pc8_work+0xa0/0xa9 [i915] [ 47.382267] [<ffffffff8104d56d>] process_one_work+0x172/0x232 [ 47.388107] [<ffffffff8104da14>] worker_thread+0x159/0x1ee [ 47.393676] [<ffffffff8104d8bb>] ? rescuer_thread+0x264/0x264 [ 47.399504] [<ffffffff8104d8bb>] ? rescuer_thread+0x264/0x264 [ 47.405349] [<ffffffff81052c53>] kthread+0x88/0x90 [ 47.410223] [<ffffffff81052bcb>] ? kthread_freezable_should_stop+0x39/0x39 [ 47.417169] [<ffffffff814b17bc>] ret_from_fork+0x7c/0xb0 [ 47.422562] [<ffffffff81052bcb>] ? kthread_freezable_should_stop+0x39/0x39 [ 47.429507] ---[ end trace d3b3f133587ed2f1 ]---
(In reply to samuelchueh from comment #0) > Reproducing steps: > linux kernel boot exception message happen when environment temperature is > low. kernel dmesg as below: How low is low? Are you within chip specs? Please attach dmesg with drm.debug=14 module parameter set, all the way from boot, for both working and non-working cases.
Environment temperature is lower (20 degree). The temperature range within the chipset spec. How to enable dmesg with drm.debug=14 module parameter set?
(In reply to samuelchueh from comment #2) > How to enable dmesg with drm.debug=14 module parameter set? Depends on your setup. For example in grub you can hit 'e' before booting the kernel to edit the module parameters. Add drm.debug=14 to the boot line. If you have drm as loadable module it's possible to add that in modprobe config.
Created attachment 112595 [details] Non working kernel message(drm.debug=14) Non working kernel message(drm.debug=14)
Created attachment 112596 [details] working kernel message(drm.debug=14) working kernel message(drm.debug=14)
This hsw_enable_pc8_work doesn't exist and it is part of runtime PM suspend. This patch and the rest of its series fixed this issue: [PATCH 02/16] drm/i915: make PC8 be part of runtime PM suspend/resume.
Where is the fixed patch? I can't find it in attachments.
(In reply to samuelchueh from comment #7) > Where is the fixed patch? I can't find it in attachments. The code has changed considerably in v3.15. Please try a recent kernel, preferably one of the v3.19-rc or drm-intel-nightly branch of http://cgit.freedesktop.org/drm-intel.
And attach logs similar to comment #4 and comment #5 for the new kernel.
We had test the latest linux kernel 3.18.2. Test fail on this version.
We had test the linux kernel v3.19-rc6. It didn't fixed this issue.
Please attach the logs for the new kernel as requested in comment #9. Thanks.
Created attachment 112926 [details] 3.19-rc6 Non working kernel message(drm.debug=14)
[ 6.943962] [drm:intel_hdmi_detect] [CONNECTOR:22:HDMI-A-1] [ 6.944112] [drm:gmbus_xfer] GMBUS [i915 gmbus dpc] NAK for addr: 0050 r(1) GMBUS NAK return when environment temperature is low. Can it retry intel_hdmi_detect when GMBUS NAK?
I'm suspecting a hardware problem. But let's go back to basics and check some of the assumptions. Do you get a picture on screen from BIOS when you boot up, before the kernel and i915 are loaded, in the non-working conditions? Is the display in the same ambient temperature as the machine? Can you try the machine with another display, or another machine with the same display, in the non-working conditions? Can you try having the machine in the non-working temperature and the display in the working temperature, and vice versa? Does the problem go away if you let the machine and display warm up in the low ambient temperature? (You can e.g. unplug and replug the monitor, or just reboot, to retry.)
Below(==>) is my answers for your question: Do you get a picture on screen from BIOS when you boot up, before the kernel and i915 are loaded, in the non-working conditions? ==> Yes, I can see the picture on screen from BIOS when boot up, in the non- working conditions. And no signal output on monitor after i915 driver loaded. Is the display in the same ambient temperature as the machine? ==> In the working conditions, it can output the 1080p resolution. Can you try the machine with another display, or another machine with the same display, in the non-working conditions? Can you try having the machine in the non-working temperature and the display in the working temperature, and vice versa? ==> We had tried the another monitor. And the test result is fail in the non-working conditions. All test monitors are normal in other machines. Does the problem go away if you let the machine and display warm up in the low ambient temperature? (You can e.g. unplug and replug the monitor, or just reboot, to retry.) ==> Yes the problem go away when the machine warm up. ==> It can display to 1024x768 resolution when unplug and replug the HDMI cable. ==> Can the i915 driver fix the issue? ==> If we want to work around this issue. Do you know how to let the i915/drm driver reset or re-detect the Edid using the linux command line?
(In reply to samuelchueh from comment #2) > Environment temperature is lower (20 degree). Celsius or Fahrenheit?
Celsius
Closing bug since no new updates on the case. Thanks.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.