Bug 101745

Summary: [SKL] SYSTEM HANG on boot-up
Product: DRI Reporter: Giles Anderson <agander>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: high CC: bugs+freedesktop, didierg-divers, freedesktop, intel-gfx-bugs, mail, nobodyless, reupke, ricardo.vega, xblitz
Version: XOrg git   
Hardware: Other   
OS: Linux (All)   
Whiteboard:
i915 platform: SKL i915 features:
Attachments:
Description Flags
Output from lspci -nn
none
journalctl kernel 4.11.12
none
i915_vbt from Asus UX305UA
none
drm/i915/vbt: ignore extraneous child devices for a port
none
dmesg with patch by jani nikula 2017-08-10 19:40:35 none

Description Giles Anderson 2017-07-10 18:44:27 UTC
Created attachment 132597 [details]
Output from lspci -nn

Reproducing steps:
On power-up the grub menu is displayed. On selection of a kernel version or the automatic selection of the latest, the screen immediately goes blank.

How often:
Always


System environment:

-- chipset:

-- system architecture: x86_64

-- xf86-video-intel:
(xorg-x11-drv-intel-2.99.917-26.20160929.fc25.x86_64
)
Version: 2.99.917 
Release: 26.20160929.fc25

-- xserver:
Version: 1.19.3 
Release: 1.fc25

-- mesa:
Version: 17.0.5 
Release: 3.fc25

-- libdrm:
Version: 2.4.81 
Release: 1.fc25

-- kernel: 
4.11.3-202.fc25.x86_64

-- Linux distribution:
Fedora 25

-- Machine or mobo model:
ASUSTeK COMPUTER INC. UX305UA/UX305UA, BIOS UX305UA.201 10/12/2015

-- Display connector:

Additional info:


I have files with other information which are all over the upload limit:
rom
snapshot
jrnlctl_b-1
gdm-x-session

/bin/su -c "dd if=/dev/mem of=/tmp/rom bs=64k skip=12 count=1"

/bin/su -c "mount -t debugfs debugfs /sys/kernel/debug ; cp /sys/kernel/debug/dri*/i915_opr
egion /tmp/opregion"

cp: impossibile eseguire stat di '/sys/kernel/debug/dri*/i915_opregion': No such file or di
rectory

gnome-abrt messages:
A kernel problem occurred, but your kernel has been tainted (flags:GW).
WARNING: CPU: 2 PID: 114 at drivers/gpu/drm/i915/intel_pm.c:3749 skl_compute_wm+0xcde/0x1540 [i915]
Comment 1 _nobody_ 2017-07-11 04:14:18 UTC
http://www.forums.fedoraforum.org/showthread.php?t=314575

_nobody_
Comment 2 Elizabeth 2017-07-12 21:13:08 UTC
I tried to reproduce with the following configuration without success:

-Software:
kernel version            : 4.11.3
architecture              : x86_64
kernel driver             : i915
bios revision             : 5.6
bios release date         : 09/29/2016
mesa                      : 17.0.5

-Hardware:
platform                  : SKL Canyon, NUC6i7KYB
cpu information           : Intel(R) Core(TM) i7-6770HQ CPU @ 2.60GHz
gpu card                  : Intel Corporation Iris Pro Graphics 580 (rev 09)
displays connected        : DP-3

-Firmware:
dmc fw loaded             : yes
dmc version               : 1.26
guc fw loaded             : NONE
guc version wanted        : 6.1
guc version found         : 0.0

Also tried with latest 4.11.10 and neither manage to reproduce.
Could you please try with 4.11.10 or 4.12 and if the problem persist send full dmesg with parameter drm.debug=0xe on grub boot options?
Jrnlctl could also be helpful. Thanks.
Comment 3 Giles Anderson 2017-07-13 09:05:04 UTC
> Could you please try with 4.11.10 or 4.12 and if the problem persist send
> full dmesg with parameter drm.debug=0xe on grub boot options?
> Jrnlctl could also be helpful. Thanks.

Will do when they appear: today I noticed that only 4.11.9-200.fc25 is available.
Comment 4 Didier G 2017-07-24 23:17:42 UTC
Created attachment 132939 [details]
journalctl kernel 4.11.12
Comment 5 Didier G 2017-07-24 23:18:49 UTC
According https://bugzilla.redhat.com/show_bug.cgi?id=1463085 and https://bugzilla.kernel.org/show_bug.cgi?id=196233 it seems this problem is reported only on Asus laptop UX305UA, UX305U and UX306U

In my case I have UX305UA and I did not have any problem with kernel 4.11.4 and previous. The problem has been introduced by kernel 4.11.5 and following. I am affirmative about that.

I attach journalctl of failing boot with kernel 4.11.12 with parameter drm.debug=0xe on grub boot options.

Message "[drm:drm_calc_timestamping_constants [drm]] *ERROR* crtc 31: Can't calculate constants, dotclock = 0!" is present since 4.11.5
Comment 6 Jani Nikula 2017-07-31 13:40:28 UTC
Assuming this is the same as https://bugzilla.kernel.org/show_bug.cgi?id=196233.

Judging by the git logs, the only thing that looks potentially suspicious between v4.11.4 and v4.11.5 in drm/i915 is

commit 6b183d1b84764e81dcb64f6b41151e79db23679c
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Fri Mar 10 15:27:58 2017 +0200

    drm/i915/vbt: split out defaults that are set when there is no VBT
    
    commit bb1d132935c2f87cd261eb559759fe49d5e5dc43 upstream.

Please try reverting that.
Comment 7 Jani Nikula 2017-08-01 09:52:16 UTC
Another thing to check, are you booting in legacy BIOS or UEFI mode?
Comment 8 Jani Nikula 2017-08-02 09:24:31 UTC
(In reply to Jani Nikula from comment #6)
> Assuming this is the same as
> https://bugzilla.kernel.org/show_bug.cgi?id=196233.
> 
> Judging by the git logs, the only thing that looks potentially suspicious
> between v4.11.4 and v4.11.5 in drm/i915 is
> 
> commit 6b183d1b84764e81dcb64f6b41151e79db23679c
> Author: Jani Nikula <jani.nikula@intel.com>
> Date:   Fri Mar 10 15:27:58 2017 +0200
> 
>     drm/i915/vbt: split out defaults that are set when there is no VBT
>     
>     commit bb1d132935c2f87cd261eb559759fe49d5e5dc43 upstream.
> 
> Please try reverting that.

Quoting https://bugzilla.kernel.org/show_bug.cgi?id=196233#c15

"i just built the 4.13.0-rc3 kernel WITHOUT bb1d132935c2f87cd261eb559759fe49d5e5dc43 and that's working for me.

To verify i built the same kernel WITH bb1d132935c2f87cd261eb559759fe49d5e5dc43 and got the same error as before."

Confirms the culprit.
Comment 9 Jani Nikula 2017-08-03 08:05:07 UTC
Please attach /sys/kernel/debug/dri/0/i915_vbt
Comment 10 Didier G 2017-08-03 11:00:25 UTC
Created attachment 133226 [details]
i915_vbt from Asus UX305UA
Comment 11 Jani Nikula 2017-08-10 19:40:35 UTC
Created attachment 133428 [details] [review]
drm/i915/vbt: ignore extraneous child devices for a port

Please try the attached patch, obviously don't revert bb1d132935c2 ("drm/i915/vbt: split out defaults that are set when there is no VBT") for this.
Comment 12 Jani Nikula 2017-08-10 19:41:11 UTC
And please attach dmesg with drm.debug=14 regardless of the outcome.
Comment 13 Oliver Weißbarth 2017-08-10 21:55:30 UTC
Created attachment 133429 [details]
dmesg with patch by jani nikula 2017-08-10 19:40:35

I tested the latest kernel with your patch. It worked. The dmesg is attached. Thank you very much.
Comment 14 Jani Nikula 2017-08-11 11:35:47 UTC
(In reply to Oliver Weißbarth from comment #13)
> I tested the latest kernel with your patch. It worked. The dmesg is
> attached. Thank you very much.

Thanks for testing, patch with proper commit message submitted http://patchwork.freedesktop.org/patch/msgid/20170811113907.6716-1-jani.nikula@intel.com
Comment 15 Eric 2017-08-15 19:18:39 UTC
I am also affected by this bug on an Asus ux306ua. can't use any kernel > 4.10
Comment 16 Jani Nikula 2017-08-17 07:10:40 UTC
commit 7c648bde211baeda7a029bd6be4957e8be48d8c9
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Fri Aug 11 14:39:07 2017 +0300

    drm/i915/vbt: ignore extraneous child devices for a port

in drm-intel-fixes, expected to land upstream at about v4.13-rc7.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.