Summary: | [KBL/SKL] Screen does not wake after screen blank | ||||||
---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | dopey | ||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||
Status: | CLOSED WORKSFORME | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||
Severity: | normal | ||||||
Priority: | medium | CC: | bugs+freedesktop, gordon.messmer, intel-gfx-bugs, mschiffer+misc | ||||
Version: | unspecified | ||||||
Hardware: | x86-64 (AMD64) | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
i915 platform: | KBL, SKL | i915 features: | power/runtime PM | ||||
Attachments: |
|
Description
dopey
2017-08-14 20:42:46 UTC
After updating the BIOS of my laptop to the latest version and Upgrading to Kernel 4.12.4 i'm still experiencing the issue. Same message in the logs: Aug 06 11:26:02 localhost.localdomain kernel: [drm:i915_gem_idle_work_handler [i915]] *ERROR* Timeout waiting for engines to idle Screen is blank, unable to unblank it, keys when pressed are lit for a moment and then turn off, unable to ssh into the machine though login asks for the password. This is definitely a bug in i915 rc6 support. Just in case my VAIO Z flip is using a skylake CPU: model name : Intel(R) Core(TM) i7-6567U CPU @ 3.30GHz Kernel: [root@localhost]# uname -a Linux localhost.localdomain 4.12.4 #1 SMP Sat Aug 5 11:00:30 UYT 2017 x86_64 x86_64 x86_64 GNU/Linux i915 module parameters used: [root@localhost ]# systool -vm i915 Module = "i915" Attributes: coresize = "1277952" initsize = "0" initstate = "live" refcnt = "21" srcversion = "9F705B72B03F193BC3EF19B" taint = "" uevent = <store method only> Parameters: alpha_support = "N" disable_display = "N" disable_power_well = "1" edp_vswing = "0" enable_cmd_parser = "Y" enable_dc = "-1" enable_dp_mst = "Y" enable_dpcd_backlight= "N" enable_execlists = "1" enable_fbc = "0" enable_guc_loading = "0" enable_guc_submission= "0" enable_gvt = "N" enable_hangcheck = "Y" enable_ips = "1" enable_ppgtt = "3" enable_psr = "1" enable_rc6 = "1" error_capture = "Y" fastboot = "N" force_reset_modeset_test= "N" guc_firmware_path = "(null)" guc_log_level = "-1" huc_firmware_path = "(null)" inject_load_failure = "0" invert_brightness = "0" load_detect_test = "N" lvds_channel_mode = "0" lvds_use_ssc = "-1" mmio_debug = "0" modeset = "-1" nuclear_pageflip = "N" panel_ignore_lid = "1" prefault_disable = "N" reset = "Y" semaphores = "0" use_mmio_flip = "0" vbt_sdvo_panel_type = "-1" verbose_state_checks= "Y" Again setting i915.emable_rc6 to 0 is NOT an option as it destroys battery life. I hope this bug can be fixed cause its been a long time since i915 rc6 bugs have been around for skylake and kabylake CPU. I'm currently testing some i915 module parameters and i will report back if the problem appears again Hello everyone, Could you please boot with drm.debug=0x1e log_bug_len=2M on grub and provide the full dmesg? If it's possible could you try to replicate with drm-tip branch: https://cgit.freedesktop.org/drm-tip Thank you. I can't reproduce this issue on kernel 4.13-rc5 anymore (the last I tried was some 4.12.x, which was affected). Hardware: Thinkpad T470, i5-7200U, Intel(R) HD Graphics 620 UPDATE: It seems the issue has disappeared! I modified 3 i915 module options and after 3 days of testing including leaving the laptop on overnight i haven't experience the problems again. Battery consumption has been great, around 2.5 watts when idle. The module options modified were: enable_guc_loading = "1" enable_guc_submission= "1" disable_power_well = "0" for the guc module options make sure you have installed the latest firmware from https://01.org/linuxgraphics/downloads/firmware. In your dmesg after booting you will see these messages: [ 2.303462] Setting dangerous option enable_guc_loading - tainting kernel [ 2.303463] Setting dangerous option enable_guc_submission - tainting kernel [ 2.340111] [drm] GuC submission enabled (firmware i915/skl_guc_ver6_1.bin [version 6.1]) These are the GRUB boot options used: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.12.4 root=UUID=d9310d7b-9422-463c-89ec-e1431caba3c4 ro nosplash quiet noiswmd i915.enable_rc6=1 i915.enable_psr=1 i915.disable_power_well=0 i915.enable_guc_loading=1 i915.enable_guc_submission=1 i915.enable_fbc=1 pcie_aspm=force resume=/dev/nvme0n1p6 Again this has worked on a VAIO Z Flip model name : Intel(R) Core(TM) i7-6567U CPU @ 3.30GHz Kernel: [root@localhost]# uname -a Linux localhost.localdomain 4.12.4 #1 SMP Sat Aug 5 11:00:30 UYT 2017 x86_64 x86_64 x86_64 GNU/Linux Please give it a try and let me know if it fixes your issues (In reply to Elizabeth from comment #2) > Hello everyone, > Could you please boot with drm.debug=0x1e log_bug_len=2M on grub and provide > the full dmesg? > If it's possible could you try to replicate with drm-tip branch: > https://cgit.freedesktop.org/drm-tip > Thank you. I am currently running tests a ThinkPad T470 with kernel 4.13-rc4, with the options you give here. Sometimes it takes days for the symptom to show up. As soon as it happens I will provide the full dmesg (piping it in an ever-growing file!) Tomislav After finding the enable_guc_loading option stable, I disabled that option and added the debugging options requested by Elizabeth. I think that was on the 1st or 2nd of this month. Since then, I'm still unable to reproduce the original problem under Fedora kernels 4.12 or 4.11. If the problem recurs, I'll provide additional information. In the mean time, I wonder if loading the GuC firmware introduced a persistent change. If the problem were solved by a firmware update, that would explain why I can no longer reproduce the problem. (In reply to Gordon Messmer from comment #6) > After finding the enable_guc_loading option stable, I disabled that option > and added the debugging options requested by Elizabeth. I think that was on > the 1st or 2nd of this month. Since then, I'm still unable to reproduce the > original problem under Fedora kernels 4.12 or 4.11. > > If the problem recurs, I'll provide additional information. In the mean > time, I wonder if loading the GuC firmware introduced a persistent change. > If the problem were solved by a firmware update, that would explain why I > can no longer reproduce the problem. Likewise, I am not able to reproduce the issue on mainline 4.13.0-rc4 and Fedora's 4.12.9-300, after five days of normal use on each and with enable_guc_loading=0. On a sidenote, is it likely that GuC firmware loading introduces persistent changes? UPDATE: I have found an issue with kernel 4.13 stable, for some reason the GPU becomes stuck with Powered ON at 100% for no reason. Cpu usage and load are low, only way to notice is the heat coming from the laptop and checking powertop/Idle stats section. This is dangerous as it can kill the battery or maybe even degrade the life of the GPU. I have reverted back to my older kernel 4.12.4. Can anyone confirm this? I ran 4.11.11-300.fc26.x86_64 with "drm.debug=0x1e log_bug_len=2M" for a few weeks and was not able to reproduce the problem. Yesterday I removed those options, and today I got the blank-screen hang and "*ERROR* Timeout waiting for engines to idle" error message. Seems the failure might not manifest while debugging is enabled. Created attachment 134626 [details]
dmesg captured by abrt after oops
My system (running 4.11.11-300.fc26.x86_64 for the purpose of locating this bug) recorded the attached "oops" today. It's hard to say if it's related. With debugging enabled, the laptop never fails to return from low power mode, and naturally I'm not getting the same error text. I'm hoping this is useful information, though:
...
[264710.592668] Device suspended during HW access
While I first got the impression that the issue is not reproducible anymore with kernel 4.13.x, I am still experiencing it on occasion after all. I have no idea if all 4.13.x versions are affected and I was just lucky at first, or if the issue was reintroduced in later linux-stable releases (on 4.13.10 at the moment). I still see it only about once a week, so I don't think there's an effective way to bisect it to be sure... I am also experiencing this on F27 4.13.16-302.fc27.x86_64. My system is a Lenovo T470 with Intel integrated graphics. First of all. Sorry about spam. This is mass update for our bugs. Sorry if you feel this annoying but with this trying to understand if bug still valid or not. If bug investigation still in progress, please ignore this and I apologize! If you think this is not anymore valid, please comment to the bug that can be closed. If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug. Closing, please re-open is issue still exists. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.