Created attachment 102100 [details] screenshot-showing-corruptions.png Sometimes after resuming from hibernation, client windows are corrupted like shown in screenshot (contents of firefox tab, compiz-0.8 as window manager). Restarting compiz doesn't help, but restarting the clients (in this case firefox) does get rid of the corruptions. 00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device 220c Flags: bus master, fast devsel, latency 0, IRQ 56 Memory at f0000000 (64-bit, non-prefetchable) [size=4M] Memory at e0000000 (64-bit, prefetchable) [size=256M] I/O ports at 3000 [size=64] Expansion ROM at <unassigned> [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [a4] PCI Advanced Features Kernel driver in use: i915 linux-3.15.2, xorg-server-1.15.99.903, mesa git (also occurs with 10.2.2), xf86-video-intel git This might be related to bug #80727, but it does not always lead to a GPU hang, so I've opened a separate bug for this.
Possible duplicate of #65496?
I'm not quite sure what the real problem is/was in bug #65496. At the end of that report, "[PATCH] drm/i915: Undo gtt scratch pte unmapping again" is mentioned. Shall I revert that patch, is that your suggestion? Lenovo T440s BIOS version is GJET60WW (2.10). 3.13.11 does not produce these corruptions, it only hangs sometimes when starting hibernation at freeing memory, could be some freezing problem, but I believe it is still driver related. The real issues start with 3.14 (as explained in bug #80727), and the corruptions appear only in 3.15.2 AFAICT so far. I believe I've not seen these corruptions with S3, but that may be another thing to thoroughly investigate...
Yes, you could try reverting "[PATCH] drm/i915: Undo gtt scratch pte unmapping again" to see it it fixes your problem. You could also try the suggestions from comment #36 there: boot your machine normally, hibernate, then when you want to resume it, use the modprobe.blacklist=i915 option and see it it solves the problem. You could also run "slabinfo -v" every time you resume to see if it catches some corruption on the Kernel slabs. It may not solve your problem, but then at least we'll know your bug is a different one.
Ok, issue solved. First, reverting "[PATCH] drm/i915: Undo gtt scratch pte unmapping again" got rid of the corruptions in 3.15.2. Turning off fbsplash seemed to make hibernation more stable, freezes did no longer occur. However, when hibernating on AC and then resuming on battery, hard disk errors would start to appear, and I found https://bugzilla.kernel.org/show_bug.cgi?id=72191. The solution was to update the BIOS of T440s from 2.10 to 2.27. Now hibernation and resuming works without errors, currently counting 37 successful attempts. Probably that measure resolves other stability issues too. I'll try again without reverting the patch to see whether it is still required.
Reverting the patch is no longer required. So the simple, yet a bit risky solution is to update the BIOS to the newest version.
Unfortunately, I was wrong. First the instabilities/freezes have returned, finally the corruptions have appeared again. I'm testing now again with the patch reverted.
(In reply to comment #6) > Unfortunately, I was wrong. First the instabilities/freezes have returned, > finally the corruptions have appeared again. I'm testing now again with the > patch reverted. Harald, what's the status with that?
I still revert the patch with linux-3.16.0. Using that setup everything including hibernating/resuming works fine so far. There have been changes between 3.15.8 and 3.16.0 which seem to fix the freezes on hibernation, as reverting the patch alone did not help with the freezes. Shall I retry without reverting the patch ("[PATCH] drm/i915: Undo gtt scratch pte unmapping again")?
(In reply to comment #8) > Shall I retry without reverting the patch ("[PATCH] drm/i915: Undo gtt > scratch pte unmapping again")? Doesn't hurt to try. Any chance of trying a later kernel?
I was wrong, the freezes persisted. However, I've tried current vanilla kernel 3.18-rc1 (or whatever is the most up-to-date git version at the moment), and first the good news: hibernation/resume and suspend/resume finally seem to work fine. The bad news: New bugs concerning daily usage. The worst one is that the eDP screen goes dark and cannot be brought back to life, only resolution is a hard reset. I assume this is due to DPMS off/on, and/or connecting/disconnecting the VGA monitor. The monitor on the VGA port however works fine, and the machine is not dead. No output in dmesg, however. I can reproduce the dark screen issue by booting the machine, waiting until lightdm has started up, then restarting lightdm => LCD display is completely off. Also, the backlight display will stop working (does not respond to changes via keys or sysfs). I hope that hibernation/resume is really solved now. I will try to reproduce the other issues more consistently. Is there any debug kernel param that would help, or another repository with fixes to pull? Here are two error messages in dmesg, but maybe not related to the above issues: [22343.421856] [drm:ivybridge_set_fifo_underrun_reporting] *ERROR* uncleared fifo underrun on pipe A [22343.421861] [drm:ivb_err_int_handler] *ERROR* Pipe A FIFO underrun
It is easy to reproduce the black LCD panel issue: xset dpms force off && sleep 10 && xset dpms force on The panel turns off (completely black) but does not turn on again. Machine is still responding, but no output in dmesg. I assume the error message mentioned previously in comment #10 does not have anything to do with this.
Another status update: The real problem seems to be that the backlight is not restored after dpms on, and any attempt to set it using various commands fails. So dpms off/on works, but it's a backlight problem.
git bisect start # bad: [0429fbc0bdc297d64188483ba029a23773ae07b0] Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu git bisect bad 0429fbc0bdc297d64188483ba029a23773ae07b0 # good: [bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9] Linux 3.17 git bisect good bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9 # good: [35a9ad8af0bb0fa3525e6d0d20e32551d226f38e] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next git bisect good 35a9ad8af0bb0fa3525e6d0d20e32551d226f38e # good: [ca321885b0511a85e2d1cd40caafedbeb18f4af6] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net git bisect good ca321885b0511a85e2d1cd40caafedbeb18f4af6 # good: [da92da3638a04894afdca8b99e973ddd20268471] Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild git bisect good da92da3638a04894afdca8b99e973ddd20268471 # bad: [d898ce03675fc061f89a347a22d41271ed75c436] drm/tilcdc: panel: Add support for enable GPIO git bisect bad d898ce03675fc061f89a347a22d41271ed75c436 # good: [91b06a8e1cfd400c65e16b1ee0747bc6aca35e9e] Merge branch 'drm-next-3.18' of git://people.freedesktop.org/~agd5f/linux into drm-next git bisect good 91b06a8e1cfd400c65e16b1ee0747bc6aca35e9e # good: [4ac073640a528662a7c072a30e92e70ce00ded33] Merge branch 'linux-3.18' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-next git bisect good 4ac073640a528662a7c072a30e92e70ce00ded33 # bad: [e240d55d671c63056b118ec29acb26b273a94405] drm/i915: Don't call DVO mode_set hook on DPMS changes git bisect bad e240d55d671c63056b118ec29acb26b273a94405 # bad: [ec49ba2d709f3a1a4cd822e547db2f07e121b375] drm/i915: fix panel unlock register mask git bisect bad ec49ba2d709f3a1a4cd822e547db2f07e121b375 # bad: [98a2e5f94275b6aafb12a3650937f6c54222cdc2] drm/i915: Bring UP Power Wells before disabling RC6. git bisect bad 98a2e5f94275b6aafb12a3650937f6c54222cdc2 # bad: [e6755fb78e8f20ecadf2a4080084121336624ad9] drm/i915: switch off backlight for backlight class 0 brightness git bisect bad e6755fb78e8f20ecadf2a4080084121336624ad9 # good: [9dd3c605a395c27afeadbb95cf73cdb35e99e135] drm/i915: fix i915_frequency_info on BDW git bisect good 9dd3c605a395c27afeadbb95cf73cdb35e99e135 # good: [ab656bb9012b9eabc21234caa47af478ea6ceec5] drm/i915: add some framework for backlight bl_power support git bisect good ab656bb9012b9eabc21234caa47af478ea6ceec5 # bad: [73580fb764c4213d305c0d36bd8f856ae631eb42] drm/i915/dp: make backlight bl_power control power sequencer backlight git bisect bad 73580fb764c4213d305c0d36bd8f856ae631eb42 # first bad commit: [73580fb764c4213d305c0d36bd8f856ae631eb42] drm/i915/dp: make backlight bl_power control power sequencer backlight
Seems whatever sets bl_power to 1 doesn't reset it to 0.
Reverting that commit helps, backlight gets restored properly.
Just a hunch, please try diff --git a/drivers/gpu/drm/i915/intel_panel.c b/drivers/gpu/drm/i915/intel_panel.c index 18784470a760..2af1a83f813b 100644 --- a/drivers/gpu/drm/i915/intel_panel.c +++ b/drivers/gpu/drm/i915/intel_panel.c @@ -990,8 +990,6 @@ static int intel_backlight_device_update_status(struct backlight_device *bd) bd->props.brightness != 0; panel->backlight_power(connector, enable); } - } else { - bd->props.power = FB_BLANK_POWERDOWN; } drm_modeset_unlock(&dev->mode_config.connection_mutex);
No, unfortunately that patch did not help.
Please provide dmesg from boot to suspend and resume, with drm.debug=14 module parameter set, and this patch applied. diff --git a/drivers/gpu/drm/i915/intel_panel.c b/drivers/gpu/drm/i915/intel_panel.c index e18b3f49074c..9c77afb571b8 100644 --- a/drivers/gpu/drm/i915/intel_panel.c +++ b/drivers/gpu/drm/i915/intel_panel.c @@ -968,8 +968,10 @@ static int intel_backlight_device_update_status(struct backlight_device *bd) struct drm_device *dev = connector->base.dev; drm_modeset_lock(&dev->mode_config.connection_mutex, NULL); - DRM_DEBUG_KMS("updating intel_backlight, brightness=%d/%d\n", - bd->props.brightness, bd->props.max_brightness); + DRM_DEBUG_KMS("updating intel_backlight, brightness=%d/%d, " + "bl_power=%d, by %s\n", + bd->props.brightness, bd->props.max_brightness, + bd->props.power, current->comm); intel_panel_set_backlight(connector, bd->props.brightness, bd->props.max_brightness);
There are a mixture of issues reported in this bug, but it seems the most pressing is the backlight after resume regression. I'm making this bug be about that alone, and changing subject to reflect that. If there are other issues, please report new bugs about them. Another report on the issue: http://lkml.kernel.org/r/5454E2F3.2060404@odi.ch
Created attachment 108913 [details] debug output This is the debug output of a complete cycle: power-on/suspend/resume/power-off that demos the backlight problem. Excerpt: after resume backlight comes on quickly: [ 90.150434] [drm:intel_dp_complete_link_train] Channel EQ done. DP Training successful [ 90.150584] [drm:intel_edp_backlight_on] [ 90.150586] [drm:intel_panel_enable_backlight] pipe A [ 90.150591] [drm:intel_panel_actually_set_backlight] set backlight PWM = 2789 this is pm-utils writing 0, then 2789 to /sys/class/backlight/intel_backlight/brightness: [ 90.235459] [drm:intel_backlight_device_update_status] updating intel_backlight, brightness=2789/4648, bl_power=1, by pm-suspend [ 90.235468] [drm:intel_panel_actually_set_backlight] set backlight PWM = 2789 then something turns it off again: [ 90.235510] [drm:intel_edp_backlight_power] panel power control backlight disable
> There are a mixture of issues reported in this bug, but it seems the most > pressing is the backlight after resume regression. I'm making this bug be > about that alone, and changing subject to reflect that. If there are other > issues, please report new bugs about them. Jani Nikula, the issue was not about backlight resume. The original report was about hibernate/resume, but that is solved. In fact, the backlight issue does not have anything to do with resume, but only with dpms off/on. To reproduce my problem, you only need to xset dpms force off && sleep 10 && xset dpms force on
(In reply to Harald Judt from comment #21) > > There are a mixture of issues reported in this bug, but it seems the most > > pressing is the backlight after resume regression. I'm making this bug be > > about that alone, and changing subject to reflect that. If there are other > > issues, please report new bugs about them. > > Jani Nikula, the issue was not about backlight resume. The original report > was about hibernate/resume, but that is solved. > > In fact, the backlight issue does not have anything to do with resume, but > only with dpms off/on. Argh. This is why we generally prefer new bug reports for new issues. I understand it's all clear to you, but when looking at loads and loads of bugs, it's easy to be confused when you quickly read through the bugs. Still, my bad. > To reproduce my problem, you only need to > xset dpms force off && sleep 10 && xset dpms force on Please provide dmesg with drm.debug=14, running the patch, and reproduce this. Thanks.
(In reply to Ortwin Glück from comment #20) > this is pm-utils writing 0, then 2789 to > /sys/class/backlight/intel_backlight/brightness: > [ 90.235459] [drm:intel_backlight_device_update_status] updating > intel_backlight, brightness=2789/4648, bl_power=1, by pm-suspend Actually it seems like this is pm-suspend writing bl_power=1 which promptly switches the backlight off as requested. (Before you ask, bl_power=0 is on, other values off.) However I can't find a version of pm-utils that touches bl_power by default. Which version are you running? Distro? Or is that your own addition?
May be relevant: http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=7ecc778691c452285f754743a93a46fa1d3da52f
(In reply to Jani Nikula from comment #24) > May be relevant: > > http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/ > ?id=7ecc778691c452285f754743a93a46fa1d3da52f Thanks, that indeed fixes it. Test case "xset dpms force off && sleep 10 && xset dpms force on" now works as intended.
Harald, thanks for the report and trying out xf86-video-intel with the commit referenced in comment #24. Ortwin, please also try the same. If it does not work for you, instead of reopening, please do open a new bug. Please also include the information requested in comment #23. I think there's enough different issues on this one already. Thanks.
(In reply to Jani Nikula from comment #23) > (Before you ask, bl_power=0 is on, > other values off.) ...very intuitive choice of values indeed... > However I can't find a version of pm-utils that touches bl_power by default. > Which version are you running? Distro? Or is that your own addition? Oh crap, yes. It's an ancient pm-utils script that seems now to do something really stupid. Sorry for the noise.
(In reply to Ortwin Glück from comment #27) > > However I can't find a version of pm-utils that touches bl_power by default. > > Which version are you running? Distro? Or is that your own addition? > > Oh crap, yes. It's an ancient pm-utils script that seems now to do something > really stupid. Sorry for the noise. The thing is, this script shouldn't matter at all. It is hooked into the suspend path, but the log appears in the resume path. Assuming pm-utils isn't completely broken, has something changed in the kernel with respect to locking or userspace notification of suspend/resume events?
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.