Summary: | [Lenovo ThinkPad T450s] System hang with MST dock | ||
---|---|---|---|
Product: | DRI | Reporter: | Alexander Kops <alexkops> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | RESOLVED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | high | CC: | anshuman.gupta, intel-gfx-bugs, johan.freedesktop, tomi, ville.syrjala |
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
URL: | https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1727662 | ||
Whiteboard: | Triaged, ReadyForDev | ||
i915 platform: | BXT | i915 features: | display/DP MST |
Attachments: |
Description
Alexander Kops
2017-11-09 11:54:00 UTC
Hello Alexander, Could you share a dmesg or/and kern.log with debug information from boot til problem: drm.debug=0x1e log_bug_len=2M on grub. I'm currently running my computer with the current Mainline kernel (4.14-rc8) and these settings and will post the dmesg as soon as I'm able to reproduce it. Created attachment 135371 [details]
kern.log - Crash seemed to happen at 13:31:38
I added a compressed kern.log (the computer shut down at Nov 10 13:31:38
I reproduced the bug running the current Mainline kernel 4.14.0-041400rc8-generic
Created attachment 135436 [details] kern.log with running drm-tip kernel from today - Computer shut down at 15:20:39 Today I was able to reproduce it with the drm-tip kernel found here: http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/ I attach the kern.log, the computer shut itself down at 15:20:39 This time no messages about a fifo underrun are in the log. (In reply to Alexander Kops from comment #4) > Created attachment 135436 [details] > kern.log with running drm-tip kernel from today - Computer shut down at > 15:20:39 > > Today I was able to reproduce it with the drm-tip kernel found here: > http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/ > > I attach the kern.log, the computer shut itself down at 15:20:39 > This time no messages about a fifo underrun are in the log. Hello Alexander, it seems that the log is keeping all the information that you already shared in the first attachment. Could you reproduce with a clean kern.log, since it shuts I guess a dmesg can't be obtained. To clean kern.log: # rm /var/logs/kern.log # reboot The kern.log will regenerate after boot. Also I noticed you marked this bug as a regression, could you please share latest good know kernel commit and bad know commit. Created attachment 135438 [details]
kern.log with running drm-tip kernel from today - Computer shut down at 15:20:39
Oops, looks like I re-uploaded the kern.log from last time. This one is the correct one from today.
Also the regression tag was added by "Christopher M. Penalver" from the Ubuntu bug tracker. So I can't point to specific kernel commits.
I just noticed that it started appearing after using the Kernel shipping with Ubuntu 17.10 and it wasn't happening with the kernel from 17.04
(In reply to Alexander Kops from comment #7) > ... > Also the regression tag was added by "Christopher M. Penalver" from the > Ubuntu bug tracker. So I can't point to specific kernel commits. > I just noticed that it started appearing after using the Kernel shipping > with Ubuntu 17.10 and it wasn't happening with the kernel from 17.04 That would be 4.9 and 4.13, I guess... (In reply to Alexander Kops from comment #7) > Created attachment 135438 [details] > kern.log with running drm-tip kernel from today - Computer shut down at > 15:20:39 > > Oops, looks like I re-uploaded the kern.log from last time. This one is the > correct one from today. The logs contain multiple boots with multiple different kernels, so it's hard to say what's what. But this log doesn't seem to have any FIFO underruns. So am I to assume this is now fixed? > But this log doesn't seem to have any FIFO underruns. So am I to assume this is now fixed?
Well, it is fixed in a sense that these FIFO underruns don't appear anymore with the drm-tip kernel. But the behaviour, that the computer will just turn itself off a lot of times after enabling lock screen is still there.
Created attachment 135451 [details]
kern.log with running drm-tip kernel from today - Computer froze at 16:17:45
I'll attach this current kern.log. This time the situation was a bit different, I didn't find the notebook turned off, but the power light was still on, but all three screens were black and it didn't react to anything. So I had to hard reboot it.
Maybe you can see something in the logs that would lead to a follow up bug report?
The last thing I see in the log before the crash are a bunch of
[drm:drm_mode_addfb2 [drm]] [FB:87]
lines.
(In reply to Alexander Kops from comment #11) > Created attachment 135451 [details] > kern.log with running drm-tip kernel from today - Computer froze at 16:17:45 > > I'll attach this current kern.log. This time the situation was a bit > different, I didn't find the notebook turned off, but the power light was > still on, but all three screens were black and it didn't react to anything. > So I had to hard reboot it. > > Maybe you can see something in the logs that would lead to a follow up bug > report? > > The last thing I see in the log before the crash are a bunch of > > [drm:drm_mode_addfb2 [drm]] [FB:87] > > lines. Nothing interesting there unfortunately. So I guess we're dealing with some kind of hard system hang, and it doesn't manage to write anything useful to the logs. So it's not even clear whether this has anything to do with i915, or caused by something totally different. Maybe try netconsole/serial console if the machine has a ethernet/serial port. Or you may want to look into pstore to see if that might catch something when the machine dies. Maybe also enable various debug features in the kernel config: CONFIG_LOCKUP_DETECTOR=y CONFIG_SOFTLOCKUP_DETECTOR=y CONFIG_HARDLOCKUP_DETECTOR=y CONFIG_DETECT_HUNG_TASK=y CONFIG_PROVE_LOCKING=y PS. Your logs are huuuuge. Might want to trim away the unrelated boots from the logs. iirc when ubuntu puts system in slumber it first calls a bunch of addfb's for the fade to black animation. It doesn't mean it's the cause of the issue, though could very well be related to dpms off. I can confirm that this weird issue also happens without a docking station. It happens both with an external screen and without it. system-manufacturer: LENOVO system-version: ThinkPad T450s bios-version: JBET66WW (1.30 ) bios-release-date: 09/13/2017 (In reply to Angelo Lisco from comment #14) > I can confirm that this weird issue also happens without a docking station. > It happens both with an external screen and without it. Alexander Kops, as the original reporter, can you confirm the same without a docking station or external screen? I tried to reproduce it without docking station once, but wasn't able to. But I also don't use the notebook without docking station for longer times usually, so no throughout testing happened. First of all. Sorry about spam. This is mass update for our bugs. Sorry if you feel this annoying but with this trying to understand if bug still valid or not. If bug investigation still in progress, please ignore this and I apologize! If you think this is not anymore valid, please comment to the bug that can be closed. If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug. Closing, please re-open if still occurs. This error still occurs on kernel 4.20. My model is T470s and the behavior is consistent: It only happens when connected to an external display through the dock, not when using it disconnected from the dock. Can (most of the time) be triggered by changing display settings through xrandr. Dock model is SD20F82750. Dmesg error message: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun GPU: Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07) (In reply to Johan Thorén from comment #19) > This error still occurs on kernel 4.20. > > My model is T470s and the behavior is consistent: It only happens when > connected to an external display through the dock, not when using it > disconnected from the dock. Can (most of the time) be triggered by changing > display settings through xrandr. > > Dock model is SD20F82750. > > Dmesg error message: > [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO > underrun > > GPU: Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07) The original bug is reported is on Broadwell. So, your issue could be different from the original issue reported in this bug. Can you please attach the full dmesg log from boot with kernel parameters drm.debug=0x1e log_buf_len=4M? What is the impact of this issue other than the error in the log? Can you elaborate the issue? Created attachment 143607 [details]
dmesg
Here is my dmesg output with the requested parameters. I'm now running the 5.0.0 kernel with the same behavior. The trigger is sometimes a xrandr change, but almost always coming back from suspend. Reboot is necessary.
Created attachment 143608 [details]
Video showing the screen
Would appreciate feedback on the data given, if more is needed or if a separate bug report should be filed. Thanks. Johan - can you re-run on drm-tip, and see if the issue persists? If it does, please provide the dmesg log output Created attachment 144595 [details]
New dmesg 2019-06-19 running drm-tip
James, the issue persists running drm-tip as of today.
Hi johan, I need few inputs. 1. Is this issue is seen when u connect a external display directly to laptop without dock ? 2. Have u screen the display tear issue on embedder panel of laptop. 3. As i see from dmesg logs your external display resolution is 1920x1080, have u observed the issue with other monitors or with other resolution modes. 4. could you please let me know if you see this issue after running below command echo "2 0 0 0 0 0 0 0" > /sys/kernel/debug/dri/0/i915_pri_wm_latency Thanks , Anshuman Hi Anshuman, 1. Is this issue is seen when u connect a external display directly to laptop without dock? It happens more frequently (it seems) when connected through a dock, but it also happens when using a cable directly to the laptop. 2. Have u screen the display tear issue on embedder panel of laptop. It never happens on the embedded panel, only on external displays. 3. As i see from dmesg logs your external display resolution is 1920x1080, have u observed the issue with other monitors or with other resolution modes. Actually, the resolution of the screen is 1920x1200. I have, however, observed the same behavior on a screen that has 1920x1080 resolution, as well as an older VGA screen with a lower resolution that I don't know as we speak. 4. could you please let me know if you see this issue after running below command echo "2 0 0 0 0 0 0 0" > /sys/kernel/debug/dri/0/i915_pri_wm_latency After running this command I did not notice the problem. Was this command supposed to fix the problem or provoke it? If it's meant to fix it, I will need to test more extensively since I only had the opportunity to test for maybe 20 minutes. Let me know if anything is needed. Thanks for taking the time! I've done some additional testing, and after issuing 'echo "2 0 0 0 0 0 0 0" > /sys/kernel/debug/dri/0/i915_pri_wm_latency' I ran for over 2 hours without any freeze, and I tried to provoke the error by switching resolution and screen layout several times. I've now verified that the error still occurs with or without that command issued, especially when coming back from suspend. (In reply to Johan Thorén from comment #27) > Hi Anshuman, > > 1. Is this issue is seen when u connect a external display directly to > laptop without dock? > > It happens more frequently (it seems) when connected through a dock, but it > also happens when using a cable directly to the laptop. > > 2. Have u screen the display tear issue on embedder panel of laptop. > > It never happens on the embedded panel, only on external displays. > > 3. As i see from dmesg logs your external display resolution is 1920x1080, > have u observed the issue with other monitors or with other resolution modes. > > Actually, the resolution of the screen is 1920x1200. I have, however, > observed the same behavior on a screen that has 1920x1080 resolution, as > well as an older VGA screen with a lower resolution that I don't know as we > speak. > > 4. could you please let me know if you see this issue after running below > command > > echo "2 0 0 0 0 0 0 0" > /sys/kernel/debug/dri/0/i915_pri_wm_latency > > After running this command I did not notice the problem. Was this command > supposed to fix the problem or provoke it? If it's meant to fix it, I will > need to test more extensively since I only had the opportunity to test for > maybe 20 minutes. > > Let me know if anything is needed. Thanks for taking the time! @Anshuman, any further suggestions? (In reply to Johan Thorén from comment #29) > I've now verified that the error still occurs with or without that command > issued, especially when coming back from suspend. hmm, i was expecting this command to improve the issue, if it would improve the issue, then we could think of a watermark issue. I was experiencing the same issue with a ThinkPad 25 (which is almost the same thing as T470) and I implemented these precautions as a workaround: - disable DPMS when docked and external monitors enabled - never switch VTs with external monitors enabled - always disable external monitors before suspending or undocking Posting here just in case someone is still suffering from this and hasn't figured out a workaround yet. Just lately there was some fixes on drm-tip on MST. Are you able to test with latest drm-tip and report back behaviour. I've just built the drm-tip kernel. Will test this for a few days and report back. Thanks. I'm happy to report that I've not had a single problem since I installed 5.4.0-rc7-drm-tip-git-g3ff71899c56c. I'm marking this as resolved, with thanks! Excellent to hear, thank you. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.