Summary: | Regression - system fails to boot with Link Rate Fallback | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | alexander.wilson | ||||||||||||||||
Component: | DRM/Intel | Assignee: | Manasi <manasi.d.navare> | ||||||||||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||||||||
Severity: | major | ||||||||||||||||||
Priority: | high | CC: | alexander.wilson, intel-gfx-bugs, jani.nikula, manasi.d.navare | ||||||||||||||||
Version: | unspecified | Keywords: | bisected, regression | ||||||||||||||||
Hardware: | Other | ||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
i915 platform: | BDW | i915 features: | display/DP | ||||||||||||||||
Attachments: |
|
Is it possible to get xorg.logs and/or kern.logs from the working and non-working kernels? Thank you. I'm suddenly having trouble patching the current kernel, will continue on that. For now I'm attaching the Xorg and systemd journal logs for v4.16-rc4 (problematic on current computer) Created attachment 137878 [details]
Xorg.log of nonfunctioning v4.16-rc4
Created attachment 137879 [details]
systemd journal log of nonfunctioning v4.16-rc4
As reference: commit 9301397a63b3bf1090dffe846c6f1c8efa032236 Author: Manasi Navare <manasi.d.navare@intel.com> Date: Thu Apr 6 16:44:19 2017 +0300 drm/i915: Implement Link Rate fallback on Link training failure commit 713946d16f45ad0509434970ae6ff71529faab4b Author: Manasi Navare <manasi.d.navare@intel.com> Date: Thu Oct 26 14:52:00 2017 -0700 drm/i915: Cancel the modeset retry work during modeset cleanup First of all. Sorry about spam. This is mass update for our bugs. Sorry if you feel this annoying but with this trying to understand if bug still valid or not. If bug investigation still in progress, please ignore this and I apologize! If you think this is not anymore valid, please comment to the bug that can be closed. If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug. As to the mass-comment above, the problem persists with the latest pre-upstream tree. In general, if more tests or logs are needed from my system, please don't hesitate to ask. Jani, Manasi, options here? (In reply to alexander.wilson from comment #4) > Created attachment 137879 [details] > systemd journal log of nonfunctioning v4.16-rc4 That's a completely different problem, fixed by commit a95845ba184b854106972f5d8f50354c2d272c06 Author: Mauro Carvalho Chehab <mchehab@s-opensource.com> Date: Thu Apr 5 06:51:15 2018 -0300 media: v4l2-core: fix size of devnode_nums[] bitarray and also backported to v4.16. The original report here is a bit short on details, but it's possible this has been fixed by commit a306343bcd7df89d9d45a601929e26866e7b7a81 Author: Manasi Navare <manasi.d.navare@intel.com> Date: Thu Oct 12 12:13:38 2017 -0700 drm/i915/edp: Do not do link training fallback or prune modes on EDP and also backported to v4.15+. Please retry with up-to-date kernels. Created attachment 139072 [details]
systemd journal log of functioning v4.17-rc2
The latest kernel does indeed run. There is still a related error logged by journald:
kernel: [drm:intel_dp_start_link_train [i915]] *ERROR* [CONNECTOR:65:eDP-1] Link Training failed at link rate >
I am unsure what the consequences of this are, if any. The full journal log is attached.
(In reply to alexander.wilson from comment #10) > Created attachment 139072 [details] > systemd journal log of functioning v4.17-rc2 > > The latest kernel does indeed run. There is still a related error logged by > journald: > > kernel: [drm:intel_dp_start_link_train [i915]] *ERROR* [CONNECTOR:65:eDP-1] > Link Training failed at link rate > > > I am unsure what the consequences of this are, if any. The full journal log > is attached. Please add drm.debug=14 module parameter, and reproduce. Please also make sure the log lines are not cut off. For example, get the dmesg using 'dmesg > dmesg.log' and attach that. Created attachment 139104 [details]
dmesg for 4.17-rc2
OK, here is the dmesg with the drm.debug=14 kernel param on the same kernel as the my last log. I'll compile the latest kernel and test that today.
Created attachment 139109 [details]
dmesg.log from latest 4.17rc2
Replacing dmesg with latest kernel
Looking at the logs, it looks like on boot the optimum values of link parameters are link rate = 27000 and lane count = 2 but with these it fails in clock recovery phase after 5 retries. That is why you see the debug message "Link training failed" but then it still recovers and brings up the display after enabling the pipe. It is probably one of those panels that do not handle the voltage swing values according to the spec and link training fails but we still have the display. Jani, looks like the hack of retrying clock recovery 5 times and then giving up needs to be added back which was removed during the compliance efforts. I can give a test patch that adds these retries before declaring Link failure. What are your thoughts here? Manasi Jani, any advice to Manasi's comment? The status of the bug is currently NEEDINFO. Is there anything more I can provide of use? Current logs show the same link training failure. No more info is needed from the reporter AFAICT, back to ASSIGNED. I rewrote the link training fallback in this patch: https://patchwork.freedesktop.org/patch/223573/ Could you try this patch to see if it fixes the issue? Regards Manasi Created attachment 140308 [details]
dmesg of patched 4.18-rc1 kernel
The patch appears to work! I'm attaching the dmesg.log just in case. Thank you for your effort on this. How might I be able to follow this patch to see when it gets in the mainline and / or LTS?
Thats great! I will rebase this patch and submit to Intel-GFX mailing list. If you are subscribed to that, you can track the status there and also give a Tested-By tag if you could. Manasi Great, I've just subscribed to the mailing list. For the Tested-By tag, is this something I would commit to the patch as submitted? (In reply to alexander.wilson from comment #21) > Great, I've just subscribed to the mailing list. For the Tested-By tag, is > this something I would commit to the patch as submitted? You just reply via email with Tested-By: your name Manasi, any updates here? I am working on the patch to address the review comments from Jani Nikula to also compare with the downclock mode and disconnecting the downclock mode from drrs mode. Lets keep this as assigned for now and then close it once this patch gets upstreamed. Manasi Presumed fixed by commit 1e712535c51ab025ebc776d4405683d81521996d Author: Manasi Navare <manasi.d.navare@intel.com> Date: Tue Oct 9 14:28:04 2018 -0700 drm/i915/dp: Link train Fallback on eDP only if fallback link BW can fit panel's native mode thanks for the report and testing. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 137780 [details] Bisect log of kernel Commit 9301397a63b3bf1090dffe846c6f1c8efa032236 and the related commit 713946d16f45ad0509434970ae6ff71529faab4b cause the linux kernel boot process to fail on a Dell Chromebook 13 running Arch Linux. I bisected the kernel and the log is attached. Reverting those two commits fixes the regression.