| Summary: | [BAT][SKL] WARNING: no modes for connector 48 when running igt@kms_setmode@basic-clone-single-crtc | ||
|---|---|---|---|
| Product: | DRI | Reporter: | Martin Peres <martin.peres> |
| Component: | IGT | Assignee: | Manasi <manasi.d.navare> |
| Status: | CLOSED FIXED | QA Contact: | |
| Severity: | critical | ||
| Priority: | medium | CC: | intel-gfx-bugs, manasi.d.navare |
| Version: | DRI git | ||
| Hardware: | Other | ||
| OS: | All | ||
| Whiteboard: | |||
| i915 platform: | ALL | i915 features: | |
|
Description
Martin Peres
2017-06-20 13:38:03 UTC
Connector 48 from the logs is eDP-1. What's interesting is that I see this: [ 392.746337] [drm:drm_mode_prune_invalid] Not using 1920x1080 mode: CLOCK_HIGH Some more debugging from dmesg.. [ 382.469405] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:48:eDP-1] [ 382.469523] [drm:intel_dp_detect [i915]] [CONNECTOR:48:eDP-1] [ 382.469656] [drm:intel_power_well_enable [i915]] enabling DC off [ 382.469777] [drm:gen9_set_dc_state [i915]] Setting DC state from 02 to 00 [ 382.469871] [drm:intel_dp_detect [i915]] Display Port TPS3 support: source yes, sink yes [ 382.469947] [drm:intel_dp_print_rates [i915]] source rates: 162000, 216000, 270000, 324000, 432000, 540000 [ 382.470015] [drm:intel_dp_print_rates [i915]] sink rates: 162000, 270000 [ 382.470081] [drm:intel_dp_print_rates [i915]] common rates: 162000, 270000 [ 382.470249] [drm:edp_panel_vdd_on [i915]] Turning eDP port A VDD on [ 382.470501] [drm:edp_panel_vdd_on [i915]] PP_STATUS: 0x80000008 PP_CONTROL: 0x0000000f [ 382.471733] [drm:drm_dp_read_desc] DP sink: OUI 00-22-b9 dev-ID sivarT HW-rev 0.0 SW-rev 0.0 quirks 0x0000 [ 382.472916] [drm:drm_edid_to_eld] ELD: no CEA Extension found [ 382.472944] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:48:eDP-1] probed modes : [ 382.472970] [drm:drm_mode_debug_printmodeline] Modeline 49:"1920x1080" 60 138700 1920 1968 2000 2080 1080 1083 1088 1111 0x48 0x9 Ok, looks good. Now getting weird HPD and tons of dp_aux_ch timeouts.. But what I also noticed.. $ grep ERROR dmesg-during.log [ 385.088911] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training [ 390.392461] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training [ 391.236144] [drm:intel_dp_check_link_status [i915]] *ERROR* Failed to get link status [ 407.184193] [drm:intel_dp_check_link_status [i915]] *ERROR* Failed to get link status [ 417.936164] [drm:intel_dp_check_link_status [i915]] *ERROR* Failed to get link status [ 420.311740] [drm:intel_dp_check_link_status [i915]] *ERROR* Failed to get link status So I guess we fail with warn, without adding the dmesg error? It should be dmesg-fail I think, from piglit/framework/dmesg.py When adding a quick test for this in igt/tests/meta_test.c, it seems the bug is in intel-CI. The piglit html summary is correctly generated and shows it as dmesg-fail. Manasi, now that you fixed the failing platform, could you have a look into this IGT bug? Basically, IGT does not know how to deal with the kernel pruning modes. Do we want to simply ignore this possibility or do we want to be more robust to it? At the very least, it would be nice to add a debug message saying that the modes may all have been pruned, according to the DP spec. *** Bug 101519 has been marked as a duplicate of this bug. *** Assigning Manasi to this issue, because it is a fallout of the link-status patch. We may argue all we want about whether the platform is bad or not, but the tests are for sure wrong since they don't check for their dependencies. Sorry to through you under the bus Manasi, but this is only a problem after your patch. I see the old logs, do you have the dmesg logs for the most recent testing after the T12 delay fix patch went in. Do we still see AUX timeouts? Like I mentioned, the problem here is those aux timeouts. We should not have those since for an eDP panel we should never fail link training. These aux timeouts are putting the system in an unexpected state. (In reply to Manasi from comment #8) > I see the old logs, do you have the dmesg logs for the most recent testing > after the T12 delay fix patch went in. > Do we still see AUX timeouts? Like I mentioned, the problem here is those > aux timeouts. We should not have those since for an eDP panel we should > never fail link training. These aux timeouts are putting the system in an > unexpected state. Sure, it should not happen, but the failure mode taken is also wrong. Why are we pruning the mode when we are failing the enable the link training? It should only prune a mode if it failed to perform the actual link training. You still need to send the hotplug even though, to ask the userspace to re-do the modeset. And also, we should NEVER prune the last mode. It is not a problem on DP, but it is on eDP. The quickest fix would be to never prune modes on eDP since there is only one mode anyway. Here are newer logs: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_2903/fi-skl-6700hq/igt@kms_busy@basic-flip-default-a.html Please get familiar with the results we export on https://intel-gfx-ci.01.org, the landing page describes everything. If it is confusing, talk to me about it on IRC. Probably a direct consequence of bug #101144 and the dp link status patch then nuking the mode list. Time to revert link status handling since it doesn't work? No i think time to fix the AUX CH timeouts for good. That's the real cause not the DP link status patch. Although might be a good idea to remove the link status and fallback handling if its eDP since in case of eDP link training should not fail and if it does then we just fail no need to rtry with different mode sinec there is only 1 mode. Manasi (In reply to Manasi from comment #12) > No i think time to fix the AUX CH timeouts for good. > That's the real cause not the DP link status patch. > Although might be a good idea to remove the link status and fallback > handling if its eDP since in case of eDP link training should not fail and > if it does then we just fail no need to rtry with different mode sinec there > is only 1 mode. > > Manasi Yes, nuke the mode pruning from eDP all together. This will allow us to un-blacklist a lot of tests for this platform and we'll be able to start checking what to do with the AUX channel issue. Systems has been quite stable lately. (In reply to Jani Saarinen from comment #14) > Systems has been quite stable lately. Has anyone commited anything that would affect the machine? not sure. Asked on intel-gfx ml that too ;) I resolve now and wait to pop up again. So the patch that I submitted to fix the T12 delay and increase it further to 900ms got merged (SHA: 5b2eff59160e) so that could have fixed the issue of AUX timeouts on SKL system. Came back again on CI_DRM_3055 https://intel-gfx-ci.01.org/tree/drm-tip/fi-skl-6700hq.html Series now merged: https://patchwork.freedesktop.org/series/31361/ No Jani, this bug has not been addressed yet. Manasi thinks it does.... Manasi, did you land the patch that prevents pruning the last mode? Ok so this bug was mainly caused by the UAX timeouts seen on the panel and the mdoe getting pruned because of that. So this bug should be fixed because the patch series that prevents the AUX timeouts got merged. As a precaution, we should not prune the preferred mode on eDP. I had submitted a patch for that as well: https://patchwork.freedesktop.org/series/31102/ And I am stillw aiting to get some feedback on this since I really want to know how to handle the case where the preferred mode cannot be handled by lowered link rate and we don't prune it but then in the next modeset we get encoder config failure since the requested BW > available BW. What to do in that case? Manasi Ping, what to do with this? (In reply to Jani Saarinen from comment #25) > Ping, what to do with this? The patch Manasi was proposing needs to land. Then we also need to make IGT more resistant against modes disappearing (skip instead of fail). (In reply to Martin Peres from comment #26) > (In reply to Jani Saarinen from comment #25) > > Ping, what to do with this? > > The patch Manasi was proposing needs to land. Then we also need to make IGT > more resistant against modes disappearing (skip instead of fail). Manasi, could you please prioritise this bug? We will be expecting an update weekly on this issue until this is resolved... The patch for not pruning the modes for eDP got merged already in drm-tip: drm/i915/edp: Do not do link training fallback or prune modes on EDP Could you check if this failure is still seen? I also have a new patch to handle link training on EDP in a better way on the M-L: https://patchwork.freedesktop.org/patch/223573/ This needs a respin as per Jani's comments and I am working on it. Regards Manasi (In reply to Manasi from comment #28) > The patch for not pruning the modes for eDP got merged already in drm-tip: > > drm/i915/edp: Do not do link training fallback or prune modes on EDP > > Could you check if this failure is still seen? > > I also have a new patch to handle link training on EDP in a better way on > the M-L: > https://patchwork.freedesktop.org/patch/223573/ > > This needs a respin as per Jani's comments and I am working on it. > > Regards > Manasi Thanks Manasi! Dropping the priority now as the likeliness of this issue to be hit ever again is low. Let's close the bug after we finish reviewing IGT tests assuming that modes cannot disappear. Could you try to hack something to prune modes when running a test and seeing how IGT tests react to it? This is something you can do locally. Hi Martin, Can we close this bug since the patches to fix this have already been upstreamed Manasi Closing this bug as fixed. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.