Bug 97363 - [drm:ironlake_crtc_compute_clock [i915]] *ERROR* Couldn't find PLL settings for mode!
Summary: [drm:ironlake_crtc_compute_clock [i915]] *ERROR* Couldn't find PLL settings f...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Ville Syrjala
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-16 11:57 UTC by Alexander Kobel
Modified: 2017-07-24 22:40 UTC (History)
6 users (show)

See Also:
i915 platform: IVB
i915 features: display/Other


Attachments
Xorg.log (9.18 KB, text/plain)
2016-08-16 11:57 UTC, Alexander Kobel
no flags Details
dmesg (53.63 KB, text/plain)
2016-08-16 11:58 UTC, Alexander Kobel
no flags Details
drm-debug-linux-4.6 (3.45 KB, text/plain)
2016-08-24 21:07 UTC, Robin Müller
no flags Details
drm-debug-linux-4.7 (3.54 KB, text/plain)
2016-08-24 21:13 UTC, Robin Müller
no flags Details
drm-debug-opregion_panel_type_stuff (701.26 KB, text/plain)
2016-09-06 19:22 UTC, Robin Müller
no flags Details

Description Alexander Kobel 2016-08-16 11:57:46 UTC
Created attachment 125813 [details]
Xorg.log

This is a followup on https://bugzilla.kernel.org/show_bug.cgi?id=153181, where we were redirected to here.
Robin reports:

After upgrading to linux-4.7 I get the following error during boot time and while starting Wayland or xserver (ends in a black screen):

[drm:ironlake_crtc_compute_clock [i915]] *ERROR* Couldn't find PLL settings for mode!

I'm not sure if this is the cause for the error but I had a look to the changes of the intel_driver.c between 4.6 and 4.7 and noticed the following:
With linux-4.6 the method ironlake_crtc_compute_clock used ironlake_compute_clocks for retrieving the PLL settings. Which then used on of the following methods depending on the given device:
- g4x_find_best_dpll
- chv_find_best_dpll
- vlv_find_best_dpll
- pnv_find_best_dpll
- i9xx_find_best_dpll
In linux-4.7 the method ironlake_crtc_compute_clock uses only this one g4x_find_best_dpll.


I confirm on a Samsung NP900X4C laptop (Intel HD graphics 4000) on Arch Linux. Worked fine in 4.6.5. Booting to the console already prints the message a number of times. When trying to start X as root, the system freezes entirely (only Magic SysRq to the rescue); as user, X fails to start, but the system stays responsive. The error message from the title also appears whenever I switch virtual terminals.

A discussion with some (unsuccessful) hints is here: https://bbs.archlinux.org/viewtopic.php?id=215847
Things that didn't work:
- uninstall xf86-video-intel and use the native modesetting
- add iomem=relaxed as a kernel parameter

I attach the dmesg output as well as an Xorg.log. The btrfs complaints are unrelated and due to the fact that I had to shut down the system hard.
Comment 1 Alexander Kobel 2016-08-16 11:58:13 UTC
Created attachment 125814 [details]
dmesg
Comment 2 Robin Müller 2016-08-16 21:21:24 UTC
I added some logging to the intel_display.c for both linux-4.6 and linux-4.7. While running these customized kernels I figured out the following:
In both cases the same methods are called. So my first assumption was wrong, that in linux-4.6 maybe a different method then g4x_find_best_dpll was used for calculating the PLL. But I found a difference in the values passed to this method. The value of the parameter target is 100400 using linux-4.6 and 25180 using linux-4.7. As the code of the method g4x_find_best_dpll hasn't changed between these 2 versions, I'd say the different input value causes the bug.
At the moment I haven't found the location where the value for target (crtc_state->port_clock) comes from but it's not that easy navigating through the code just using vim. If I find some time tomorrow I'll try how well IntelliJ can handle C projects to find the cause for the different values for the crtc_state->port_clock
Furthermore I hope this information is useful for somebody else to dig into this issue.
Comment 3 Robin Müller 2016-08-23 21:19:14 UTC
After some further investigations I figured out the following:
* only the LVDS port is affected by this bug, if I attach a monitor via HDMI wayland starts and shows a picture on this monitor but the display of the laptop gets black at the same time
* the wrong value of the crtc_state->port_clock comes from the clock value of the drm_display_mode struct so it seems the wrong frequency is set very early during the initialization of the drm stuff. Unfortunately I still haven't found the place where this happens cause I'm not that familiar with the code of the kernel to know by which part and when the drm initialization gets started. So I mostly did this by trail and error.
Comment 4 Robin Müller 2016-08-24 21:07:36 UTC
Created attachment 126020 [details]
drm-debug-linux-4.6
Comment 5 Robin Müller 2016-08-24 21:13:23 UTC
Created attachment 126021 [details]
drm-debug-linux-4.7

I finally found how to configure the debugging for the drm module(s) \o/. After doing this I found the reason for the different clock values:
For linux 4.6 I get these lines:
[drm:parse_lfp_panel_data] Found panel mode in BIOS VBT tables:
[drm:drm_mode_debug_printmodeline] Modeline 0:"1600x900" 0 100400 1600 1648 1680 1792 900 902 907 932 0x8 0xa
For linux 4.7 I get these lines:
[drm:parse_lfp_panel_data] Found panel mode in BIOS VBT tables:
[drm:drm_mode_debug_printmodeline] Modeline 0:"640x480" 0 25180 640 648 744 784 480 482 484 509 0x8 0xa

I attached the beginning of both logs for further infos. So it seems the code that retrieves the available modes from the BIOS does something different between these 2 versions.
Comment 6 Alexander Kobel 2016-08-24 22:52:46 UTC
The second line of the description of
  https://lists.freedesktop.org/archives/dri-devel/2016-August/116579.html
rings a bell... Could this be related and/or of any help?
Comment 7 philipp.reinkemeier 2016-09-04 12:47:03 UTC
Just wanted to add that the same error msg and BUG also affects my laptop: Error msg shown at boot, blank screen when starting X or wayland.
Comment 8 Trudy Tective 2016-09-05 19:10:55 UTC
Just to add, that this also affects my laptop with a i5 3320m and onboard intel hd4000 graphics. The panel's modes get wrongly identified and this results in distortion across the screen, rendering kernels 4.7* and 4.8* versions as of this writing unusable for me.
Comment 9 Trudy Tective 2016-09-05 19:24:00 UTC
Just to add, that this also affects my laptop with a i5 3320m and onboard intel hd4000 graphics. The panel's modes get wrongly identified and this results in distortion across the screen, rendering kernels 4.7* and 4.8* versions as of this writing unusable for me.
Comment 10 Trudy Tective 2016-09-05 19:25:12 UTC
Bug 97060 seems related to this one.
Comment 11 Ville Syrjala 2016-09-06 12:10:06 UTC
Can you try 
git://github.com/vsyrjala/linux.git opregion_panel_type_stuff

I'm not expecting it would really fix the problem, but it will at least show us the full response from the BIOS.
Comment 12 Robin Müller 2016-09-06 19:22:29 UTC
Created attachment 126253 [details]
drm-debug-opregion_panel_type_stuff

I just build a kernel form the given branch and enabled the debugging for the drm module. The attachment contains all debug logs of the drm. Hope this helps to find the cause of this bug.
Comment 13 Ville Syrjala 2016-09-06 19:40:22 UTC
(In reply to Robin Müller from comment #12)
> Created attachment 126253 [details]
> drm-debug-opregion_panel_type_stuff
> 
> I just build a kernel form the given branch and enabled the debugging for
> the drm module. The attachment contains all debug logs of the drm. Hope this
> helps to find the cause of this bug.

Thanks. I pushed a new patch to the branch which I think should get us back on track with your machine. So please pull again, and retest.

I don't really know if it's any more correct than the current code though, and I don't yet know if it works for the original machine that needs the opregion panel type.
Comment 14 Robin Müller 2016-09-06 20:18:59 UTC
(In reply to Ville Syrjala from comment #13)
> (In reply to Robin Müller from comment #12)
> > Created attachment 126253 [details]
> > drm-debug-opregion_panel_type_stuff
> > 
> > I just build a kernel form the given branch and enabled the debugging for
> > the drm module. The attachment contains all debug logs of the drm. Hope this
> > helps to find the cause of this bug.
> 
> Thanks. I pushed a new patch to the branch which I think should get us back
> on track with your machine. So please pull again, and retest.
> 
> I don't really know if it's any more correct than the current code though,
> and I don't yet know if it works for the original machine that needs the
> opregion panel type.

Thanks, this patch did the trick. After applying the patch the error is gone and my display manager starts again :-)
Comment 15 Emil Andersen Lauridsen 2016-09-07 15:20:02 UTC
(In reply to Ville Syrjala from comment #13)
> (In reply to Robin Müller from comment #12)
> > Created attachment 126253 [details]
> > drm-debug-opregion_panel_type_stuff
> > 
> > I just build a kernel form the given branch and enabled the debugging for
> > the drm module. The attachment contains all debug logs of the drm. Hope this
> > helps to find the cause of this bug.
> 
> Thanks. I pushed a new patch to the branch which I think should get us back
> on track with your machine. So please pull again, and retest.
> 
> I don't really know if it's any more correct than the current code though,
> and I don't yet know if it works for the original machine that needs the
> opregion panel type.

This branch resolve the problem on my i5-3317U as well.
Comment 16 Ville Syrjala 2016-09-08 09:14:51 UTC
I had to resort to a quirk to fix this for all machines. Please test this branch:

git://github.com/vsyrjala/linux.git opregion_panel_type_quirk
Comment 17 oceans112 2016-09-08 09:55:56 UTC
(In reply to Ville Syrjala from comment #16)
> I had to resort to a quirk to fix this for all machines. Please test this
> branch:
> 
> git://github.com/vsyrjala/linux.git opregion_panel_type_quirk

Thank you for this, it's working (on np900x4c).

ps: I was trying to enable drm debugging, adding drm.debug=14 to kernel line..is this the right way? And then I'll read the debugging infos with dmesg, right?
Comment 18 oceans112 2016-09-08 10:03:21 UTC
(In reply to oceans112 from comment #17)
> (In reply to Ville Syrjala from comment #16)
> > I had to resort to a quirk to fix this for all machines. Please test this
> > branch:
> > 
> > git://github.com/vsyrjala/linux.git opregion_panel_type_quirk
> 
> Thank you for this, it's working (on np900x4c).
> 
> ps: I was trying to enable drm debugging, adding drm.debug=14 to kernel
> line..is this the right way? And then I'll read the debugging infos with
> dmesg, right?

Ooops! I'm sorry but I copy/paste the wrong branch, actually I built and tested opregion_panel_type_stuff. I'll try opregion_panel_type_quirk asap, please ask for any details.
Comment 19 oceans112 2016-09-08 14:38:22 UTC
ok, so git://github.com/vsyrjala/linux.git opregion_panel_type_quirk (this time I double-checked) works too.

From Xorg.log I can tell that the driver is picking up the correct output mode for the panel.
Comment 20 Emil Andersen Lauridsen 2016-09-08 15:22:50 UTC
(In reply to Ville Syrjala from comment #16)
> I had to resort to a quirk to fix this for all machines. Please test this
> branch:
> 
> git://github.com/vsyrjala/linux.git opregion_panel_type_quirk

Also an affirmative that the quirk based fix works on my Samsung NP900X4D.
Comment 21 Ville Syrjala 2016-09-09 08:46:22 UTC
(In reply to Emil Andersen Lauridsen from comment #20)
> (In reply to Ville Syrjala from comment #16)
> > I had to resort to a quirk to fix this for all machines. Please test this
> > branch:
> > 
> > git://github.com/vsyrjala/linux.git opregion_panel_type_quirk
> 
> Also an affirmative that the quirk based fix works on my Samsung NP900X4D.

I had to revise the patch a bit more to make it actually work on the original machine that needs the quirk:
git://github.com/vsyrjala/linux.git opregion_panel_type_quirk_2

Please test to make sure I didn't fumble anything for other machines. Fingers crossed that this is the last revision...
Comment 22 Emil Andersen Lauridsen 2016-09-09 10:37:46 UTC
(In reply to Ville Syrjala from comment #21)
> (In reply to Emil Andersen Lauridsen from comment #20)
> > (In reply to Ville Syrjala from comment #16)
> > > I had to resort to a quirk to fix this for all machines. Please test this
> > > branch:
> > > 
> > > git://github.com/vsyrjala/linux.git opregion_panel_type_quirk
> > 
> > Also an affirmative that the quirk based fix works on my Samsung NP900X4D.
> 
> I had to revise the patch a bit more to make it actually work on the
> original machine that needs the quirk:
> git://github.com/vsyrjala/linux.git opregion_panel_type_quirk_2
> 
> Please test to make sure I didn't fumble anything for other machines.
> Fingers crossed that this is the last revision...

New branch works here. Nice work.
Comment 23 Robin Müller 2016-09-09 10:44:10 UTC
(In reply to Emil Andersen Lauridsen from comment #22)
> (In reply to Ville Syrjala from comment #21)
> > (In reply to Emil Andersen Lauridsen from comment #20)
> > > (In reply to Ville Syrjala from comment #16)
> > > > I had to resort to a quirk to fix this for all machines. Please test this
> > > > branch:
> > > > 
> > > > git://github.com/vsyrjala/linux.git opregion_panel_type_quirk
> > > 
> > > Also an affirmative that the quirk based fix works on my Samsung NP900X4D.
> > 
> > I had to revise the patch a bit more to make it actually work on the
> > original machine that needs the quirk:
> > git://github.com/vsyrjala/linux.git opregion_panel_type_quirk_2
> > 
> > Please test to make sure I didn't fumble anything for other machines.
> > Fingers crossed that this is the last revision...
> 
> New branch works here. Nice work.

This branch works also on my machine.
Comment 24 oceans112 2016-09-09 12:19:07 UTC
Quite useless comment, but I can confirm that last branch works on np900x4c.

So, can we expect to see this merged into 4.8.1?
Comment 25 yann 2016-09-13 09:37:02 UTC
Patch submitted and under review: https://patchwork.freedesktop.org/series/12380/
Comment 26 yann 2016-09-13 09:42:16 UTC
Resolving as fixed due to re-test feedback and waiting for patch to be merged and tested to close it.
Comment 27 Jani Nikula 2016-09-13 13:11:27 UTC
(In reply to yann from comment #26)
> Resolving as fixed due to re-test feedback and waiting for patch to be
> merged and tested to close it.

Please let's resolve fixed only after the patch has been merged. Thanks.
Comment 28 Jani Nikula 2016-09-14 09:37:06 UTC
Fixed by

commit ea54ff4008892b46c7a3e6bc8ab8aaec9d198639
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Tue Sep 13 12:22:19 2016 +0300

    drm/i915: Ignore OpRegion panel type except on select machines

in drm-intel-fixes.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.