Summary: | Horizontal lines in radeon driver on kernel 3.15 and upwards | ||||||
---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | lockheed <qwrules> | ||||
Component: | DRM/Radeon | Assignee: | Default DRI bug account <dri-devel> | ||||
Status: | RESOLVED FIXED | QA Contact: | |||||
Severity: | major | ||||||
Priority: | high | CC: | chris.bainbridge, madeforspam, qwrules, turkal | ||||
Version: | unspecified | ||||||
Hardware: | x86-64 (AMD64) | ||||||
OS: | Linux (All) | ||||||
URL: | https://www.youtube.com/watch?v=nx2-Fvihzxg | ||||||
Whiteboard: | |||||||
i915 platform: | i915 features: | ||||||
Attachments: |
|
Description
lockheed
2014-12-24 15:35:20 UTC
Can you isolate the kernel change which introduced the problem with git bisect? Most likely another problem caused by the PLL rework. I would guess it's one of those patches. @Michel Dänzer, I can contribute bug as detailed as I can, but I don't think I have the necessary combination of time and skill to "bisect" a kernel. However, since I gave the specific kernel version which the error emerges, it should be enough information for someone with more knowledge to find the cause. Possibly related to https://bugzilla.kernel.org/show_bug.cgi?id=83461 I can confirm this bug: Laptop HP 6735s 2xTurion + RS780M videochip I happen to have the exact same artefacts with any kernelversion higher than 3.13 It affects the buildin LVDS but NOT the VGA-output. I tested kernels up to 4.4.0 (to no avail) I don't know what "git bisect" is but eager to learn. I also dropped a note on https://bugzilla.kernel.org/show_bug.cgi?id=83461 I used the link to Lockheed's video as illustration on https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1479136 where I originally filed a bug. I am in the happy circumstances to dedicate this laptop to any test you want me to throw at it. (In reply to Thom from comment #5) > I don't know what "git bisect" is but eager to learn. "bisecting" is a way to find out which commit caused a specific regression. This involves compiling the linux kernel from git and testing the compiled versions. If you can find out which commit is the culprit chances are pretty good that the problem can be fixed quickly. To learn more about bisecting I suggest seaching for "git bisect". Ok, I did my first bisect, it worked out well but I encountered something that puzzles me a bit. Here is the last part of the bisect: 3.15.0-rc3-00725-g1465967 bad Bisecting: 658 revisions left to test after this (roughly 9 steps) [e9dba837640d960f56bef22ff08611955ff8a5b4] Merge tag 'pm+acpi-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 3.15.0-rc2-00219-ge9dba83 bad Bisecting: 355 revisions left to test after this (roughly 8 steps) [6e66d5dab5d530a368314eb631201a02aabb075d] Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 3.15.0-rc1-00303-g6e66d5d good Bisecting: 176 revisions left to test after this (roughly 8 steps) [4d0fa8a0f01272d4de33704f20303dcecdb55df1] Merge tag 'gpio-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio 3.15.0-rc2-00042-g4d0fa8a good Bisecting: 99 revisions left to test after this (roughly 7 steps) [76e7745e8e4330fdb30f049303d524261c0b7a2c] Merge tag 'zynq-dt-fixes-for-3.15' of git://git.xilinx.com/linux-xlnx into fixes 3.15.0-rc2-00077-g76e7745 good (how can this be ??) Bisecting: 49 revisions left to test after this (roughly 6 steps) [92891ed6b1fdb49655f9a071ef2880a567807375] Merge branch 'fixes_for_v3.15' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping 3.15.0-rc2-00092-g92891ed bad Bisecting: 22 revisions left to test after this (roughly 5 steps) [1aae31c8306e5f1bdeafd87b2cd9e3f0df3709e5] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 3.15.0-rc2-00069-g1aae31c bad Bisecting: 13 revisions left to test after this (roughly 4 steps) [7740fc52105c9e6d2beac389a9ae0ce7138cf5ab] Input: soc_button_array - fix a crash during rmmod 3.14.0-rc4-00065-g7740fc5 good Bisecting: 6 revisions left to test after this (roughly 3 steps) [3ed9a335cfc64b2c83545f341cdddf2347b12b97] drm/radeon/pm: don't walk the crtc list before it has been initialized (v2) 3.15.0-rc1-00075-g3ed9a33 bad Bisecting: 3 revisions left to test after this (roughly 2 steps) [c2fb3094669a3205f16a32f4119d0afe40b1a1fd] drm/radeon: improve PLL limit handling in post div calculation 3.15.0-rc1-00071-gc2fb309 bad Bisecting: 0 revisions left to test after this (roughly 1 step) [24315814239a3fdb306244c99bd076bc79db4ade] drm/radeon: use fixed PPL ref divider if needed 3.15.0-rc1-00070-g2431581 good c2fb3094669a3205f16a32f4119d0afe40b1a1fd is the first bad commit commit c2fb3094669a3205f16a32f4119d0afe40b1a1fd Author: Christian König <christian.koenig@amd.com> Date: Sun Apr 20 13:24:32 2014 +0200 drm/radeon: improve PLL limit handling in post div calculation This improves the PLL parameters when we work at the limits of the allowed ranges. Signed-off-by: Christian König <christian.koenig@amd.com> :040000 040000 5c3ac5ddf911c2c1f8926ecde2d83fdbcd6bb269 4731ceed6e1c149abd6fda6a06318700750f8 So far so good, but what I'm puzzled about is this: As far as I understand; 3.15.0-rc2-00077-g76e7745 is a later revision (good) than 3.15.0-rc2-00069-g1aae31c (bad) and an earlier revision than 3.15.0-rc2-00092-g92891ed (bad) which doesn't seem to make sense to me. It is as if someone did a patch to improve on 3.15.0-rc1-00071-gc2fb309 but that it got revoked afterwards, is that possible ? > As far as I understand; 3.15.0-rc2-00077-g76e7745 is a later revision (good) > than 3.15.0-rc2-00069-g1aae31c (bad) This is not correct. The 77/69 does not imply a linear ordering because of forks: $ git merge-base --is-ancestor 3.15.0-rc2-00069-g1aae31c 3.15.0-rc2-00077-g76e7745; echo $? 1 Trust git ;-) > c2fb3094669a3205f16a32f4119d0afe40b1a1fd is the first bad commit Not familiar with this code, but from the patch the PLL values are printed out: DRM_DEBUG_KMS("%d - %d, pll dividers - fb: %d.%d ref: %d, post %d\n", freq, *dot_clock_p * 10, *fb_div_p, *frac_fb_div_p, ref_div, post_div); So suggest enabling debug log and compare those two lines from a working and non-working kernel. It should also be trivial to checkout a recent tag and revert the bad commit (there is a conflict but just delete the avivo_get_fb_ref_div function to resolve it). (In reply to Chris Bainbridge from comment #8) > This is not correct. The 77/69 does not imply a linear ordering because of > forks: > Trust git ;-) Thanks for the update, that explains everything. I hardly know git, and before yesterday I didn't even know what git or what bisecting was...it's a bit overwhelming. > > > c2fb3094669a3205f16a32f4119d0afe40b1a1fd is the first bad commit > > Not familiar with this code, but from the patch the PLL values are printed > out: > > DRM_DEBUG_KMS("%d - %d, pll dividers - fb: %d.%d ref: %d, post > %d\n", > freq, *dot_clock_p * 10, *fb_div_p, *frac_fb_div_p, > ref_div, post_div); > That is like magic :-) How did you get git to give you the source of that patch so quickly ? (I googled for hours on this stuff without success) > So suggest enabling debug log and compare those two lines from a working and > non-working kernel. > I assume that I have to enable debug log via a bootoption because I couldn't find anything in menuconfig that wasn't already marked for inclusion. What bootoption do I have to use to enable the right (and right amount of) debug logging ? (and after that, where do I find the log output?) > It should also be trivial to checkout a recent tag and revert the bad commit I don't even know yet what that is or how to do that, even after reading the man pages about checkout, tag, revert and commit; but I'm convinced I'll get there in the end ;-) Hmmm.... I'm afraid I have to enable "debug boot parapeters" in menuconfig. What git command do I use to get a specific kernelversion source lined up so I can recompile selected kernels for debug ? ok, some results: PLL-readings on good working compilations: 3.15.0-rc1-00303-g6e66d5d [drm:radeon_compute_pll_avivo] 69300 - 6949, pll dividers - fb: 165.0 ref: 2, post 17 3.15.0-rc2-00042-g4d0fa8a [drm:radeon_compute_pll_avivo] 69300 - 6930, pll dividers - fb: 329.1 ref: 4, post 17 3.15.0-rc2-00077-g76e7745 [drm:radeon_compute_pll_avivo] 69300 - 6930, pll dividers - fb: 329.1 ref: 4, post 17 3.15.0-rc1-00070-g2431581 no output, system hangs loading driver in debug mode (probably because this one didn't had the patch yet.) works ok when not in debug mode. PLL-readings on bad noisy-artefacty compilations: 3.15.0-rc2-00069-g1aae31c [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14 3.15.0-rc1-00071-gc2fb309 [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14 3.15.0-rc1-00075-g3ed9a33 [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14 3.15.0-rc2-00092-g92891ed [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14 Problem is: I haven't the slightest clue what it all means. (In reply to Chris Bainbridge from comment #8) > So suggest enabling debug log and compare those two lines from a working and > non-working kernel. Done (see previous message) :-) > It should also be trivial to checkout a recent tag and revert the bad commit Done :-) Reverted the bad commit on current 4.6.0-rc6+ an tested and it worked like a charm !! no display problems anymore > (there is a conflict but just delete the avivo_get_fb_ref_div function to > resolve it). I did, and thanks to your directions it all worked out perfectly :-) This might be https://bugzilla.kernel.org/show_bug.cgi?id=75241 - there is one line patch there from Christian König but it doesn't look like it was ever merged. (In reply to Chris Bainbridge from comment #13) > This might be https://bugzilla.kernel.org/show_bug.cgi?id=75241 - there is > one line patch there from Christian König but it doesn't look like it was > ever merged. I did a git fetch origin , git reset --hard origin/master to get a plain unaltered current kernel again (4.6.0-rc7+) I changed the one line in ./drivers/gpu/drm/radeon/radeon_display.c: fb_div_max = pll->max_feedback_div; to: fb_div_max = min(pll->max_feedback_div, 512u); according to: https://bugzilla.kernel.org/attachment.cgi?id=142281 (linked from https://bugzilla.kernel.org/show_bug.cgi?id=75241) and compiled (make && make modules_install install) Assuming that i did not make a mistake or overlooked something; this patch didn't work, lots of noise/artefacts. Timings seem identical to the other "bad" compilations, i.e. nothing changed: (bootparam drm.debug=4) [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14 too bad, but it was absolutely worth to try. I wonder if "fb" and "post" are consequently too low....is that possible ? ok, i created a variation of the one liner patch that works without reverting any of the existing code: This patch prevents fb from going lower than 140 Preventing noise/snow on display . (for RS780M + LVDS) diff: @@ void radeon_compute_pll_avivo(struct radeon_pll *pll, /* determine allowed feedback divider range */ -- fb_div_min = pll->min_feedback_div; ++ fb_div_min = max(pll->min_feedback_div, 140u); fb_div_max = pll->max_feedback_div; if (pll->flags & RADEON_PLL_USE_FRAC_FB_DIV) { fb_div_min *= 10; results in: [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 271.0 ref: 4, post 14 This "works for me (TM)" But it would be good if someone could check if there are no "unforeseen consequences" to this patch. I don't know much about GPU stuff an I am not familiar with the code. (and yes I know: hardcoding values is definitely "not done") fb lower than 140 is possible, my current stock kernel 3.13.0-86 works flawless [drm:radeon_compute_pll_avivo], 6928, pll dividers - fb: 125.8 ref: 2, post 13 (sigh) I just wish I understood why some modes work and some don't Christian König posted an explanation of the PLL divider values at https://bugzilla.kernel.org/show_bug.cgi?id=91861#c12 (another "no screen after 3.15" bug report) The various fixes adjust the divider value limits slightly for different displays. The basic formula is commented in the radeon_compute_pll_avivo function: dot_clock = (ref_freq * feedback_div) / (ref_div * post_div) So by adjusting the limits of those values you can find something that works for your laptop display. But I don't know which solution is technically correct - if you don't get a reply here you could try emailing Christian König and asking. (In reply to Chris Bainbridge from comment #17) > if you don't get a reply here you could try emailing Christian > König and asking. I did, and Christian responded almost instantly, so I will be busy for quite a while with testing. Don't close this bug yet....work in progress :-) Patch submitted by Christian König https://lists.freedesktop.org/archives/dri-devel/2016-June/110724.html This solved the bug. Thanks everyone for all the help. I have this same problem with an upgrade from 14.04 LTS to 16.04 LTS Ubuntu - Linux DV7 4.6.4-040604-generic #201607111332 SMP Mon Jul 11 17:34:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux 01:05.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] RS780M [Mobility Radeon HD 3200] [1002:9612] (prog-if 00 [VGA controller]) I noticed that a patch was submitted. Can I expect to see this in a future kernel or perhaps a RC version after my 4.6.4 kernel? ------- Gilbert, same with me, also ubuntu 14.04 -> 16.04. The patch is already in the 4.7+ kernel tree so it should be in the first 4.7 kernel (pre) release. I'm not familiar with ubuntu's kernel policy and I also don't know anyone who does but I guess that the 4.7 kernel will land in 16.10 or 17.04. Best to ask the Ubuntu Kernelteam. addendum: https://github.com/torvalds/linux/commit/9ef8537e68941d858924a3eacee5a1945767cbab i.e. kernel 4.7-rc4 and up (In reply to Thom from comment #21) > Gilbert, same with me, also ubuntu 14.04 -> 16.04. > > The patch is already in the 4.7+ kernel tree so it should be in the first > 4.7 kernel (pre) release. > > I'm not familiar with ubuntu's kernel policy and I also don't know anyone > who does but I guess that the 4.7 kernel will land in 16.10 or 17.04. > Best to ask the Ubuntu Kernelteam. Thank you for the informative information. I'll probably stay on the LTS 16.04 but as soon as I get wind of the release of kernel 4.7+ I will install it. I was able to get my system working properly by reverting to kernel 3.13.0-92-generic. Here' a link to a discussion I found that stated that users who upgraded may use older kernels from 12.04 and 14.04 on 16.04 even if not supported. http://askubuntu.com/questions/776910/install-old-kernel-in-ubuntu-16-04/801847#801847 (In reply to Thom from comment #21) > Gilbert, same with me, also ubuntu 14.04 -> 16.04. > > The patch is already in the 4.7+ kernel tree so it should be in the first > 4.7 kernel (pre) release. > > I'm not familiar with ubuntu's kernel policy and I also don't know anyone > who does but I guess that the 4.7 kernel will land in 16.10 or 17.04. > Best to ask the Ubuntu Kernelteam. I just installed the new kernel 4.7.0-040700-generic but it didn't fix the display problem. Reverting back to 3.13.0-92-generic. :( AFAIK the patch is in since 4.7-RC4. Could it be that your version is older ? see also: https://github.com/torvalds/linux/commit/9ef8537e68941d858924a3eacee5a1945767cbab |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.