Summary: | Macbook pro 11,5 screen flicker when AC adapter plugged in | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Tom B <tom> | ||||||||||
Component: | DRM/Radeon | Assignee: | Default DRI bug account <dri-devel> | ||||||||||
Status: | RESOLVED MOVED | QA Contact: | |||||||||||
Severity: | normal | ||||||||||||
Priority: | medium | CC: | bugs.freedesktop, ed, paulgier, stesen | ||||||||||
Version: | unspecified | ||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||
OS: | Linux (All) | ||||||||||||
Whiteboard: | |||||||||||||
i915 platform: | i915 features: | ||||||||||||
Attachments: |
|
Description
Tom B
2016-11-29 09:50:38 UTC
Update: After suspending the machine overnight and booting it this morning, the `battery` setting is still active and upon resume seems to have had an effect on temperature. I'm now seeing 42C on radeon-pci-0100 after 20 minutes on the desktop which is far more sane previously the temperature was always around 60C, although radeon_pm_info reports the same numbers The suspend/resume cycle seems to have forced the `battery` setting fixing both temperatures and the flicker issue. Could the flicker issue be due to overheating/overclocking? I've seen similar graphical artifacts/corruption in games when overclocking GPUs too high. Please attach your dmesg output and xorg log. Is this system a hybrid laptop with 2 GPUs (integrated and discrete)? Created attachment 128266 [details]
dmesg output
Created attachment 128267 [details]
Xorg log
dmesg and xorg attached. Yes, this has dual GPUs, an onboard intel on the i7 4870HQ and a Radeon M370X. lspci | grep VGA output: 00:02.0 VGA compatible controller: Intel Corporation Crystal Well Integrated Graphics Controller (rev 08) 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Venus XT [Radeon HD 8870M / R9 M270X/M370X] (rev 83) glxinfo | grep OpenGL output: OpenGL vendor string: X.Org OpenGL renderer string: Gallium 0.4 on AMD CAPE VERDE (DRM 2.46.0 / 4.8.10-1-ARCH, LLVM 3.9.0) OpenGL core profile version string: 4.3 (Core Profile) Mesa 13.0.1 OpenGL core profile shading language version string: 4.30 OpenGL core profile context flags: (none) OpenGL core profile profile mask: core profile OpenGL core profile extensions: OpenGL version string: 3.0 Mesa 13.0.1 OpenGL shading language version string: 1.30 OpenGL context flags: (none) OpenGL extensions: OpenGL ES profile version string: OpenGL ES 3.1 Mesa 13.0.1 OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10 OpenGL ES profile extensions: The Radeon card is in use. I can switch to the intel card using the gpu-switch utility ( https://aur.archlinux.org/packages/gpu-switch/ ) but the external ports HDMI and DP are connected to the radeon card, and I need to use external monitors frequently. When the intel gpu is used, there is no flicker issue. Can you bisect, or at least narrow down the kernel version which introduced the issue? The issue began at version 4.8.7 also mentioned here: https://aur.archlinux.org/packages/linux-macbook/ I did a bit more digging to see what the connection is with the power cable and it seems unrelated. The flicker happens only when the power cable is connected AND the power mode is performance (or balanced***see bottom of post***). But, power_dpm_state seems to be ignored unless the power cable is plugged in. To test this I ran unigine-heaven with default settings power cable + battery dpm state = 4fps power cable + performance dpm state = 8fps no power cable + battery dpm state = 4fps no power cable + performance dpm state = 4fps So obviously the last result tells us that having the power cable unplugged forces "battery" mode regardless of whether "performance" is set in power_dpm_state and the power cable itself is a bit of a red herring, it's the performance dpm state which causes the flicker and having the power cable connected is the only way to get the gpu into performance mode. (These numbers seem rather low for a M370X GPU since it apparently gets 35 in windows, see http://www.mobiletechreview.com/notebooks/15-inch-Retina-MacBook-Pro-2015.htm I'm not sure how the radeon driver stacks up and I'm not really bothered about that as I don't any gpu intensive work, but the performance may highlight clock speed issues) ***** Note on "balanced". Since "balanced" always seems to cause the flicker and high temperatures when enabled, it suggests that it's being rather over-zealous with its clock speeds. Forcing "battery" has no noticeable impact on performance in desktop applications. I'm not sure how it's measured but "balanced" would probably be better with a different threshold. If you can't or don't want to bisect, there are only 4 radeon driver commits between 4.8.6 and 4.8.7, so it shouldn't take long to try manually reverting each of those. https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.8.y&id=e136de5d733161fdfd203f23b448434170d189ea seems like a good candidate, since it's clock related and explicitly references your GPU in the code. I've never done that before so give me some time and I'll try it. Is there any command I can run to work out what my rdev->pdev->revision is? I'm also struggling to find any official information on clock speeds for the m370x, I'll keep looking hopefully it's on the amd site somewhere! Apologies for repeated comments. According to http://www.anandtech.com/show/9276/2015-15inch-retina-macbook-pros-dgpu-r9-m370x-is-cape-verde (someone else running the same GPU on windows with catalyst driver) The clock speeds are: GPU: 800mhz Memory: 1125mhz I'm not sure how that correlates to the speeds in the patch you listed: + max_sclk = 75000; + max_mclk = 80000; What is the difference between sclk and mclk? If mclk is memory clock (as suggested here: http://askubuntu.com/questions/569085/radeon-pm-info-what-are-vclk-dclk-sclk-mclk-vddc-and-vddci ) that looks way of but if it's GPU clock then 80000 (assuming 80000 means 800mhz) is correct but then I don't know what the memory clock is or how to work it out. radon_pm_info does show seemingly correct information that tallies with the patch above uvd vclk: 0 dclk: 0 power level 0 sclk: 30000 mclk: 30000 vddc: 900 vddci: 850 pcie gen: 3 It shows the same in both battery and performance mode and seems to be working correctly. In performance mode if I run a GPU intensive program such as inigine-heaven the power level, as expected, changes: uvd vclk: 0 dclk: 0 power level 4 sclk: 75000 mclk: 80000 vddc: 1025 vddci: 900 pcie gen: 3 Interestingly, when the GPU is running at power level 4, there's no flicker! So it seems to be an issue with power level 0. Despite the clocks showing the same on both "battery" and "performance" mode at power level 0, the flicker (and additional heat) only happen on "performance" mode. So the flicker is some combination of "performance" dpm state and power level 0 even though the clock speeds seem the same. I can confirm this bug does not affect 4.8.6 kernel on MacBook Pro 11,5. I built the 4.8.6 kernel this morning and do not have the same flickering problem as I had 4.8.7 onward. I did notice a very big jump in GPU performance though from 4.7.0 to 4.8.6 (around 35-40% improvement on OpenGL benchmark) on the MacBook Pro 11,5 with the radeon driver. I did find that the GPU fan was full speed on initial first boot on 4.8.6+, noticeably very noisy but quietened down after some time. While I'm not experiencing any flickering on 4.8.6, I have noticed some subtle screen tearing on Gnome Shell transitions; such as when you hit the super key. In the past with the fglrx driver, there was an option to enable in the ATI/AMD configuration manager to address this tearing. I just noticed this tearing as I was looking for the flickering that this bug mentions. (In reply to Michel Dänzer from comment #10) > If you can't or don't want to bisect, there are only 4 radeon driver commits > between 4.8.6 and 4.8.7, so it shouldn't take long to try manually reverting > each of those. > https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/ > ?h=linux-4.8.y&id=e136de5d733161fdfd203f23b448434170d189ea seems like a good > candidate, since it's clock related and explicitly references your GPU in > the code. Hi, I have reverted this commit on a 4.8.14 and the flickering stopped. C. (In reply to Cédric Le Goater from comment #14) > Hi, > > I have reverted this commit on a 4.8.14 and the flickering stopped. > > C. What chip do you have (pci device id and revision id)?
>
> What chip do you have (pci device id and revision id)?
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Venus XT [Radeon HD 8870M / R9 M270X/M370X] (rev 83) (prog-if 00 [VGA controller])
Subsystem: Apple Inc. Radeon R9 M370X Mac Edition
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 256 bytes
Interrupt: pin A routed to IRQ 45
Region 0: Memory at 80000000 (64-bit, prefetchable) [size=256M]
Region 2: Memory at b0c00000 (64-bit, non-prefetchable) [size=256K]
Region 4: I/O ports at 3000 [size=256]
Expansion ROM at b0c40000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: radeon
Kernel modules: radeon
so this is a CHIP_VERDE revision 0x83 (In reply to Cédric Le Goater from comment #17) > so this is a CHIP_VERDE revision 0x83 (In reply to Cédric Le Goater from comment #14) > (In reply to Michel Dänzer from comment #10) > > If you can't or don't want to bisect, there are only 4 radeon driver commits > > between 4.8.6 and 4.8.7, so it shouldn't take long to try manually reverting > > each of those. > > https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/ > > ?h=linux-4.8.y&id=e136de5d733161fdfd203f23b448434170d189ea seems like a good > > candidate, since it's clock related and explicitly references your GPU in > > the code. > > Hi, > > I have reverted this commit on a 4.8.14 and the flickering stopped. > > C. Having a looking at the diff; the new diff actually configures } else if (rdev->family == CHIP_VERDE) { + if ((rdev->pdev->revision == 0x81) || + (rdev->pdev->revision == 0x83) || ... + (rdev->pdev->device == 0x6821) || ... + (rdev->pdev->device == 0x682B)) { + max_sclk = 75000; + max_mclk = 80000; + } So on my MacBook Pro 11,5 - the device ID and revision are: 01:00.0 0300: 1002:6821 (rev 83) 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Venus XT [Radeon HD 8870M / R9 M270X/M370X] (rev 83) So; since this commit, the max_sclk and max_mclk has been set for this GPU to 75000 and 80000. In previous version of this driver module, this specific GPU was being skipped. I think these values have been incorrectly set for this CPU. According to these specifications for the M370X Mac chip, http://gpuboss.com/graphics-card/Radeon-R9-M370X-Mac, the two values max_sclk and max_mclk are probably: Clock speed 775 MHz Turbo clock speed 800 MHz So we are setting this stuff to run possibly 25 MHz out of sync with the actual GPU clock. I'm guessing this would be subtle enough to cause the flickering we're seeing, perhaps it should be something like this: } else if (rdev->family == CHIP_VERDE) { if (rdev->pdev->device == 0x6821 && rdev->pdev->revision == 0x83) { max_sclk = 77500; max_mclk = 80000; } else if other conditions In general though, the new block of device and revisions are VERY loose and not very well thought out. The OR conditionals are too far reaching. This GPU is matched in two different sections and even the device ID or the revision alone is enough to modify the aforementioned values. I might make compile 4.9.0 tonight to try this theory out and set max_sclk to 77500. Perhaps the best actual solution is to not even include this device and revision in the dpm quirks; as it was previously omitted and was never an actual problem. I haven't figured out how to determine the actual GPU frequency right now, but if we can confirm it's running at a stock speed of 775 MHz, that would give me greater confidence in testing this idea out. In hindsight the new code is not as bad as I thought as it is conditional on the family type and does simplify things. Building 4.9 kernel with a new diff now. Created attachment 128481 [details] [review] Setting correct core and memory clock for M370X in MBP 11,5 No screen flickering after testing out this patch I made on the v4.9 branch. I confirmed the GPU configuration on this page https://en.wikipedia.org/wiki/AMD_Radeon_Rx_300_series. Set mclk and sclk accordingly; assuming that mclk is for memory and sclk is for the GPU clock. Noticeably everything is running very nicely. No strange behaviour so far and both GPU and CPU temps are OK. In addition. My CPU turbo boost is also working, which it previously wasn't on 3.8.6. Test at your own risk. I've still not noticed any problems with this patch, I accidentally left the top two git diff lines in the top of the patch file, this will probably cause patch command to fail if you don't remove them. Created attachment 128780 [details] [review] fix This patch should fix it. Running this patch on Linux kernel version 4.9.1 fixes the problem for me. The patch in comment 22 works for me. Running Fedora 24 with 4.8.16 kernel. MacBookPro11,5. Fedora Copr repo available here: https://copr.fedorainfracloud.org/coprs/pgier/macbook-kernel/ (In reply to berg from comment #20) > Created attachment 128481 [details] [review] [review] > Setting correct core and memory clock for M370X in MBP 11,5 > > No screen flickering after testing out this patch I made on the v4.9 branch. > I confirmed the GPU configuration on this page > https://en.wikipedia.org/wiki/AMD_Radeon_Rx_300_series. Set mclk and sclk > accordingly; assuming that mclk is for memory and sclk is for the GPU clock. > > Noticeably everything is running very nicely. No strange behaviour so far > and both GPU and CPU temps are OK. > > In addition. My CPU turbo boost is also working, which it previously wasn't > on 3.8.6. Test at your own risk. hi berg, thanks for your attachment. I am new to linux kernel, and I don't how to apply your attachment, i.e. how to compile a single radeon.ko ? I find a radeon.ko under "/lib/modules/4.8.0-36-generic/kernel/drivers/gpu/drm/radeon", so I want to compile it, likes here: https://bugzilla.kernel.org/show_bug.cgi?id=105051#c37 I modify si_dpm.c and write a Makefile: obj-m += si_dpm.o all: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules clean: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean but when make I get: /si_dpm.c:24:18: fatal error: drmP.h: No such file or directory compilation terminated. so could you tell me how to compile a single .ko or if any tutorial I can follow, or I have to compile all the kernel? thanks! I can now compile a single radeon module, ref: http://www.codewhirl.com/2012/04/how-to-compile-a-single-module-in-ubuntu-linux/ however, neither attachment 128481 [details] [review] nor attachment 128780 [details] [review] work for me. the screen still flicker whether plugged in AC adapter or not. I am macbookpro 11,5, and ubuntu 16.04. uname -a: 4.8.0-36-generic #36~16.04.1-Ubuntu SMP Sun Feb 5 09:39:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux tail /var/log/Xorg.0.log: [ 877.072] (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip completion event has impossible msc 52285 < target_msc 52286 [ 877.222] (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip completion event has impossible msc 52294 < target_msc 52295 [ 878.838] (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip completion event has impossible msc 52391 < target_msc 52392 anyone helps? thanks! seems if I boot without AC adapter plugged in, then there is no flicker problem. plug in AC is OK after I log in my computer. but when I suspend and wakeup with AC plugged in, the screen flicker will happen. any idea? -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/759. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.