Bug 105725

Summary: WARNING: CPU: 0 PID: 487 at drivers/gpu/drm/amd/amdgpu/../display /dc/gpio/gpio_base.c:64 dal_gpio_open_ex+0xc/0x30 [amdgpu]
Product: DRI Reporter: hjpriester
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: higuita, hjpriester, marc, petrcvekcz, stfn+freedesktop
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Dmesg output with the two "WARNINGS of admgpu"
none
dmesg of drm-next-4.17-wip (date of clone 2018-03-29)
none
Suggested fix with a semaphore none

Description hjpriester 2018-03-24 08:18:36 UTC
Created attachment 138331 [details]
Dmesg output with the two "WARNINGS of admgpu"

Using Linux kernel 4.15.11 I get two warnings during a boot.
I am using the "amdgpu.dc=1"  to get sound. 
I have a RX550 video card.

Could these message be related to a "hang" I sometimes have?
Comment 1 Harry Wentland 2018-03-26 14:26:27 UTC
I'm not sure, but I think they could be related to a hang.

A couple questions:
 * Do you see a hang a boot, or in a different scenario?
 * What display are you using?
 * What's your distribution?
 * Are you able to try the drm-next-4.17-wip branch from https://cgit.freedesktop.org/~agd5f/linux/ ?
Comment 2 hjpriester 2018-03-26 14:40:28 UTC
Is there a guide a build the 4.15.x kernel using the drm-4.17-wip?
I can try to compile/test it next weekend.

The problem I have is that when I display a certain png with "imagemagick display" X11 hangs.
The screen is not longer updated and only a reboot helps.

When I run the program using strace the last system call displayed is:

 ioctl(6, DRM_IOCTL_AMDGPU_WAIT_CS. 

That is why I am thinking it might have to do with the amdgpu.
Comment 3 Harry Wentland 2018-03-27 19:06:15 UTC
The warnings are likely not related to the hang in this case.

Installing or building the kernel might depend on your distribution. You should be able to find a guide by googling "building a custom kernel <distroname>".

In general you might want to
 * mkdir mybuilddir
 * git clone git://people.freedesktop.org/~agd5f/linux
 * cd linux
 * cp /boot/config-<your-current-config> .config
 * if you're on ubuntu/debian:
   * make deb-pkg
   * sudo dpkg -i <path to custom kernel .deb from your install>
Comment 4 hjpriester 2018-03-30 17:26:37 UTC
Harry,

I have build a kernel using the:
 
    git clone -b drm-next-4.17-wip git://people.freedesktop.org/~agd5f/linux

I did the clone about 2018-29-03 20:00 UTC  

The warnings are still there.
root@hjp2:~# dmesg | grep -i Warning
[    0.000000] ACPI BIOS Warning (bug): Optional FADT field Pm2ControlBlock has valid Length but zero Address: 0x0000000000000000/0x1 (20180105/tbfadt-658)
[    5.351226] WARNING: CPU: 1 PID: 491 at drivers/gpu/drm/amd/amdgpu/../display/dc/gpio/gpio_base.c:64 dal_gpio_open_ex+0xc/0x30 [amdgpu]
[    5.353376] WARNING: CPU: 1 PID: 491 at drivers/gpu/drm/amd/amdgpu/../display/dc/gpio/gpio_service.c:487 dal_ddc_open+0x2c/0xd0 [amdgpu]

I never get the hang during a boot. I will report that as a separate bug.

The display I am using is a IIYama Prolite XUB2792QSU using Displayport.
It it a 27" 2560x1440 monitor.
The distrubution is Slackware-current. ("pre 15.0")
Comment 5 hjpriester 2018-03-30 17:27:38 UTC
Created attachment 138448 [details]
dmesg of  drm-next-4.17-wip (date of clone 2018-03-29)
Comment 6 higuita 2018-06-22 02:08:40 UTC
running 4.17.1 on a asus rx480 i also get this

I do not see any hang or problem due to this warning/oops
Comment 7 Marc Thomas 2018-07-30 14:39:15 UTC
I also see the WARNING messages in 4.17.10 and 4.18-rc6:

# grep WARNING dmesg.4.18.0-rc6
[    9.167769] WARNING: CPU: 4 PID: 1133 at drivers/gpu/drm/amd/amdgpu/../display/dc/gpio/gpio_base.c:64 dal_gpio_open_ex+0xc/0x30 [amdgpu]
[    9.184735] WARNING: CPU: 4 PID: 1133 at drivers/gpu/drm/amd/amdgpu/../display/dc/gpio/gpio_service.c:488 dal_ddc_open+0x31/0xe0 [amdgpu]
[    9.246097] WARNING: CPU: 10 PID: 1134 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:88 dm_dp_aux_transfer+0xa5/0xb0 [amdgpu]
[    9.411094] WARNING: CPU: 4 PID: 1133 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:88 dm_dp_aux_transfer+0xa5/0xb0 [amdgpu]


This is on an Ryzen 2700x based system with an RX480 GPU, connected to an IIyama ProLite XB2380HS via DVI.

I'm also running Slackware-current Linux, so I wonder whether it's something in Slackware's kernel config.

PS: This may be the same as Bug 106164.
Comment 8 Marc Thomas 2018-08-02 10:53:40 UTC
After some testing/experimentation/and scouring the internet, I've found that blacklisting modules gpio_amdpt (gpio-amdpt.ko) & pinctrl_amd (pinctrl-amd.ko) greatly reduce the likelihood of the gpio_base.c:64 & gpio_service.c:488 dmesg warnings. I suspect some kind of race condition during boot as the modules may load in any order.

In Slackware-current I think pinctrl_amd was compiled in, so I had to change the Kernel config and re-compile:

-CONFIG_PINCTRL_AMD=y
+CONFIG_PINCTRL_AMD=m

The blacklist was achieved by creating this file:
root@deepthought:~# cat /etc/modprobe.d/gpio_amdpt.conf 
blacklist gpio_amdpt
blacklist pinctrl_amd


NB: I found that once I'd had these kernel warnings I needed to reset the hardware (front panel reset button or power cycle) to clear them, even with the blacklist in place. A reboot didn't seem to be enough.

Finally, I do still get this warning (in 4.18-rc6 & 4.17.10), but I think the cause is something else:

root@deepthought:~# dmesg | grep WARNING
[    9.260098] WARNING: CPU: 15 PID: 1138 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:88 dm_dp_aux_transfer+0xa5/0xb0 [amdgpu]
[    9.260163] WARNING: CPU: 4 PID: 1142 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:88 dm_dp_aux_transfer+0xa5/0xb0 [amdgpu]
Comment 9 Alex Deucher 2018-08-02 13:35:39 UTC
(In reply to Marc Thomas from comment #8)
> After some testing/experimentation/and scouring the internet, I've found
> that blacklisting modules gpio_amdpt (gpio-amdpt.ko) & pinctrl_amd
> (pinctrl-amd.ko) greatly reduce the likelihood of the gpio_base.c:64 &
> gpio_service.c:488 dmesg warnings. I suspect some kind of race condition
> during boot as the modules may load in any order.

It's possible that there are issues with gpio-amdpt and pinctrl-amd that lead to general system stability issues, but it should be noted the the gpio stuff in amdgpu is completely unrelated to those modules.
Comment 10 Marc Thomas 2018-09-25 11:41:09 UTC
(In reply to Alex Deucher from comment #9)
> It's possible that there are issues with gpio-amdpt and pinctrl-amd that
> lead to general system stability issues, but it should be noted the the gpio
> stuff in amdgpu is completely unrelated to those modules.

I think you're right - I still do get these warnings in 4.18.9 even with the modules blacklisted; just not all the time.

Interestingly I found the other warning I get from amdgpu_dm/amdgpu_dm_mst_types.c can be provoked by running "xcmddc -lv". It seems udevd does this during boot. See also bug 107456.
Comment 11 higuita 2018-09-26 23:06:56 UTC
*** Bug 106164 has been marked as a duplicate of this bug. ***
Comment 12 Petr Cvek 2018-11-16 01:47:19 UTC
Created attachment 142485 [details] [review]
Suggested fix with a semaphore
Comment 13 Petr Cvek 2018-11-16 01:50:06 UTC
Comment on attachment 142485 [details] [review]
Suggested fix with a semaphore

It seems there is a race condition between multiple threads which calls dal_ddc_open(). With attached patch (testing only, ugly) there are no warnings anymore.
Comment 14 hjpriester 2018-11-28 10:05:59 UTC
I am now running 4.19.4 and did not get the messages anymore.
The messages where there from 4.5.11 to 4.18.16
Comment 15 Petr Cvek 2018-11-29 05:50:14 UTC
(In reply to hjpriester from comment #14)
> I am now running 4.19.4 and did not get the messages anymore.
> The messages where there from 4.5.11 to 4.18.16

I've got the warnings with vanilla 4.20-rc2 (-next-20181113) on RX460 card.
Comment 16 Martin Peres 2019-11-19 08:33:30 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/336.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.