Bug 108940

Summary: QHD bug? drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1613 core_link_enable_stream+0xc14/0x1040
Product: DRI Reporter: Stefan <stfn+freedesktop>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: NEW --- QA Contact:
Severity: normal    
Priority: medium CC: ddstreet, h.habighorst, jzahraoui, lmoiseichuk, nicholas.kazlauskas
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
4.19 dmesg crash at boot
none
4.18 dmesg crash at boot
none
4.19 dmesg crash at boot, monitor off
none
4.19 Xorg.0.log
none
lspci -nnk | grep VGA -A 12
none
dmesg 4.20rc5 ubuntu 18.10 + iobaf
none
xubuntu 18.10 kernel with high loglevel
none
4.20.0-rc3-amd-staging-drm-next-git-b8cd95e15410+
none
dmesg 4.20.2-gentoo AMD Ryzen 5 2400G
none
dmesg 4.20.2 - drm.debug=0x06
none
dmesg from ~agd5f/linux, branch drm-next-5.1-wip, drm.debug=0x06, commit e7498c5ed98802940cb21e4fb18c9c440b646e6a none

Description Stefan 2018-12-04 11:17:16 UTC
Created attachment 142710 [details]
4.19 dmesg crash at boot

I recently switched from a dual monitor setup with 1 HDMI and 1 VGA monitor (connected through a powered HDMI -> VGA concerter) to a QHD 34" Acer ED347CKR monitor (https://www.acer.com/ac/en/US/content/model/UM.CE7AA.001) since Ryzen doesn't VGA (https://bugs.freedesktop.org/show_bug.cgi?id=105880).

It solved that crash, but it introduced a new crash.

4.18 amdgpu.dc_log=1 amdgpu.dc=0 pcie_aspm=off doesn't boot, display corruption. Colored line through boot text
4.19 amdgpu.dc_log=1 amdgpu.dc=0 pcie_aspm=off doesn't boot, display corruption. Colored line through boot text
4.18 amdgpu.dc_log=1 amdgpu.dc=1 crash at boot and systems hangs, see attachment 418 [details]_dmesg_boot_crash.log
4.19 amdgpu.dc_log=1 amdgpu.dc=1 crash at boot, system runs, see attachment 419 [details] [review]_dmesg_boot_crash.log

All errors below are with 419 amdgpu.dc_log=1 amdgpu.dc=1.

I see a difference in crashes when I boot with my monitor on or off.

Monitor on at boot (419_dmesg_boot_crash.log):

dec 04 11:25:48 spin kernel: ACPI BIOS Warning (bug): Optional FADT field Pm2ControlBlock has valid Length but zero Address: 0x0000000000000000/0x1 (20180810/tbfadt-624)
dec 04 11:25:48 spin kernel: random: 7 urandom warning(s) missed due to ratelimiting
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1372 dcn_bw_update_from_pplib+0x16b/0x280 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1570 core_link_enable_stream+0x657/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1599 core_link_enable_stream+0x6a2/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1570 core_link_enable_stream+0x657/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1589 core_link_enable_stream+0x9cd/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1599 core_link_enable_stream+0x6a2/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1570 core_link_enable_stream+0x657/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1589 core_link_enable_stream+0x9cd/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1599 core_link_enable_stream+0x6a2/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1617 core_link_enable_stream+0xa5b/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1646 core_link_enable_stream+0xaa7/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1617 core_link_enable_stream+0xa5b/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1636 core_link_enable_stream+0xb68/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1646 core_link_enable_stream+0xaa7/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1617 core_link_enable_stream+0xa5b/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1636 core_link_enable_stream+0xb68/0xb90 [amdgpu]
dec 04 11:25:49 spin kernel: WARNING: CPU: 6 PID: 460 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1646 core_link_enable_stream+0xaa7/0xb90 [amdgpu]

Monitor off at boot (419_dmesg_monitor_off_at_boot.log):

[    9.162954] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1613 core_link_enable_stream+0xc14/0x1040 [amdgpu]
[    9.163778] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1632 core_link_enable_stream+0xe2e/0x1040 [amdgpu]
[    9.164599] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1645 core_link_enable_stream+0xc8b/0x1040 [amdgpu]
[    9.165418] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1666 core_link_enable_stream+0xedd/0x1040 [amdgpu]
[    9.166236] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1698 core_link_enable_stream+0xf58/0x1040 [amdgpu]
[    9.167059] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1666 core_link_enable_stream+0xedd/0x1040 [amdgpu]
[    9.167879] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1685 core_link_enable_stream+0x1018/0x1040 [amdgpu]
[    9.168696] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1698 core_link_enable_stream+0xf58/0x1040 [amdgpu]
[    9.169517] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1666 core_link_enable_stream+0xedd/0x1040 [amdgpu]
[    9.170343] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1685 core_link_enable_stream+0x1018/0x1040 [amdgpu]
[    9.171162] WARNING: CPU: 3 PID: 537 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1698 core_link_enable_stream+0xf58/0x1040 [amdgpu]
[   51.610218] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1613 core_link_enable_stream+0xc14/0x1040 [amdgpu]
[   51.611366] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1645 core_link_enable_stream+0xc8b/0x1040 [amdgpu]
[   51.612436] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1613 core_link_enable_stream+0xc14/0x1040 [amdgpu]
[   51.613504] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1632 core_link_enable_stream+0xe2e/0x1040 [amdgpu]
[   51.614573] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1645 core_link_enable_stream+0xc8b/0x1040 [amdgpu]
[   51.615639] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1613 core_link_enable_stream+0xc14/0x1040 [amdgpu]
[   51.616711] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1632 core_link_enable_stream+0xe2e/0x1040 [amdgpu]
[   51.617776] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1645 core_link_enable_stream+0xc8b/0x1040 [amdgpu]
[   51.618773] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1666 core_link_enable_stream+0xedd/0x1040 [amdgpu]
[   51.619734] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1698 core_link_enable_stream+0xf58/0x1040 [amdgpu]
[   51.620694] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1666 core_link_enable_stream+0xedd/0x1040 [amdgpu]
[   51.621657] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1685 core_link_enable_stream+0x1018/0x1040 [amdgpu]
[   51.622617] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1698 core_link_enable_stream+0xf58/0x1040 [amdgpu]
[   51.623581] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1666 core_link_enable_stream+0xedd/0x1040 [amdgpu]
[   51.624547] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1685 core_link_enable_stream+0x1018/0x1040 [amdgpu]
[   51.625508] WARNING: CPU: 5 PID: 50 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1698 core_link_enable_stream+0xf58/0x1040 [amdgpu]

Without pcie_aspm=off my console gets spammed infrequently with errors like:

[ 16.588103] pcieport 0000:00:01.2: AER: Corrected error received: id=0008
[ 16.588110] pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Receiver ID)
[ 16.588121] pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00000040/00006000
[ 16.588128] pcieport 0000:00:01.2: [ 6] Bad TLP

The first time I saw those was with 4.16, but they don't occur every boot. I'm not sure if those were present <4.16. I'm aslo not sure if this is still present at 4.19. I'll soon find out since my system no longer boots with pci_aspm=off.

When running 4.19 I sometimes see

(EE) AMDGPU(0): drmmode_do_crtc_dpms cannot get last vblank counter

in xorg.log. See 419_Xorg.0.monitor_off_at_boot.log

I tried 4.20rc2 (I can't get newer RC's to build the Manjaro way) but it has identical errors as 4.19.

Also attached lspci -nnk | grep VGA -A 12.

System:    Host: spin Kernel: 4.19.6-1-MANJARO x86_64 bits: 64 compiler: gcc v: 8.2.1 Desktop: KDE Plasma 5.14.4 tk: Qt 5.11.2 
           wm: kwin_x11 dm: SDDM Distro: Manjaro Linux 
Graphics:  Device-1: AMD Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] driver: amdgpu v: kernel bus ID: 38:00.0 
           chip ID: 1002:15dd 
           Display: x11 server: X.Org 1.20.3 driver: amdgpu,ati unloaded: modesetting alternate: fbdev,vesa 
           compositor: kwin_x11 resolution: 3440x1440~60Hz 
           OpenGL: renderer: AMD RAVEN (DRM 3.27.0 4.19.6-1-MANJARO LLVM 7.0.0) v: 4.5 Mesa 18.2.6 compat-v: 4.4 
           direct render: Yes

If you need more info, please let me know!
Comment 1 Stefan 2018-12-04 11:17:42 UTC
Created attachment 142711 [details]
4.18 dmesg crash at boot
Comment 2 Stefan 2018-12-04 11:18:09 UTC
Created attachment 142712 [details]
4.19 dmesg crash at boot, monitor off
Comment 3 Stefan 2018-12-04 11:18:52 UTC
Created attachment 142713 [details]
4.19 Xorg.0.log
Comment 4 Stefan 2018-12-04 11:19:25 UTC
Created attachment 142714 [details]
lspci -nnk | grep VGA -A 12
Comment 5 fin4478 2018-12-05 15:09:41 UTC Comment hidden (spam)
Comment 6 Stefan 2018-12-08 16:40:00 UTC
Nope, exactly the same error is present with your setup:

System:
  Host: spin Kernel: 4.20.0-042000rc5-generic x86_64 bits: 64 
  Desktop: Gnome 3.30.1 Distro: Ubuntu 18.10 (Cosmic Cuttlefish) 
Machine:
  Type: Desktop Mobo: ASRock model: AB350 Gaming-ITX/ac 
  serial: <root required> UEFI [Legacy]: American Megatrends v: P4.90 
  date: 10/08/2018 
CPU:
  Quad Core: AMD Ryzen 5 2400G with Radeon Vega Graphics type: MT MCP 
  speed: 1418 MHz min/max: 1600/3600 MHz 
Graphics:
  Device-1: AMD Raven Ridge [Radeon Vega Series / Radeon Vega Mobile Series] 
  driver: amdgpu v: kernel 
  Display: x11 server: X.Org 1.20.1 driver: amdgpu 
  resolution: 3440x1440~60Hz 
  OpenGL: 
  renderer: AMD RAVEN (DRM 3.27.0 4.20.0-042000rc5-generic LLVM 7.0.1) 
  v: 4.5 Mesa 19.0.0-devel (git-cc6a5e9 2018-12-08 cosmic-oibaf-ppa)
Comment 7 Stefan 2018-12-08 16:40:47 UTC
Created attachment 142753 [details]
dmesg 4.20rc5 ubuntu  18.10 + iobaf
Comment 8 Leonid Moiseichuk 2018-12-09 00:39:59 UTC
I see exactly the same crash on ASrock X470 Fatality ITX, BIOS 1.70, kernel 4.18.0-12-generic (xubuntu 18.10)
Comment 9 Leonid Moiseichuk 2018-12-09 00:52:08 UTC
The implementation comes with 1e8635ea0ea370bf4f0f2b2f1b3eb61474dd962a
by Zeyu.Fan at amd.com
Comment 10 Leonid Moiseichuk 2018-12-15 18:42:36 UTC
Created attachment 142817 [details]
xubuntu 18.10 kernel with high loglevel

I see the number of problems reported on ASRock x470 + amd Ryzen 5 2400g during boot:

$ grep  WARNING 4.18.0-13-generic.log 
[    3.938686] WARNING: CPU: 0 PID: 323 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1553 core_link_enable_stream+0x74c/0xc90 [amdgpu]
[    3.939285] WARNING: CPU: 0 PID: 323 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1582 core_link_enable_stream+0x78e/0xc90 [amdgpu]
[    3.939863] WARNING: CPU: 0 PID: 323 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1553 core_link_enable_stream+0x74c/0xc90 [amdgpu]
[    3.940450] WARNING: CPU: 0 PID: 323 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1572 core_link_enable_stream+0xacd/0xc90 [amdgpu]
[    3.941029] WARNING: CPU: 0 PID: 323 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1582 core_link_enable_stream+0x78e/0xc90 [amdgpu]
[    3.941610] WARNING: CPU: 0 PID: 323 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1553 core_link_enable_stream+0x74c/0xc90 [amdgpu]
[    3.942190] WARNING: CPU: 0 PID: 323 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1572 core_link_enable_stream+0xacd/0xc90 [amdgpu]
[    3.942766] WARNING: CPU: 0 PID: 323 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1582 core_link_enable_stream+0x78e/0xc90 [amdgpu]
Comment 11 Stefan 2018-12-23 08:25:28 UTC
I just tried with amd-staging-drm-next. The error is still present.
Comment 12 Stefan 2018-12-23 08:26:43 UTC
Created attachment 142876 [details]
4.20.0-rc3-amd-staging-drm-next-git-b8cd95e15410+
Comment 13 H.Habighorst 2019-01-16 13:26:51 UTC
Created attachment 143140 [details]
dmesg 4.20.2-gentoo AMD Ryzen 5 2400G

dmesg of boot
- linux-firmware: 2019-01-14
- kernel: 4.20.2
- AMD Ryzen 5 2400G
- X11 driver: modeset
- Xorg Log: 
Modeline "1920x1200"x0.0  154.00  1920 1968 2000 2080  1200 1203 1209 1235 +hsync -vsync (74.0 kHz eP)
Comment 14 H.Habighorst 2019-01-16 13:52:49 UTC
Can confirm. 

But I'm a bit worried - there is a part before the "core_link_enable_stream" that seems wrong to me in initializing the device, which seems to occure in all dmesg outputs.

I've attached line numbers via less -N - maybe it helps.

837 [    8.686299] [drm] BIOS signature incorrect 0 0
838 [    8.686326] ATOM BIOS: 113-RAVEN-110
839 [    8.686357] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
...
854 [    8.688113] [drm] Found VCN firmware Version: 1.73 Family ID: 18
...
 857 [    8.797714] [drm] DM_PPLIB: values for Invalid clock
    858 [    8.797715] [drm] DM_PPLIB:   0 in kHz
    859 [    8.797715] [drm] DM_PPLIB:   400000 in kHz
    860 [    8.797716] [drm] DM_PPLIB:   933000 in kHz
    861 [    8.797716] [drm] DM_PPLIB:   1067000 in kHz
    862 [    8.797809] WARNING: CPU: 6 PID: 725 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1380 dcn_bw_update_from_pplib+0x16b/0x280 [amdgpu]

This seems plain wrong to me - shouldn't it mention here the engine clock?

This WARNING stems from not getting the clock values right - seems intended.

The same warning occurs again, I wildly guess for the memory (?!).

Line 862-1017 - only afterwards the mentioned "core_link_enable_stream" bug occurs. It seems to me - as a total layman - that the initialization fails and afterwards bogus happens.
Comment 15 H.Habighorst 2019-01-17 12:18:03 UTC
Created attachment 143145 [details]
dmesg 4.20.2 - drm.debug=0x06

I've forgotten to increase log size and add drm.debug=0x06 to the kernel line.
Comment 16 H.Habighorst 2019-01-17 12:22:08 UTC
Created attachment 143146 [details]
dmesg from ~agd5f/linux, branch drm-next-5.1-wip, drm.debug=0x06, commit e7498c5ed98802940cb21e4fb18c9c440b646e6a

It still happens in latest GIT.

I've add drm.debug = 0x06 and attached the dmesg.

If you need anything, please comment.
Comment 17 Stefan 2019-02-25 09:52:04 UTC
Still present in 4.20.12, 5.0rc8 and amd-staging-drm-next-git-b8cd95e15410.

What is needed to resolve this? My system remains unstable with frequent/continues crashes. Hard reset is the only option.
Comment 18 udo 2019-03-28 17:14:23 UTC
I see this WARNING on boot with various kernels, currently 4.19.30.
What can I try to avoid this WARNING?
Comment 19 udo 2019-03-28 17:29:46 UTC
In my kernel history 4.9.12 is the first kernel that shows the WARNING in dmesg.
Comment 20 udo 2019-03-28 17:31:15 UTC
Correction: 4.19.12.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.