Bug 107296 - WARNING: CPU: 0 PID: 370 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1355 dcn_bw_update_from_pplib+0x16b/0x280 [amdgpu]
Summary: WARNING: CPU: 0 PID: 370 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dc...
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
: 107209 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-07-19 17:13 UTC by Paul Menzel
Modified: 2019-11-20 07:56 UTC (History)
15 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Linux 4.18-rc5+ (with merged drm-tip) messages (163.46 KB, text/x-log)
2018-07-19 17:13 UTC, Paul Menzel
no flags Details
These errors appear on 2400G on kernel 4.20.11 (5.22 KB, text/plain)
2019-02-22 15:05 UTC, Öyvind Saether
no flags Details
Complete kern.log (72.98 KB, text/plain)
2019-05-25 21:46 UTC, John Pathy
no flags Details
dmesg warning on 5.1.14 with vega 11 (8.99 KB, text/plain)
2019-07-03 06:49 UTC, Janpieter Sollie
no flags Details
dmesg.txt blank screen on boot (93.02 KB, text/plain)
2019-09-25 14:18 UTC, frido.ferdinand
no flags Details
dmesg of PPLIB error (10.04 KB, text/plain)
2019-10-17 12:12 UTC, Janpieter Sollie
no flags Details

Description Paul Menzel 2018-07-19 17:13:51 UTC
Created attachment 140716 [details]
Linux 4.18-rc5+ (with merged drm-tip) messages

On a MSI B350M MORTAR (MS-7A37) with AMD Ryzen 3 2200G with Radeon Vega Graphics and with Linux 4.18-rc5+ with merged drm-tip the warning below is always shown.

15.355: [   24.263999] WARNING: CPU: 0 PID: 370 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1355 dcn_bw_update_from_pplib+0x16b/0x280 [amdgpu]
15.356: [   24.294351] Modules linked in: nls_ascii nls_cp437 vfat fat amdkfd snd_hda_codec_realtek amdgpu(+) snd_hda_codec_generic chash snd_hda_codec_hdmi efi_pstore i2c_algo_bit edac_mce_amd snd_hda_intel snd_hda_codec gpu_sched ccp rng_core drm_kms_helper snd_hda_core syscopyarea sysfillrect r8169 kvm sysimgblt irqbypass fb_sys_fops snd_hwdep ttm sp5100_tco crct10dif_pclmul snd_pcm crc32_pclmul snd_timer pcspkr snd sg ghash_clmulni_intel k10temp efivars mii drm soundcore i2c_piix4 video button efivarfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 fscrypto dm_crypt dm_mod sd_mod evdev hid_generic usbhid hid crc32c_intel aesni_intel ahci xhci_pci aes_x86_64 libahci crypto_simd xhci_hcd libata cryptd glue_helper usbcore scsi_mod gpio_amdpt gpio_generic
15.360: [   24.365932] CPU: 0 PID: 370 Comm: systemd-udevd Not tainted 4.18.0-rc5+ #2
15.360: [   24.365933] Hardware name: MSI MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.G1 05/17/2018
15.361: [   24.365984] RIP: 0010:dcn_bw_update_from_pplib+0x16b/0x280 [amdgpu]
15.361: [   24.388384] Code: d8 ca d8 f1 d9 5a 50 8b 44 fc 14 49 8b 94 24 70 01 00 00 48 89 04 24 df 2c 24 d8 f1 db 42 78 de c9 de ca de f9 d9 5a 4c eb 02 <0f> 0b 48 89 da be 04 00 00 00 48 89 ef e8 33 5a fe ff 84 c0 0f 84 
15.363: [   24.388405] RSP: 0018:ffffa31f0259f7c8 EFLAGS: 00010246
15.363: [   24.414366] RAX: 0000000000000001 RBX: ffffa31f0259f828 RCX: 0000000000000000
15.364: [   24.414367] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff97d5de816730
15.365: [   24.414368] RBP: ffff97d5c3f09d40 R08: 00000028c4bad032 R09: 000000000000001f
15.365: [   24.414369] R10: 0000000000000af4 R11: 00000000003c7cb4 R12: ffff97d5c0e3d000
15.366: [   24.445133] R13: ffff97d5c825dd00 R14: ffff97d5c0e3d000 R15: 0000000000000000
15.366: [   24.445134] FS:  00007fd442ec88c0(0000) GS:ffff97d5de800000(0000) knlGS:0000000000000000
15.367: [   24.445135] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
15.367: [   24.467698] CR2: 00007fd34c0034c8 CR3: 0000000403e6c000 CR4: 00000000003406f0
15.368: [   24.467699] Call Trace:
15.368: [   24.467748]  dcn10_create_resource_pool+0x756/0x990 [amdgpu]
15.369: [   24.484250]  dc_create_resource_pool+0x42/0x180 [amdgpu]
15.369: [   24.484254]  ? __kmalloc+0x1b4/0x250
15.369: [   24.493877]  ? dal_gpio_service_create+0x8f/0x110 [amdgpu]
15.370: [   24.493922]  dc_create+0x228/0x650 [amdgpu]
15.370: [   24.504339]  ? amdgpu_cgs_create_device+0x23/0x50 [amdgpu]
15.371: [   24.504386]  dm_hw_init+0xc8/0x130 [amdgpu]
15.371: [   24.514764]  amdgpu_device_init.cold.28+0x10ea/0x1295 [amdgpu]
15.372: [   24.514803]  amdgpu_driver_load_kms+0x86/0x2c0 [amdgpu]
15.372: [   24.526681]  drm_dev_register+0x109/0x140 [drm]
15.372: [   24.526720]  amdgpu_pci_probe+0x13c/0x1c0 [amdgpu]
15.373: [   24.536741]  local_pci_probe+0x41/0x90
15.373: [   24.536743]  pci_device_probe+0x189/0x1a0
15.373: [   24.545118]  driver_probe_device+0x2b9/0x460
15.374: [   24.545120]  __driver_attach+0xdd/0x110
15.374: [   24.553853]  ? driver_probe_device+0x460/0x460
15.374: [   24.553855]  bus_for_each_dev+0x76/0xc0
15.375: [   24.562813]  ? klist_add_tail+0x3b/0x70
15.375: [   24.562814]  bus_add_driver+0x152/0x230
15.375: [   24.571094]  ? 0xffffffffc08bf000
15.376: [   24.571096]  driver_register+0x6b/0xb0
15.376: [   24.578735]  ? 0xffffffffc08bf000
15.376: [   24.578737]  do_one_initcall+0x46/0x1c3
15.377: [   24.578739]  ? kmem_cache_alloc_trace+0x183/0x1f0
15.377: [   24.591626]  ? do_init_module+0x22/0x210
15.377: [   24.591628]  do_init_module+0x5a/0x210
15.378: [   24.591630]  load_module+0x21c4/0x2410
15.378: [   24.603927]  ? vfs_read+0x110/0x140
15.378: [   24.603929]  ? __do_sys_finit_module+0xa8/0x110
15.379: [   24.612601]  __do_sys_finit_module+0xa8/0x110
15.379: [   24.612602]  do_syscall_64+0x55/0xe0
15.379: [   24.621178]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
15.380: [   24.621179] RIP: 0033:0x7fd443f25a79
15.380: [   24.630489] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d df 43 0c 00 f7 d8 64 89 01 48 
15.382: [   24.650822] RSP: 002b:00007ffffe006958 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
15.382: [   24.650823] RAX: ffffffffffffffda RBX: 000056217c76ecc0 RCX: 00007fd443f25a79
15.383: [   24.650825] RDX: 0000000000000000 RSI: 00007fd443c2d0ed RDI: 0000000000000018
15.383: [   24.674422] RBP: 00007fd443c2d0ed R08: 0000000000000000 R09: 0000000000000000
15.384: [   24.674422] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000000000
15.385: [   24.674423] R13: 000056217c76b5d0 R14: 0000000000020000 R15: 000056217c76ecc0
15.385: [   24.674470] WARNING: CPU: 0 PID: 370 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1355 dcn_bw_update_from_pplib+0x16b/0x280 [amdgpu]
15.386: [   24.712143] ---[ end trace 115af7e91900bac6 ]---
Comment 1 Paul Menzel 2018-07-23 14:29:35 UTC
In today’s drm-tip this is:

    [   20.149515] WARNING: CPU: 0 PID: 347 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1372 dcn_bw_update_from_pplib+0x16b/0x280 [amdgpu]

It looks like this is `BREAK_TO_DEBUGGER()` [1], which is defined as `#define BREAK_TO_DEBUGGER() ASSERT(0)`.

```
	/* TODO: This is not the proper way to obtain fabric_and_dram_bandwidth, should be min(fclk, memclk) */
	res = dm_pp_get_clock_levels_by_type_with_voltage(
			ctx, DM_PP_CLOCK_TYPE_FCLK, &fclks);

	if (res)
		res = verify_clock_values(&fclks);

	if (res) {
		ASSERT(fclks.num_levels >= 3);
		dc->dcn_soc->fabric_and_dram_bandwidth_vmin0p65 = 32 * (fclks.data[0].clocks_in_khz / 1000.0) / 1000.0;
		dc->dcn_soc->fabric_and_dram_bandwidth_vmid0p72 = dc->dcn_soc->number_of_channels *
				(fclks.data[fclks.num_levels - (fclks.num_levels > 2 ? 3 : 2)].clocks_in_khz / 1000.0)
				* ddr4_dram_factor_single_Channel / 1000.0;
		dc->dcn_soc->fabric_and_dram_bandwidth_vnom0p8 = dc->dcn_soc->number_of_channels *
				(fclks.data[fclks.num_levels - 2].clocks_in_khz / 1000.0)
				* ddr4_dram_factor_single_Channel / 1000.0;
		dc->dcn_soc->fabric_and_dram_bandwidth_vmax0p9 = dc->dcn_soc->number_of_channels *
				(fclks.data[fclks.num_levels - 1].clocks_in_khz / 1000.0)
				* ddr4_dram_factor_single_Channel / 1000.0;
	} else
		BREAK_TO_DEBUGGER();
```

So, either `dm_pp_get_clock_levels_by_type_with_voltage()` or `verify_clock_values(&fclks)` returns 0.

[1]: https://cgit.freedesktop.org/drm-tip/tree/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c#n1372
Comment 2 Paul Menzel 2018-07-23 15:39:13 UTC
Adding print statements, it turns out, the second function `verify_clock_values(&fclks)` returns *false*.
Comment 3 Paul Menzel 2018-08-08 16:20:03 UTC
This is also present in agd5f/amd-staging-drm-next (v4.18-rc1-779-gbae3d5443de1).
Comment 4 Öyvind Saether 2018-10-07 01:14:17 UTC
This bug is still present in the linux.git kernel as of now.

kernel 4.19.0-rc6-ChaeKyung-April-00336-gc1d84a1b42ef'
chip 2400G
board ROG STRIX X470-F GAMING, BIOS 4011 04/19/2018

on boot,
[    7.913285] amdgpu: [powerplay] dpm has been enabled
[    7.913430] [drm] DM_PPLIB: values for Invalid clock
[    7.913514] [drm] DM_PPLIB:   0 in kHz
[    7.913592] [drm] DM_PPLIB:   0 in kHz
[    7.913669] [drm] DM_PPLIB:   0 in kHz
[    7.913747] [drm] DM_PPLIB:   1400000 in kHz
[    7.913905] WARNING: CPU: 6 PID: 483 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1372 dcn_bw_update_from_pplib+0x
16b/0x280 [amdgpu]
[    7.914052] Modules linked in: amdgpu(+) chash gpu_sched drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm raid1
 igb crct10dif_pclmul crc32_pclmul crc32c_intel r8169 ghash_clmulni_intel libphy serio_raw dca i2c_algo_bit it87(OE) hwmon_vid k10temp
[    7.914287] CPU: 6 PID: 483 Comm: systemd-udevd Tainted: G           OE     4.19.0-rc6-ChaeKyung-April-00336-gc1d84a1b42ef #34
[    7.914419] Hardware name: System manufacturer System Product Name/ROG STRIX X470-F GAMING, BIOS 4011 04/19/2018
[    7.914602] RIP: 0010:dcn_bw_update_from_pplib+0x16b/0x280 [amdgpu]
[    7.914698] Code: d8 ca d8 f1 d9 5a 50 8b 44 fc 14 49 8b 94 24 78 01 00 00 48 89 04 24 df 2c 24 d8 f1 db 42 78 de c9 de ca de f9 d9 5a 4c eb 02 <0f> 0b 48 89 da be 04 00 00 00 48 89 ef e8 73 55 fe ff 84 c0 0f 84
[    7.914883] RSP: 0018:ffffada04206b788 EFLAGS: 00010246
[    7.914969] RAX: 0000000000000001 RBX: ffffada04206b7e8 RCX: 0000000000000000
[    7.915065] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff98c390395540
[    7.915160] RBP: ffff98c38069be80 R08: 0000000000000000 R09: ffff98bf800b9f40
[    7.915256] R10: ffffffff934c1f80 R11: ffffffff9491f00d R12: ffff98c3813dd000
[    7.915351] R13: ffff98c3853214c0 R14: ffff98c3813dd000 R15: 0000000000000000
[    7.915447] FS:  00007f895d8f3180(0000) GS:ffff98c390380000(0000) knlGS:0000000000000000
[    7.915562] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    7.915650] CR2: 00007f5aacdab120 CR3: 0000000405b94000 CR4: 00000000003406e0
[    7.915745] Call Trace:
[    7.915823]  ? _cond_resched+0x15/0x30
[    7.915959]  dcn10_create_resource_pool+0x781/0x9d0 [amdgpu]
[    7.916110]  ? dal_aux_engine_dce110_create+0x39/0x80 [amdgpu]
[    7.916258]  dc_create_resource_pool+0x42/0x190 [amdgpu]
[    7.916350]  ? _cond_resched+0x15/0x30
[    7.916430]  ? __kmalloc+0x17e/0x220
[    7.916563]  ? dal_gpio_service_create+0x8f/0x110 [amdgpu]
[    7.916709]  dc_create+0x20f/0x630 [amdgpu]
[    7.916796]  ? vgacon_putc+0x10/0x10
[    7.916930]  dm_hw_init+0xc8/0x130 [amdgpu]
[    7.917069]  amdgpu_device_init.cold.28+0x10ea/0x1295 [amdgpu]
[    7.917210]  amdgpu_driver_load_kms+0x86/0x2c0 [amdgpu]
[    7.917311]  drm_dev_register+0x109/0x140 [drm]
[    7.917440]  amdgpu_pci_probe+0x13c/0x1c0 [amdgpu]
[    7.917530]  local_pci_probe+0x41/0x90
[    7.917610]  pci_device_probe+0x188/0x1a0
[    7.917691]  really_probe+0x235/0x3a0
[    7.917770]  driver_probe_device+0xb3/0xf0
[    7.917851]  __driver_attach+0xdd/0x110
[    7.917930]  ? driver_probe_device+0xf0/0xf0
[    7.918011]  bus_for_each_dev+0x76/0xc0
[    7.918092]  ? klist_add_tail+0x3b/0x60
[    7.918171]  bus_add_driver+0x152/0x230
[    7.918250]  ? 0xffffffffc07e4000
[    7.918326]  driver_register+0x6b/0xb0
[    7.918405]  ? 0xffffffffc07e4000
[    7.918482]  do_one_initcall+0x46/0x1b8
[    7.918562]  ? _cond_resched+0x15/0x30
[    7.918640]  ? kmem_cache_alloc_trace+0x15f/0x1e0
[    7.918725]  do_init_module+0x5a/0x210
[    7.918804]  load_module+0x203c/0x2280
[    7.918885]  ? __do_sys_init_module+0x13b/0x180
[    7.918967]  __do_sys_init_module+0x13b/0x180
[    7.919050]  do_syscall_64+0x55/0x150
[    7.919129]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    7.919214] RIP: 0033:0x7f895c50786a
[    7.919292] Code: 48 8b 0d 39 e6 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 06 e6 2b 00 f7 d8 64 89 01 48
[    7.919476] RSP: 002b:00007ffca7439ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[    7.919587] RAX: ffffffffffffffda RBX: 0000562804b0a9c0 RCX: 00007f895c50786a
[    7.919682] RDX: 00007f895d06e4cd RSI: 000000000858f176 RDI: 00007f894c87f010
[    7.919777] RBP: 00007f895d06e4cd R08: 0000000000000002 R09: 0000000000000001
[    7.919871] R10: 0000562804ae2010 R11: 0000000000000246 R12: 00007f894c87f010
[    7.919966] R13: 0000562804afb2f0 R14: 0000000000020000 R15: 0000000000000000
[    7.920061] ---[ end trace 3de9567f671e90b4 ]---
[    7.920148] [drm] DM_PPLIB: values for Invalid clock
[    7.920232] [drm] DM_PPLIB:   300000 in kHz
[    7.920313] [drm] DM_PPLIB:   600000 in kHz
[    7.920391] [drm] DM_PPLIB:   626000 in kHz
[    7.920470] [drm] DM_PPLIB:   654000 in kHz
[    7.929270] [drm] Display Core initialized with v3.1.59!
Comment 5 darkbasic 2018-11-11 15:14:22 UTC
I confirm, the same with B450M + 2400g and 4.18/4.19/4.20-rc1.
Comment 6 Romain Reignier 2018-12-16 18:34:57 UTC
(In reply to darkbasic from comment #5)
> I confirm, the same with B450M + 2400g and 4.18/4.19/4.20-rc1.

Exactly like darkbasic, MSI B450M Mortar, 2400G on 4.19.8 (from Ubuntu kernel team).

Most of the time, even with that "calltrace", the system is useable but from time to time, I get a freeze with the following (not sure it is related):

[ 2590.157513] romain-desktop kernel: gmc_v9_0_process_interrupt: 10 callbacks suppressed
[ 2590.157518] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157525] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x000000010980b000 from 27
[ 2590.157528] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2590.157535] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157539] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x0000000109810000 from 27
[ 2590.157541] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2590.157548] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157551] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x0000000109811000 from 27
[ 2590.157553] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2590.157560] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157562] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x000000010980c000 from 27
[ 2590.157564] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2590.157571] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157574] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x000000010980b000 from 27
[ 2590.157576] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2590.157583] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157585] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x0000000109800000 from 27
[ 2590.157587] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2590.157594] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157597] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x0000000109810000 from 27
[ 2590.157599] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2590.157605] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157608] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x0000000109810000 from 27
[ 2590.157610] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2590.157617] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157619] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x0000000109811000 from 27
[ 2590.157621] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2590.157628] romain-desktop kernel: amdgpu 0000:38:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pi
                                      )
[ 2590.157630] romain-desktop kernel: amdgpu 0000:38:00.0:   at address 0x0000000109810000 from 27
[ 2590.157632] romain-desktop kernel: amdgpu 0000:38:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 2600.376443] romain-desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=107913, emitted seq=107915
[ 2600.376449] romain-desktop kernel: [drm] GPU recovery disabled.

Does anyone know a workaround or a tree with a fix?
Comment 7 Paul Menzel 2019-01-13 16:52:44 UTC
I am still getting this error with Linux 5.0-rc1+.

    [   19.518149] WARNING: CPU: 2 PID: 359 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1380 dcn_bw_update_from_pplib+0x158/0x
290 [amdgpu]
Comment 8 Paul Menzel 2019-02-17 12:28:59 UTC
Testing Linux 5.0-rc6, I am still seeing this.
Comment 9 Öyvind Saether 2019-02-22 15:05:11 UTC
Created attachment 143440 [details]
These errors appear on 2400G on kernel 4.20.11
Comment 10 jzahraoui@gmail.com 2019-03-18 23:03:37 UTC
Testing Linux 5.1-rc1, error still present.
Comment 11 Jan 2019-04-29 20:08:50 UTC
Same here:

HW: Asrock B450M Pro4 + Ryzen 2400g
Kernel: 5.0.9

I'm totally unable to use the gpu.
Comment 12 John Pathy 2019-05-25 21:46:34 UTC
Created attachment 144345 [details]
Complete kern.log

I have attached a complete copy of my kern.log as a reference to show the trace.
Comment 13 John Pathy 2019-05-25 22:02:37 UTC
I have the same problem with my 2400G.

My kernel is as follows

[    0.000000] Linux version 4.19.0-5-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-7)) #1 SMP Debian 4.19.37-3 (2019-05-15)

When my system boots I get a RIP with a trace as above

[    1.532390] amdgpu: [powerplay] dpm has been enabled
[    1.532453] [drm] DM_PPLIB: values for Invalid clock
[    1.532454] [drm] DM_PPLIB:	 0 in kHz
[    1.532456] [drm] DM_PPLIB:	 400000 in kHz
[    1.532457] [drm] DM_PPLIB:	 933000 in kHz
[    1.532459] [drm] DM_PPLIB:	 1067000 in kHz
[    1.532537] WARNING: CPU: 6 PID: 134 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1372 dcn_bw_update_from_pplib+0x171/0x290 [amdgpu]
[    1.532540] Modules linked in: amdkfd amdgpu(+) crc32c_intel mxm_wmi chash gpu_sched ttm aesni_intel ahci drm_kms_helper aes_x86_64 crypto_simd libahci cryptd glue_helper xhci_pci libata drm xhci_hcd igb i2c_piix4 scsi_mod usbcore dca i2c_algo_bit usb_common video wmi gpio_amdpt gpio_generic button
[    1.532558] CPU: 6 PID: 134 Comm: systemd-udevd Not tainted 4.19.0-5-amd64 #1 Debian 4.19.37-3
[    1.532561] Hardware name: Gigabyte Technology Co., Ltd. X470 AORUS ULTRA GAMING/X470 AORUS ULTRA GAMING-CF, BIOS F30 04/16/2019
[    1.532620] RIP: 0010:dcn_bw_update_from_pplib+0x171/0x290 [amdgpu]

So I looked into this and the problem I see is that the fclks are not initializing properly is routine

res = dm_pp_get_clock_levels_by_type_with_voltage(
			ctx, DM_PP_CLOCK_TYPE_FCLK, &fclks);

When the sanity check in 

res = verify_clock_values(&fclks);

is completed, it detects the 0 in index [0] and initiated the RIP. This is the data that initiated the RIP.

[    1.532454] [drm] DM_PPLIB:	 0 in kHz

The fclks in my log appear to be shifted down. I seems it should look like this, which is what a 2200G reports on boot. This was from somebody else's system.

[    3.774892] [drm] DM_PPLIB: values for F clock
[    3.774894] [drm] DM_PPLIB:	 400000 in kHz
[    3.774894] [drm] DM_PPLIB:	 933000 in kHz
[    3.774895] [drm] DM_PPLIB:	 1067000 in kHz
[    3.774895] [drm] DM_PPLIB:	 1200000 in kHz
[    3.774896] [drm] DM_PPLIB: values for DCF clock
[    3.774896] [drm] DM_PPLIB:	 300000 in kHz
[    3.774897] [drm] DM_PPLIB:	 600000 in kHz
[    3.774897] [drm] DM_PPLIB:	 626000 in kHz
[    3.774897] [drm] DM_PPLIB:	 654000 in kHz

Also to note is that for my kernel (4.19.0-5-amd64) the F and DCF are reported as invalid, but for subsequent kernels 5.1.3 and 5.2-rc1, this appears to be corrected.

My whole kern.log is located in the attachment for comment 12.
Comment 14 vono 2019-05-27 11:06:32 UTC
*** Bug 107209 has been marked as a duplicate of this bug. ***
Comment 15 Janpieter Sollie 2019-07-03 06:49:05 UTC
Created attachment 144689 [details]
dmesg warning on 5.1.14 with vega 11

This bug looks very much like the one I have on my zen system, but with a B450 instead, so the error seems to impact other hardware as well. I filtered the output to only show DRM/amdgpu output, will post my .config file if neccessary
Comment 16 Paul Menzel 2019-07-13 18:30:40 UTC
Could some AMD developer please comment, on how to fix this? Tables(?) containing “0 kHz” are apparently shipped by vendors, so what to do?

```
static bool verify_clock_values(struct dm_pp_clock_levels_with_voltage *clks)
{
        int i;

        if (clks->num_levels == 0)
                return false;

        for (i = 0; i < clks->num_levels; i++)
                /* Ensure that the result is sane */
                if (clks->data[i].clocks_in_khz == 0)
                        return false;

        return true;
}
```

Should commit 00893681a0ff4 (drm/amd/display: Reject PPLib clock values if they are invalid) [1] be reverted? Andrew, Tony, Harry?

> drm/amd/display: Reject PPLib clock values if they are invalid
>
> We should be sticking with the default clock values if the values
> obtained from PPLib are bogus.
>
> Signed-off-by: Andrew Jiang <Andrew.Jiang@amd.com>
> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
> Acked-by: Harry Wentland <harry.wentland@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

PS: AMDGPU’s commit messages are too terse, and should be more elaborate.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=00893681a0ff41cacecabc3dafe0987593a3d5c5
Comment 17 BRULE Herman 2019-08-30 20:49:12 UTC
Same here with 3400G
Comment 18 Karl Riki 2019-09-12 10:33:23 UTC
Similar here on a Asrock B450M Pro4 with Athlon 200GE (Raven Ridge):
Screen blanks on boot a few seconds, aswell on wakeups.

[    0.849231] Linux agpgart interface v0.103
[    0.927246] [drm] amdgpu kernel modesetting enabled.
[    0.927385] Parsing CRAT table with 1 nodes
[    0.927395] Creating topology SYSFS entries
[    0.927433] Topology: Add APU node [0x0:0x0]
[    0.927434] Finished initializing topology
[    0.927497] amdgpu 0000:07:00.0: remove_conflicting_pci_framebuffers: bar 0: 0xe0000000 -> 0xefffffff
[    0.927498] amdgpu 0000:07:00.0: remove_conflicting_pci_framebuffers: bar 2: 0xf0000000 -> 0xf01fffff
[    0.927499] amdgpu 0000:07:00.0: remove_conflicting_pci_framebuffers: bar 5: 0xfcb00000 -> 0xfcb7ffff
[    0.927501] amdgpu 0000:07:00.0: vgaarb: deactivate vga console
[    0.928793] Console: switching to colour dummy device 80x25
[    0.929058] [drm] initializing kernel modesetting (RAVEN 0x1002:0x15DD 0x1002:0x15DD 0xCB).
[    0.929072] [drm] register mmio base: 0xFCB00000
[    0.929072] [drm] register mmio size: 524288
[    0.929130] [drm] add ip block number 0 <soc15_common>
[    0.929131] [drm] add ip block number 1 <gmc_v9_0>
[    0.929131] [drm] add ip block number 2 <vega10_ih>
[    0.929132] [drm] add ip block number 3 <psp>
[    0.929132] [drm] add ip block number 4 <gfx_v9_0>
[    0.929133] [drm] add ip block number 5 <sdma_v4_0>
[    0.929133] [drm] add ip block number 6 <powerplay>
[    0.929134] [drm] add ip block number 7 <dm>
[    0.929135] [drm] add ip block number 8 <vcn_v1_0>
[    0.929186] [drm] VCN decode is enabled in VM mode
[    0.929187] [drm] VCN encode is enabled in VM mode
[    0.929187] [drm] VCN jpeg decode is enabled in VM mode
[    0.952670] [drm] BIOS signature incorrect 20 7
[    0.952690] ATOM BIOS: 113-RAVEN-113
[    0.952723] [drm] RAS INFO: ras initialized successfully, hardware ability[0] ras_mask[0]
[    0.952725] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
[    0.952737] amdgpu 0000:07:00.0: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
[    0.952738] amdgpu 0000:07:00.0: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[    0.952739] amdgpu 0000:07:00.0: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[    0.952743] [drm] Detected VRAM RAM=2048M, BAR=2048M
[    0.952744] [drm] RAM width 128bits DDR4
[    0.952804] [TTM] Zone  kernel: Available graphics memory: 7173824 KiB
[    0.952804] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[    0.952805] [TTM] Initializing pool allocator
[    0.952808] [TTM] Initializing DMA pool allocator
[    0.952865] [drm] amdgpu: 2048M of VRAM memory ready
[    0.952873] [drm] amdgpu: 3072M of GTT memory ready.
[    0.952884] [drm] GART: num cpu pages 262144, num gpu pages 262144
[    0.953019] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
[    0.954302] [drm] use_doorbell being set to: [true]
[    0.954418] [drm] Found VCN firmware Version ENC: 1.9 DEC: 1 VEP: 0 Revision: 28
[    0.954428] [drm] PSP loading VCN firmware
[    0.975073] [drm] reserve 0x400000 from 0xf400c00000 for PSP TMR SIZE
[    1.140184] [drm] DM_PPLIB: values for F clock
[    1.140185] [drm] DM_PPLIB:	 0 in kHz
[    1.140186] [drm] DM_PPLIB:	 0 in kHz
[    1.140186] [drm] DM_PPLIB:	 0 in kHz
[    1.140186] [drm] DM_PPLIB:	 1333000 in kHz
[    1.140187] ------------[ cut here ]------------
[    1.140280] WARNING: CPU: 2 PID: 199 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1401 dcn_bw_update_from_pplib.cold+0x73/0x9c [amdgpu]
[    1.140281] Modules linked in: amdgpu(+) gpu_sched i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart
[    1.140287] CPU: 2 PID: 199 Comm: modprobe Not tainted 5.2.14-arch1-1-ARCH #1
[    1.140288] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P3.50 07/18/2019
[    1.140364] RIP: 0010:dcn_bw_update_from_pplib.cold+0x73/0x9c [amdgpu]
[    1.140366] Code: 48 8b 93 e0 02 00 00 db 42 78 83 f9 02 77 37 b8 02 00 00 00 8d 71 ff e9 ca 2b f7 ff 48 c7 c7 98 03 3e c0 31 c0 e8 6b 67 9b e1 <0f> 0b e9 44 2c f7 ff 48 c7 c7 98 03 3e c0 31 c0 e8 56 67 9b e1 0f
[    1.140367] RSP: 0018:ffff9bbc81d2f668 EFLAGS: 00010246
[    1.140369] RAX: 0000000000000024 RBX: ffff8addc5723000 RCX: 0000000000000000
[    1.140369] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 00000000ffffffff
[    1.140370] RBP: ffff8addc620c980 R08: 00000000000002b3 R09: 0000000000000004
[    1.140370] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9bbc81d2f708
[    1.140371] R13: 0000000000000001 R14: 000000000000000a R15: 0000000000000000
[    1.140372] FS:  00007fb896a9e740(0000) GS:ffff8addd0680000(0000) knlGS:0000000000000000
[    1.140373] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.140373] CR2: 0000565204d50058 CR3: 00000004060c8000 CR4: 00000000003406e0
[    1.140374] Call Trace:
[    1.140458]  dcn10_create_resource_pool+0x983/0xa50 [amdgpu]
[    1.140462]  ? _raw_spin_lock_irqsave+0x26/0x50
[    1.140537]  dc_create_resource_pool+0x1c0/0x270 [amdgpu]
[    1.140612]  dc_create+0x229/0x630 [amdgpu]
[    1.140615]  ? kmem_cache_alloc_trace+0x34/0x1c0
[    1.140687]  ? amdgpu_cgs_create_device+0x23/0x50 [amdgpu]
[    1.140763]  amdgpu_dm_init+0xeb/0x160 [amdgpu]
[    1.140839]  dm_hw_init+0xe/0x20 [amdgpu]
[    1.140915]  amdgpu_device_init.cold+0x1000/0x15e3 [amdgpu]
[    1.140975]  amdgpu_driver_load_kms+0x88/0x270 [amdgpu]
[    1.140987]  drm_dev_register+0x111/0x150 [drm]
[    1.141046]  amdgpu_pci_probe+0xbd/0x120 [amdgpu]
[    1.141049]  ? __pm_runtime_resume+0x49/0x60
[    1.141051]  local_pci_probe+0x42/0x80
[    1.141053]  ? pci_match_device+0xc5/0x100
[    1.141054]  pci_device_probe+0xfa/0x190
[    1.141057]  really_probe+0xf0/0x380
[    1.141058]  driver_probe_device+0xb6/0x100
[    1.141060]  device_driver_attach+0x53/0x60
[    1.141061]  __driver_attach+0x8a/0x150
[    1.141063]  ? device_driver_attach+0x60/0x60
[    1.141064]  ? device_driver_attach+0x60/0x60
[    1.141065]  bus_for_each_dev+0x89/0xd0
[    1.141066]  bus_add_driver+0x14a/0x1e0
[    1.141068]  driver_register+0x6c/0xb0
[    1.141070]  ? 0xffffffffc052b000
[    1.141072]  do_one_initcall+0x59/0x234
[    1.141076]  do_init_module+0x5c/0x230
[    1.141078]  load_module+0x2122/0x23a0
[    1.141082]  ? __se_sys_finit_module+0xa8/0x100
[    1.141083]  __se_sys_finit_module+0xa8/0x100
[    1.141086]  do_syscall_64+0x5f/0x1d0
[    1.141089]  ? page_fault+0x8/0x30
[    1.141091]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    1.141092] RIP: 0033:0x7fb896bbfe3d
[    1.141094] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 50 0c 00 f7 d8 64 89 01 48
[    1.141095] RSP: 002b:00007ffdaf7dd488 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    1.141096] RAX: ffffffffffffffda RBX: 0000557476d8caf0 RCX: 00007fb896bbfe3d
[    1.141097] RDX: 0000000000000000 RSI: 0000557475ac0400 RDI: 000000000000000d
[    1.141097] RBP: 0000557475ac0400 R08: 0000000000000000 R09: 0000000000000000
[    1.141098] R10: 000000000000000d R11: 0000000000000246 R12: 0000000000000000
[    1.141098] R13: 0000557476d8cb70 R14: 0000000000060000 R15: 0000557476d8caf0
[    1.141100] ---[ end trace e8ff844124760292 ]---
[    1.141102] [drm] DM_PPLIB: values for DCF clock
[    1.141102] [drm] DM_PPLIB:	 300000 in kHz
[    1.141103] [drm] DM_PPLIB:	 600000 in kHz
[    1.141103] [drm] DM_PPLIB:	 626000 in kHz
[    1.141103] [drm] DM_PPLIB:	 654000 in kHz
[    1.141368] [drm:construct [amdgpu]] *ERROR* construct: Invalid Connector ObjectID from Adapter Service for connector index:2! type 0 expected 3
[    1.144192] [drm] Display Core initialized with v3.2.27!
[    1.157042] [drm] SADs count is: -2, don't need to read it
[    1.157380] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    1.157380] [drm] Driver supports precise vblank timestamp query.
[    1.170880] [drm] VCN decode and encode initialized successfully(under SPG Mode).
[    1.171884] kfd kfd: Allocated 3969056 bytes on gart
[    1.171899] Topology: Add APU node [0x15dd:0x1002]
[    1.172085] kfd kfd: added device 1002:15dd
[    1.173162] [drm] fb mappable at 0x61000000
[    1.173162] [drm] vram apper at 0x60000000
[    1.173163] [drm] size 8294400
[    1.173163] [drm] fb depth is 24
[    1.173164] [drm]    pitch is 7680
[    1.173214] fbcon: amdgpudrmfb (fb0) is primary device
[    1.195455] Console: switching to colour frame buffer device 240x67
[    1.217430] amdgpu 0000:07:00.0: fb0: amdgpudrmfb frame buffer device
[    1.260159] amdgpu 0000:07:00.0: ring gfx uses VM inv eng 0 on hub 0
[    1.260163] amdgpu 0000:07:00.0: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[    1.260165] amdgpu 0000:07:00.0: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[    1.260167] amdgpu 0000:07:00.0: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[    1.260169] amdgpu 0000:07:00.0: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[    1.260171] amdgpu 0000:07:00.0: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[    1.260172] amdgpu 0000:07:00.0: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[    1.260174] amdgpu 0000:07:00.0: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[    1.260176] amdgpu 0000:07:00.0: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[    1.260178] amdgpu 0000:07:00.0: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[    1.260180] amdgpu 0000:07:00.0: ring sdma0 uses VM inv eng 0 on hub 1
[    1.260182] amdgpu 0000:07:00.0: ring vcn_dec uses VM inv eng 1 on hub 1
[    1.260184] amdgpu 0000:07:00.0: ring vcn_enc0 uses VM inv eng 4 on hub 1
[    1.260186] amdgpu 0000:07:00.0: ring vcn_enc1 uses VM inv eng 5 on hub 1
[    1.260187] amdgpu 0000:07:00.0: ring vcn_jpeg uses VM inv eng 6 on hub 1
[    1.263555] [drm] Initialized amdgpu 3.32.0 20150101 for 0000:07:00.0 on minor 0
[    1.353427] SCSI subsystem initialized
[    1.355923] xhci_hcd 0000:01:00.0: xHCI Host Controller
Comment 19 frido.ferdinand 2019-09-25 14:16:28 UTC
Same problem, initially installed arch with an nvidia 970 card, later switched to internal vega graphics on displayport with 144hz monitor now blank (flickering) screen on boot. Details:

5.3.1-arch1-1-ARCH
ROG STRIX B450-F GAMING, BIOS 2801 09/18/2019
AMD Ryzen 5 2400G with Radeon Vega Graphics (Using displayport)
GeForce GTX 970

dmesg attached.
Comment 20 frido.ferdinand 2019-09-25 14:18:09 UTC
Created attachment 145514 [details]
dmesg.txt blank screen on boot
Comment 21 Janpieter Sollie 2019-10-17 12:12:56 UTC
Created attachment 145764 [details]
dmesg of PPLIB error

still present on linux 5.3, with the following exceptions:
- the values in mV seem to be initialized,
- DRM does not complain about 'Cannot find any crtc or sizes' after GPU adding
- DRM: construct error is gone

So it's going the good way, I guess ...
I investigated the source around dcn_bw_update_from_pplib

And I saw the following code in gpu/drm/amd/display/dc/calcs/dcn_calcs.c

========================================
        bool res;

        /* TODO: This is not the proper way to obtain fabric_and_dram_bandwidth, should be min(fclk, memclk) */
        res = dm_pp_get_clock_levels_by_type_with_voltage(
                        ctx, DM_PP_CLOCK_TYPE_FCLK, &fclks);

        kernel_fpu_begin();

        if (res)
                res = verify_clock_values(&fclks);

        if (res) {
//unimportant, left out
        } else
                BREAK_TO_DEBUGGER();
=============================================

which probably explains what happens: fclks gets a number of clock values from dm_pp_get_clock_levels_by_type_with_voltage, setting res to true.
It tries to validate the clock values then, which fails because of the invalid numbers
After that, it breaks to debugger.

Is it possible the vega11 needs more time to initialize its clock limits?
Comment 22 Janpieter Sollie 2019-11-01 16:16:34 UTC
using kernel 5.3.8 + firmware 22/10 instead of 5.3.6 with GCC 8.3.0 instead of 9.2.0:
[    3.421856] [drm] use_doorbell being set to: [true]
[    3.421888] amdgpu: [powerplay] hwmgr_sw_init smu backed is smu10_smu
[    3.423662] [drm] Found VCN firmware Version: 1.73 Family ID: 18
[    3.423667] [drm] PSP loading VCN firmware
[    3.444352] [drm] reserve 0x400000 from 0xf400c00000 for PSP TMR
[    3.458608] usb 1-5: reset high-speed USB device number 2 using xhci_hcd
[    3.620099] [drm] DM_PPLIB: values for F clock
[    3.620101] [drm] DM_PPLIB:   400000 in kHz, 3649 in mV
[    3.620102] [drm] DM_PPLIB:   933000 in kHz, 3974 in mV
[    3.620102] [drm] DM_PPLIB:   1067000 in kHz, 4174 in mV
[    3.620103] [drm] DM_PPLIB:   1200000 in kHz, 4325 in mV
[    3.620104] [drm] DM_PPLIB: values for DCF clock
[    3.620105] [drm] DM_PPLIB:   300000 in kHz, 3649 in mV
[    3.620106] [drm] DM_PPLIB:   600000 in kHz, 3974 in mV
[    3.620107] [drm] DM_PPLIB:   626000 in kHz, 4174 in mV
[    3.620107] [drm] DM_PPLIB:   654000 in kHz, 4325 in mV
[    3.708441] [drm] Display Core initialized with v3.2.35!
[    3.733737] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    3.733738] [drm] Driver supports precise vblank timestamp query.
[    3.735809] usb 2-1: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd
[    3.745448] [drm] VCN decode and encode initialized successfully(under SPG Mode).
[    3.746451] kfd kfd: Allocated 3969056 bytes on gart
[    3.747092] Topology: Add APU node [0x15dd:0x1002]
[    3.747094] kfd kfd: added device 1002:15dd
[    3.748312] [drm] fb mappable at 0xA1000000
[    3.748313] [drm] vram apper at 0xA0000000
[    3.748314] [drm] size 8294400
[    3.748314] [drm] fb depth is 24
[    3.748315] [drm]    pitch is 7680
[    3.748384] fbcon: amdgpudrmfb (fb0) is primary device
[    3.794386] Console: switching to colour frame buffer device 240x67
[    3.816030] amdgpu 0000:29:00.0: fb0: amdgpudrmfb frame buffer device
[    3.830110] amdgpu 0000:29:00.0: ring gfx uses VM inv eng 0 on hub 0
[    3.830113] amdgpu 0000:29:00.0: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[    3.830115] amdgpu 0000:29:00.0: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[    3.830117] amdgpu 0000:29:00.0: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[    3.830120] amdgpu 0000:29:00.0: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[    3.830121] amdgpu 0000:29:00.0: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[    3.830123] amdgpu 0000:29:00.0: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[    3.830125] amdgpu 0000:29:00.0: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[    3.830127] amdgpu 0000:29:00.0: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[    3.830129] amdgpu 0000:29:00.0: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[    3.830131] amdgpu 0000:29:00.0: ring sdma0 uses VM inv eng 0 on hub 1
[    3.830132] amdgpu 0000:29:00.0: ring vcn_dec uses VM inv eng 1 on hub 1
[    3.830134] amdgpu 0000:29:00.0: ring vcn_enc0 uses VM inv eng 4 on hub 1
[    3.830136] amdgpu 0000:29:00.0: ring vcn_enc1 uses VM inv eng 5 on hub 1
[    3.830138] amdgpu 0000:29:00.0: ring vcn_jpeg uses VM inv eng 6 on hub 1
[    3.843923] [drm] Initialized amdgpu 3.33.0 20150101 for 0000:29:00.0 on minor 0

Looks good, doesn't it? I don't know what actually changed the behaviour, but it seems to work
Comment 23 Janpieter Sollie 2019-11-06 16:14:27 UTC
Additional comment:
When changing (downgrading) the bios version for B450i from AA0 to A60, the bug appears again. Probably a combination of both software and EFI code
Comment 24 Fab Stz 2019-11-19 08:26:23 UTC
This is not fixed at all for me with -> I Reopen
Linux debian 5.3.0-2-amd64 #1 SMP Debian 5.3.9-2 (2019-11-12) x86_64 GNU/Linux

CPU/APU : Ahtlon 200GE with integrated vega 3
MB: Asus PRIME B450M-A, BIOS 1823 10/15/2019



[    1.575098] [drm] DM_PPLIB: values for F clock
[    1.575099] [drm] DM_PPLIB:   0 in kHz, 3649 in mV
[    1.575099] [drm] DM_PPLIB:   0 in kHz, 3649 in mV
[    1.575100] [drm] DM_PPLIB:   0 in kHz, 3649 in mV
[    1.575100] [drm] DM_PPLIB:   1200000 in kHz, 4399 in mV
[    1.575101] ------------[ cut here ]------------
[    1.575206] WARNING: CPU: 1 PID: 151 at drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1452 dcn_bw_update_from_pplib.cold+0x73/0x9c [amdgpu]
[    1.575207] Modules linked in: amdgpu(+) gpu_sched i2c_algo_bit ttm drm_kms_helper xhci_pci crc32c_intel drm psmouse ahci i2c_piix4 libahci xhci_hcd libata usbcore scsi_mod mfd_core usb_common wmi video gpio_amdpt gpio_generic button
[    1.575217] CPU: 1 PID: 151 Comm: systemd-udevd Not tainted 5.3.0-2-amd64 #1 Debian 5.3.9-2
[    1.575218] Hardware name: System manufacturer System Product Name/PRIME B450M-A, BIOS 1823 10/15/2019
[    1.575304] RIP: 0010:dcn_bw_update_from_pplib.cold+0x73/0x9c [amdgpu]
[    1.575306] Code: 48 8b 93 60 03 00 00 db 42 78 83 f9 02 77 37 b8 02 00 00 00 8d 71 ff e9 3b fc f3 ff 48 c7 c7 c0 af 6d c0 31 c0 e8 4c bc 6c dd <0f> 0b e9 b5 fc f3 ff 48 c7 c7 c0 af 6d c0 31 c0 e8 37 bc 6c dd 0f
[    1.575307] RSP: 0018:ffffb77040437650 EFLAGS: 00010246
[    1.575308] RAX: 0000000000000024 RBX: ffff9e9207a90000 RCX: 000000000000032e
[    1.575309] RDX: 0000000000000000 RSI: 0000000000000086 RDI: 0000000000000247
[    1.575310] RBP: ffff9e9208cb7680 R08: 000000000000032e R09: 0000000000000004
[    1.575310] R10: 0000000000000000 R11: 0000000000000001 R12: ffffb770404376f0
[    1.575311] R13: 0000000000000001 R14: ffff9e9209613200 R15: ffffb77040437880
[    1.575312] FS:  00007f43eee43d40(0000) GS:ffff9e9217240000(0000) knlGS:0000000000000000
[    1.575313] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.575313] CR2: 00007ffd4d5e9d28 CR3: 0000000209774000 CR4: 00000000003406e0
[    1.575314] Call Trace:
[    1.575402]  dcn10_create_resource_pool+0x99d/0xa50 [amdgpu]
[    1.575407]  ? _cond_resched+0x15/0x30
[    1.575409]  ? kmem_cache_alloc_trace+0x18d/0x210
[    1.575491]  ? firmware_parser_create+0x17e/0x5f0 [amdgpu]
[    1.575571]  dc_create_resource_pool+0x1b2/0x1d0 [amdgpu]
[    1.575653]  ? dal_gpio_service_create+0x95/0xe0 [amdgpu]
[    1.575732]  dc_create+0x233/0x660 [amdgpu]
[    1.575806]  ? amdgpu_cgs_create_device+0x23/0x50 [amdgpu]
[    1.575885]  amdgpu_dm_init+0x130/0x1b0 [amdgpu]
[    1.575962]  ? phm_wait_for_register_unequal.part.0+0x50/0x80 [amdgpu]
[    1.576039]  dm_hw_init+0xe/0x20 [amdgpu]
[    1.576125]  amdgpu_device_init.cold+0x1502/0x1738 [amdgpu]
[    1.576183]  amdgpu_driver_load_kms+0x58/0x1c0 [amdgpu]
[    1.576195]  drm_dev_register+0x111/0x150 [drm]
[    1.576252]  amdgpu_pci_probe+0x154/0x1b0 [amdgpu]
[    1.576255]  local_pci_probe+0x42/0x80
[    1.576258]  pci_device_probe+0x104/0x1a0
[    1.576261]  really_probe+0xf0/0x380
[    1.576263]  driver_probe_device+0x59/0xd0
[    1.576265]  device_driver_attach+0x53/0x60
[    1.576266]  __driver_attach+0x8a/0x150
[    1.576268]  ? device_driver_attach+0x60/0x60
[    1.576270]  bus_for_each_dev+0x78/0xc0
[    1.576272]  bus_add_driver+0x14a/0x1e0
[    1.576273]  driver_register+0x6c/0xb0
[    1.576274]  ? 0xffffffffc085d000
[    1.576277]  do_one_initcall+0x46/0x1f4
[    1.576278]  ? _cond_resched+0x15/0x30
[    1.576280]  ? kmem_cache_alloc_trace+0x1d4/0x210
[    1.576282]  ? do_init_module+0x23/0x230
[    1.576283]  do_init_module+0x5c/0x230
[    1.576284]  load_module+0x2349/0x24f0
[    1.576287]  ? __do_sys_finit_module+0xaf/0x110
[    1.576288]  __do_sys_finit_module+0xaf/0x110
[    1.576290]  do_syscall_64+0x53/0x140
[    1.576292]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    1.576294] RIP: 0033:0x7f43ef62df59
[    1.576296] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 07 6f 0c 00 f7 d8 64 89 01 48
[    1.576296] RSP: 002b:00007ffd4d5ee0f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    1.576298] RAX: ffffffffffffffda RBX: 00005643081f44b0 RCX: 00007f43ef62df59
[    1.576298] RDX: 0000000000000000 RSI: 00007f43ef532cad RDI: 0000000000000011
[    1.576299] RBP: 00007f43ef532cad R08: 0000000000000000 R09: 0000000000000000
[    1.576299] R10: 0000000000000011 R11: 0000000000000246 R12: 0000000000000000
[    1.576300] R13: 00005643081e1aa0 R14: 0000000000020000 R15: 00005643081f44b0
[    1.576302] ---[ end trace 0ec2ed871fa17fbb ]---
[    1.576306] [drm] DM_PPLIB: values for DCF clock
[    1.576307] [drm] DM_PPLIB:   300000 in kHz, 3649 in mV
[    1.576307] [drm] DM_PPLIB:   600000 in kHz, 3974 in mV
[    1.576308] [drm] DM_PPLIB:   626000 in kHz, 4174 in mV
[    1.576308] [drm] DM_PPLIB:   654000 in kHz, 4325 in mV
Comment 25 Martin Peres 2019-11-20 07:56:06 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/963.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.