Bug 106263

Summary: amdgpu produces several stracktraces (Fiji, Bonaire) at boot since kernel 4.16.4
Product: DRI Reporter: erhard_f
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED WORKSFORME QA Contact:
Severity: normal    
Priority: medium CC: charlene.liu, harry.wentland, hiwatari.seiji, lucas.yamanishi
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
kernel config 4.16.5
none
journalctl -k (kernel 4.16.5)
none
bisect.log none

Description erhard_f 2018-04-26 23:12:51 UTC
Created attachment 139156 [details]
kernel config 4.16.5

Getting these stracktraces at boot time since kernel 4.16.4. Last working was 4.16.3. On another machine I got similar issues since kernel 4.14.36 (last working was 4.14.35 here). Booting is somewhat delayed (maybe other issues as well?), but after a few minutes I get working console and X.

Apr 27 00:49:02 hakla03 kernel: WARNING: CPU: 2 PID: 148 at drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services.h:132 generic_reg_update_ex+0xe4/0x120 [amdgpu]
Apr 27 00:49:02 hakla03 kernel: Modules linked in: ohci_pci(+) evdev crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 crypto_simd cryptd glue_helper amdgpu(+) r8169 k10temp mii chash i2c_algo_bit gpu_sched i2c_piix4 ohci_hcd ehci_pci drm_kms_helper ehci_hcd syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm xhci_pci xhci_hcd drm_panel_orientation_quirks usbcore usb_common shpchp video acpi_cpufreq button processor nct6775 hwmon_vid hwmon
Apr 27 00:49:02 hakla03 kernel: CPU: 2 PID: 148 Comm: systemd-udevd Not tainted 4.16.5-gentoo #2
Apr 27 00:49:02 hakla03 kernel: Hardware name: System manufacturer System Product Name/A88X-PRO, BIOS 2603 03/10/2016
Apr 27 00:49:02 hakla03 kernel: RIP: 0010:generic_reg_update_ex+0xe4/0x120 [amdgpu]
Apr 27 00:49:02 hakla03 kernel: RSP: 0018:ffff880420e33320 EFLAGS: 00010246
Apr 27 00:49:02 hakla03 kernel: RAX: ffff880420e33338 RBX: ffff88042006f140 RCX: 0000000000000000
Apr 27 00:49:02 hakla03 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88042c63ac00
Apr 27 00:49:02 hakla03 kernel: RBP: ffff880420e33388 R08: 0000000000000000 R09: 0000000000000000
Apr 27 00:49:02 hakla03 kernel: R10: ffff880420e333a0 R11: 0000000000000001 R12: 0000000000000001
Apr 27 00:49:02 hakla03 kernel: R13: 0000000000000000 R14: ffff88042021e000 R15: ffff88041da10000
Apr 27 00:49:02 hakla03 kernel: FS:  00007fbfb3b21840(0000) GS:ffff88043ed00000(0000) knlGS:0000000000000000
Apr 27 00:49:02 hakla03 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 27 00:49:02 hakla03 kernel: CR2: 00007f8222b4af08 CR3: 0000000420e26000 CR4: 00000000000406e0
Apr 27 00:49:02 hakla03 kernel: Call Trace:
Apr 27 00:49:02 hakla03 kernel:  dce110_stream_encoder_update_hdmi_info_packets+0x374/0x590 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  apply_single_controller_ctx_to_hw+0x217/0x340 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? dce110_apply_ctx_to_hw+0x4c1/0x6d8 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? dc_commit_state+0x2ef/0x548 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? mod_freesync_set_user_enable+0x105/0x130 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? amdgpu_dm_atomic_commit_tail+0x353/0xdb0 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? amdgpu_bo_pin_restricted+0x1a8/0x288 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? dm_plane_helper_prepare_fb+0x16f/0x238 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? commit_tail+0x38/0x60 [drm_kms_helper]
Apr 27 00:49:02 hakla03 kernel:  ? drm_atomic_helper_commit+0xaf/0x120 [drm_kms_helper]
Apr 27 00:49:02 hakla03 kernel:  ? restore_fbdev_mode_atomic+0x1a8/0x200 [drm_kms_helper]
Apr 27 00:49:02 hakla03 kernel:  ? drm_fb_helper_restore_fbdev_mode_unlocked+0x40/0x88 [drm_kms_helper]
Apr 27 00:49:02 hakla03 kernel:  ? drm_fb_helper_set_par+0x24/0x50 [drm_kms_helper]
Apr 27 00:49:02 hakla03 kernel:  ? fbcon_init+0x525/0x6b8
Apr 27 00:49:02 hakla03 kernel:  ? visual_init+0xcd/0x128
Apr 27 00:49:02 hakla03 kernel:  ? do_bind_con_driver+0x1ee/0x3e8
Apr 27 00:49:02 hakla03 kernel:  ? do_take_over_console+0x76/0x180
Apr 27 00:49:02 hakla03 kernel:  ? do_fbcon_takeover+0x52/0xa8
Apr 27 00:49:02 hakla03 kernel:  ? notifier_call_chain+0x41/0x60
Apr 27 00:49:02 hakla03 kernel:  ? blocking_notifier_call_chain+0x39/0x58
Apr 27 00:49:02 hakla03 kernel:  ? down+0xd/0x50
Apr 27 00:49:02 hakla03 kernel:  ? register_framebuffer+0x21d/0x2f8
Apr 27 00:49:02 hakla03 kernel:  ? __drm_fb_helper_initial_config_and_unlock+0x209/0x3e0 [drm_kms_helper]
Apr 27 00:49:02 hakla03 kernel:  ? amdgpu_fbdev_init+0xba/0xf0 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? amdgpu_device_init+0xc88/0x1228 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? __alloc_pages_nodemask+0xcf/0x1b8
Apr 27 00:49:02 hakla03 kernel:  ? amdgpu_driver_load_kms+0x73/0x1c8 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? drm_dev_register+0x12d/0x1b8 [drm]
Apr 27 00:49:02 hakla03 kernel:  ? amdgpu_pci_probe+0xf4/0x180 [amdgpu]
Apr 27 00:49:02 hakla03 kernel:  ? local_pci_probe+0x3d/0x90
Apr 27 00:49:02 hakla03 kernel:  ? pci_device_probe+0xae/0x128
Apr 27 00:49:02 hakla03 kernel:  ? driver_probe_device+0x20b/0x318
Apr 27 00:49:02 hakla03 kernel:  ? __driver_attach+0x88/0x90
Apr 27 00:49:02 hakla03 kernel:  ? driver_probe_device+0x318/0x318
Apr 27 00:49:02 hakla03 kernel:  ? bus_for_each_dev+0x70/0xa0
Apr 27 00:49:02 hakla03 kernel:  ? bus_add_driver+0x18c/0x210
Apr 27 00:49:02 hakla03 kernel:  ? 0xffffffffa047f000
Apr 27 00:49:02 hakla03 kernel:  ? driver_register+0x52/0xc0
Apr 27 00:49:02 hakla03 kernel:  ? 0xffffffffa047f000
Apr 27 00:49:02 hakla03 kernel:  ? do_one_initcall+0x49/0x180
Apr 27 00:49:02 hakla03 kernel:  ? __vunmap+0x67/0xa0
Apr 27 00:49:02 hakla03 kernel:  ? do_init_module+0x51/0x1d6
Apr 27 00:49:02 hakla03 kernel:  ? load_module+0x2007/0x25b8
Apr 27 00:49:02 hakla03 kernel:  ? SYSC_finit_module+0x90/0xa8
Apr 27 00:49:02 hakla03 kernel:  ? SYSC_finit_module+0x90/0xa8
Apr 27 00:49:02 hakla03 kernel:  ? do_syscall_64+0x69/0x1a0
Apr 27 00:49:02 hakla03 kernel:  ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Apr 27 00:49:02 hakla03 kernel: Code: a8 45 84 d2 75 4b 48 8b 7f 18 89 da 48 8b 07 48 8b 40 38 e8 df 6a 47 e1 48 83 c4 48 89 d8 5b 41 5a 41 5c 41 5d 5d c3 0f 0b eb c3 <0f> 0b e9 52 ff ff ff 41 8b 0c 24 41 89 c0 49 83 c4 08 45 8b 2c 
Apr 27 00:49:02 hakla03 kernel: ---[ end trace 647d706200fdea57 ]---

$ inxi -b
System:    Host: hakla03 Kernel: 4.16.5-gentoo x86_64 bits: 64 Console: tty 0
           Distro: Gentoo Base System release 2.4.1
Machine:   Device: desktop Mobo: ASUSTeK model: A88X-PRO v: Rev X.0x serial: N/A
           UEFI [Legacy]: American Megatrends v: 2603 date: 03/10/2016
CPU:       Quad core AMD A8-6600K APU with Radeon HD Graphics (-MCP-) speed/max: 1897/3600 MHz
Graphics:  Card-1: Advanced Micro Devices [AMD/ATI] Fiji [Radeon R9 FURY / NANO Series]
           Card-2: Advanced Micro Devices [AMD/ATI] Bonaire XTX [Radeon R7 260X/360]
           Display Server: X.org 1.19.5 drivers: ati,amdgpu (unloaded: modesetting,radeon)
           tty size: 211x53 Advanced Data: N/A out of X
Network:   Card: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller driver: r8169
Drives:    HDD Total Size: 128.0GB (4.8% used)
Info:      Processes: 200 Uptime: 11 min Memory: 568.1/16054.6MB Init: systemd Client: Shell (bash) inxi: 2.3.56 

# lspci 
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Root Complex
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) I/O Memory Management Unit
00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Root Port
00:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Root Port
00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 09)
00:10.1 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 09)
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 40)
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 11)
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 11)
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 16)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 11)
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] FCH PCI Bridge (rev 40)
00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Hudson PCI to PCI bridge (PCIE port 0)
00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Hudson PCI to PCI bridge (PCIE port 1)
00:15.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Hudson PCI to PCI bridge (PCIE port 2)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models 10h-1fh) Processor Function 5
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Fiji [Radeon R9 FURY / NANO Series] (rev cb)
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Fiji HDMI/DP Audio [Radeon R9 Nano / FURY/FURY X]
02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Bonaire XTX [Radeon R7 260X/360]
02:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Tobago HDMI Audio [Radeon R7 360 / R9 360 OEM]
05:00.0 USB controller: ASMedia Technology Inc. ASM1042A USB 3.0 Host Controller
06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 11)
Comment 1 erhard_f 2018-04-26 23:13:57 UTC
Created attachment 139157 [details]
journalctl -k (kernel 4.16.5)
Comment 2 Harry Wentland 2018-04-27 14:14:08 UTC
Are you able to bisect?
Comment 3 erhard_f 2018-04-27 14:23:13 UTC
I did that once, so technically yes. Hope I got time for it the next few days.
Comment 4 hiwatari.seiji 2018-04-28 09:57:24 UTC
Having the same problem with HAWAII (R9 290) and amdgpu.
I bisected the bug and will upload my bisect.log
Comment 5 hiwatari.seiji 2018-04-28 09:57:53 UTC
Created attachment 139194 [details]
bisect.log
Comment 6 Michel Dänzer 2018-04-30 09:28:31 UTC
Pasting bisected commit for convenience:

0c04975852a65307a909db6f235166c10301c950 is the first bad commit
commit 0c04975852a65307a909db6f235166c10301c950
Author: Charlene Liu <charlene.liu@amd.com>
Date:   Fri Apr 6 23:03:12 2018 -0400

    drm/amd/display: HDMI has no sound after Panel power off/on
    
    commit af2ac326087da632e9580f65205f4cc4205caf85 upstream.
Comment 7 erhard_f 2018-06-24 21:52:33 UTC
Issue still persent on 4.17.2.
Comment 8 Harry Wentland 2018-06-27 20:00:26 UTC
Bisected commit is different from what was originally reported (both in generic_reg_update_ex, but otherwise completely different codepaths and cases).

The bisected commit is fixed in 4.18 rc1 by this commit

commit 9356badb2636b0afe2b34a8133ab246547cdf9ca
Author: Roman Li <Roman.Li@amd.com>
Date:   Thu May 17 18:08:54 2018 -0400

    drm/amd/display: check if audio clk enable is applicable
    
    Fixing warning on dce10 with HDMI display.
    
    Signed-off-by: Roman Li <Roman.Li@amd.com>
    Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
    Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Comment 9 erhard_f 2018-06-28 23:09:28 UTC
Tried 4.18-rc2 today, which also fixed the issue for me. Whether it was this fix or some other. ;)

Hence closing.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.