Summary: | REG_WAIT timeout 10us * 3500 tries - dce_mi_free_dmif line:563 during boot | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Tobias Theisselmann <mail> | ||||||
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> | ||||||
Status: | RESOLVED MOVED | QA Contact: | |||||||
Severity: | major | ||||||||
Priority: | medium | CC: | asmith, harry.wentland, root.main, sunpeng.li | ||||||
Version: | unspecified | ||||||||
Hardware: | x86-64 (AMD64) | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Description
Tobias Theisselmann
2018-09-16 06:34:40 UTC
Bug 107947 might be related, but the backtrace appears to be rather different. (In reply to Gediminas Jakutis from comment #1) > Bug 107947 might be related, but the backtrace appears to be rather > different. please ignore this comment – accidentally replied to the wrong bug report in the wrong browser tab I have this issue, or one very like it. I have a system with both a Vega64 and a GTX 980ti. The 980ti is controlled by the vfio-pci driver (preempting nouveau, nvidiafb, etc. on boot) and is therefore not in use by this host system. If a display is plugged into the 980ti at boot time, booting will always fail with a black screen and more failures from drm. If a display is not plugged in, booting will almost always work without a problem, with fewer errors from drm. Even with no display plugged into the 980ti, when the system's displays (plugged into the vega64) resume when they have been sleeping (such as on the lock screen) there is a substantial chance all displays will be completely black and drm failures can be seen in dmesg. If a display is plugged into the 980ti when the main displays wake, the outcome is much less likely to be good. In the case where the system is returning from sleeping screens it fortunately does not fully necessitate a complete reboot, as the displays will sleep again in a few seconds and it can be tried again until success. Nonetheless, as this system is intended to be used as a VM host with a passed-through monitor, this bug is a showstopper. I have full dmesg logs from boot to an unusable black screen, and from boot to a login screen. Created attachment 142268 [details]
dmesg logs for failed boot to black screen
Created attachment 142269 [details]
dmesg logs for successful boot to login screen
Good-citizen update for you: I have resolved my immediate problem. These apparent crashes are still happening pretty often when display modes change, but do not seem to be the root cause of my issue. (In summary: my AMD and nvidia GPUs were in slots 1 and 4 of my motherboard; I moved them to slots 3 and 1 respectively, which gives both of them full x16 lanes instead of both being on x8 lanes if the documentation tells the truth, and while the system still shows bios and grub on the nvidia card, booting and resuming in ubuntu now seems to work fine. Quite unsure why, but there you have it.) Apologies for any unhelpful noise, good luck with whatever this bug actually is, and thank you for your work. This is occurring for me on a Vega 64. When it occurs the machine boots to a black screen. It has started happening since upgrading to Fedora 27's 4.18.16, previously I was on 4.18.7 (does not happen there). I've since updated to Fedora 29, 4.18.16-300.fc29.x86_64, and it still happens there. [drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 10us * 3500 tries - dce_mi_free_dmif line:636 WARNING: CPU: 6 PID: 122 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:254 generic_reg_wait+0xe8/0x160 [amdgpu] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c tun bridge stp llc ebtable_filter ebtables > ttm alx drm crc32c_intel mdio video CPU: 6 PID: 122 Comm: kworker/6:1 Not tainted 4.18.16-100.fc27.x86_64 #1 Hardware name: Gigabyte Technology Co., Ltd. Z170X-Gaming 3/Z170X-Gaming 3, BIOS F7 06/03/2016 Workqueue: events drm_mode_rmfb_work_fn [drm] RIP: 0010:generic_reg_wait+0xe8/0x160 [amdgpu] Code: 58 48 8b 4c 24 50 89 ee 8b 54 24 48 48 c7 c7 e0 93 75 c0 44 89 4c 24 08 e8 b5 4f c1 ff 41 83 7c 24 20 01 44 8b 4c 24 08 74 02 <0f> 0b 48 83 c4 10 44 89 c8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f af RSP: 0018:ffffb846441e7a78 EFLAGS: 00010297 RAX: 0000000000000000 RBX: 0000000000000dad RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9365c1d96938 RDI: ffff9365c1d96938 RBP: 000000000000000a R08: 0000000000000473 R09: 0000000000000002 R10: 000000000000beee R11: ffffffff889a21ed R12: ffff93659551d900 R13: 00000000000035af R14: 0000000000000010 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff9365c1d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000250e080 CR3: 00000005e720a005 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: dce_mi_free_dmif+0xf7/0x180 [amdgpu] dce110_reset_hw_ctx_wrap+0x13f/0x1e0 [amdgpu] dce110_apply_ctx_to_hw+0x58/0x9e0 [amdgpu] ? _cond_resched+0x15/0x40 ? pp_dpm_dispatch_tasks+0x41/0x60 [amdgpu] ? amdgpu_pm_compute_clocks.part.9+0xb7/0x590 [amdgpu] dc_commit_state+0x30a/0x590 [amdgpu] amdgpu_dm_atomic_commit_tail+0x385/0xd70 [amdgpu] ? _cond_resched+0x15/0x40 ? wait_for_completion_timeout+0x3a/0x190 ? wait_for_completion_interruptible+0x35/0x1c0 commit_tail+0x3d/0x70 [drm_kms_helper] drm_atomic_helper_commit+0xfc/0x110 [drm_kms_helper] drm_framebuffer_remove+0x30d/0x400 [drm] drm_mode_rmfb_work_fn+0x4f/0x60 [drm] process_one_work+0x18f/0x370 worker_thread+0x30/0x380 ? process_one_work+0x370/0x370 kthread+0x113/0x130 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x35/0x40 ---[ end trace 9529270edb28c719 ]--- I think this is related to display power management - it occurs when the monitor is woken from standby. The blank screen I get at startup occurs between the login screen (gdm) and the desktop. After logging in, gdm briefly puts the display to sleep, and then when it wakes up again the screen is blank and this error appears in dmesg. I've found I'm usually able to get things working again by doing a few VT switches between gdm and the desktop, but this is still quite annoying as it is happening almost every boot. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/528. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.