Summary: | Kernel panic when waking up after screen goes blank. | ||
---|---|---|---|
Product: | DRI | Reporter: | L.S.S. <ragnaros39216> |
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | critical | ||
Priority: | high | CC: | abolte, bugs, harry.wentland, juston.li, mezcalbert, nicholas.kazlauskas, tones111, wirch.eduard |
Version: | unspecified | Keywords: | regression |
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=109001 | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Description
L.S.S.
2018-02-09 01:13:44 UTC
(In reply to L.S.S. from comment #0) > I've thought about the possibility of it being DC-related as I saw similar > bug reports, but I was wrong, as at one time I was able to reproduce it even > after passing amdgpu.dc=0 during boot. The rest of your report mostly points towards a DC specific issue. If you can still reproduce an issue without DC, it would be best to file a separate report about that. l too have same problem. Sorry, i'm forget write. I too have same problem, but on Desktop*. Just now I tried reproducing it without dc (passing amdgpu.dc=0) but somehow I was not able to... the system was able to successfully get back to the lock screen after letting it blank after an extended period. As for that time I did manage to reproduce... maybe I passed the parameter wrong or for some other reasons, but for now, will keep the issue DC-specific as it's always reproducible with DC enabled (Arch/Manjaro enables DC by default including pre-Vega). Can you attach a full dmesg log with amdgpu.dc_log=1 and drm.debug=6 passed as kernel options? Created attachment 137308 [details]
dmesg output with amdgpu.dc_log=1 and drm.debug=6, right after login.
I'm not sure to what extent is a "full" dmesg. Attached is the dmesg I exported right after startup, with the above parameters passed.
And I just crashed my system the same usual way. With those parameters set there are some additional outputs besides the usual ones. Feb 13 11:31:55 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0 Feb 13 11:31:55 linuxsys kernel: [drm:handle_cursor_update.isra.22 [amdgpu]] handle_cursor_update: crtc_id=0 with size 128 to 128 Feb 13 11:31:55 linuxsys kernel: [drm:dm_plane_helper_prepare_fb [amdgpu]] No FB bound Feb 13 11:31:55 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0 Feb 13 11:31:55 linuxsys kernel: [drm:handle_cursor_update.isra.22 [amdgpu]] handle_cursor_update: crtc_id=0 with size 0 to 0 Feb 13 11:31:55 linuxsys kernel: [drm:dm_plane_helper_prepare_fb [amdgpu]] No FB bound Feb 13 11:31:55 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0 Feb 13 11:31:55 linuxsys kernel: [drm:handle_cursor_update.isra.22 [amdgpu]] handle_cursor_update: crtc_id=0 with size 128 to 128 Feb 13 11:31:55 linuxsys kernel: [drm:best_encoder [amdgpu]] Finding the best encoder Feb 13 11:31:55 linuxsys kernel: [drm:best_encoder [amdgpu]] Finding the best encoder Feb 13 11:31:55 linuxsys kernel: [drm:update_stream_scaling_settings [amdgpu]] Destination Rectangle x:0 y:0 width:1920 height:1080 Feb 13 11:31:55 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] Mode change not required, setting mode_changed to 0 Feb 13 11:31:55 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:0, mode_changed:0,active_changed:0,connectors_changed:0 Feb 13 11:31:55 linuxsys kernel: [drm:best_encoder [amdgpu]] Finding the best encoder Feb 13 11:31:55 linuxsys kernel: [drm:best_encoder [amdgpu]] Finding the best encoder Feb 13 11:31:55 linuxsys kernel: [drm:dm_update_planes_state.part.28 [amdgpu]] Disabling DRM plane: 36 on DRM crtc 43 Feb 13 11:31:55 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:0, planes_changed:0, mode_changed:0,active_changed:1,connectors_changed:0 Feb 13 11:31:55 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] Disabling DRM crtc: 43 Feb 13 11:31:55 linuxsys kernel: [drm:update_stream_scaling_settings [amdgpu]] Destination Rectangle x:0 y:0 width:1920 height:1080 Feb 13 11:31:55 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] Mode change not required, setting mode_changed to 0 Feb 13 11:31:55 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:0, planes_changed:0, mode_changed:0,active_changed:1,connectors_changed:0 Feb 13 11:31:55 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:0, planes_changed:1, mode_changed:0,active_changed:1,connectors_changed:0 Feb 13 11:31:55 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] Atomic commit: RESET. crtc id 0:[000000000ce1e17c] Feb 13 11:31:55 linuxsys kernel: [drm] dc_commit_state: 0 streams Feb 13 11:31:55 linuxsys kernel: [drm] hwss_edp_backlight_control: backlight action: Off Feb 13 11:31:55 linuxsys kernel: [drm] hwss_edp_backlight_control: backlight action: Off Feb 13 11:31:55 linuxsys kernel: [drm:amdgpu_vm_init [amdgpu]] VM update mode is SDMA Feb 13 11:31:55 linuxsys kernel: [drm] hwss_edp_backlight_control: backlight action: Off Feb 13 11:31:55 linuxsys kernel: [drm] hwss_edp_power_control: Panel Power action: Off Feb 13 11:31:56 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] Mode change not required, setting mode_changed to 0 Feb 13 11:31:56 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:0, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0 Feb 13 11:31:56 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] Mode change not required, setting mode_changed to 0 Feb 13 11:31:56 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:0, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0 Feb 13 11:31:56 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] Mode change not required, setting mode_changed to 0 Feb 13 11:31:56 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:0, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0 Feb 13 11:31:56 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] Mode change not required, setting mode_changed to 0 Feb 13 11:31:56 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:0, planes_changed:1, mode_changed:0,active_changed:0,connectors_changed:0 Feb 13 11:32:06 linuxsys kernel: BUG: unable to handle kernel NULL pointer dereference at (null) Feb 13 11:32:06 linuxsys kernel: IP: dce110_vblank_set+0x4f/0xb0 [amdgpu] Feb 13 11:32:06 linuxsys kernel: PGD 7d98ee067 P4D 7d98ee067 PUD 7d98ef067 PMD 0 Feb 13 11:32:06 linuxsys kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI Feb 13 11:32:06 linuxsys kernel: Modules linked in: cmac rfcomm fuse bnep vmnet(O) arc4 nls_iso8859_1 nls_cp437 vfat fat amdkfd amd_iommu_v2 iwlmvm amdgpu mac80211 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core vi Feb 13 11:32:06 linuxsys kernel: k10temp i2c_piix4 shpchp thermal wmi battery ac tpm_crb tpm_tis tpm_tis_core tpm video i2c_hid asus_wireless button acpi_cpufreq sch_fq_codel vmmon(O) vmw_vmci vboxnetflt(O) vboxnetadp(O) pci_stub vboxpci(O Feb 13 11:32:06 linuxsys kernel: CPU: 10 PID: 1451 Comm: xfwm4 Tainted: G O 4.15.0-1-MANJARO #1 Feb 13 11:32:06 linuxsys kernel: Hardware name: ASUSTeK COMPUTER INC. GL702ZC/GL702ZC, BIOS GL702ZC.303 12/15/2017 Feb 13 11:32:06 linuxsys kernel: RIP: 0010:dce110_vblank_set+0x4f/0xb0 [amdgpu] Feb 13 11:32:06 linuxsys kernel: RSP: 0018:ffff994148b27be0 EFLAGS: 00010002 Feb 13 11:32:06 linuxsys kernel: RAX: ffff8c4273cd0000 RBX: 0000000000000001 RCX: 0000000000000000 Feb 13 11:32:06 linuxsys kernel: RDX: 0000000000000000 RSI: 000000000000000c RDI: 0000000000000000 Feb 13 11:32:06 linuxsys kernel: RBP: ffff8c42b7fcaba0 R08: 0000000000000000 R09: 0000000000000000 Feb 13 11:32:06 linuxsys kernel: R10: 00007ffef0182bf0 R11: ffff8c42b9079d80 R12: ffff8c42b6c96b80 Feb 13 11:32:06 linuxsys kernel: R13: ffffffffc1178ba0 R14: ffff8c42a84d0000 R15: ffff8c42acb3c368 Feb 13 11:32:06 linuxsys kernel: FS: 00007f87374b6980(0000) GS:ffff8c42dee80000(0000) knlGS:0000000000000000 Feb 13 11:32:06 linuxsys kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 13 11:32:06 linuxsys kernel: CR2: 0000000000000000 CR3: 00000007d990c000 CR4: 00000000003406e0 Feb 13 11:32:06 linuxsys kernel: Call Trace: Feb 13 11:32:06 linuxsys kernel: amdgpu_dm_set_crtc_irq_state+0x31/0x60 [amdgpu] Feb 13 11:32:06 linuxsys kernel: amdgpu_irq_update+0x55/0x90 [amdgpu] Feb 13 11:32:06 linuxsys kernel: drm_vblank_enable+0x84/0x100 [drm] Feb 13 11:32:06 linuxsys kernel: drm_vblank_get+0x8d/0xb0 [drm] Feb 13 11:32:06 linuxsys kernel: drm_wait_vblank_ioctl+0x12a/0x690 [drm] Feb 13 11:32:06 linuxsys kernel: ? unix_stream_recvmsg+0x53/0x70 Feb 13 11:32:06 linuxsys kernel: ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm] Feb 13 11:32:06 linuxsys kernel: drm_ioctl_kernel+0x5b/0xb0 [drm] Feb 13 11:32:06 linuxsys kernel: drm_ioctl+0x2d5/0x370 [drm] Feb 13 11:32:06 linuxsys kernel: ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm] Feb 13 11:32:06 linuxsys kernel: ? do_iter_write+0xdc/0x190 Feb 13 11:32:06 linuxsys kernel: ? vfs_writev+0xb9/0x110 Feb 13 11:32:06 linuxsys kernel: amdgpu_drm_ioctl+0x49/0x80 [amdgpu] Feb 13 11:32:06 linuxsys kernel: do_vfs_ioctl+0xa4/0x630 Feb 13 11:32:06 linuxsys kernel: ? __sys_recvmsg+0x4e/0x90 Feb 13 11:32:06 linuxsys kernel: ? __sys_recvmsg+0x7d/0x90 Feb 13 11:32:06 linuxsys kernel: SyS_ioctl+0x74/0x80 Feb 13 11:32:06 linuxsys kernel: entry_SYSCALL_64_fastpath+0x20/0x83 Feb 13 11:32:06 linuxsys kernel: RIP: 0033:0x7f8733b03d87 Feb 13 11:32:06 linuxsys kernel: RSP: 002b:00007ffef0182c38 EFLAGS: 00000246 Feb 13 11:32:06 linuxsys kernel: Code: e8 17 20 04 00 83 e8 4e 0f b6 d0 48 89 d0 48 c1 e0 05 48 01 d0 48 c1 e0 05 49 03 86 60 01 00 00 84 db 48 8b b8 78 02 00 00 74 18 <48> 8b 07 be 02 00 00 00 48 8b 80 d8 00 00 00 e8 6d 73 92 cb 84 Feb 13 11:32:06 linuxsys kernel: RIP: dce110_vblank_set+0x4f/0xb0 [amdgpu] RSP: ffff994148b27be0 Feb 13 11:32:06 linuxsys kernel: CR2: 0000000000000000 Feb 13 11:32:06 linuxsys kernel: ---[ end trace de1630a0c4489cb7 ]--- Feb 13 11:32:06 linuxsys kernel: note: xfwm4[1451] exited with preempt_count 3 Feb 13 11:32:32 linuxsys kernel: [drm:best_encoder [amdgpu]] Finding the best encoder Feb 13 11:32:32 linuxsys kernel: [drm:best_encoder [amdgpu]] Finding the best encoder Feb 13 11:32:32 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:0, mode_changed:0,active_changed:1,connectors_changed:0 Feb 13 11:32:32 linuxsys kernel: [drm:update_stream_scaling_settings [amdgpu]] Destination Rectangle x:0 y:0 width:1920 height:1080 Feb 13 11:32:32 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:0, mode_changed:0,active_changed:1,connectors_changed:0 Feb 13 11:32:32 linuxsys kernel: [drm:dm_update_crtcs_state [amdgpu]] Enabling DRM crtc: 43 Feb 13 11:32:32 linuxsys kernel: [drm:dm_update_planes_state.part.28 [amdgpu]] Enabling DRM plane: 36 on DRM crtc 43 Feb 13 11:32:32 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:0 crtc_state_flags: enable:1, active:1, planes_changed:1, mode_changed:0,active_changed:1,connectors_changed:0 Feb 13 11:32:32 linuxsys kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] Atomic commit: SET crtc id 0: [000000000ce1e17c] Feb 13 11:32:32 linuxsys kernel: [drm] dc_commit_state: 1 streams Feb 13 11:32:32 linuxsys kernel: [drm] core_stream 0x866b8000: src: 0, 0, 1920, 1080; dst: 0, 0, 1920, 1080, colorSpace:1 Feb 13 11:32:32 linuxsys kernel: [drm] pix_clk_khz: 138700, h_total: 2080, v_total: 1111, pixelencoder:1, displaycolorDepth:1 Feb 13 11:32:32 linuxsys kernel: [drm] sink name: , serial: 0 Feb 13 11:32:32 linuxsys kernel: [drm] link: 0 Feb 13 11:32:32 linuxsys kernel: [drm] [Mode] [eDP][ConnIdx:0] {1920x1080, 2080x1111@138700Khz}^ Feb 13 11:32:32 linuxsys kernel: [drm] hwss_edp_power_control: Panel Power action: On Feb 13 11:32:32 linuxsys kernel: [drm] hwss_edp_backlight_control: backlight action: On Feb 13 11:32:32 linuxsys kernel: [drm] Link: 0 eDP panel mode supported: 1 eDP panel mode enabled: 1 Feb 13 11:32:32 linuxsys kernel: [drm] [LKTN] [eDP][ConnIdx:0] RBRx2 pass VS=1, PE=0^ Feb 13 11:32:32 linuxsys kernel: [drm] hwss_edp_backlight_control: backlight action: On Created attachment 137322 [details] [review] Patch 1 Use crtc enable/disable_vblank hooks Created attachment 137323 [details] [review] Patch 2 Return success when enabling interrupt Created attachment 137324 [details] [review] Patch 3 Clean up formatting in irq_service_dce110.c Created attachment 137325 [details] [review] Patch 4 Don't blow up if TG is NULL in dce110_vblank_set Are you able to rebuild the kernel with the attached patches and see if that fixes things? Created attachment 137327 [details] [review] Patch 2 Return success when enabling interrupt Goofed up my original patch 2. This should work. The first patch got rejected with the most recent 4.15 kernel source pulled using the PKGBUILD file (Feb 14, 2018). The reject file contains: --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2523,6 +2545,8 @@ static const struct drm_crtc_funcs amdgpu_dm_crtc_funcs = { .atomic_duplicate_state = dm_crtc_duplicate_state, .atomic_destroy_state = dm_crtc_destroy_state, .set_crc_source = amdgpu_dm_crtc_set_crc_source, + .enable_vblank = dm_enable_vblank, + .disable_vblank = dm_disable_vblank, }; static enum drm_connector_status In the original amdgpu_dm.c: /* Implemented only the options currently availible for the driver */ static const struct drm_crtc_funcs amdgpu_dm_crtc_funcs = { .reset = dm_crtc_reset_state, .destroy = amdgpu_dm_crtc_destroy, .gamma_set = drm_atomic_helper_legacy_gamma_set, .set_config = drm_atomic_helper_set_config, .page_flip = drm_atomic_helper_page_flip, .atomic_duplicate_state = dm_crtc_duplicate_state, .atomic_destroy_state = dm_crtc_destroy_state, }; This line: .set_crc_source = amdgpu_dm_crtc_set_crc_source, is not there. Never mind, just figured out how to properly adjust the patch file to match the kernel source I got so the patch file gets to be properly applied... it seems the rest of the patches went through without complaints and the kernel's now building... However, I still need to be check if this one-line change in the patch will lead to any side effects during build or during runtime... Just installed and booted the new kernel. It seems to have fixed the issue at least to the extent that it would not totally crash the system like it used to. However, I still see these in journalctl. This is after I locked the screen then wake the screen up 3 times. 1st time: Feb 14 14:35:17 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 14 14:35:37 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 14 14:35:57 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 14 14:36:10 linuxsys kernel: [drm] {1920x1080, 2080x1111@138700Khz} ... Feb 14 14:36:17 linuxsys kernel: [drm] RBRx2 pass VS=1, PE=0 Feb 14 14:36:17 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:17 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:17 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:17 linuxsys kernel: WARNING: CPU: 10 PID: 1485 at drivers/gpu/drm/drm_vblank.c:612 drm_calc_vbltimestamp_from_scanoutpos+0x2c5/0x340 [drm] Feb 14 14:36:17 linuxsys kernel: Modules linked in: cmac rfcomm fuse bnep vmnet(O) arc4 nls_iso8859_1 nls_cp437 vfat fat amdkfd amd_iommu_v2 amdgpu iwlmvm ax88179_178a usbnet mac80211 mii uvcvideo btusb videobuf2_vmalloc btrtl videobuf2_mem Feb 14 14:36:17 linuxsys kernel: rng_core tpm_tis tpm_tis_core k10temp shpchp battery ac rtc_cmos wmi i2c_piix4 tpm asus_wireless i2c_hid pinctrl_amd gpio_amdpt evdev mac_hid acpi_cpufreq sch_fq_codel vmmon(O) vmw_vmci vboxnetflt(O) vboxne Feb 14 14:36:17 linuxsys kernel: CPU: 10 PID: 1485 Comm: xfwm4 Tainted: G O 4.15.3-1-MANJARO #1 Feb 14 14:36:17 linuxsys kernel: Hardware name: ASUSTeK COMPUTER INC. GL702ZC/GL702ZC, BIOS GL702ZC.303 12/15/2017 Feb 14 14:36:17 linuxsys kernel: RIP: 0010:drm_calc_vbltimestamp_from_scanoutpos+0x2c5/0x340 [drm] Feb 14 14:36:17 linuxsys kernel: RSP: 0018:ffffa97548adfb30 EFLAGS: 00010082 Feb 14 14:36:17 linuxsys kernel: RAX: ffffffffc15e54c0 RBX: ffff8a1375ae6800 RCX: 0000000000000001 Feb 14 14:36:17 linuxsys kernel: RDX: ffffffffc0d4d380 RSI: 0000000000000001 RDI: ffffffffc0d4b24e Feb 14 14:36:17 linuxsys kernel: RBP: ffffa97548adfb98 R08: 0000000000000000 R09: ffffffffc0d2c870 Feb 14 14:36:17 linuxsys kernel: R10: ffffffffc140b320 R11: ffffffffb15c7f2d R12: 0000000000000001 Feb 14 14:36:17 linuxsys kernel: R13: ffffa97548adfbac R14: ffffa97548adfbe0 R15: ffff8a1377b6a000 Feb 14 14:36:17 linuxsys kernel: FS: 00007f2e273eb980(0000) GS:ffff8a137e880000(0000) knlGS:0000000000000000 Feb 14 14:36:17 linuxsys kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 14 14:36:17 linuxsys kernel: CR2: 00007feda6f5a000 CR3: 00000007da154000 CR4: 00000000003406e0 Feb 14 14:36:17 linuxsys kernel: Call Trace: Feb 14 14:36:17 linuxsys kernel: ? set_cursor+0x80/0x80 Feb 14 14:36:17 linuxsys kernel: ? set_cursor+0x80/0x80 Feb 14 14:36:17 linuxsys kernel: drm_get_last_vbltimestamp+0x54/0x90 [drm] Feb 14 14:36:17 linuxsys kernel: drm_update_vblank_count+0x77/0x250 [drm] Feb 14 14:36:17 linuxsys kernel: drm_vblank_enable+0xbd/0x100 [drm] Feb 14 14:36:17 linuxsys kernel: drm_vblank_get+0x8d/0xb0 [drm] Feb 14 14:36:17 linuxsys kernel: drm_wait_vblank_ioctl+0x12a/0x6a0 [drm] Feb 14 14:36:17 linuxsys kernel: ? unix_stream_recvmsg+0x53/0x70 Feb 14 14:36:17 linuxsys kernel: ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm] Feb 14 14:36:17 linuxsys kernel: drm_ioctl_kernel+0x5b/0xb0 [drm] Feb 14 14:36:17 linuxsys kernel: drm_ioctl+0x2d5/0x370 [drm] Feb 14 14:36:17 linuxsys kernel: ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm] Feb 14 14:36:17 linuxsys kernel: ? do_iter_write+0xdc/0x190 Feb 14 14:36:17 linuxsys kernel: ? vfs_writev+0xb9/0x110 Feb 14 14:36:17 linuxsys kernel: amdgpu_drm_ioctl+0x49/0x80 [amdgpu] Feb 14 14:36:17 linuxsys kernel: do_vfs_ioctl+0xa4/0x630 Feb 14 14:36:17 linuxsys kernel: ? __sys_recvmsg+0x4e/0x90 Feb 14 14:36:17 linuxsys kernel: ? __sys_recvmsg+0x7d/0x90 Feb 14 14:36:17 linuxsys kernel: SyS_ioctl+0x74/0x80 Feb 14 14:36:17 linuxsys kernel: do_syscall_64+0x75/0x190 Feb 14 14:36:17 linuxsys kernel: entry_SYSCALL_64_after_hwframe+0x21/0x86 Feb 14 14:36:17 linuxsys kernel: RIP: 0033:0x7f2e23a38d87 Feb 14 14:36:17 linuxsys kernel: RSP: 002b:00007ffd8b7da1c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Feb 14 14:36:17 linuxsys kernel: RAX: ffffffffffffffda RBX: 00007ffd8b7da1f0 RCX: 00007f2e23a38d87 Feb 14 14:36:17 linuxsys kernel: RDX: 00007ffd8b7da1f0 RSI: 00000000c018643a RDI: 000000000000000c Feb 14 14:36:17 linuxsys kernel: RBP: 0000000001006d10 R08: 0000000000800109 R09: 0000000000000000 Feb 14 14:36:17 linuxsys kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c018643a Feb 14 14:36:17 linuxsys kernel: R13: 000000000080619a R14: 00000000010cd380 R15: 0000000000000000 Feb 14 14:36:17 linuxsys kernel: Code: e1 48 c7 c2 80 d3 d4 c0 be 01 00 00 00 48 c7 c7 4e b2 d4 c0 e8 6d 62 fe ff 48 8b 83 98 03 00 00 48 83 78 20 00 0f 84 6f fd ff ff <0f> ff e9 68 fd ff ff 48 c7 c2 48 d3 d4 c0 31 f6 48 c7 c7 4b b2 Feb 14 14:36:17 linuxsys kernel: ---[ end trace e345f4b7c52fbc5c ]--- Feb 14 14:36:17 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:17 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:17 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:25 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:25 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:25 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:25 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:25 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 14 14:36:25 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! 2nd time: Feb 14 14:40:37 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 14 14:40:57 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 14 14:41:17 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 14 14:41:37 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 14 14:41:57 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 14 14:42:05 linuxsys kernel: [drm] {1920x1080, 2080x1111@138700Khz} 3rd time: Feb 14 14:44:26 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 14 14:44:46 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! The "failed to get VBLANK" errors still appear, and the 1st time seems to have crashed something, but the system still works, and there are no traces of crashes like that from the 1st time, during the 2nd and 3rd time. Thanks for fixing the patch conflict. I based them on https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next but should have based them on the 4.15 kernel. Thanks as well for testing. The patches don't fix the root cause of the issue but make sure you don't crash in this case. Catching the root cause is a bit more difficult and would require more debugging. I haven't seen the "Failed to get VBLANK" on platforms available to me but will keep an eye out for it. I applied the patch to 4.15.3 on archlinux and have tested with xset dpms force {standby,suspend} with success. Created attachment 137383 [details]
stacktrace even with patches
I just got another freeze despite using the patches. I'm not sure if this is the same bug since it mentions slub.c but I see amdgpu/drm stuff in the trace. After this trace the journal was flooded with items like "amdgpu_dm_irq_schedule_work FAILED src 4" (it alternates between 2 and 4)
I've been using this patchset on linux 4.15.3 and 4.14.4 and haven't had a problem since. Another problem: When I woke up the screen, sometimes the system would have intermittent soft lockups that made the system kind of unusable... This is after I included the patch to the latest 4.15 kernel, 4.15.5 (4.15.5-1-MANJARO) In journalctl I find the following phenomenon. During its sleep the error "failed to get VBLANK" is being written every 20s. Feb 27 11:57:29 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 27 11:57:49 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 27 11:58:09 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 27 11:58:29 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 27 11:58:49 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 27 11:59:09 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 27 11:59:29 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Feb 27 11:59:49 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! After I woke the screen up and I get the intermittent soft lockup issue, the error "dc_stream_state is NULL for crtc '1'" is written multiple times from either dm_vblank_get_counter or dm_crtc_get_scanoutpos every 8 or 18 seconds, which seemed to correspond to the lockup interval. I did not test the issue further as the system was almost unusable due to the lockup and I had to reboot. Feb 27 12:17:08 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:08 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:08 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:08 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:08 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:08 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:16 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:16 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:16 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:16 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:16 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:16 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:34 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:34 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:34 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:34 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:34 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:34 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:42 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:42 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:42 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:42 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:42 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:17:42 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:00 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:00 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:00 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:00 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:00 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:00 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:08 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:08 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:08 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:08 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:08 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:08 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! ... Feb 27 12:18:26 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:26 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:26 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:26 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:26 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:26 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:35 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:35 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:35 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:35 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:35 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Feb 27 12:18:35 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Unfortunately, this issue is random and I could not always reproduce it. It just happens from time to time. It seems some of the patches (1, 2, and 4) have entered the 4.16 kernel, from what I can tell when building the kernel for Manjaro, where these three patches got rejected, and the exact code changes were found in the original kernel code files. However, after some testing I found out that the patch 3 (which is not included) is still needed for 4.16 to fix this the problem, as I can still crash the system without it. EDIT: It seems I'm experiencing some intermittent screen flicker with current 4.16 kernel (on the same system, with only Patch 3 applied as it's the only patch needed for 4.16), although it doesn't really affect normal system usage. I'm not sure if this flicker is related to this problem, but I'm putting it up here as it's still a continuation of my watching this issue's condition. My stability has been fine since I last commented. I'm now on 4.15.10+these patches. However, my monitors won't turn off: When the screen turns off it'll come right back on after a second. Just now I also had my second panic just like in #19, sadly (really bad time to have that happen, too) :( (In reply to L.S.S. from comment #23) > EDIT: It seems I'm experiencing some intermittent screen flicker with > current 4.16 kernel (on the same system, with only Patch 3 applied as it's > the only patch needed for 4.16), although it doesn't really affect normal > system usage. > > I'm not sure if this flicker is related to this problem, but I'm putting it > up here as it's still a continuation of my watching this issue's condition. Do you have TearFree on? https://bugs.freedesktop.org/show_bug.cgi?id=105530 Also propably related: https://bugs.freedesktop.org/show_bug.cgi?id=101580 (In reply to Mez from comment #25) > (In reply to L.S.S. from comment #23) > > EDIT: It seems I'm experiencing some intermittent screen flicker with > > current 4.16 kernel (on the same system, with only Patch 3 applied as it's > > the only patch needed for 4.16), although it doesn't really affect normal > > system usage. > > > > I'm not sure if this flicker is related to this problem, but I'm putting it > > up here as it's still a continuation of my watching this issue's condition. > > Do you have TearFree on? > > https://bugs.freedesktop.org/show_bug.cgi?id=105530 I don't know about TearFree as I haven't actually configured it, so it should be Manjaro's default setting. And I'm only getting the flicker on 4.16 kernel, on 4.15 it is and has always been fine. Just updated to the 4.17-rc0 kernel and it seems the problem has been mostly fixed there. The patches are no longer needed (already in there) and trying to reproduce the issue only resulted in a couple of "Failed to get VBLANK!" errors that aren't fatal. I'm not certain about the flickering issue I mentioned earlier... It looked like one but might actually be some kind of sudden color palette distortion. The problem only appeared since 4.16. I don't recall having the issue in 4.15. I can partially reproduce it in the Firefox new tab page, by quickly hovering over the links on the "Top Sites" and "Highlights". The whole screen would turn a bit darker in color for a very short instant then returns to normal. It happens randomly and it doesn't seem to produce any errors in the log. Not a major issue, just it can be annoying sometimes. EDIT: Maybe not really fixed in 4.17 (regression again?!)... just now after the screen went blank, I got another panic and had to reboot... :-( When the panic occurred, it spawned two errors. Apr 12 11:32:03 linuxsys systemd[5491]: Started Virtual filesystem service. Apr 12 11:32:03 linuxsys udisksd[1970]: udisks_mount_get_mount_path: assertion 'mount->type == UDISKS_MOUNT_TYPE_FILESYSTEM' failed Apr 12 11:32:13 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:13 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:13 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:13 linuxsys kernel: WARNING: CPU: 12 PID: 1761 at drivers/gpu/drm/drm_vblank.c:620 drm_calc_vbltimestamp_from_scanoutpos+0x2a8/0x2f0 [drm] Apr 12 11:32:13 linuxsys kernel: Modules linked in: btrfs zstd_compress zstd_decompress xxhash xor raid6_pq ufs hfsplus hfs minix ntfs msdos jfs ext4 mbcache jbd2 fscrypto dm_mod vmw_vsock_vmci_transport vsock cmac rfcomm fuse bnep vmnet(O> Apr 12 11:32:13 linuxsys kernel: agpgart snd_timer syscopyarea rfkill sysfillrect sysimgblt fb_sys_fops aesni_intel snd tpm_crb aes_x86_64 tpm_tis crypto_simd ccp cryptd soundcore tpm_tis_core sp5100_tco glue_helper pcspkr k10temp i2c_pii> Apr 12 11:32:13 linuxsys kernel: CPU: 12 PID: 1761 Comm: xfwm4 Tainted: G O 4.17.0-1-MANJARO #1 Apr 12 11:32:13 linuxsys kernel: Hardware name: ASUSTeK COMPUTER INC. GL702ZC/GL702ZC, BIOS GL702ZC.303 12/15/2017 Apr 12 11:32:13 linuxsys kernel: RIP: 0010:drm_calc_vbltimestamp_from_scanoutpos+0x2a8/0x2f0 [drm] Apr 12 11:32:13 linuxsys kernel: RSP: 0018:ffffa96189f63b28 EFLAGS: 00010082 Apr 12 11:32:13 linuxsys kernel: RAX: ffffffffc131d9e0 RBX: ffff9d76fa423000 RCX: 0000000000000000 Apr 12 11:32:13 linuxsys kernel: RDX: 0000000000000001 RSI: ffffffffc09b98d0 RDI: 0000000000000001 Apr 12 11:32:13 linuxsys kernel: RBP: ffffa96189f63b90 R08: 0000000000000000 R09: ffffffffc0998ab0 Apr 12 11:32:13 linuxsys kernel: R10: ffff9d76f74131d8 R11: ffffffffc114e500 R12: 0000000000000001 Apr 12 11:32:13 linuxsys kernel: R13: ffff9d76f7413000 R14: ffffa96189f63ba4 R15: ffffa96189f63bd8 Apr 12 11:32:13 linuxsys kernel: FS: 00007fefa4d35980(0000) GS:ffff9d76fe900000(0000) knlGS:0000000000000000 Apr 12 11:32:13 linuxsys kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 12 11:32:13 linuxsys kernel: CR2: 00007f5f09dfa000 CR3: 00000007cc786000 CR4: 00000000003406e0 Apr 12 11:32:13 linuxsys kernel: Call Trace: Apr 12 11:32:13 linuxsys kernel: drm_get_last_vbltimestamp+0x54/0x90 [drm] Apr 12 11:32:13 linuxsys kernel: drm_update_vblank_count+0x79/0x240 [drm] Apr 12 11:32:13 linuxsys kernel: drm_vblank_enable+0xce/0x120 [drm] Apr 12 11:32:13 linuxsys kernel: drm_vblank_get+0x8d/0xb0 [drm] Apr 12 11:32:13 linuxsys kernel: drm_wait_vblank_ioctl+0x12a/0x620 [drm] Apr 12 11:32:13 linuxsys kernel: ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm] Apr 12 11:32:13 linuxsys kernel: drm_ioctl_kernel+0x5b/0xb0 [drm] Apr 12 11:32:13 linuxsys kernel: drm_ioctl+0x2c3/0x360 [drm] Apr 12 11:32:13 linuxsys kernel: ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm] Apr 12 11:32:13 linuxsys kernel: ? do_iter_write+0xdc/0x190 Apr 12 11:32:13 linuxsys kernel: ? vfs_writev+0xb9/0x110 Apr 12 11:32:13 linuxsys kernel: amdgpu_drm_ioctl+0x49/0x80 [amdgpu] Apr 12 11:32:13 linuxsys kernel: do_vfs_ioctl+0xa4/0x630 Apr 12 11:32:13 linuxsys kernel: ? __sys_recvmsg+0x5b/0xa0 Apr 12 11:32:13 linuxsys kernel: ? __sys_recvmsg+0x8a/0xa0 Apr 12 11:32:13 linuxsys kernel: ksys_ioctl+0x70/0x80 Apr 12 11:32:13 linuxsys kernel: SyS_ioctl+0xa/0x10 Apr 12 11:32:13 linuxsys kernel: do_syscall_64+0x74/0x190 Apr 12 11:32:13 linuxsys kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Apr 12 11:32:13 linuxsys kernel: RIP: 0033:0x7fefa1389d87 Apr 12 11:32:13 linuxsys kernel: RSP: 002b:00007ffedb5d0ee8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Apr 12 11:32:13 linuxsys kernel: RAX: ffffffffffffffda RBX: 00007ffedb5d0f10 RCX: 00007fefa1389d87 Apr 12 11:32:13 linuxsys kernel: RDX: 00007ffedb5d0f10 RSI: 00000000c018643a RDI: 000000000000000c Apr 12 11:32:13 linuxsys kernel: RBP: 0000000000ffad10 R08: 0000000000e00109 R09: 0000000000000000 Apr 12 11:32:13 linuxsys kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c018643a Apr 12 11:32:13 linuxsys kernel: R13: 0000000000f67cbb R14: 00000000010bf7a0 R15: 0000000000000000 Apr 12 11:32:13 linuxsys kernel: Code: e9 b5 fd ff ff 44 89 e2 48 c7 c6 d0 98 9b c0 bf 01 00 00 00 e8 fa e9 ff ff 48 8b 83 98 03 00 00 48 83 78 28 00 0f 84 8c fd ff ff <0f> 0b 45 31 ed e9 85 fd ff ff 48 c7 c7 98 98 9b c0 45 31 ed e8 Apr 12 11:32:13 linuxsys kernel: ---[ end trace bc02c50ede9b0814 ]--- Apr 12 11:32:13 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:13 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:13 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:22 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:22 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:22 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:22 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:22 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:32:22 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Apr 12 11:42:05 linuxsys kernel: ------------[ cut here ]------------ Apr 12 11:42:05 linuxsys kernel: kernel BUG at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:4692! Apr 12 11:42:05 linuxsys kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI Apr 12 11:42:05 linuxsys kernel: Modules linked in: btrfs zstd_compress zstd_decompress xxhash xor raid6_pq ufs hfsplus hfs minix ntfs msdos jfs ext4 mbcache jbd2 fscrypto dm_mod vmw_vsock_vmci_transport vsock cmac rfcomm fuse bnep vmnet(O> Apr 12 11:42:05 linuxsys kernel: agpgart snd_timer syscopyarea rfkill sysfillrect sysimgblt fb_sys_fops aesni_intel snd tpm_crb aes_x86_64 tpm_tis crypto_simd ccp cryptd soundcore tpm_tis_core sp5100_tco glue_helper pcspkr k10temp i2c_pii> Apr 12 11:42:05 linuxsys kernel: CPU: 6 PID: 5473 Comm: Xorg Tainted: G W O 4.17.0-1-MANJARO #1 Apr 12 11:42:05 linuxsys kernel: Hardware name: ASUSTeK COMPUTER INC. GL702ZC/GL702ZC, BIOS GL702ZC.303 12/15/2017 Apr 12 11:42:05 linuxsys kernel: RIP: 0010:dm_update_crtcs_state+0x347/0x3c0 [amdgpu] Apr 12 11:42:05 linuxsys kernel: RSP: 0018:ffffa9618c3b3b10 EFLAGS: 00010246 Apr 12 11:42:05 linuxsys kernel: RAX: 0000000000000000 RBX: ffff9d76f7bf2000 RCX: 0000000025a00806 Apr 12 11:42:05 linuxsys kernel: RDX: 0000000025a00606 RSI: ffff9d76fe7a7160 RDI: ffff9d76fe006e80 Apr 12 11:42:05 linuxsys kernel: RBP: ffff9d76eec29800 R08: 0000000000027160 R09: ffffffffc125a16d Apr 12 11:42:05 linuxsys kernel: R10: ffffedaf06a59400 R11: 00000000ffffffff R12: 0000000000000000 Apr 12 11:42:05 linuxsys kernel: R13: ffff9d70a9657400 R14: ffff9d70a9652400 R15: ffff9d76af8b3980 Apr 12 11:42:05 linuxsys kernel: FS: 00007f9cdd552940(0000) GS:ffff9d76fe780000(0000) knlGS:0000000000000000 Apr 12 11:42:05 linuxsys kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 12 11:42:05 linuxsys kernel: CR2: 0000562ed2083fc0 CR3: 00000001a09ae000 CR4: 00000000003406e0 Apr 12 11:42:05 linuxsys kernel: Call Trace: Apr 12 11:42:05 linuxsys kernel: amdgpu_dm_atomic_check+0x2a0/0x4d0 [amdgpu] Apr 12 11:42:05 linuxsys kernel: drm_atomic_check_only+0x33a/0x4f0 [drm] Apr 12 11:42:05 linuxsys kernel: drm_atomic_commit+0x13/0x50 [drm] Apr 12 11:42:05 linuxsys kernel: drm_atomic_connector_commit_dpms+0xe5/0xf0 [drm] Apr 12 11:42:05 linuxsys kernel: drm_mode_obj_set_property_ioctl+0x170/0x290 [drm] Apr 12 11:42:05 linuxsys kernel: ? drm_mode_connector_set_obj_prop+0x70/0x70 [drm] Apr 12 11:42:05 linuxsys kernel: drm_mode_connector_property_set_ioctl+0x3e/0x60 [drm] Apr 12 11:42:05 linuxsys kernel: drm_ioctl_kernel+0x5b/0xb0 [drm] Apr 12 11:42:05 linuxsys kernel: drm_ioctl+0x2c3/0x360 [drm] Apr 12 11:42:05 linuxsys kernel: ? drm_mode_connector_set_obj_prop+0x70/0x70 [drm] Apr 12 11:42:05 linuxsys kernel: ? __handle_mm_fault+0xbff/0x14d0 Apr 12 11:42:05 linuxsys kernel: amdgpu_drm_ioctl+0x49/0x80 [amdgpu] Apr 12 11:42:05 linuxsys kernel: do_vfs_ioctl+0xa4/0x630 Apr 12 11:42:05 linuxsys kernel: ? handle_mm_fault+0x10b/0x260 Apr 12 11:42:05 linuxsys kernel: ? __do_page_fault+0x317/0x5a0 Apr 12 11:42:05 linuxsys kernel: ksys_ioctl+0x70/0x80 Apr 12 11:42:05 linuxsys kernel: SyS_ioctl+0xa/0x10 Apr 12 11:42:05 linuxsys kernel: do_syscall_64+0x74/0x190 Apr 12 11:42:05 linuxsys kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Apr 12 11:42:05 linuxsys kernel: RIP: 0033:0x7f9cdae0cd87 Apr 12 11:42:05 linuxsys kernel: RSP: 002b:00007ffcd39d6308 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Apr 12 11:42:05 linuxsys kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f9cdae0cd87 Apr 12 11:42:05 linuxsys kernel: RDX: 00007ffcd39d6340 RSI: 00000000c01064ab RDI: 000000000000000e Apr 12 11:42:05 linuxsys kernel: RBP: 00007ffcd39d6340 R08: 0000562ed21dc3e0 R09: 0000000000000000 Apr 12 11:42:05 linuxsys kernel: R10: 00007f9cdae85220 R11: 0000000000000246 R12: 00000000c01064ab Apr 12 11:42:05 linuxsys kernel: R13: 000000000000000e R14: 0000562ed0451470 R15: 0000562ed0456040 Apr 12 11:42:05 linuxsys kernel: Code: 18 c6 00 01 0f 84 f7 fd ff ff e9 e9 fd ff ff 45 0f b6 4d 0a 41 f6 c1 0e 0f 84 5c fd ff ff 48 c7 04 24 00 00 00 00 e9 16 fe ff ff <0f> 0b 48 83 bb 08 0d 00 00 00 0f 84 13 ff ff ff 48 83 3c 24 00 Apr 12 11:42:05 linuxsys kernel: RIP: dm_update_crtcs_state+0x347/0x3c0 [amdgpu] RSP: ffffa9618c3b3b10 Apr 12 11:42:05 linuxsys kernel: ---[ end trace bc02c50ede9b0815 ]--- Apr 12 11:42:16 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Apr 12 11:42:36 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Apr 12 11:42:56 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Apr 12 11:43:16 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! EDIT 2: I couldn't reproduce the issue on 4.17 this time even after 2 wakeups. However, the issue I encountered was similar (system apparently froze when trying to wake up the screen). After looking into it, I found that along with the errors that showed up like that in Comment 16 (which did not crash the system), there was an additional "kernel BUG" related to dm_update_crtcs_state which was called by amdgpu_dm_atomic_check, but the log appeared to have been cut (the log entries between the two errors were apparently unrelated to the error so I did not include them, and the new error began with a "--[ cut here ]--"). This additional error (not 100% reproducible) might be what actually crashed the system that time. Kernel 4.17.0-rc3-linus.git-keumjo4.17.0-rc3-linus.git-keumjo on a 2400G with a RX 560 GPU:
> login to xfce desktop
> type "xset dpms force standby" in a terminal
screens go blank and there is no more response from the box, looks dead. But it's possible to ssh into it and find the following information in dmesg:
[12743.692027] ------------[ cut here ]------------
[12743.692030] kernel BUG at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:4708!
[12743.692039] invalid opcode: 0000 [#1] SMP NOPTI
[12743.692041] Modules linked in: twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common loop serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic rfcomm fuse rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle lz4 lz4_compress ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc vfat fat snd_hda_codec_realtek snd_hda_codec_generic
[12743.692069] snd_hda_codec_hdmi snd_hda_intel btusb snd_hda_codec btrtl btbcm edac_mce_amd btintel bluetooth snd_hda_core snd_hwdep snd_seq kvm_amd ccp wmi_bmof kvm snd_seq_device snd_pcm ecdh_generic snd_timer irqbypass rfkill joydev pcspkr snd soundcore shpchp i2c_piix4 k10temp wmi video acpi_cpufreq binfmt_misc dm_crypt amdkfd raid1 amd_iommu_v2 amdgpu chash i2c_algo_bit gpu_sched drm_kms_helper ttm drm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel r8169 mii
[12743.692091] CPU: 6 PID: 3708 Comm: Xorg Not tainted 4.17.0-rc3-linus.git-keumjo #3
[12743.692093] Hardware name: System manufacturer System Product Name/TUF B350M-PLUS GAMING, BIOS 4009 04/14/2018
[12743.692154] RIP: 0010:dm_update_crtcs_state+0x419/0x480 [amdgpu]
[12743.692156] RSP: 0018:ffffb4e688f07b30 EFLAGS: 00010246
[12743.692158] RAX: ffff8b720090d001 RBX: ffff8b720527d000 RCX: 000000000008dccc
[12743.692160] RDX: 000000000008dccb RSI: ffff8b723eda6160 RDI: ffff8b723e806e80
[12743.692162] RBP: ffff8b720275c000 R08: 0000000000026160 R09: 0000000000000000
[12743.692163] R10: ffffdb50a0024200 R11: 0000000000000a00 R12: 0000000000000000
[12743.692165] R13: ffff8b7205189800 R14: ffff8b720090ec00 R15: ffff8b71f16c8a00
[12743.692167] FS: 00007f88668e2ac0(0000) GS:ffff8b723ed80000(0000) knlGS:0000000000000000
[12743.692169] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12743.692170] CR2: 00007f885f886000 CR3: 00000007fbc90000 CR4: 00000000003406e0
[12743.692172] Call Trace:
[12743.692231] amdgpu_dm_atomic_check+0x1b1/0x3b0 [amdgpu]
[12743.692248] drm_atomic_check_only+0x360/0x4f0 [drm]
[12743.692264] drm_atomic_commit+0x13/0x50 [drm]
[12743.692278] drm_atomic_connector_commit_dpms+0xdb/0x100 [drm]
[12743.692292] drm_mode_obj_set_property_ioctl+0x178/0x280 [drm]
[12743.692307] ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm]
[12743.692320] drm_mode_connector_property_set_ioctl+0x39/0x60 [drm]
[12743.692333] drm_ioctl_kernel+0x5b/0xb0 [drm]
[12743.692346] drm_ioctl+0x1b3/0x370 [drm]
[12743.692359] ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm]
[12743.692364] ? _cond_resched+0x15/0x30
[12743.692404] amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[12743.692408] do_vfs_ioctl+0xa4/0x610
[12743.692411] ksys_ioctl+0x60/0x90
[12743.692414] ? ksys_read+0x9c/0xb0
[12743.692416] __x64_sys_ioctl+0x16/0x20
[12743.692420] do_syscall_64+0x5b/0x160
[12743.692423] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[12743.692425] Code: ff ff e9 21 fe ff ff 48 83 3c 24 00 0f 84 2f fe ff ff 48 8b 3c 24 e8 e7 99 07 00 80 7c 24 0f 00 0f 85 02 fe ff ff e9 40 ff ff ff <0f> 0b 48 83 c4 20 b8 ea ff ff ff 5b 5d 41 5c 41 5d 41 5e 41 5f
[12743.692510] RIP: dm_update_crtcs_state+0x419/0x480 [amdgpu] RSP: ffffb4e688f07b30
[12743.692525] ---[ end trace fb5e2b69e8f8d9c9 ]---
There's no recovery from this; service lightdm restart doesn't restart it. shutdown -r now doens't reboot either, just hangs (can't tell why since screens are blank and ssh closes)
So it seems there definitely is a regression on 4.17 on this issue (the patches are not required as the lines were already there in 4.17). This time it isn't a kernel panic, but an "invalid opcode" error caused by dm_update_crtcs_state that was called by amdgpu_dm_atomic_check. The issue is not 100% reproducible, but still means going to standby with an AMD GPU and DC is still dangerous and may result in losses of unsaved work. In my case the problem was not having xorg-x11-drv-amdgpu installed (how embarrassing) which made xorg use xorg-x11-drv-ati. Yes, really. I assumed Fedora 28 beta installed it along with all the other drives and didn't realize until comparing the X logs on a box which didn't have a problem with one that did. I did file the Fedora bug kindly asking xorg-x11-drv-amdgpu to be installed as a default. Simply installing xorg-x11-drv-amdgpu solved this error and amdgpu kernel crash: [12743.692030] kernel BUG at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:4708! I realize the immensity of my stupidity not realizing xorg-x11-drv-amdgpu will make you laugh but it's not like the kernel panic message warned me. While it's good to hear that xf86-video-amdgpu doesn't trigger it, the kernel BUG is still a kernel driver bug. I believe I've been seeing the same bug as of late. [Sat Jun 23 23:02:04 2018] ------------[ cut here ]------------ [Sat Jun 23 23:02:04 2018] kernel BUG at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:4713! [Sat Jun 23 23:02:04 2018] invalid opcode: 0000 [#1] SMP PTI [Sat Jun 23 23:02:04 2018] Modules linked in: ipt_REJECT(E) nf_reject_ipv4(E) tun(E) bridge(E) stp(E) llc(E) fuse(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) snd_hrtimer(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_rawmidi(E) snd_seq(E) snd_seq_device(E) cpufreq_conservative(E) cpufreq_powersave(E) cpufreq_userspace(E) nf_log_ipv4(E) nf_log_common(E) xt_LOG(E) xt_multiport(E) xt_conntrack(E) iptable_filter(E) ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack(E) xt_CHECKSUM(E) xt_tcpudp(E) iptable_mangle(E) binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) mxm_wmi(E) amdkfd(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) amdgpu(E) intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) snd_hda_codec_hdmi(E) [Sat Jun 23 23:02:04 2018] chash(E) gpu_sched(E) snd_hda_intel(E) kvm_intel(E) ttm(E) snd_hda_codec(E) efi_pstore(E) drm_kms_helper(E) snd_hda_core(E) snd_pcsp(E) kvm(E) snd_hwdep(E) snd_pcm_oss(E) drm(E) irqbypass(E) snd_mixer_oss(E) intel_cstate(E) snd_pcm(E) mei_me(E) i2c_algo_bit(E) intel_uncore(E) snd_timer(E) coretemp(E) vhba(OE) snd(E) iTCO_wdt(E) intel_rapl_perf(E) efivars(E) joydev(E) evdev(E) iTCO_vendor_support(E) soundcore(E) shpchp(E) mei(E) sg(E) intel_pch_thermal(E) wmi(E) video(E) acpi_pad(E) button(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) parport_pc(E) ppdev(E) sunrpc(E) lp(E) parport(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) btrfs(E) zstd_decompress(E) zstd_compress(E) xxhash(E) algif_skcipher(E) af_alg(E) raid10(E) raid456(E) [Sat Jun 23 23:02:04 2018] async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) raid1(E) multipath(E) linear(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_crypt(E) dm_mod(E) raid0(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sr_mod(E) cdrom(E) sd_mod(E) uas(E) usb_storage(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) ahci(E) xhci_pci(E) aes_x86_64(E) libahci(E) crypto_simd(E) nvme(E) xhci_hcd(E) cryptd(E) glue_helper(E) libata(E) i2c_i801(E) alx(E) mdio(E) nvme_core(E) scsi_mod(E) usbcore(E) fan(E) thermal(E) [Sat Jun 23 23:02:04 2018] CPU: 2 PID: 1340 Comm: Xorg Tainted: G W OE 4.17.2+ #2 [Sat Jun 23 23:02:04 2018] Hardware name: MSI MS-7976/Z170A GAMING M7 (MS-7976), BIOS 1.J0 12/07/2017 [Sat Jun 23 23:02:04 2018] RIP: 0010:dm_update_crtcs_state+0x424/0x4b0 [amdgpu] [Sat Jun 23 23:02:04 2018] RSP: 0018:ffffb84fc4affa90 EFLAGS: 00010246 [Sat Jun 23 23:02:04 2018] RAX: 0000000000000000 RBX: ffff9d7e34528280 RCX: fffff1505f079c9f [Sat Jun 23 23:02:04 2018] RDX: 0000000000000017 RSI: ffff9d7e41f63800 RDI: 0000000000000286 [Sat Jun 23 23:02:04 2018] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [Sat Jun 23 23:02:04 2018] R10: ffffb84fc4affa90 R11: 00000000000005a0 R12: ffff9d7e41f63800 [Sat Jun 23 23:02:04 2018] R13: ffff9d7ec0f61800 R14: ffff9d7ec66a8c00 R15: 0000000000000000 [Sat Jun 23 23:02:04 2018] FS: 00007f614ec0ba40(0000) GS:ffff9d7eeec80000(0000) knlGS:0000000000000000 [Sat Jun 23 23:02:04 2018] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Sat Jun 23 23:02:04 2018] CR2: 00007f1dbb1ec0c8 CR3: 000000081aa56005 CR4: 00000000003606e0 [Sat Jun 23 23:02:04 2018] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Sat Jun 23 23:02:04 2018] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [Sat Jun 23 23:02:04 2018] Call Trace: [Sat Jun 23 23:02:04 2018] amdgpu_dm_atomic_check+0x1a1/0x3d0 [amdgpu] [Sat Jun 23 23:02:04 2018] drm_atomic_check_only+0x3f3/0x4f0 [drm] [Sat Jun 23 23:02:04 2018] ? handle_conflicting_encoders+0x26c/0x280 [drm_kms_helper] [Sat Jun 23 23:02:04 2018] drm_atomic_commit+0x13/0x50 [drm] [Sat Jun 23 23:02:04 2018] drm_atomic_helper_set_config+0x67/0x90 [drm_kms_helper] [Sat Jun 23 23:02:04 2018] __drm_mode_set_config_internal+0x67/0x110 [drm] [Sat Jun 23 23:02:04 2018] drm_mode_setcrtc+0x452/0x5a0 [drm] [Sat Jun 23 23:02:04 2018] ? amdgpu_cs_wait_ioctl+0xe5/0x160 [amdgpu] [Sat Jun 23 23:02:04 2018] ? drm_mode_getcrtc+0x170/0x170 [drm] [Sat Jun 23 23:02:04 2018] drm_ioctl_kernel+0x67/0xb0 [drm] [Sat Jun 23 23:02:04 2018] drm_ioctl+0x2d1/0x390 [drm] [Sat Jun 23 23:02:04 2018] ? drm_mode_getcrtc+0x170/0x170 [drm] [Sat Jun 23 23:02:04 2018] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [Sat Jun 23 23:02:04 2018] do_vfs_ioctl+0xa2/0x620 [Sat Jun 23 23:02:04 2018] ? __x64_sys_futex+0x88/0x180 [Sat Jun 23 23:02:04 2018] ksys_ioctl+0x70/0x80 [Sat Jun 23 23:02:04 2018] __x64_sys_ioctl+0x16/0x20 [Sat Jun 23 23:02:04 2018] do_syscall_64+0x55/0x100 [Sat Jun 23 23:02:04 2018] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [Sat Jun 23 23:02:04 2018] RIP: 0033:0x7f614c650dd7 [Sat Jun 23 23:02:04 2018] RSP: 002b:00007ffd9280d9d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [Sat Jun 23 23:02:04 2018] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f614c650dd7 [Sat Jun 23 23:02:04 2018] RDX: 00007ffd9280da10 RSI: 00000000c06864a2 RDI: 000000000000000d [Sat Jun 23 23:02:04 2018] RBP: 00007ffd9280da10 R08: 0000000000000000 R09: 000055a69128dca0 [Sat Jun 23 23:02:04 2018] R10: 00007ffd9280dad0 R11: 0000000000000246 R12: 00000000c06864a2 [Sat Jun 23 23:02:04 2018] R13: 000000000000000d R14: 000055a690954b90 R15: 000055a69128dca0 [Sat Jun 23 23:02:04 2018] Code: 4c 89 ee 48 89 c7 e8 bc f5 ff ff 84 c0 0f 84 b7 fe ff ff e9 a0 fe ff ff 48 83 b8 08 0d 00 00 00 0f 85 67 ff ff ff e9 f5 fe ff ff <0f> 0b 41 8b 4f 60 48 c7 c2 d0 95 c7 c0 48 c7 c6 a0 57 ca c0 bf [Sat Jun 23 23:02:04 2018] RIP: dm_update_crtcs_state+0x424/0x4b0 [amdgpu] RSP: ffffb84fc4affa90 [Sat Jun 23 23:02:04 2018] ---[ end trace 293f9551ffc27adc ]--- This is on a Fiji card. I have a 144Hz FreeSync-capable monitor, and can easily reproduce the error with this command (where 143.86 is the xrandr-advertised maximum frequency): xrandr --output DisplayPort-0 --mode 2560x1440 --rate 143.86 --set "scaling mode" "Full aspect" Interestingly xrandr reports 59.95*+ as the current frequency, but my monitor says 144Hz. I tried firing up Grey Goo under Wine and that game reports my monitor running at 144Hz also. If I just run: xrandr --output DisplayPort-0 --mode 2560x1440 --rate 143.86 then xrandr correctly reports 143.86* indicating that that frequency is now selected. I can also run the following: xrandr --output DisplayPort-0 --mode 2560x1440 --set "scaling mode" "Full aspect" But if I combine these options as per the first command above, I get GUI crash. The symptoms are simiar. In my case the screen is still on (not blank) but completely frozen. I was able to SSH in to get the above trace from the dmesg command. The machine cannot successfully shutdown or reboot and I need to physically hard reset the box at this point. As others have said, this is definitely a regression. This didn't happen in older kernels. The kernel BUG should be fixed since 4.17 by this commit commit 20d4ac659c76034586a3ab79489b0940631a65de Author: Leo (Sunpeng) Li <sunpeng.li@amd.com> Date: Tue May 29 09:51:51 2018 -0400 drm/amd/display: Fix BUG_ON during CRTC atomic check update For cases where the CRTC is inactive (DPMS off), where a modeset is not required, yet the CRTC is still in the atomic state, we should not attempt to update anything on it. Previously, we were relying on the modereset_required() helper to check the above condition. However, the function returns false immediately if a modeset is not required, ignoring the CRTC's enable/active state flags. The correct way to filter is by looking at these flags instead. Fixes: e277adc5a06c "drm/amd/display: Hookup color management functions" Bugzilla: https://bugs.freedesktop.org/106194 Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> I'll mark this as resolved but please reopen if it still reproduces. Attempted to lock the screen and leave it blank for a while to see what happens, and it seems for the first time there are still errors related to VBLANK, but they appear minor as the system woke up just fine. I tried letting the screen blank twice this time, and the errors did not show up the second time. Only during first time did the errors appear, though this is probably not enough to prove much. But still, I don't see that kernel bug this time so that must have been fixed. I'm yet to assess whether the bug can no longer be reproduced as I've been avoiding having to leave the screen blank, as back then this bug has caused losses of unsaved work and other problems due to the crash. Jul 05 13:35:42 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Jul 05 13:35:55 linuxsys blueman-mechanism[2281]: Exiting Jul 05 13:36:02 linuxsys kernel: [drm:dm_logger_write [amdgpu]] *ERROR* Failed to get VBLANK! Jul 05 13:36:22 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Jul 05 13:36:22 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Jul 05 13:36:22 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Jul 05 13:36:22 linuxsys kernel: WARNING: CPU: 14 PID: 1998 at drivers/gpu/drm/drm_vblank.c:620 drm_calc_vbltimestamp_from_scanoutpos+0x2af/0x2f0 [drm] Jul 05 13:36:22 linuxsys kernel: Modules linked in: cmac rfcomm fuse bnep vmnet(O) nls_iso8859_1 nls_cp437 vfat fat arc4 amdkfd amd_iommu_v2 amdgpu iwlmvm chash gpu_sched i2c_algo_bit ttm mac80211 drm_kms_helper btusb uvcvideo btrtl iwlwif> Jul 05 13:36:22 linuxsys kernel: glue_helper pcspkr k10temp i2c_piix4 rtc_cmos shpchp tpm_tis_core battery ac tpm wmi rng_core asus_wireless gpio_amdpt i2c_hid pinctrl_amd evdev mac_hid acpi_cpufreq vmmon(O) vmw_vmci vboxnetflt(O) vboxnet> Jul 05 13:36:22 linuxsys kernel: CPU: 14 PID: 1998 Comm: xfwm4 Tainted: G O 4.17.3-1-MANJARO #1 Jul 05 13:36:22 linuxsys kernel: Hardware name: ASUSTeK COMPUTER INC. GL702ZC/GL702ZC, BIOS GL702ZC.303 12/15/2017 Jul 05 13:36:22 linuxsys kernel: RIP: 0010:drm_calc_vbltimestamp_from_scanoutpos+0x2af/0x2f0 [drm] Jul 05 13:36:22 linuxsys kernel: RSP: 0018:ffff9bae0b6d7b38 EFLAGS: 00010082 Jul 05 13:36:22 linuxsys kernel: RAX: ffffffffc114e400 RBX: ffff8f4ff7840800 RCX: 0000000000000000 Jul 05 13:36:22 linuxsys kernel: RDX: 0000000000000001 RSI: ffffffffc0e128a0 RDI: 0000000000000001 Jul 05 13:36:22 linuxsys kernel: RBP: ffff9bae0b6d7b98 R08: 0000000000000000 R09: ffffffffc0df1770 Jul 05 13:36:22 linuxsys kernel: R10: 0000000000000000 R11: ffffffffc0f814d0 R12: 0000000000000001 Jul 05 13:36:22 linuxsys kernel: R13: ffff8f4fe8eb01d8 R14: ffff8f4fe8eb0000 R15: ffff9bae0b6d7bac Jul 05 13:36:22 linuxsys kernel: FS: 00007f4c76bcdfc0(0000) GS:ffff8f4ffe980000(0000) knlGS:0000000000000000 Jul 05 13:36:22 linuxsys kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 05 13:36:22 linuxsys kernel: CR2: 00007f8de800bf98 CR3: 00000007520fc000 CR4: 00000000003406e0 Jul 05 13:36:22 linuxsys kernel: Call Trace: Jul 05 13:36:22 linuxsys kernel: drm_get_last_vbltimestamp+0x78/0x90 [drm] Jul 05 13:36:22 linuxsys kernel: drm_update_vblank_count+0x79/0x230 [drm] Jul 05 13:36:22 linuxsys kernel: drm_vblank_enable+0x101/0x120 [drm] Jul 05 13:36:22 linuxsys kernel: drm_vblank_get+0x8d/0xb0 [drm] Jul 05 13:36:22 linuxsys kernel: drm_wait_vblank_ioctl+0x138/0x630 [drm] Jul 05 13:36:22 linuxsys kernel: ? import_iovec+0x37/0xd0 Jul 05 13:36:22 linuxsys kernel: ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm] Jul 05 13:36:22 linuxsys kernel: drm_ioctl_kernel+0x5b/0xb0 [drm] Jul 05 13:36:22 linuxsys kernel: drm_ioctl+0x1b7/0x370 [drm] Jul 05 13:36:22 linuxsys kernel: ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm] Jul 05 13:36:22 linuxsys kernel: ? do_iter_write+0xdc/0x190 Jul 05 13:36:22 linuxsys kernel: amdgpu_drm_ioctl+0x49/0x80 [amdgpu] Jul 05 13:36:22 linuxsys kernel: do_vfs_ioctl+0xa4/0x610 Jul 05 13:36:22 linuxsys kernel: ? __sys_recvmsg+0x83/0xa0 Jul 05 13:36:22 linuxsys kernel: ksys_ioctl+0x60/0x90 Jul 05 13:36:22 linuxsys kernel: __x64_sys_ioctl+0x16/0x20 Jul 05 13:36:22 linuxsys kernel: do_syscall_64+0x5b/0x170 Jul 05 13:36:22 linuxsys kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jul 05 13:36:22 linuxsys kernel: RIP: 0033:0x7f4c731df667 Jul 05 13:36:22 linuxsys kernel: RSP: 002b:00007ffcd9a51ae8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Jul 05 13:36:22 linuxsys kernel: RAX: ffffffffffffffda RBX: 00007ffcd9a51b10 RCX: 00007f4c731df667 Jul 05 13:36:22 linuxsys kernel: RDX: 00007ffcd9a51b10 RSI: 00000000c018643a RDI: 000000000000000c Jul 05 13:36:22 linuxsys kernel: RBP: 00000000017bde80 R08: 0000000001000109 R09: 0000000000000000 Jul 05 13:36:22 linuxsys kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c018643a Jul 05 13:36:22 linuxsys kernel: R13: 00000000010037c3 R14: 000000000189dd10 R15: 0000000000000000 Jul 05 13:36:22 linuxsys kernel: Code: e9 af fd ff ff 44 89 e2 48 c7 c6 a0 28 e1 c0 bf 01 00 00 00 e8 53 ea ff ff 48 8b 83 98 03 00 00 48 83 78 28 00 0f 84 89 fd ff ff <0f> 0b 45 31 f6 e9 82 fd ff ff 48 c7 c7 68 28 e1 c0 45 31 f6 e8 Jul 05 13:36:22 linuxsys kernel: ---[ end trace e82ad29a813c3d81 ]--- Jul 05 13:36:22 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Jul 05 13:36:22 linuxsys kernel: [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! Jul 05 13:36:22 linuxsys kernel: [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'! EDIT: Forgot to mention that I'm currently on 4.17.3-1-MANJARO kernel, which is the latest at the time of writing. I have this same problem wirh 4.18.5-1-MANJARO x86_64 $ inxi CPU: Dual Core AMD A6-9500 RADEON R5 8 COMPUTE CORES 2C+6G (-MCP-) speed/min/max: 1622/1400/3500 MHz Kernel: 4.18.5-1-MANJARO x86_64 Up: 14m Mem: 1365.6/15029.9 MiB (9.1%) Storage: 1.93 TiB (23.8% used) Procs: 189 Shell: bash 4.4.23 inxi: 3.0.21 Running an XFCE desktop. (In reply to ecomer from comment #39) > I have this same problem wirh 4.18.5-1-MANJARO x86_64 Please file your own report. The reporter of this one says it's fixed. ecomer, as I suspect we are both experiencing the same issue, here's a new report you (and anyone else lurking) can follow: https://bugs.freedesktop.org/show_bug.cgi?id=107863 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.