Bug 109001

Summary: Freezes when waking up after screen goes blank.
Product: DRI Reporter: Julio <juliolokooo>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: FD, harry.wentland, nicholas.kazlauskas, wirch.eduard
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=105018
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Dmesg-output
none
Xorg Log none

Description Julio 2018-12-10 14:46:12 UTC
Created attachment 142769 [details]
Dmesg-output

Similar to bug 105018, but instead of kernel panic, system freezes for some time after waking screen up.

Tested on ArchLinux, it can be reproduced on Linux 4.18 and 4.19, using an AMD RX550.
It always happens when using DPMS/Screen blanking, after leaving the screen blank for some minutes and waking it up.
It keeps freezing for some time, sometimes for seconds others for minutes.

The attached dmesg output is similar to bug 105018. 
Every time the system is frozen, those 2 lines are shown:
"[drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'!
[ 1701.356720] [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'!"

Tried removing xf86-video-amdgpu (DDX Driver), and while the freezes disappear, it takes longer to wake up the screen and dmesg only shows "[amdgpu]] *ERROR* Failed to get VBLANK!" several times.
Comment 1 Michel Dänzer 2018-12-10 14:49:10 UTC
Please attach the corresponding Xorg log file.
Comment 2 Julio 2018-12-10 15:24:06 UTC
Created attachment 142770 [details]
Xorg Log
Comment 3 Michel Dänzer 2018-12-10 15:30:46 UTC
The Xorg log file doesn't have any messages corresponding to those in dmesg. Was it really captured after reproducing the problem?

Does https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu/merge_requests/15 help or the freezes by any chance?
Comment 4 Julio 2018-12-10 16:35:06 UTC
(In reply to Michel Dänzer from comment #3)
> The Xorg log file doesn't have any messages corresponding to those in dmesg.
> Was it really captured after reproducing the problem?
> 
> Does
> https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu/merge_requests/
> 15 help or the freezes by any chance?

I checked again, but it doesn't print anything on Xorg log. 
Those "device removed" are always shown after the freeze, not sure if related.

I tried that patch but it doesn't seem to make any difference.
Comment 5 L.S.S. 2018-12-11 05:20:51 UTC
I'm not sure about the freeze you experienced. I'm having similar issues on latest Manjaro (4.18-4.19) that after wakeup, there are intermittent screen freezes for a few seconds every 2-3 minutes. Aside from the good old VBLANK-related error messages (like those in the attached dmesg), no more errors are being recorded in the log except the system keeps freezing like that unless I reboot.

The patches in my old bug report (which were eventually merged at some time around 4.17) did not fix the root cause but at least fixed the 100% reproducible kernel panic I used to have.

The problem has been observed on the same laptop I use for work (which was also the one I used to test and report the previous bug), and for that reason, I'm still not confident about locking the screen (since LightDM GTK Greeter doesn't honor the "Timeout until the screen blanks", nor related settings in power options), as the risk of losing unsaved work is still there.
Comment 6 fin4478 2018-12-11 20:53:54 UTC Comment hidden (spam)
Comment 7 Julio 2018-12-12 16:01:45 UTC
(In reply to fin4478 from comment #6)
> Comment on attachment 142770 [details]
> Xorg Log
> 
> Disable the Xfce compositor, vsync feature in that causes it tainted. That
> might not help.  Use the display port, development focus is in the display
> port. Developers have  display monitors and are not interested about hdmi so
> much.

That's it, I disabled V-Sync on Xfce compositor settings and there are no more freezes or errors.
So this is a Xfwm4 bug?
Comment 8 Michel Dänzer 2018-12-12 16:12:06 UTC
I suspect disabling V-Sync in xfwm4 just avoids the problem, i.e. it's a workaround, not a fix. Unless you're using rotation or other RandR transforms, leaving V-Sync enabled and leaving TearFree at the default (auto) might also avoid the problem, and would be more efficient.

Does https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu/merge_requests/15 help by any chance?
Comment 9 Matt Garman 2018-12-31 00:18:30 UTC
I too am having the same problem.  After the screen blanks, the system is barely usable: pauses/freezes randomly.  If I switch to a virtual (text) console, everything works as expected.  Switching back to X usually causes X to crash, kicking me back to the login screen.  Usually after I re-login, the pauses are gone.  Sometimes it takes a second try.  Every now and then, though, the pausing/freezing won't go away, and a reboot is required.

I am using Manjaro, which is based on Arch, so fairly similar to the original poster.  Kernel 4.19.12-2-MANJARO.

In addition to the error log lines already reported:

[32952.381123] [drm:dm_vblank_get_counter [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'!
[32952.381201] [drm:dm_crtc_get_scanoutpos [amdgpu]] *ERROR* dc_stream_state is NULL for crtc '1'!

I also have the following kernel trace in dmesg:

[24708.647186] WARNING: CPU: 2 PID: 638 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/
dc_link.c:1570 core_link_enable_stream+0x657/0xb90 [amdgpu]
[24708.647187] Modules linked in: iptable_mangle xt_CHECKSUM iptable_nat ipt_MASQUERADE n
f_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ipt_R
EJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc devlink ebtable_filter ebtables ip6tabl
e_filter ip6_tables iptable_filter rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs loc
kd grace sunrpc fscache cmac bnep nct6775 hwmon_vid amdkfd amd_iommu_v2 amdgpu arc4 iwlmv
m edac_mce_amd snd_hda_codec_hdmi kvm_amd snd_hda_intel ccp mac80211 rng_core kvm snd_hda
_codec btusb chash btrtl gpu_sched btbcm fuse irqbypass ttm snd_hda_core btintel snd_hwde
p iwlwifi crct10dif_pclmul drm_kms_helper crc32_pclmul bluetooth snd_pcm ghash_clmulni_in
tel pcbc drm snd_timer mousedev cfg80211 igb agpgart syscopyarea sysfillrect aesni_intel
wmi_bmof
[24708.647214]  ecdh_generic snd sysimgblt aes_x86_64 i2c_algo_bit crypto_simd cryptd inp
ut_leds glue_helper fb_sys_fops soundcore dca rfkill pcspkr sp5100_tco i2c_piix4 k10temp
wmi evdev mac_hid gpio_amdpt pinctrl_amd pcc_cpufreq acpi_cpufreq sg crypto_user ip_table
s x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto sr_mod cdrom sd_mod hid_generi
c usbhid hid ahci libahci libata xhci_pci scsi_mod xhci_hcd crc32c_intel
[24708.647233] CPU: 2 PID: 638 Comm: Xorg Tainted: G        W         4.19.12-2-MANJARO #
1
[24708.647234] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350 Gaming-
ITX/ac, BIOS P5.30 12/18/2018
[24708.647268] RIP: 0010:core_link_enable_stream+0x657/0xb90 [amdgpu]
[24708.647270] Code: e8 b9 02 00 00 00 48 8d 54 24 58 44 89 ee 48 89 df 45 0f b6 74 24 01 44 88 7c 24 58 44 88 74 24 59 e8 9d dd ff ff 84 c0 75 02 <0f> 0b 41 8d 47 f6 3c 02 77 b5 41 80 ff 0a 0f 85 2c 03 00 00 44 88
[24708.647271] RSP: 0018:ffffb75a43e578d8 EFLAGS: 00010246
[24708.647272] RAX: 0000000000000000 RBX: ffff9496df3b4188 RCX: 0000000000000000
[24708.647273] RDX: 0000000001d5f402 RSI: ffff9498108a70c0 RDI: ffff949810407600
[24708.647274] RBP: ffffb75a43e5791c R08: 00000000000270c0 R09: ffffffffc0e7803d
[24708.647274] R10: ffffe0670974f400 R11: ffffb75a43e57720 R12: ffffb75a43e5791a
[24708.647275] R13: 000000000000005e R14: 000000000000004c R15: 000000000000000c
[24708.647276] FS:  00007f03c1ad5e00(0000) GS:ffff949810880000(0000) knlGS:0000000000000000
[24708.647277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[24708.647278] CR2: 00007f75a1af47a0 CR3: 000000038c548000 CR4: 00000000003406e0
[24708.647278] Call Trace:
[24708.647320]  dce110_apply_ctx_to_hw+0x63f/0x650 [amdgpu]
[24708.647355]  dc_commit_state+0x2c6/0x520 [amdgpu]
[24708.647383]  ? set_freesync_on_streams.part.6+0x4d/0x250 [amdgpu]
[24708.647410]  ? mod_freesync_set_user_enable+0x11f/0x150 [amdgpu]
[24708.647439]  amdgpu_dm_atomic_commit_tail+0x388/0xdb0 [amdgpu]
[24708.647441]  ? _raw_spin_lock_irq+0x1a/0x40
[24708.647442]  ? _raw_spin_unlock_irq+0x1d/0x30
[24708.647443]  ? wait_for_common+0x113/0x190
[24708.647444]  ? _raw_spin_unlock_irq+0x1d/0x30
[24708.647445]  ? wait_for_common+0x113/0x190
[24708.647450]  commit_tail+0x3d/0x70 [drm_kms_helper]
[24708.647454]  drm_atomic_helper_commit+0x103/0x110 [drm_kms_helper]
[24708.647458]  drm_atomic_helper_set_config+0x80/0x90 [drm_kms_helper]
[24708.647466]  drm_mode_setcrtc+0x187/0x6b0 [drm]
[24708.647468]  ? __switch_to_asm+0x34/0x70
[24708.647469]  ? __switch_to_asm+0x40/0x70
[24708.647470]  ? __switch_to_asm+0x34/0x70
[24708.647470]  ? __switch_to_asm+0x40/0x70
[24708.647476]  ? drm_mode_getcrtc+0x180/0x180 [drm]
[24708.647482]  drm_ioctl_kernel+0xa7/0xf0 [drm]
[24708.647488]  drm_ioctl+0x30e/0x3c0 [drm]
[24708.647494]  ? drm_mode_getcrtc+0x180/0x180 [drm]
[24708.647512]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[24708.647514]  do_vfs_ioctl+0xa4/0x630
[24708.647516]  ? syscall_slow_exit_work+0x19b/0x1b0
[24708.647517]  ksys_ioctl+0x60/0x90
[24708.647518]  __x64_sys_ioctl+0x16/0x20
[24708.647520]  do_syscall_64+0x65/0x180
[24708.647521]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[24708.647522] RIP: 0033:0x7f03c46f480b
[24708.647523] Code: 0f 1e fa 48 8b 05 55 b6 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 25 b6 0c 00 f7 d8 64 89 01 48
[24708.647524] RSP: 002b:00007ffe7fa7e4e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[24708.647525] RAX: ffffffffffffffda RBX: 00007ffe7fa7e520 RCX: 00007f03c46f480b
[24708.647525] RDX: 00007ffe7fa7e520 RSI: 00000000c06864a2 RDI: 000000000000000d
[24708.647526] RBP: 00007ffe7fa7e520 R08: 0000000000000000 R09: 0000564f3061e110
[24708.647527] R10: 00007ffe7fa7e5e0 R11: 0000000000000246 R12: 00000000c06864a2
[24708.647527] R13: 000000000000000d R14: 0000000000000000 R15: 0000564f3061e110
[24708.647529] ---[ end trace 47c6b0b9b26e8c53 ]---

Any other info I can provide to help track this one down?

Thanks!
Comment 10 wirch.eduard 2019-01-02 09:18:02 UTC
Disabling stuff does not help. Running kernel 4.20.0-1 and these settings:

    Section "Device"
    	Identifier  "Device0"
    	Driver	"amdgpu"
    	BusID       "PCI:14:0:0"
    	Option "DRI" "2"
    	Option "TearFree" "off"
      Option "EnablePageFlip" "off"
    EndSection

Will still freeze the system after waking up from power save mode.

    kernel: [drm:dce110_vblank_set [amdgpu]] *ERROR* Failed to get VBLANK!
    kernel: [drm:dce110_vblank_set [amdgpu]] *ERROR* Failed to get VBLANK!
    kernel: [drm:dce110_vblank_set [amdgpu]] *ERROR* Failed to get VBLANK!
    kernel: [drm:dce110_vblank_set [amdgpu]] *ERROR* Failed to get VBLANK!
    kernel: [drm:dce110_vblank_set [amdgpu]] *ERROR* Failed to get VBLANK!
    kernel: [drm:dce110_vblank_set [amdgpu]] *ERROR* Failed to get VBLANK!
    kernel: [drm:dce110_vblank_set [amdgpu]] *ERROR* Failed to get VBLANK!
    kernel: general protection fault: 0000 [#1] PREEMPT SMP NOPTI
    kernel: CPU: 12 PID: 1442 Comm: xfwm4 Not tainted 4.20.0-1-MANJARO #1
    kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X370 Taichi, BIOS P3.30 01/15/2018
    kernel: RIP: 0010:dce110_vblank_set+0x4a/0xa0 [amdgpu]
    kernel: Code: ef e8 ba 7b 03 00 84 db 74 38 83 e8 4e 0f b6 c0 48 69 c0 30 04 00 00 49 03 85 30 01 00 00 48 8b b8 88 02 00 00 48 85 ff 74 32 <48> 8b 07 be 02 00 00 00 48 8b 80 f0 00 00 00 e8 f2 c1 12 e9 84 c0
    kernel: RSP: 0018:ffffaa058767bbe0 EFLAGS: 00010002
    kernel: RAX: ffff93ede3b66960 RBX: 0000000000000001 RCX: 0000000000000000
    kernel: RDX: 0000000000000000 RSI: 00000000fffffff9 RDI: 0004000600010001
    kernel: RBP: ffffffffc137d050 R08: 0000000000000000 R09: ffffffffc0888c4b
    kernel: R10: 0000000000000000 R11: ffffffffc0875c10 R12: ffff93ef8c8fa000
    kernel: R13: ffff93ef91e06000 R14: ffff93f0d4ea01c8 R15: ffffaa058767bd98
    kernel: FS:  00007fbb6b5e9180(0000) GS:ffff93f0def00000(0000) knlGS:0000000000000000
    kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    kernel: CR2: 000010ef9ae91000 CR3: 0000000e72de2000 CR4: 00000000003406e0
    kernel: Call Trace:
    kernel:  dm_enable_vblank+0x26/0x30 [amdgpu]
    kernel:  drm_vblank_enable+0xd4/0x120 [drm]
    kernel:  drm_vblank_get+0x88/0xa0 [drm]
    kernel:  drm_wait_vblank_ioctl+0x138/0x630 [drm]
    kernel:  ? import_iovec+0x52/0xb0
    kernel:  ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm]
    kernel:  drm_ioctl_kernel+0xaf/0xf0 [drm]
    kernel:  drm_ioctl+0x333/0x3e0 [drm]
    kernel:  ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm]
    kernel:  ? do_iter_write+0xda/0x190
    kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
    kernel:  do_vfs_ioctl+0xa4/0x630
    kernel:  ksys_ioctl+0x60/0x90
    kernel:  __x64_sys_ioctl+0x16/0x20
    kernel:  do_syscall_64+0x65/0x180
    kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    kernel: RIP: 0033:0x7fbb6d8ff80b
    kernel: Code: 0f 1e fa 48 8b 05 55 b6 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 25 b6 0c 00 f7 d8 64 89 01 48
    kernel: RSP: 002b:00007ffe82b09618 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
    kernel: RAX: ffffffffffffffda RBX: 00007ffe82b09640 RCX: 00007fbb6d8ff80b
    kernel: RDX: 00007ffe82b09640 RSI: 00000000c018643a RDI: 000000000000000c
    kernel: RBP: 000055843bb359e0 R08: 00007ffe82b790b0 R09: 00007ffe82b79080
    kernel: R10: 000000000001e54e R11: 0000000000000246 R12: 00000000c018643a
    kernel: R13: 000000000080e489 R14: 0000000000000000 R15: 000055843bc31240
    kernel: Modules linked in: cmac rfcomm fuse ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc overlay aufs bnep arc4 amdgpu iwlmvm edac_mce_amd kvm mac80211 irqbypass nls_iso8859_1 nls_cp437 >
    kernel:  crc32c_generic crc16 mbcache jbd2 fscrypto sr_mod cdrom hid_generic usbhid hid ahci libahci libata xhci_pci crc32c_intel xhci_hcd scsi_mod
    kernel: ---[ end trace 53aef1365621a569 ]---
    kernel: RIP: 0010:dce110_vblank_set+0x4a/0xa0 [amdgpu]
    kernel: Code: ef e8 ba 7b 03 00 84 db 74 38 83 e8 4e 0f b6 c0 48 69 c0 30 04 00 00 49 03 85 30 01 00 00 48 8b b8 88 02 00 00 48 85 ff 74 32 <48> 8b 07 be 02 00 00 00 48 8b 80 f0 00 00 00 e8 f2 c1 12 e9 84 c0
    kernel: RSP: 0018:ffffaa058767bbe0 EFLAGS: 00010002
    kernel: RAX: ffff93ede3b66960 RBX: 0000000000000001 RCX: 0000000000000000
    kernel: RDX: 0000000000000000 RSI: 00000000fffffff9 RDI: 0004000600010001
    kernel: RBP: ffffffffc137d050 R08: 0000000000000000 R09: ffffffffc0888c4b
    kernel: R10: 0000000000000000 R11: ffffffffc0875c10 R12: ffff93ef8c8fa000
    kernel: R13: ffff93ef91e06000 R14: ffff93f0d4ea01c8 R15: ffffaa058767bd98
    kernel: FS:  00007fbb6b5e9180(0000) GS:ffff93f0def00000(0000) knlGS:0000000000000000
    kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    kernel: CR2: 000010ef9ae91000 CR3: 0000000e72de2000 CR4: 00000000003406e0
    kernel: note: xfwm4[1442] exited with preempt_count 2


I'm stuck with kernel 4.14 because of this nasty bug.
Comment 11 Michel Dänzer 2019-01-03 11:35:50 UTC
(In reply to wirch.eduard from comment #10)
> I'm stuck with kernel 4.14 because of this nasty bug.

You can try amdgpu.dc=0 as a workaround with current kernels.
Comment 12 wirch.eduard 2019-01-04 07:50:44 UTC
Disabling new display code does help indeed. Thanks for the hint. Hopefully the recent patch set for 4.21 (https://www.phoronix.com/scan.php?page=news_item&px=AMDGPU-Fixes-For-Linux-4.21) fixes this problem.
Comment 13 Matt Garman 2019-01-05 03:00:42 UTC
If I set amdgpu.dc=0 on my Ryzen 5 2400g, the system hard-locks as soon as X starts (i.e. hard reset via physical power button required).
Comment 14 Alex Deucher 2019-01-05 15:57:15 UTC
(In reply to Matt Garman from comment #13)
> If I set amdgpu.dc=0 on my Ryzen 5 2400g, the system hard-locks as soon as X
> starts (i.e. hard reset via physical power button required).

There is no non-DC code for raven, so if you set dc=0, the display hardware is not initialized and no display features are exposed to the OS.  The GPU basically a render only device in that case.
Comment 15 Martin Peres 2019-11-19 09:07:59 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/641.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.