Bug 108098 - Ryzen 7 2700U, amdgpu, graphics freezes on 4.19.0-041900-generic
Summary: Ryzen 7 2700U, amdgpu, graphics freezes on 4.19.0-041900-generic
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-29 15:45 UTC by Antonio Chirizzi
Modified: 2019-01-09 17:12 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg gzipped (21.47 KB, application/gzip)
2018-09-29 15:45 UTC, Antonio Chirizzi
no flags Details
kern.log gzipped with fault from 4.19.0-041900-generic (24.69 KB, application/gzip)
2018-10-28 12:21 UTC, Antonio Chirizzi
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Antonio Chirizzi 2018-09-29 15:45:59 UTC
Created attachment 141794 [details]
dmesg gzipped

Just bought and installed a new Acer Swift 3 (From kern.log: Acer Swift SF315-41/Becks_RR, BIOS V2.03 06/15/2018)

# uname -a
Linux antonioRyzen 4.15.0-34-generic #37~16.04.1-Ubuntu SMP Tue Aug 28 10:44:06 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

# inxi -G
Graphics:  Card: Advanced Micro Devices [AMD/ATI] Device 15dd
           Display Server: X.org 1.18.4 drivers: ati (unloaded: fbdev,vesa,radeon,amdgpu)
           tty size: 156x48 Advanced Data: N/A for root



Running linux mint 18.3 with KDE, fully updated.
After a few hours using it I get the screen freeze. Still able to connect via ssh, but impossible to fully shutdown (systemd reports timeouts after 2 minutes waiting on some processes)



This is what is logged at the moment of the freeze:



Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337272] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337279] amdgpu 0000:02:00.0:   at page 0x0000000104a4c000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337283] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00701031
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337290] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337292] amdgpu 0000:02:00.0:   at page 0x0000000104a4e000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337295] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337301] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337304] amdgpu 0000:02:00.0:   at page 0x0000000104a51000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337306] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337312] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337314] amdgpu 0000:02:00.0:   at page 0x0000000104a53000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337316] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337323] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337325] amdgpu 0000:02:00.0:   at page 0x0000000104a56000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337327] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337333] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337336] amdgpu 0000:02:00.0:   at page 0x0000000104a58000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337338] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337344] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337346] amdgpu 0000:02:00.0:   at page 0x0000000104a4f000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337348] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337354] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337357] amdgpu 0000:02:00.0:   at page 0x0000000104a50000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337358] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337365] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337367] amdgpu 0000:02:00.0:   at page 0x0000000104a54000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337369] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337375] amdgpu 0000:02:00.0: [gfxhub] VMC page fault (src_id:0 ring:24 vm_id:7 pas_id:0)
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337377] amdgpu 0000:02:00.0:   at page 0x0000000104a55000 from 27
Sep 29 15:21:13 antonioRyzen kernel: [ 7156.337379] amdgpu 0000:02:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000



This is what I can see at boot time in the kern.log:


[    3.392314] usb 3-2.2: new high-speed USB device number 4 using xhci_hcd
[    3.436384] [drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 1us * 100 tries - tgn10_lock line:566
[    3.436449] WARNING: CPU: 1 PID: 291 at /build/linux-hwe-On9fm7/linux-hwe-4.15.0/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:190 generic_reg_wai
t+0xf5/0x110 [amdgpu]
[    3.436450] Modules linked in: dm_mirror dm_region_hash dm_log hid_logitech_hidpp hid_logitech_dj usbhid amdkfd amd_iommu_v2 amdgpu chash i2c_algo_bit tt
m drm_kms_helper syscopyarea sysfillrect sysimgblt ahci fb_sys_fops libahci drm video wmi i2c_hid hid
[    3.436468] CPU: 1 PID: 291 Comm: plymouthd Not tainted 4.15.0-34-generic #37~16.04.1-Ubuntu
[    3.436469] Hardware name: Acer Swift SF315-41/Becks_RR, BIOS V2.03 06/15/2018
[    3.436507] RIP: 0010:generic_reg_wait+0xf5/0x110 [amdgpu]
[    3.436508] RSP: 0018:ffffb749c1913900 EFLAGS: 00010297
[    3.436509] RAX: 0000000000000001 RBX: 0000000000000065 RCX: ffffffffb02626e8
[    3.436510] RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000246
[    3.436511] RBP: ffffb749c1913940 R08: 000000000006a8a0 R09: 0000000000000383
[    3.436512] R10: 0000000000000002 R11: ffffffffb075380d R12: 0000000000000001
[    3.436513] R13: 0000000000000064 R14: ffff93089a3e6680 R15: 0000000000000100
[    3.436514] FS:  00007fa393f96740(0000) GS:ffff9308a7640000(0000) knlGS:0000000000000000
[    3.436515] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.436516] CR2: 0000557d22ffc680 CR3: 000000021aa98000 CR4: 00000000003406e0
[    3.436517] Call Trace:
[    3.436561]  tgn10_lock+0xa2/0xb0 [amdgpu]
[    3.436601]  program_all_pipe_in_tree+0x80d/0x8c0 [amdgpu]
[    3.436641]  ? amdgpu_cgs_write_register+0x14/0x20 [amdgpu]
[    3.436679]  ? amdgpu_cgs_read_register+0x14/0x20 [amdgpu]
[    3.436718]  dcn10_apply_ctx_for_surface+0x413/0x510 [amdgpu]
[    3.436755]  dc_commit_state+0x265/0x550 [amdgpu]
[    3.436797]  amdgpu_dm_atomic_commit_tail+0x2c4/0xae0 [amdgpu]
[    3.436828]  ? amdgpu_bo_pin_restricted+0x9f/0x290 [amdgpu]
[    3.436868]  ? dm_plane_helper_prepare_fb+0x1d3/0x260 [amdgpu]
[    3.436879]  commit_tail+0x42/0x70 [drm_kms_helper]
[    3.436885]  drm_atomic_helper_commit+0x116/0x120 [drm_kms_helper]
[    3.436900]  ? drm_atomic_check_only+0x456/0x570 [drm]
[    3.436939]  amdgpu_dm_atomic_commit+0x91/0xa0 [amdgpu]
[    3.436951]  drm_atomic_commit+0x51/0x60 [drm]
[    3.436956]  restore_fbdev_mode_atomic+0x17b/0x1e0 [drm_kms_helper]
[    3.436962]  restore_fbdev_mode+0x32/0x130 [drm_kms_helper]
[    3.436967]  drm_fb_helper_restore_fbdev_mode_unlocked.part.31+0x28/0x70 [drm_kms_helper]
[    3.436972]  drm_fb_helper_restore_fbdev_mode_unlocked+0x25/0x30 [drm_kms_helper]
[    3.437003]  amdgpu_fbdev_restore_mode+0x1f/0x50 [amdgpu]
[    3.437032]  amdgpu_driver_lastclose_kms+0x12/0x20 [amdgpu]
[    3.437041]  drm_lastclose+0x3c/0xf0 [drm]
[    3.437050]  drm_release+0x2cf/0x390 [drm]
[    3.437056]  __fput+0xea/0x220
[    3.437058]  ____fput+0xe/0x10
[    3.437062]  task_work_run+0x8a/0xb0
[    3.437066]  exit_to_usermode_loop+0xc4/0xd0
[    3.437068]  do_syscall_64+0xf4/0x130
[    3.437073]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[    3.437074] RIP: 0033:0x7fa3936978f0
[    3.437075] RSP: 002b:00007ffd115bb108 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[    3.437076] RAX: 0000000000000000 RBX: 00000000020c3dc0 RCX: 00007fa3936978f0
[    3.437077] RDX: 00000000020b1290 RSI: 00007fa393964b28 RDI: 0000000000000009
[    3.437078] RBP: 0000000000000009 R08: 00000000020c3dc0 R09: 0000000000000000
[    3.437079] R10: 00007fa393964b78 R11: 0000000000000246 R12: 000000000000e200
[    3.437079] R13: 00007fa39397d650 R14: 0000000000000000 R15: 00000000020c66f0
[    3.437081] Code: 18 31 f6 45 89 e8 44 89 e1 48 c7 c7 7a c1 71 c0 89 45 d4 52 48 c7 c2 f8 46 71 c0 e8 26 85 dd ff 41 83 7e 20 01 58 8b 45 d4 74 02 <0f> 0b 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b e9 2f 
[    3.437106] ---[ end trace fce3b3b89c93a318 ]---
[    3.437277] [drm] DC: Cursor address is 0!
[    3.437447] [drm] DC: Cursor address is 0!
[    3.437488] [drm] {1920x1080, 2230x1140@152600Khz}
[    3.449512] [drm] HBRx2 pass VS=1, PE=0
Comment 1 Michel Dänzer 2018-10-01 09:46:07 UTC
Can you try a newer kernel? Support for Ryzen APUs was known to be a bit rough around the edges until recently.
Comment 2 Antonio Chirizzi 2018-10-01 21:44:24 UTC
Hello Michel,

I have installed the latest stable ubuntu kernel after my laptop froze again half an hour ago (but still accessible through ssh).

# uname -a
Linux antonioRyzen 4.18.0-041800-generic #201808122131 SMP Sun Aug 12 21:33:20 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

With this one at least no call trace are being shown in the log.

I'll let you know if it freezes again.

Thanks!
Comment 3 Antonio Chirizzi 2018-10-15 15:30:28 UTC
Hello,

after a few days without problems, I got a complete freeze today.
The laptop was really stuck, not ssh access was possible.

In the mean time I managed to install the latest LinuxMint 19 Cinnamon, and also reinstalled the latest kernel 4.18.0-041800-generic from Ubuntu.

I have no info in any log file or any core available. The computer just froze.

What can I do to collect useful data the next time it freezes?

Thanks
Comment 4 Antonio Chirizzi 2018-10-28 12:21:48 UTC
Created attachment 142243 [details]
kern.log gzipped with fault from 4.19.0-041900-generic

This keeps happening with latest Ubuntu kernel 4.19.0-041900-generic as well.

AMD Ryzen 7 2700U frozen.

It's becoming really difficult to work on Linux with the Ryzen 7!
Comment 5 fin4478 2018-10-28 17:23:51 UTC Comment hidden (spam)
Comment 6 Antonio Chirizzi 2018-11-01 13:15:55 UTC
Hello fin4478, thanks for the suggestion.
I have tried the latest Manjaro Xfce, it's able to boot but it's not able to start the graphics environment. I have seen that for the Ryzen 7 2700U (my CPU) there are a few people who are experiencing the same problem iwth Manjaro.
Moreover it starts up with kernel 4.14, which is too old for the Ryzen I think.

Going back to your comment on Linux Mint 19 and the old software, is it possible in your opinion that the old software is causing a fault in the Kernel for the amdgpu?
Comment 7 Michel Dänzer 2018-11-01 15:28:27 UTC
FWIW, I advise against paying too much attention to fin4478. They are not involved in driver development and known for making rather questionable suggestions which are definitely not suitable for everyone.
Comment 8 Antonio Chirizzi 2018-11-17 16:17:07 UTC
Hello again,

I have been testing my laptop with the AMD Ryzen 7 2700U with latest the Ubuntu kernel 4.19.0-041900-generic for the last 2 weeks.

It does not freeze and everything works well if I use the WIFI at 2 GHz.

It DOES FREEZE if I use the WIFI at 5 GHz.

What I don't understand is that no amdgpu kernel fault lines appear in the kern.log if I use the 2 GHz wifi.

How is that connected?

Thanks,

-Antonio
Comment 9 Daniel Stone 2019-01-09 17:12:20 UTC
(In reply to Michel Dänzer from comment #7)
> FWIW, I advise against paying too much attention to fin4478. They are not
> involved in driver development and known for making rather questionable
> suggestions which are definitely not suitable for everyone.

@finn4478: Please stop posting in these bugs asserting which distributions/environments/etc do and don't work. Most of what you say is not factual, and though I'm sure you have good intentions, you are misleading users and frustrating both users and developers. If you do not stop with these interjections, we may have to remove your ability to comment.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.