This bug happens for me like once a day, at seemingly random times with latest kernel from torvalds tree. Last merge was yesterday https://github.com/torvalds/linux/commit/0fe4e2d5cd931ad2ff99d61cfdd5c6dc0c3ec60b but in the previous drm merge the bug was also present. System: Host: mjb Kernel: 4.20.0-1-tkg-cfs x86_64 bits: 64 Desktop: KDE Plasma 5.14.4 Distro: Manjaro Linux Machine: Type: Desktop Mobo: ASUSTeK model: TUF B450M-PLUS GAMING v: Rev X.0x serial: <root required> UEFI: American Megatrends v: 0601 date: 10/29/2018 CPU: 6-Core: AMD Ryzen 5 2600 type: MT MCP speed: 3885 MHz min/max: 1550/3900 MHz Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] driver: amdgpu v: kernel Display: x11 server: X.Org 1.20.3 driver: amdgpu resolution: 1920x1080~60Hz OpenGL: renderer: Radeon RX 580 Series (POLARIS10 DRM 3.27.0 4.21.0-torvaldsgit LLVM 8.0.0) v: 4.5 Mesa 19.0.0-devel (git-8847370424) Network: Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver: r8169 Drives: Local Storage: total: 1.59 TiB used: 283.09 GiB (17.3%) Info: Processes: 262 Uptime: 6m Memory: 15.66 GiB used: 1.22 GiB (7.8%) Shell: zsh inxi: 3.0.28 dmesg from previous boot always shows a trace like this: jan 06 18:37:32 mjb kernel: general protection fault: 0000 [#1] PREEMPT SMP NOPTI jan 06 18:37:32 mjb kernel: CPU: 4 PID: 676 Comm: Xorg:cs0 Tainted: G O 4.21.0-torvaldsgit #1 jan 06 18:37:32 mjb kernel: Hardware name: System manufacturer System Product Name/TUF B450M-PLUS GAMING, BIOS 0601 10/29/2018 jan 06 18:37:32 mjb kernel: RIP: 0010:__memcpy+0x12/0x20 jan 06 18:37:32 mjb kernel: Code: 48 89 c8 e9 f9 fc ff ff 48 89 f0 e9 f1 fc ff ff 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66> jan 06 18:37:32 mjb kernel: RSP: 0018:ffffc9000327bc30 EFLAGS: 00010246 jan 06 18:37:32 mjb kernel: RAX: 0000a0050f003b80 RBX: ffff888105fdb0b0 RCX: 0000000000000200 jan 06 18:37:32 mjb kernel: RDX: 0000000000000000 RSI: ffff8880d3369000 RDI: 0000a0050f003b80 jan 06 18:37:32 mjb kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000000001cc jan 06 18:37:32 mjb kernel: R10: 0000000000000000 R11: ffff8883fa6a4828 R12: 0000000000001000 jan 06 18:37:32 mjb kernel: R13: 0000000000000000 R14: 00000000d3369000 R15: ffff88840bf6fb28 jan 06 18:37:32 mjb kernel: FS: 00007fc0419a4700(0000) GS:ffff88840eb00000(0000) knlGS:0000000000000000 jan 06 18:37:32 mjb kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 jan 06 18:37:32 mjb kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 jan 06 18:37:32 mjb kernel: CR2: 00007fa2d142d210 CR3: 0000000401080000 CR4: 00000000003406e0 jan 06 18:37:32 mjb kernel: Call Trace: jan 06 18:37:32 mjb kernel: dma_direct_unmap_page+0x92/0xa0 jan 06 18:37:32 mjb kernel: ttm_unmap_and_unpopulate_pages+0x148/0x170 [ttm] jan 06 18:37:32 mjb kernel: ttm_tt_destroy+0x81/0xd0 [ttm] jan 06 18:37:32 mjb kernel: ttm_bo_put+0x25e/0x2f0 [ttm] jan 06 18:37:32 mjb kernel: amdgpu_bo_unref+0x1a/0x30 [amdgpu] jan 06 18:37:32 mjb kernel: amdgpu_gem_object_free+0x23/0x30 [amdgpu] jan 06 18:37:32 mjb kernel: drm_gem_handle_delete+0x9b/0x130 [drm] jan 06 18:37:32 mjb kernel: ? drm_gem_handle_create+0x40/0x40 [drm] jan 06 18:37:32 mjb kernel: drm_ioctl_kernel+0x8b/0xd0 [drm] jan 06 18:37:32 mjb kernel: drm_ioctl+0x1e5/0x390 [drm] jan 06 18:37:32 mjb kernel: ? drm_gem_handle_create+0x40/0x40 [drm] jan 06 18:37:32 mjb kernel: ? tlb_finish_mmu+0x1f/0x30 jan 06 18:37:32 mjb kernel: ? unmap_region+0xc9/0xf0 jan 06 18:37:32 mjb kernel: amdgpu_drm_ioctl+0x49/0x80 [amdgpu] jan 06 18:37:32 mjb kernel: do_vfs_ioctl+0x97/0x720 jan 06 18:37:32 mjb kernel: ? __do_munmap.constprop.9+0x263/0x3a0 jan 06 18:37:32 mjb kernel: __x64_sys_ioctl+0x62/0x90 jan 06 18:37:32 mjb kernel: do_syscall_64+0x55/0x100 jan 06 18:37:32 mjb kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 jan 06 18:37:32 mjb kernel: RIP: 0033:0x7fc04ba0580b jan 06 18:37:32 mjb kernel: Code: 0f 1e fa 48 8b 05 55 b6 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3> jan 06 18:37:32 mjb kernel: RSP: 002b:00007fc0419a3968 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 jan 06 18:37:32 mjb kernel: RAX: ffffffffffffffda RBX: 0000562b657e9710 RCX: 00007fc04ba0580b jan 06 18:37:32 mjb kernel: RDX: 00007fc0419a39a0 RSI: 0000000040086409 RDI: 000000000000000e jan 06 18:37:32 mjb kernel: RBP: 00007fc0419a39a0 R08: 0000562b6333bc48 R09: 0000000000000007 jan 06 18:37:32 mjb kernel: R10: 0000000000000026 R11: 0000000000000246 R12: 0000000040086409 jan 06 18:37:32 mjb kernel: R13: 000000000000000e R14: 0000562b63390960 R15: 0000562b657e3fa0 jan 06 18:37:32 mjb kernel: Modules linked in: devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter nf_tables nfnetlink edac_mce_amd kvm_amd kvm irqbypass snd_hda_code> jan 06 18:37:32 mjb kernel: ---[ end trace 28938eb196cb96ca ]--- jan 06 18:37:32 mjb kernel: RIP: 0010:__memcpy+0x12/0x20 jan 06 18:37:32 mjb kernel: Code: 48 89 c8 e9 f9 fc ff ff 48 89 f0 e9 f1 fc ff ff 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66> jan 06 18:37:32 mjb kernel: RSP: 0018:ffffc9000327bc30 EFLAGS: 00010246 jan 06 18:37:32 mjb kernel: RAX: 0000a0050f003b80 RBX: ffff888105fdb0b0 RCX: 0000000000000200 jan 06 18:37:32 mjb kernel: RDX: 0000000000000000 RSI: ffff8880d3369000 RDI: 0000a0050f003b80 jan 06 18:37:32 mjb kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000000001cc jan 06 18:37:32 mjb kernel: R10: 0000000000000000 R11: ffff8883fa6a4828 R12: 0000000000001000 jan 06 18:37:32 mjb kernel: R13: 0000000000000000 R14: 00000000d3369000 R15: ffff88840bf6fb28 jan 06 18:37:32 mjb kernel: FS: 00007fc0419a4700(0000) GS:ffff88840eb00000(0000) knlGS:0000000000000000 jan 06 18:37:32 mjb kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 jan 06 18:37:32 mjb kernel: CR2: 00007fa2d142d210 CR3: 0000000401080000 CR4: 00000000003406e0 lines 2853-2894/2894 (END)
Still happens in 5.0-rc1
Can you bisect?
(In reply to Michel Dänzer from comment #2) > Can you bisect? I could but I don't have a reliable reproduction yet. I don't know what triggers the bug.
I havent triggered it again yet in 5.0-rc1 after a bios update, lets see what happens in next few days.
got it playing a steam game in wine now, but still can't reproduce reliably: jan 07 21:27:20 mjb kernel: BUG: unable to handle kernel paging request at ffff8e08888b4c00 jan 07 21:27:20 mjb kernel: #PF error: [WRITE] jan 07 21:27:20 mjb kernel: PGD 0 P4D 0 jan 07 21:27:20 mjb kernel: Oops: 0002 [#1] SMP NOPTI jan 07 21:27:20 mjb kernel: CPU: 1 PID: 18040 Comm: Steam.exe Tainted: G O 5.0.0-1-tkg-cfs #1 jan 07 21:27:20 mjb kernel: Hardware name: System manufacturer System Product Name/TUF B450M-PLUS GAMING, BIOS 0604 12/07/2018 jan 07 21:27:20 mjb kernel: RIP: 0010:__memcpy+0x12/0x20 jan 07 21:27:20 mjb kernel: Code: 48 89 c8 e9 f9 fc ff ff 48 89 f0 e9 f1 fc ff ff 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66> jan 07 21:27:20 mjb kernel: RSP: 0018:ffffc90001b73cc0 EFLAGS: 00210246 jan 07 21:27:20 mjb kernel: RAX: ffff8e08888b4c00 RBX: ffff888105fd80b0 RCX: 0000000000000200 jan 07 21:27:20 mjb kernel: RDX: 0000000000000000 RSI: ffff8880d50f0000 RDI: ffff8e08888b4c00 jan 07 21:27:20 mjb kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 jan 07 21:27:20 mjb kernel: R10: ffffea000d8bf580 R11: ffff888143d89710 R12: 0000000000001000 jan 07 21:27:20 mjb kernel: R13: 0000000000000000 R14: 00000000d50f0000 R15: ffff8883fcaefd28 jan 07 21:27:20 mjb kernel: FS: 000000007ffd8000(0063) GS:ffff88840ea40000(006b) knlGS:00000000f7b810c0 jan 07 21:27:20 mjb kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 jan 07 21:27:20 mjb kernel: CR2: ffff8e08888b4c00 CR3: 000000033c700000 CR4: 00000000003406e0 jan 07 21:27:20 mjb kernel: Call Trace: jan 07 21:27:20 mjb kernel: dma_direct_unmap_page+0x92/0xa0 jan 07 21:27:20 mjb kernel: ttm_unmap_and_unpopulate_pages+0x148/0x170 [ttm] jan 07 21:27:20 mjb kernel: ttm_tt_destroy+0x81/0xd0 [ttm] jan 07 21:27:20 mjb kernel: ttm_bo_put+0x262/0x2f0 [ttm] jan 07 21:27:20 mjb kernel: amdgpu_bo_unref+0x1a/0x30 [amdgpu] jan 07 21:27:20 mjb kernel: amdgpu_gem_object_free+0x23/0x30 [amdgpu] jan 07 21:27:20 mjb kernel: drm_gem_handle_delete+0x9e/0x130 [drm] jan 07 21:27:20 mjb kernel: ? drm_gem_handle_create+0x40/0x40 [drm] jan 07 21:27:20 mjb kernel: drm_ioctl_kernel+0x8b/0xd0 [drm] jan 07 21:27:20 mjb kernel: drm_ioctl+0x1e5/0x390 [drm] jan 07 21:27:20 mjb kernel: ? drm_gem_handle_create+0x40/0x40 [drm] jan 07 21:27:20 mjb kernel: ? kmem_cache_free+0x18e/0x1b0 jan 07 21:27:20 mjb kernel: ? remove_vma_list+0xe6/0x140 jan 07 21:27:20 mjb kernel: ? __do_munmap.constprop.9+0x263/0x3a0 jan 07 21:27:20 mjb kernel: __se_compat_sys_ioctl+0x2e3/0xe10 jan 07 21:27:20 mjb kernel: ? __ia32_sys_munmap+0x75/0x90 jan 07 21:27:20 mjb kernel: do_fast_syscall_32+0x98/0x210 jan 07 21:27:20 mjb kernel: entry_SYSCALL_compat_after_hwframe+0x45/0x4d jan 07 21:27:20 mjb kernel: Modules linked in: edac_mce_amd kvm_amd kvm snd_hda_codec_realtek amdgpu irqbypass snd_hda_codec_generic ledtrig_audio chash snd_hda_codec_hdmi amd_iommu_v2 gpu> jan 07 21:27:20 mjb kernel: CR2: ffff8e08888b4c00 jan 07 21:27:20 mjb kernel: ---[ end trace b2ffa643a20c80fe ]--- jan 07 21:27:20 mjb kernel: RIP: 0010:__memcpy+0x12/0x20 jan 07 21:27:20 mjb kernel: Code: 48 89 c8 e9 f9 fc ff ff 48 89 f0 e9 f1 fc ff ff 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66> jan 07 21:27:20 mjb kernel: RSP: 0018:ffffc90001b73cc0 EFLAGS: 00210246 jan 07 21:27:20 mjb kernel: RAX: ffff8e08888b4c00 RBX: ffff888105fd80b0 RCX: 0000000000000200 jan 07 21:27:20 mjb kernel: RDX: 0000000000000000 RSI: ffff8880d50f0000 RDI: ffff8e08888b4c00 jan 07 21:27:20 mjb kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 jan 07 21:27:20 mjb kernel: R10: ffffea000d8bf580 R11: ffff888143d89710 R12: 0000000000001000 jan 07 21:27:20 mjb kernel: R13: 0000000000000000 R14: 00000000d50f0000 R15: ffff8883fcaefd28 jan 07 21:27:20 mjb kernel: FS: 000000007ffd8000(0063) GS:ffff88840ea40000(006b) knlGS:00000000f7b810c0 jan 07 21:27:20 mjb kernel: CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 jan 07 21:27:20 mjb kernel: CR2: ffff8e08888b4c00 CR3: 000000033c700000 CR4: 00000000003406e0 jan 07 21:27:22 mjb kernel: general protection fault: 0000 [#2] SMP NOPTI jan 07 21:27:22 mjb kernel: CPU: 0 PID: 649 Comm: Xorg Tainted: G D O 5.0.0-1-tkg-cfs #1 jan 07 21:27:22 mjb kernel: Hardware name: System manufacturer System Product Name/TUF B450M-PLUS GAMING, BIOS 0604 12/07/2018 jan 07 21:27:22 mjb kernel: RIP: 0010:__memcpy+0x12/0x20 jan 07 21:27:22 mjb kernel: Code: 48 89 c8 e9 f9 fc ff ff 48 89 f0 e9 f1 fc ff ff 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66> jan 07 21:27:22 mjb kernel: RSP: 0018:ffffc90002203c30 EFLAGS: 00010246 jan 07 21:27:22 mjb kernel: RAX: c930ce4031168b49 RBX: ffff888105fd80b0 RCX: 0000000000000200 jan 07 21:27:22 mjb kernel: RDX: 0000000000000000 RSI: ffff8880d5297000 RDI: c930ce4031168b49 jan 07 21:27:22 mjb kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000041 jan 07 21:27:22 mjb kernel: R10: ffffea00060c1d40 R11: ffff8883fa0950f8 R12: 0000000000001000 jan 07 21:27:22 mjb kernel: R13: 0000000000000000 R14: 00000000d5297000 R15: ffff8883fc85ef28 jan 07 21:27:22 mjb kernel: FS: 00007fed1f70bdc0(0000) GS:ffff88840ea00000(0000) knlGS:0000000000000000 jan 07 21:27:22 mjb kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 jan 07 21:27:22 mjb kernel: CR2: 0000561332707448 CR3: 0000000402090000 CR4: 00000000003406f0 jan 07 21:27:22 mjb kernel: Call Trace: jan 07 21:27:22 mjb kernel: dma_direct_unmap_page+0x92/0xa0 jan 07 21:27:22 mjb kernel: ttm_unmap_and_unpopulate_pages+0x148/0x170 [ttm] jan 07 21:27:22 mjb kernel: ttm_tt_destroy+0x81/0xd0 [ttm] jan 07 21:27:22 mjb kernel: ttm_bo_put+0x262/0x2f0 [ttm] jan 07 21:27:22 mjb kernel: amdgpu_bo_unref+0x1a/0x30 [amdgpu] jan 07 21:27:22 mjb kernel: amdgpu_gem_object_free+0x23/0x30 [amdgpu] jan 07 21:27:22 mjb kernel: drm_gem_handle_delete+0x9e/0x130 [drm] jan 07 21:27:22 mjb kernel: ? drm_gem_handle_create+0x40/0x40 [drm] jan 07 21:27:22 mjb kernel: drm_ioctl_kernel+0x8b/0xd0 [drm] jan 07 21:27:22 mjb kernel: drm_ioctl+0x1e5/0x390 [drm] jan 07 21:27:22 mjb kernel: ? drm_gem_handle_create+0x40/0x40 [drm] jan 07 21:27:22 mjb kernel: ? tlb_finish_mmu+0x1f/0x30 jan 07 21:27:22 mjb kernel: ? unmap_region+0xc9/0xf0 jan 07 21:27:22 mjb kernel: amdgpu_drm_ioctl+0x49/0x80 [amdgpu] jan 07 21:27:22 mjb kernel: do_vfs_ioctl+0x97/0x720 jan 07 21:27:22 mjb kernel: ? __do_munmap.constprop.9+0x263/0x3a0 jan 07 21:27:22 mjb kernel: __x64_sys_ioctl+0x62/0x90 jan 07 21:27:22 mjb kernel: do_syscall_64+0x55/0x100 jan 07 21:27:22 mjb kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 jan 07 21:27:22 mjb kernel: RIP: 0033:0x7fed21f6480b jan 07 21:27:22 mjb kernel: Code: 0f 1e fa 48 8b 05 55 b6 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3> jan 07 21:27:22 mjb kernel: RSP: 002b:00007ffe6b968648 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 jan 07 21:27:22 mjb kernel: RAX: ffffffffffffffda RBX: 00005630ead1d5e0 RCX: 00007fed21f6480b jan 07 21:27:22 mjb kernel: RDX: 00007ffe6b968680 RSI: 0000000040086409 RDI: 000000000000000e jan 07 21:27:22 mjb kernel: RBP: 00007ffe6b968680 R08: 00005630e9527c48 R09: 0000000000000000 jan 07 21:27:22 mjb kernel: R10: 000000000000001c R11: 0000000000000246 R12: 0000000040086409 jan 07 21:27:22 mjb kernel: R13: 000000000000000e R14: 00005630eaee5c80 R15: 00005630e957c960 jan 07 21:27:22 mjb kernel: Modules linked in: edac_mce_amd kvm_amd kvm snd_hda_codec_realtek amdgpu irqbypass snd_hda_codec_generic ledtrig_audio chash snd_hda_codec_hdmi amd_iommu_v2 gpu> jan 07 21:27:22 mjb kernel: ---[ end trace b2ffa643a20c80ff ]--- jan 07 21:27:22 mjb kernel: RIP: 0010:__memcpy+0x12/0x20 jan 07 21:27:22 mjb kernel: Code: 48 89 c8 e9 f9 fc ff ff 48 89 f0 e9 f1 fc ff ff 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 c3 66> jan 07 21:27:22 mjb kernel: RSP: 0018:ffffc90001b73cc0 EFLAGS: 00210246 jan 07 21:27:22 mjb kernel: RAX: ffff8e08888b4c00 RBX: ffff888105fd80b0 RCX: 0000000000000200 jan 07 21:27:22 mjb kernel: RDX: 0000000000000000 RSI: ffff8880d50f0000 RDI: ffff8e08888b4c00 jan 07 21:27:22 mjb kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 jan 07 21:27:22 mjb kernel: R10: ffffea000d8bf580 R11: ffff888143d89710 R12: 0000000000001000 jan 07 21:27:22 mjb kernel: R13: 0000000000000000 R14: 00000000d50f0000 R15: ffff8883fcaefd28 jan 07 21:27:22 mjb kernel: FS: 00007fed1f70bdc0(0000) GS:ffff88840ea00000(0000) knlGS:0000000000000000 jan 07 21:27:22 mjb kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 jan 07 21:27:22 mjb kernel: CR2: 0000561332707448 CR3: 0000000402090000 CR4: 00000000003406f0
Arch Linux distributions is for nvidia gpus. Use Debian testing/sid Xfce and Oibaf ppa Mesa cosmic version. The AMD wip kernel is the best kernel for AMD GPUs https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-4.21-wip System: Host: ryzenpc Kernel: 5.0.0-rc1 x86_64 bits: 64 Desktop: Xfce 4.12.4 Distro: Debian GNU/Linux buster/sid Machine: Type: Desktop Mobo: ASUSTeK model: PRIME B350M-K v: Rev X.0x serial: <root required> UEFI [Legacy]: American Megatrends v: 4207 date: 12/07/2018 CPU: 6-Core: AMD Ryzen 5 1600 type: MT MCP speed: 2959 MHz Graphics: Device-1: AMD Ellesmere [Radeon RX 470/480] driver: amdgpu v: kernel Display: x11 server: X.Org 1.20.3 driver: amdgpu resolution: 3840x2160~60Hz OpenGL: renderer: Radeon RX 570 Series (POLARIS10 DRM 3.27.0 5.0.0-rc1 LLVM 7.0.1) v: 4.5 Mesa 19.0.0-devel (git-70be9af 2019-01-02 cosmic-oibaf-ppa)
(In reply to fin4478 from comment #6) > blablabla This makes zero sense and is totally uncalled for, specially here. Go back to posting your usual bs in Phoronix debianxfce, this is not the place. You are polluting this and other bug reports without adding anything.
(In reply to bmilreu from comment #7) > (In reply to fin4478 from comment #6) > > blablabla > > This makes zero sense and is totally uncalled for, specially here. Go back > to posting your usual bs in Phoronix debianxfce, this is not the place. You > are polluting this and other bug reports without adding anything. Look at mirror, Arch Linux, Ubuntu and Fedora users are polluting this system, see: https://bugs.freedesktop.org/buglist.cgi?bug_status=__open__&component=DRM%2FAMDgpu&list_id=663649&product=DRI You are using old kernels, old mesa, buggy llvm 8 etc. Do not know that kernel configuration and bios settings can cause unstability. Steam games supports only Ubuntu and SteamOS etc.
(In reply to fin4478 from comment #8) > (In reply to bmilreu from comment #7) > > (In reply to fin4478 from comment #6) > > > blablabla > > > > This makes zero sense and is totally uncalled for, specially here. Go back > > to posting your usual bs in Phoronix debianxfce, this is not the place. You > > are polluting this and other bug reports without adding anything. > > Look at mirror, Arch Linux, Ubuntu and Fedora users are polluting this > system, see: > https://bugs.freedesktop.org/buglist. > cgi?bug_status=__open__&component=DRM%2FAMDgpu&list_id=663649&product=DRI > > You are using old kernels, old mesa, buggy llvm 8 etc. Do not know that > kernel configuration and bios settings can cause unstability. Steam games > supports only Ubuntu and SteamOS etc. Old kernels? This report is specifically about 4.21 wip/5.0-rc1 so that doesn't make any sense as well. Old mesa? My mesa builds daily and is usually newer than oibaf's by a couple of days. Buggy llvm8? Maybe, but unless you can point out specific bugs it works just fine and is very close to a stable release. Lastly, the reported bug is very likely to be in kernel code anyway so mesa and llvm are mostly irrelevant here. I'm not answering you anymore, I'll leave up to moderation to take care of this.
Created attachment 143038 [details] Bisect result
I've been running into this issue multiple times a day. I noticed I hit the OOPS a lot more frequent when my system was under load (e.g. compiling a kernel) and then opening a new tab in Firefox. Don't ask me how, but eventually I figured out I could reproduce the problem reliably on my system by starting many instances of my terminal emulator until I hit the OOM killer. v4.20 (good): OOM Killer kills processes and/or my user session and I can login again. v5.0-rc1 (bad): System hangs with OOPS in dmesg. So I started bisecting, result attached. I have not been able to reproduce after reverting parent merge commit [af7ddd8a627c62a835524b3f5b471edbbbcce025] and these related commits: 06f55fd2d22742ed7e725124dfea68936d12ce40 2e05ea5cdc1ac55d9ef678ed5ea6c38acf7fd2a3 d7076f07840851bbe57cb21ba052d6a4a9b1efa9 4788ba5792cc1368ba4867e1488dc168b4fe97b7 ed6ccf10f24bdfc1955bc8b976ddedc370fc3869 See the full tree here: https://github.com/SibrenVasse/linux/tree/revert Hope this helps!
Looks like this should be reported to Christoph Hellwig and other kernel DMA mapping helper developers then. Please Cc the dri-devel mailing list when doing so.
There are a few new dma fixes on torvalds tree, but I'm still triggering the bug. I got something similar now but slightly different while watching a real-time 60fps interpolated video that uses opencl acceleration via rocm. Attached the log, the first error is from kfd driver and the second looks like the one reported in OP.
Created attachment 143066 [details] dmesg kfd and amdgpu hangs attachment for last comment
@Sibren Vasse Have you forwarded this to dma devs yet?
@bmilreu: No, Michel beat me to it. See thread here: https://lists.linuxfoundation.org/pipermail/iommu/2019-January/032528.html
@bmilreu: Could you try this patch? It works for me. https://lists.linuxfoundation.org/pipermail/iommu/2019-January/032651.html
(In reply to Sibren Vasse from comment #17) > @bmilreu: Could you try this patch? It works for me. > https://lists.linuxfoundation.org/pipermail/iommu/2019-January/032651.html Sure, will report if it fixes it for me.
Thanks for the report, turned out to be a bug in the DMA subsystem.
Michel, thanks. I tested this patch https://patchwork.codeaurora.org/patch/699617/ for several days and confirm that it fix the problem.
Forgot to ask: when it will be merged in Linus tree?
(In reply to mikhail.v.gavrilov from comment #21) > Forgot to ask: when it will be merged in Linus tree? https://github.com/torvalds/linux/commit/6d060fa39035d5ff6bb3e720a8119aeb50453e3b Can confirm my system been stable for 3 days with the patch
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.