I just tried to use a Wayland session with KDE Plasma 5.13.2 (Debian testing) and it's segfaulting right after login and falling back to sddm. That's what I see in dmesg: [ 176.359816] [drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 10us * 3000 tries - dce110_stream_encoder_dp_blank line:956 [ 176.814144] QThread[2620]: segfault at f ip 00007f30f30b0e60 sp 00007f30e506e4f0 error 4 in libwayland-client.so.0.3.0[7f30f30a9000+d000] Not sure whether it's amdgpu problem of something with KWin. GPU: AMD Vega 56, connected over DisplayPort. OpenGL renderer string: Radeon RX Vega (VEGA10, DRM 3.25.0, 4.17.0-1-amd64, LLVM 6.0.1) OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.1.3 Corresponding KDE bug: https://bugs.kde.org/show_bug.cgi?id=396066
Please attach the full dmesg.
Created attachment 140624 [details] dmesg output See attached dmesg output. Once thing to note, I consistently get a black screen after boot (the monitor goes into a sleep mode). I need to turn monitor off, turn it back on, switch to tty1 (then monitor turns on), then log in there and restart sddm. Only then sddm appears on tty7. After that, Wayland session log-in fails (I tried a couple of times which is reflected in dmesg). First try didn't result in segfault in dmesg, just in *ERROR* REG_WAIT, but second attempt also added as segfault.
Did anyone manage to narrow down the cause?
Still a problem with kernel 4.18.5 and latest firmware for Vega (20180825). Except now, the session doesn't crash but just hangs with black screen. Similar dmesg can be seen: [ 162.743804] [drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 10us * 3500 tries - dce_mi_free_dmif line:636 [ 162.743830] WARNING: CPU: 6 PID: 1575 at /build/linux-ETX4PU/linux-4.18.5/drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:254 generic_reg_wait+0xe8/0x160 [amdgpu]
Hello, found this bug via web search. I am experiencing the *exact* same bug. I'm running Fedora 28 with MATE desktop, so I'm confident this is not a KDE problem. My error message: [39911.150851] [drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 10us * 3000 tries - dce110_stream_encoder_dp_blank line:956 [39911.150927] WARNING: CPU: 5 PID: 1452 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:195 generic_reg_wait+0xe7/0x160 [amdgpu] Mobo: Supermicro X9SRL CPU: intel Xeon E5 2680 v2 GPU: MSI Radeon RX 480 4GB with latest polaris10 bin file Kernel: 4.17.19 Mesa: 18.0.5 This must be an AMDGPU driver bug. I'm also connected via DisplayPort and I frequently get the monitor sleeping to powersave mode during boot. If I switch to TTY1 and do a Ctrl-Alt-Del reboot, it usually boots up normally after the blind "three finger salute" reboot.
Created attachment 141418 [details] amdgpu crash in dmesg output
I'm waiting for kernel 4.19.x to see if it improves anything, since it apparently had some fix that looks related: https://lists.freedesktop.org/archives/dri-devel/2018-August/185123.html > drm/amd/display: Fix Vega10 black screen after mode change
Same issue here, with the same error on dmesg: GPU: R9 380 connected over Displayport Monitor: DELL U2515H CPU: AMD Ryzen 7 1700 Motherboard: ASRock AB350 Gaming-ITX/ac OpenGL Renderer: AMD Radeon R9 380 Series (TONGA DRM 3.26.0 4.18.5-1-MANJARO LLVM 6.0.1) OpenGL version: 4.5 Mesa 18.1.7 I'm using KDE Plasma 5.13.4
Created attachment 141438 [details] dmesg crash output
May be related to (if your kernel have the same faulty commit): https://bugs.freedesktop.org/show_bug.cgi?id=107784
I'm inclined to believe that is a userspace issue. I can observe the crash happening on the newest stable Ubuntu/Debian releases. However, the crash does *not* occur for distributions that have newer userspace and kernel configurations (Fedora, Arch). I can boot and use Wayland under this ASIC and many others. That said, I haven't done investigation into the root cause of the issue. Might be worth looking into a bisection on Wayland or the kernel. It shouldn't be specific to a particular ASIC at least.
May be different then, because my bug https://bugs.freedesktop.org/show_bug.cgi?id=107784 is with git userspace no older than a few days, and displayport is broken whatever the screen resolution. I did manually bisect the kernel and found the faulty commit though, I guess the guys in amd are now looking into it.
I have a recent kernel and userland stack with Debian testing, it's still crashing and falling back into sddm. But with newest kernel I don't see the segfault message in dmesg anymore. Linux 4.19.0-rc2-amd64 #1 SMP Debian 4.19~rc2-1~exp1 (2018-09-03) x86_64 GNU/Linux firmware-amd-graphics: 20180825+dfsg-1 I see this in dmesg: [ 21.111724] [drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 10us * 3000 tries - dce110_stream_encoder_dp_blank line:922 [ 21.111795] WARNING: CPU: 6 PID: 153 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:254 generic_reg_wait+0xe7/0x160 [amdgpu] [ 21.111796] Modules linked in: devlink(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) cmac(E) bnep(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) arc4(E) amdkfd(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) uvcvideo(E) edac_mce_amd(E) btusb(E) snd_hda_codec_hdmi(E) btrtl(E) mxm_wmi(E) wmi_bmof(E) videobuf2_vmalloc(E) btbcm(E) btintel(E) amdgpu(E) kvm_amd(E) videobuf2_memops(E) iwlmvm(E) bluetooth(E) snd_hda_intel(E) chash(E) videobuf2_v4l2(E) kvm(E) irqbypass(E) snd_usb_audio(E) gpu_sched(E) snd_hda_codec(E) mac80211(E) snd_usbmidi_lib(E) videobuf2_common(E) snd_hda_core(E) crct10dif_pclmul(E) jitterentropy_rng(E) crc32_pclmul(E) ttm(E) snd_rawmidi(E) snd_seq_device(E) snd_hwdep(E) efi_pstore(E) videodev(E) evdev(E) drm_kms_helper(E) ghash_clmulni_intel(E) [ 21.111824] snd_pcm(E) iwlwifi(E) pcspkr(E) drbg(E) efivars(E) media(E) drm(E) ansi_cprng(E) snd_timer(E) cfg80211(E) ecdh_generic(E) snd(E) soundcore(E) rfkill(E) sp5100_tco(E) crc16(E) k10temp(E) ccp(E) rng_core(E) sg(E) wmi(E) pcc_cpufreq(E) button(E) acpi_cpufreq(E) nct6775(E) hwmon_vid(E) parport_pc(E) ppdev(E) lp(E) parport(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) xfs(E) btrfs(E) xor(E) zstd_decompress(E) zstd_compress(E) xxhash(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E) hid_generic(E) usbhid(E) hid(E) sd_mod(E) crc32c_intel(E) ahci(E) xhci_pci(E) aesni_intel(E) aes_x86_64(E) libahci(E) crypto_simd(E) xhci_hcd(E) igb(E) cryptd(E) glue_helper(E) libata(E) i2c_piix4(E) nvme(E) i2c_algo_bit(E) dca(E) usbcore(E) scsi_mod(E) usb_common(E) nvme_core(E) gpio_amdpt(E) gpio_generic(E) [ 21.111860] CPU: 6 PID: 153 Comm: kworker/6:3 Tainted: G E 4.19.0-rc2-amd64 #1 Debian 4.19~rc2-1~exp1 [ 21.111861] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X370 Taichi, BIOS L4.64 04/03/2018 [ 21.111877] Workqueue: events drm_mode_rmfb_work_fn [drm] [ 21.111931] RIP: 0010:generic_reg_wait+0xe7/0x160 [amdgpu] [ 21.111932] Code: 44 24 58 8b 54 24 48 89 de 44 89 4c 24 08 48 8b 4c 24 50 48 c7 c7 20 dd 1e c2 e8 64 76 ab fe 83 7d 18 01 44 8b 4c 24 08 74 02 <0f> 0b 48 83 c4 10 44 89 c8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 41 0f [ 21.111933] RSP: 0018:ffffaf830207fa20 EFLAGS: 00010297 [ 21.111935] RAX: 0000000000000000 RBX: 000000000000000a RCX: 0000000000000000 [ 21.111936] RDX: 0000000000000000 RSI: ffff96dcceb966a8 RDI: ffff96dcceb966a8 [ 21.111937] RBP: ffff96dcc61f1700 R08: 0000000000000005 R09: 0000000000010200 [ 21.111938] R10: 0000000000000498 R11: ffffffff9a1dc6ed R12: 0000000000000bb9 [ 21.111938] R13: 00000000000051e2 R14: 0000000000010000 R15: 0000000000000000 [ 21.111940] FS: 0000000000000000(0000) GS:ffff96dcceb80000(0000) knlGS:0000000000000000 [ 21.111941] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 21.111942] CR2: 000055ab41a08358 CR3: 00000003f4b52000 CR4: 00000000003406e0 [ 21.111943] Call Trace: [ 21.112005] dce110_stream_encoder_dp_blank+0x12c/0x1a0 [amdgpu] [ 21.112061] core_link_disable_stream+0x54/0x220 [amdgpu] [ 21.112116] dce110_reset_hw_ctx_wrap+0xc1/0x1e0 [amdgpu] [ 21.112170] dce110_apply_ctx_to_hw+0x45/0x650 [amdgpu] [ 21.112224] ? dc_remove_plane_from_context+0x1fc/0x240 [amdgpu] [ 21.112276] dc_commit_state+0x2c6/0x520 [amdgpu] [ 21.112334] amdgpu_dm_atomic_commit_tail+0x37a/0xd80 [amdgpu] [ 21.112338] ? __wake_up_common_lock+0x89/0xc0 [ 21.112341] ? _cond_resched+0x15/0x30 [ 21.112342] ? wait_for_completion_timeout+0x3b/0x1a0 [ 21.112399] ? amdgpu_dm_atomic_commit_tail+0xd80/0xd80 [amdgpu] [ 21.112407] commit_tail+0x3d/0x70 [drm_kms_helper] [ 21.112414] drm_atomic_helper_commit+0xb4/0x120 [drm_kms_helper] [ 21.112428] drm_framebuffer_remove+0x361/0x410 [drm] [ 21.112442] drm_mode_rmfb_work_fn+0x4f/0x60 [drm] [ 21.112446] process_one_work+0x1a7/0x360 [ 21.112447] worker_thread+0x30/0x390 [ 21.112449] ? pwq_unbound_release_workfn+0xd0/0xd0 [ 21.112451] kthread+0x112/0x130 [ 21.112452] ? kthread_bind+0x30/0x30 [ 21.112454] ret_from_fork+0x22/0x40 [ 21.112456] ---[ end trace b22dbbbbffd241d9 ]---
Correction, segfault is still happening, it's just not consistent (not every time). [ 683.792530] QThread[3520]: segfault at f ip 00007f41f9f2ae60 sp 00007f41f19d4500 error 4 in libwayland-client.so.0.3.0[7f41f9f23000+d000] [ 683.792538] Code: 48 83 c4 10 5b c3 e8 cf d1 ff ff 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 41 55 41 54 49 89 cc 55 53 48 89 fb 48 83 ec 08 <48> 8b 7f 08 44 0f b6 07 45 84 c0 0f 84 17 01 00 00 48 89 f8 44 89
I managed to make it produce a core. It's from kwin_wayland. After installing needed debug symbol packages, here is a backtrace: Core was generated by `/usr/bin/kwin_wayland --xwayland --libinput --exit-with-session=/usr/lib/x86_64'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007eff59760f30 in wl_closure_init (message=message@entry=0x7, size=size@entry=52, num_arrays=num_arrays@entry=0x7eff5140858c, args=args@entry=0x0) at ../src/connection.c:562 562 ../src/connection.c: No such file or directory. [Current thread is 1 (Thread 0x7eff51409700 (LWP 7249))] (gdb) bt #0 0x00007eff59760f30 in wl_closure_init (message=message@entry=0x7, size=size@entry=52, num_arrays=num_arrays@entry=0x7eff5140858c, args=args@entry=0x0) at ../src/connection.c:562 #1 0x00007eff59761aa0 in wl_connection_demarshal (connection=0x7eff440053e0, size=size@entry=52, objects=objects@entry=0x7eff440052e8, message=0x7) at ../src/connection.c:698 #2 0x00007eff5975fae8 in queue_event (len=52, display=0x7eff44005270) at ../src/wayland-client.c:1364 #3 read_events (display=0x7eff44005270) at ../src/wayland-client.c:1466 #4 wl_display_read_events (display=display@entry=0x7eff44005270) at ../src/wayland-client.c:1549 #5 0x00007eff59760169 in wl_display_dispatch_queue (display=0x7eff44005270, queue=0x7eff44005338) at ../src/wayland-client.c:1788 #6 0x00007eff5d123933 in KWayland::Client::ConnectionThread::Private::<lambda()>::operator() (__closure=0x7eff44009550) at ./src/client/connection_thread.cpp:129 #7 QtPrivate::FunctorCall<QtPrivate::IndexesList<>, QtPrivate::List<>, void, KWayland::Client::ConnectionThread::Private::setupSocketNotifier()::<lambda()> >::call (arg=<optimized out>, f=...) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qobjectdefs_impl.h:128 #8 QtPrivate::Functor<KWayland::Client::ConnectionThread::Private::setupSocketNotifier()::<lambda()>, 0>::call<QtPrivate::List<>, void> (arg=<optimized out>, f=...) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qobjectdefs_impl.h:238 #9 QtPrivate::QFunctorSlotObject<KWayland::Client::ConnectionThread::Private::setupSocketNotifier()::<lambda()>, 0, QtPrivate::List<>, void>::impl(int, QtPrivate::QSlotObjectBase *, QObject *, void **, bool *) (which=<optimized out>, this_=0x7eff44009540, r=<optimized out>, a=<optimized out>, ret=<optimized out>) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qobjectdefs_impl.h:421 #10 0x00007eff5e606910 in QtPrivate::QSlotObjectBase::call (a=0x7eff514087d0, r=0x564a80be84f0, this=0x7eff44009540) at ../../include/QtCore/../../src/corelib/kernel/qobjectdefs_impl.h:376 #11 QMetaObject::activate(QObject*, int, int, void**) () at kernel/qobject.cpp:3754 #12 0x00007eff5e606dd7 in QMetaObject::activate (sender=sender@entry=0x7eff44009440, m=m@entry=0x7eff5e863c60 <QSocketNotifier::staticMetaObject>, local_signal_index=local_signal_index@entry=0, argv=argv@entry=0x7eff514087d0) at kernel/qobject.cpp:3633 #13 0x00007eff5e611ff9 in QSocketNotifier::activated (this=this@entry=0x7eff44009440, _t1=<optimized out>, _t2=...) at .moc/moc_qsocketnotifier.cpp:136 #14 0x00007eff5e612341 in QSocketNotifier::event (this=0x7eff44009440, e=0x7eff51408a30) at kernel/qsocketnotifier.cpp:266 #15 0x00007eff5e9cb4a1 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /lib/x86_64-linux-gnu/libQt5Widgets.so.5 #16 0x00007eff5e9d2ae0 in QApplication::notify(QObject*, QEvent*) () from /lib/x86_64-linux-gnu/libQt5Widgets.so.5 #17 0x00007eff5e5dd579 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at ../../include/QtCore/5.11.1/QtCore/private/../../../../../src/corelib/thread/qthread_p.h:307 #18 0x00007eff5e62fe4a in QCoreApplication::sendEvent (event=0x7eff51408a30, receiver=<optimized out>) at ../../include/QtCore/../../src/corelib/kernel/qcoreapplication.h:234 #19 socketNotifierSourceDispatch(_GSource*, int (*)(void*), void*) () at kernel/qeventdispatcher_glib.cpp:106 #20 0x00007eff5a647287 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #21 0x00007eff5a6474c0 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #22 0x00007eff5a64754c in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #23 0x00007eff5e62f223 in QEventDispatcherGlib::processEvents (this=0x7eff44000b20, flags=...) at kernel/qeventdispatcher_glib.cpp:423 #24 0x00007eff5e5dc24b in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at ../../include/QtCore/../../src/corelib/global/qflags.h:140 #25 0x00007eff5e42b176 in QThread::exec() () at ../../include/QtCore/../../src/corelib/global/qflags.h:120 #26 0x00007eff5e434d47 in QThreadPrivate::start(void*) () at thread/qthread_unix.cpp:367 #27 0x00007eff5efb5f2a in start_thread (arg=0x7eff51409700) at pthread_create.c:463 #28 0x00007eff5e0fdedf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
kwin crashes in libwayland code. You should probably report this to libwayland (uses Gitlab issues now) and/or kwin.
Thanks. I already reported it for KWin (linked above): https://bugs.kde.org/show_bug.cgi?id=396066 I'll open libwayland bug too.
Corresponding wayland-client bug: https://gitlab.freedesktop.org/wayland/wayland/issues/56
Created attachment 141650 [details] dmesg error msg while suspending. Found this bug report from a google search, i'm not using wayland and the machine appears to have suspended and resumed fine but i did happen to see the same error in the logs. posting in case it help narrow down the problem. System: Host: Plasma Kernel: 4.18.8-arch1-1-ARCH x86_64 bits: 64 Desktop: KDE Plasma 5.13.5 Distro: Antergos Linux CPU: Topology: 6-Core model: Intel Core i7-5820K bits: 64 type: MT MCP L2 cache: 15.0 MiB Speed: 2697 MHz min/max: 1200/3600 MHz Core speeds (MHz): 1: 2292 2: 1949 3: 2333 4: 2576 5: 2371 6: 3401 7: 2804 8: 2979 9: 2767 10: 2782 11: 3069 12: 3402 Graphics: Card-1: AMD Vega 10 XT [Radeon RX Vega 64] driver: amdgpu v: kernel Display: x11 server: X.Org 1.20.1 driver: modesetting unloaded: fbdev,vesa resolution: 2560x1440~144Hz, 1280x720~60Hz OpenGL: renderer: Radeon RX Vega (VEGA10 DRM 3.26.0 4.18.8-arch1-1-ARCH LLVM 6.0.1) v: 4.5 Mesa 18.2.0
I also opened downstream Debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=909636
I tested it with Intel GPU recently, and it doesn't crash. So it's amdgpu specific.
It looks like it's related to https://bugs.freedesktop.org/show_bug.cgi?id=107978 My monitor (Dell U2413) has a setting for toggling DisplayPort 1.2. When I disable it, Wayland Plasma session isn't crashing anymore and is logging in properly! So it's likely an amdgpu issue actually.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/445.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.