Created attachment 119547 [details] Dmesg from system start until and including the second time a kernel backtrace appears I was referred by my distributions bug tracking to report an error related to the Nouveau drivers here. I use the "Tumbleweed" distribution by Opensuse. This is a kind of rolling release with package upgrades circa once a week. On 2015-11-08 I pulled that distributions latest snapshot, which installed among not many other packages Linux kernel version 4.3.0, which was an upgrade from 4.2.4. Since then I experience grave issues when using the KDE desktop. Beginning on that Sunday (i.e. 2015-11-08) after some time of using the desktop I noticed that KDEs text editor ("Kwrite") would not start anymore when launched from the file manager. Initially I thought this was an communcation problem inside KDE, because I could restart the file manager and managed to open one text file. But then, attempting to launch a second instance of Kwrite via the file mananger failed again. I tried repeatedly and found that after very few attempts Kwrite could not be launched anymore. At that time I could still interact with other, running programs. But after some time the whole desktop locked up. Not even switchting to a text console via Ctrl + F1 worked. The system had to be rebooted. I can now reproduce a whole desktop lockup by this simple procedure: - Power on - Login in KDM - Pressing Alt + F2, then typing konsole in the mini command line - Entering dmesg in Konsole window. - Opening a second Konsole tab. - In that new tab, typing kwrite. Kwrite is not launched successfully by that attempt. To gather information, I installed kernel version 4.2.4 from the distributions package in parallel to 4.3.0. When I boot 4.2.4, I can not reproduce the desktop lockups. I have attached the complete dmesg output which stems from the procedure to reproduce above. As you can see, there are some suspicious kernel backtraces related to Nouveau. One of these backtraces is associated in time closely to attempting to launch Kwrite, i.e. after I type "dmesg" for the first time I only see one backtrace. Then, after entering kwrite I can request dmesg again and spot the second kernel backtrace. With slightly older kernel versions I also get these kernel backtraces in the system log (journalctl), but I do NOT experience whole desktop lockups. With even older kernel versions, I do not get these types of kernel backtraces. These are the lines from when journactl indicates a similar backtrace for the first time: (The installed kernel must have been 4.2.3 as far as I can determine through inspecting the package install history logfile.) ----- Kernel 4.2.3: ------------------------------------------------------------ Okt 27 15:42:10 linux-5rjk kernel: resource sanity check: requesting [mem 0xddf6d000-0xde06cfff], which spans more than 0000:01:00.0 [mem 0xdc000000-0xddffffff 64bit pref] Okt 27 15:42:10 linux-5rjk kernel: ------------[ cut here ]------------ Okt 27 15:42:10 linux-5rjk kernel: WARNING: CPU: 0 PID: 5113 at ../arch/x86/mm/ioremap.c:198 __ioremap_caller+0x2de/0x360() Okt 27 15:42:10 linux-5rjk kernel: Info: mapping multiple BARs. Your kernel is fine. Okt 27 15:42:10 linux-5rjk kernel: Modules linked in: Okt 27 15:42:10 linux-5rjk kernel: nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit iscsi_ibft iscsi_boot_sysfs af_packet ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables snd_hda_codec_hdmi snd_hda_codec_analog snd_hda_codec_generic iTCO_wdt gpio_ich iTCO_vendor_support ppdev dm_mod coretemp kvm_intel kvm pcspkr i2c_i801 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep lpc_ich mfd_core snd_pcm asus_atk0110 8250_fintek parport_pc parport snd_timer nouveau snd mxm_wmi wmi video ttm drm_kms_helper drm i2c_algo_bit acpi_cpufreq button processor shpchp soundcore hid_generic usbhid Okt 27 15:42:10 linux-5rjk kernel: ata_generic serio_raw firewire_ohci firewire_core crc_itu_t atl1 mii pata_jmicron ehci_pci uhci_hcd ehci_hcd usbcore usb_common sg Okt 27 15:42:10 linux-5rjk kernel: CPU: 0 PID: 5113 Comm: kwrite Not tainted 4.2.3-1-default #1 Okt 27 15:42:10 linux-5rjk kernel: Hardware name: System manufacturer System Product Name/P5B-E, BIOS 1002 01/30/2007 Okt 27 15:42:10 linux-5rjk kernel: ffffffff81a20135 ffff880180b93758 ffffffff81661dad 0000000000000007 Okt 27 15:42:10 linux-5rjk kernel: ffff880180b937a8 ffff880180b93798 ffffffff81068246 ffffc90006cfffff Okt 27 15:42:10 linux-5rjk kernel: 0000000000100000 ffffc90006c00000 00000000ddf6d000 0000000000000000 Okt 27 15:42:10 linux-5rjk kernel: Call Trace: Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff81007a15>] try_stack_unwind+0x175/0x190 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff81006223>] dump_trace+0x93/0x3a0 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff81007a7f>] show_trace_log_lvl+0x4f/0x60 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff8100663c>] show_stack_log_lvl+0x10c/0x180 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff81007b15>] show_stack+0x25/0x50 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff81661dad>] dump_stack+0x4c/0x6e Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff81068246>] warn_slowpath_common+0x86/0xc0 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff810682c6>] warn_slowpath_fmt+0x46/0x50 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff8105425e>] __ioremap_caller+0x2de/0x360 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff810542f7>] ioremap_nocache+0x17/0x20 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa0234e72>] nvkm_barobj_ctor+0xc2/0xf0 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa02338d1>] nvkm_object_ctor+0x31/0xd0 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa0234ece>] nvkm_bar_alloc+0x2e/0x40 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa023092d>] nvkm_gpuobj_create_+0x26d/0x2a0 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa023099d>] _nvkm_gpuobj_ctor+0x3d/0x50 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa02338d1>] nvkm_object_ctor+0x31/0xd0 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa02309fc>] nvkm_gpuobj_new+0x4c/0x50 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa0274f41>] nvkm_vm_get+0x171/0x2c0 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa02c4e9e>] nouveau_bo_vma_add+0x2e/0x90 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa02d63c5>] nouveau_channel_prep+0x215/0x2f0 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa02d6511>] nouveau_channel_new+0x71/0x700 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa02d53da>] nouveau_abi16_ioctl_channel_alloc+0x12a/0x3f0 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa01493a5>] drm_ioctl+0x125/0x610 [drm] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffffa02bdab0>] nouveau_drm_ioctl+0x70/0xd0 [nouveau] Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff811f2bf5>] do_vfs_ioctl+0x285/0x460 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff811f2e49>] SyS_ioctl+0x79/0x90 Okt 27 15:42:10 linux-5rjk kernel: [<ffffffff81667e32>] entry_SYSCALL_64_fastpath+0x16/0x75 Okt 27 15:42:10 linux-5rjk kernel: DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x16/0x75 Okt 27 15:42:10 linux-5rjk kernel: Okt 27 15:42:10 linux-5rjk kernel: Leftover inexact backtrace: Okt 27 15:42:10 linux-5rjk kernel: ---[ end trace d43371eb12dab49d ]--- Okt 27 15:42:10 linux-5rjk kernel: nouveau E[kwrite[5113]] channel failed to initialise, -12 Okt 27 15:42:13 linux-5rjk kernel: SFW2-INext-DROP-DEFLT IN=enp3s0 OUT= MAC [... cut long line, the reporter] Okt 27 15:42:16 linux-5rjk kernel: SFW2-INext-DROP-DEFLT IN=enp3s0 OUT= MAC [... cut long line, the reporter] Okt 27 15:42:30 linux-5rjk kernel: resource sanity check: requesting [mem 0xddf6d000-0xde06cfff], which spans more than 0000:01:00.0 [mem 0xdc000000-0xddffffff 64bit pref] Okt 27 15:42:30 linux-5rjk kernel: nouveau E[kwrite[5122]] channel failed to initialise, -12 -------------------------------------------------------------------------------- For your information I have attached files showing the package install history (only the most recent weeks), output of "hwinfo --gfx" and a bit of information about installed packages (I hope you can make something out of the RPM output, if not I am glad to supply any missing information). Of course I attached the dmesg output as well, as written above. For reference I opened this report in Opensuses bug tracking: https://bugzilla.opensuse.org/show_bug.cgi?id=954473 For further information, for some time now (ca. since mid 2015) I also get similar Nouveau failure message to those that are attached here: https://bugs.freedesktop.org/show_bug.cgi?id=92504 But these do not usually provoke hard desktop lockups, and are only seen when I also use Firefox, which I do sparingly. So that is likely a separate problem, which has a weak relation to my recent troubles.
Created attachment 119548 [details] Output of hwinfo --gfx
Created attachment 119549 [details] Version information of some relevant packages
Created attachment 119550 [details] Recent weeks of package install history
Nouveau underwent a significant rewrite for kernel 4.3. Any chance you could bisect the changes to drivers/gpu/drm/nouveau between v4.2 and v4.3?
I will try to bisect between 4.2 and 4.3. I will likely not report back until the weekend. Thanks for answering so fast.
I believe I'm seeing the same bug: * plasma5 hangs with kernel to 4.3, but not 4.2 * "resource sanity check" in /var/log/messages I have an "NVIDIA Corporation G98 [Quadro NVS 295] (rev a1)" as seen by lspci (Dell workstation). I have kernel-default-4.3.0-2.1 on Tumbleweed, but when I had kernel-default-4.3.0-1.1, I was also seeing "DRM: GPU lockup - switching to software fbcon" like in #92971.
I used git bisect to find the first bad kernel revision. This is Gits "BISECT_LOG": git bisect start # good: [1c02865136fee1d10d434dc9e3616c8e39905e9b] Linux 4.2.6 git bisect good 1c02865136fee1d10d434dc9e3616c8e39905e9b # bad: [6ff33f3902c3b1c5d0db6b1e2c70b6d76fba357f] Linux 4.3-rc1 git bisect bad 6ff33f3902c3b1c5d0db6b1e2c70b6d76fba357f # good: [64291f7db5bd8150a74ad2036f1037e6a0428df2] Linux 4.2 git bisect good 64291f7db5bd8150a74ad2036f1037e6a0428df2 # good: [dd5cdb48edfd34401799056a9acf61078d773f90] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next git bisect good dd5cdb48edfd34401799056a9acf61078d773f90 # bad: [f377ea88b862bf7151be96d276f4cb740f8e1c41] Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux git bisect bad f377ea88b862bf7151be96d276f4cb740f8e1c41 # good: [abebcdfb64f1b39eeeb14282d9cd4aad1ed86f8d] Merge tag 'sound-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound git bisect good abebcdfb64f1b39eeeb14282d9cd4aad1ed86f8d # good: [bef2c7bd578e91c9c10983e0c15c4501127b77ca] Merge tag 'drm/tegra/for-4.3-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next git bisect good bef2c7bd578e91c9c10983e0c15c4501127b77ca # good: [99336ed363f49f484b4d93600c4dfec1f2ebb84a] drm/nouveau/ltc: switch to device pri macros git bisect good 99336ed363f49f484b4d93600c4dfec1f2ebb84a # bad: [97070f23c60869830039b216ff88230f54ef7107] drm/nouveau/pm: convert to new-style nvkm_engine git bisect bad 97070f23c60869830039b216ff88230f54ef7107 # good: [c813d8e048740ca82b88a9d3f639bbd8095b24ac] drm/nouveau/bin: punt client/device argument handling into a common helper git bisect good c813d8e048740ca82b88a9d3f639bbd8095b24ac # bad: [6157091177102638c7d94ffc159c0b157a1c9b56] drm/nouveau/sw: remove dependence on namedb/engctx lookup git bisect bad 6157091177102638c7d94ffc159c0b157a1c9b56 # good: [168c2e213d3a9b605856d3676d9e93733c8b37d3] drm/nouveau/engine: implement support for new-style nvkm_engine git bisect good 168c2e213d3a9b605856d3676d9e93733c8b37d3 # good: [358ce601ae5de59bf6f08f79455c5b3cb7d359d4] drm/nouveau/fifo: directly use instmem for runlists and polling areas git bisect good 358ce601ae5de59bf6f08f79455c5b3cb7d359d4 # bad: [344c2d429dd86b1b0113177e18f15adb74e9d936] drm/nouveau/fb: remove dependence on namedb/engctx lookup git bisect bad 344c2d429dd86b1b0113177e18f15adb74e9d936 # bad: [1d2a1e53865266a67fb569705eba3ec992682721] drm/nouveau/ramht: remove dependence on namedb git bisect bad 1d2a1e53865266a67fb569705eba3ec992682721 # good: [f027f49166171c98d5945af12ac3ee9bc9f9bf4c] drm/nouveau/gpuobj: separate allocation from nvkm_object git bisect good f027f49166171c98d5945af12ac3ee9bc9f9bf4c # first bad commit: [1d2a1e53865266a67fb569705eba3ec992682721] drm/nouveau/ramht: remove dependence on namedb
I have a very similar setup. - OpenSUSE Tumbleweed - Dell Laptop (E6510) - NVIDIA Corporation GT218M [NVS 3100M] - upgraded kernel to 4.3.0 And exactly the same symptoms (down to the same call backtrace). Thus, I can help testing driver fixes if needed. Also @Volker Lukas: - Where did you get the older still functioning copy (4.2.4 ?) I would like download and have until the 4.3.0 kernel gets fixed, but all the tumbleweed mirror seem to have deleted the older kernel RPMs and only have the latest one.
Hi doktor.yak, at the time I encountered this bug, the Opensuse Linux 4.2.4 RPM was still downloadable. If you build linux-4.2.6.tar.xz from kernel.org via "make rpm" you should be able to get a working kernel package. You can copy the /boot/config-4.x-something to the kernel source directory to copy the build configuration (rename it to ".config").
Thanks for your answer. Do you know if Suse did apply any patch on their version of the 4.2.4 kernel ? Otherwise I'll follow your recommendation and compile a vanilla kernel. (with "make oldconfig"-ing /proc/config.gz)
(In reply to doktor.yak from comment #10) > Thanks for your answer. > > Do you know if Suse did apply any patch on their version of the 4.2.4 kernel > ? Nothing about nouveau. > Otherwise I'll follow your recommendation and compile a vanilla kernel. > (with "make oldconfig"-ing /proc/config.gz) It's anyway better to compile by yourself for excluding any subtle differences.
With current Opensuse snapshots this problem is gone apparently. One notable upgrade is that of Linux to 4.4.0, but other upgrades also happened to X-Server, Mesa, KDE, Qt, etc...
Created attachment 121479 [details] dmesg 4.5.0-0.rc2.git0.1.fc24.x86_64 nouveau KDE5 SW: kernel-modules-4.5.0-0.rc2.git0.1.fc24.x86_64 libdrm-2.4.66-1.fc24.x86_64 xorg-x11-server-Xorg-1.18.0-5.fc24.x86_64 xorg-x11-drv-nouveau-1.0.12-1.fc24.x86_64 mesa-dri-drivers-11.2.0-0.devel.8.24ea81a.fc24.x86_64 plasma-workspace-5.5.4-1.fc24.x86_64 qt5-qtdeclarative-5.6.0-0.7.beta.fc24.x86_64 HW: NVIDIA G98
After upgrade to: $ rpm --query --file /usr/lib64/libQt5Qml.so.5.6.0 qt5-qtdeclarative-5.6.0-0.8.beta.fc24.x86_64 KDE5 starts without hassle Ref. - Info: qt5-qtdeclarative-5.6.0-0.8.beta.fc24 http://koji.fedoraproject.org/koji/buildinfo?buildID=715479 "build with -fno-delete-null-pointer-checks to workaround gcc6-related runtime crashes (#1303643)" - "qt5-qtdeclarative-5.6.0-0.7.beta.fc24 broken" https://bugzilla.redhat.com/show_bug.cgi?id=1303643
http://download.opensuse.org/tumbleweed/iso/ openSUSE-Tumbleweed-KDE-Live-x86_64-Snapshot20160130-Media.iso works OK
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.