Created attachment 117881 [details]
Host machine is Slackware64-current.
- xf86-video-nouveau 1.0.11
- xorg-server 1.17.2
- Linux 4.1.6
X is unable to start from what appears to be a block by nouveau. Blacklisting nouveau in /etc/modprobe.d/nouveau.conf like so:
results in a kernel panic. See attached dmesg and Xorg.0.log. Forgive me if I am reporting this bug in the wrong place.
dmesg after startx:
Created attachment 117882 [details]
Created attachment 117883 [details]
What is this -- xcmddc ? I suspect it's not helping.
But actually your problem is due to xf86-video-intel built against a recent libdrm version. You either need a patch, or to downgrade libdrm below 2.4.60. The intel ddx patch is available at http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/commit/?id=7fe2b2948652443ff43d907855bd7a051d54d309
By the way, for the xcmddc issue, it may be helpful if you could provide your vbios (available at /sys/kernel/debug/dri/0/vbios.rom).
But Xorg is crashing most likely because of intel, which you need to rebuild against an older libdrm (or rebuild with the patch I linked to).
Thank you Ilia! I will try that patch.
Upgrading xf86-video-intel did the trick! Thank you Ilia. Although on my first reboot X would not start. It was a hard lockup (I was unable to get to another tty).
After forcing a reboot with the power button, I was able to start X on the next boot. I don't know if there is a relevant bug somewhere, but I will provide the necessary logs in case someone can make sense of them.
Looks like you still have the same failure from xcmddc. IMO it's worth investigating/fixing. Will need your vbios though, as I mentioned before.
[ 8.700223] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 8.700262] IP: [<ffffffffc0a3ca10>] nvkm_i2c_try_acquire_pad+0x40/0x80 [nouveau]
[ 8.700265] PGD 2eb8067 PUD 2f1c067 PMD 0
[ 8.700269] Oops: 0000 [#1] SMP
[ 8.700313] Modules linked in: snd_hda_codec_hdmi snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core snd_hwdep joydev snd_pcm snd_timer hid_generic usbhid hid snd x86_pkg_temp_thermal intel_powerclamp coretemp intel_rapl btusb btbcm iwlmvm mac80211 btintel bluetooth iosf_mbi i2c_dev soundcore iwlwifi cfg80211 kvm_intel i915 r8169 rtsx_pci_ms nouveau kvm ttm drm_kms_helper drm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel intel_gtt thermal processor video psmouse agpgart rtsx_pci_sdmmc rfkill mmc_core memstick mei_me mei lpc_ich rtsx_pci thermal_sys hwmon i2c_algo_bit xhci_pci xhci_hcd mxm_wmi ehci_pci wmi mii microcode i2c_i801 ehci_hcd evdev tpm_infineon serio_raw tpm_tis tpm battery button ac i2c_core loop
[ 8.700321] CPU: 7 PID: 406 Comm: xcmddc Tainted: G I 4.1.6 #2
[ 8.700322] Hardware name: Notebook W230SS /W230SS , BIOS 4.6.5 05/13/2014
[ 8.700324] task: ffff88041bec4380 ti: ffff8800c613c000 task.ti: ffff8800c613c000
[ 8.700366] RIP: 0010:[<ffffffffc0a3ca10>] [<ffffffffc0a3ca10>] nvkm_i2c_try_acquire_pad+0x40/0x80 [nouveau]
[ 8.700369] RSP: 0018:ffff8800c613fc38 EFLAGS: 00010286
[ 8.700370] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000010000000
[ 8.700370] RDX: ffff88041817ec80 RSI: 0000000000000005 RDI: ffff88041c26d000
[ 8.700371] RBP: ffff8800c613fc38 R08: 0000000000018de0 R09: ffff88041d803e00
[ 8.700372] R10: ffff88041d803e00 R11: 0000000000000246 R12: ffff88041c26d000
[ 8.700372] R13: 0000000000000000 R14: ffff880002ffdd00 R15: ffff88041c26d000
[ 8.700373] FS: 00007fa6d1bab780(0000) GS:ffff88042fbc0000(0000) knlGS:0000000000000000
[ 8.700374] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.700375] CR2: 0000000000000008 CR3: 000000041bd0d000 CR4: 00000000001406e0
[ 8.700375] Stack:
[ 8.700378] ffff8800c613fca8 ffffffffc0a3cb1d ffff8800c613fc68 ffff8800c613fe08
[ 8.700379] 0000000000008002 ffff8800c613fda8 ffff8800c613fe08 ffffffff811d3e74
[ 8.700380] ffff8800c613fcc8 ffffffff810b7779 ffff88041c26d000 ffff88041c26d398
[ 8.700381] Call Trace:
[ 8.700412] [<ffffffffc0a3cb1d>] nvkm_i2c_acquire_pad+0xcd/0x150 [nouveau]
[ 8.700422] [<ffffffff811d3e74>] ? mntput+0x24/0x40
[ 8.700433] [<ffffffff810b7779>] ? update_curr+0xd9/0x160
[ 8.700448] [<ffffffffc0a3c930>] nvkm_i2c_acquire+0x40/0x60 [nouveau]
[ 8.700462] [<ffffffffc0a3e863>] aux_xfer+0x53/0x160 [nouveau]
[ 8.700480] [<ffffffffc000abf5>] __i2c_transfer+0x245/0x440 [i2c_core]
[ 8.700484] [<ffffffffc000ae44>] i2c_transfer+0x54/0x90 [i2c_core]
[ 8.700487] [<ffffffffc000aebf>] i2c_master_send+0x3f/0x50 [i2c_core]
[ 8.700490] [<ffffffffc001e460>] i2cdev_write+0x50/0x70 [i2c_dev]
[ 8.700509] [<ffffffff811b2ad8>] __vfs_write+0x28/0xf0
[ 8.700518] [<ffffffff81cc7982>] ? do_nanosleep+0x82/0x110
[ 8.700534] [<ffffffff815e2733>] ? security_file_permission+0x23/0xa0
[ 8.700536] [<ffffffff811b2f73>] ? rw_verify_area+0x53/0x100
[ 8.700538] [<ffffffff811b3209>] vfs_write+0xa9/0x1b0
[ 8.700545] [<ffffffff810e4180>] ? hrtimer_get_res+0x50/0x50
[ 8.700547] [<ffffffff811b4016>] SyS_write+0x46/0xb0
[ 8.700549] [<ffffffff81cc869b>] system_call_fastpath+0x16/0x6e
[ 8.700567] Code: 42 08 48 8b 08 8b 09 81 e1 00 00 00 10 74 ec b8 01 00 00 00 f0 0f c1 42 1c 85 c0 74 36 48 8b 42 28 eb 11 0f 1f 84 00 00 00 00 00 <48> 8b 40 08 48 85 c0 74 0f 48 39 c7 75 f2 31 c0 5d c3 66 0f 1f
[ 8.700618] RIP [<ffffffffc0a3ca10>] nvkm_i2c_try_acquire_pad+0x40/0x80 [nouveau]
[ 8.700622] RSP <ffff8800c613fc38>
[ 8.700622] CR2: 0000000000000008
Hi Ilia, I appreciate your help. This directory is empty: /sys/kernel/debug/
Is there anywhere else I can look?
Make sure that debugfs is mounted. e.g.
mount -t debugfs debugfs /sys/kernel/debug
Ah, I did not know that! Learning is fun. :-)
As requested, vbios.rom:
Also the system is still experiencing a hard lock up after a few minutes, so this is definitely worth investigating.
Please attach files here.
Sorry, trying to do this from my phone, since the computer will not stay on for more than a few minutes.
Add nouveau.modeset=0 to your kernel cmdline, that should prevent nouveau from loading. I believe that xcmddc is killing of CPU's one-by-one as nouveau is mishandling something it's asking for.
Created attachment 117904 [details]
Thank you, I was able to get it attached.
Ilia, should I change the status of this bug?
Created attachment 118404 [details]
dmesg on linux 4.1.8
Just so you know, I am still getting a hard system freeze on Linux 4.1.8.
(In reply to ryanpcmcquen from comment #17)
> Created attachment 118404 [details]
> dmesg on linux 4.1.8
> Just so you know, I am still getting a hard system freeze on Linux 4.1.8.
Are you able to try with a 4.3-rc kernel? This code got reworked to be more sane and hopefully fixes this issue as a side-effect.
(In reply to Ben Skeggs from comment #18)
> Are you able to try with a 4.3-rc kernel? This code got reworked to be more
> sane and hopefully fixes this issue as a side-effect.
Thanks for the reply Ben. X does not work for me on Linux 4.2.1 (it will not start). I just tried compiling Linux 4.3-rc2, but it will not compile, I will try again when rc3 comes out, hopefully that will fix it.
Linux 4.3-rc3 will not compile either, here is the error:
arch/x86/built-in.o: In function `hv_machine_crash_shutdown':
mshyperv.c:(.text+0x31dcf): undefined reference to `native_machine_crash_shutdown'
make: *** [vmlinux] Error 1
Is there a patch I can try against nouveau?
Created attachment 119187 [details]
dmesg for linux 4.3-rc7
I haven't had a freeze yet on Linux 4.3-rc7.
Do you notice anything wonky in dmesg?
This appears to be fixed in the Nov 19 version of nouveau (git commit 6e6d8ac).
I spoke too soon, the issue seems absent with the latest nouveau driver on Linux 4.2.x+, but still exists on 4.1.x. I now have Xorg 1.18.0 if that makes any difference.