Created attachment 122587 [details] dmesg from boot Booting Linux 4.6-rc1 and Mesa 11.2/11.3 fails to load the nouveau driver on a GTX 970M (6GB) on an MSI GS60 Ghost Pro 4K (i7-6700HQ). It spews out wonderful messages like [ 2.146398] nouveau 0000:01:00.0: priv: HUB0: 10ecc0 ffffffff (1940822c) [ 2.154362] vga_switcheroo: enabled [ 2.154567] [TTM] Zone kernel: Available graphics memory: 8170764 kiB [ 2.154568] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 2.154569] [TTM] Initializing pool allocator [ 2.154572] [TTM] Initializing DMA pool allocator [ 2.154577] nouveau 0000:01:00.0: DRM: VRAM: 6144 MiB [ 2.154578] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB [ 2.154580] nouveau 0000:01:00.0: DRM: Pointer to TMDS table invalid [ 2.154582] nouveau 0000:01:00.0: DRM: DCB version 4.1 [ 2.154583] nouveau 0000:01:00.0: DRM: Pointer to flat panel table invalid Attached is a dmesg from boot. The driver does just drop to the i915 driver so the machine is usable, but whenever I run lspci or lshw or try to logout of the X session, it hangs when it switches back to the nVidia GPU (the laptop has an LED indicator showing which GPU is in use)
Nouveau is successfully loaded on your laptop, but it seems to fail when it tries to wake up the NVIDIA GPU (if you look at the dmesg you linked, around 11sec, the NVIDIA GPU goes to sleep). You could try booting with `nouveau.runpm=0` on the kernel command line, and see if you still get the issue. Do you have any dmesg from when it hangs? IIRC, Alexandre Courbot sent a patch some time ago to fix an issue where the driver would try to reload the signed firmware upong resume and fail, but I would have guess it is included in 4.6-rc1.
In addition to the runpm=0 thing, please ensure that you have the appropriate firmware installed for this GPU - it should be in linux-firmware.git by now (nvidia/*). I don't see a message about nouveaufb, which could be due to how you configured your kernel, but it could also be because you don't have the firmware, and the user helper is kicking in and waiting 60 seconds for it to fail out, so nouveau's not fully done loading by the time the runpm stuff kicks in. Just a theory.
(In reply to Ilia Mirkin from comment #2) > In addition to the runpm=0 thing, please ensure that you have the > appropriate firmware installed for this GPU - it should be in > linux-firmware.git by now (nvidia/*). I don't see a message about nouveaufb, > which could be due to how you configured your kernel, but it could also be > because you don't have the firmware, and the user helper is kicking in and > waiting 60 seconds for it to fail out, so nouveau's not fully done loading > by the time the runpm stuff kicks in. Just a theory. I have the gm20x firmware from the linux-firmware repo installed. (In reply to Pierre Moreau from comment #1) > Nouveau is successfully loaded on your laptop, but it seems to fail when it > tries to wake up the NVIDIA GPU (if you look at the dmesg you linked, around > 11sec, the NVIDIA GPU goes to sleep). You could try booting with > `nouveau.runpm=0` on the kernel command line, and see if you still get the > issue. > Do you have any dmesg from when it hangs? I'll try that in a bit as well as try to get a dmesg when it hangs (not at my computer ATM) I see "NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!" when it hangs during a logout/shutdown but that's not particularly helpful.
Created attachment 122591 [details] dmesg while crashing Here is the dmesg from when it crashes. I ran lshw and it seems that triggered the nVidia card to start back up which caused the crash. With runpm=0 the nVidia card is never powered off so it doesn't crash.
I have the same problem with a GM206, GTX 960. I have to recycle the computer twice.
Maybe these messages are pointing to the root of the problem: [ 51.608479] nouveau 0000:01:00.0: Refused to change power state, currently in D3 [ 51.683924] nouveau 0000:01:00.0: Refused to change power state, currently in D3 [ 51.700020] nouveau 0000:01:00.0: Refused to change power state, currently in D3 If the device is still in D3 when we resume it, then accessing registers would understandably result in a freeze. Devinit comes early enough in the resume chain to make this plausible. FWIW I can successfully suspend/resume (echo mem >/sys/power/state) a GTX 960, but runtime PM works slightly differently. I would like to enable runtime PM on my desktop GTX960 to repro this, but for some reason I am failing - despite loading nouveau with "modeset=2 runpm=1", I cannot see runtime PM kicking in and /sys/class/drm/card0/power/runtime_status says "unsupported". What am I doing wrong?
can you try booting with acpi_osi="!Windows 2013" on the kernel command line.
Created attachment 122653 [details] Tentative fix The attached patch *might* help with this issue, but I have no way to test it. Rashed, Efrem, can one of you give it a try and tell us if it helps?
Dave, I tried adding the option you suggested, but it did not allow me to enable runtime PM, sadly. /sys/class/drm/card0/power/runtime_status still "unsupported" despite nouveau.ko being loaded with "modeset=2 runpm=1".
Created attachment 122654 [details] dmesg using tentative fix Alexandre, using the tentative fix you uploaded it switches GPUs properly now. Here is the dmesg from that since there are still some errors related to power state in it. Also, for some reason lshw is returning this for the nVidia GPU now: *-generic description: Unassigned class product: Illegal Vendor ID vendor: Illegal Vendor ID physical id: 0 bus info: pci@0000:01:00.0 version: ff width: 32 bits clock: 66MHz capabilities: bus_master vga_palette cap_list rom configuration: driver=nouveau latency=255 maxlatency=255 mingnt=255 resources: irq:129 memory:dc000000-dcffffff memory:b0000000-bfffffff memory:c0000000-c1ffffff ioport:e000(size=128) memory:dd000000-dd07ffff I don't know if that's related to this at all but before, if I set runpm=0 and run lshw, it would return the proper description (running it without runpm would cause the system to hang)
Thanks Rashed. This looks better but something seems to be going wrong with PCI. I'm pretty clueless about PCI/ACPI, so let's see if someone else has something to suggest...
Created attachment 122662 [details] attachment-28993-0.html I will not be able to test until later this afternoon. I have a GTX 960 as PCI 01:00.0 and GTX 730 as PCI 02:00.0. I will send over dmesg/journalctl -k. Regards Efrem On Apr 1, 2016 4:02 AM, <bugzilla-daemon@freedesktop.org> wrote: > *Comment # 11 <https://bugs.freedesktop.org/show_bug.cgi?id=94725#c11> on > bug 94725 <https://bugs.freedesktop.org/show_bug.cgi?id=94725> from > Alexandre Courbot <gnurou@gmail.com> * > > Thanks Rashed. This looks better but something seems to be going wrong with > PCI. I'm pretty clueless about PCI/ACPI, so let's see if someone else has > something to suggest... > > ------------------------------ > You are receiving this mail because: > > - You are the assignee for the bug. > > > _______________________________________________ > Nouveau mailing list > Nouveau@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau > >
(In reply to Rashed Abdel-Tawab from comment #10) > Created attachment 122654 [details] > dmesg using tentative fix > > Alexandre, using the tentative fix you uploaded it switches GPUs properly > now. Here is the dmesg from that since there are still some errors related > to power state in it. Also, for some reason lshw is returning this for the > nVidia GPU now: > > *-generic > description: Unassigned class > product: Illegal Vendor ID > vendor: Illegal Vendor ID > physical id: 0 > bus info: pci@0000:01:00.0 > version: ff > width: 32 bits > clock: 66MHz > capabilities: bus_master vga_palette cap_list rom > configuration: driver=nouveau latency=255 maxlatency=255 > mingnt=255 > resources: irq:129 memory:dc000000-dcffffff > memory:b0000000-bfffffff memory:c0000000-c1ffffff ioport:e000(size=128) > memory:dd000000-dd07ffff > > I don't know if that's related to this at all but before, if I set runpm=0 > and run lshw, it would return the proper description (running it without > runpm would cause the system to hang) I have the same issue with bbswitch (and maybe vgaswitcheroo too). Basically this means that in d3cold we can't talke to the gpu and the information isn't cached or something like that.
Created attachment 122708 [details] _msg_kernel_pci-23.txt This what I captured from dmesg. PCI 02:00.0 is a 730 GTX w/1Gb DDR5 PCI 01:00.0 is a 960 GTX w/4Gb DDR5 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: bios: version 84.06.26.00.2c Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: gr: using external firmware Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: Direct firmware load for nvidia/gm206/fecs_inst.bin failed with error -2 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: gr: failed to load fecs_inst Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: disp: dcb 15 type 8 unknown Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: fb: 4096 MiB GDDR5 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: VRAM: 4096 MiB Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: GART: 1048576 MiB Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: TMDS table version 2.0 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB version 4.1 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB outp 01: 02000f00 00000000 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB outp 02: 04011f82 00020030 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB outp 03: 02022f62 00020010 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB outp 05: 02833f76 04400020 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB outp 06: 02033f72 00020020 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB outp 15: 01df5ff8 00000000 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB conn 00: 00001030 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB conn 01: 01000131 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB conn 02: 00010261 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB conn 03: 00020346 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: DCB conn 05: 00000570 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: Pointer to flat panel table invalid Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: unknown connector type 70 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: failed to create encoder 1/8/0: -19 Apr 02 17:08:56 localhost kernel: nouveau 0000:01:00.0: DRM: Unknown-1 has no encoders, removing Apr 02 17:08:57 localhost kernel: nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies Apr 02 17:08:57 localhost kernel: nouveau 0000:01:00.0: DRM: allocated 1920x1080 fb: 0x60000, bo ffff88089ac02800 Apr 02 17:08:57 localhost kernel: nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device Apr 02 17:08:57 localhost kernel: nouveau 0000:02:00.0: enabling device (0000 -> 0003) Apr 02 17:08:57 localhost kernel: nouveau 0000:02:00.0: NVIDIA GK208B (b06070b1) Apr 02 17:08:57 localhost kernel: nouveau 0000:02:00.0: bios: version 80.28.78.00.01 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: priv: HUB0: 086014 ffffffff (1f70820c) Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: fb: 1024 MiB GDDR5 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: VRAM: 1024 MiB Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: GART: 1048576 MiB Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: TMDS table version 2.0 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: DCB version 4.0 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: DCB outp 00: 01000f02 00020030 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: DCB outp 01: 02011f62 00020010 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: DCB outp 02: 02022f10 00000000 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: DCB conn 00: 00001031 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: DCB conn 01: 00002161 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: DCB conn 02: 00000200 Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: DRM: MM: using COPY for buffer copies Apr 02 17:08:58 localhost kernel: nouveau 0000:02:00.0: No connectors reported connected with modes Apr 02 17:08:59 localhost kernel: nouveau 0000:02:00.0: DRM: allocated 1024x768 fb: 0x60000, bo ffff88089a47c400 Apr 02 17:08:59 localhost kernel: nouveau 0000:02:00.0: fb1: nouveaufb frame buffer device Apr 02 17:09:01 localhost.localdomain kernel: mei_me 0000:00:16.0: enabling device (0000 -> 0002) Apr 02 17:09:01 localhost.localdomain kernel: snd_hda_intel 0000:01:00.1: Disabling MSI Apr 02 17:09:01 localhost.localdomain kernel: snd_hda_intel 0000:01:00.1: Handle vga_switcheroo audio client Apr 02 17:09:01 localhost.localdomain kernel: snd_hda_intel 0000:02:00.1: Disabling MSI Apr 02 17:09:01 localhost.localdomain kernel: snd_hda_intel 0000:02:00.1: Handle vga_switcheroo audio client Apr 02 17:09:02 localhost.localdomain kernel: snd_hda_intel 0000:00:1f.3: failed to add i915 component master (-19) Apr 02 17:09:02 localhost.localdomain kernel: e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode Apr 02 17:09:02 localhost.localdomain kernel: e1000e 0000:00:1f.6 eth1: registered PHC clock Apr 02 17:09:02 localhost.localdomain kernel: e1000e 0000:00:1f.6 eth1: (PCI Express:2.5GT/s:Width x1) 00:1f:bc:0f:37:76 Apr 02 17:09:02 localhost.localdomain kernel: e1000e 0000:00:1f.6 eth1: Intel(R) PRO/1000 Network Connection Apr 02 17:09:02 localhost.localdomain kernel: e1000e 0000:00:1f.6 eth1: MAC: 12, PHY: 12, PBA No: FFFFFF-0FF Apr 02 17:09:03 localhost.localdomain kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth1 Apr 02 17:09:11 localhost.localdomain kernel: ahci 0000:00:17.0: port does not support device sleep Apr 02 17:09:29 localhost.localdomain kernel: e1000e 0000:00:1f.6 enp0s31f6: 10/100 speed: disabling TSO On Fri, Apr 1, 2016 at 8:17 AM, <bugzilla-daemon@freedesktop.org> wrote: > *Comment # 13 <https://bugs.freedesktop.org/show_bug.cgi?id=94725#c13> on > bug 94725 <https://bugs.freedesktop.org/show_bug.cgi?id=94725> from Karol > Herbst <freedesktop@karolherbst.de> * > > (In reply to Rashed Abdel-Tawab from comment #10 <https://bugs.freedesktop.org/show_bug.cgi?id=94725#c10>)> Created attachment 122654 [details] <https://bugs.freedesktop.org/attachment.cgi?id=122654> [details] <https://bugs.freedesktop.org/attachment.cgi?id=122654&action=edit> > > dmesg using tentative fix > > > > Alexandre, using the tentative fix you uploaded it switches GPUs properly > > now. Here is the dmesg from that since there are still some errors related > > to power state in it. Also, for some reason lshw is returning this for the > > nVidia GPU now: > > > > *-generic > > description: Unassigned class > > product: Illegal Vendor ID > > vendor: Illegal Vendor ID > > physical id: 0 > > bus info: pci@0000:01:00.0 > > version: ff > > width: 32 bits > > clock: 66MHz > > capabilities: bus_master vga_palette cap_list rom > > configuration: driver=nouveau latency=255 maxlatency=255 > > mingnt=255 > > resources: irq:129 memory:dc000000-dcffffff > > memory:b0000000-bfffffff memory:c0000000-c1ffffff ioport:e000(size=128) > > memory:dd000000-dd07ffff > > > > I don't know if that's related to this at all but before, if I set runpm=0 > > and run lshw, it would return the proper description (running it without > > runpm would cause the system to hang) > > I have the same issue with bbswitch (and maybe vgaswitcheroo too). Basically > this means that in d3cold we can't talke to the gpu and the information isn't > cached or something like that. > > ------------------------------ > You are receiving this mail because: > > - You are the assignee for the bug. > > > _______________________________________________ > Nouveau mailing list > Nouveau@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/nouveau > >
So according to Karol's comment it seems like the issue might be fixed. Rashed, can you confirm that the GPU is operational after runtime resume with the patch I posted?
(In reply to Alexandre Courbot from comment #15) > So according to Karol's comment it seems like the issue might be fixed. > Rashed, can you confirm that the GPU is operational after runtime resume > with the patch I posted? I can confirm the driver no longer hangs on runtime resume, yes. I don't know how to offload to the GPU so I guess I can't say I know if its operational.
Created attachment 122747 [details] new dmesg during hang I decided to keep testing this in case we missed something, and running lshw twice in a row causes it to hang. I'm not sure what's up with it so I've attached the dmesg. It looks pretty similar to before, but that doesn't make sense since Alexandre patched the original problem.
uhh the card seems pretty much messed up after resume, because several things just fail.
I think I'm seeing the same issue here on a Schenker XMG A506 notebook. The NV GPU is the dedicated one. AFAIK all display hardware is connected to the Intel iGPU. Kernel is vanilla 4.6.1. Actually this is the first kernel (and the first time) that I try to use the dedicated GPU. lspci: 01:00.0 VGA compatible controller: NVIDIA Corporation GM107M [GeForce GTX 960M] (rev a2) Without runpm=0 even just calling DRI_PRIME=1 glxinfo leaves the system unresponsive shortly afterwards. SysRq still works though. Going to attach the part of the log in a second.
Created attachment 124326 [details] nouveau run on GM170 with default runpm
Created attachment 124332 [details] Schenker XMD A506 acpidump
Tobias, your issue might be different and is caused by an issue with a Windows 10-specific workaround in the firmware for your Clevo-based hardware (notice the AML_INFINITE_LOOP in the PGON method). Workaround for you: add acpi_osi="!Windows 2015" to your cmdline. @Rashed, @Efrem Please attach the uncompressed acpidump output.
http://forums.fedoraforum.org/showthread.php?t=310422 Is this issue effected mines?
@Efrem Your issue occurs with completely different (desktop) hardware. Please file a new bug. (In reply to Rashed Abdel-Tawab from comment #17) > Created attachment 122747 [details] > new dmesg during hang > > I decided to keep testing this in case we missed something, and running lshw > twice in a row causes it to hang. I'm not sure what's up with it so I've > attached the dmesg. It looks pretty similar to before, but that doesn't make > sense since Alexandre patched the original problem. Chris Wilson reported a similar trace for an Acer Aspire VN7-791G. https://lists.freedesktop.org/archives/nouveau/2016-July/025602.html I found ACPI information in the BIOS firmware for your laptop from https://us.msi.com/Laptop/GS60-Ghost-Pro-4K-6th-Gen-GTX-970M.html It could be the same PCI problem that I and Tobias are experiencing, except that our laptops hang in ACPI while Rashed's firmware does not check the PCIe link status in the ACPI firmware. Still digging...
HI guys . The Bug is similar with my graphics panic , so I add related logs here . When I boot up system ,system will hang at nouveau driver panic , repeat print nouveau driver panic info , I must force down the system . If i add nouveau.runpm=0 ,system can boot up normally without panic . My graphics : NVIDIA GM107 And the panic info is related to pci op. --------------------------------------------------- [ 21.303467] Bluetooth: RFCOMM socket layer initialized [ 21.303471] Bluetooth: RFCOMM ver 1.11 [ 22.095603] nouveau 0000:01:00.0: DRM: evicting buffers... [ 22.095605] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle... [ 22.095622] nouveau 0000:01:00.0: DRM: suspending client object trees... [ 22.098407] nouveau 0000:01:00.0: DRM: suspending kernel object tree... [ 22.220120] FAT-fs (sdb1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck. [ 23.669052] Non-volatile memory driver v1.3 [ 24.083662] pci_raw_set_power_state: 48 callbacks suppressed [ 24.083665] nouveau 0000:01:00.0: Refused to change power state, currently in D3 [ 24.159560] nouveau 0000:01:00.0: Refused to change power state, currently in D3 [ 24.179571] nouveau 0000:01:00.0: Refused to change power state, currently in D3 [ 24.179575] nouveau 0000:01:00.0: DRM: resuming kernel object tree... [ 24.179617] nouveau 0000:01:00.0: pci: failed to adjust cap speed [ 24.179618] nouveau 0000:01:00.0: pci: failed to adjust lnkctl speed [ 24.317867] wlp2s0: authenticate with f0:b4:29:87:9e:e4 --------------------------------------------------- and the frequently panic is : ul 21 18:38:50 localhost kernel: [<ffffffffa03c463b>] ? nv04_timer_read+0x2b/0x70 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa03c41af>] nvkm_timer_read+0xf/0x20 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa03bc598>] nvkm_pmu_init+0x58/0x480 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa0374e8e>] nvkm_subdev_init+0xee/0x230 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa03c84df>] nvkm_device_init+0x18f/0x280 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa03cc138>] nvkm_udevice_init+0x48/0x60 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa03736c0>] nvkm_object_init+0x50/0x1c0 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa0373705>] nvkm_object_init+0x95/0x1c0 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa037062e>] nvkm_client_init+0xe/0x10 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa04125fe>] nvkm_client_resume+0xe/0x10 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa036f7b4>] nvif_client_resume+0x14/0x20 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa040fbed>] nouveau_do_resume+0x4d/0x130 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffffa041000c>] nouveau_pmops_runtime_resume+0x7c/0x120 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffff81350c8b>] pci_pm_runtime_resume+0x7b/0xc0 Jul 21 18:38:50 localhost kernel: [<ffffffff81350c10>] ? pci_restore_standard_config+0x40/0x40 Jul 21 18:38:50 localhost kernel: [<ffffffff8142ddb6>] __rpm_callback+0x36/0xc0 Jul 21 18:38:50 localhost kernel: [<ffffffff8142de64>] rpm_callback+0x24/0x80 Jul 21 18:38:50 localhost kernel: [<ffffffff8142ee39>] rpm_resume+0x4e9/0x670 Jul 21 18:38:50 localhost kernel: [<ffffffff812a7ffb>] ? cred_has_capability+0x6b/0x120 Jul 21 18:38:50 localhost kernel: [<ffffffff8142f00f>] __pm_runtime_resume+0x4f/0x80 Jul 21 18:38:50 localhost kernel: [<ffffffffa04107bb>] nouveau_drm_open+0x3b/0x1b0 [nouveau] Jul 21 18:38:50 localhost kernel: [<ffffffff812a80de>] ? selinux_capable+0x2e/0x40 Jul 21 18:38:50 localhost kernel: [<ffffffff812a1cc8>] ? security_capable+0x18/0x20 Jul 21 18:38:50 localhost kernel: [<ffffffffa008add6>] drm_open+0x1f6/0x470 [drm] Jul 21 18:38:50 localhost kernel: [<ffffffffa0091759>] drm_stub_open+0xa9/0x120 [drm] Jul 21 18:38:50 localhost kernel: [<ffffffff811fc4b1>] chrdev_open+0xa1/0x1e0 Jul 21 18:38:50 localhost kernel: [<ffffffff811f5567>] do_dentry_open+0x1a7/0x2e0 Jul 21 18:38:50 localhost kernel: [<ffffffff812a21ac>] ? security_inode_permission+0x1c/0x30 Jul 21 18:38:50 localhost kernel: [<ffffffff811fc410>] ? cdev_put+0x30/0x30 Jul 21 18:38:50 localhost kernel: [<ffffffff811f573d>] vfs_open+0x5d/0xd0 Jul 21 18:38:50 localhost kernel: [<ffffffff81203058>] ? may_open+0x68/0x110 Jul 21 18:38:50 localhost kernel: [<ffffffff81206b3d>] do_last+0x1ed/0x12a0 Jul 21 18:38:50 localhost kernel: [<ffffffff811d9c96>] ? kmem_cache_alloc_trace+0x1d6/0x200 Jul 21 18:38:50 localhost kernel: [<ffffffff81207cb2>] path_openat+0xc2/0x490 Jul 21 18:38:50 localhost kernel: [<ffffffff8120947b>] do_filp_open+0x4b/0xb0 Jul 21 18:38:50 localhost kernel: [<ffffffff812160a7>] ? __alloc_fd+0xa7/0x130 Jul 21 18:38:50 localhost kernel: [<ffffffff811f6a93>] do_sys_open+0xf3/0x1f0 Jul 21 18:38:50 localhost kernel: [<ffffffff811f6bae>] SyS_open+0x1e/0x20 Jul 21 18:38:50 localhost kernel: [<ffffffff816956c9>] system_call_fastpath+0x16/0x1b
I too have a optimus laptop with GM107 / intel graphics card, and I have not been able to use nouveau driver since I bought the laptop (using kernels 4.3.5 up to 4.8.10 on Fedora 23-25 x64). I have searched for multiple solutions so far, and always finished by blacklisting nouveau in order to be able to use that laptop. I could not even able to launch lspci without rendering my system unresponsive (process then takes 100% cpu and I get cpu softlocks) I've tried nouveau.runpm=0 and well, I can at least boot without having nouveau totally disabled :) When not using runpm=0, I get the following when launching lspci / or DRI_PRIME=1 glxgears [ 1470.121123] nouveau 0000:01:00.0: Refused to change power state, currently in D3 [ 1470.193738] nouveau 0000:01:00.0: Refused to change power state, currently in D3 [ 1470.205762] nouveau 0000:01:00.0: Refused to change power state, currently in D3 [ 1470.205766] nouveau 0000:01:00.0: DRM: resuming kernel object tree... [ 1470.205792] nouveau 0000:01:00.0: pci: failed to adjust cap speed [ 1470.205793] nouveau 0000:01:00.0: pci: failed to adjust lnkctl speed So I guess the nouveau driver cannot send to sleep / wake my nvidia card for whatever reason. I am actually trying rely on the discrete graphics in order to reduce power consumption by default. I noticed that I can't use vgaswitcheroo in order to chose my primary graphic card. Also noticed the following error messages on boot. [ 1.795356] [drm] Initialized drm 1.1.0 20060810 [ 1.881362] ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160422/nsarguments-95) [ 1.881428] ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160422/nsarguments-95) [ 1.881552] ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20160422/nsarguments-95) [ 1.881753] VGA switcheroo: detected Optimus DSM method \_SB_.PCI0.PEG0.PEGP handle [ 1.881753] nouveau: detected PR support, will not use DSM So far, when I switch graphic cards, I get the following in dmesg: [ 1174.921195] nouveau 0000:01:00.0: DRM: resuming kernel object tree... [ 1174.921209] nouveau 0000:01:00.0: DRM: resuming client object trees... [ 1174.921226] nouveau 0000:01:00.0: fifo: BIND_ERROR 01 [BIND_NOT_UNBOUND] [ 1317.637168] vga_switcheroo: client 0 refused switch I guess that nouveau driver does not allow to disable the nvidia card when runpm=0 is passed as argument ? Anything I could try please ?
Orsiris What laptop model do you have? It sounds like you are affected by https://bugzilla.kernel.org/show_bug.cgi?id=156341 See also https://github.com/Bumblebee-Project/Bumblebee/issues/764#issuecomment-234494238
Can you try using pcie_port_pm=off into your boot flags?
@Peter I have a MSI GE62 6QD (skylake i67-700HQ) which is on the list of both links you gave me. Using acpi_osi=! acpi_osi="Windows 2009", I can launch lspci, but the system freezes a minute or so later with [ 92.249957] nouveau 0000:01:00.0: DRM: resuming kernel object tree... [ 92.337917] nouveau 0000:01:00.0: DRM: resuming client object trees... [ 98.017700] nouveau 0000:01:00.0: DRM: evicting buffers... [ 98.017704] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle... [ 98.017726] nouveau 0000:01:00.0: DRM: suspending client object trees... [ 98.020446] nouveau 0000:01:00.0: DRM: suspending kernel object tree... [ 98.617684] nouveau 0000:01:00.0: priv: HUB0: 10ecc0 ffffffff (1e40822c) followed by a cpu lock. Is there any debug info I can provide ? @Pablo I have tried pcie_port_pm=off alone or together with the acpi_osi boot flags, but it doesn't seem to make any difference.
Hi, I have the same problems with my new MSI GP62 6QF laptop. It has a geforce gtx960m videocard. I installed fedora 25 on it. The only way to login into the system (gnome) was to disable the nouveau powermanagement with grub option nouveau.runpm=0. I did manage to login into the system with option: acpi_osi=! "acpi_os=Windows 2013" In /sys/kernel/debug/vgaswitcheroo/switch is see the following: 0:IGD:+:Pwr:0000:00:02.0 1:DIS: :DynOff:0000:01:00.0 But it looks like it is not working properly, i see the red light on the power button, meaning the other card is active as well. I also tried the option: i915.modeset=-1 nouveau.runpm=-1, occasionally i can login, but the the system freezes completely. It this a kernel bug or nouveau bug? Thanks.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/257.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.