Bug 99482

Summary: nouveau driver returns -16 trying to hibernate
Product: xorg Reporter: Brian J. Murrell <brian>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: blocker    
Priority: medium CC: dr_bugzilla
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Brian J. Murrell 2017-01-21 15:31:58 UTC
I am running Fedora 25 with kernel 4.8.15-300.fc25.x86_64.  I don't actually actively use the nouveau driver as my machine has an Intel/Nvidia hybrid so I use the Intel driver for X.  But even when I do that, the nouveau driver gets loaded (with 2 references) unless I blacklist it.

So given the above, when I try to hibernate it typically works once but a second attempt to hibernate (after having been woken up of course) will fail and the kernel reports:

Jan 20 17:42:15 laptop kernel: PM: Syncing filesystems ... done.
Jan 20 17:42:15 laptop kernel: Freezing user space processes ... (elapsed 0.264 seconds) done.
Jan 20 17:42:15 laptop kernel: Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
Jan 20 17:42:15 laptop kernel: PM: Marking nosave pages: [mem 0x00000000-0x00000fff]
Jan 20 17:42:15 laptop kernel: PM: Marking nosave pages: [mem 0x0009d000-0x000fffff]
Jan 20 17:42:15 laptop kernel: PM: Marking nosave pages: [mem 0x48b7f000-0x49ffefff]
Jan 20 17:42:15 laptop kernel: PM: Marking nosave pages: [mem 0x4a000000-0xffffffff]
Jan 20 17:42:15 laptop kernel: PM: Basic memory bitmaps created
Jan 20 17:42:15 laptop kernel: PM: Preallocating image memory... done (allocated 1487391 pages)
Jan 20 17:42:15 laptop kernel: PM: Allocated 5949564 kbytes in 0.34 seconds (17498.71 MB/s)
Jan 20 17:42:15 laptop kernel: Freezing remaining freezable tasks ... (elapsed 0.216 seconds) done.
Jan 20 17:42:15 laptop kernel: Suspending console(s) (use no_console_suspend to debug)
Jan 20 17:42:15 laptop kernel: tpm tpm0: TPM savestate took 3100ms
Jan 20 17:42:15 laptop kernel: parport_pc 00:04: disabled
Jan 20 17:42:15 laptop kernel: nouveau 0000:01:00.0: DRM: suspending console...
Jan 20 17:42:15 laptop kernel: nouveau 0000:01:00.0: DRM: suspending display...
Jan 20 17:42:15 laptop kernel: nouveau 0000:01:00.0: DRM: evicting buffers...
Jan 20 17:42:15 laptop kernel: nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
Jan 20 17:42:15 laptop kernel: nouveau 0000:01:00.0: DRM: failed to idle channel 0 [DRM]
Jan 20 17:42:15 laptop kernel: nouveau 0000:01:00.0: DRM: resuming display...
Jan 20 17:42:15 laptop kernel: pci_pm_freeze(): nouveau_pmops_freeze+0x0/0x20 [nouveau] returns -16
Jan 20 17:42:15 laptop kernel: dpm_run_callback(): pci_pm_freeze+0x0/0xf0 returns -16
Jan 20 17:42:15 laptop kernel: PM: Device 0000:01:00.0 failed to freeze async: error -16
Jan 20 17:42:15 laptop kernel: usb usb1: root hub lost power or was reset
Jan 20 17:42:15 laptop kernel: usb usb3: root hub lost power or was reset
Jan 20 17:42:15 laptop kernel: usb usb4: root hub lost power or was reset
Jan 20 17:42:15 laptop kernel: usb usb2: root hub lost power or was reset
Jan 20 17:42:15 laptop kernel: ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
Jan 20 17:42:15 laptop kernel: ehci-pci 0000:00:1d.0: cache line size of 64 is not supported
Jan 20 17:42:15 laptop kernel: sd 0:0:0:0: [sda] Starting disk
Jan 20 17:42:15 laptop kernel: pciehp 0000:3c:03.0:pcie204: Device 0000:5f:00.0 already exists at 0000:5f:00, cannot hot-add
Jan 20 17:42:15 laptop kernel: pciehp 0000:3c:03.0:pcie204: Cannot add device at 0000:5f:00
Jan 20 17:42:15 laptop kernel: usb 1-1: reset high-speed USB device number 3 using ehci-pci
Jan 20 17:42:15 laptop kernel: usb 2-1: reset high-speed USB device number 2 using ehci-pci
Jan 20 17:42:15 laptop kernel: ata3: SATA link down (SStatus 0 SControl 300)
Jan 20 17:42:15 laptop kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 20 17:42:15 laptop kernel: ata1.00: supports DRM functions and may not be fully accessible
Jan 20 17:42:15 laptop kernel: ata4: SATA link down (SStatus 0 SControl 300)
Jan 20 17:42:15 laptop kernel: ata1.00: supports DRM functions and may not be fully accessible
Jan 20 17:42:15 laptop kernel: ata1.00: configured for UDMA/133
Jan 20 17:42:15 laptop kernel: usb 3-7: reset high-speed USB device number 3 using xhci_hcd
Jan 20 17:42:15 laptop kernel: usb 3-5: reset full-speed USB device number 2 using xhci_hcd
Jan 20 17:42:15 laptop kernel: usb 3-12: reset full-speed USB device number 4 using xhci_hcd
Jan 20 17:42:15 laptop kernel: rtc_cmos 00:03: System wakeup disabled by ACPI
Jan 20 17:42:15 laptop kernel: parport_pc 00:04: activated
Jan 20 17:42:15 laptop kernel: PM: restore of devices complete after 1370.000 msecs

As you can see, the fatal part is the:

Jan 20 17:42:15 laptop kernel: pci_pm_freeze(): nouveau_pmops_freeze+0x0/0x20 [nouveau] returns -16
Jan 20 17:42:15 laptop kernel: dpm_run_callback(): pci_pm_freeze+0x0/0xf0 returns -16
Jan 20 17:42:15 laptop kernel: PM: Device 0000:01:00.0 failed to freeze async: error -16

If I do blacklist the nouveau driver completely so that it does not get loaded into the kernel even, I can hibernate and resume over and over again with no problem.
Comment 1 dr_bugzilla 2017-05-27 18:04:47 UTC
I confirm this bug on Xubuntu 16.04 Xenial, when using a new mainline kernel from PPA (4.11.3-041103-generic) rather than the stock kernel. My laptop has hybrid Intel/Nvidia graphics. Any attempts to hibernate (in my case, using uswsusp's s2disk) fail, with control being returned to the GUI. In the kernel log, I see the same messages as Brian does, specifically:

May 25 17:18:13 *** kernel: [ 3356.651761] nouveau 0000:01:00.0: DRM: evicting buffers...
May 25 17:18:13 *** kernel: [ 3356.651765] nouveau 0000:01:00.0: DRM: waiting for kernel channels to go idle...
May 25 17:18:13 *** kernel: [ 3371.651076] nouveau 0000:01:00.0: DRM: failed to idle channel 0 [DRM]
May 25 17:18:13 *** kernel: [ 3371.651132] pci_pm_freeze(): nouveau_pmops_freeze+0x0/0x20 [nouveau] returns -16
May 25 17:18:13 *** kernel: [ 3371.651134] dpm_run_callback(): pci_pm_freeze+0x0/0xf0 returns -16
May 25 17:18:13 *** kernel: [ 3371.651136] PM: Device 0000:01:00.0 failed to freeze async: error -16

With the older stock kernel (4.4.0-78-generic), I think that the underlying cause of this bug was having an effect, even though I did not see these kernel messages. I say this because hibernation with the older kernel would fail about 25% of the time, hanging the system completely (hard power-off required, all work was lost). I believe now that this would happen because the kernel was not able to idle nouveau but just continued the attempt to hibernate anyway. 
Also, even when hibernate succeeded, my laptop would not actually power off when writing 'platform' to /sys/power/disk; I would either have to hold the power button down or set /sys/power/disk to 'shutdown' instead. Again, I think this was happening because nouveau was not being properly idled.

When I blacklist nouveau and use only Intel graphics (i915), hibernation always works (including powering off the laptop automatically).

My nouveau is stock Xenial (version 1.0.12-1build2 of package xserver-xorg-video-nouveau).
Comment 2 dr_bugzilla 2017-05-27 18:34:48 UTC
Quick followup:

Bug #99889 "nouveau preventing shutdown after suspend-resume" seems to be related to this one:

https://bugs.freedesktop.org/show_bug.cgi?id=99889

See also this thread on the Nouveau mailing list:

https://lists.freedesktop.org/archives/nouveau/2017-February/027316.html
Comment 3 Martin Peres 2019-12-04 09:23:28 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/319.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.