Bug 70213 - [git-v3.12-rc3 + nouveau HEAD] Vmalloc failure -> pci_pm_freeze(): nouveau_pmops_freeze+0x0/0x50 returns -12
Summary: [git-v3.12-rc3 + nouveau HEAD] Vmalloc failure -> pci_pm_freeze(): nouveau_pm...
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-07 05:27 UTC by Ronald
Modified: 2014-02-17 06:50 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Dmesg containing failed s2disk output. (27.85 KB, text/plain)
2013-10-07 05:27 UTC, Ronald
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ronald 2013-10-07 05:27:25 UTC
Created attachment 87219 [details]
Dmesg containing failed s2disk output.

Tried to do s2disk under high memory pressure. I have attached a dmesg excluding the noise from my firewall and wireless card. The log has wrapped due to these messages so it's starting exactly at the point where I do 's2disk'.

This is the card:

[    0.355731] nouveau  [  DEVICE][0000:01:00.0] BOOT0  : 0x0d90a0a1
[    0.355734] nouveau  [  DEVICE][0000:01:00.0] Chipset: GF119 (NVD9)
[    0.355737] nouveau  [  DEVICE][0000:01:00.0] Family : NVD0
[    0.356330] nouveau  [   VBIOS][0000:01:00.0] checking PRAMIN for image...
[    0.429669] nouveau  [   VBIOS][0000:01:00.0] ... appears to be valid
[    0.429671] nouveau  [   VBIOS][0000:01:00.0] using image from PRAMIN
[    0.429874] nouveau  [   VBIOS][0000:01:00.0] BIT signature found
[    0.429878] nouveau  [   VBIOS][0000:01:00.0] version 75.19.55.00.02
[    0.430371] nouveau 0000:01:00.0: irq 44 for MSI/MSI-X
[    0.430381] nouveau  [     PMC][0000:01:00.0] MSI interrupts enabled
[    0.430413] nouveau  [     PFB][0000:01:00.0] RAM type: DDR3
[    0.430416] nouveau  [     PFB][0000:01:00.0] RAM size: 1024 MiB
[    0.430418] nouveau  [     PFB][0000:01:00.0]    ZCOMP: 0 tags
[    0.453371] nouveau  [  PTHERM][0000:01:00.0] FAN control: PWM
[    0.453393] nouveau  [  PTHERM][0000:01:00.0] fan management: disabled
[    0.453398] nouveau  [  PTHERM][0000:01:00.0] internal sensor: yes
[    0.455773] [TTM] Zone  kernel: Available graphics memory: 1025304 kiB
[    0.455775] [TTM] Initializing pool allocator
[    0.455782] [TTM] Initializing DMA pool allocator
[    0.455794] nouveau  [     DRM] VRAM: 1024 MiB
[    0.455797] nouveau  [     DRM] GART: 1048576 MiB
[    0.455802] nouveau  [     DRM] TMDS table version 2.0
[    0.455804] nouveau  [     DRM] DCB version 4.0
[    0.455807] nouveau  [     DRM] DCB outp 00: 02000300 00000000
[    0.455810] nouveau  [     DRM] DCB outp 01: 01000302 00020030
[    0.455812] nouveau  [     DRM] DCB outp 02: 02011362 00020010
[    0.455815] nouveau  [     DRM] DCB outp 03: 04022310 00000000
[    0.455817] nouveau  [     DRM] DCB conn 00: 00001030
[    0.455820] nouveau  [     DRM] DCB conn 01: 00002161
[    0.455822] nouveau  [     DRM] DCB conn 02: 00000200
[    0.456951] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
[    0.456953] [drm] No driver support for vblank timestamp query.
[    0.457177] nouveau  [     DRM] 2 available performance level(s)
[    0.457181] nouveau  [     DRM] 1: core 270MHz shader 540MHz memory 405MHz voltage 900mV
[    0.457185] nouveau  [     DRM] 3: core 810MHz shader 1620MHz memory 500MHz voltage 1110mV
[    0.457189] nouveau  [     DRM] c: core 270MHz shader 540MHz memory 405MHz voltage 900mV fanspeed 40%
[    0.461446] nouveau  [     DRM] MM: using COPY0 for buffer copies
[    0.537183] nouveau  [     DRM] allocated 1280x1024 fb: 0x60000, bo ffff88007cc94000
[    0.537284] fbcon: nouveaufb (fb0) is primary device
[    0.605604] Console: switching to colour frame buffer device 160x64
[    0.607900] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[    0.607902] nouveau 0000:01:00.0: registered panic notifier
[    0.607908] [drm] Initialized nouveau 1.1.1 20120801 for 0000:01:00.0 on minor 0
Comment 1 Ronald 2013-10-07 05:28:36 UTC
Btw, card fails to come back properly:
- Cursor gone
- TTY's show black screen (maybe the process was killed, I don't know)
Comment 2 Ilia Mirkin 2014-01-23 07:49:05 UTC
I believe that this patch should prevent the negaive side-effects  of the suspend failing: http://lists.freedesktop.org/archives/nouveau/2014-January/015812.html
Comment 3 Ronald 2014-01-23 11:11:53 UTC
This bug is tough and cumbersome to reproduce. Is the patch going upstream anyway?
Comment 4 Ilia Mirkin 2014-01-23 18:44:01 UTC
(In reply to comment #3)
> This bug is tough and cumbersome to reproduce.

No kidding. However you could just add a "return false;" in nv84_fence_suspend, which should have an identical effect.

> Is the patch going upstream anyway?

Not up to me, but I would assume so.
Comment 5 Ronald 2014-01-24 13:58:01 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > This bug is tough and cumbersome to reproduce.
> 
> No kidding. However you could just add a "return false;" in
> nv84_fence_suspend, which should have an identical effect.
> 
> > Is the patch going upstream anyway?
> 
> Not up to me, but I would assume so.

I will test this out once I upgrade my kernel. Which will probably be around 3.14-rc1 :) . (if that is okay ofc)
Comment 6 Ronald 2014-02-16 17:17:07 UTC
I tried this:

nv84_fence_suspend(struct nouveau_drm *drm)
{
	struct nouveau_fifo *pfifo = nouveau_fifo(drm->device);
	struct nv84_fence_priv *priv = drm->fence;
	int i;

	priv->suspend = vmalloc((pfifo->max + 1) * sizeof(u32));
	if (priv->suspend) {
		for (i = 0; i <= pfifo->max; i++)
			priv->suspend[i] = nouveau_bo_rd32(priv->bo, i*4);
	}

	/*return priv->suspend != NULL;*/
	return false;
}

This did not work, the laptop suspended (which is actually good news since it started working for this hardware).
Comment 7 Ilia Mirkin 2014-02-17 05:31:38 UTC
(In reply to comment #6)
> I tried this:
> 
> nv84_fence_suspend(struct nouveau_drm *drm)
> {
> 	struct nouveau_fifo *pfifo = nouveau_fifo(drm->device);
> 	struct nv84_fence_priv *priv = drm->fence;
> 	int i;
> 
> 	priv->suspend = vmalloc((pfifo->max + 1) * sizeof(u32));
> 	if (priv->suspend) {
> 		for (i = 0; i <= pfifo->max; i++)
> 			priv->suspend[i] = nouveau_bo_rd32(priv->bo, i*4);
> 	}
> 
> 	/*return priv->suspend != NULL;*/
> 	return false;
> }
> 
> This did not work, the laptop suspended (which is actually good news since
> it started working for this hardware).

I find that very hard to believe. The suspend should have been aborted... Was this on the NVD9? Or on your NV4E/whatever else you have? If it was on a pre-nv84 card, there is no suspend function, and so that code path isn't hit.
Comment 8 Ronald 2014-02-17 06:50:54 UTC
My apologies. I forgot I reported this issue with my NVD9.

The rig it was in is dead and I have no other place to put it in. I have not decided on what to purchase next. It's hard to decide.

So far, I only have the nv34 and nv4e.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.