Bug 46557 - nouveau: NV4E acceleration corruption when DMA above 31-bit (2 Gig barrier)
Summary: nouveau: NV4E acceleration corruption when DMA above 31-bit (2 Gig barrier)
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
: 54988 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-02-23 22:13 UTC by Salah Coronya
Modified: 2013-03-01 17:53 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
live dmesg before X was started (72.67 KB, text/plain)
2012-02-23 22:13 UTC, Salah Coronya
no flags Details
Xorg logs (27.61 KB, text/plain)
2012-02-23 22:13 UTC, Salah Coronya
no flags Details
Picture from the framebugger after the nouveau drivers loads with acceleration (56.99 KB, image/png)
2012-05-04 01:30 UTC, Salah Coronya
no flags Details
VBOIS from card (64.00 KB, application/octet-stream)
2012-05-09 15:53 UTC, Salah Coronya
no flags Details
64-bit dmesg (774.83 KB, text/plain)
2012-05-22 09:55 UTC, Salah Coronya
no flags Details
32-bit dmesg (116.05 KB, text/plain)
2012-05-22 09:57 UTC, Salah Coronya
no flags Details
Hugh's dmesg + lspci + /proc/modules (73.92 KB, text/plain)
2012-06-17 06:14 UTC, D. Hugh Redelmeier
no flags Details
Hugh's Xorg.0.log (59.09 KB, text/plain)
2012-06-17 06:16 UTC, D. Hugh Redelmeier
no flags Details
Hugh's mem=2g dmesg + lspci -v + /proc/modules (69.12 KB, text/plain)
2012-06-17 06:18 UTC, D. Hugh Redelmeier
no flags Details
Hugh's mem=2g Xorg.0.log (33.43 KB, text/plain)
2012-06-17 06:19 UTC, D. Hugh Redelmeier
no flags Details
ugly workaround (532 bytes, patch)
2012-06-17 07:16 UTC, Marcin Slusarz
no flags Details | Splinter Review
dmesg of new kernel (189.26 KB, text/plain)
2012-10-03 07:47 UTC, Salah Coronya
no flags Details
dmesg|egrep -i 'drm|agp|fb' (60.01 KB, text/plain)
2012-10-29 14:26 UTC, Raphaël Droz
no flags Details
nouveau CALL_SUBR_ACTIVE errors using dma_bits=32 kernel on NV34 (111.62 KB, text/plain)
2012-11-30 17:42 UTC, Raphaël Droz
no flags Details
limit vm size to 31 bits (nv04-nv40,nv45) (632 bytes, patch)
2012-12-28 01:33 UTC, Marcin Slusarz
no flags Details | Splinter Review
nouveau CALL_SUBR_ACTIVE errors unpatched 3.7.0 (15.91 KB, text/plain)
2012-12-29 14:35 UTC, Raphaël Droz
no flags Details
netconsole log of a NV34 crash when mem > 3GB (5.30 KB, text/plain)
2013-03-01 17:53 UTC, Raphaël Droz
no flags Details

Description Salah Coronya 2012-02-23 22:13:18 UTC
Created attachment 57574 [details]
live dmesg before X was started

When I start nouveau with 2D acceleration the framebuffer is unusable (typically displays frozen contents of the last shutdown), but the keyboard is responsive. Once X starts, the display is unreadable - anything to a tiled staircase picture to a partially solid grey screen, with no mouse cursor, and the keyboard locks up - no responsive from the caps/num/scroll locks keys, however it DOES respond to Magic SysRQ. The machine itself doesn't seem hard locked, though.

The syslog is spammed with tons of message from the nouveau driver complaining about PFIFO_CACHE_ERROR, PFIFO_DMA_PUSHER: MEM_FAULT, INVALID_CMD, CALL_SUBR_ACTIVE, etc  - varies wildly on each boot. Sometimes its only few errors, sometimes tons of them, but the end result is the same. 

Tried with the nouveau git tree on freedesktop 2/23. using nouveau.nofbaccel=1 clears up the framebuffer corruption, but the X display corruption/lockups still happen

Display adapter: 

00:05.0 VGA compatible controller: nVidia Corporation C51 [GeForce 6150 LE] (rev a2) (prog-if 00 [VGA controller])
	Subsystem: Hewlett-Packard Company Device 2a34
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Region 3: Memory at fb000000 (64-bit, non-prefetchable) [size=16M]
	[virtual] Expansion ROM at c0000000 [disabled] [size=128K]
	Capabilities: [48] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Kernel driver in use: nouveau
	Kernel modules: nouveau
Comment 1 Salah Coronya 2012-02-23 22:13:58 UTC
Created attachment 57575 [details]
Xorg logs
Comment 2 Salah Coronya 2012-05-04 01:30:53 UTC
Created attachment 61012 [details]
Picture from the framebugger after the nouveau drivers loads with acceleration

Using the current (5/2/12) nouveau git -this is a picture from the framebuffer when nouveau loads and acceleration is enabled - it displays the shutdown screen from the previous boot and the display is frozen. It stay this way until either the driver is unloaded or the GPU hangs and the fbcon code switches back to software fbcon (dmesg displays "GPU lockup - switching to software fbcon").
Comment 3 Salah Coronya 2012-05-09 15:53:36 UTC
Created attachment 61315 [details]
VBOIS from card

vios dump attached; also sent mmiotrace via e-mail.
Comment 4 Salah Coronya 2012-05-13 01:07:06 UTC
After experimenting with a few kernels, acceleration works in the framebuffer and X normally on 32-bit (x86) kernels; but not on 64-bit (amd64) kernels.
Comment 5 Salah Coronya 2012-05-22 09:55:47 UTC
Created attachment 61969 [details]
64-bit dmesg

Here's the "bad" 64-bit dmesg from kernel 3.4.0
Comment 6 Salah Coronya 2012-05-22 09:57:19 UTC
Created attachment 61970 [details]
32-bit dmesg

This is the "good" 32-bit dmesg for comaprison (both dmesg have drm.debug=0x06)
Comment 7 Salah Coronya 2012-06-01 22:51:07 UTC
After playing with kernel command line, I've found the problem does not occur in a 64-bit kernel if mem=2G is added the kernel command line (the machine in question has 3G of RAM). The higher the mem above that, the great the chance of corruption - its hard to tell exactly when its an issue because its intermittent - sometimes it work, sometimes it doesn't, (So far, 5 attempts at 2G no failures on either X for the framebuffer, whereas one start at 2G + 80M worked but the second time it did not)
Comment 8 D. Hugh Redelmeier 2012-06-17 06:10:59 UTC
I experience something similar on my notebook.

- AMD Turion CPU; 3GiB RAM; nVidia GeForce Go 6100 video controller

- was working fine with 64 bit Ubuntu 10.04 using nv X video driver

- problems on 64 bit Ubuntu 12.04 with nouveau driver (VESA driver works but does not support native resolution)

Symptoms are various.  Simplest, on fully updated system: LightDM seems to work to allow login but Unity desktop does not show up except for background.

dmesg shows a lot of messages like this (and variants):
 [drm] nouveau 0000:00:05.0: PFIFO_DMA_PUSHER - Ch 3 Get 0x01256038 Put 0x011c60b0 State 0x4ffe0004 (err: INVALID_MTHD) Push 0x00000000

/var/log/Xorg.0.log shows lots of mayhem after this message:
 [mi] EQ overflowing.  Additional events will be discarded until existing events are processed.

Adding mem=2g to kernel seems to fix the problem!

Thanks Salah for this discovery!  Thanks xexaxo on #nouveau for recognizing my problem and pointing me here!

I will attach logs.
Comment 9 D. Hugh Redelmeier 2012-06-17 06:14:37 UTC
Created attachment 63133 [details]
Hugh's dmesg + lspci + /proc/modules
Comment 10 D. Hugh Redelmeier 2012-06-17 06:16:18 UTC
Created attachment 63134 [details]
Hugh's Xorg.0.log
Comment 11 D. Hugh Redelmeier 2012-06-17 06:18:58 UTC
Created attachment 63135 [details]
Hugh's mem=2g dmesg + lspci -v + /proc/modules
Comment 12 D. Hugh Redelmeier 2012-06-17 06:19:47 UTC
Created attachment 63136 [details]
Hugh's mem=2g Xorg.0.log
Comment 13 Marcin Slusarz 2012-06-17 07:16:35 UTC
Created attachment 63137 [details] [review]
ugly workaround

Does this patch help too? It's still a workaround, but it doesn't lose memory above 2GB.
Comment 14 D. Hugh Redelmeier 2012-06-17 08:03:12 UTC
Marcin:

Thanks for the proposed patch.

I'm in "dumb Ubuntu user mode".  Would testing your patch be valuable to the cause, valuable enough for me to learn how to learn how to rebuild Ubuntu kernels, with patches?  (I've rebuilt CentOS kernels and long ago built kernel.org kernels, but not debian or Ubuntu kernels.)

Is there a good test for "broken PCI/AGP" hardware?  I take it that there is a lot broken at the 4G boundary but you suspect mine is broken at the 2G boundary.

Possibly relevant factoid: the notebook is speced to accept 4G of RAM but won't with this BIOS (the latest).  It will accept 4G with an older BIOS.  The manufacturer (Acer) does not accept that this is a defect.

I had guessed (based on no evidence) that there was a sign-extension bug in the nouveau code. That guess was based on the apparent fact that a 32-bit kernel worked.

How do you distinguish hardware vs software bug?  Your patch should bypass either.

I would have thought that a kernel parameter to set MAX_DMA32_PFN might be useful.
Comment 15 D. Hugh Redelmeier 2012-06-17 09:02:34 UTC
Thinking some more.  Some inconclusive evidence that this is not a hardware bug: 64-bit Ubuntu worked fine on it with 3G of RAM.  So to does the MS Windows Vista.  This surely included DMAing into the high 1G by disk I/O, DVD writer I/O, and video driver I/O.

What's new is nouveau.  Perhaps it uses part of the video controller that nv and Vista do not, part that does defective DMA, but that isn't obvious to me.

(Note: I'm using "DMA" in the computer architecture sense, not the IBM PC clone sense.)
Comment 16 D. Hugh Redelmeier 2012-06-17 10:52:34 UTC
Sorry, I said my notebook had 3G of RAM.  I misremembered.  It has 2.25G.
Comment 17 Salah Coronya 2012-06-17 13:12:54 UTC
The above "workaround" works. Framebuffer is OK, X is good, glxgears runs without having to specify mem=2G (this is against the nouveau git).

I think it is buggy hardware - its just the blob and Windows driver know about it and only do 31-bit DMA (or maybe they just get lucky). Attempting to set dma_bits=31 in nouveau_vram_init cause nouveau_sgdma_init to mail to map the page, and attempting to allocate a suitable page using pci_alloc_consistent / dna_alloc_coherent or alloc_page GFP_DMA flag causes BUGs and paging faults.  

If I specify if I specify both nouveau.vram_pushbuf=1 AND nouveau.vram_notify=1 (just one alone does not work), it "soft of" works without mem=2G. The framebuffer is OK. X is still distorted but not as badly and isn't locked up, and the dmesg is no longer filled with errors, but glxgears does not work (it doesn't crash but just shows crazy flashing triangles).
Comment 18 D. Hugh Redelmeier 2012-06-17 14:26:03 UTC
Salah:

If the limitation were in the hardware, why would the kernel arch (32 bit vs 64 bit) make a difference?  By the time that addresses get to the PCI bus, the architecture should make no difference.
Comment 19 Marcin Slusarz 2012-09-16 19:45:48 UTC
*** Bug 54988 has been marked as a duplicate of this bug. ***
Comment 20 Marcin Slusarz 2012-10-02 16:24:06 UTC
I talked to Ben about this bug at XDC2012 and he told me we are using nv04-style virtual memory interface, because of some then unknown bugs in nv4x implementation - and this is probably the reason why you are seeing this bug.

Since XDC, Ben fixed and enabled nv4x-style virtual memory in nouveau.git, so please test it!
Comment 21 Salah Coronya 2012-10-03 07:47:46 UTC
Created attachment 68024 [details]
dmesg of new kernel

It runs FAR better - the distortion and lockups are gone, and the picture us substantially better (on par with the blob).

Its not quite 100% - X crashes reliably if I switch to another VT, back to X, and then use something with video acceleration (after it crashes and restart its distored like before, but not locked up. If I apply the workaround in bug 31961, X still crashes but restarts withotu distortion)

Regardless of console switching, CACHE_ERROR start flooding the syslog but not immedately, but there no noticable artfacts or slowdown.
Comment 22 baldur 2012-10-03 11:14:04 UTC
For my system (reported Bug 54988) it starts up now and works for a RAM size of 3G.

But after logging into the gnome desktop the screen becomes blurry and 
and while trying to open any gnome menu, the screen gets scrambled now.
This is even a problem when i boot with mem=2G - this used to work with the current version of the fedora driver in kernel 3.5.4.

Seems the DMA problem is gone, however some new bugs are showing up :-)

System Description is still the same as 54988. I am running a current fedora kernel,patched with nouveau from 2nd October.


Linux version 3.5.4-2.localnouveau.fc17.x86_64 (root@baldur) (gcc version 4.7.2 20120921 (Red Hat 4.7.2-2) (GCC) ) #1 SMP Wed Oct 3 11:36:49 CEST 2012
Comment 23 baldur 2012-10-03 11:21:49 UTC
here is the output for nouveau from dmesg

[    1.098755] nouveau 0000:00:05.0: >setting latency timer to 64
[    1.099370] nouveau  [  DEVICE][0000:00:05.0] BOOT0  : 0x04e000a2
[    1.099374] nouveau  [  DEVICE][0000:00:05.0] Chipset: C51 (NV4E)
[    1.099377] nouveau  [  DEVICE][0000:00:05.0] Family : NV40
[    1.100226] nouveau  [   VBIOS][0000:00:05.0] checking PRAMIN for image...
[    1.136693] nouveau  [   VBIOS][0000:00:05.0] ... appears to be valid
[    1.136696] nouveau  [   VBIOS][0000:00:05.0] using image from PRAMIN
[    1.136922] nouveau  [   VBIOS][0000:00:05.0] BIT signature found
[    1.136925] nouveau  [   VBIOS][0000:00:05.0] version 05.51.22.33
[    1.137122] nouveau  [     PFB][0000:00:05.0] RAM type: stolen system memory
[    1.137125] nouveau  [     PFB][0000:00:05.0] RAM size: 32 MiB
[    1.789533] nouveau  [     DRM] VRAM: 29 MiB
[    1.789540] nouveau  [     DRM] GART: 512 MiB
[    1.789546] nouveau  [     DRM] BIT BIOS found
[    1.789550] nouveau  [     DRM] Bios version 05.51.22.33
[    1.789554] nouveau  [     DRM] TMDS table version 1.1
[    1.789557] nouveau  [     DRM] DCB version 3.0
[    1.789560] nouveau  [     DRM] DCB outp 00: 02000300 00000023
[    1.789563] nouveau  [     DRM] DCB outp 01: 03011312 00000000
[    1.789566] nouveau  [     DRM] DCB outp 02: 020023f1 0040c080
[    1.789569] nouveau  [     DRM] DCB conn 00: 0000
[    1.789572] nouveau  [     DRM] DCB conn 01: 0131
[    1.789575] nouveau  [     DRM] DCB conn 02: 0210
[    1.789577] nouveau  [     DRM] DCB conn 03: 0211
[    1.789580] nouveau  [     DRM] DCB conn 04: 0213
[    1.791153] nouveau  [     DRM] 0xD186: Parsing digital output script table
[    1.841924] nouveau  [     DRM] 1 available performance level(s)
[    1.841930] nouveau  [     DRM] 0: core 475MHz shader 475MHz fanspeed 100%
[    1.841932] nouveau  [     DRM] c:
[    1.843560] nouveau  [     DRM] MM: using M2MF for buffer copies
[    1.843567] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[    1.843570] nouveau  [     DRM] Setting dpms mode 3 on tmds encoder (output 1)
[    1.843574] nouveau  [     DRM] Setting dpms mode 3 on TV encoder (output 2)
[    1.878032] nouveau  [     DRM] Load detected on output B
[    1.892156] nouveau  [     DRM] allocated 1024x768 fb: 0x9000, bo ffff880036fed400
[    1.892270] fbcon: nouveaufb (fb0) is primary device
[    1.902714] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[    1.902716] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[    1.903846] fb0: nouveaufb frame buffer device
[    1.903853] [drm] Initialized nouveau 1.1.0 20120801 for 0000:00:05.0 on minor 0
[    1.980036] nouveau  [     DRM] Load detected on output B
[    2.081584] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[    2.101978] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[    2.101986] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[   36.584586] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[   36.604973] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[   36.604978] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[   37.232622] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[   37.253031] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[   37.253038] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[   37.268629] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[   37.289024] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[   37.289028] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[   37.381346] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[   37.401716] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[   37.401720] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[   42.247520] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[   42.267901] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[   42.267906] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[   55.168028] nouveau  [     DRM] Load detected on output B
[   55.185027] nouveau  [     DRM] Load detected on output B
[   58.741447] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[   58.761829] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[   58.761833] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[   59.054034] nouveau  [     DRM] Load detected on output B
[   62.834059] nouveau  [     DRM] Load detected on output B
[   63.022824] nouveau  [     DRM] Load detected on output B
[   67.077190] nouveau E[     DRM] fail ttm_validate
[   67.077198] nouveau E[     DRM] validate vram_list
[   67.077208] nouveau E[     DRM] validate: -12
[   67.137161] nouveau E[     DRM] fail ttm_validate
[   67.137169] nouveau E[     DRM] validate vram_list
[   67.137175] nouveau E[     DRM] validate: -12
[   77.311034] nouveau  [     DRM] Load detected on output B
[   80.197045] nouveau  [     DRM] Load detected on output B
[   80.290070] nouveau  [     DRM] Load detected on output B
[   87.611597] nouveau E[     DRM] fail ttm_validate
[   87.611605] nouveau E[     DRM] validate vram_list
[   87.611611] nouveau E[     DRM] validate: -12
[   87.637939] nouveau E[     DRM] fail ttm_validate
[   87.637946] nouveau E[     DRM] validate vram_list
[   87.637950] nouveau E[     DRM] validate: -12
[  211.914453] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[  211.934857] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[  211.934862] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[  256.428249] nouveau  [     DRM] Setting dpms mode 3 on vga encoder (output 0)
[  256.448634] nouveau  [     DRM] Setting dpms mode 0 on vga encoder (output 0)
[  256.448638] nouveau  [     DRM] Output VGA-1 is running on CRTC 0 using output B
[  256.481072] nouveau  [     DRM] Load detected on output B
Comment 24 Emil Velikov 2012-10-03 22:38:33 UTC
(In reply to comment #23)
> here is the output for nouveau from dmesg
> 
> ...
> [    1.098755] nouveau 0000:00:05.0: >setting latency timer to 64
> ...
> [   67.077190] nouveau E[     DRM] fail ttm_validate
> [   67.077198] nouveau E[     DRM] validate vram_list
> [   67.077208] nouveau E[     DRM] validate: -12 (ENOMEM)
> ...

You have allocated only 32MB of RAM for the GPU

Try bumping it to 128 or 256MB it should resolve your issue
Comment 25 Raphaël Droz 2012-10-29 14:26:03 UTC
Created attachment 69231 [details]
dmesg|egrep -i 'drm|agp|fb'

I regularly encounter a similar trace as in the first attachment (attachment 57574 [details]) since I went from 1.5G RAM to 4G RAM using an NV34 [GeForce FX 5200].
(linux 3.6.0, xf86-video-nouveau 1.0.2)
It happens almost every day, by the end of the day, always during (basic) graphical operation (eg: open a PDF viewer)
Comment 26 Maarten Lankhorst 2012-11-07 12:19:34 UTC
Judging from the errors, I'd say it can't look up the handle it created.

Diving into the old dma implementation seems 

The handles for vram and gart could not be looked up, so guessing an invalid entry was used.

Does setting dma_bits = 32 inside drivers/gpu/drm/nouveau/core/subdev/vm/nv44.c help?

The old nouveau driver seemed to have commented out the part about 39-bits support for cards < nv50.
Comment 27 Raphaël Droz 2012-11-30 17:42:41 UTC
Created attachment 70841 [details]
nouveau CALL_SUBR_ACTIVE errors using dma_bits=32 kernel on NV34

I tried your suggestion about setting dma_bits to 32 inside drivers/gpu/drm/nouveau/core/subdev/vm/nv44.c.
but sadly the same issue arises (dmesg attached).
I hope I'm still wise to post those traces (from my NV34) in this bug report and hope the root cause is common.
I currently use the drm kernel module with debug=2, debug=3 dumps too much output, but let me know if this can provide additional useful info.
Comment 28 Raphaël Droz 2012-12-26 17:15:55 UTC
I should add that I've no problem with 3GB. Problem arises when I add 1 more GB.
Comment 29 Marcin Slusarz 2012-12-28 01:33:39 UTC
Created attachment 72201 [details] [review]
limit vm size to 31 bits (nv04-nv40,nv45)

Ok, original Salah's issue seems to be fixed. Xorg crashes and CACHE_ERRORs look like separate bugs - please open new bug reports for them (note that for CACHE_ERRORs I advise running nouveau git kernel with http://lists.freedesktop.org/archives/nouveau/2012-December/011780.html).

Raphaël Droz's: you have nv34, so changing something in *nv44.c* obviously won't fix anything for you... Does the above patch help?
Comment 30 Salah Coronya 2012-12-28 23:38:51 UTC
As of kernel 3.7, xorg-1.13, nouveau DDX 1.0.4, mesa-9.0 all the errors related to ths bug are gone for me - no distortion, no crashes, and no CACHE_ERROR, even after switching VT and running accelerated programs for over a week.
Comment 31 Raphaël Droz 2012-12-29 12:21:47 UTC
I switched to 3.7.0 and I can't reproduce either. All seems stable with 4GB.
I'm confident, but I may need to do longer testing.

Note that I can consistently throw "nouveau: ib channel create, -22" messages (eg: each time I run glxgears) but they seem harmless (and maybe even unrelated)
Comment 32 Marcin Slusarz 2012-12-29 13:09:53 UTC
Heh, you probably were experiencing different bug.

"ib channel create" messages are not errors - if you turn debugging off you won't see them again.

I'm changing status of this bug to RESOLVED FIXED.
Comment 33 Raphaël Droz 2012-12-29 14:35:27 UTC
Created attachment 72253 [details]
nouveau CALL_SUBR_ACTIVE errors unpatched 3.7.0

oops, I spoke too soon. It just happened again with an unpatched 3.7 kernel.
I'll come back later after testing your patch, heavily.
Comment 34 Raphaël Droz 2013-03-01 17:53:04 UTC
Created attachment 75747 [details]
netconsole log of a NV34 crash when mem > 3GB

Finally I took some time to seriously (netconsole) dig "how" it crashes when I boot using my 4th memory module.
trace attached


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.