When booting a Sun Blade 2500 with a GeForce 8400 GS PCI, everything appears pretty normal but for no apparent reason, the probe fails. It looks like error -12 is "out of memory", but bug 56721 which looks similar does not apply. My machine also has 6GB of memory, so I'm sure it isn't actually failing to allocate system memory.
Running Linux 3.8-rc1 git and nouveau git from last night (Jan 2 2013).
Doh, dmesg.log didn't attach. I'll get that later today.
Created attachment 72491 [details]
Please attach kernel log with nouveau.debug=trace.
Created attachment 72526 [details]
dmesg output with tracing enabled.
It's odd, I don't see any obvious errors, it just finalizes after the DCB messages.
Appears to be failing in nouveau_display.c
nouveau_display_create(struct drm_device *dev)
296: disp = drm->display = kzalloc(sizeof(*disp), GFP_KERNEL);
This is still present in 3.8 RC kernels? I saw a bunch of patches posted.
Can you sprinkle printks in nouveau_drm_load around nouveau_bios_init (and go deeper) and figure out what is actually failing? I doubt it's the line Emil thinks it is.
I added some printk()s starting from nouveau_drm_load(), and nouveau_display_create() is returning -12.
Do you need me to go further and add some printk()s in nouveau_display_create()? It looks like Emil may be correct. If so -- is there a fix for this issue applied to the kernel git yet?
Yes, please go further.
I'm not aware of any patch fixing this.
OK, I found some time to dig into this. I'm now running linux + nouveau git latest.
The error continues into nv50_display_create(), where nouveau_bo_new() returns -12. Tracing into nouveau_bo_new(), the function ttm_bo_init() fails with -12. This seems to be "outside" of the nouveau driver, however, since I've had success with nouveau using NV17 hardware, I thought perhaps that the parameters were somehow problematic.
I see these integer (non-pointer/boolean) parameters to ttm_bo_init():
size = 8192
type = 0
align >> PAGE_SHIFT = 0
acc_size = 17536
PAGE_SHIFT = 13
SPARC has 8KB pages by default (1<<13), if that helps explain anything.
Hmm, so I saw nouveau_bo_new() has this line:
nvbo->page_shift = 12
That imples a page size of 4KB (default on 32-bit x86). When I changed it to "nvbo->page_shift = PAGE_SHIFT", the problem was fixed!
I now see as the final message:
[drm] Initialized nouveau 1.1.0 20120801 for 0001:01:00.0 on minor 0
As you mentioned in the mailing list post that there are other problems after "fixing" this bug, I think the problem is more widespread.
If page size on GPU is unconnected to page size on CPU (I think it's more likely than the other option), we need to have our own macros for PAGE_SHIFT / PAGE_MASK / PAGE_SIZE for GPU and use them everywhere, OR use existing CPU macros consistently (there are a lot of places using "12" constant).
Either way it's not as simple as this oneliner...
Hmm, that does make sense. Alright then, since you seem to have more of an idea of what is going on, how can I help?
Which makes sense / is the right route? A private PAGE_SHIFT|MASK|SIZE for GPUs or using the system page size consistently? Are all NV GPUs that support paging using 4K pages? (I say "support paging" because I had an NV17 GPU that worked fine on SPARC, OpenGL games worked and all).
Should I just audit nouveau/*.c for any usage of "12" and try changing it to PAGE_SHIFT to see what happens? Guide me and I'll do all the annoying busy work. ;)
(In reply to comment #15)
> Which makes sense / is the right route? A private PAGE_SHIFT|MASK|SIZE for
> GPUs or using the system page size consistently? Are all NV GPUs that
> support paging using 4K pages? (I say "support paging" because I had an NV17
> GPU that worked fine on SPARC, OpenGL games worked and all).
All nvidia GPUs all the way back to NV01 support 4kB pages (with NV50-gen ones also supporting 16kB + 64kB and NVC0-gen also supporting 128kB, but these sizes are usually only used for VRAM and not system RAM).
I was looking to audit the use of 12/PAGE_SHIFT in nouveau to help determine the context of what is meant (i.e. host CPU page size or GPU page size?). Marcin Slusarz suggested that there be a private PAGE_XXX macro for NV cards; would these then be 4KB pages only? I realize the GPUs may support other sizes, but my understanding of the current situation is that they are not supported by the current driver.