Bug 58984 - [NV98] Probe error on SPARC with 8kb pages
Summary: [NV98] Probe error on SPARC with 8kb pages
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: git
Hardware: SPARC Linux (All)
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-03 14:26 UTC by Patrick Baggett
Modified: 2013-12-13 20:12 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg output (20.41 KB, text/plain)
2013-01-03 23:31 UTC, Patrick Baggett
no flags Details
dmesg output with tracing enabled. (119.54 KB, text/plain)
2013-01-04 23:59 UTC, Patrick Baggett
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick Baggett 2013-01-03 14:26:00 UTC
When booting a Sun Blade 2500 with a GeForce 8400 GS PCI, everything appears pretty normal but for no apparent reason, the probe fails. It looks like error -12 is "out of memory", but bug 56721 which looks similar does not apply. My machine also has 6GB of memory, so I'm sure it isn't actually failing to allocate system memory.

Running Linux 3.8-rc1 git and nouveau git from last night (Jan 2 2013).
Comment 1 Patrick Baggett 2013-01-03 15:23:29 UTC
Doh, dmesg.log didn't attach. I'll get that later today.
Comment 2 Patrick Baggett 2013-01-03 23:31:49 UTC
Created attachment 72491 [details]
dmesg output
Comment 3 Marcin Slusarz 2013-01-04 14:08:41 UTC
Please attach kernel log with nouveau.debug=trace.
Comment 4 Patrick Baggett 2013-01-04 23:59:18 UTC
Created attachment 72526 [details]
dmesg output with tracing enabled.
Comment 5 Patrick Baggett 2013-01-05 00:15:44 UTC
It's odd, I don't see any obvious errors, it just finalizes after the DCB messages.
Comment 6 Emil Velikov 2013-01-05 00:52:10 UTC
Appears to be failing in nouveau_display.c

nouveau_display_create(struct drm_device *dev)
...
296:	disp = drm->display = kzalloc(sizeof(*disp), GFP_KERNEL);
	if (!disp)
		return -ENOMEM;
...

See
https://bugzilla.kernel.org/show_bug.cgi?id=51301
https://lkml.org/lkml/2012/11/27/486
Comment 7 Patrick Baggett 2013-01-05 01:37:17 UTC
This is still present in 3.8 RC kernels? I saw a bunch of patches posted.
Comment 8 Marcin Slusarz 2013-01-05 14:46:49 UTC
Can you sprinkle printks in nouveau_drm_load around nouveau_bios_init (and go deeper) and figure out what is actually failing? I doubt it's the line Emil thinks it is.
Comment 9 Patrick Baggett 2013-01-09 15:13:25 UTC
I added some printk()s starting from nouveau_drm_load(), and nouveau_display_create() is returning -12.
Comment 10 Patrick Baggett 2013-01-10 16:48:18 UTC
Do you need me to go further and add some printk()s in nouveau_display_create()? It looks like Emil may be correct. If so -- is there a fix for this issue applied to the kernel git yet?
Comment 11 Marcin Slusarz 2013-01-10 17:07:52 UTC
Yes, please go further.
I'm not aware of any patch fixing this.
Comment 12 Patrick Baggett 2013-03-13 02:28:20 UTC
OK, I found some time to dig into this. I'm now running linux + nouveau git latest.

The error continues into nv50_display_create(), where nouveau_bo_new() returns -12. Tracing into nouveau_bo_new(), the function ttm_bo_init() fails with -12. This seems to be "outside" of the nouveau driver, however, since I've had success with nouveau using NV17 hardware, I thought perhaps that the parameters were somehow problematic.

I see these integer (non-pointer/boolean) parameters to ttm_bo_init():

size = 8192
type = 0
align >> PAGE_SHIFT = 0
acc_size = 17536


PAGE_SHIFT = 13


SPARC has 8KB pages by default (1<<13), if that helps explain anything.
Comment 13 Patrick Baggett 2013-03-13 02:33:01 UTC
Hmm, so I saw nouveau_bo_new() has this line:

nvbo->page_shift = 12

That imples a page size of 4KB (default on 32-bit x86). When I changed it to "nvbo->page_shift = PAGE_SHIFT", the problem was fixed!

I now see as the final message:

[drm] Initialized nouveau 1.1.0 20120801 for 0001:01:00.0 on minor 0
Comment 14 Marcin Slusarz 2013-03-13 20:32:10 UTC
As you mentioned in the mailing list post that there are other problems after "fixing" this bug, I think the problem is more widespread.

If page size on GPU is unconnected to page size on CPU (I think it's more likely than the other option), we need to have our own macros for PAGE_SHIFT / PAGE_MASK / PAGE_SIZE for GPU and use them everywhere, OR use existing CPU macros consistently (there are a lot of places using "12" constant).

Either way it's not as simple as this oneliner...
Comment 15 Patrick Baggett 2013-03-13 21:48:34 UTC
Hmm, that does make sense. Alright then, since you seem to have more of an idea of what is going on, how can I help?

Which makes sense / is the right route? A private PAGE_SHIFT|MASK|SIZE for GPUs or using the system page size consistently? Are all NV GPUs that support paging using 4K pages? (I say "support paging" because I had an NV17 GPU that worked fine on SPARC, OpenGL games worked and all).

Should I just audit nouveau/*.c for any usage of "12" and try changing it to PAGE_SHIFT to see what happens? Guide me and I'll do all the annoying busy work. ;)


Patrick
Comment 16 Marcin Kościelnicki 2013-03-13 23:46:18 UTC
(In reply to comment #15)

> Which makes sense / is the right route? A private PAGE_SHIFT|MASK|SIZE for
> GPUs or using the system page size consistently? Are all NV GPUs that
> support paging using 4K pages? (I say "support paging" because I had an NV17
> GPU that worked fine on SPARC, OpenGL games worked and all).
> 

All nvidia GPUs all the way back to NV01 support 4kB pages (with NV50-gen ones also supporting 16kB + 64kB and NVC0-gen also supporting 128kB, but these sizes are usually only used for VRAM and not system RAM).
Comment 17 Patrick Baggett 2013-05-13 18:53:29 UTC
Hey all,

I was looking to audit the use of 12/PAGE_SHIFT in nouveau to help determine the context of what is meant (i.e. host CPU page size or GPU page size?). Marcin  Slusarz suggested that there be a private PAGE_XXX macro for NV cards; would these then be 4KB pages only? I realize the GPUs may support other sizes, but my understanding of the current situation is that they are not supported by the current driver.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.