Bug 103284 - Unable to handle kernel NULL pointer dereference on 4.13.0-1-sparc64-smp
Summary: Unable to handle kernel NULL pointer dereference on 4.13.0-1-sparc64-smp
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-10-15 21:59 UTC by James
Modified: 2017-10-22 20:57 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg with oops (26.34 KB, text/plain)
2017-10-15 21:59 UTC, James
no flags Details
debug.trace output from modprobe nouveau (184.60 KB, text/plain)
2017-10-16 22:13 UTC, James
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description James 2017-10-15 21:59:18 UTC
Created attachment 134850 [details]
dmesg with oops

dmesg output attached.

I'll admit this is a somewhat bizarre hardware combination (PC GeForce 210 card in a Sun Ultra 45), but it probably shouldn't Oops on a null pointer.
Comment 1 Ilia Mirkin 2017-10-15 23:19:21 UTC
Looks like if we error out somewhere we never have before, and the attempt to clean things up ends up using something that hasn't been initialized yet. Can you try booting with nouveau.debug=trace to see where we end up going wrong, which should hopefully point out the bad piece of code wrt error recovery?
Comment 2 James 2017-10-16 22:13:52 UTC
Created attachment 134875 [details]
debug.trace output from modprobe nouveau

Requested info attached -- taken from manually doing modprobe nouveau debug=trace.
Comment 3 Ilia Mirkin 2017-10-16 22:30:20 UTC
OK, so looks like it very successfully creates a GT214_DISP object, and then proceeds to call nv50_display_create. This then does not trigger *any* further traces, and errors out. It errors out before the dmac is initialized, and I think something in the list head thing fails.

Now, as I recall, SPARC has 8k pages. You could try bumping some of those 4096 values up to 8192 and see if that magically makes it work, at least a little (like the nouveau_bo_new).

If you're sufficiently interested, you could add a lot more prints throughout that file to see exactly what fails, and then try to resolve it.

It might be an uphill battle though - a bunch of stuff is tied to 4K pages, I think.
Comment 4 Ilia Mirkin 2017-10-22 20:55:59 UTC
https://github.com/skeggsb/nouveau/commit/0f6081089699b9f882def35ebe2b1649e77b7688

I believe this patch should fix your crash, but won't help you get the board running. If you want to do that, you'll need to figure out which thing is failing.
Comment 5 James 2017-10-22 20:57:54 UTC
OK -- I'll try giving that a go some time this week. To be honest I'm no too bothered about actually getting this card working on an Ultra 45...


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.