Bug 41114

Summary: [NVCx] nouveau module crashes on boot
Product: xorg Reporter: Timo Aaltonen <tjaalton>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: mog55356, ryzion, ypwong
Version: unspecified   
Hardware: Other   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg log
none
vbios
none
another vbios dump
none
nouveau crash none

Description Timo Aaltonen 2011-09-22 05:33:55 UTC
Created attachment 51516 [details]
dmesg log

Upon loading the kernel module it crashes. An extract from the dmesg:

[   39.247309] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 0 at offset 0x759C
[   39.358806] [drm] nouveau 0000:01:00.0: unknown i2c port 51
[   39.425347] [drm] nouveau 0000:01:00.0: 0x7568: i2c bus not found
[   39.498115] [drm] nouveau 0000:01:00.0: unknown i2c port 51
[   39.564653] [drm] nouveau 0000:01:00.0: 0x757A: i2c bus not found
[   39.637531] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 1 at offset 0x7BF8
[   39.733665] [drm] nouveau 0000:01:00.0: unknown i2c port 51
[   39.800206] [drm] nouveau 0000:01:00.0: 0x8CE4: Failed parsing init table opcode: INIT_I2C_LONG_IF -19
[   39.911419] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 2 at offset 0x8E58
[   40.002897] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 3 at offset 0x8E5C
[   40.094423] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table 4 at offset 0x8F56
[   40.185903] [drm] nouveau 0000:01:00.0: Parsing VBIOS init table at offset 0x8FBB
[   40.295285] [drm] nouveau 0000:01:00.0: 0x74F7: Condition still not met after 20ms, skipping following opcodes
[   40.414831] [drm] nouveau 0000:01:00.0: unknown i2c port 51
[   40.481375] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
Comment 1 Timo Aaltonen 2011-09-22 05:34:52 UTC
Created attachment 51517 [details]
vbios
Comment 2 Timo Aaltonen 2011-09-22 05:36:37 UTC
The NVidia binary driver works seemingly fine, if it matters.
Comment 3 Emil Velikov 2011-09-22 10:54:39 UTC
Hi Timo

Can you try the latest kernel [1] or at least this patch (drm/nouveau: fix oops if i2c bus not found in nouveau_i2c_identify()) [2]

It should resolve the "NULL pointer dereference"

Cheers
Emil

[1] http://cgit.freedesktop.org/nouveau/linux-2.6/
[2] http://cgit.freedesktop.org/nouveau/linux-2.6/commit/?id=4213df3fa3af818b4ec0397428b007a9d033084c
Comment 4 Ben Skeggs 2011-09-22 14:57:48 UTC
How did you acquire the VBIOS image?  If you didn't already do it this way, can you mount debugfs with nouveau loaded (i guess you may need the patch already mentioned), and copy the VBIOS image from /<debugfs/dri/0/vbios.rom for me please?
Comment 5 Timo Aaltonen 2011-09-23 02:40:01 UTC
It wasn't acquired by me, but I heard that the 'dd' method mentioned on the wiki was used. I'll ask to test a new kernel with the fix and get a new dump.
Comment 6 Timo Aaltonen 2011-09-27 15:09:36 UTC
Created attachment 51692 [details]
another vbios dump

Ok this should be a better (?) dump of the vbios.. The output is "garbled beyond recognition" with the oops fixed.
Comment 7 Xavier Bru 2011-11-09 00:34:36 UTC
Created attachment 53317 [details]
nouveau crash
Comment 8 Xavier Bru 2011-11-09 00:39:59 UTC
We have same crash when loading the nouveau driver on a platform that owns an
NviDia GPU (NVc0 generation card (0x0c8880a1). See previous attached traces.
I applied the proposed [2] patch. It solves the crash, but we then the nouveau
driver logs continuously errors:
[drm] nouveau 0000:83:00.0: unknown i2c port 48
Comment 9 Xavier Bru 2011-11-09 06:27:14 UTC
It seems that adding the nouveau.modeset=0 boot kernel parameter the driver loads without problem (but no traces are seen in dmesg).
Comment 10 Timo Aaltonen 2011-11-09 06:31:33 UTC
xavier: nomodeset disables nouveau completely.

Ben: Is the vbios I posted helpful?
Comment 11 Xavier Bru 2011-11-09 09:22:21 UTC
Hi Timo

Thanks for your answer. I worried with this option because when running lsmod command the "nouveau" driver is present, but I could not see traces in dmesg....
As a work-around, we also tried unsuccessfully to blacklist the driver (rdblacklist=nouveau kernel boot parameter). 
Using the nouveau.disable=1 kernel boot parameter prevents the driver to be loaded, but its a side effect (error on no "disable" option in the driver).
Is there a clean way to blacklist it ? 

Thanks again.
Comment 12 Emil Velikov 2012-05-23 15:34:19 UTC
*** Bug 44652 has been marked as a duplicate of this bug. ***
Comment 13 Emil Velikov 2012-05-23 15:34:31 UTC
*** Bug 43029 has been marked as a duplicate of this bug. ***
Comment 14 Robert Riches 2012-05-28 19:52:23 UTC
With an ENGTX560 DC/2DI/1GD5 card, an experiment with Mageia 2 LiveCD with a desktop 3.3.6 kernel was successful, which may mean the solution is in the kernel at least by that version.  Otherwise, there are some attachments in #43029 that might help, including dmesg output from Knoppix 6.7.1 with the above-named card.
Comment 15 Robert Riches 2013-01-03 03:37:02 UTC
Unfortunately, installed Mageia 2 doesn't work properly with Nouveau and the ENGTX560 DC/2DI/1GD5 card.  According to observation and dmesg timestamps, the kernel appears to go out to lunch for about four minutes.  Later, when attempting startx, keyboard input becomes extremely unresponsive, and video output is blacked out.  This is with kernel 3.3.8-desktop-2.mga2.  The same installation and kernel work well with an ENGT430 card.  What information (dmesg, Xorg log) would be useful?
Comment 16 Robert Riches 2013-01-04 03:39:24 UTC
With multiple duplicate reports and apparently multiple reporters, shouldn't this have a status of 'confirmed' rather than 'new'?
Comment 17 Ilia Mirkin 2013-08-31 03:21:01 UTC
Does this still happen with more recent kernels?
Comment 18 Robert Riches 2013-09-01 02:51:53 UTC
I won't be able to test a newer kernel for the symptoms I observed, because I don't have a system available to install a release with a newer kernel.  However, please don't take that as reason to disqualify this report.  I'm a minor late arrival at this party.  It's not clear whether the issue I mentioned with kernel 3.3.8-desktop-2.mga2 on Mageia 2 is consistent with the original symptoms reported here.  My symptoms should probably be reported separately, if I could afford the time to do that.  Hopefully, the earlier reporters can test with a newer kernel.
Comment 19 Ilia Mirkin 2013-10-01 22:33:46 UTC
No (positive) response to re-test request after a month. Closing as invalid. The driver has been rewritten enough since the original report that it's very unlikely that the same crash still exists.
Comment 20 Timo Aaltonen 2013-10-02 06:22:18 UTC
yeah, sorry for the delay.

My old T420s works pretty good these days, running on ubuntu 13.10 now with 3.11 kernel and tested booting the discrete gpu with it, all fine (NVD9 here though). Acceleration seems to be enabled as well :)
Comment 21 Robert Riches 2013-10-03 02:07:54 UTC
Ilia, would you please point me toward where I can find out what kernel version contains the rewrite(s) of this module that may solve the problem with the 
ENGTX560 DC/2DI/1GD5 card?  I Googled for changelog and release notes for Nouveau, but I came up empty.  Thanks.
Comment 22 Ilia Mirkin 2013-10-03 02:54:00 UTC
A lot of parts of nouveau are rewritten in various kernel versions. A bunch of stuff happened in 3.5, more in 3.7, more in 3.8. Probably a bunch of stuff before then too.
Comment 23 Robert Riches 2013-10-04 02:41:37 UTC
Thank you for the info.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.