91523 – [NVE7] driver cannot initialize gpu(failed to parse ramcfg data)

Bug 91523 - [NVE7] driver cannot initialize gpu(failed to parse ramcfg data)

Summary: [NVE7] driver cannot initialize gpu(failed to parse ramcfg data)

Status:	RESOLVED MOVED

Alias:	None

Product:	xorg
Classification:	Unclassified
Component:	Driver/nouveau (show other bugs)
Version:	unspecified
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium blocker
Assignee:	Nouveau Project
QA Contact:	Xorg Project Team

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-08-01 17:04 UTC by Ilia Bozhinov
Modified:	2019-12-04 09:02 UTC (History)
CC List:	2 users (show)

See Also:
i915 platform:
i915 features:

Attachments
dmesg log with enabled nouveau debug (185.76 KB, text/plain) 2015-08-01 17:04 UTC, Ilia Bozhinov	no flags	Details
vbios.rom (88.50 KB, application/octet-stream) 2015-08-01 20:55 UTC, Ilia Bozhinov	no flags	Details
View All

Description Ilia Bozhinov 2015-08-01 17:04:32 UTC

Created attachment 117481 [details]
dmesg log with enabled nouveau debug

When the kernel tries to load the nouveau kernel module during boot, it fails with "failed to parse ramcfg data"(see attached logs). It also happens when I try to load the module manually later. This is a bug which occurred between kernel 3.17 and 3.18(downgrading to 3.17 fixes the problem, each version above doesn't work). I'm using nvidia gt645m(gk107) on a Lenovo z500 with intel i5-3230 processor. OS: tested on arch linux(both stock & -ck kernel) and fedora 22. Let me know if any other info would be needed.

Comment 1 Ilia Mirkin 2015-08-01 17:16:56 UTC

Relevant bits:

[    3.766892] nouveau D[     PFB][0000:01:00.0] 0x100800: 0x00000002
[    3.766892] nouveau D[     PFB][0000:01:00.0] parts 0x00000002 mask 0x00000000
[    3.766900] nouveau D[     PFB][0000:01:00.0] 0: mem_amount 0x00000400
[    3.766902] nouveau D[     PFB][0000:01:00.0] 1: mem_amount 0x00000400
[    3.766913] nouveau E[     PFB][0000:01:00.0] failed to parse ramcfg data
[    3.766914] nouveau E[     PFB][0000:01:00.0] failed to create 0x00000000, -22
[    3.766915] nouveau ![     PFB][0000:01:00.0] error detecting memory configuration!!
[    3.766916] nouveau E[  DEVICE][0000:01:00.0] failed to create 0x1000e00b, -22

Could you attach your vbios.rom from a successful boot (/sys/kernel/debug/dri/1/vbios.rom)

Comment 2 Ilia Bozhinov 2015-08-01 20:55:56 UTC

Created attachment 117483 [details]
vbios.rom

Comment 3 Ilia Bozhinov 2016-04-17 17:45:43 UTC

It seems that the code that triggers this bug is in drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgk104.cpp around line 1583. It seems that the assumption in the comment is not true, on my card this condition obviously doesn't mean that the card should be ignored completely. This maybe has some other meaning, does anybody know how I can help debugging the problem? I'm posting the comment in the code:

/* parse bios data for all rammap table entries up-front, and                                         
 * build information on whether certain fields differ between                                         
 * any of the entries.                                                                                
 *                                                                                                    
 * the binary driver appears to completely ignore some fields                                         
 * when all entries contain the same value.  at first, it was                                         
 * hoped that these were mere optimisations and the bios init                                         
 * tables had configured as per the values here, but there is                                         
 * evidence now to suggest that this isn't the case and we do                                         
 * need to treat this condition as a "don't touch" indicator.                                         
 */

Comment 4 Ilia Bozhinov 2016-04-18 17:17:04 UTC

I tested a kernel build with the relevant code commented(right after the comment I posted) and indeed the kernel module works! (it loads and DRI_PRIME=1 allows me to run glxgears/glxinfo). What is more, now I get better power savings because when driver is loaded vgaswitcheroo can turn off the nvidia card.

Comment 5 FeepingCreature 2017-11-22 20:43:43 UTC

I still get this error in 4.14.1. Same laptop, same vbios.rom. If I comment out the "return ret" in https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgk104.c#L1577, it boots and runs 3d acceleration with xrandr offload, but cannot reclock successfully.

Comment 6 FeepingCreature 2017-11-22 22:57:12 UTC

Looking at envytools' nvbios decode, given ramcfg=1, the first entry of the timing mapping table (0x718f) has a value of 0xf at index 1, offset 0. This is taken as index into the timing table (0x733c); however, the timing table only defines 12 entries, and only six of them are nonzero.

If I simply skip entry 0, reclocking works fine. Maybe that's what 0x0f indicates?

Comment 7 Martin Peres 2019-12-04 09:02:12 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/206.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.