Bug 91523 - [NVE7] driver cannot initialize gpu(failed to parse ramcfg data)
Summary: [NVE7] driver cannot initialize gpu(failed to parse ramcfg data)
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium blocker
Assignee: Nouveau Project
QA Contact: Xorg Project Team
Depends on:
Reported: 2015-08-01 17:04 UTC by Ilia Bozhinov
Modified: 2017-11-22 22:57 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:

dmesg log with enabled nouveau debug (185.76 KB, text/plain)
2015-08-01 17:04 UTC, Ilia Bozhinov
no flags Details
vbios.rom (88.50 KB, application/octet-stream)
2015-08-01 20:55 UTC, Ilia Bozhinov
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ilia Bozhinov 2015-08-01 17:04:32 UTC
Created attachment 117481 [details]
dmesg log with enabled nouveau debug

When the kernel tries to load the nouveau kernel module during boot, it fails with "failed to parse ramcfg data"(see attached logs). It also happens when I try to load the module manually later. This is a bug which occurred between kernel 3.17 and 3.18(downgrading to 3.17 fixes the problem, each version above doesn't work). I'm using nvidia gt645m(gk107) on a Lenovo z500 with intel i5-3230 processor. OS: tested on arch linux(both stock & -ck kernel) and fedora 22. Let me know if any other info would be needed.
Comment 1 Ilia Mirkin 2015-08-01 17:16:56 UTC
Relevant bits:

[    3.766892] nouveau D[     PFB][0000:01:00.0] 0x100800: 0x00000002
[    3.766892] nouveau D[     PFB][0000:01:00.0] parts 0x00000002 mask 0x00000000
[    3.766900] nouveau D[     PFB][0000:01:00.0] 0: mem_amount 0x00000400
[    3.766902] nouveau D[     PFB][0000:01:00.0] 1: mem_amount 0x00000400
[    3.766913] nouveau E[     PFB][0000:01:00.0] failed to parse ramcfg data
[    3.766914] nouveau E[     PFB][0000:01:00.0] failed to create 0x00000000, -22
[    3.766915] nouveau ![     PFB][0000:01:00.0] error detecting memory configuration!!
[    3.766916] nouveau E[  DEVICE][0000:01:00.0] failed to create 0x1000e00b, -22

Could you attach your vbios.rom from a successful boot (/sys/kernel/debug/dri/1/vbios.rom)
Comment 2 Ilia Bozhinov 2015-08-01 20:55:56 UTC
Created attachment 117483 [details]
Comment 3 Ilia Bozhinov 2016-04-17 17:45:43 UTC
It seems that the code that triggers this bug is in drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgk104.cpp around line 1583. It seems that the assumption in the comment is not true, on my card this condition obviously doesn't mean that the card should be ignored completely. This maybe has some other meaning, does anybody know how I can help debugging the problem? I'm posting the comment in the code:

/* parse bios data for all rammap table entries up-front, and                                         
 * build information on whether certain fields differ between                                         
 * any of the entries.                                                                                
 * the binary driver appears to completely ignore some fields                                         
 * when all entries contain the same value.  at first, it was                                         
 * hoped that these were mere optimisations and the bios init                                         
 * tables had configured as per the values here, but there is                                         
 * evidence now to suggest that this isn't the case and we do                                         
 * need to treat this condition as a "don't touch" indicator.                                         
Comment 4 Ilia Bozhinov 2016-04-18 17:17:04 UTC
I tested a kernel build with the relevant code commented(right after the comment I posted) and indeed the kernel module works! (it loads and DRI_PRIME=1 allows me to run glxgears/glxinfo). What is more, now I get better power savings because when driver is loaded vgaswitcheroo can turn off the nvidia card.
Comment 5 FeepingCreature 2017-11-22 20:43:43 UTC
I still get this error in 4.14.1. Same laptop, same vbios.rom. If I comment out the "return ret" in https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgk104.c#L1577, it boots and runs 3d acceleration with xrandr offload, but cannot reclock successfully.
Comment 6 FeepingCreature 2017-11-22 22:57:12 UTC
Looking at envytools' nvbios decode, given ramcfg=1, the first entry of the timing mapping table (0x718f) has a value of 0xf at index 1, offset 0. This is taken as index into the timing table (0x733c); however, the timing table only defines 12 entries, and only six of them are nonzero.

If I simply skip entry 0, reclocking works fine. Maybe that's what 0x0f indicates?

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.