Summary: | [NVE7] Lenovo Y500 with 2x GT650M, first card gets wrong vbios, fails boot on 3.13+ | ||
---|---|---|---|
Product: | xorg | Reporter: | Claas Lorenz <cllorenz> |
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> |
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | normal | ||
Priority: | medium | CC: | cllorenz |
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Am I correct that you have 2 ~identical GK107 cards in there? Unfortunately it appears that the VBIOS for the first one is corrupt. It's being retrieved from PCIROM, which more often than not actually doesn't have what we want (despite it passing the checksum). Note that your 2nd card appears to init correctly, and it gets its VBIOS from the PROM. Not sure where the VBIOS for your first card lives... I assume that the VBIOS error about the unknown opcode exists in 3.12 as well? Except that 3.13 made that an error. Who knows, maybe it's a legitimately unknown opcode... could you grab a copy of envytools and retrieve the VBIOSes for both of your cards with nvagetbios? i.e. nvagetbios -c 0 -s PROM > vbios-0.rom nvagetbios -c 1 -s PROM > vbios-1.rom If the first one fails, also try PRAMIN instead of PROM (I assume -c 1 will work just fine since nouveau is able to find it without problem). If that still fails, try to get it from the pci rom directly: echo 1 > /sys/bus/pci/devices/0000:01:00.0/rom cat /sys/bus/pci/devices/0000:01:00.0/rom > vbios-0.rom echo 0 > /sys/bus/pci/devices/0000:01:00.0/rom (all of this as root, obviously.) Created attachment 94736 [details]
output of "cat /sys/bus/pci/devices/0000:01:00.0/rom > vbios-0.rom"
Created attachment 94737 [details]
output of "nvagetbios -c 1 -s PROM > vbios-1.rom"
Created attachment 94738 [details]
output of "nvagetbios -c 0 -s PRAMIN > vbios-0.ramin"
Created attachment 94739 [details]
output of "nvagetbios -c 1 -s PRAMIN > vbios-1.ramin"
Wow, that was fast. Yes, I have indeed two cards in there and yes, the kernel message also occured in all earlier kernels I used (at least since 3.9.8 when I got my laptop). From the attachments you see that I had to go the direct pci way to get the vbios-0.rom. Well, analyzing the vbios from the second (working) card, what I see is: Init script 0 at 0x83e0: 0x83e0: 8c UNK8C 0x83e1: 7a 00 02 00 00 20 20 00 00 ZM_REG R[0x000200] = 0x00002020 0x83ea: 33 14 REPEAT 0x14 0x83ec: 6e 00 00 00 00 ff ff ff ff 00 00 00 00 NV_REG R[0x000000] &= 0xffffffff |= 0x00000000 0x83f9: 36 END_REPEAT 0x83fa: 7a 00 02 00 00 25 21 01 40 ZM_REG R[0x000200] = 0x40012125 0x8403: 7a c0 24 12 00 00 00 00 00 ZM_REG R[0x1224c0] = 0x00000000 0x840c: 7a 40 26 12 00 00 00 00 00 ZM_REG R[0x122640] = 0x00000000 0x8415: 6e 00 24 02 00 ff f7 ff ff 00 08 00 00 NV_REG R[0x022400] &= 0xfffff7ff |= 0x00000800 and so on Looking at the pci rom of the first (non-working) card, I see: Init script 0 at 0x83e0: 0x83e0: 42 ??? 0x83e1: 66 CONFIGURE_MEM 0x83e2: ad ??? 0x83e3: 66 CONFIGURE_MEM 0x83e4: c1 ??? 0x83e5: c8 ??? As you can see the bytes are all different, I'm pretty sure this is 16-bit real mode x86 code: $ udcli -16 -x 42 66 ad 66 c1 c8 10 ee 66 c1 c0 08 ee 66 c1 c0 08 ee e2 ed 1f 66 61 0000000000000000 42 inc dx 0000000000000001 66ad lodsd 0000000000000003 66c1c810 ror eax, 0x10 0000000000000007 ee out dx, al 0000000000000008 66c1c008 rol eax, 0x8 000000000000000c ee out dx, al 000000000000000d 66c1c008 rol eax, 0x8 0000000000000011 ee out dx, al 0000000000000012 e2ed loop 0x1 0000000000000014 1f pop ds 0000000000000015 6661 popad Otherwise the bioses appear identical... at least the DCB and GPIO tables match up. So I'd recommend simply grabbing that good vbios-1.rom, sticking it in /lib/firmware (in the initrd if that's where nouveau is loaded from), and adding nouveau.config=NvBios=vbios-1.rom which will use that as the vbios instead of trying to read it from the card. I'm not extremely happy with this solution, of course, but it's the fastest way to get something that works. Can you provide any relevant details about your system? These appear to both be mobile chips, it's pretty uncommon to have that in a single system. Would you also mind providing an acpidump? Perhaps the vbios for the first card is hiding in ACPI somewhere unexpected. I have a Lenovo Y500 with two GeForce GT 650M cards. I think this laptop was intended as medium class gaming machine where two graphic cards make sense somehow. Here is the lspci: 01:00.0 VGA compatible controller: NVIDIA Corporation GK107M [GeForce GT 650M] (rev a1) 02:00.0 VGA compatible controller: NVIDIA Corporation GK107M [GeForce GT 650M] (rev a1) Since I do not use it for gaming, I never cared about having a second card. I put the working vbios to /lib/firmware and added the config to the kernel line, as you suggested, and it works fine now (also with 3.13). Thank you very much for your help! I also attached my acpidump, I hope that helps. Created attachment 94782 [details]
output of acpidump
Would you mind trying to edit nouveau_acpi.c:nouveau_acpi_rom_supported and removing the check for dsm_detected and optimus_detected? (And then booting that without the NvBios setting.) I think the situation is that those aren't set, but we should still get the rom from ACPI for that first card. (In reply to comment #10) > Would you mind trying to edit nouveau_acpi.c:nouveau_acpi_rom_supported and > removing the check for dsm_detected and optimus_detected? (And then booting > that without the NvBios setting.) I think the situation is that those aren't > set, but we should still get the rom from ACPI for that first card. A patch to do this was merged a while back, and backported to stable kernels. Pretty sure it will fix the issue for you (without needing the NvBios thing). Feel free to re-open if not. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 94735 [details] dmesg output During boot (with late module load of nouveau) the screen freezes at about the time where normally a glitch occurs and the resolution is set much higher. It will not recover from this state and the machine must be restarted/powered off via ctrl+alt+del or the power button. The journald log shows that systemd booted normally. I circumvented the problem by downgrading to kernel 3.12 where everything works fine. Here are the Arch Linux package versions of the configuration when the error occurs: kernel - 3.13.5-1 xf86-video-nouveau - 1.0.10-2 nouveau-dri - 10.0.3-1 mesa - 10.0.3-1 xorg-server - 1.15.0-5 libdrm - 2.4.52-1 The dmesg output in the attachment was taken during a boot (as last systemd target before gnome) with the problem described above.