Bug 76475 - [NVE7] fails to load due to unknown opcode 0x80 (incorrect vbios)
Summary: [NVE7] fails to load due to unknown opcode 0x80 (incorrect vbios)
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-22 09:19 UTC by patrick.clara
Modified: 2014-05-09 21:37 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
vbios from /sys/bus/pci/devices/<pciid>/rom (58.00 KB, application/octet-stream)
2014-03-22 09:19 UTC, patrick.clara
no flags Details
dmesg from kernel 3.12 (76.08 KB, text/plain)
2014-03-22 09:20 UTC, patrick.clara
no flags Details
dmesg from kernel 3.14 rc7 (71.85 KB, text/plain)
2014-03-22 09:20 UTC, patrick.clara
no flags Details
lspci (27.99 KB, text/plain)
2014-03-22 19:10 UTC, patrick.clara
no flags Details
dmidecode (8.99 KB, text/plain)
2014-03-22 19:10 UTC, patrick.clara
no flags Details
acpidump (367.37 KB, application/octet-stream)
2014-03-22 19:12 UTC, patrick.clara
no flags Details
output of "echo w > /proc/sysrq-trigger" (74.35 KB, text/plain)
2014-03-27 11:07 UTC, patrick.clara
no flags Details
powertop html report (69.72 KB, text/html)
2014-03-27 11:19 UTC, patrick.clara
no flags Details

Description patrick.clara 2014-03-22 09:19:47 UTC
Created attachment 96195 [details]
vbios from /sys/bus/pci/devices/<pciid>/rom

Since kernel 3.13 nouveau fails to load with error -22.
Olso it reports an unknown opcode 0x80.
I suspect it fails to load becouse of the unknown opcode, since i have seen the return code of the function checking the opcodes have changed in 3.13.

I have attached the vbios taken from /sys/bus/pci/devices/<pciid>/rom and the dmesg from both kernel 3.12 and 3.14 rc7.

Could maybe a mmiotrace be helpful?
Comment 1 patrick.clara 2014-03-22 09:20:20 UTC
Created attachment 96196 [details]
dmesg from kernel 3.12
Comment 2 patrick.clara 2014-03-22 09:20:39 UTC
Created attachment 96197 [details]
dmesg from kernel 3.14 rc7
Comment 3 Ilia Mirkin 2014-03-22 13:29:36 UTC
Unfortunately the VBIOS from the PCIROM is often the wrong one. The data here is the x86 opcodes instead of the vbios opocdes. And as you observed, we now consider unknown opcodes an error. (The init tables need to be run on e.g. resume, and if it's a secondary card, on load as well.)

An mmiotrace of the blob loading would potentially be helpful, you can send one (compressed... xz -9 works well) to mmio.dumps@gmail.com.

Can you provide a bit more info about this system? Is it an optimus setup? Could you provide an acpidump?
Comment 4 patrick.clara 2014-03-22 19:10:20 UTC
Created attachment 96213 [details]
lspci
Comment 5 patrick.clara 2014-03-22 19:10:53 UTC
Created attachment 96214 [details]
dmidecode
Comment 6 patrick.clara 2014-03-22 19:12:07 UTC
Created attachment 96215 [details]
acpidump
Comment 7 patrick.clara 2014-03-22 21:40:55 UTC
> Unfortunately the VBIOS from the PCIROM is often the wrong one. The data
> here is the x86 opcodes instead of the vbios opocdes. And as you observed,
> we now consider unknown opcodes an error. (The init tables need to be run on
> e.g. resume, and if it's a secondary card, on load as well.)

I have read many times on the internet about PCIROM providing wrong bios. I have tried various methods of getting one but all failed or provided the exactly same bios as PCIROM. Vbtracetool freezes. Also nvclock fails.

> An mmiotrace of the blob loading would potentially be helpful, you can send
> one (compressed... xz -9 works well) to mmio.dumps@gmail.com.

I have sent one. I used blob version 331.49. During the trace I have loaded the nvidia module and started X. I did it with a ubuntu live cd with kernel 3.13 since on my debian installation I m not able to launch X anymore. Hopefully I did it right.

> Can you provide a bit more info about this system? Is it an optimus setup?
> Could you provide an acpidump?

The macine is an ASUS ZENBOOK UX51VZH, the version with the 2880x1620 display.
It appears to have only the nvidia gpu, or at least the intel one in the i7-3632QM is permanently disabled.

I have attached lspci dmidecode and an acpidump.
Comment 8 Ilia Mirkin 2014-03-23 17:52:02 UTC
OK, as best I can tell from the mmiotrace, it's using the ACPI method[1]. Further, your acpi dump indeed contains a _DSM method, as well as a _ROM method, which should be used to get at the ROM. Unfortunately, it would appear that evaluating _DSM fails for you:

[    2.920235] ACPI Warning: \_SB_.PCI0.PEG0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20131218/nsarguments-95)
[    2.920300] ACPI: \_SB_.PCI0.PEG0.GFX0: failed to evaluate _DSM

Why it's failing to evaluate the DSM is, unfortunately, beyond my knowledge of how these things work. The failure did not seem present in earlier versions (at least not as vocally), although it still didn't find the ACPI ROM back then either. I don't know if that acpi warning is what actually cause it, or if there's something else.

At the top of nouveau_acpi.c:nouveau_acpi_rom_supported, there's a check for dsm_detected or optimus_detected. Try removing that check entirely and let it just try to grab the ROM from ACPI no matter what. I think that dsm_detected might not be getting set even if the earlier _DSM call had succeeded.

[1] Here is what I'm seeing. The other things are identifiable as the PRAMIN and PROM methods. I'm assuming that the maps of random-seeming memory are actually it reading out the ACPI rom.

[0] 1248.299066 MMIO32 R 0x619f04 0x007ffe09 PDISPLAY.VGA.ROM_WINDOW => { BASE = 0x7ffe0000 | ENABLE | TARGET = VRAM }
[0] 1248.299087 MMIO32 R 0x022438 0x00000002 PUNITS.DESIGN_PART_COUNT => 0x2
[0] 1248.299105 MMIO32 R 0x022554 0x00000000 PUNITS.HW_MC_DISABLE_MASK => 0
[0] 1248.299123 MMIO32 R 0x100800 0x00000002 PFFB.PART_CONFIG => { PART_COUNT = 0x2 | MEM_SPLIT_PART = 0 }
[0] 1248.299139 MMIO32 R 0x11020c 0x00000400 PBFB[0].MEM_AMOUNT => 0x40000000
[0] 1248.299158 MMIO32 R 0x11120c 0x00000400 PBFB[0x1].MEM_AMOUNT => 0x40000000
[0] 1248.299177 MMIO32 R 0x001700 0x00007ff0 PBUS.HOST_MEM.PMEM => { OFFSET = 0x7ff00000 | TARGET = VRAM }
[0] 1248.299188, MEM8 7ffe0000 => ef
[0] 1248.299197, MEM8 7ffe0001 => be
SLEEP 0.212000ms
[0] 1248.299409 MMIO32 R 0x619f04 0x007ffe09 PDISPLAY.VGA.ROM_WINDOW => { BASE = 0x7ffe0000 | ENABLE | TARGET = VRAM }
[0] 1248.299423, MEM16 7ffe0000 => beef
[0] 1248.299442 MMIO32 R 0x101000 0x9f408480 PSTRAPS.STRAPS0_PRIMARY => { VALUE = { RAMCFG = 0 | CRYSTAL = 0 | DEVICE_ID_0_3 = 0x1 | BAR1_SIZE = 25
6MB | ROM_TYPE = SPI | FP_CONFIG = 0xf | DEVICE_ID_4 = 0x1 | DEVICE_ID_5 = 0 | 0x80 } | OVERRIDE_ENABLE }
[0] 1248.299483 MMIO32 R 0x088050 0x00000001 PPCI.ROM_SHADOW_ENABLE => 0x1
[0] 1248.299505 MMIO32 W 0x088050 0x00000000 PPCI.ROM_SHADOW_ENABLE <= 0
[0] 1248.299524 MMIO32 R 0x088050 0x00000000 PPCI.ROM_SHADOW_ENABLE => 0
[0] 1248.299540 MMIO32 R 0x300000 0x00000000 PROM[0] => 0
                                             PROM[0x1] => 0
                                             PROM[0x2] => 0
                                             PROM[0x3] => 0
[0] 1248.299557 MMIO32 W 0x088050 0x00000001 PPCI.ROM_SHADOW_ENABLE <= 0x1
MAP 1248.299831 6 0xde759000 0xffffc900040ba000 0x1000 0x0 0
SLEEP 0.304000ms
UNMAP 1248.309290 6 0x0 0
MAP 1248.309326 7 0xde75a000 0xffffc900040bc000 0x1000 0x0 0
SLEEP 4.378000ms
UNMAP 1248.309595 7 0x0 0
MAP 1248.309828 8 0xde75a000 0xffffc900040be000 0x1000 0x0 0
SLEEP 0.472000ms
UNMAP 1248.317354 8 0x0 0
MAP 1248.317361 9 0xde75b000 0xffffc900040fa000 0x1000 0x0 0
SLEEP 2.454000ms
UNMAP 1248.317627 9 0x0 0
Comment 9 patrick.clara 2014-03-26 22:50:46 UTC
I have played a bit with the _ROM method and acpi_call. It actually returns a bios of nearly double the size of the PCIROM so there should not be doubts that the PCIROM is the wrong one.

I have tried removing that check in nouveau_acpi.c:nouveau_acpi_rom_supported and it actually seem to load correctly, except a kworker uses 100 of CPU, but this happened also in 3.12 and maybe is another bug which is unrelated to this one.

[   24.412373] [drm] Initialized drm 1.1.0 20060810
[   24.645116] ACPI Warning: \_SB_.PCI0.PEG0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20131218/nsarguments-95)
[   24.645180] ACPI: \_SB_.PCI0.PEG0.GFX0: failed to evaluate _DSM
[   24.645209] checking generic (e0000000 1e8000) vs hw (e0000000 10000000)
[   24.645211] fb: conflicting fb hw usage nouveaufb vs simple - removing generic driver
[   24.645285] Console: switching to colour dummy device 80x25
[   24.646505] nouveau  [  DEVICE][0000:01:00.0] BOOT0  : 0x0e7110a2
[   24.646511] nouveau  [  DEVICE][0000:01:00.0] Chipset: GK107 (NVE7)
[   24.646514] nouveau  [  DEVICE][0000:01:00.0] Family : NVE0
[   24.654219] nouveau  [   VBIOS][0000:01:00.0] checking PRAMIN for image...
[   24.654229] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[   24.654231] nouveau  [   VBIOS][0000:01:00.0] checking PROM for image...
[   24.654288] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[   24.654290] nouveau  [   VBIOS][0000:01:00.0] checking ACPI for image...
[   26.781797] nouveau  [   VBIOS][0000:01:00.0] ... appears to be valid
[   26.781806] nouveau  [   VBIOS][0000:01:00.0] using image from ACPI
[   26.782089] nouveau  [   VBIOS][0000:01:00.0] BIT signature found
[   26.782095] nouveau  [   VBIOS][0000:01:00.0] version 80.07.46.00.45
[   26.783154] nouveau 0000:01:00.0: irq 48 for MSI/MSI-X
[   26.783173] nouveau  [     PMC][0000:01:00.0] MSI interrupts enabled
[   26.783251] nouveau  [     PFB][0000:01:00.0] RAM type: GDDR5
[   26.783254] nouveau  [     PFB][0000:01:00.0] RAM size: 2048 MiB
[   26.783257] nouveau  [     PFB][0000:01:00.0]    ZCOMP: 0 tags
[   26.802029] nouveau  [    VOLT][0000:01:00.0] GPU voltage: 925000uv
[   26.826224] nouveau  [  PTHERM][0000:01:00.0] FAN control: none / external
[   26.826233] nouveau  [  PTHERM][0000:01:00.0] fan management: automatic
[   26.826238] nouveau  [  PTHERM][0000:01:00.0] internal sensor: yes
[   26.826280] nouveau  [     CLK][0000:01:00.0] 07: core 270-405 MHz memory 810 MHz
[   26.826365] nouveau  [     CLK][0000:01:00.0] 0a: core 270-835 MHz memory 1600 MHz
[   26.826431] nouveau  [     CLK][0000:01:00.0] 0f: core 270-835 MHz memory 4000 MHz
[   26.826571] nouveau  [     CLK][0000:01:00.0] --: core 405 MHz memory 648 MHz
[   26.870817] [TTM] Zone  kernel: Available graphics memory: 4074432 kiB
[   26.870819] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[   26.870821] [TTM] Initializing pool allocator
[   26.870826] [TTM] Initializing DMA pool allocator
[   26.870838] nouveau  [     DRM] VRAM: 2048 MiB
[   26.870840] nouveau  [     DRM] GART: 1048576 MiB
[   26.870844] nouveau  [     DRM] TMDS table version 2.0
[   26.870846] nouveau  [     DRM] DCB version 4.0
[   26.870848] nouveau  [     DRM] DCB outp 00: 04800fb6 0f430014
[   26.870850] nouveau  [     DRM] DCB outp 01: 02011f00 00000000
[   26.870852] nouveau  [     DRM] DCB outp 02: 02022f62 00020010
[   26.870853] nouveau  [     DRM] DCB conn 00: 00020047
[   26.870856] nouveau  [     DRM] DCB conn 01: 00000100
[   26.870857] nouveau  [     DRM] DCB conn 02: 00010261
[   26.872193] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   26.872195] [drm] Driver supports precise vblank timestamp query.
[   26.872197] nouveau  [     DRM] ACPI backlight interface available, not registering our own
[   26.884895] nouveau  [     DRM] MM: using COPY for buffer copies
[   26.936448] nouveau  [     DRM] allocated 2880x1620 fb: 0x80000, bo ffff880210ea2400
[   27.246200] Console: switching to colour frame buffer device 360x101
[   27.260592] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[   27.260594] nouveau 0000:01:00.0: registered panic notifier
[   27.260598] [drm] Initialized nouveau 1.1.1 v3.14-rc7-59-g08edb33c4e1b81001 for 0000:01:00.0 on minor 0


At this point I would like to clarify that I don't really understand much about the whole thing but in anycase I have written down about some experimentation. I hope is doesn't contain too much nonsense....

I actually don't understand this check in nouveau_acpi.c:nouveau_acpi_rom_supported

if (!nouveau_dsm_priv.dsm_detected && !nouveau_dsm_priv.optimus_detected)
		return false;

Why, in order to read the vbios from ACPI do we need either a dsm or optimus?
Lets suppose this check makes sense.

We see that nouveau_dsm_priv.dsm_detected and nouveau_dsm_priv.optimus_detected should get set in nouveau_acpi.c:nouveau_dsm_detect. In my case vga_count==1 and gets set in the first while loop. Since nouveau_dsm_priv.dsm_detected needs vga_count==2 in order to become true, it will never be the case.

So lets check why nouveau_dsm_priv.optimus_detected does not become true.

has_dsm and has_optimus turns out to be both 0. Simplay becouse
retval = nouveau_dsm_pci_probe(pdev); is also 0.

In nouveau_acpi.c:nouveau_dsm_pci_probe the relevant check should be

if (nouveau_check_optimus_dsm(dhandle))
		retval |= NOUVEAU_DSM_HAS_OPT;

nouveau_check_optimus_dsm obviously returns 0.

nouveau_acpi.c:nouveau_check_optimus_dsm exits at this check

if (nouveau_optimus_dsm(handle, 0, 0, &result))
		return 0;

becouse of the "failed to evaluate _DSM" error, triggered in nouveau_acpi.c:nouveau_optimus_dsm at this point.

obj = acpi_evaluate_dsm_typed(handle, nouveau_op_dsm_muid, 0x00000100,
				      func, &argv4, ACPI_TYPE_BUFFER);

It calls the _DSM metod and checks if obj->type == ACPI_TYPE_BUFFER. On my machine it turns out to be of type ACPI_TYPE_INTEGER. As it is called in nouveau_acpi.c:nouveau_dsm, but as seen before nouveau_dsm_priv.dsm_detected seems to be excluded becouse of that vga_count==2.

Other notebooks in the same Zenbook Prime series seems to have optimus and not some older switching methods so for the moment I assume the whole condition around vga_count==2 is correct and nouveau_dsm_priv.dsm_detected is not interesting for us. Also my machine doesnt have 2 gpus but maybe it could have some kind of permanently disabled optimus

I tried to change that ACPI_TYPE_BUFFER in ACPI_TYPE_INTEGER. The "failed to evaluate _DSM" error disappears and nouveau_acpi.c:nouveau_check_optimus_dsm executes till the end, but it still returns 0 or false becouse result in nouveau_acpi.c:nouveau_optimus_dsm doesnt get set to something other than 0.

It does not enter in this block

if (obj->buffer.length == 4) {
	*result |= obj->buffer.pointer[0];
	*result |= (obj->buffer.pointer[1] << 8);
	*result |= (obj->buffer.pointer[2] << 16);
	*result |= (obj->buffer.pointer[3] << 24);
}

simply becouse acpi_evaluate_dsm_typed returns 0. Also acpi_call confirms it.

I have seen it is called using this parameter

static const char nouveau_op_dsm_muid[] = {
	0xF8, 0xD8, 0x86, 0xA4, 0xDA, 0x0B, 0x1B, 0x47,
	0xA7, 0x2B, 0x60, 0x42, 0xA6, 0xB5, 0xBE, 0xE0,
};

I think it should be somewhere in the dsdt but I have found this inside _DSM

0x75,0x0B,0xA5,0xD4,0xC7,0x65,0xF7,0x46,
0xBF,0xB7,0x41,0x51,0x4C,0xEA,0x02,0x44

echo "\_SB.PCI0.PEG0.GFX0._DSM {0x75,0x0B,0xA5,0xD4,0xC7,0x65,0xF7,0x46,0xBF,0xB7,0x41,0x51,0x4C,0xEA,0x02,0x44} 0x100 0 0" > call

gives {0x01, 0x00, 0x50, 0x00} as output.

So giving in the above instead of the original nouveau_op_dsm_muid removes the need of changing that ACPI_TYPE_BUFFER in to ACPI_TYPE_INTEGER, since now its output is correct.

Still this condition in nouveau_acpi.c:nouveau_check_optimus_dsm doesnt hold

return result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS);

I actually don't fully understand the sense of nouveau_op_dsm_muid[] but maybe it is coupled with NOUVEAU_DSM_OPTIMUS_CAPS.

Changing NOUVEAU_DSM_OPTIMUS_CAPS to 16 instead of 1A in order to simulate interpretation of {0x01, 0x00, 0x50, 0x00} leads to this


[drm] Initialized drm 1.1.0 20060810
[  188.108609] ACPI Warning: \_SB_.PCI0.PEG0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20131218/nsarguments-95)
[  188.108682] ACPI Warning: \_SB_.PCI0.PEG0.GFX0._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20131218/nsarguments-95)
[  188.108770] ACPI: \_SB_.PCI0.PEG0.GFX0: failed to evaluate _DSM
[  188.108776] pci 0000:01:00.0: optimus capabilities: disabled, status
[  188.108780] VGA switcheroo: detected Optimus DSM method \_SB_.PCI0.PEG0.GFX0 handle
[  188.108803] checking generic (e0000000 1e8000) vs hw (e0000000 10000000)
[  188.108805] fb: conflicting fb hw usage nouveaufb vs simple - removing generic driver
[  188.108887] Console: switching to colour dummy device 80x25
[  188.109934] nouveau  [  DEVICE][0000:01:00.0] BOOT0  : 0x0e7110a2
[  188.109938] nouveau  [  DEVICE][0000:01:00.0] Chipset: GK107 (NVE7)
[  188.109941] nouveau  [  DEVICE][0000:01:00.0] Family : NVE0
[  188.116780] nouveau  [   VBIOS][0000:01:00.0] checking PRAMIN for image...
[  188.116790] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[  188.116793] nouveau  [   VBIOS][0000:01:00.0] checking PROM for image...
[  188.116868] nouveau  [   VBIOS][0000:01:00.0] ... signature not found
[  188.116872] nouveau  [   VBIOS][0000:01:00.0] checking ACPI for image...
[  190.339394] nouveau  [   VBIOS][0000:01:00.0] ... appears to be valid
[  190.339402] nouveau  [   VBIOS][0000:01:00.0] using image from ACPI
[  190.339686] nouveau  [   VBIOS][0000:01:00.0] BIT signature found
[  190.339692] nouveau  [   VBIOS][0000:01:00.0] version 80.07.46.00.45
[  190.341007] nouveau 0000:01:00.0: irq 48 for MSI/MSI-X
[  190.341022] nouveau  [     PMC][0000:01:00.0] MSI interrupts enabled
[  190.341090] nouveau  [     PFB][0000:01:00.0] RAM type: GDDR5
[  190.341093] nouveau  [     PFB][0000:01:00.0] RAM size: 2048 MiB
[  190.341096] nouveau  [     PFB][0000:01:00.0]    ZCOMP: 0 tags
[  190.359931] nouveau  [    VOLT][0000:01:00.0] GPU voltage: 925000uv
[  190.384143] nouveau  [  PTHERM][0000:01:00.0] FAN control: none / external
[  190.384152] nouveau  [  PTHERM][0000:01:00.0] fan management: automatic
[  190.384157] nouveau  [  PTHERM][0000:01:00.0] internal sensor: yes
[  190.384198] nouveau  [     CLK][0000:01:00.0] 07: core 270-405 MHz memory 810 MHz
[  190.384284] nouveau  [     CLK][0000:01:00.0] 0a: core 270-835 MHz memory 1600 MHz
[  190.384348] nouveau  [     CLK][0000:01:00.0] 0f: core 270-835 MHz memory 4000 MHz
[  190.384489] nouveau  [     CLK][0000:01:00.0] --: core 405 MHz memory 648 MHz
[  190.428782] [TTM] Zone  kernel: Available graphics memory: 4074432 kiB
[  190.428785] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[  190.428786] [TTM] Initializing pool allocator
[  190.428791] [TTM] Initializing DMA pool allocator
[  190.428803] nouveau  [     DRM] VRAM: 2048 MiB
[  190.428805] nouveau  [     DRM] GART: 1048576 MiB
[  190.428809] nouveau  [     DRM] TMDS table version 2.0
[  190.428811] nouveau  [     DRM] DCB version 4.0
[  190.428813] nouveau  [     DRM] DCB outp 00: 04800fb6 0f430014
[  190.428815] nouveau  [     DRM] DCB outp 01: 02011f00 00000000
[  190.428817] nouveau  [     DRM] DCB outp 02: 02022f62 00020010
[  190.428819] nouveau  [     DRM] DCB conn 00: 00020047
[  190.428821] nouveau  [     DRM] DCB conn 01: 00000100
[  190.428823] nouveau  [     DRM] DCB conn 02: 00010261
[  190.430141] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[  190.430143] [drm] Driver supports precise vblank timestamp query.
[  190.430145] nouveau  [     DRM] ACPI backlight interface available, not registering our own
[  190.442596] nouveau  [     DRM] MM: using COPY for buffer copies
[  190.493975] nouveau  [     DRM] allocated 2880x1620 fb: 0x80000, bo ffff880213a52c00
[  190.798327] Console: switching to colour frame buffer device 360x101
[  190.812684] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[  190.812686] nouveau 0000:01:00.0: registered panic notifier
[  190.812691] [drm] Initialized nouveau 1.1.1 v3.14-rc7-59-g08edb33c4e1b81001 for 0000:01:00.0 on minor 0


There is again the "failed to evaluate _DSM" error, but this time it gets triggered when nouveau_optimus_dsm() is called for the second time.

I also have seen that the "failed to evaluate _DSM" error was introduced with the committs b072e53b0a27a885d8be3d08c8d8758292762f39 and e284175a96e5af087ea7806b3e38282b524ff5b9. I could even understand that, since it is not an optimus laptop, DSM could fail, but this brings me to the initial question. Why do we need _DSM to load vbios from _ROM?
Comment 10 Ilia Mirkin 2014-03-26 23:06:01 UTC
My (limited) understanding is that _DSM is used to turn the card on/off and switch the mux on relevant laptops between the different GPU's. And the ACPI vbios method was introduced for optimus laptops -- those checks have always been there. I'm fairly sure they can be dropped.

Just to confirm -- dropping the checks from nouveau_acpi.c:nouveau_acpi_rom_supported (and no other changes) makes the VBIOS get loaded from ACPI, without errors.

For the kworker thing -- see what's going on -- 'echo w > /proc/sysrq-trigger' (iirc; perhaps a different sysrq would be helpful... see if it's in nouveau or not. even if it is, it may not be related to this specific issue.)
Comment 11 patrick.clara 2014-03-27 11:07:35 UTC
Created attachment 96452 [details]
output of "echo w > /proc/sysrq-trigger"
Comment 12 patrick.clara 2014-03-27 11:08:51 UTC
I have attached the output of "echo w > /proc/sysrq-trigger".

The kworker shows up in "top" as kworker/7:1
Comment 13 patrick.clara 2014-03-27 11:19:51 UTC
Created attachment 96453 [details]
powertop html report
Comment 14 Ilia Mirkin 2014-03-28 03:25:31 UTC
Erm oops. I guess 'w' is not what I wanted (perhaps because the tasks aren't actually blocked). From powertop, seems like it's from nouveau_connector_hotplug_work which is probably triggering more work for itself :(

That's a separate bug though. While I'm fairly sure I've seen this before, I don't see a bug open about it. Could you open a fresh one, and include a boot log with the following module settings (e.g. passed in via kernel cmdline)

nouveau.debug=debug drm.debug=0xe

I'm guessing this will keep generating oodles of output, I don't need the infinite log, but a good chunk that includes a bunch of the repeats would be great.
Comment 15 patrick.clara 2014-03-28 11:25:10 UTC
Ok thank you.
I will immediately open a new one.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.