Bug 103463 - [drm:amdgpu_get_bios [amdgpu]] *ERROR* ACPI VFCT table present but broken (too short #2)
Summary: [drm:amdgpu_get_bios [amdgpu]] *ERROR* ACPI VFCT table present but broken (to...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-10-26 08:18 UTC by Dennis Schridde
Modified: 2017-10-30 20:35 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg (broken boot) (81.61 KB, text/plain)
2017-10-26 08:18 UTC, Dennis Schridde
no flags Details
dmesg (regular boot) (80.76 KB, text/x-log)
2017-10-26 08:19 UTC, Dennis Schridde
no flags Details
lshw (broken boot) (23.33 KB, text/x-log)
2017-10-26 08:19 UTC, Dennis Schridde
no flags Details
lshw (regular boot) (23.35 KB, text/x-log)
2017-10-26 08:19 UTC, Dennis Schridde
no flags Details
lspci (broken boot) (14.18 KB, text/x-log)
2017-10-26 08:20 UTC, Dennis Schridde
no flags Details
lspci (regular boot) (14.23 KB, text/x-log)
2017-10-26 08:20 UTC, Dennis Schridde
no flags Details
glxinfo (broken boot) (101.34 KB, text/x-log)
2017-10-26 08:20 UTC, Dennis Schridde
no flags Details
glxinfo (regular boot) (101.34 KB, text/x-log)
2017-10-26 08:21 UTC, Dennis Schridde
no flags Details
glxinfo (regular boot, DRI_PRIME=1) (101.32 KB, text/x-log)
2017-10-26 08:21 UTC, Dennis Schridde
no flags Details
Linux 4.13.10-gentoo config (122.07 KB, text/plain)
2017-10-30 16:21 UTC, Dennis Schridde
no flags Details
photo of screen when kernel hangs (4.44 MB, image/jpeg)
2017-10-30 20:35 UTC, Dennis Schridde
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dennis Schridde 2017-10-26 08:18:50 UTC
Created attachment 135050 [details]
dmesg (broken boot)

I am seeing messages like the following during startup:
```
[drm:amdgpu_get_bios [amdgpu]] *ERROR* ACPI VFCT table present but broken (too 
short #2)
AMD-Vi: Event logged [IO_PAGE_FAULT device=00:00.0 domain=0x0000 
address=0x00000000fffc0000 flags=0x0070]
[drm:dce_v11_0_set_pageflip_irq_state [amdgpu]] *ERROR* invalid pageflip crtc 
5
[drm:amdgpu_irq_disable_all [amdgpu]] *ERROR* error disabling interrupt (-22)
amdgpu 0000:01:00.0: Fatal error during GPU init
[TTM] Memory type 2 has not been initialized
```

The first message always appears, while the others are not as easily 
reproducable.  During a boot like this, the second card (AMD Radeon RX 560) fails to come up and is not available to the system.  After a "regular" startup, `dmesg -l err` shows the following messages:
```
[    3.056584] [drm:amdgpu_get_bios [amdgpu]] *ERROR* ACPI VFCT table present but broken (too short #2)
[    6.592964] ACPI Error: [AFN7] Namespace lookup failure, AE_NOT_FOUND (20170531/psargs-364)
[    6.593020] ACPI Error: Method parse/execution failed \_SB.PCI0.VGA.LCD._BCM, AE_NOT_FOUND (20170531/psparse-550)
[    6.593062] ACPI Error: Evaluating _BCM failed (20170531/video-364)
[    6.593207] ACPI Error: [AFN7] Namespace lookup failure, AE_NOT_FOUND (20170531/psargs-364)
[    6.593243] ACPI Error: Method parse/execution failed \_SB.PCI0.PB21.VGA.LCD._BCM, AE_NOT_FOUND (20170531/psparse-550)
[    6.593286] ACPI Error: Evaluating _BCM failed (20170531/video-364)
[    6.628143] snd_hda_intel 0000:01:00.1: control 3:0:0:ELD:0 is already present
[    6.631508] snd_hda_intel 0000:01:00.1: control 3:0:0:ELD:0 is already present
[    6.637737] snd_hda_intel 0000:01:00.1: control 3:0:0:ELD:0 is already present
```

Other weird behaviour I notice is:
* Hangs of the entire system when I start Steam using `env DRI_PRIME=1 steam` 
(nothing reacts to commands anymore, including mouse clicks, the power button 
and the num-lock key, and the mouse cursor moves very sluggishly)
* Crashes of KWin when using Alt+Tab (s.b.)
* The firmware and GRUB (and Linux, initially) display at 1024x768, while the 
monitor's native resolution is 2560x1080.  After the Linux kernel takes over, 
the monitor switches back to the native resolution.
* Sometimes the system fails to boot entirely and gets stuck after the "*ERROR* ACPI VFCT table present but broken" error message

I would hope that someone could guide me in gathering more information about 
this and in the best case getting additional output or a backtrace from the 
kernel, please.

Please find the full output of dmesg, lshw, lspci, glxinfo attached.  Output taken after a "broken" boot with the AMD Radeon RX 560 not coming up is suffixed with "broken-boot", while output taken from a system that came up more completely is suffixed with "regular-boot".

I run Gentoo Linux with following software:
* Linux 4.13.8
* Mesa 17.2.3
* LLVM 5.0.0

I have two graphics cards plugged in:
* AMD Radeon R7 / AMD A10-7800
* AMD Radeon RX 560

The monitor is connected via Display Port to the first card (R7).

If more information would be helpful, please tell me how and I will try to 
acquire it.

See-Also: https://bugs.freedesktop.org/show_bug.cgi?id=103234
Comment 1 Dennis Schridde 2017-10-26 08:19:14 UTC
Created attachment 135051 [details]
dmesg (regular boot)
Comment 2 Dennis Schridde 2017-10-26 08:19:36 UTC
Created attachment 135052 [details]
lshw (broken boot)
Comment 3 Dennis Schridde 2017-10-26 08:19:54 UTC
Created attachment 135053 [details]
lshw (regular boot)
Comment 4 Dennis Schridde 2017-10-26 08:20:12 UTC
Created attachment 135054 [details]
lspci (broken boot)
Comment 5 Dennis Schridde 2017-10-26 08:20:26 UTC
Created attachment 135055 [details]
lspci (regular boot)
Comment 6 Dennis Schridde 2017-10-26 08:20:51 UTC
Created attachment 135056 [details]
glxinfo (broken boot)
Comment 7 Dennis Schridde 2017-10-26 08:21:09 UTC
Created attachment 135057 [details]
glxinfo (regular boot)
Comment 8 Dennis Schridde 2017-10-26 08:21:52 UTC
Created attachment 135058 [details]
glxinfo (regular boot, DRI_PRIME=1)
Comment 9 Michel Dänzer 2017-10-26 09:08:59 UTC
(In reply to Dennis Schridde from comment #0)
> During a boot like this, the second card (AMD Radeon RX 560) fails to come up
> and is not available to the system.

That's actually because of:

 [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD)
 [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v8_0> failed -22
 amdgpu 0000:01:00.0: amdgpu_init failed

the other messages are probably mostly harmless / not directly related to this problem.


> * Hangs of the entire system when I start Steam using `env DRI_PRIME=1
> steam` (nothing reacts to commands anymore, including mouse clicks, the power
> button and the num-lock key, and the mouse cursor moves very sluggishly)

That's probably related to the above.


> * The firmware and GRUB (and Linux, initially) display at 1024x768, while
> the monitor's native resolution is 2560x1080.

That's a motherboard firmware / video card ROM issue, nothing to do with the Linux kernel / drivers.
Comment 10 Dennis Schridde 2017-10-26 09:41:37 UTC
(In reply to Michel Dänzer from comment #9)
> (In reply to Dennis Schridde from comment #0)
> > During a boot like this, the second card (AMD Radeon RX 560) fails to come up
> > and is not available to the system.
> 
> That's actually because of:
> 
>  [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 0 test failed
> (scratch(0xC040)=0xCAFEDEAD)
>  [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v8_0>
> failed -22
>  amdgpu 0000:01:00.0: amdgpu_init failed
> 
> the other messages are probably mostly harmless / not directly related to
> this problem.
> 
> 
> > * Hangs of the entire system when I start Steam using `env DRI_PRIME=1
> > steam` (nothing reacts to commands anymore, including mouse clicks, the power
> > button and the num-lock key, and the mouse cursor moves very sluggishly)
> 
> That's probably related to the above.

Clarification / more information: The missing RX 560 happens for a few boots *after* the full system hang.  I.e. first I run Steam with DRI_PRIME=1, click around for a bit until the system hangs (sometimes waiting alone seems to be enough, though), then I hard-reset the system, when Linux started the RX 560 is missing, I reboot, RX 560 is still missing, ... (loop for a few iterations) ..., RX 560 is back and we have a "regular boot".

> > * The firmware and GRUB (and Linux, initially) display at 1024x768, while
> > the monitor's native resolution is 2560x1080.
> 
> That's a motherboard firmware / video card ROM issue, nothing to do with the
> Linux kernel / drivers.

One more bit of information (though probably still unrelated): The resolution is correct when I do not plug in the RX 560, or connect the display to the RX 560 directly (and setup the mainboard firmware to use the dGPU as primary video adapter).
Comment 11 Dennis Schridde 2017-10-30 16:16:42 UTC
One more thing: It appears as if the first "cold" boot usually fails -- the kernel hanging after "[drm:amdgpu_get_bios [amdgpu]] *ERROR* ACPI VFCT table present but broken (too short #2)" and the blinking cursor freezing.  "Cold" boot meaning booting the system after it was powered off.  The next boot (hard resetting the machine) usually succeeds.  I write "usually", because sometimes the first boot already succeeds, and sometimes it needs two hard resets to bring up the machine successfully, but I cannot yet make out a pattern.  I will enable verbose and debug command line arguments to hopefully get some more information, the next time it happens.
Comment 12 Dennis Schridde 2017-10-30 16:20:03 UTC
P.S. If you have ANY hint, on how to gather more information about this issue, I would be most grateful.  Maybe there is some way to make the kernel dump stacktraces somewhere, or to make the hardware itself dump information someplace...?
Comment 13 Dennis Schridde 2017-10-30 16:21:35 UTC
Created attachment 135165 [details]
Linux 4.13.10-gentoo config
Comment 14 Dennis Schridde 2017-10-30 20:35:04 UTC
Created attachment 135171 [details]
photo of screen when kernel hangs

Please find attached a photo of the screen contents, with "verbose debug" in the kernel command line, at the time when the kernel hangs. If you have a tip on how to get the full log in text form, even though the kernel hangs, that would be great and might give us additional information on what is actually happening.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.