Bug 109366 - NULL pointer at pcie_capability_read_dword with Radeon SI vfio passthrough
Summary: NULL pointer at pcie_capability_read_dword with Radeon SI vfio passthrough
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-15 15:39 UTC by Ryan Bair
Modified: 2019-11-19 09:34 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg output (4.88 KB, text/plain)
2019-01-15 15:39 UTC, Ryan Bair
no flags Details
possible fix (2.21 KB, patch)
2019-01-15 17:16 UTC, Alex Deucher
no flags Details | Splinter Review

Description Ryan Bair 2019-01-15 15:39:07 UTC
Created attachment 143131 [details]
dmesg output

My guest is seeing the attached bug and call trace during boot. Kernel 4.20.2-200, also an issue on 4.19 series. 4.18 is similar with the older drm_ prefixed version of the function.
Comment 1 Alex Williamson 2019-01-15 16:22:20 UTC
Use a Q35 VM configuration with the assigned GPU downstream of an emulated PCIe root port as a workaround.  The driver assumes this configuration, presumably it's the only one that exists on bare metal, and reads from the upstream device without checking that it is actually present.
Comment 2 Alex Deucher 2019-01-15 17:16:12 UTC
Created attachment 143133 [details] [review]
possible fix

Does this patch fix it?  dGPUs are always add in cards, so they always plug into an upstream port on bare metal.  The driver needs to query the upstream port to determine what pcie gen speeds and lanes are available on the platform so that the driver can properly adjust them at runtime to save power.
Comment 3 Ryan Bair 2019-01-17 14:12:53 UTC
Thank you both for the responses. 

I can confirm using the Q35 machine type does not see this issue. 

I'm rebuilding a kernel today to test the patch and will report back.
Comment 4 Ryan Bair 2019-01-17 14:51:47 UTC
I can confirm the attached patch does fix the issue for i440FX.
Comment 5 Martin Peres 2019-11-19 09:34:52 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/861.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.