Bug 111582

Summary: VGPU fails and kills KVM host
Product: DRI Reporter: velde666
Component: DRM/iGVT-gAssignee: Terrence Xu <terrence.xu>
Status: RESOLVED MOVED QA Contact: Terrence Xu <terrence.xu>
Severity: normal    
Priority: not set CC: velde666
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
trace 1
none
trace 2
none
trace 3 none

Description velde666 2019-09-07 08:41:53 UTC
Created attachment 145291 [details]
trace 1

Hi there,

I am running CentOS 7 on an Intel NUC7i3BNH with with KVM/QEMU using GPU virtualization passthrough to a Windows 10 (1903) VM. 

Kernel is 5.2.11 compiled with merged configs from standard CentOS kernel 3.10.x and 5.2.x kernel-ml from elrepo-kernel.

I have an issue where the VGPU used in the Win 10 Guest throws following errors a gazillion times on the host:

Sep  6 19:59:55 floor13 kernel: gvt: vgpu 1: fail: shadow page 00000000c3628cae guest entry 0xffffffffffffffff type 9
Sep  6 19:59:55 floor13 kernel: gvt: vgpu 1: fail: spt 00000000eed2450a guest entry 0xffffffffffffffff type 9
Sep  6 19:59:55 floor13 kernel: gvt: vgpu 1: fail: shadow page 00000000eed2450a guest entry 0xffffffffffffffff type 9.
Sep  6 19:59:55 floor13 kernel: gvt: guest page write error, gpa 193446000
Sep  6 19:59:55 floor13 kernel: gvt: vgpu 1: fail: shadow page 00000000c3628cae guest entry 0xffffffffffffffff type 9
Sep  6 19:59:55 floor13 kernel: gvt: vgpu 1: fail: spt 00000000eed2450a guest entry 0xffffffffffffffff type 9
Sep  6 19:59:55 floor13 kernel: gvt: vgpu 1: fail: shadow page 00000000eed2450a guest entry 0xffffffffffffffff type 9.
Sep  6 19:59:55 floor13 kernel: gvt: guest page write error, gpa 193446008
Sep  6 19:59:55 floor13 kernel: gvt: vgpu 1: fail: shadow page 00000000c3628cae guest entry 0xffffffffffffffff type 9
Sep  6 19:59:55 floor13 kernel: gvt: vgpu 1: fail: spt 00000000eed2450a guest entry 0xffffffffffffffff type 9
Sep  6 19:59:55 floor13 kernel: gvt: vgpu 1: fail: shadow page 00000000eed2450a guest entry 0xffffffffffffffff type 9.
Sep  6 19:59:55 floor13 kernel: gvt: guest page write error, gpa 193446010
[...]

After that there is an entry 

Sep  6 20:02:37 floor13 kernel: gvt: vgpu 1: GVT doesn't support 1GB entry

followed by three nearly identical stack traces I will attach to this bug report

The result ist that a) the Win10 VM dies and b) the complete KMV host dies.

Many thanks in advance for your time and support! :)

Alex
Comment 1 velde666 2019-09-07 08:42:25 UTC
Created attachment 145292 [details]
trace 2
Comment 2 velde666 2019-09-07 08:42:41 UTC
Created attachment 145293 [details]
trace 3
Comment 3 Martin Peres 2019-11-29 16:48:19 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/15.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.