Created attachment 128752 [details]
The issue happens when the app calls map_gtt to reads/writes the data to the memory, right then the KMD starts call i915_reset to reset the GPU.
The fence registers are reinited during GPU reset, while the CPU is reading/writing the data through GTT aperture space and fence register at the same time.
As a result, the data is inconsistent.
Created attachment 128753 [details]
The issue might be because i915_gem_restore_fences calls i965_write_fence_reg, which writes 0 to fence register during the GPU reset
Zhipeng Gong, can you try Chris' patchset: https://patchwork.freedesktop.org/series/17498/
Author: Chris Wilson <email@example.com>
Date: Wed Jan 4 14:51:10 2017 +0000
drm/i915: Revoke fenced GTT mmapings across GPU reset
The fence registers are clobbered by a GPU reset. If there is concurrent
user access to a fenced region via a GTT mmaping, the access will not be
fenced during the reset (until we restore the fences afterwards). In order
to prevent invalid access during the reset, before we clobber the fences
first we must invalidate the GTT mmapings. Access to the mmap will then
be forced to fault in the page, and in handling the fault, i915_gem_fault()
will take the struct_mutex and wait upon the reset to complete.
v2: Fix up commentary.
Signed-off-by: Chris Wilson <firstname.lastname@example.org>
Reviewed-by: Tvrtko Ursulin <email@example.com>
I have verified the patch locally, it can work, Thanks for the patch.