Summary: | map_gtt reading/writing data mismatch during the GPU reset | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Zhipeng Gong <zhipeng.gong> | ||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Severity: | normal | ||||||||
Priority: | medium | CC: | intel-gfx-bugs, tvrtko.ursulin | ||||||
Version: | DRI git | ||||||||
Hardware: | Other | ||||||||
OS: | All | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Created attachment 128753 [details]
ftrace file
The issue might be because i915_gem_restore_fences calls i965_write_fence_reg, which writes 0 to fence register during the GPU reset Zhipeng Gong, can you try Chris' patchset: https://patchwork.freedesktop.org/series/17498/ commit b1ed35d9179bc42c5ac7b86548cfae589be17b3e Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 4 14:51:10 2017 +0000 drm/i915: Revoke fenced GTT mmapings across GPU reset The fence registers are clobbered by a GPU reset. If there is concurrent user access to a fenced region via a GTT mmaping, the access will not be fenced during the reset (until we restore the fences afterwards). In order to prevent invalid access during the reset, before we clobber the fences first we must invalidate the GTT mmapings. Access to the mmap will then be forced to fault in the page, and in handling the fault, i915_gem_fault() will take the struct_mutex and wait upon the reset to complete. v2: Fix up commentary. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99274 Testcase: igt/gem_mmap_gtt/hang Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170104145110.1486-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> I have verified the patch locally, it can work, Thanks for the patch. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 128752 [details] test case The issue happens when the app calls map_gtt to reads/writes the data to the memory, right then the KMD starts call i915_reset to reset the GPU. The fence registers are reinited during GPU reset, while the CPU is reading/writing the data through GTT aperture space and fence register at the same time. As a result, the data is inconsistent.