Bug 44748

Summary: [pnv] CS prefetches into invalid page at end of GTT
Product: DRI Reporter: Chris Wilson <chris>
Component: DRM/IntelAssignee: Daniel Vetter <daniel>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: medium CC: ben, chris, daniel, guang.a.yang, jbarnes
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 42991, 44622    
Attachments:
Description Flags
error state with guard-page-fix none

Description Chris Wilson 2012-01-13 01:02:39 UTC
Time: 1326419416 s 302125 us
PCI ID: 0xa011
Detected GEN3 chipset
EIR: 0x00000010
PGTBL_ER: 0x00100000
    CS Instruction: Invalid GTT PTE
...
  ACTHD: 0x1fffe8a0

Why is that precisely 128 bytes before the end of the last valid page? Didn't we populate the entire GTT with valid pages? Oh noes...
Comment 1 Daniel Vetter 2012-01-24 13:51:57 UTC
Chris confirmed that i-g-t/tests/gem_cs_prefetch does indeed blow up on his pnv. Seems to work flawless on all my machines here ... Now I'll just have to stitch together the patch.
Comment 2 Chris Wilson 2012-03-08 00:37:34 UTC
*** Bug 47085 has been marked as a duplicate of this bug. ***
Comment 3 Chris Wilson 2012-03-08 00:39:33 UTC
*** Bug 47082 has been marked as a duplicate of this bug. ***
Comment 5 Chris Wilson 2012-03-25 08:28:17 UTC
The good news is that this does vary the fault... Now the GPU detects that the last PTE entry is marked as invalid when the prefetch goes past the end of the penutlimate page.
Comment 6 Chris Wilson 2012-03-25 08:52:30 UTC
Created attachment 59006 [details]
error state with guard-page-fix

Just to quieten the doubting Daniel...
Comment 7 Chris Wilson 2012-03-25 09:13:22 UTC
Pebkac; boot into the correct kernel and it the test passes.

I think the patch itself is a bit rude, and makes an over-assumption that I would like to be made explicit.
Comment 8 Chris Wilson 2012-03-25 14:20:01 UTC
Passes gem_cs_prefetch and I think looks much better.

commit f2db5a0f464378dc072474c0eb860eeb881f0b33
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Mar 21 16:04:38 2012 +0100

    drm/i915: clear the entire gtt when using gem
    
    We've lost our guard page somewhere in the gtt rewrite, this patch
    here will restore it.
    
    Exercised by i-g-t/tests/gem_cs_prefetch.
    
    v2: Substract the guard page from the range we're supposed to manage
    with gem. Suggested by Chris Wilson to increase the odds of old ums +
    gem userspace not blowing up. To compensate for the loss of a page,
    don't substract the guard page in the modeset init code any longer.
    
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44748
    Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 9 Chris Wilson 2012-03-27 07:58:51 UTC
commit d1dd20a96524ac462ed4f28dae168b9637f079e5
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Mar 26 09:45:42 2012 +0200

    drm/i915: clear the entire gtt when using gem
    
    We've lost our guard page somewhere in the gtt rewrite, this patch
    here will restore it.
    
    Exercised by i-g-t/tests/gem_cs_prefetch.
    
    v2: Substract the guard page from the range we're supposed to manage
    with gem. Suggested by Chris Wilson to increase the odds of old ums +
    gem userspace not blowing up. To compensate for the loss of a page,
    don't substract the guard page in the modeset init code any longer.
    
    Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44748
    Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.