Bug 18347 - intelReadPixels needs to wait for hardware
Summary: intelReadPixels needs to wait for hardware
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i915 (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-11-01 19:15 UTC by Owen Taylor
Modified: 2008-11-05 17:58 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Patch as described (733 bytes, patch)
2008-11-01 19:15 UTC, Owen Taylor
Details | Splinter Review
Patch on top of keithp's patch (681 bytes, patch)
2008-11-02 14:27 UTC, Owen Taylor
Details | Splinter Review

Description Owen Taylor 2008-11-01 19:15:13 UTC
Created attachment 20001 [details] [review]
Patch as described

With the current intel driver, small 1x1 pixel reads often (perhaps even
usually) give the wrong result. I think the problem is that when the
code falls back to _swrast_ReadPixels(), it doesn't wait for the GPU
to finish.

The patch I'll attach here adds an intelFinish() call before calling
_swrast_ReadPixels(). It fixes the problems I'm observing.

Comments:

 - I think the intelFlush() call at the beginning of of intelReadPixels()
   can be removed. do_blit_readpixels() calls intelFlush() itself, 
   as does intelFinish(). Performance difference for doing so is 
   probably minimal.

 - intelFinish() does a bit more than is necessary, since it waits for
   rendering to all color buffers to finish, not just the one we
   are going to read from. Seems unlikely to matter, since presumably
   the last thing done before glReadPixels() is to render to the 
   buffer you are going to read from.

 - I don't see where do_blit_readpixels() waits for the blit it emitted
   to finish, but I don't have a test case for that code path, and maybe
   that happens implicitly or in the upper layers.

 - I don't know the code at all, so please double check for sanity :-)
Comment 1 Philipp Klaus Krause 2008-11-02 06:14:51 UTC
bugzilla-daemon@freedesktop.org schrieb:

>  - intelFinish() does a bit more than is necessary, since it waits for
>    rendering to all color buffers to finish, not just the one we
>    are going to read from. Seems unlikely to matter, since presumably
>    the last thing done before glReadPixels() is to render to the 
>    buffer you are going to read from.

I don't think so. AFAIR using multiple buffers, rendering the next one
before reading the current one is used by some applications to improve
speed.

Philipp
Comment 2 Owen Taylor 2008-11-02 08:58:11 UTC
In further testing, the patch works as advertised for direct rendering but when running inside the server (AIGLX) seems to have no effect. My only idea is that the operation dri_bo_wait_rendering() (which calls DRM_IOCTL_I915_GEM_SET_DOMAIN with read_domains of I915_GEM_DOMAIN_GTT) is ineffective inside the server for some reason.
Comment 3 Eric Anholt 2008-11-02 10:10:35 UTC
We should figure out why the existing code is failing to synchronize correctly, since swrast_ReadPixels should call the span read functions, which use dri_bo_get_subdata, which should correctly synchronize with any flushed rendering.  My guess is you're running into a kernel bug and papering over it with adding more waiting, and losing in the server case where the timing/cache usage is different.
Comment 4 Owen Taylor 2008-11-02 11:15:52 UTC
Any advice about how to investigate further?
Comment 5 Owen Taylor 2008-11-02 14:26:28 UTC
Turns out after discussion on IRC that the mesa patch here is a red
herring. The problem is basically what is discussed in:

  http://lists.freedesktop.org/archives/intel-gfx/2008-October/000482.html

"i915_gem_object_set_domain is also used from the set_domain ioctl. In
this case, the domains will be mashed and a flush operation pushed into
the ring, but the object will not be placed on the active list when that
flush is posted."

The patch in there doesn't quite fix the problem, since
i915_gem_object_set_domain_range() thinks that it can tell whether to
call i915_gem_object_wait_rendering() by just looking at object->write_domain.

A small additional patch ... removing the check and just always
calling wait_rendering() gets things working.
Comment 6 Owen Taylor 2008-11-02 14:27:07 UTC
Created attachment 20014 [details] [review]
Patch on top of keithp's patch
Comment 7 Eric Anholt 2008-11-03 15:02:51 UTC
Applied the patch to for-airlied and drm-intel-next, since it was obviously needed.  Is it good enough to fix your bug on its own?
Comment 8 Owen Taylor 2008-11-05 06:39:01 UTC
I'll try this evening, but my understanding is no. To quote Keith's mail:

  i915_gem_object_set_domain is also used from the set_domain ioctl. In
  this case, the domains will be mashed and a flush operation pushed into
  the ring, but the object will not be placed on the active list when that
  flush is posted. This means that no-one knows to wait for the operation
  to complete before accessing the object.

and (as far as I recall) I think that the situation I was hitting fit that
situation. So, checking if the texture is on the active list doesn't do
any good if the object didn't get added to the active list.
Comment 9 Owen Taylor 2008-11-05 17:53:34 UTC
Contrary to my expectation, my patch in isolation does fix the problem I was observing!
Comment 10 Eric Anholt 2008-11-05 17:58:54 UTC
Excellent!  I'll mark this one closed, then.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.