Bug 78769

Summary: i965: glGetGraphicsResetStatusARB always returns GUILTY_CONTEXT_RESET_ARB status for guilty context
Product: Mesa Reporter: pavel.e.popov
Component: Drivers/DRI/i965Assignee: Ian Romanick <idr>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium    
Version: 10.1   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 79039    
Attachments: Initial patch to resolve this issue

Description pavel.e.popov 2014-05-16 05:15:53 UTC
Created attachment 99138 [details]
Initial patch to resolve this issue

Overview:
=========
The glGetGraphicsResetStatusARB from ARB_robustness extension always returns GUILTY_CONTEXT_RESET_ARB and never returns NO_ERROR for guilty context with LOSE_CONTEXT_ON_RESET_ARB strategy.
This is because Mesa returns GUILTY_CONTEXT_RESET_ARB if batch_active !=0 whereas kernel driver never reset batch_active and this variable always > 0 for guilty context.
The same behaviour also can be observed for batch_pending and INNOCENT_CONTEXT_RESET_ARB.

But spec says the following (http://www.opengl.org/registry/specs/ARB/robustness.txt):

  If a reset status other than NO_ERROR is returned and subsequent
  calls return NO_ERROR, the context reset was encountered and
  completed. If a reset status is repeatedly returned, the context may
  be in the process of resetting.

  8. How should the application react to a reset context event?
  RESOLVED: For this extension, the application is expected to query
  the reset status until NO_ERROR is returned. If a reset is encountered,
  at least one *RESET* status will be returned. Once NO_ERROR is
  encountered, the application can safely destroy the old context and
  create a new one.


Reproducer:
===========
1. Create context with GL_LOSE_CONTEXT_ON_RESET_ARB strategy using glXCreateContextAttribsARB().
2. Provoke hardware reset. For example using infinite shader.
3. Call glGetGraphicsResetStatusARB() in a loop.
4. Make sure that this function always returns GUILTY_CONTEXT_RESET_ARB. It's a wrong behavior.


Solution:
=========
The main problem is the context may be in the process of resetting and in this case a reset status should be repeatedly returned. But looks like the kernel driver returns nonzero active/pending only if the context reset has already been encountered and completed.
For this reason the *RESET* status cannot be repeatedly returned and should be returned only once.

The reset_count and brw->reset_count variables can be used to control that glGetGraphicsResetStatusARB returns *RESET* status only once for each context.
Note the i915 triggers reset_count twice which allows to return correct reset count immediately after active/pending have been incremented.

The patch is attached.
Comment 1 pavel.e.popov 2014-05-19 11:50:17 UTC
Mail with patch "[Mesa-dev][PATCH] i965: Properly return *RESET* status in glGetGraphicsResetStatusARB" was sent to mesa-dev mailing list using git-send-email.
Comment 2 Ian Romanick 2014-05-23 17:00:46 UTC
Fixed on master by the commit below.  This has been cherry picked to the 10.2 branch as 9a8f12ae, and it will be included in 10.2-rc3.

commit 8dc4a98c44a824630f3cc234136833dbac9a1f4c
Author: Pavel Popov <pavel.e.popov@intel.com>
Date:   Fri May 16 12:00:02 2014 +0700

    i965: Properly return *RESET* status in glGetGraphicsResetStatusARB
    
    The glGetGraphicsResetStatusARB from ARB_robustness extension always
    returns GUILTY_CONTEXT_RESET_ARB and never returns NO_ERROR for guilty
    context with LOSE_CONTEXT_ON_RESET_ARB strategy.  This is because Mesa
    returns GUILTY_CONTEXT_RESET_ARB if batch_active !=0 whereas kernel
    driver never reset batch_active and this variable always > 0 for guilty
    context.  The same behaviour also can be observed for batch_pending and
    INNOCENT_CONTEXT_RESET_ARB.
    
    But ARB_robustness spec says:
    
      If a reset status other than NO_ERROR is returned and subsequent calls
      return NO_ERROR, the context reset was encountered and completed. If a
      reset status is repeatedly returned, the context may be in the process
      of resetting.
    
      8. How should the application react to a reset context event?
      RESOLVED: For this extension, the application is expected to query the
      reset status until NO_ERROR is returned. If a reset is encountered, at
      least one *RESET* status will be returned. Once NO_ERROR is
      encountered, the application can safely destroy the old context and
      create a new one.
    
    The main problem is the context may be in the process of resetting and
    in this case a reset status should be repeatedly returned.  But looks
    like the kernel driver returns nonzero active/pending only if the
    context reset has already been encountered and completed.  For this
    reason the *RESET* status cannot be repeatedly returned and should be
    returned only once.
    
    The reset_count and brw->reset_count variables can be used to control
    that glGetGraphicsResetStatusARB returns *RESET* status only once for
    each context.  Note the i915 triggers reset_count twice which allows to
    return correct reset count immediately after active/pending have been
    incremented.
    
    v2 (idr): Trivial reformatting of comments.
    
    Signed-off-by: Pavel Popov <pavel.e.popov@intel.com>
    Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
    Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.