Bug 55282 - Crash in drm_intel_gem_bo_unreference() in intel_bufmgr_gem.c
Summary: Crash in drm_intel_gem_bo_unreference() in intel_bufmgr_gem.c
Status: RESOLVED FIXED
Alias: None
Product: libva
Classification: Unclassified
Component: intel (show other bugs)
Version: unspecified
Hardware: Other Linux (All)
: medium critical
Assignee: haihao
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-24 12:54 UTC by Gautam
Modified: 2012-12-28 00:49 UTC (History)
7 users (show)

See Also:
i915 platform:
i915 features:


Attachments
sample application to play videos[MP4 with H264 codec] from current directory. It has scrpit file to build and run. (4.91 KB, application/octect-stream)
2012-09-24 12:54 UTC, Gautam
Details
Kernel configuration file. (82.06 KB, text/plain)
2012-09-26 13:34 UTC, Gautam
Details
Patch for mutex in gen6_mfd_free_avc_surface (1.04 KB, text/plain)
2012-09-27 09:29 UTC, Gautam
Details
updated patch (1.34 KB, text/plain)
2012-10-12 09:07 UTC, Gautam
Details

Description Gautam 2012-09-24 12:54:23 UTC
Created attachment 67629 [details]
sample application to play videos[MP4 with H264 codec] from current directory. It has scrpit file to build and run.

Hardware:-
 Using x86_64 kernel and user land.
 CPU and GPU: Intel(R) Core(TM) i3-2105 CPU i965 chipset
 linux 3.5.0
Packages used :-
  packages are used from  http://intellinuxgraphics.org/2012.07.html .
  Driver name: intel-driver
  Driver source code repository: http://cgit.freedesktop.org/vaapi/intel-driver/
  Driver version: 1.0.18 (latest stable)

Crash Details:-
  when we run the sample application[wall] to play videos with hw accelerated support for decoding [gst-vaapi] and render through cluttersink the application crashes randomly with the following error
wall: intel_bufmgr_gem.c:1116: drm_intel_gem_bo_unreference: Assertion`((&bo_gem->refcount)->atomic) > 0' failed.

Sample application takes .mp4 files with [H264 codec] from the current directory and plays one by one repeatedly.

Reproducibility :-80% [sometimes within 5 to 15 mins]

Observation:-
             The problem was observed as a double free triggered by a race condition in gen6_mfd_free_avc_surface. The two functions i965_PutSurface() and i965_EndPicture() from i965_drv_video.c directly/indirectly calls gen6_mfd_free_avc_surface() simultanelously from two threads.
            
One of the functions unreferences the variable "gen6_avc_surface->dmv_top" and at the same time other function gets assertion saying reference count as zero.

If we bypass this assertion, after null check for the variable "gen6_avc_surface" in function gen6_mfd_free_avc_surface() both i965_PutSurface() and i965_EndPicture() functions try to free the same pointer.

Solution:-
           We fixed this issue by adding mutex in the function gen6_mfd_free_avc_surface().
Comment 1 Gautam 2012-09-26 13:34:46 UTC
Created attachment 67728 [details]
Kernel configuration file.

3.5.0 kernel configuration file.
Comment 2 Gautam 2012-09-26 13:35:55 UTC
OS: vanilla 3.5.0 linux kernel
No external patch were applied to the kernel We are using 3rd party drivers for which we have full source code.
Comment 3 haihao 2012-09-27 07:31:08 UTC
(In reply to comment #0)
> Solution:-
>            We fixed this issue by adding mutex in the function
> gen6_mfd_free_avc_surface().

Where is the patch ?
Comment 4 Gautam 2012-09-27 09:29:22 UTC
Created attachment 67759 [details]
Patch for mutex in gen6_mfd_free_avc_surface

patch contains mutex added in the gen6_mfd_free_avc_surface function.
Comment 5 Gautam 2012-10-12 09:07:36 UTC
Created attachment 68477 [details]
updated patch

Fix was incomplete in previous patch . so updated it
Comment 6 Gautam 2012-10-12 09:13:09 UTC
The packages used are

clutter-gst: 1.5.6
gstreamer-0.10.36
gst-plugins-base-0.10.36
gst-plugins-bad-0.10.23
gst-plugins-good-0.10.31
gst-plugins-ugly-0.10.19
gstreamer-vaapi:0.10 verison 0.3.7
libva-1.1.0
intel-driver 1.0.18
xorg-server-1.12.1
linux 3.5.0
Comment 7 haihao 2012-10-24 08:31:14 UTC
The patch looks good to me although I still can't reproduce the issue.  I pushed your patch with some modification to fix the same issue on other platforms

Thanks a lot.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.