Bug 69683 - [r600g][GLAMOR] Crash when viewing a movie full screen with mplayer
Summary: [r600g][GLAMOR] Crash when viewing a movie full screen with mplayer
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/Acceleration/glamor (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Zhigang Gong
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-22 21:26 UTC by Shawn Starr
Modified: 2013-09-29 01:12 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
X crashes w/ GLAMOR enabled while using mplayer (66.17 KB, text/plain)
2013-09-22 21:26 UTC, Shawn Starr
no flags Details
Backtrace (3.19 KB, text/plain)
2013-09-26 08:22 UTC, Raul Fernandes
no flags Details
workaround for the bug (317 bytes, patch)
2013-09-26 08:23 UTC, Raul Fernandes
no flags Details | Splinter Review
Destroy invalid sub_pixmap (526 bytes, patch)
2013-09-26 19:56 UTC, Raul Fernandes
no flags Details | Splinter Review
Backtrace of the freeze when openning a movie with mplayer (6.12 KB, text/plain)
2013-09-28 13:27 UTC, Raul Fernandes
no flags Details
Destroys the pixmap if data is invalid (552 bytes, patch)
2013-09-28 23:28 UTC, Raul Fernandes
no flags Details | Splinter Review

Description Shawn Starr 2013-09-22 21:26:27 UTC
Created attachment 86330 [details]
X crashes w/ GLAMOR enabled while using mplayer

X will crash if I view a video full screen with GLAMOR enabled on my RV635 (HD 3650) GPU, it will crash if I load the video full screen quickly (from commandline load mplayer video -loop 0 then press F quickly).

Attached is stack dump of crash
Comment 1 Shawn Starr 2013-09-22 21:29:31 UTC
Reassign to AMD team til we figure this out
Comment 2 Shawn Starr 2013-09-22 21:31:54 UTC
more info:

ATI DDX:     xorg-x11-drv-ati-7.2.0-0.fc21.x86_64
libdrm:         libdrm-2.4.46-1.fc19.x86_64
kernel:         kernel-3.12.0-0.rc1.git4.2.fc21.x86_64 (non-debug release repo)
mesa:          mesa-dri-drivers-9.2-1.20130919.fc19.x86_64
X:                xorg-x11-server-Xorg-1.14.3-1.fc19.x86_64
Comment 3 Shawn Starr 2013-09-22 21:32:58 UTC
GLAMOR:  xorg-x11-glamor-0.5.1-1.20130921git29c0907.fc19.x86_64

Custom compiled from git master + patch from http://lists.freedesktop.org/archives/glamor/2013-September/000386.html but this doesn't change this.
Comment 4 Michel Dänzer 2013-09-25 09:57:26 UTC
There doesn't seem to be anything radeon specific about the crash.
Comment 5 Zhigang Gong 2013-09-25 15:45:19 UTC
Thanks to report bug.
It seems taht something bad happen when glamor try to fallback to sw path.
It failed to find a matched format for the following formats. And the formats
value are not valid formats. 
[ 71790.287] (II) fail to get matched format for 7893d260 
[ 71790.287] (II) fail to get matched format for 7893d260 
 As I don't have a machine ready to run glamor now, I can't reproduce here. If possible, could you set env
variable GLAMOR_DEBUG=2 before you start the X to log more debug information and may give more hints of the root cause.

(In reply to comment #3)
> GLAMOR:  xorg-x11-glamor-0.5.1-1.20130921git29c0907.fc19.x86_64
> 
> Custom compiled from git master + patch from
> http://lists.freedesktop.org/archives/glamor/2013-September/000386.html but
> this doesn't change this.
Comment 6 Raul Fernandes 2013-09-26 08:22:09 UTC
Created attachment 86624 [details]
Backtrace

Backtrace of the crash. I have compiled without optimizations (-O0)
Comment 7 Raul Fernandes 2013-09-26 08:23:27 UTC
Created attachment 86625 [details] [review]
workaround for the bug

I think this is only a workaround not the real fix.
Comment 8 Raul Fernandes 2013-09-26 08:32:07 UTC
I have a problem that seems similar.
When I will see a movie with mplayer, the Xorg crashs randomly while opening the movie.
It doesn't matter if is fullscreen or not.
Looking at backtrace, your bug passes through:

[ 71790.290] (EE) 5: /lib64/libglamor.so.0 (glamor_add_traps_nf+0x1e3) [0x7fb5799b3f83]

and mine:

[   417.612] (EE) 6: /usr/lib/libglamor.so.0 (0x7f1542250000+0x1c073) [0x7f154226c073]

The rest seems to be similar (fbGetImage -> fbBltStip -> fbBlt).
Looking in the full backtrace using gdb, the function:

 fbGetDrawable(pDrawable, src, srcStride, srcBpp, srcXoff, srcYoff); (line 242 - fbimage.c)

returns the src = 0x0
The invalid src causes the crash when the Xorg tries to read from src.
The patch is a workaround because it only stop processing if src is invalid, but I think that src should be valid.
The error is in another place.
Comment 9 Raul Fernandes 2013-09-26 08:32:53 UTC
Try the patch and see if it works for you.
Comment 10 Raul Fernandes 2013-09-26 08:43:33 UTC
One thing that I think is strange is that the pixmap has the width of 1 and height of 1.
It comes from

#7  0x0000000000439d8f in DoGetImage (client=0x2e0b8a0, format=2, 
    drawable=330, x=0, y=0, width=1, height=1, planemask=4294967295, 
    im_return=0x0) at dispatch.c:2131
#8  0x000000000043a057 in ProcGetImage (client=0x2e0b8a0) at dispatch.c:2206

It seems not to be the glamor. Or the glamor should treat this case??
Comment 11 Zhigang Gong 2013-09-26 13:03:56 UTC
Did you also get the following log information:

[ 71790.287] (II) fail to get matched format for 7893d260 
[ 71790.287] (II) fail to get matched format for 7893d260 

From the back trace log, glamor fail to download a valid sub_pixmap for
the input 1x1 pixmap. Please see code in glamor_get_sub_pixmap().

...
	data = glamor_download_sub_pixmap_to_cpu(pixmap, x, y, w, h, sub_pixmap->devKind,
						 data, pbo, access);
	if (pbo) {
                //we should check whether the data is NULL.
                // if the data is NULL, we should destroy the sub pixmap and return a NULL pointer. And we should also back trace why we get a NULL pointer. 
		assert(sub_pixmap->devPrivate.ptr == NULL);
		sub_pixmap->devPrivate.ptr = data;
		sub_pixmap_priv->base.fbo->pbo_valid = 1;
	}
...


(In reply to comment #6)
> Created attachment 86624 [details]
> Backtrace
> 
> Backtrace of the crash. I have compiled without optimizations (-O0)
Comment 12 Shawn Starr 2013-09-26 13:07:46 UTC
For me to to reproduce, how do i set the environment variable with GDM running? or should I run X manually (startx)?
Comment 13 Zhigang Gong 2013-09-26 13:10:11 UTC
(In reply to comment #12)
> For me to to reproduce, how do i set the environment variable with GDM
> running? or should I run X manually (startx)?
start X manually is a good idea.
Comment 14 Raul Fernandes 2013-09-26 19:54:20 UTC
(In reply to comment #11)
> Did you also get the following log information:
> 
> [ 71790.287] (II) fail to get matched format for 7893d260 
> [ 71790.287] (II) fail to get matched format for 7893d260 
> 

Yes.

> From the back trace log, glamor fail to download a valid sub_pixmap for
> the input 1x1 pixmap. Please see code in glamor_get_sub_pixmap().
> 
> ...
> 	data = glamor_download_sub_pixmap_to_cpu(pixmap, x, y, w, h,
> sub_pixmap->devKind,
> 						 data, pbo, access);
> 	if (pbo) {
>                 //we should check whether the data is NULL.
>                 // if the data is NULL, we should destroy the sub pixmap and
> return a NULL pointer. And we should also back trace why we get a NULL
> pointer. 
> 		assert(sub_pixmap->devPrivate.ptr == NULL);
> 		sub_pixmap->devPrivate.ptr = data;
> 		sub_pixmap_priv->base.fbo->pbo_valid = 1;
> 	}
> ...
> 

I have made another patch to glamor now.

diff --git a/src/glamor_pixmap.c b/src/glamor_pixmap.c
index 9bbc989..52c3c64 100644
--- a/src/glamor_pixmap.c
+++ b/src/glamor_pixmap.c
@@ -1377,6 +1377,10 @@ glamor_get_sub_pixmap(PixmapPtr pixmap, int x, int y, int w, int h, glamor_acces
 
 	data = glamor_download_sub_pixmap_to_cpu(pixmap, x, y, w, h, sub_pixmap->devKind,
 						 data, pbo, access);
+	if(data == NULL) {
+		fbDestroyPixmap(sub_pixmap);
+		return NULL;
+	}
 	if (pbo) {
 		assert(sub_pixmap->devPrivate.ptr == NULL);
 		sub_pixmap->devPrivate.ptr = data;

It seems to fix the bug. From the code, I have saw it:

/*
 * Download a sub region of pixmap to a specified memory region.
 * The pixmap must have a valid FBO, otherwise return a NULL.
 * */

static void *
_glamor_download_sub_pixmap_to_cpu(PixmapPtr pixmap, GLenum format,

It is expected that the function should return NULL if the pixmap has not a valid FBO.
So the code is wrong when calls the function and doesn't check for NULL.
Or the pixmap should always had a valid FBO??
I have another question. The data will be used only in if(pob){ }.
Why not do this:

	if (pbo) {
		assert(sub_pixmap->devPrivate.ptr == NULL);
		data = glamor_download_sub_pixmap_to_cpu(pixmap, x, y, w, h, sub_pixmap->devKind,
							 data, pbo, access);
		sub_pixmap->devPrivate.ptr = data;
		sub_pixmap_priv->base.fbo->pbo_valid = 1;
	}
Comment 15 Raul Fernandes 2013-09-26 19:56:25 UTC
Created attachment 86693 [details] [review]
Destroy invalid sub_pixmap

This patch checks if data is NULL. If so, destroy sub_pixmap and returns NULL.
Comment 16 Zhigang Gong 2013-09-26 21:20:40 UTC
(In reply to comment #14)
> (In reply to comment #11)
> > Did you also get the following log information:
> > 
> > [ 71790.287] (II) fail to get matched format for 7893d260 
> > [ 71790.287] (II) fail to get matched format for 7893d260 
> > 
> 
> Yes.
> 
> > From the back trace log, glamor fail to download a valid sub_pixmap for
> > the input 1x1 pixmap. Please see code in glamor_get_sub_pixmap().
> > 
> > ...
> > 	data = glamor_download_sub_pixmap_to_cpu(pixmap, x, y, w, h,
> > sub_pixmap->devKind,
> > 						 data, pbo, access);
> > 	if (pbo) {
> >                 //we should check whether the data is NULL.
> >                 // if the data is NULL, we should destroy the sub pixmap and
> > return a NULL pointer. And we should also back trace why we get a NULL
> > pointer. 
> > 		assert(sub_pixmap->devPrivate.ptr == NULL);
> > 		sub_pixmap->devPrivate.ptr = data;
> > 		sub_pixmap_priv->base.fbo->pbo_valid = 1;
> > 	}
> > ...
> > 
> 
> I have made another patch to glamor now.
> 
> diff --git a/src/glamor_pixmap.c b/src/glamor_pixmap.c
> index 9bbc989..52c3c64 100644
> --- a/src/glamor_pixmap.c
> +++ b/src/glamor_pixmap.c
> @@ -1377,6 +1377,10 @@ glamor_get_sub_pixmap(PixmapPtr pixmap, int x, int y,
> int w, int h, glamor_acces
>  
>  	data = glamor_download_sub_pixmap_to_cpu(pixmap, x, y, w, h,
> sub_pixmap->devKind,
>  						 data, pbo, access);
> +	if(data == NULL) {
> +		fbDestroyPixmap(sub_pixmap);
> +		return NULL;
> +	}
>  	if (pbo) {
>  		assert(sub_pixmap->devPrivate.ptr == NULL);
>  		sub_pixmap->devPrivate.ptr = data;
> 
> It seems to fix the bug. From the code, I have saw it:
Good.
> 
> /*
>  * Download a sub region of pixmap to a specified memory region.
>  * The pixmap must have a valid FBO, otherwise return a NULL.
>  * */
> 
> static void *
> _glamor_download_sub_pixmap_to_cpu(PixmapPtr pixmap, GLenum format,
> 
> It is expected that the function should return NULL if the pixmap has not a
> valid FBO.
> So the code is wrong when calls the function and doesn't check for NULL.
> Or the pixmap should always had a valid FBO??
I
> I have another question. The data will be used only in if(pob){ }.
> Why not do this:
> 
> 	if (pbo) {
> 		assert(sub_pixmap->devPrivate.ptr == NULL);
> 		data = glamor_download_sub_pixmap_to_cpu(pixmap, x, y, w, h,
> sub_pixmap->devKind,
> 							 data, pbo, access);
> 		sub_pixmap->devPrivate.ptr = data;
> 		sub_pixmap_priv->base.fbo->pbo_valid = 1;
> 	}
No, You can't change it that way, as for non-pbo mode, we pass in a null data
to the glamor_download_sub_pixmap_to_cpu, and then that function will use the
devPrivate.ptr directly and download required content to that memory region.
If pass in Null data pointer, that function will do a map to the pbo, and return
the mapped data pointer. So with or without pbo we both need to call that function.
Comment 17 Zhigang Gong 2013-09-26 21:23:05 UTC
Shawn Starr,

Could you try this patch? I will push it latter, and if it can fix your problem,
please close this bug, thanks.

(In reply to comment #15)
> Created attachment 86693 [details] [review] [review]
> Destroy invalid sub_pixmap
> 
> This patch checks if data is NULL. If so, destroy sub_pixmap and returns
> NULL.
Comment 18 Shawn Starr 2013-09-26 21:33:35 UTC
I'll test this evening, compiling it now but won't be able to test as I'm not at the laptop.
Comment 19 Raul Fernandes 2013-09-26 21:49:47 UTC
(In reply to comment #16)
> No, You can't change it that way, as for non-pbo mode, we pass in a null data
> to the glamor_download_sub_pixmap_to_cpu, and then that function will use the
> devPrivate.ptr directly and download required content to that memory region.
> If pass in Null data pointer, that function will do a map to the pbo, and
> return
> the mapped data pointer. So with or without pbo we both need to call that
> function.

Ok. Only checking.
Thanks for the help.
The bug seems to be fixed to me.
My card is a HD 7750 by the way.
Let's see if it will work too for Shawn Starr.
Comment 20 Shawn Starr 2013-09-27 03:54:25 UTC
I can't crash it now, this appears to be fixed, keeping it open til weekend just to be sure.
Comment 21 Raul Fernandes 2013-09-28 13:26:10 UTC
Unfortanely, I have to say that but there are another bug that triggers in the same way (openning a movie with mplayer).
When opening the movie, a mplayer window appears but with a black content (no movie).
In this moment, the whole desktop freezes and after some seconds the system reboots.
I have took a backtrace with glamor-egl compiled without optimizations (-O0).
Zhigang Gong, you have any idea??
I will upload the backtrace.
Comment 22 Raul Fernandes 2013-09-28 13:27:16 UTC
Created attachment 86768 [details]
Backtrace of the freeze when openning a movie with mplayer
Comment 23 Zhigang Gong 2013-09-28 13:38:40 UTC
Have no idea currently. Not sure this is the same bug. But the previous fix
is indeed not a complete fix. We need to find the root cause why it get a
Null data pointer. As I don't have a testing machine, it's hard for me to
dig into it further.
在 2013年9月28日 下午9:26, <bugzilla-daemon@freedesktop.org>写道:

>  Raul Fernandes <rgfernandes@gmail.com> changed bug 69683<https://bugs.freedesktop.org/show_bug.cgi?id=69683>
>  What Removed Added  Status VERIFIED REOPENED  Resolution FIXED ---
>
>  *Comment # 21 <https://bugs.freedesktop.org/show_bug.cgi?id=69683#c21>on bug
> 69683 <https://bugs.freedesktop.org/show_bug.cgi?id=69683> from Raul
> Fernandes <rgfernandes@gmail.com> *
>
> Unfortanely, I have to say that but there are another bug that triggers in the
> same way (openning a movie with mplayer).
> When opening the movie, a mplayer window appears but with a black content (no
> movie).
> In this moment, the whole desktop freezes and after some seconds the system
> reboots.
> I have took a backtrace with glamor-egl compiled without optimizations (-O0).
> Zhigang Gong, you have any idea??
> I will upload the backtrace.
>
>  ------------------------------
> You are receiving this mail because:
>
>    - You are the assignee for the bug.
>
>
Comment 24 Raul Fernandes 2013-09-28 23:28:49 UTC
Created attachment 86778 [details] [review]
Destroys the pixmap if data is invalid

I forgot to free the sub_pixmap_priv too in previous patch.
Comment 25 Raul Fernandes 2013-09-28 23:30:41 UTC
(In reply to comment #23)
> Have no idea currently. Not sure this is the same bug. But the previous fix
> is indeed not a complete fix. We need to find the root cause why it get a
> Null data pointer. As I don't have a testing machine, it's hard for me to
> dig into it further.

The bug really is not the same. It is another bug. I see it now.
I've just tried to test it without the patch.
The patch fixes a bug that causes the crash with any movie randomly.
This new bug reboots the system with some certain movies.
With the majority of movies, the problem doesn't happen.
But with such movies, the system reboots all times.
I will close this bug with resolved and open another in bugzilla.
One more thing.
I take another look in the patch and see that I should free the sub_pixmap_priv too.
So I made another patch. Please, submit the newer.
Comment 26 Raul Fernandes 2013-09-29 00:17:43 UTC
I have some more informations.
I have been using the vdpau (from open source ATI drivers) and seems that the vdpau is generating the wrong pixmaps.
I have changed to sdl in mplayer and the problems have gone.
I think the patch is still valid because the glamor should treat this type of error appropriately.
Thanks for help.
Comment 27 Zhigang Gong 2013-09-29 00:27:25 UTC
I am on vacation now, will continue to handle this bug when I back to work. Thanks.
(In reply to comment #26)
> I have some more informations.
> I have been using the vdpau (from open source ATI drivers) and seems that
> the vdpau is generating the wrong pixmaps.
> I have changed to sdl in mplayer and the problems have gone.
> I think the patch is still valid because the glamor should treat this type
> of error appropriately.
> Thanks for help.
Comment 28 Raul Fernandes 2013-09-29 01:12:12 UTC
Sorry for wrong information.
I just have to crash the Xorg when not using the patch and using the sdl in mplayer and the backtrace is the same.
It seems that only the other bug is caused by vdpau.
So the patch is still necessary.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.