Created attachment 93689 [details] Xorg log from crash Hello, I decided to try the latest glamor from git today, hearing that it had fixed some glyph rendering issues. However, I am getting reliable X crashes when using it. The steps I've found to reproduce are: 1) Go to gmail or google+ 2) Open a hangouts chat with someone This causes a segfault that results in X closing. I'll attach a log file. I do not see this using glamor 0.6.0, so it must be related to one of the three patches since then. It seems most likely that it is related to one of the two patches added on January 31st, 2014. I'm running a relatively recent (few weeks) from-git version of mesa with a radeon 7870. Kernel version is 3.12.9. If any additional information is desired, I'll be happy to provide it if I can.
CC to Alex, it seems that this bug is a regression caused by the patches after 0.6.0. Could you help to look into it? Thanks.
Works fine for me here, although, I am running xserver from git, I'll try downgrading to 1.15.0 to see if I can reproduce.
I tried reproducing with xorg-server-1.15.0 and mesa-10.0.3 with glamor from git and I couldn't reproduce with firefox-27.0, cairo-1.12.16, and gtk+-3.10.7/gtk+-2.24.22. What browser does this happen in?
I'm using: Firefox 27.0 xorg-server 1.15.0 gtk3 3.10.7 gtk2 2.24.22 mesa 10.1 from git (git rev-list --count HEAD = 60899) xf86-video-ati 7.3.0 I was using a version of cairo from git as well, to enable hidpi mode in Gnome. But I just reinstalled the stock version and see the same crash. I'll try rolling back mesa next.
Still happens with mesa 10.0.3. And I just verified that it happens with my desktop in single-display mode, and with that single display in 1920x1080, so it doesn't seem to be related to very high resolutions. Are there any additional logs or traces I can send that would help narrow things down?
I reproduced it with a 32 bit version of firefox-27.0, I'll see what I can do.
Created attachment 93821 [details] [review] Possible fix for the crash See if this patch fixes it, I'm not too sure what was wrong the old code but from the looks of it, the code was doing too much work with copying data when all it needed to do was update the location of the data to put into the pixmap.
(In reply to comment #7) > Created attachment 93821 [details] [review] [review] > Possible fix for the crash > > See if this patch fixes it, I'm not too sure what was wrong the old code but > from the looks of it, the code was doing too much work with copying data > when all it needed to do was update the location of the data to put into the > pixmap. IMO, this patch is not correct. The reason why we need to gather the subregion to a contiguous buffer is that OpenGL ES(glTexImage2D/glTexSubImage2D)only support to upload from a contiguous external buffer.
It does eliminate the crash, yes.
(In reply to comment #8) > (In reply to comment #7) > > Created attachment 93821 [details] [review] [review] [review] > > Possible fix for the crash > > > > See if this patch fixes it, I'm not too sure what was wrong the old code but > > from the looks of it, the code was doing too much work with copying data > > when all it needed to do was update the location of the data to put into the > > pixmap. > > IMO, this patch is not correct. The reason why we need to gather the > subregion to a contiguous buffer is that OpenGL > ES(glTexImage2D/glTexSubImage2D)only support to upload from a contiguous > external buffer. I see, is https://www.opengl.org/sdk/docs/man/html/glTexImage2D.xhtml the correct documentation for OpenGL ES? I was having trouble finding the specs. When I set MAX_FBO_SIZE I did notice some corruption, not sure of the cause, but I will work more on this.
(In reply to comment #10) > (In reply to comment #8) > > (In reply to comment #7) > > > Created attachment 93821 [details] [review] [review] [review] [review] > > > Possible fix for the crash > > > > > > See if this patch fixes it, I'm not too sure what was wrong the old code but > > > from the looks of it, the code was doing too much work with copying data > > > when all it needed to do was update the location of the data to put into the > > > pixmap. > > > > IMO, this patch is not correct. The reason why we need to gather the > > subregion to a contiguous buffer is that OpenGL > > ES(glTexImage2D/glTexSubImage2D)only support to upload from a contiguous > > external buffer. > > I see, is https://www.opengl.org/sdk/docs/man/html/glTexImage2D.xhtml the > correct documentation for OpenGL ES? I was having trouble finding the specs. > > > When I set MAX_FBO_SIZE I did notice some corruption, not sure of the cause, > but I will work more on this. I checked the commit: commit 654e1536a6d6539dcc30d9151122536c1807f34a Author: Anthony Waters <awaters1@gmail.com> Date: Wed Jan 29 11:21:13 2014 -0500 glamor: Add in support for the stride parameter when uploading texture data glamor_upload_sub_pixmap_to_texture(temp_pixmap, 0, 0, w, h, - pixmap->devKind, bits, 0); - + 0, bits, 0); glamor_copy_area(&temp_pixmap->drawable, drawable, gc, 0, 0, w, h, x, y); glamor_destroy_pixmap(temp_pixmap); - } else + } else { glamor_upload_sub_pixmap_to_texture(pixmap, x + drawable->x + x_off, y + drawable->y + y_off, - w, h, PixmapBytePad(w, depth), bits, 0); + w, h, 0, bits, 0); + } why do you set the stride to zero at the glamor_upload_sub_pixmap_to_texture()? The original stride is pixmap->devKind which is needed to upload the sub pixmap. I think this is the root cause of this regression. Could you look at it? Thanks.
*** Bug 74841 has been marked as a duplicate of this bug. ***
I had it set to 0 originally because it would cause the xserver to crash, but turns out that crash was related to something else, so changing it back to be that PixmapBytePad is the correct path. I also noticed that I was using the variable i instead of j when getting the data into temp_bits, so I will fix that up and repost the patch. Also, from looking at what the old code did with glamor_put_bits to copy the data from bits to temp_bits it should be equivalent to setting temp_bits to point to the correct place in bits. I believe the only difference would be if temp_stride is different from stride, but this doesn't matter because _glamor_upload_bits_to_pixmap_texture takes the stride parameter into account when uploading the data.
(In reply to comment #13) > I had it set to 0 originally because it would cause the xserver to crash, > but turns out that crash was related to something else, so changing it back > to be that PixmapBytePad is the correct path. I also noticed that I was > using the variable i instead of j when getting the data into temp_bits, so I > will fix that up and repost the patch. > > Also, from looking at what the old code did with glamor_put_bits to copy the > data from bits to temp_bits it should be equivalent to setting temp_bits to > point to the correct place in bits. I believe the only difference would be > if temp_stride is different from stride, but this doesn't matter because > _glamor_upload_bits_to_pixmap_texture takes the stride parameter into > account when uploading the data. I just mentioned in my previous comment. What we need to do in the glamor_upload_sub_pixmap_to_texture() is to gather the sub-region into a contiguous buffer. Let me explain more details here: As the temp_bits and the temp_stride will pass to the function __glamor_upload_pixmap_to_texture() eventually. In that function, it will use the stride as below: dispatch->glPixelStorei(GL_UNPACK_ROW_LENGTH, stride / alignment); But unfortunately, OpenGL ES doesn't support that storage type which means, OpenGL ES will treat the external buffer of glTexImage2D/glTexSubImage2D is always contiguous buffer. The stride will be determined by the width and height of that temp_bit buffer. That is obviously incorrect. So my comments to your new patch is that: 1. change back to pass correct stride value devKind to glamor_upload_sub_pixmap_to_texture(). 2. Don't change the logic in glamor_upload_sub_pixmap_to_texture(). BTW, even when we are using OpenGL which does support GL_UNPACK_ROW_LENGTH, to gather the subregion in glamor_upload_sub_pixmap_to_texture is not really a overhead as what it looks like. As when you set GL_UNPACK_ROW_LENGTH to a larger stride, the OpenGL dri driver will know that this is not a contiguous buffer, and will go to a slow path to upload that buffer.
I think I see, so in OpenGL ES the stride is taken from width/height (so boxes[j].x2 - boxes[j].x1 and boxes[j].y2 - boxes[j].y1), however, because the data in temp_bits is according to w and h, it will copy incorrectly. Assuming I am understanding it correctly, I see how it works now. I'll work on restoring how it used to work and see if the bug is still apparent.
(In reply to comment #0) > Created attachment 93689 [details] > Xorg log from crash > > Hello, > > I decided to try the latest glamor from git today, hearing that it had fixed > some glyph rendering issues. However, I am getting reliable X crashes when > using it. > > The steps I've found to reproduce are: > > 1) Go to gmail or google+ > 2) Open a hangouts chat with someone > > This causes a segfault that results in X closing. I'll attach a log file. > > I do not see this using glamor 0.6.0, so it must be related to one of the > three patches since then. It seems most likely that it is related to one of > the two patches added on January 31st, 2014. > > I'm running a relatively recent (few weeks) from-git version of mesa with a > radeon 7870. Kernel version is 3.12.9. > > If any additional information is desired, I'll be happy to provide it if I > can. I just pushed Anthony's patch to the git master branch. Could you test whether it fix this bug in your system? Thanks.
Yes. Fixed here. Thanks.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.