Bug 92323 - Font rendering issue on Intel GMA X3100/X4500 with Android-x86
Summary: Font rendering issue on Intel GMA X3100/X4500 with Android-x86
Status: RESOLVED INVALID
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-07 01:01 UTC by Mauro Rossi
Modified: 2017-02-10 22:38 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Initial setup systematic artifact (236.16 KB, image/png)
2015-10-07 01:01 UTC, Mauro Rossi
Details
Another systematic artifact (take a look at the O letters) (520.25 KB, image/png)
2015-10-07 01:04 UTC, Mauro Rossi
Details
problematic GUI (1.22 MB, image/png)
2015-10-07 01:08 UTC, Mauro Rossi
Details
Google apps transparencies and glyphs glitches (19.46 KB, image/jpeg)
2015-10-07 01:10 UTC, Mauro Rossi
Details

Description Mauro Rossi 2015-10-07 01:01:57 UTC
Created attachment 118723 [details]
Initial setup systematic artifact

There is a nasty font rendering problem affecting Intel GMA X3100/X4500
and other chipsets belonging to the same family driver i965GM.

The problem appeared with mesa 10.5, survived mesa 10.6, mesa 11.0 and it affects also mesa 11.1.0devel.

Mesa 10.4 branch was completely unaffected.

There are many analogies between font artifacts observed on both kitkat-x86 and lollipo-x86 builds which resemble the artifacts described in the following links:

https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1342675

https://bugs.launchpad.net/xserver-xorg-video-intel/+bug/1432194

http://ubuntuforums.org/showthread.php?t=2241482

http://askubuntu.com/questions/584922/how-do-i-fix-fonts-not-rendering-and-missing-letters

We tried very hard to isolate the commit causing this, which should be around 10.5 branchpoint ... 10.5.0 release, but a lot happened (format_pack, NIR and i965)

Prime suspect is 

Forcing MSAA antialiasing in Developer Settings does not solve the problem.
The GUI elements shown in the attachments are systematically affected with 10.6, 11.0 and 11.1.0devel, but in the latter the font artifacts started appearing as vertical stripes, while in the former branches

Another clue is that all characters of the same letter appear with the exact same artifact and Settings, Google Play are heavily affected by font transparency, causing the GUI to be unusable.

S.O.S.

Mauro
Comment 1 Mauro Rossi 2015-10-07 01:04:54 UTC
Created attachment 118724 [details]
Another systematic artifact (take a look at the O letters)

The O azure letters are perfectly rendered, 
hope this can help.
Comment 2 Mauro Rossi 2015-10-07 01:08:48 UTC
Created attachment 118725 [details]
problematic GUI
Comment 3 Mauro Rossi 2015-10-07 01:10:01 UTC
Created attachment 118726 [details]
Google apps transparencies and glyphs glitches
Comment 4 Mauro Rossi 2015-10-07 01:34:44 UTC
We tried very hard to isolate the commit causing this, which should be around 10.5 branchpoint ... 10.5.0rc1, to be precise.

Commits between 10.4.4 and the 10.5 branchpoint were all reverted with no effects.

I am available to try reverting key commits, following your kind suggestions and report back.

Mauro
Comment 5 Ian Romanick 2015-10-07 01:59:27 UTC
git-bisect is the usual way to find the problem.
Comment 6 Mauro Rossi 2015-10-19 19:57:40 UTC
Hi,

I performed a sequence of builds, reaching 10.5 branchpoint and the problem is still there, I need to continue looking in i965 related commits back to 10.4 branchpoint, to cover all what happended in 10.5.0devel stage.

While performing tests I got evidence of glError by activating EGL logging,
in each and every case of font artifacts I get the following indication in logcat:

10-18 23:21:25.597  3514  3605 D libEGL  : [glTexSubImage2D] 0x500

I checked that the 0x500 error was systematic by loggin 10-12 hazi font events
and the timestamps do match in all cases.

More info here: https://groups.google.com/forum/m/#!topic/android-x86/zaz57OckjJs

I have enabled LIBGL_DEBUG, MESA_DEBUG, MESA_LOG_FILE environment variables,
but there I get no valuable log.

Does it help if I use INTEL_DEBUG environment variable?
All flags or a subset?
Will the output go in MESA_LOG_FILE?

Thanks for your indications
Mauro
Comment 7 Mauro Rossi 2015-10-31 19:48:49 UTC
Hi,

while trying to collect apitrace on Android L, which is not very easy for me, 
but I will not surrender, I have collected INTEL_DEBUG messages preceeding the 0x500 error.

Mauro


W/INTEL-MESA(29959): intelTexSubImage mesa_format MESA_FORMAT_A_UNORM8 target GL_TEXTURE_2D format GL_ALPHA type GL_UNSIGNED_BYTE level 0 1024x512x1

D/libEGL  (29959): [glTexSubImage2D] 0x500

D/glGetError:glTexSubImage2D(29959): #00 pc 00003121  /android/system/lib/libbacktrace.so
D/glGetError:glTexSubImage2D(29959): #01 pc 0000f24b  /android/system/lib/libutils.so
D/glGetError:glTexSubImage2D(29959): #02 pc 0001bead  /android/system/lib/libEGL.so

...

W/INTEL-MESA(29959): intelTexSubImage mesa_format MESA_FORMAT_A_UNORM8 target GL_TEXTURE_2D format GL_ALPHA type GL_UNSIGNED_BYTE level 0 1024x512x1

D/libEGL  (29959): [glTexSubImage2D] 0x500

D/glGetError:glTexSubImage2D(29959): #00 pc 00003121  /android/system/lib/libbacktrace.so
D/glGetError:glTexSubImage2D(29959): #01 pc 0000f24b  /android/system/lib/libutils.so
D/glGetError:glTexSubImage2D(29959): #02 pc 0001bead  /android/system/lib/libEGL.so
...
Comment 8 Mauro Rossi 2015-11-02 02:29:59 UTC
Hi,

I have added a few checkpoints in src/mesa/drivers/dri/i965/intel_texsubimage.c
and src/mesa/drivers/dri/i965/intel_texsubimage.c to see the value of bool variable named ok

The result is:

11-01 16:49:04.645  4817  4945 W INTEL-MESA: intelTexSubImage mesa_format MESA_FORMAT_A_UNORM8 target GL_TEXTURE_2D format GL_ALPHA type GL_UNSIGNED_BYTE level 0 1024x512x1
11-01 16:49:04.645  4817  4945 W INTEL-MESA: intelNewTextureObject
11-01 16:49:04.645  4817  4945 W INTEL-MESA: intelNewTextureImage
11-01 16:49:04.652  4817  4945 W INTEL-MESA: MAUROSSI intelTexSubImage _mesa_meta_pbo_TexSubImage() returns: 1
11-01 16:49:04.652  4817  4945 D libEGL  : [glTexSubImage2D] 0x500


this means that _mesa_meta_pbo_TexSubImage() reports to have completed its job.

In order to check if the font artifact were related to meta implementation I've skipped _mesa_meta_pbo_TexSubImage() completely with /*  ... */ and logged 
intel_texsubimage_tiled_memcpy() and mesa_store_texsubimage() execution.  

11-02 02:20:44.825  5078  5096 W INTEL-MESA: intel_texsubimage_tiled_memcpy: MAUROSSI 1st check failed, packing?
11-02 02:20:44.825  5078  5096 W INTEL-MESA: MAUROSSI intelTexSubImage intel_texsubimage_tiled_memcpy() returns: 0
11-02 02:20:44.826  5078  5096 W INTEL-MESA: MAUROSSI intelTexSubImage _mesa_store_texsubimage() executed.

The result I get by skipping _mesa_meta_pbo_TexSubImage() is that intel_texsubimage_tiled_memcpy() bails out because of packing,

_mesa_store_texsubimage() is executed and there are no font artifacts at all (triple checked).

This means that the offending commit lies among the following ones:

i965/tex_subimage: use meta instead of the blitter for PBO TexSubImage
i965/tex_image: Use meta for instead of the blitter PBO TexImage and GetTexImage
i965/pixel_read: Use meta_pbo_GetTexSubImage for PBO ReadPixels
meta: Add an implementation of GetTexSubImage for PBOs
meta: Add a BlitFramebuffers-based implementation of TexSubImage

Mauro
Comment 9 Jason Ekstrand 2015-11-03 04:15:55 UTC
This looks like an ugly little bug and I can totally believe that it's caused by that commit.  Do you have a test-case that can reproduce this without Android?  An api-trace should be sufficient if you can get one.
Comment 10 Ian Romanick 2015-11-03 15:18:56 UTC
(In reply to Mauro Rossi from comment #6)
> While performing tests I got evidence of glError by activating EGL logging,
> in each and every case of font artifacts I get the following indication in
> logcat:

Can you run in GDB or similar debugger?  If so, please set a breakpoint at _mesa_error and provide a backtrace.  From within _mesa_error, could you also print the value of ctx->API?

From some out-of-band communication there was a suggestion that the problem could be because of uses of GL_PIXEL_UNPACK_BUFFER and GL_PIXEL_PACK_BUFFER in meta.  Meta usually sets the API to API_OPENGL_COMPAT to avoid problems like this, but it's possible that was missed.  I believe PBOs were added in OpenGL ES 3.0, but i965GM is an OpenGL ES 2.0 part.

FWIW... I'm working on another meta related bug, and I have some patches that may affect the code from the "meta:" tagged commits.
Comment 11 Mauro Rossi 2015-11-03 23:02:00 UTC
Hi,

>Can you run in GDB or similar debugger?  If so, please set a breakpoint at >_mesa_error and provide a backtrace.  From within _mesa_error, could you also >print the value of ctx->API?

I'm not skilled enough with GDB, sorry, but I could use DBG(...) to print checkpoint informations in logcat.
Apitrace collection is possible on kitkat-x86 while it is still very difficult to perform on lollipop-x86.

We are trying to collect apitrace on kitkat-x86, while dreaming of it on lollipop-x86, but I would proceed on OpenGL ES version checks.

>From some out-of-band communication there was a suggestion that the problem >could be because of uses of GL_PIXEL_UNPACK_BUFFER and GL_PIXEL_PACK_BUFFER in >meta.  Meta usually sets the API to API_OPENGL_COMPAT to avoid problems like >this, but it's possible that was missed.  I believe PBOs were added in OpenGL ES >3.0, but i965GM is an OpenGL ES 2.0 part.

pstglia, a fellow  member of Android-x86 forum came to the same conclusion that PBO are OpenGL ES 3.0 and the binding with GL_PIXEL_PACK_BUFFER/GL_PIXEL_PACK_BUFFER targets fails returning NULL and causing the 0x500 error.

I think that API_OPENGL_COMPAT is definitely worth a try.
Do you have a proposed patch for us to test and report?

Thanks a lot

Mauro
Comment 12 pstglia 2015-11-05 00:21:52 UTC
Hi all,

> >From some out-of-band communication there was a suggestion that the problem >could be because of uses of GL_PIXEL_UNPACK_BUFFER and GL_PIXEL_PACK_BUFFER in >meta.  Meta usually sets the API to API_OPENGL_COMPAT to avoid problems like >this, but it's possible that was missed.  I believe PBOs were added in OpenGL ES >3.0, but i965GM is an OpenGL ES 2.0 part.
> 
> pstglia, a fellow  member of Android-x86 forum came to the same conclusion
> that PBO are OpenGL ES 3.0 and the binding with
> GL_PIXEL_PACK_BUFFER/GL_PIXEL_PACK_BUFFER targets fails returning NULL and
> causing the 0x500 error.
> 
> I think that API_OPENGL_COMPAT is definitely worth a try.
> Do you have a proposed patch for us to test and report?

Just to add more info, _mesa_error is returning the following in this case:

glBindBufferARB(target 0x88ec)

It's being raised here:

### src/mesa/main/bufferobj.c:
   bindTarget = get_buffer_target(ctx, target);
   if (!bindTarget) {
      _mesa_error(ctx, GL_INVALID_ENUM, "glBindBufferARB(target 0x%x)", target);
      return;
   }

I didn't trace get_buffer_target directly (like printing ctx->API as suggested), but by looking it's first check, assumed ctx is neither "API_OPENGL_COMPAT", "API_OPENGL_CORE"  or  "API_OPENGLES2" context (even if ctx is this last type, it's not ctx->Version >= 30):


### src/mesa/main/bufferobj.c
  /* Other targets are only supported in desktop OpenGL and OpenGL ES 3.0.
    */
   if (!_mesa_is_desktop_gl(ctx) && !_mesa_is_gles3(ctx)
       && target != GL_ARRAY_BUFFER && target != GL_ELEMENT_ARRAY_BUFFER)
      return NULL;


### src/mesa/main/context.h

/**
 * Checks if the context is for any GLES version
 */
static inline bool
_mesa_is_gles(const struct gl_context *ctx)
{
   return ctx->API == API_OPENGLES || ctx->API == API_OPENGLES2;
}


/**
 * Checks if the context is for GLES 3.0 or later
 */
static inline bool
_mesa_is_gles3(const struct gl_context *ctx)
{
   return ctx->API == API_OPENGLES2 && ctx->Version >= 30;
}


Of course I may missed some point.

Regards,
Pstglia
Comment 13 Mauro Rossi 2015-11-26 00:57:40 UTC
Hi,

I have good news!

The following commit in mesa 11.2.0devel solved the problem:

http://cgit.freedesktop.org/mesa/mesa/commit/?id=89a61afdd7346d6e36caccc4d6f2a2607dc4a1f6

Tested mesa 11.2.0devel with Dell E630 i965GM

I also tested mesa 11.1.0rc1 after git cherry-pick of the same commit,
the problem of hazi fonts/0x500 GL_INVALID_ENUM is gone.

Mesa 11.0.x would require backport, by apply the following commit, prior to the aforementioned one. Tested on top of current 11.0.6 and works.

http://cgit.freedesktop.org/mesa/mesa/commit/?id=f30cf3258e495a583e011e07d5b4a19031c5518f

Regards

Mauro
Comment 14 Annie 2017-02-10 22:38:30 UTC
Dear Reporter,

This Mesa bug has been in the "NEEDINFO" status for over 60 days. I am closing this bug based on lack of response but feel free to reopen if resolution is still needed. Please ensure you're supplying the correct information as requested.

Thank you.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.