33220 – Use of talloc_autofree_context causes segfault when dynamically loaded

Bug 33220 - Use of talloc_autofree_context causes segfault when dynamically loaded

Summary: Use of talloc_autofree_context causes segfault when dynamically loaded

Status:	RESOLVED WONTFIX

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	glsl-compiler (show other bugs)
Version:	7.9
Hardware:	Other All

Importance:	medium normal
Assignee:	mesa-dev
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2011-01-17 19:33 UTC by Bryan Henderson
Modified:	2016-07-30 03:35 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments

Description Bryan Henderson 2011-01-17 19:33:39 UTC

The fix for 29722 makes glsl use talloc_autofree_context(), which sets up an atexit that refers to a location in the talloc library.

If the talloc library is bound into the module that contains glsl and that module gets dynamically unloaded, the atexit refers to memory that no longer exists.  As the program exits, the libc exit code refers to that memory and segfaults or worse.

talloc_autofree_context() warns against doing this.  But it says that on some platforms, it isn't a problem because the atexit handler runs at module unload time instead of program exit time.  Modern GNU libc is one of those.

Comment 1 Ian Romanick 2011-02-08 16:00:43 UTC

Now that talloc has been replaced with ralloc (which is built into Mesa), does this issue still exist?

Comment 2 Bryan Henderson 2011-02-11 19:46:30 UTC

Yes, the problem is the same.  ralloc ends up in a dynamically loaded module and ralloc_autofree_context() sets up code within ralloc as an atexit handler.  On a traditional platform, that handler runs at program exit time, after the dynamically loaded module has been unloaded.

Comment 3 Kenneth Graunke 2012-01-25 02:03:28 UTC

I see a few options:

1. Use C++ static objects with destructors instead of the talloc-style autofree context.  These are supposed to work in libraries, though I'm uncertain whether all platforms actually do that correctly.  You also might need the -fuse-cxa-atexit compiler flag.  If it does work, this would be fairly easy.

2. Avoid the autofree context altogether, by plumbing a context in core Mesa, to be created and destroyed at the same time.  Figuring out exactly when to delete or recreate this is tricky, and it may also involve refactoring glsl_type code to not use static objects whose constructors rely on this context existing.

As my platform uses modern GNU libc, I can't reproduce this issue, and unfortunately don't think I'll have time to solve this in the near future.  Patches welcome...

Comment 4 Bryan Henderson 2012-01-25 09:25:17 UTC

And I can't help because it was so long ago that I no longer have the build and test environment and knowledge to work on it.

The workaround I have been using is (probably just undoing the fix for 29722):

  glsl_type::mem_ctx = talloc_autofree_context();

with

  glsl_type::mem_ctx = talloc_init("glsl_type");

and then at screen close time do

  talloc_free(glsl_type::mem_ctx);
  glsl_type::mem_ctx = NULL;

I believe the cost of that is a crash if you try to open a screen twice in the same program instance, because for the second open, static variables set at the first open will still be pointing to the memory freed during the first close.

I don't know what causes a screen to get opened twice and I apparently have not done it yet, so I assume I never will and I am safe.

Comment 5 Kenneth Graunke 2012-01-25 12:52:08 UTC

Reassigning to the list in case someone else has the setup to reproduce this and work on a fix.

Comment 6 Timothy Arceri 2016-07-30 03:35:36 UTC

I don't think anyone is going to fix this and since no one else has complained I'm assuming nobody cares. Closing.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.