Created attachment 117080 [details] exec_list fix for optimizing compilers Observed: Segfaulting and optimization passes that fail to terminate. Problems were observed with both GCC and Clang. Expected: None of it. Analysis: Optimizing compilers will ignore modification through the sentinel nodes cast from the list structure. Code like while (! list.is_empty()) ... list.pop_head(); ... breaks, because the compiler assumes that the list structure doesn't change. Fix: Good old Amiga-style lists can still work with modern, optimizing compilers as long as they are told to check for side effects on the sentinels' respective next / prev pointers. See contained patch - it's actually quite simple and should leave most optimizations intact.
Created attachment 117094 [details] [review] alternative / more thorough fix using union types
`configure.ac` at lines 282 and 329 reads: # Work around aliasing bugs - developers should comment this out CFLAGS="$CFLAGS -fno-strict-aliasing" This workaround may keep the effects from showing - and also prevents lots of useful compiler optimizations. These lines will need to be commented out before recreating the buildfiles for reproducing the problems (I encountered them in a derived codebase). An alternative, more thorough solution to my first patch would be telling the compiler what's really going on: It can be done using union types - as in the second patch. This approach results in ~4K less executable size. In this case, it even works without volatile qualifiers for GCC (tested with version 4.8.3, Gentoo build). I kept the previous members because a lot of the codebase relies on their presence. I also kept the volatile qualifiers, since the generated code was smallest and did not run measurably slower. The patch optimistically removes the constructor of 'exec_node' (what's a node's life without a list?) and I didn't experience problems. Since types in unions may not have constructors, the recipe to keep that constructor would be a separate struct 'exec_node_c' (C-style POD type, without constructor) that 'exec_node' (with constructor) inherits from. Both variants work with "-O6 -fstrict-aliasing".
Successfully tested with Clang 3.5.1.
FWIW, There's been a lot of discussion about this and the interactions with strict-aliasing on the mesa-dev mailing list recently.
Thanks for the pointer. There are external projects using the glsl-compiler code and it'd be great to see a fixed upstream. We're running it in the browser via emscripten where optimized builds are crucial to achieve reasonable speed and download size. I don't really care about the debate on principles, whether the rest of Mesa should obey strict aliasing rules or not. I neither care whose patch fixes the issue. That being said, there are some advantages over the changes proposed on the mailing list: o My first patch already gets the job done. It may not be "clean" in terms of language lawery, but it is very much "to the point", as I actually traced the assembly to spot the one assumption that breaks it all. The patch can very easily be approved to be non-breaking. o My second patch is a 1:1 translation of the introductory inline comment into code. Why not share our cleverness with the compiler so it can act less stupid? It's API compatible, but allows for a optional and gradual rewrite of code that uses 'head', 'tail', and 'tail_pred' directly as a cleanup step. Other than PATCHv3 from the mailing list, it does not increase the memory footprint of 'exec_list' - in fact, it's binary compatible. I also added #pragma pack(1) so its binary format is portable to ABIs that prefer an alignment of more than one address size (x32 may be one of these, but I'm not sure).
This issue should be solved by: author Matt Turner <mattst88@gmail.com> commit d1f6f656973a2e18641441e3c97b30799a82de52 (patch) glsl: Separate overlapping sentinel nodes in exec_list. I do appreciate the cleverness, but unfortunately it prevents a lot more cleverness in the form of additional compiler optimizations brought on by -fstrict-aliasing. No difference in OglBatch7 (n=20). Co-authored-by: Davin McCall <davmac@davmac.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(In reply to Timothy Arceri from comment #6) Oh, thanks Tim. I'd totally forgotten that this bug existed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.