Bug 63632

Summary: mesa +r600 llvm = segfault
Product: Mesa Reporter: Andy Furniss <adf.lists>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Andy Furniss 2013-04-17 09:56:25 UTC
With current mesa and llvm heads I get a segfault running anything, R600_LLVM=0 is OK.

Haven't had time to bisect what with the extra hassle of two trees with inter dependencies.

I did find a recent working 

mesa on 

commit 1d6eb23f2dc1bb53636802cb698e6788ca0a26ac
Author: Roland Scheidegger <sroland@vmware.com>
Date:   Mon Apr 15 03:57:23 2013 +0200

    gallivm: fix small but severe bug in handling multiple lod level strides

llvm on the commit before 

commit ef1762b6a1d3353790bdb415788e7d8963e70372
Author: Nico Rieck <nico.rieck@gmail.com>
Date:   Sun Apr 14 21:18:36 2013 +0000

    Use object file specific section type for initial text section

with llvm on that commit and mesa on above I get 

/mnt/sdb1/Src64/llvm/include/llvm/MC/MCStreamer.h:224: void llvm::MCStreamer::SwitchSection(const llvm::MCSection*): Assertion `Section && "Cannot switch to a null section!"' failed

With both on head I segfault

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff1e9b68c in r600_bytecode_from_byte_stream (num_bytes=0, bytes=0x0, ctx=0x7fffffffc670) at r600_shader.c:593
593             ctx->bc->nstack = bytes[bytes_read++];
(gdb) bt
#0  0x00007ffff1e9b68c in r600_bytecode_from_byte_stream (num_bytes=0, bytes=0x0, ctx=0x7fffffffc670) at r600_shader.c:593
#1  r600_shader_from_tgsi (rscreen=0x634630, pipeshader=0x61da70, key=...) at r600_shader.c:1558
#2  0x00007ffff1e9dd73 in r600_pipe_shader_create (ctx=0x6478f0, shader=0x61da70, key=...) at r600_shader.c:132
#3  0x00007ffff1eb06db in r600_shader_select (ctx=0x6478f0, sel=0x61d910, dirty=0x0) at r600_state_common.c:747
#4  0x00007ffff1eb0876 in r600_create_shader_state (ctx=0x6478f0, pipe_shader_type=0, state=<optimized out>) at r600_state_common.c:794
#5  0x00007ffff1dc2f16 in ureg_create_shader (ureg=0x618fa0, pipe=0x6478f0, so=0x0) at tgsi/tgsi_ureg.c:1701
#6  0x00007ffff1de8172 in ureg_create_shader_with_so_and_destroy (so=0x0, pipe=0x6478f0, p=0x618fa0) at ./tgsi/tgsi_ureg.h:131
#7  util_make_vertex_passthrough_shader_with_so (pipe=0x6478f0, num_attribs=2, semantic_names=0x7fffffffdd70, semantic_indexes=0x7fffffffdd80, so=0x0) at util/u_simple_shaders.c:98
#8  0x00007ffff1dcf4a6 in util_blitter_create (pipe=0x6478f0) at util/u_blitter.c:301
#9  0x00007ffff1e8e59c in r600_create_context (screen=0x634630, priv=0x0) at r600_pipe.c:466
#10 0x00007ffff1cf7a56 in st_api_create_context (stapi=<optimized out>, smapi=0x633e60, attribs=0x7fffffffdee0, error=0x7fffffffdf0c, shared_stctxi=0x0) at ../../src/mesa/state_tracker/st_manager.c:633
#11 0x00007ffff1ebdbf3 in dri_create_context (api=<optimized out>, visual=0x6388e0, cPriv=<optimized out>, major_version=<optimized out>, minor_version=<optimized out>, flags=<optimized out>, error=0x7fffffffdfdc, 
    sharedContextPrivate=0x0) at dri_context.c:122
#12 0x00007ffff1bc36aa in dri2CreateContextAttribs (screen=0x633ca0, api=<optimized out>, config=0x6388e0, shared=<optimized out>, num_attribs=<optimized out>, attribs=<optimized out>, error=0x7fffffffdfdc, data=0x647680)
    at ../../../../src/mesa/drivers/dri/common/dri_util.c:288
#13 0x00007ffff1bc37dd in dri2CreateNewContextForAPI (screen=<optimized out>, api=<optimized out>, config=<optimized out>, shared=<optimized out>, data=<optimized out>) at ../../../../src/mesa/drivers/dri/common/dri_util.c:306
#14 0x00007ffff716dd88 in dri2_create_context (base=0x617290, config_base=0x642ac0, shareList=<optimized out>, renderType=<optimized out>) at dri2_glx.c:230
#15 0x00007ffff7142b8e in CreateContext (dpy=0x604050, generic_id=185, config=0x642ac0, shareList_user=0x0, allowDirect=<optimized out>, code=24, renderType=32788, screen=0) at glxcmds.c:274
#16 0x00007ffff7142dba in glXCreateNewContext (dpy=<optimized out>, fbconfig=<optimized out>, renderType=<optimized out>, shareList=<optimized out>, allowDirect=<optimized out>) at glxcmds.c:1591
#17 0x00007ffff73cdf61 in fghCreateNewContext (window=<optimized out>) at freeglut_window.c:458
#18 0x00007ffff73ce81b in fgOpenWindow (window=0x6139e0, title=0x402f20 "Gears", positionUse=0 '\000', x=-1, y=-1, sizeUse=1 '\001', w=300, h=300, gameMode=0 '\000', isSubWindow=0 '\000') at freeglut_window.c:1228
#19 0x00007ffff73cd182 in fgCreateWindow (parent=0x0, title=0x402f20 "Gears", positionUse=0 '\000', x=-1, y=-1, sizeUse=1 '\001', w=300, h=300, gameMode=0 '\000', isMenu=0 '\000') at freeglut_structure.c:108
#20 0x00007ffff73cea12 in glutCreateWindow (title=0x402f20 "Gears") at freeglut_window.c:1583
#21 0x0000000000401553 in main (argc=1, argv=0x7fffffffe568) at gears.c:391
Comment 1 Andy Furniss 2013-04-19 21:55:08 UTC
I am still getting segfaults with updated mesa/llvm, though the bt is a bit different.

I can reproduce with a new 64 bit LFS and an old 32 bit one.

Have put RV790 back in this box so it's not just RS880.

Program received signal SIGSEGV, Segmentation fault.
0xb6451433 in r600_llvm_compile (mod=0x80658c0, inst_bytes=inst_bytes@entry=0xbfffa7d4, inst_byte_count=inst_byte_count@entry=0xbfffa7d8, family=CHIP_RV770, ngpr=0x806425c, dump=dump@entry=0) at r600_llvm.c:567
567             *ngpr = util_le32_to_cpu(*(uint32_t*)binary.config);
(gdb) bt
#0  0xb6451433 in r600_llvm_compile (mod=0x80658c0, inst_bytes=inst_bytes@entry=0xbfffa7d4, inst_byte_count=inst_byte_count@entry=0xbfffa7d8, family=CHIP_RV770, ngpr=0x806425c, dump=dump@entry=0) at r600_llvm.c:567
#1  0xb6433f16 in r600_shader_from_tgsi (rscreen=0x8074cf0, pipeshader=pipeshader@entry=0x8064230, key=...) at r600_shader.c:1464
#2  0xb6435135 in r600_pipe_shader_create (ctx=ctx@entry=0x805b850, shader=0x8064230, key=...) at r600_shader.c:132
#3  0xb6448f6b in r600_shader_select (ctx=ctx@entry=0x805b850, sel=sel@entry=0x8094230, dirty=dirty@entry=0x0) at r600_state_common.c:747
#4  0xb6449134 in r600_create_shader_state (ctx=0x805b850, state=<optimized out>, pipe_shader_type=0) at r600_state_common.c:794
#5  0xb633d0a3 in ureg_create_shader (ureg=ureg@entry=0x808f8f0, pipe=pipe@entry=0x805b850, so=so@entry=0x0) at tgsi/tgsi_ureg.c:1701
#6  0xb636a434 in ureg_create_shader_with_so_and_destroy (so=0x0, pipe=0x805b850, p=0x808f8f0) at ./tgsi/tgsi_ureg.h:131
#7  util_make_vertex_passthrough_shader_with_so (pipe=pipe@entry=0x805b850, num_attribs=num_attribs@entry=2, semantic_names=semantic_names@entry=0xbffff16c, semantic_indexes=semantic_indexes@entry=0xbffff20c, so=so@entry=0x0) at util/u_simple_shaders.c:98
#8  0xb636a48f in util_make_vertex_passthrough_shader (pipe=pipe@entry=0x805b850, num_attribs=num_attribs@entry=2, semantic_names=semantic_names@entry=0xbffff16c, semantic_indexes=semantic_indexes@entry=0xbffff20c) at util/u_simple_shaders.c:64
#9  0xb634c604 in util_blitter_create (pipe=pipe@entry=0x805b850) at util/u_blitter.c:301
#10 0xb64248a1 in r600_create_context (screen=0x8074cf0, priv=0x0) at r600_pipe.c:466
#11 0xb6261a9b in st_api_create_context (stapi=0xb77752c0 <st_gl_api>, smapi=0x80747b0, attribs=0xbffff474, error=0xbffff470, shared_stctxi=0x0) at ../../src/mesa/state_tracker/st_manager.c:633
#12 0xb645777c in dri_create_context (api=API_OPENGL_COMPAT, visual=0x80787f8, cPriv=0x807fc28, major_version=1, minor_version=0, flags=0, error=0xbffff53c, sharedContextPrivate=0x0) at dri_context.c:124
#13 0xb611b89d in dri2CreateContextAttribs (screen=screen@entry=0x80746f8, api=api@entry=0, config=config@entry=0x80787f8, shared=shared@entry=0x0, num_attribs=num_attribs@entry=0, attribs=attribs@entry=0x0, error=error@entry=0xbffff53c, data=data@entry=0x807f518)
    at ../../../../src/mesa/drivers/dri/common/dri_util.c:288
#14 0xb611ba17 in dri2CreateNewContextForAPI (screen=screen@entry=0x80746f8, api=api@entry=0, config=config@entry=0x80787f8, shared=shared@entry=0x0, data=data@entry=0x807f518) at ../../../../src/mesa/drivers/dri/common/dri_util.c:306
#15 0xb611ba4f in dri2CreateNewContext (screen=0x80746f8, config=0x80787f8, shared=0x0, data=0x807f518) at ../../../../src/mesa/drivers/dri/common/dri_util.c:314
#16 0xb7f11209 in dri2_create_context (base=0x805b300, config_base=0x809ed70, shareList=0x0, renderType=32788) at dri2_glx.c:230
#17 0xb7ee73f9 in CreateContext (dpy=dpy@entry=0x804c050, generic_id=545, config=0x809ed70, shareList_user=shareList_user@entry=0x0, allowDirect=allowDirect@entry=1, code=code@entry=3, renderType=32788, screen=screen@entry=0) at glxcmds.c:274
#18 0xb7ee7c15 in glXCreateContext (dpy=0x804c050, vis=0x805b6b8, shareList=0x0, allowDirect=1) at glxcmds.c:379
#19 0xb7cb7c6a in __glutCreateWindow (parent=0x0, x=0, y=0, width=300, height=300, gameMode=0) at glut_win.c:609
#20 0xb7cb7e11 in glutCreateWindow (title=title@entry=0x804a900 "Gears") at glut_win.c:731
#21 0x0804906f in main (argc=1, argv=0xbffff7f4) at gears.c:391
Comment 2 Tom Stellard 2013-04-20 06:07:16 UTC
I can't reproduce this with LLVM r179895 and Mesa 12eab7cc564a6928197f9b87ded9e368e56976f0

Have you done full rebuilds of both projects?
Comment 3 Andy Furniss 2013-04-20 09:48:23 UTC
(In reply to comment #2)
> I can't reproduce this with LLVM r179895 and Mesa
> 12eab7cc564a6928197f9b87ded9e368e56976f0
> 
> Have you done full rebuilds of both projects?

Yes, I always do make [dist]clean and git clean -dfx.

I have just deleted both trees and re-cloned to be sure, but the segfault is still there.

When I was on my working commits moving either llvm or mesa to head while keeping the other on "working" produced the segfault (which is why I didn't do a proper bisect).

I always clean and rebuild mesa after llvm has changed.
Comment 4 Tom Stellard 2013-04-22 21:56:38 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > I can't reproduce this with LLVM r179895 and Mesa
> > 12eab7cc564a6928197f9b87ded9e368e56976f0
> > 
> > Have you done full rebuilds of both projects?
> 
> Yes, I always do make [dist]clean and git clean -dfx.
> 
> I have just deleted both trees and re-cloned to be sure, but the segfault is
> still there.
> 
> When I was on my working commits moving either llvm or mesa to head while
> keeping the other on "working" produced the segfault (which is why I didn't
> do a proper bisect).
> 
> I always clean and rebuild mesa after llvm has changed.

I was able to reproduce this on my gentoo system, but not on either of my fedora systems.  I will investigate further, what distro are you using?
Comment 5 Andy Furniss 2013-04-22 23:06:22 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > I can't reproduce this with LLVM r179895 and Mesa
> > > 12eab7cc564a6928197f9b87ded9e368e56976f0
> > > 
> > > Have you done full rebuilds of both projects?
> > 
> > Yes, I always do make [dist]clean and git clean -dfx.
> > 
> > I have just deleted both trees and re-cloned to be sure, but the segfault is
> > still there.
> > 
> > When I was on my working commits moving either llvm or mesa to head while
> > keeping the other on "working" produced the segfault (which is why I didn't
> > do a proper bisect).
> > 
> > I always clean and rebuild mesa after llvm has changed.
> 
> I was able to reproduce this on my gentoo system, but not on either of my
> fedora systems.  I will investigate further, what distro are you using?

I use linux from scratch and can reproduce on an old 32bit build and a more recent pure 64bit.
Comment 6 Michel Dänzer 2013-04-23 09:13:55 UTC
Which version of libelf is used in each case? I was running into problems with the one from http://www.mr511.de/software/ but the one from Fedora's elfutils works fine. Tom told me on IRC the former requires an additional initialization step.
Comment 7 Tom Stellard 2013-04-23 14:32:49 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > I can't reproduce this with LLVM r179895 and Mesa
> > > 12eab7cc564a6928197f9b87ded9e368e56976f0
> > > 
> > > Have you done full rebuilds of both projects?
> > 
> > Yes, I always do make [dist]clean and git clean -dfx.
> > 
> > I have just deleted both trees and re-cloned to be sure, but the segfault is
> > still there.
> > 
> > When I was on my working commits moving either llvm or mesa to head while
> > keeping the other on "working" produced the segfault (which is why I didn't
> > do a proper bisect).
> > 
> > I always clean and rebuild mesa after llvm has changed.
> 
> I was able to reproduce this on my gentoo system, but not on either of my
> fedora systems.  I will investigate further, what distro are you using?

The problem on my gentoo system was that I had removed --enable-shared from my llvm configure script a few days ago, so I was still linking with an older LLVM.

Can you check that you are passing --enable-shared when configuring LLVM?
Comment 8 Andy Furniss 2013-04-23 19:00:29 UTC
(In reply to comment #7)

> The problem on my gentoo system was that I had removed --enable-shared from
> my llvm configure script a few days ago, so I was still linking with an
> older LLVM.
> 
> Can you check that you are passing --enable-shared when configuring LLVM?

I wasn't, but passing it does not prevent the segfault - will look into libelf.
Comment 9 Andy Furniss 2013-04-23 19:05:17 UTC
(In reply to comment #6)
> Which version of libelf is used in each case? I was running into problems
> with the one from http://www.mr511.de/software/ but the one from Fedora's
> elfutils works fine. Tom told me on IRC the former requires an additional
> initialization step.

On my old 32bit setup - haven't got a clue, it was ages ago :-)

On the 64 bit build I have only recently installed as it became required for llvm - I used the source from debian sid. Have now tried vanilla and with debian diff, but still segfault.

Will look in to Fedora version.
Comment 10 Tom Stellard 2013-04-23 19:08:10 UTC
(In reply to comment #8)
> (In reply to comment #7)
> 
> > The problem on my gentoo system was that I had removed --enable-shared from
> > my llvm configure script a few days ago, so I was still linking with an
> > older LLVM.
> > 
> > Can you check that you are passing --enable-shared when configuring LLVM?
> 
> I wasn't, but passing it does not prevent the segfault - will look into
> libelf.

Is it the same segfault?  If the problem is libelf you will see a segfault in radeon_llvm_emit.cpp.  Are building mesa with --enable-opencl or --with-llvm-shared-libs ?
Comment 11 Andy Furniss 2013-04-23 19:28:51 UTC
(In reply to comment #10)
> (In reply to comment #8)
> > (In reply to comment #7)
> > 
> > > The problem on my gentoo system was that I had removed --enable-shared from
> > > my llvm configure script a few days ago, so I was still linking with an
> > > older LLVM.
> > > 
> > > Can you check that you are passing --enable-shared when configuring LLVM?
> > 
> > I wasn't, but passing it does not prevent the segfault - will look into
> > libelf.
> 
> Is it the same segfault?  If the problem is libelf you will see a segfault
> in radeon_llvm_emit.cpp.  Are building mesa with --enable-opencl or
> --with-llvm-shared-libs ?

It is the same segfault as comment1.

I was building mesa with neither option.

Have just tried --with-llvm-shared-libs but get the same segfault.
Comment 12 Andy Furniss 2013-04-23 20:26:59 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > (In reply to comment #8)
> > > (In reply to comment #7)
> > > 
> > > > The problem on my gentoo system was that I had removed --enable-shared from
> > > > my llvm configure script a few days ago, so I was still linking with an
> > > > older LLVM.
> > > > 
> > > > Can you check that you are passing --enable-shared when configuring LLVM?
> > > 
> > > I wasn't, but passing it does not prevent the segfault - will look into
> > > libelf.
> > 
> > Is it the same segfault?  If the problem is libelf you will see a segfault
> > in radeon_llvm_emit.cpp.  Are building mesa with --enable-opencl or
> > --with-llvm-shared-libs ?
> 
> It is the same segfault as comment1.
> 
> I was building mesa with neither option.
> 
> Have just tried --with-llvm-shared-libs but get the same segfault.

Just tried with the two elf patches you posted to the list and the segfault is fixed.
Comment 13 Michel Dänzer 2013-04-24 07:46:51 UTC
(In reply to comment #9)
> On the 64 bit build I have only recently installed as it [libelf] became
> required for llvm - I used the source from debian sid.

For the record, that's ambiguous, as Debian has both variants as libelfg0{,-dev} and libelf{1,-dev} respectively.
Comment 14 Andy Furniss 2013-04-24 10:47:00 UTC
(In reply to comment #13)
> (In reply to comment #9)
> > On the 64 bit build I have only recently installed as it [libelf] became
> > required for llvm - I used the source from debian sid.
> 
> For the record, that's ambiguous, as Debian has both variants as
> libelfg0{,-dev} and libelf{1,-dev} respectively.

Ahh I guess I got the wrong one :-)

To be more specific I just used the the orig from below, then tried with the diff applied after you mentioned libelf could br the problem

http://packages.debian.org/source/sid/libelf
Comment 15 Andy Furniss 2013-04-27 14:27:01 UTC
Working with current gits now the patches are in.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.