Bug 99515

Summary: SIGSEGV MAPERR on Android nougat-x86 with mesa 17.0.0rc
Product: Mesa Reporter: Mauro Rossi <issor.oruam>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium    
Version: 17.0   
Hardware: x86-64 (AMD64)   
OS: other   
Whiteboard:
i915 platform: i915 features:
Attachments: Dmesg
Logcat
addr2line log
various EGL issues with addr2line logs

Description Mauro Rossi 2017-01-24 13:24:30 UTC
Created attachment 129125 [details]
Dmesg

Hi,

I've been seeying SIGSEGV MAPERR with radeonsi (HD7750 and HD7950)
It can be triggered in various situations/primitives,
like EGLSwapBuffersWithDamage, EGLterminate and others,
but the crash has always the same outcome and backtrace.

I think it is important to highlight that the issue started happening with nougat, which uses multiple threads/render threads, as opposed to marshmallow which was using single threading and it is not affected.

Now this may have exposed a non thread safe path in radeonsi code,
because other drivers (intel, nouveau with Ilia's draft locking patches)
are not affected by this issue.

Dmesg, Logcat and addr2line log in the attachments
Available to provide further info and to test patches, if needed.
Mauro

--------- beginning of crash
01-21 19:30:40.195  1457  1457 F libc    : Fatal signal 11 (SIGSEGV), code 1, fault addr 0x0 in tid 1457 (surfaceflinger)
01-21 19:30:40.195  1443  1443 W         : debuggerd: handling request: pid=1457 uid=0 gid=1003 tid=1457
01-21 19:30:40.196  1743  1743 E         : debuggerd: Unable to connect to activity manager (connect failed: No such file or directory)
01-21 19:30:40.246  1743  1743 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-21 19:30:40.246  1743  1743 F DEBUG   : Build fingerprint: 'Android-x86/android_x86_64/x86_64:7.1.1/NMF26O/utente12301216:eng/test-keys'
01-21 19:30:40.246  1743  1743 F DEBUG   : Revision: '0'
01-21 19:30:40.246  1743  1743 F DEBUG   : ABI: 'x86_64'
01-21 19:30:40.246  1743  1743 F DEBUG   : pid: 1457, tid: 1457, name: surfaceflinger  >>> /system/bin/surfaceflinger <<<
01-21 19:30:40.246  1743  1743 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
01-21 19:30:40.246  1743  1743 F DEBUG   :     rax 000077505a696cc0  rbx 000077505fa75a20  rcx 0000000000000000  rdx 0000000000000020
01-21 19:30:40.246  1743  1743 F DEBUG   :     rsi 0000000000000000  rdi 000077505a696cc0
01-21 19:30:40.246  1743  1743 F DEBUG   :     r8  ffffffffffffffd8  r9  0400050000000000  r10 0000000000000002  r11 0000000000000246
01-21 19:30:40.246  1743  1743 F DEBUG   :     r12 000077505fa75600  r13 0000000001000000  r14 000077505fa75a48  r15 000077505a6c3980
01-21 19:30:40.246  1743  1743 F DEBUG   :     cs  0000000000000033  ss  000000000000002b
01-21 19:30:40.246  1743  1743 F DEBUG   :     rip 000077505eb7b5f8  rbp 0000000000500000  rsp 00007ffea2d36c70  eflags 0000000000010206
01-21 19:30:40.326  1743  1743 F DEBUG   : 
01-21 19:30:40.326  1743  1743 F DEBUG   : backtrace:
01-21 19:30:40.326  1743  1743 F DEBUG   :     #00 pc 000000000024f5f8  /system/lib64/dri/gallium_dri.so
01-21 19:30:40.326  1743  1743 F DEBUG   :     #01 pc 0000000000262843  /system/lib64/dri/gallium_dri.so


utente@utente-MS-7576:~/nougat-x86_kernel_49$ addr2line -Cfe out/target/product/x86_64/symbols/system/lib64/dri/gallium_dri.so
000000000024f5f8
list_add
/proc/self/cwd/external/mesa/src/util/list.h:62
0000000000262843
pb_destroy
/proc/self/cwd/external/mesa/src/gallium/auxiliary/pipebuffer/pb_buffer.h:232
0000000000260816
Comment 1 Mauro Rossi 2017-01-24 13:25:25 UTC
Created attachment 129126 [details]
Logcat
Comment 2 Mauro Rossi 2017-01-24 13:25:53 UTC
Created attachment 129127 [details]
addr2line log
Comment 3 Mauro Rossi 2017-01-30 21:26:31 UTC
Created attachment 129239 [details]
various EGL issues with addr2line logs
Comment 4 Mauro Rossi 2017-04-03 23:09:33 UTC
Hi,

the problem has disappeared in the last week,
I've seen a series of commits for radeonsi,
but could some developer point out what changes have been done,

which may have corrected a NULL pointer dereference MAPERR at address 0x0 
happening at list_add after a destroy_buffer_locked?


11-27 18:00:55.044  3215  3215 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0

addr2list output (when issue was present)

utente@utente-MS-7576:~/nougat-x86_kernel_49$ addr2line -Cfe out/target/product/x86_64/symbols/system/lib64/dri/gallium_dri.so
245108
list_add
/proc/self/cwd/external/mesa/src/util/list.h:62
5eb80e
destroy_buffer_locked
/proc/self/cwd/external/mesa/src/gallium/auxiliary/pipebuffer/pb_cache.c:50
248872
pb_destroy
/proc/self/cwd/external/mesa/src/gallium/auxiliary/pipebuffer/pb_buffer.h:232
250044
r600_fence_reference
/proc/self/cwd/external/mesa/src/gallium/drivers/radeon/r600_pipe_common.c:1072
2f5a8c
dri_flush
/proc/self/cwd/external/mesa/src/gallium/state_trackers/dri/dri_drawable.c:527
^C

Besides closing this bug, the information is important in order to backport bug corrections in mesa 17.0.x and mesa 13.0.x

Mauro
Comment 5 Mauro Rossi 2017-04-04 00:30:27 UTC
In the previous post is what happened with x86_64 nougat-x86 build, for all mesa 17.0.x cycle.

Here follows what still happens with mesa 17.0.3 with x86 build of nougat-x86.
I hope by coupling these information of x86_64 and x86 build may help to figure out the issue.

Mauro

--------- beginning of crash
04-04 00:08:20.947  1528  1889 F libc    : Fatal signal 11 (SIGSEGV), code 1, fault addr 0x8 in tid 1889 (Thread-2)
04-04 00:08:20.947  1234  1234 W         : debuggerd: handling request: pid=1528 uid=1000 gid=1000 tid=1889
04-04 00:08:20.957  1902  1902 E         : debuggerd: Unable to connect to activity manager (connect failed: No such file or directory)
04-04 00:08:20.957  1902  1902 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
04-04 00:08:20.957  1902  1902 F DEBUG   : Build fingerprint: 'Android-x86/android_x86/x86:7.1.1/NMF26O/utente01090017:eng/test-keys'
04-04 00:08:20.957  1902  1902 F DEBUG   : Revision: '0'
04-04 00:08:20.957  1902  1902 F DEBUG   : ABI: 'x86'
04-04 00:08:20.957  1902  1902 F DEBUG   : pid: 1528, tid: 1889, name: Thread-2  >>> system_server <<<
04-04 00:08:20.957  1902  1902 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x8
04-04 00:08:20.957  1902  1902 F DEBUG   :     eax 00a44000  ebx 00000000  ecx 00000000  edx 903089c0
04-04 00:08:20.957  1902  1902 F DEBUG   :     esi 00000000  edi 9bf0eff4
04-04 00:08:20.957  1902  1902 F DEBUG   :     xcs 00000073  xds 0000007b  xes 0000007b  xfs 0000003b  xss 0000007b
04-04 00:08:20.957  1902  1902 F DEBUG   :     eip 8d1e708f  ebp 8df43048  esp 8df42fd0  flags 00010217
04-04 00:08:20.965  1902  1902 F DEBUG   : 
04-04 00:08:20.965  1902  1902 F DEBUG   : backtrace:
04-04 00:08:20.965  1902  1902 F DEBUG   :     #00 pc 0022f08f  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #01 pc 002318f3  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #02 pc 00245685  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #03 pc 00694298  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #04 pc 004c9853  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #05 pc 004cea9a  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #06 pc 0044f289  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #07 pc 004ce687  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #08 pc 004595cb  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #09 pc 004ce6e3  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #10 pc 0045b0ea  /android/system/lib/dri/gallium_dri.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #11 pc 0000d276  /android/system/lib/libglapi.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #12 pc 000108af  /android/system/lib/libandroid_servers.so
04-04 00:08:20.965  1902  1902 F DEBUG   :     #13 pc 00063d46  /android/data/dalvik-cache/x86/system@framework@services.jar@classes.dex (offset 0xc3f000)


utente@utente-desktop:~/nougat-x86$ addr2line -Cfe out/target/product/x86/symbols/system/lib/dri/gallium_dri.so 
0022f08f  /android/system/lib/dri/gallium_dri.so
002318f3  /android/system/lib/dri/gallium_dri.so
00245685  /android/system/lib/dri/gallium_dri.so
00694298  /android/system/lib/dri/gallium_dri.so
004c9853  /android/system/lib/dri/gallium_dri.so
004cea9a  /android/system/lib/dri/gallium_dri.so
0044f289  /android/system/lib/dri/gallium_dri.so
004ce687  /android/system/lib/dri/gallium_dri.so
004595cb  /android/system/lib/dri/gallium_dri.so
004ce6e3  /android/system/lib/dri/gallium_dri.so
0045b0ea  /android/system/lib/dri/gallium_dri.so
radeon_bomgr_free_va
/proc/self/cwd/external/mesa/src/gallium/winsys/radeon/drm/radeon_drm_bo.c:289
radeon_bo_destroy_or_cache
/proc/self/cwd/external/mesa/src/gallium/winsys/radeon/drm/radeon_drm_bo.c:399
pb_destroy
/proc/self/cwd/external/mesa/src/gallium/auxiliary/pipebuffer/pb_buffer.h:232
u_resource_destroy_vtbl
/proc/self/cwd/external/mesa/src/gallium/auxiliary/util/u_transfer.c:127
pipe_resource_reference
/proc/self/cwd/external/mesa/src/gallium/auxiliary/util/u_inlines.h:141
st_FreeTextureImageBuffer
/proc/self/cwd/external/mesa/src/mesa/state_tracker/st_cb_texture.c:187
_mesa_delete_texture_image
/proc/self/cwd/external/mesa/src/mesa/main/teximage.c:189
st_DeleteTextureImage
/proc/self/cwd/external/mesa/src/mesa/state_tracker/st_cb_texture.c:144
_mesa_delete_texture_object
/proc/self/cwd/external/mesa/src/mesa/main/texobj.c:408
st_DeleteTextureObject
/proc/self/cwd/external/mesa/src/mesa/state_tracker/st_cb_texture.c:171
_mesa_reference_texobj_
/proc/self/cwd/external/mesa/src/mesa/main/texobj.c:569
^C


04-04 00:08:22.152  1907  1907 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
04-04 00:08:22.153  1907  1907 F DEBUG   : Build fingerprint: 'Android-x86/android_x86/x86:7.1.1/NMF26O/utente01090017:eng/test-keys'
04-04 00:08:22.153  1907  1907 F DEBUG   : Revision: '0'
04-04 00:08:22.153  1907  1907 F DEBUG   : ABI: 'x86'
04-04 00:08:22.153  1907  1907 F DEBUG   : pid: 1515, tid: 1518, name: BootAnimation  >>> /system/bin/bootanimation <<<
04-04 00:08:22.153  1907  1907 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x8
04-04 00:08:22.153  1907  1907 F DEBUG   :     eax 01500000  ebx 00000000  ecx 00000000  edx a1cc5040
04-04 00:08:22.153  1907  1907 F DEBUG   :     esi 00000000  edi a753b0f4
04-04 00:08:22.153  1907  1907 F DEBUG   :     xcs 00000073  xds 0000007b  xes 0000007b  xfs 00000000  xss 0000007b
04-04 00:08:22.153  1907  1907 F DEBUG   :     eip a250308f  ebp a2d001e8  esp a2d00170  flags 00010217
04-04 00:08:22.159  1907  1907 F DEBUG   : 
04-04 00:08:22.159  1907  1907 F DEBUG   : backtrace:
04-04 00:08:22.159  1907  1907 F DEBUG   :     #00 pc 0022f08f  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #01 pc 0063111a  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #02 pc 002318e9  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #03 pc 002382b1  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #04 pc 00694298  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #05 pc 002430a3  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #06 pc 00245668  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #07 pc 00694298  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #08 pc 002430a3  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #09 pc 002452b7  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #10 pc 00683274  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #11 pc 0021a0d5  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #12 pc 00608a99  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #13 pc 004b39e7  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #14 pc 004b23dd  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #15 pc 004bab58  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #16 pc 0037418a  /system/lib/dri/gallium_dri.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #17 pc 0000a7ee  /system/lib/libglapi.so
04-04 00:08:22.159  1907  1907 F DEBUG   :     #18 pc 0000550b  /system/bin/bootanimation
04-04 00:08:22.159  1907  1907 F DEBUG   :     #19 pc 000051ae  /system/bin/bootanimation
04-04 00:08:22.159  1907  1907 F DEBUG   :     #20 pc 00012095  /system/lib/libutils.so (_ZN7android6Thread11_threadLoopEPv+309)
04-04 00:08:22.159  1907  1907 F DEBUG   :     #21 pc 00011883  /system/lib/libutils.so (_ZN13thread_data_t10trampolineEPKS_+259)
04-04 00:08:22.159  1907  1907 F DEBUG   :     #22 pc 000750c2  /system/lib/libc.so (_ZL15__pthread_startPv+210)
04-04 00:08:22.159  1907  1907 F DEBUG   :     #23 pc 000202de  /system/lib/libc.so (__start_thread+30)
04-04 00:08:22.159  1907  1907 F DEBUG   :     #24 pc 0001e0b6  /system/lib/libc.so (__bionic_clone+70)


utente@utente-desktop:~/nougat-x86$ addr2line -Cfe out/target/product/x86/symbols/system/lib/dri/gallium_dri.so 
0022f08f  /system/lib/dri/gallium_dri.so
0063111a  /system/lib/dri/gallium_dri.so
002318e9  /system/lib/dri/gallium_dri.so
002382b1  /system/lib/dri/gallium_dri.so
00694298  /system/lib/dri/gallium_dri.so
002430a3  /system/lib/dri/gallium_dri.so
00245668  /system/lib/dri/gallium_dri.so
00694298  /system/lib/dri/gallium_dri.so
002430a3  /system/lib/dri/gallium_dri.so
002452b7  /system/lib/dri/gallium_dri.so
00683274  /system/lib/dri/gallium_dri.so
0021a0d5  /system/lib/dri/gallium_dri.so
00608a99  /system/lib/dri/gallium_dri.so
004b39e7  /system/lib/dri/gallium_dri.so
004b23dd  /system/lib/dri/gallium_dri.so
004bab58  /system/lib/dri/gallium_dri.so
0037418a  /system/lib/dri/gallium_dri.so
radeon_bomgr_free_va
/proc/self/cwd/external/mesa/src/gallium/winsys/radeon/drm/radeon_drm_bo.c:289
destroy_buffer_locked
/proc/self/cwd/external/mesa/src/gallium/auxiliary/pipebuffer/pb_cache.c:50
radeon_bo_destroy_or_cache
/proc/self/cwd/external/mesa/src/gallium/winsys/radeon/drm/radeon_drm_bo.c:397
pb_destroy
/proc/self/cwd/external/mesa/src/gallium/auxiliary/pipebuffer/pb_buffer.h:232
u_resource_destroy_vtbl
/proc/self/cwd/external/mesa/src/gallium/auxiliary/util/u_transfer.c:127
pipe_resource_reference
/proc/self/cwd/external/mesa/src/gallium/auxiliary/util/u_inlines.h:141
r600_resource_reference
/proc/self/cwd/external/mesa/src/gallium/drivers/radeon/r600_pipe_common.h:865
u_resource_destroy_vtbl
/proc/self/cwd/external/mesa/src/gallium/auxiliary/util/u_transfer.c:127
pipe_resource_reference
/proc/self/cwd/external/mesa/src/gallium/auxiliary/util/u_inlines.h:141
r600_surface_destroy
/proc/self/cwd/external/mesa/src/gallium/drivers/radeon/r600_texture.c:1822
pipe_surface_reference
/proc/self/cwd/external/mesa/src/gallium/auxiliary/util/u_inlines.h:113
si_set_framebuffer_state
/proc/self/cwd/external/mesa/src/gallium/drivers/radeonsi/si_state.c:2387
cso_set_framebuffer
/proc/self/cwd/external/mesa/src/gallium/auxiliary/cso_cache/cso_context.c:712
update_framebuffer_state
/proc/self/cwd/external/mesa/src/mesa/state_tracker/st_atom_framebuffer.c:213
st_validate_state
/proc/self/cwd/external/mesa/src/mesa/state_tracker/st_atom.c:219 (discriminator 1)
st_Clear
/proc/self/cwd/external/mesa/src/mesa/state_tracker/st_cb_clear.c:409
_mesa_Clear
/proc/self/cwd/external/mesa/src/mesa/main/clear.c:224
^C
Comment 6 Mauro Rossi 2017-04-08 00:20:01 UTC
Hi,

commit ce27b27 "radeon: initialize hole variable before calling container_of"
solves the issue and should be backported to mesa 13.0 and mesa 17.0 branches
to have this bug resolved in next maintenance releases.

Tested with nougat-x86 on HD7750 and HD7950

Mauro

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.