Bug 101996

Summary: Having problems when drawing lots of mesh with texture array
Product: Mesa Reporter: benau2006
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact: Default DRI bug account <dri-devel>
Severity: normal    
Priority: medium CC: asdfghrbljzmkd, deveee
Version: 17.2   
Hardware: Other   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description benau2006 2017-08-01 04:23:47 UTC
Hi,

I'm developing a new engine for game STK, and is having serious regression when switching to use texture array for drawing meshes in game:

Code:
https://github.com/Benau/stk-code/tree/sp_new
(Basically only data/shader/sp*.* and src/sp/* is relevant

Assets:
https://github.com/Benau/sp-assets
(In case if real-time testing is needed)

Apitrace:
http://kobato.stan.hk/bug.trace.lzma

Linux is 4.12.4
DRM is using git
Mesa version  is 17.2.0-rc1 (git-a455f594bb)

So when you enter the game with lots of meshes rendering, the game is spending a lot of time waiting in glFlush,
(~100ms at worst)
in util_queue_fence_wait in gdb with ctrl-c,
Full trace:

#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f85d8c30d22 in cnd_wait (cond=0x1ca8d50, mtx=0x1ca8d28) at ../../include/c11/threads_posix.h:159
#2  0x00007f85d8c31221 in util_queue_fence_wait (fence=0x1ca8d28) at u_queue.c:106
#3  0x00007f85d90bc6c3 in radeon_drm_cs_sync_flush (rcs=0x1c80b70) at radeon_drm_cs.c:489
#4  0x00007f85d90bcb20 in radeon_drm_cs_flush (rcs=0x1c80b70, flags=1, pfence=0x1ad97e8) at radeon_drm_cs.c:614
#5  0x00007f85d9078f00 in si_context_gfx_flush (context=0x1ad9430, flags=1, fence=0x0) at si_hw_context.c:154
#6  0x00007f85d90fff53 in r600_flush_from_st (ctx=0x1ad9430, fence=0x0, flags=0) at r600_pipe_common.c:396
#7  0x00007f85d8d8fb30 in tc_flush (_pipe=0x1ca8da0, fence=0x0, flags=0) at util/u_threaded_context.c:1799
#8  0x00007f85d8ada3ec in st_flush (st=0x1ce2300, fence=0x0, flags=0) at state_tracker/st_cb_flush.c:87
#9  0x00007f85d8ada4b3 in st_glFlush (ctx=0x1cb0ec0) at state_tracker/st_cb_flush.c:121
#10 0x00007f85d884cff5 in _mesa_flush (ctx=0x1cb0ec0) at main/context.c:1846
#11 0x00007seriousf85d884d17d in _mesa_Flush () at main/context.c:1884
#12 0x0000000000f649ce in irr::video::COpenGLDriver::endScene (this=0x1cfe410) at /data/game/stk-code/lib/irrlicht/source/Irrlicht/COpenGLDriver.cpp:908
(As you can see in apitrace)


When something bad-luck enough, it can even hang / lock the whole linux, this is the before journalctl:
8月 01 10:55:52 kobato kernel: [TTM] Failed allocating page table
8月 01 10:55:52 kobato kernel: [TTM] Buffer eviction failed
8月 01 10:55:52 kobato kernel: [TTM] Failed allocating page table
8月 01 10:55:52 kobato kernel: [TTM] Buffer eviction failed
8月 01 10:55:52 kobato kernel: BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
8月 01 10:55:52 kobato kernel: IP: drm_mm_remove_node+0x280/0x2c0
8月 01 10:55:52 kobato kernel: PGD 295f96067 
8月 01 10:55:52 kobato kernel: P4D 295f96067 
8月 01 10:55:52 kobato kernel: PUD 20e4a1067 
8月 01 10:55:52 kobato kernel: PMD 0 
8月 01 10:55:52 kobato kernel: 
8月 01 10:55:52 kobato kernel: Oops: 0002 [#1] PREEMPT SMP
8月 01 10:55:52 kobato kernel: Modules linked in: ctr ccm ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack af_packet fuse月 01 10:55:52 kobato kernel: CPU: 5 PID: 4349 Comm: radeon_cs:0 Tainted: G           O    4.12.4 #4
8月 01 10:55:52 kobato kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z87 Pro4, BIOS P2.30 07/11/2014
8月 01 10:55:52 kobato kernel: task: ffff88010c9f8c00 task.stack: ffffc9000ee1c000
8月 01 10:55:52 kobato kernel: RIP: 0010:drm_mm_remove_node+0x280/0x2c0
8月 01 10:55:52 kobato kernel: RSP: 0018:ffffc9000ee1f600 EFLAGS: 00010246
8月 01 10:55:52 kobato kernel: RAX: 0000000000000000 RBX: 00000000000007d8 RCX: 0000000200000002
8月 01 10:55:52 kobato kernel: RDX: 0000000000000000 RSI: ffffc9000ee1f668 RDI: 0000000000000000
8月 01 10:55:52 kobato kernel: RBP: ffff88011f267780 R08: 0000000000000000 R09: ffff88011f2677c0
8月 01 10:55:52 kobato kernel: R10: ffffea00047c99c0 R11: 0000000000000000 R12: 00000000000007f8
8月 01 10:55:52 kobato kernel: R13: 0000000000000000 R14: ffff88030ff70980 R15: ffff88020e7f9068
8月 01 10:55:52 kobato kernel: FS:  00007f7ee903d700(0000) GS:ffff88031ed40000(0000) knlGS:0000000000000000
8月 01 10:55:52 kobato kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
8月 01 10:55:52 kobato kernel: CR2: 00000000000000b8 CR3: 0000000297b41000 CR4: 00000000001406a0
8月 01 10:55:52 kobato kernel: Call Trace:
8月 01 10:55:52 kobato kernel:  ? ttm_bo_man_put_node+0x26/0x50
8月 01 10:55:52 kobato kernel:  ? ttm_bo_evict+0x156/0x2a0
8月 01 10:55:52 kobato kernel:  ? ttm_mem_evict_first+0x137/0x190
8月 01 10:55:52 kobato kernel:  ? ttm_bo_mem_space+0x327/0x4a0
8月 01 10:55:52 kobato kernel:  ? ttm_bo_validate+0xa5/0x120
8月 01 10:55:52 kobato kernel:  ? ttm_eu_reserve_buffers+0x28a/0x300
8月 01 10:55:52 kobato kernel:  ? radeon_bo_list_validate+0xbf/0x200
8月 01 10:55:52 kobato kernel:  ? radeon_cs_parser_relocs+0x2c0/0x3d0
8月 01 10:55:52 kobato kernel:  ? radeon_cs_ioctl+0xc0/0x750
8月 01 10:55:52 kobato kernel:  ? drm_ioctl+0x1c8/0x3e0
8月 01 10:55:52 kobato kernel:  ? radeon_cs_parser_init+0x20/0x20
8月 01 10:55:52 kobato kernel:  ? do_futex+0x26d/0xb20
8月 01 10:55:52 kobato kernel:  ? shmem_truncate_range+0x19/0x30
8月 01 10:55:52 kobato kernel:  ? radeon_drm_ioctl+0x44/0x80
8月 01 10:55:52 kobato kernel:  ? do_vfs_ioctl+0x8a/0x5d0
8月 01 10:55:52 kobato kernel:  ? __fget+0x62/0xa0
8月 01 10:55:52 kobato kernel:  ? SyS_ioctl+0x36/0x70
8月 01 10:55:52 kobato kernel:  ? entry_SYSCALL_64_fastpath+0x17/0x98
8月 01 10:55:52 kobato kernel: Code: 00 00 00 e9 b2 fe ff ff 48 8b 85 88 00 00 00 4c 8b 41 08 49 89 ca 48 89 41 48 48 89 c8 e9 65 fe ff ff 4c 89 51 10 e9 8f fe ff ff <49> 89 8d b8 00 00 0
8月 01 10:55:52 kobato kernel: RIP: drm_mm_remove_node+0x280/0x2c0 RSP: ffffc9000ee1f600
8月 01 10:55:52 kobato kernel: CR2: 00000000000000b8


The same code has been tested in windows in same hardware, and other PC with nvidia proprietary driver, and it's working fine.

Did I do something wrong with opengl code, or do you think it's a problem in mesa somewhere?

Thanks
Comment 1 GitLab Migration User 2019-09-25 17:59:41 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1274.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.