* thread #1, name = 'mpv', stop reason = signal SIGSEGV * frame #0: 0x00007fa379611cbf libc.so.6`__memmove_avx_unaligned_erms at memmove-vec-unaligned-erms.S:306 frame #1: 0x00007fa362f82639 i965_dri.so`brw_upload_cs_work_groups_surface(brw=0x00007fa358538010) at brw_wm_surface_state.c:1660 frame #2: 0x00007fa362f7a829 i965_dri.so`brw_upload_compute_state [inlined] check_and_emit_atom(atom=0x00007fa35854f3f0, state=<unavailable>, brw=0x00007fa358538010) at brw_state_upload.c:496 frame #3: 0x00007fa362f7a810 i965_dri.so`brw_upload_compute_state at brw_state_upload.c:615 frame #4: 0x00007fa362f7a6b8 i965_dri.so`brw_upload_compute_state(brw=0x00007fa358538010) at brw_state_upload.c:675 frame #5: 0x00007fa362f613a8 i965_dri.so`brw_dispatch_compute_common(ctx=0x00007fa358538010) at brw_compute.c:192 frame #6: 0x00007fa36319f7ff i965_dri.so`_mesa_DispatchCompute at compute.c:265 frame #7: 0x00007fa36319f762 i965_dri.so`_mesa_DispatchCompute(num_groups_x=480, num_groups_y=270, num_groups_z=1) at compute.c:280 frame #8: 0x0000556f661adf42 mpv`gl_renderpass_run(ra=0x00007fa358567700, params=0x00007fa368f984c0) at ra_gl.c:1051 frame #9: 0x0000556f661936db mpv`gl_sc_dispatch_compute(sc=0x00007fa358677ca0, w=480, h=270, d=1) at shader_cache.c:1021 frame #10: 0x0000556f6619a27c mpv`dispatch_compute(p=0x00007fa358678a30, w=3840, h=2160) at video.c:1165 frame #11: 0x0000556f6619a379 mpv`finish_pass_tex(p=0x00007fa358678a30, dst_tex=0x00007fa358678e88, w=3840, h=2160) at video.c:1264 frame #12: 0x0000556f6619cce7 mpv`pass_draw_to_screen(p=0x00007fa358678a30, fbo=<unavailable>) at video.c:2815 frame #13: 0x0000556f6619fc64 mpv`gl_video_render_frame(p=0x00007fa358678a30, frame=0x00007fa1da397ce0, fbo=<unavailable>, flags=3) at video.c:3124 frame #14: 0x0000556f661b3f5c mpv`draw_frame(vo=0x0000556f6726b090, frame=0x00007fa1da397ce0) at vo_gpu.c:87 frame #15: 0x0000556f661b1b27 mpv`vo_render_frame_external(vo=0x0000556f6726b090) at vo.c:898 frame #16: 0x0000556f661b25f7 mpv`vo_thread(ptr=0x0000556f6726b090) at vo.c:1055 frame #17: 0x00007fa37d006a9d libpthread.so.0`start_thread(arg=<unavailable>) at pthread_create.c:486 frame #18: 0x00007fa3795adaf3 libc.so.6`__GI___clone at clone.S:95
To be more accurate, it appears to be a NULL pointer dereference: (gdb) p $_siginfo._sifields._sigfault.si_addr $1 = (void *) 0x4 (gdb)
Is there a specific set of options that need to be given to mpv to reproduce this issue?
also provide please your SW/HW configurations, mesa version, gpu, kernel etc
(In reply to Lionel Landwerlin from comment #2) > Is there a specific set of options that need to be given to mpv to reproduce > this issue? Nope
(In reply to Denis from comment #3) > also provide please your SW/HW configurations, mesa version, gpu, kernel etc Linux distro is ArchLinux. Output of lshw attached. mesa 18.3.4-1. Linux kernel 5.0.0-arch1-1-ARCH.
Created attachment 143683 [details] output of lshw
Happened again (doesn't occur frequently). Backtrace from coredump (gdb): (gdb) p $_siginfo._sifields._sigfault.si_addr $1 = (void *) 0x4 (gdb) bt #0 __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:306 #1 0x00007f386c1f4639 in brw_upload_cs_work_groups_surface (brw=0x7f38644bcc40) at ../mesa-18.3.4/src/mesa/drivers/dri/i965/brw_wm_surface_state.c:1660 #2 0x00007f386c1ec829 in check_and_emit_atom (atom=0x7f38644d4020, state=<synthetic pointer>, brw=0x7f38644bcc40) at ../mesa-18.3.4/src/mesa/drivers/dri/i965/brw_state_upload.c:496 #3 brw_upload_pipeline_state (pipeline=BRW_COMPUTE_PIPELINE, brw=0x7f38644bcc40, brw@entry=0x7f38644d4068) at ../mesa-18.3.4/src/mesa/drivers/dri/i965/brw_state_upload.c:615 #4 brw_upload_compute_state (brw=brw@entry=0x7f38644bcc40) at ../mesa-18.3.4/src/mesa/drivers/dri/i965/brw_state_upload.c:675 #5 0x00007f386c1d33a8 in brw_dispatch_compute_common (ctx=0x7f38644bcc40) at ../mesa-18.3.4/src/mesa/drivers/dri/i965/brw_compute.c:192 #6 0x00007f386c4117ff in dispatch_compute (no_error=false, num_groups_z=1, num_groups_y=135, num_groups_x=240) at ../mesa-18.3.4/src/mesa/main/compute.c:265 #7 _mesa_DispatchCompute (num_groups_x=240, num_groups_y=135, num_groups_z=1) at ../mesa-18.3.4/src/mesa/main/compute.c:280 #8 0x000055a08dbdaf42 in gl_renderpass_run (ra=0x7f38644e2cd0, params=0x7f386dc724c0) at ../video/out/opengl/ra_gl.c:1051 #9 0x000055a08dbc06db in gl_sc_dispatch_compute (sc=0x7f38646780a0, w=w@entry=240, h=h@entry=135, d=d@entry=1) at ../video/out/gpu/shader_cache.c:1021 #10 0x000055a08dbc727c in dispatch_compute (p=p@entry=0x7f3864678e30, w=w@entry=1916, h=h@entry=1077, info=...) at ../video/out/gpu/video.c:1165 #11 0x000055a08dbc7379 in finish_pass_tex (p=p@entry=0x7f3864678e30, dst_tex=dst_tex@entry=0x7f3864679288, w=1916, h=1077) at ../video/out/gpu/video.c:1264 #12 0x000055a08dbc9ce7 in pass_draw_to_screen (p=p@entry=0x7f3864678e30, fbo=...) at ../video/out/gpu/video.c:2815 #13 0x000055a08dbccc64 in gl_video_render_frame (p=0x7f3864678e30, frame=frame@entry=0x7f3865434650, fbo=..., flags=flags@entry=3) at ../video/out/gpu/video.c:3124 #14 0x000055a08dbe0f5c in draw_frame (vo=0x55a08e1d0090, frame=0x7f3865434650) at ../video/out/vo_gpu.c:87 #15 0x000055a08dbdeb27 in vo_render_frame_external (vo=vo@entry=0x55a08e1d0090) at ../video/out/vo.c:898 #16 0x000055a08dbdf5f7 in vo_thread (ptr=0x55a08e1d0090) at ../video/out/vo.c:1055 #17 0x00007f3886113a9d in start_thread (arg=<optimized out>) at pthread_create.c:486 #18 0x00007f38826baaf3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
All I can find is that somehow the buffer manager of i965 is failing to allocate or map some memory. Which could indicate there is a memory leak somewhere... Running mpv with INTEL_DEBUG=buf might print out some traces that would help.
Hello Anthony, could you please clarify a video, at least as example? I downloaded 4 of them (from available demo's, such a cartoon with bunny, etc... And ran them about 6 times in a row. I didn't get sigfalults. My configuration: Kernel - 5.0 Distro - Manjaro (all packages up-to-date) Mesa version - 18.3.4 Env - Gnome-shell GPU - UHD630 What server in your case in use - X or wayland?
The video which triggers the crash is available at the following link: https://drive.google.com/file/d/1d3IcZwYunqWJbIph7gICM2spMKPrh-kz/view?usp=sharing Just confirmed that the crash still occurs. It required playing the video for 44 minutes and 30 seconds (length of the video is 52:40). No extra command-line args were supplied to mpv. My desktop environment is i3-wm (X server).
ok, thanks for video and clarification. Will try it on my configuration, if nothing - who knows, will check i3 desktop env
hi, sorry for delay, but looks like I was able to reproduce your issue. Reproducibility is very bad, 2 times only. So I continue my investigations. Test configuration: Manjaro Kernel 5.0 Mesa 19.1.0 git-master UHD 630 gpu (CFL)
Created attachment 144210 [details] full backtrace
Created attachment 144211 [details] intel_debug=buf log tail -n100 of the log since the full thing is very big (100 MB)
brw_upload_cs_work_groups_surface is leaking buffer objects like nobody's business.
Patch out for review: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/857
(In reply to Kenneth Graunke from comment #16) > Patch out for review: > https://gitlab.freedesktop.org/mesa/mesa/merge_requests/857 I used to be able to reproduce this issue without fail after watching 20-30 minutes of HEVC video with HDR. Now, with this patch, I can't even trigger it after watching 2 hours of video. If it happens again I'll reply to this but for now this patch has fixed it for me.
A slightly simplified version of that patch has landed in master: commit 3f60810de0a2960ec15118ef9888d9efc9ea605a Author: Kenneth Graunke <kenneth@whitecape.org> Date: Thu May 9 15:40:13 2019 -0700 i965: Fix memory leaks in brw_upload_cs_work_groups_surface(). This was taking a reference to the 64kB upload buffer and never returning it, leaking a reference each time this atom triggered. This leaked lots of 64kB upload BOs, eventually running us out of of VMA space. This would usually happen when using mpv to watch a movie, after 20-40 minutes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110134 Fixes: 63d7b33f516 i965/cs: Setup surface binding for gl_NumWorkGroups Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> It's been tagged for backporting to stable branches as well. Thanks for the report!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.