75061 – bug in clearing color buffer

Bug 75061 - bug in clearing color buffer

Summary: bug in clearing color buffer

Status:	RESOLVED MOVED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/Gallium/r600 (show other bugs)
Version:	git
Hardware:	Other All

Importance:	medium normal
Assignee:	Default DRI bug account
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-02-16 17:22 UTC by Maxim
Modified:	2019-09-18 19:15 UTC (History)
CC List:	1 user (show)

See Also:
i915 platform:
i915 features:

Attachments
program which seg fault. (2.19 KB, text/plain) 2014-02-16 17:22 UTC, Maxim	Details
gdb full backtrace (2.01 KB, text/plain) 2014-02-18 10:10 UTC, Maxim	Details
valgrind + hellgrind (6.01 KB, text/plain) 2014-02-18 10:10 UTC, Maxim	Details
valgrind only (5.99 KB, text/plain) 2014-02-18 10:11 UTC, Maxim	Details
Bail if rr->buf is NULL (544 bytes, patch) 2014-02-19 04:03 UTC, Michel Dänzer	Details \| Splinter Review
gdb with patch (3.00 KB, text/plain) 2014-02-20 09:43 UTC, Maxim	Details
valgrind only (with patch) (5.55 KB, text/plain) 2014-02-20 09:44 UTC, Maxim	Details
valgrind + helgrind (with patch) (83.92 KB, text/plain) 2014-02-20 09:52 UTC, Maxim	Details
possible fix (3.53 KB, patch) 2014-03-08 22:39 UTC, Marek Olšák	Details \| Splinter Review
possible fix 2 (3.88 KB, patch) 2014-03-08 22:57 UTC, Marek Olšák	Details \| Splinter Review
Show Obsolete (2) View All

Description Maxim 2014-02-16 17:22:00 UTC

Created attachment 94169 [details]
program which seg fault.

I have example program, that creates 2 context for 2 threads. Program runs 2-3 seconds and crashes (segmentation fault). It crashes in glClear( GL_COLOR_BUFFER_BIT ); Exactly in: rctx->gtt += rr->buf->size; (114 line; function: r600_context_add_resource_size; file: r600_context_adr600_buffer_common.c) Flags: autogen.sh --with-dri-drivers="" --enable-gbm --enable-gallium-gbm --enable-glx-tls --with-gallium-drivers=r600 --enable-driglx-direct --enable-opencl --enable-opencl-icd --with-llvm-shared-libs --enable-r600-llvm-compiler --enable-openvg --enable-gles2 --enable-gles1 --prefix=/usr/local --enable-xa --enable-gallium-egl --enable-texture-float --enable-selinux

Comment 1 Michel Dänzer 2014-02-17 06:45:35 UTC

Please attach a backtrace of the crash from gdb.

Also, running the program in valgrind (with/out --tool=helgrind) might give some hints.

Comment 2 Maxim 2014-02-18 10:10:10 UTC

Created attachment 94275 [details]
gdb full backtrace

Comment 3 Maxim 2014-02-18 10:10:42 UTC

Created attachment 94276 [details]
valgrind + hellgrind

Comment 4 Maxim 2014-02-18 10:11:14 UTC

Created attachment 94277 [details]
valgrind only

Comment 5 Michel Dänzer 2014-02-19 04:03:03 UTC

Created attachment 94321 [details] [review]
Bail if rr->buf is NULL

Does this patch help?

(In reply to comment #3)
> valgrind + hellgrind

Thanks, but the proper spelling is 'helgrind', so it was still using memcheck as you can see. Just as a note for next time.

Comment 6 Maxim 2014-02-20 09:43:27 UTC

(In reply to comment #5)
> Created attachment 94321 [details] [review] [review]
> Bail if rr->buf is NULL
> 
> Does this patch help?
> 
> (In reply to comment #3)
> > valgrind + hellgrind
> 
> Thanks, but the proper spelling is 'helgrind', so it was still using
> memcheck as you can see. Just as a note for next time.

SIGSEGV in evergreen_emit_vertex_buffers (evergreen_state.c:2479)

Comment 7 Maxim 2014-02-20 09:43:56 UTC

Created attachment 94418 [details]
gdb with patch

Comment 8 Maxim 2014-02-20 09:44:35 UTC

Created attachment 94419 [details]
valgrind only (with patch)

Comment 9 Maxim 2014-02-20 09:52:36 UTC

Created attachment 94422 [details]
valgrind + helgrind (with patch)

Comment 10 Maxim 2014-02-23 09:09:24 UTC

Any progress?

Comment 11 Michel Dänzer 2014-02-24 07:04:30 UTC

AFAICT the problem is that both threads access the same struct r600_resource concurrently. It might be relatively easy to avoid the crashes by updating the buf member atomically in r600_init_resource() instead of setting it to NULL first in r600_invalidate_buffer(), but I suspect there could be more subtle issues with other members, in particular valid_buffer_range.

Marek, any thoughts on how to solve this?

Comment 12 Marek Olšák 2014-02-24 12:04:42 UTC

All writes to valid_buffer_range are protected by a mutex. Only the reads are not.

I've got no idea what to do with invalidate_buffer. If we added mutexes everywhere, it would slow down the driver.

I think that calling BufferData in one thread and using the buffer for rendering in some other thread is a race condition in the application and should be fixed in the app.

Comment 13 Marek Olšák 2014-02-24 12:36:17 UTC

If the program called MapBufferRange(MAP_INVALIDATE_BUFFER_BIT) instead of BufferData, I think it would be okay to crash:

From GL 4.4 spec:

"MAP_INVALIDATE_BUFFER_BIT indicates that the previous contents of the entire buffer may be discarded. Data within the entire buffer are undefined with the exception of subsequently written data. No GL error is generated if subsequent GL operations access unwritten data, but the result is undefined and system errors (possibly including program termination) may occur."

In other words, no context should access the buffer until UnmapBuffer() is complete, because there may be unwritten data.

Comment 14 Marek Olšák 2014-03-08 22:39:36 UTC

Created attachment 95375 [details] [review]
possible fix

Could you please try this patch?

Comment 15 Marek Olšák 2014-03-08 22:57:28 UTC

Created attachment 95378 [details] [review]
possible fix 2

Please test this one. The previous patch wasn't correct.

Comment 16 Maxim 2014-09-17 23:27:29 UTC

I'm sorry that I leave this bug. Just read comment 13 and realized that it was my fault. Do I need to test this patch?

Comment 17 Vedran Miletić 2017-03-22 15:24:02 UTC

Was this fixed?

Comment 18 GitLab Migration User 2019-09-18 19:15:06 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/495.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.