Bug 101425

Summary: deleted context will leak memory if it contains internal programs
Product: Beignet Reporter: Patrick Beaulieu <heypaku>
Component: BeignetAssignee: Xiuli Pan <xiuli.pan>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: patch for possible fix

Description Patrick Beaulieu 2017-06-14 18:03:18 UTC
Created attachment 131958 [details] [review]
patch for possible fix

This should be an issue on all platforms, found in 1.2, confirmed still an issue in master.

Initially noticed when running valgrind leak checking on own code (which uses clEnqueueCopyImage which uses an internal program/kernel CL_ENQUEUE_COPY_IMAGE_2D_TO_2D), reproduced in benchmarks once root cause was understood.
To repro simply run
beignet/out/benchmark$ valgrind --leak-check=full ./benchmark_run -c benchmark_copy_image_to_buffer

and see output:
==20526== 16,777,216 bytes in 1 blocks are definitely lost in loss record 217 of 217
==20526==    at 0x641F960: ??? (in /usr/lib/x86_64-linux-gnu/libdrm_intel.so.1.0.0)
==20526==    by 0x409B738: cl_mem_map (cl_mem.c:2258)
==20526==    by 0x4097A64: cl_enqueue_map_buffer (cl_enqueue.c:308)
==20526==    by 0x4097A64: cl_enqueue_handle (cl_enqueue.c:649)
==20526==    by 0x4096197: cl_event_exec (cl_event.c:578)
==20526==    by 0x4089026: clEnqueueMapBuffer (cl_api_mem.c:313)
==20526==    by 0x403C862: benchmark_copy_image_to_buffer() (benchmark_copy_image_to_buffer.cpp:41)
==20526==    by 0x403D00C: __ANON__benchmark_copy_image_to_buffer__() (benchmark_copy_image_to_buffer.cpp:64)
==20526==    by 0x402F1B9: UTest::run(char const*) (utest.cpp:177)
==20526==    by 0x1099DE: main (benchmark_run.cpp:77)
==20526==
==20526== LEAK SUMMARY:
==20526==    definitely lost: 16,957,440 bytes in 9 blocks

The problem is that cl_context_delete does not account for context references from context internal programs when it decides if the context is ready to be actually deleted so this return is always hit if any internal_prgs exists:
   /* We are not done yet */
  if (CL_OBJECT_DEC_REF(ctx) > 1)
     return;

The attached patch fixes/works around the issue but I'm not sure how safe the internal context accounting is (is context create/delete use supposed to be thread-safe?) so you may wish to develop your own fix.

After my patch:
beignet/out/benchmark$ valgrind --leak-check=full ./benchmark_run -c benchmark_copy_image_to_buffer
.
.
.
==20644== HEAP SUMMARY:
==20644==     in use at exit: 9,366 bytes in 25 blocks
==20644==   total heap usage: 171,673 allocs, 171,648 frees, 77,049,165 bytes allocated
==20644==
==20644== LEAK SUMMARY:
==20644==    definitely lost: 0 bytes in 0 blocks
==20644==    indirectly lost: 0 bytes in 0 blocks



Extra info if needed:
platform number 1
platform_profile "FULL_PROFILE"
platform_name "Intel Gen OCL Driver"
platform_vendor "Intel"
platform_version "OpenCL 1.2 beignet 1.4 (git-f66ce212)"
platform_extensions "cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_intel_media_block_io cl_intel_planar_yuv cl_khr_gl_sharing"
device_profile "FULL_PROFILE"
device_name "Intel(R) HD Graphics Kabylake ULT GT3"
device_vendor "Intel"
device_version "OpenCL 1.2 beignet 1.4 (git-f66ce212)"


valgrind --version
valgrind-3.12.0

using a pretty vanilla Ub17.04, except with llvm 3.7 and ocl1.2 mode.
cat /proc/cpuinfo | grep "model name"
model name      : Intel(R) Core(TM) i5-7260U CPU @ 2.20GHz

Thank you for this excellent open source project!
Comment 1 Xiuli Pan 2017-06-15 02:47:56 UTC
Hi,

I have see the reproduce and if you have already got the fix patch you can send it to our mail list: beignet@lists.freedesktop.org

And I will reveiew it.

Thanks
Xiuli
Comment 2 Patrick Beaulieu 2017-06-15 23:32:28 UTC
Sent patch to mail list
Comment 3 Patrick Beaulieu 2017-06-16 20:21:07 UTC
https://cgit.freedesktop.org/beignet/commit/?id=4775abfe1c4d0614cf06be02707b9f2f4eabbc2f

Can be moved to resolved?
Comment 4 Xiuli Pan 2017-06-19 02:50:12 UTC
Patch is mereged.

Thanks

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.