Summary: | drm_intel_gem_bo_context_exec() failed: No space left on device | ||
---|---|---|---|
Product: | Beignet | Reporter: | kenneth johansson <ken> |
Component: | Beignet | Assignee: | rongyang <rong.r.yang> |
Status: | RESOLVED MOVED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | aklhfex, evangelos, intelfx, ismo.puustinen, johann_frei, jolan, jylo06g, nroof, sauron, victzhang |
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
URL: | https://patchwork.freedesktop.org/patch/121347/ | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
kenneth johansson
2016-11-08 23:16:49 UTC
Hmm, wrong errno - possibly a different bug... (In reply to kenneth johansson from comment #0) > on an macbook pro running ubuntu 16.04 > > have never used this software before. This is after following the readme. > works until I try to run the test program. > > ----------------------- > ./utest_run some_unit_test > platform number 1 > platform_profile "FULL_PROFILE" > platform_name "Intel Gen OCL Driver" > platform_vendor "Intel" > platform_version "OpenCL 1.2 beignet 1.3 (git-75b6f38)" > platform_extensions "cl_khr_global_int32_base_atomics > cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics > cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store > cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images > cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups > cl_intel_subgroups_short" > drm_intel_gem_bo_context_exec() failed: No space left on device > Interrupt signal (SIGSEGV) received. > summary: > ---------- > total: 982 > run: 0 > pass: 0 > fail: 1 > pass rate: 0.000000 Hi Kenneth, Could you clean your dmesg first and then run the utest again to see if there is any dmesg about the drm? Also could you try to run the clinfo (you can get one by apt-get install clinfo) to get some of the platform info. Also please provide the llvm version you are using. Thanks Xiuli *** Bug 98882 has been marked as a duplicate of this bug. *** (In reply to Chris Wilson from comment #2) > Hmm, wrong errno - possibly a different bug... Well, at least my bug 98882 was marked as a duplicate of *this*. BTW, any plans to actually fix this? beignet is broken since exactly that commit... Hi, We have tried to reproduced this bug but failed, we are not sure if this bug is related to PPGTT or something esle. The commit you bisect out is our pre-work for OpenCL 2.0 and this may need PPGTT support, could you check if the PPGTT on your device is on and provide the dmsg with the drm debug on. Thanks Xiuli (In reply to Xiuli Pan from comment #6) > We have tried to reproduced this bug but failed, we are not sure if this bug > is related to PPGTT or something esle. The commit you bisect out is our > pre-work for OpenCL 2.0 and this may need PPGTT support, could you check if > the PPGTT on your device is on and provide the dmsg with the drm debug on. How do I check that? Not that I have any familiarity with Intel's driver internals... I got the same error message as described in #98882 which was marked as a duplicate of this one: ``` $ clinfo Number of platforms 1 Platform Name Intel Gen OCL Driver Platform Vendor Intel Platform Version OpenCL 1.2 beignet 1.3 Platform Profile FULL_PROFILE Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short Platform Extensions function suffix Intel drm_intel_gem_bo_context_exec() failed: Device or resource busy Segmentation fault ``` PPGTT seems to be enabled on my end: ``` # cat /sys/module/i915/parameters/enable_ppgtt 1 ``` I'm running the Haswell platform. ``` $ lspci -s 00:02.0 -nn 00:02.0 VGA compatible controller [0300]: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller [8086:0416] (rev 06) $ glxinfo | grep 'OpenGL renderer' OpenGL renderer string: Mesa DRI Intel(R) Haswell Mobile ``` I am also affected by this bug, with current beignet master (8efa803f2f93e377b30ff957a74c5d69beec7744), on a Dell XPS 15 from 2013 /proc/cpuinfo reads: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Core(TM) i7-4712HQ CPU @ 2.30GHz stepping : 3 microcode : 0x20 cpu MHz : 2300.421 cache size : 6144 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_t bugs : bogomips : 4589.58 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: A workaround I've found is to disable the detection of HAVE_DRM_INTEL_BO_SET_SOFTPIN. Commenting the line in CMakeLists.txt ensure that the HAS_BO_SET_SOFTPIN define does not get set, and the device becomes usable again. I also have ppgtt enable. I'm running on a Linux kernel 4.9.2 (from Debian unstable), libdrm 2.4.74 (In reply to Xiuli Pan from comment #6) > Hi, > > We have tried to reproduced this bug but failed, we are not sure if this bug > is related to PPGTT or something esle. The commit you bisect out is our > pre-work for OpenCL 2.0 and this may need PPGTT support, could you check if > the PPGTT on your device is on and provide the dmsg with the drm debug on. > > Thanks > Xiuli So: 1. PPGTT is on ---- # cat /sys/module/i915/parameters/enable_ppgtt 1 ---- 2. This is Haswell, integrated graphics of Intel Core i7-4700MQ 3. I'm not sure which logs you want. The kernel log with `drm.debug=0x3f` is very big. Which parts of it do you need? (In reply to Xiuli Pan from comment #6) > Hi, > > We have tried to reproduced this bug but failed, we are not sure if this bug > is related to PPGTT or something esle. The commit you bisect out is our > pre-work for OpenCL 2.0 and this may need PPGTT support, could you check if > the PPGTT on your device is on and provide the dmsg with the drm debug on. > > Thanks > Xiuli OK, here is the kernel log with `drm.debug=0x1f` corresponding (roughly) to the time of running a sample OpenCL workload (CLBlast unit test): https://intelfx.name/files/2017-02-02%20beignet%20debug.log That link doesn't work - did you misspell it? (In reply to Rebecca Palmer from comment #12) > That link doesn't work - did you misspell it? Sorry, my server went down unexpectedly (I could not find a pastebin that would accomodate a 6 MiB file). I'll bring it back later today. (In reply to Rebecca Palmer from comment #12) > That link doesn't work - did you misspell it? Fixed. The link did wrap in that comment, but it should not matter. I can confirm error (dup. of this - 98882) on my system: Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz Linux (source compiled) 4.9.0 #1 SMP Sat Dec 31 11:03:07 CET 2016 x86_64 GNU/Linux OS: debian sid ii beignet-dev:amd64 1.3.0-1 ii beignet-opencl-icd:amd64 1.3.0-1 Application is source compiled GIMP 2.9.5 PPGTT is on Error: drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0x80efe190 error, type is 4592, error staus is -5" drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0x81711200 error, type is 4597, error staus is -5" Cheers - Adrian Arch Linux with stock 4.9.8-1 kernel, libdrm 2.4.75. CPU is i3-3110M (Ivy Bridge). Beignet 1.3.0 and latest git version (cb4f2adc) doesn't work as well, but no SIGSEGV. For example (with cb4f2adc commit): ------ $ ./utest_run test_load_program_from_bin_file platform number 1 platform_profile "FULL_PROFILE" platform_name "Intel Gen OCL Driver" platform_vendor "Intel" platform_version "OpenCL 2.0 beignet 1.4 (git-cb4f2adc)" platform_extensions "cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing" drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0x55b9cbc4cc00 error, type is 4592, error status is -5" device_profile "FULL_PROFILE" device_name "Intel(R) HD Graphics IvyBridge M GT2" device_vendor "Intel" device_version "OpenCL 1.2 beignet 1.4 (git-cb4f2adc)" device_extensions "cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_spir cl_khr_icd cl_intel_accelerator cl_intel_subgroups cl_intel_subgroups_short cl_khr_gl_sharing cl_intel_motion_estimation" device_opencl_c_version "OpenCL C 1.2 beignet 1.4 (git-cb4f2adc)" 27 image formats are supported ...... test_load_program_from_bin_file()drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0x55b9cb927fe0 error, type is 4592, error status is -5" [FAILED] Error: ((float *)buf_data[1])[i] == cpu_dst[i] at file /mnt/Lilar/userdata/Downloads/beignet/utests/load_program_from_bin_file.cpp, function test_load_program_from_bin_file, line 76 ...... ------ I'm also hit by the "Beignet: "Exec event 0x55b9cb927fe0 error, type is 4592, error status is -5" bug. I'm running Beignet 1.4.0 (git cb4f2adcb78c71fae4) on Minnowboard Turbot (Atom E3826), kernel 4.9.6. I wonder if this is the same bug ... but it looks like. I am running a debian9-stretch (kernel 4.9) with the Beignet 1.3 on a macbook pro-13 from late 2014 (Haswell processor). Former versions of Beignet used to work. This is the output of clinfo (short version): root@mac13:/home/kieffer/bin# clinfo -l drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0x15603c0 error, type is 4592, error staus is -5" Platform #0: Intel Gen OCL Driver `-- Device #0: Intel(R) HD Graphics Haswell Ultrabook GT3 reserved Platform #1: Intel(R) OpenCL `-- Device #0: Intel(R) Core(TM) i5-4308U CPU @ 2.80GHz Platform #2: AMD Accelerated Parallel Processing `-- Device #0: Intel(R) Core(TM) i5-4308U CPU @ 2.80GHz Running any kernel fails (for example clpeak is a simple benchmarking tool for OpenCL which used to work with Beignet 1.2): kieffer@mac13:~$ clpeak -p 0 --kernel-latency drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0xf2f050 error, type is 4592, error staus is -5" Platform: Intel Gen OCL Driver Device: Intel(R) HD Graphics Haswell Ultrabook GT3 reserved Driver version : 1.3 (Linux x64) Compute units : 40 Clock frequency : 1000 MHz Kernel launch latency : drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0xb68d20 error, type is 4592, error staus is -5" drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0x1ac60a0 error, type is 4592, error staus is -5" drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0xb68d20 error, type is 4592, error staus is -5" clGetEventProfileInfo (-7) Tests skipped Kernel: kieffer@mac13:~$ uname -a Linux mac13 4.9.0-1-amd64 #1 SMP Debian 4.9.6-3 (2017-01-28) x86_64 GNU/Linux Version installed: kieffer@mac13:~$ dpkg -l |grep beignet ii beignet 1.3.0-1 amd64 OpenCL library for Intel GPUs - transitional dummy package ii beignet-dev:amd64 1.3.0-1 amd64 OpenCL for Intel GPUs (development files and documentation) ii beignet-opencl-icd:amd64 1.3.0-1 amd64 OpenCL library for Intel GPUs kieffer@mac13:~$ dpkg -l |grep intel ii intel-microcode 3.20161104.1 amd64 Processor microcode firmware for Intel CPUs ii intel-opencl-icd 5.0.0.57-2 amd64 OpenCL™ runtime for Intel® CPU device ii libdrm-intel1:amd64 2.4.74-1 amd64 Userspace interface to intel-specific kernel DRM services -- runtime ii libdrm-intel1:i386 2.4.74-1 i386 Userspace interface to intel-specific kernel DRM services -- runtime ii xserver-xorg-video-intel 2:2.99.917+git20161206-1 amd64 X.Org X server -- Intel i8xx, i9xx display driver Have the same issue on Ubuntu 17.04, kernel 4.10.0-19-generic and beignet 1.3. My GPU is Intel(R) HD Graphics IvyBridge M GT2. Get the following info when running clinfo: drm_intel_gem_bo_context_exec() failed: Device or resource busy Beignet: "Exec event 0x249e1d0 error, type is 4592, error staus is -5" I'm getting the same error as in the Description when Compatibility Support Module (CSM) is disabled and/or VT-d is enabled in my motherboard BIOS (UEFI). Also the same issue is reproducible with LuxRender. When CSM is enabled and VT-d is disabled, the issue is not reproducible. For example, "utest_run compiler_box_blur_float" is passed, and LuxRender renders without errors. Addition to my previous comment: The system is freshly updated Arch Linux with linux-libre 4.11.4_gnu-1. MB is AsRock Z97-Extreme6. CPU is i7-4970K with HD 4600 iGPU. (In reply to nRoof from comment #21) > Addition to my previous comment: > The system is freshly updated Arch Linux with linux-libre 4.11.4_gnu-1. MB > is AsRock Z97-Extreme6. CPU is i7-4970K with HD 4600 iGPU. Typo: CPU is i7-4790K which version beignet do you use? (In reply to rongyang from comment #23) > which version beignet do you use? 1.3.1 OK, this now works for me (apart from an unrelated type messup in 8d3e93fa, but that's unrelated). I can confirm this error using convert (imagemagick), beignet version 1.3.1-4. Processor: Intel Core i7-3517U arch-linux system 4.13.12-1-ARCH Also seeing this on a Xeon E3-1246 v3 (HD P4600 GPU) using Beignet 1.3.1 and Kernel 4.14.4. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/beignet/beignet/issues/54. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.