Summary: | [bisected i965] Bus error (core dumped) on oglc texdecaltile | ||
---|---|---|---|
Product: | Mesa | Reporter: | fangxun <xunx.fang> |
Component: | Drivers/DRI/i965 | Assignee: | Ian Romanick <idr> |
Status: | VERIFIED FIXED | QA Contact: | |
Severity: | major | ||
Priority: | medium | CC: | chris, shuang.he |
Version: | git | ||
Hardware: | All | ||
OS: | Linux (All) | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=38423 | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Limit texture size to fit in GTT
Check the buffer is mappable at the time of creation Catch SIGBUS and propagate GL_OUT_OF_MEMORY |
Description
fangxun
2011-06-02 22:26:21 UTC
=0 sandybridge:/usr/src/oglconform_31/dump/linux/debug64/OGLconform (master)$ ./oglconform64 -z -s -suite all -v 2 -D 33 -test texdecaltile
Intel OpenGL Conformance Test
Version ENG (Feb 1 2011 11:48:42)
CLI options echo:
oglconform64 -z -s -suite all -v 2 -D 33 -test texdecaltile
WARNING: OpenCL is not supported.
Pixel format 33
GLX_USE_GL: Yes
GLX_BUFFER_SIZE: 32
GLX_LEVEL: 0
GLX_RGBA: Yes
GLX_DOUBLEBUFFER: Yes
GLX_STEREO: No
GLX_AUX_BUFFERS 0
GLX_RED_SIZE 8
GLX_GREEN_SIZE 8
GLX_BLUE_SIZE 8
GLX_ALPHA_SIZE 8
GLX_DEPTH_SIZE 24
GLX_STENCIL_SIZE 8
GLX_ACCUM_RED_SIZE 0
GLX_ACCUM_GREEN_SIZE 0
GLX_ACCUM_BLUE_SIZE 0
GLX_ACCUM_ALPHA_SIZE 0
Setup Report.
Verbose level = 2.
Path inactive.
Visual Report.
Display ID = 33.
Double Buffered.
RGBA (8, 8, 8, 8).
Stencil (8).
Depth (24).
Accumulation (0, 0, 0, 0).
0 Auxilary Buffers.
OpenGL Report.
Vendor - 'Tungsten Graphics, Inc'
Renderer - 'Mesa DRI Intel(R) Sandybridge Desktop '
Version - '2.1 Mesa 7.11-devel (git-3aeb596)'
GLSL Version - '1.20'
WARNING: Extension GL_ARB_draw_elements_base_vertex is reported but its API is NOT COMPLETE.
>> Texture Decal Tiling (texdecaltile) test:
<< Texture Decal Tiling (texdecaltile) test passed.
Intel Conformance passed.
Total Passed : 1
Total Failed : 0
Total Not run: 0
Puzzled.
Increasing the maximum texture size may lead to odd out-of-memory like problems. How much memory do the failing and non-failing configurations have? Are you both running the same kernel? Chris, is your SNB a Huronriver? We have seen some differences between the platforms, and this could be another one. Ah, I see the error on my HuronRiver. And actually thinking about the issue: 8192*8192*4 = 256MiB, which exceeds the available mappable aperture size on my laptop but still fits conformtably within the 512MiB aperture on my SugarBay. So i915_gem_fault() is detecting an E2BIG when trying to bing the bo into the mappable region which gets translated to a SIGBUS in the fault handler. The easiest approach is to ratchet down the maximum texture size until we can safely mmap it through the GTT. Created attachment 47518 [details] [review] Limit texture size to fit in GTT (In reply to comment #4) > Created an attachment (id=47518) [details] > Limit texture size to fit in GTT This isn't generally the right way to do this, but this should fix the issue for now. We should fail at texture creation time instead. After all, a 8192x1 texture will fit in the GTT. It's only when both dimensions are too big that we should fail. Created attachment 47626 [details] [review] Check the buffer is mappable at the time of creation Ian, something along the lines of this then? I'm not sure if this will make the testcase happy though, I guess it will compain about the GL_OUT_OF_MEMORY.. Proof-of-concept patch only. (Obviously doesn't even compile ;-) Created attachment 47832 [details] [review] Catch SIGBUS and propagate GL_OUT_OF_MEMORY The third scheme is to trap the SIGBUS and convert it to a GL_OUT_OF_MEMORY. Insert the horror of signal handling and multithreaded applications here... (Again another proof-of-concept patch, it prevents the SIGBUS in test case, but can not prevent it from failing, although this time gracefully.) When I run the test on my SNB system, the conformance test app seg faults in in the file: src/CONFSHEL/windowing.cpp inside the function: FbConfig::get_config_with_id() because glXChooseFBConfig() returns a NULL configs[] array and the code tries to access the first element of that array, dereferencing NULL. The test does not seg fault in the underlying graphics system, which appears to be well behaved. When I modify the test to check whether configs is NULL and just return NULL from the function in that situation, the test no longer seg faults. Instead, the test emits the following error message: Error encountered during test scheduling: Error: no test in schedule is compatibile with selected pixel format This appears to me to be a test application failure, not a failure of the underlying system. (In reply to comment #9) > When I run the test on my SNB system, the conformance test app seg faults in in > the file: > src/CONFSHEL/windowing.cpp > inside the function: > FbConfig::get_config_with_id() > because glXChooseFBConfig() returns a NULL configs[] array and the code tries > to access the first element of that array, dereferencing NULL. The test does > not seg fault in the underlying graphics system, which appears to be well > behaved. > When I modify the test to check whether configs is NULL and just return NULL > from the function in that situation, the test no longer seg faults. Instead, > the test emits the following error message: > Error encountered during test scheduling: > Error: no test in schedule is compatibile with selected pixel format > This appears to me to be a test application failure, not a failure of the > underlying system. This error is in the oglc code now it uses glXChooseFBConfigto select fbconfig and its visual. So now we should test it using the command: oglconform -z -s -suite all -v 2 -test texdecaltile basic.allCases -D 115 (if without -D option it will test all the visuals available) And I find it can pass with with some visuals while failed with some others. So I guess it related to some visual disposing in our driver. It pass with the visual (ID |ACCELERA|DB |REND_T |SURF_T |C_BUF_T |BUF_S |RED_S | 115| 1| 1| gl| wipbpx| rgba| 32| 8| GREEN_S |BLUE_S |ALPHA_S |DEPTH_S |STENC_S |ACCUM_S |SPL_BUF |SAMPLES | 8| 8| 8| 24| 8| 64| 0| 0| SRGB |TEX_RGB |TEX_RGBA|CAVEAT |SWAP |M_PBUF_W|M_PBUF_H|M_PBUF_P -1| 0| 0| slow| undef| 0| 0| 0 ). It failed with visual: (ID |ACCELERA|DB |REND_T |SURF_T |C_BUF_T |BUF_S |RED_S | 115| 1| 1| gl| wipbpx| rgba| 24| 8| GREEN_S |BLUE_S |ALPHA_S |DEPTH_S |STENC_S |ACCUM_S |SPL_BUF |SAMPLES | 8| 8| 0| 24| 8| 48| 0| 0| SRGB |TEX_RGB |TEX_RGBA|CAVEAT |SWAP |M_PBUF_W|M_PBUF_H|M_PBUF_P -1| 0| 0| slow| undef| 0| 0| 0 ) And another issue in mesa now is that even with the same fbconfig ID it will have different formats on different platforms. This maybe also a driver bug? This bug also causes piglit/fbo-maxsize to occasionally segfault on my ILK machine. See also Bug 38423 - i965/gen5: fbo-maxsize fails on master https://bugs.freedesktop.org/show_bug.cgi?id=38423 demote to unblock the release. The root cause of this bug appears to have been in code base for quite a while. The issue is that texture map allocation fails silently if too large a texture map or too many small texture maps are allocated. As I understand it (after discussions with Eric Anholt), in DRM (file: intel_bufmgr_gem.c proc: drm_intel_gem_bo_map_gtt()), the memory allocation for a texture map occurs in 2 stages: the first allocation creates a mapped file and its virtual memory, and the second stage passes that mapped file to the kernel to placed into the GTT (via drmioctl(..., DRM_IOCTL_I915_GEM_SET_DOMAIN, ...)). If the memory allcoated by mmap() is fragmented (and the likelihood of that happening gets higher the larger texture map is), DROM_IOCTL_I915_GEM_SET_DOMAIN will not be use that memory. However, no indication is returned if that memory is not usable due to fragmentation (as I understand the code, which is only partial). In that event, the first attempt to write to that memory (such as during the initialization of a texture map using an image supplied by the application), the system segfaults in texstore() (which is located in mesa/src/mesa/main/texstore.c). *** Bug 44436 has been marked as a duplicate of this bug. *** Fixed by the series ending with: commit ca9a7d975af228cabb79c3040ec67f26f94f90a2 Author: Eric Anholt <eric@anholt.net> Date: Tue Apr 2 17:28:41 2013 -0700 intel: Avoid making tiled miptrees we won't be able to blit. Verified it on latest mesa master and 10.2 branch. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.