Summary: | [i965] Xorg lockup with incorrect usage of VBOs | ||
---|---|---|---|
Product: | Mesa | Reporter: | Peter Clifton <pcjc2> |
Component: | Drivers/DRI/i965 | Assignee: | Eric Anholt <eric> |
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | normal | ||
Priority: | high | CC: | arekm, bgamari, kedgedev |
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 20277 | ||
Attachments: |
Test case to trigger the crash
Xorg log (for driver information |
Description
Peter Clifton
2009-02-02 09:41:24 UTC
Created attachment 22489 [details]
Test case to trigger the crash
This might not be the minimal test-case, but I've so far been unable to un-wedge the GPU once this lock-up has occurred - so testing each crash requires a full reboot.
Created attachment 22490 [details]
Xorg log (for driver information
I couldn't reproduce it when my second buffer was created on the stack, only when I malloc / free the second buffer. Perhaps this gives some clues. I see the same thing when using UXA (random freezes), it would be great if you could fix this, because UXA is unusable with this bug an EXA is slow :( My backtrace: #0 0x00007f114589b027 in ioctl () from /lib/libc.so.6 No symbol table info available. #1 0x00007f1144b31c63 in drmIoctl (fd=11, request=25688, arg=0x0) at xf86drm.c:187 ret = -1 #2 0x00007f1144b31f66 in drmCommandNone (fd=11, drmCommandIndex=<value optimized out>) at xf86drm.c:2313 No locals. #3 0x00007f11446ac798 in I830BlockHandler (i=0, blockData=0x0, pTimeout=0x7fff503afd78, pReadmask=0x7d4e60) at i830_driver.c:2737 flushed = <value optimized out> pScreen = (ScreenPtr) 0x862f40 pScrn = (ScrnInfoPtr) 0x813ce0 pI830 = (I830Ptr) 0x8163d0 #4 0x0000000000530a38 in AnimCurScreenBlockHandler (screenNum=0, blockData=0x0, pTimeout=0x7fff503afd78, pReadmask=0x7d4e60) at animcur.c:222 pScreen = (ScreenPtr) 0x862f40 as = (AnimCurScreenPtr) 0x35ce280 dev = (DeviceIntPtr) 0x0 now = 0 soonest = 4294967295 #5 0x00000000004fcaae in compBlockHandler (i=0, blockData=0x0, ---Type <return> to continue, or q <return> to quit--- pTimeout=0x7fff503afd78, pReadmask=0x7d4e60) at compinit.c:158 pScreen = (ScreenPtr) 0x862f40 cs = (CompScreenPtr) 0x35b7660 #6 0x000000000044f2fb in BlockHandler (pTimeout=0x7fff503afd78, pReadmask=0x7d4e60) at dixutils.c:384 i = 1 #7 0x00000000004ead91 in WaitForSomething (pClientsReady=0x3662910) at WaitFor.c:215 i = 86708032 waittime = {tv_sec = 996394, tv_usec = 117000} wt = (struct timeval *) 0x7fff503afd60 timeout = <value optimized out> clientsReadable = {fds_bits = {0 <repeats 16 times>}} clientsWritable = {fds_bits = {86708032, 56535488, 512, 79663048, 32, 32, 0, 32, 110854160, 5192572, 140734539431104, 5174049, 31, 140734539431184, 0, 110854160}} curclient = <value optimized out> selecterr = 0 nready = <value optimized out> devicesReadable = {fds_bits = {40, 65518993, 1073741825, 140734539430996, 16, 65536, 140734539431324, 40, 139712160475648, 79663048, 16, 140734539431028, 16, 56583792, 57026832, 5332129}} now = 86708032 ---Type <return> to continue, or q <return> to quit--- someReady = 0 #8 0x000000000044b750 in Dispatch () at dispatch.c:367 result = 0 client = (ClientPtr) 0x52b0f40 nready = -1 start_tick = <value optimized out> #9 0x00000000004319fd in main (argc=10, argv=0x7fff503aff58, envp=<value optimized out>) at main.c:397 i = 1 alwaysCheckForInput = {0, 1} Yeah, while you're passing garbage to GL, from the spec it sounds like we should not render (or kill your app), but not hang the GPU. (In reply to comment #2) > Created an attachment (id=22490) [details] > Xorg log (for driver information > (In reply to comment #1) > Created an attachment (id=22489) [details] > Test case to trigger the crash > > This might not be the minimal test-case, but I've so far been unable to > un-wedge the GPU once this lock-up has occurred - so testing each crash > requires a full reboot. I was discussion this with Eric on IRC. Looking at the glDrawArrays man page, the second draw call shouldn't do *anything* because GL_VERTEX_ARRAY is disabled: When glDrawArrays is called, it uses count sequential elements from each enabled array to construct a sequence of geometric primitives, beginning with element first. mode specifies what kind of primitives are constructed, and how the array elements construct those primitives. If GL_VERTEX_ARRAY is not enabled, no geometric primitives are gener- ated. (In reply to comment #6) > I was discussion this with Eric on IRC. Looking at the glDrawArrays man page, > the second draw call shouldn't do *anything* because GL_VERTEX_ARRAY is And by second I mean the second one that draws GL_TRIANGLES. This is actually the third call to glDrawArrays. Thanks for the great testcase! piglit test added that reproduces the problem, patches sent out for review. I've committed a modified version of Eric's patch from the mesa3d-mailing list (posted 2/25/09) that no-ops the glDrawArrays() call when there's no enabled vertex position array. Commit 97dd2ddbd97ba95e8bc8ab572ec05e8081556e1e Peter, could you test Mesa/master with this change and your original test case? bug #19740 looks to be the same issue and I just hit it with mesa 7.4 which contains #9 commit (details in bug #19740). mesa master from 1 hour ago and I also hit this: 0x00007fc499f48327 in ioctl () from /lib64/libc.so.6 (gdb) bt #0 0x00007fc499f48327 in ioctl () from /lib64/libc.so.6 #1 0x00007fc4987241c3 in drmIoctl (fd=7, request=25688, arg=0x0) at xf86drm.c:187 #2 0x00007fc4987244c6 in drmCommandNone (fd=7, drmCommandIndex=<value optimized out>) at xf86drm.c:2313 #3 0x00007fc49829c838 in I830BlockHandler (i=<value optimized out>, blockData=0x0, pTimeout=0x7fffa3efda88, pReadmask=0x7d1ea0) at i830_driver.c:2655 #4 0x000000000052d4b8 in AnimCurScreenBlockHandler (screenNum=0, blockData=0x0, pTimeout=0x7fffa3efda88, pReadmask=0x7d1ea0) at animcur.c:222 #5 0x00000000004f93fe in compBlockHandler (i=0, blockData=0x0, pTimeout=0x7fffa3efda88, pReadmask=0x7d1ea0) at compinit.c:158 #6 0x000000000044b170 in BlockHandler (pTimeout=0x7fffa3efda88, pReadmask=0x7d1ea0) at dixutils.c:384 #7 0x00000000004e7661 in WaitForSomething (pClientsReady=0x5571860) at WaitFor.c:215 #8 0x00000000004474f0 in Dispatch () at dispatch.c:367 #9 0x000000000042d63d in main (argc=7, argv=0x7fffa3efdc68, envp=<value optimized out>) at main.c:397 I'm using mesa from master (fetched at 20090408), xserver 1.6, intel driver from master, recent linux kernel from git, GM45. My system locks up with test program #1. I need to run & stop & run #1 several times for lockup to happen. Running once is not enough. I also applied http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg06658.html but it didn't help. Backtrace is different: 0x00007fb060db6327 in ioctl () from /lib64/libc.so.6 (gdb) bt #0 0x00007fb060db6327 in ioctl () from /lib64/libc.so.6 #1 0x00007fb05eedcb05 in drm_intel_gem_bo_map_gtt (bo=0x5100d30) at intel_bufmgr_gem.c:721 #2 0x00007fb05f12e5ed in i830_uxa_prepare_access (pixmap=0x51fd370, access=UXA_ACCESS_RW) at i830_exa.c:865 #3 0x00007fb05f14d8c4 in uxa_check_poly_fill_rect (pDrawable=0x51fd370, pGC=0x3cc4800, nrect=1, prect=0x7fff6ad6a090) at uxa-unaccel.c:255 #4 0x00007fb05f14a84e in uxa_create_alpha_picture (pScreen=0xf2cba0, pDst=<value optimized out>, pPictFormat=0xf2d988, width=7, height=7) at uxa-render.c:841 #5 0x00007fb05f14ae4c in uxa_trapezoids (op=8 '\b', pSrc=0x4624ae0, pDst=0x51fce30, maskFormat=0xf2d988, xSrc=10, ySrc=7, ntrap=49, traps=0x4e78d64) at uxa-render.c:909 #6 0x000000000052ad8d in ProcRenderTrapezoids (client=0x3fcccf0) at render.c:782 #7 0x00000000004477bc in Dispatch () at dispatch.c:437 #8 0x000000000042d63d in main (argc=7, argv=0x7fff6ad6a368, envp=<value optimized out>) at main.c:397 This works fine on my G45 and GM45 at this point. Peter, do you still have the problem? I gave up on getting the 100% solution I wanted, and came up with: commit d7430d942f6c7950a92367aeb13b80cf76ccad78 Author: Eric Anholt <eric@anholt.net> Date: Mon Aug 3 17:55:14 2009 -0700 i965: Assert that the offset in the VBO is below the VBO size. This avoids sending a bad buffer address to the GPU due to programmer error, and is permitted by the ARB_vbo spec. Note that we still have the opportuni to dereference past the end of the GPU, because we aren't clipping to a correct _MaxElement, but that appears to be harder than it should be. This gets us the 90% solution. Bug #19911. This fixes it again on my GM45 (which was lucking out somehow for a while, and then started failing). |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.