Bug 74726

Summary: r600g: unrecoverable GPU lockup after glDrawElements INVALID_ENUM
Product: Mesa Reporter: Török Edwin <edwin+bugs>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: vmerlet
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: corrupt.trace
gpureset.log
text_recovered.log
retrace.log

Description Török Edwin 2014-02-08 22:31:09 UTC
Created attachment 93679 [details]
corrupt.trace

If you send a lot of invalid glDrawElements commands to Mesa it can cause a GPU lockup. Would it be possible to validate this Mesa side / or kernel CS checker side to avoid the GPU lockup? [1]

If you replay the attached trace on r600g you get a GPU lockup, see attachments:
 * gpureset.log: dmesg when reboot is the only solution, no text console, no X, nothing works until a reboot
 * text_recovered.log: dmesg when I can kill the application and X, and get to a framebuffer text console. Starting X is impossible though, unless I reboot

I've run the trace under valgrind, and I see no valgrind errors, but of course I see a lot of Mesa errors. The mesa errors shouldn't cause a GPU lockup though.

I reproduced this with the 10.1 branch, but similar lockups happen on the 10.0.2 release too (if you force the version to 3.3), so it doesn't seem to be related to the 3.3 work on the 10.1 branch.

Mesa built like this:
$ ./configure --enable-dri --enable-glx-tls --enable-shared-glapi --enable-texture-float --enable-xa --disable-xvmc --disable-vdpau --with-gallium-drivers=r600,swrast LLVM_CONFIG=/usr/bin/llvm-config-3.4 --disable-dri3 --enable-debug

OpenGL version:

OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD RV730
OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.1.0-rc1 (git-1e6bba5)
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 10.1.0-rc1 (git-1e6bba5)
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:

kernel:
$ uname -a
Linux debian 3.14.0-rc1-00015-g7c4c62a #48 SMP PREEMPT Sat Feb 8 17:33:48 EET 2014 x86_64 GNU/Linux


[1]
There are some use-after-free bugs in the gltut tutorials when you press Escape: it frees some stuff, then calls glutLeaveMainLoop(), but freeglut still calls display(), causing use-after-frees. Of course its expected that the application itself might crash, or otherwise misbehave, but I was not expecting an unrecoverable GPU lockup.
Comment 1 Török Edwin 2014-02-08 22:31:26 UTC
Created attachment 93680 [details]
gpureset.log

Unrecoverable GPU lockup
Comment 2 Török Edwin 2014-02-08 22:31:52 UTC
Created attachment 93681 [details]
text_recovered.log

I killed glretrace and X via ssh, and then I could get to a framebuffer text console.
Comment 3 Török Edwin 2014-02-08 22:32:14 UTC
Created attachment 93682 [details]
retrace.log

Output of running glretrace under valgrind.
Comment 4 GitLab Migration User 2019-09-18 19:14:27 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/492.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.