Bug 31670 - RV670 GPU lockup with OpenArena benchmark and kernel 2.6.37 rc1/rc2
Summary: RV670 GPU lockup with OpenArena benchmark and kernel 2.6.37 rc1/rc2
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-16 10:35 UTC by Alain Perrot
Modified: 2010-12-09 10:11 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg output after GPU lockup (50.62 KB, text/plain)
2010-11-22 13:21 UTC, Alain Perrot
no flags Details
kern.log extract on GPU lockup with kernel 2.6.35rc5 (5.03 KB, text/plain)
2010-12-07 11:12 UTC, Alain Perrot
no flags Details

Description Alain Perrot 2010-11-16 10:35:00 UTC
When running the OpenArena anholt benchmark on my Radeon HD 3870 (RV670, PCIe) on my 64-bit Kubuntu 10.10 system with kernel 2.6.37 rc1 or rc2, and libdrm, mesa, xorg-video-ati from git master, I often get a GPU lockup on the level loading screen.

I can still log into the system through SSH and reboot it, but I cannot kill the openarena process. There is nothing in dmesg when the lockup occurs.

There is no such issue with kernel 2.6.36 and the same versions of libdrm, mesa, xorg-video-ati.
Comment 1 Alain Perrot 2010-11-16 10:40:51 UTC
I forgot to tell that this is with KMS enabled and Gallium based r600g Mesa driver.
Comment 2 Alex Deucher 2010-11-16 10:49:22 UTC
Can you bisect to see which commit is problematic?
Comment 3 Alain Perrot 2010-11-21 07:27:07 UTC
Here it is:

d0f8a854c340986359a3b0a97e380c71def7a440 is the first bad commit
commit d0f8a854c340986359a3b0a97e380c71def7a440
Author: Alex Deucher <alexdeucher@gmail.com>
Date:   Sat Sep 4 05:04:34 2010 -0400

    drm/radeon/kms/r6xx+: use new style fencing (v3)

    On r6xx+ a newer fence mechanism was implemented to replace
    the old wait_until plus scratch regs setup.  A single EOP event
    will flush the destination caches, write a fence value, and generate
    an interrupt.  This is the recommended fence mechanism on r6xx+ asics.

    This requires my previous writeback patch.

    v2: fix typo that enabled event fence checking on all asics
    rather than just r6xx+.

    v3: properly enable EOP interrupts
    Should fix:
    https://bugs.freedesktop.org/show_bug.cgi?id=29972

    Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

:040000 040000 f9980a7432d5daac116586b03587fd90be768be9 ad8ce09ffba05e2438c50ddff1fa6243e1349ccd M      drivers
Comment 4 Alex Deucher 2010-11-21 07:41:45 UTC
You can disable wb by adding:
radeon.no_wb=1
to your kernel command line.
Comment 5 Alain Perrot 2010-11-21 08:22:49 UTC
With kernel 2.6.37-rc2 and option radeon.no_wb=1, it seems that there is no GPU lockup with the OpenArena benchmark and both the r600g and r600c Mesa drivers.

With the same kernel and without that option, the OpenArena benchmark locks the GPU with the r600g driver (runs fine with r600c).
Comment 6 Alex Deucher 2010-11-21 08:29:58 UTC
Sounds like an r600g bug then.
Comment 7 Jerome Glisse 2010-11-22 08:32:34 UTC
Did you try with lastest mesa for r600g ? Following commit might help

http://cgit.freedesktop.org/mesa/mesa/commit/?id=3e76ed4e256dd7964deaf37b89220c775fd2891e
Comment 8 Alain Perrot 2010-11-22 09:49:02 UTC
I have just tested with the new kernel 2.6.37-rc3 and Mesa up to commit http://cgit.freedesktop.org/mesa/mesa/commit/?id=c63a86e1e5665fb5cd94de42d6c59171398e12ee, which should include the one you point out.

Still the same result: GPU lockup with OpenArena benchmark and the r600g driver.
Comment 9 Jerome Glisse 2010-11-22 10:37:07 UTC
Is it fullscreen or not ? If lockup happen is it always during loading screen ? Can you please attach kernel log after a lockup.
Comment 10 Alain Perrot 2010-11-22 13:21:09 UTC
Created attachment 40485 [details]
dmesg output after GPU lockup

I have always run the OpenArena benchmark fullscreen. Actually, it is fullscreen on one screen (1920x1080) of my dual screen setup, the second screen (1920x1200) goes black when OpenArena is started. I have just checked in window mode with the same result (GPU lockup).

The lockup always happen on the loading screen, the exact time vary slightly: while loading sounds, while loading map... Once while bisecting the kernel, the lockup even happened before the loading screen was displayed (black screen). But I never get passed the loading screen.

About the kernel log, I used to have nothing in it, but this time I wait longer (~ 5 minutes) and I get a call trace, see attachment.
Comment 11 Alain Perrot 2010-12-07 11:12:57 UTC
Created attachment 40880 [details]
kern.log extract on GPU lockup with kernel 2.6.35rc5

I have just checked with the new kernel 2.6.37rc5 and updated Mesa packages (http://cgit.freedesktop.org/mesa/mesa/commit?id=44094356149d9a63c197e15f9db344ef2f651d86) with the same result: OpenArena benchmark cause GPU lockup with r600g driver.

A difference with previous tests (in case it matters): I am now running KWin in OpenGL compositing mode (with r600c driver), it was previously running in XRender compositing mode.
Comment 12 Jerome Glisse 2010-12-07 12:20:17 UTC
Can you please confirm that patch :
http://people.freedesktop.org/~glisse/0001-r600g-fix-userspace-fence-against-lastest-kernel.patch

Applied on top of lastest mesa fix the issue for you ?
Comment 13 Alain Perrot 2010-12-07 14:21:33 UTC
Your patch applied on top of Mesa master (http://cgit.freedesktop.org/mesa/mesa/commit/?id=72845d206e692581b6084c56b8d1f3bc689e8a03) seems to fix the issue with kernel 2.6.37rc5 for me.

I have run the OpenArena benchmark a few times, and play a level of the game without a GPU lockup.

Thanks.
Comment 14 Alain Perrot 2010-12-09 10:11:02 UTC
I have seen that the patch has been commited to Mesa, so I have checked with updated Mesa packages (http://cgit.freedesktop.org/mesa/mesa/commit/?id=05e534e6c4395269b1ca3a9694a1f437363dd186) and I can confirm that there is no more GPU lockup with Linux kernel 2.6.37rc5.

Many thanks.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.