Bug 7111 - racer: [drm:drm_lock_take] *ERROR* 3 holds heavyweight lock
Summary: racer: [drm:drm_lock_take] *ERROR* 3 holds heavyweight lock
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/r300 (show other bugs)
Version: 6.5
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Default DRI bug account
QA Contact:
URL: http://www.liflg.org/?what=dl&catid=6...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-06-04 10:16 UTC by pmhere
Modified: 2013-03-15 23:34 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
backtrace of the problem (5.76 KB, text/plain)
2006-10-16 12:40 UTC, mrsteven
Details
backtrace of racer.bin (5.05 KB, text/plain)
2006-10-20 03:35 UTC, mrsteven
Details

Description pmhere 2006-06-04 10:16:43 UTC
Just after starting a race in racer game  I get a hard lock. Only mouse is
moving arround the screen. Keabord is dead. Only hard restart helps.

I get this with mesa-6.5, xorg-server-1.1.0 and drm cvs.

With mesa-6.4 and xorg-server-1.0.2 this game works fine.

I use r200. [Radeon 8500 LE]

link to the racer: http://racer.gamenavigator.ru/downloads/racer/patch/rr052b89l.

best regards.
Comment 1 pmhere 2006-06-04 12:49:05 UTC
Previous link doesn't work so here is working one:

http://www.liflg.org/?what=dl&catid=6&gameid=13&filename=racer_0.5.2beta8.9-english.run
Comment 2 Obmun Hjikoal 2006-08-11 05:25:06 UTC
Same DRM error on dmesg here, using my own OGL app.

Hardware: Ati Radeon 9200
Kernel: 2.6.17 (with gentoo patchset)
Mesa: 6.5
Xorg-server: 1.1.1
Via AGP.

Result: only mouse moves. No keyboard, no way to get back to terminal. System
still alive (can ssh). Strace shows X is in: --- SIGALRM (Alarm clock) @ 0 (0) ---.

Running the application inside a nested X (Xnest) no hang happens and app runs
without problem.
Comment 3 Obmun Hjikoal 2006-08-11 06:43:31 UTC
Running with DRI, lock error appears in dmesg.

Running without DRI (for example changing permissions of dri/card0), X and
computer hangs, but no drm_lock_take error appears in dmesg. So maybe my problem
is not related to DRI at all.

This own app runs correctly on nvidia based xorg-6.8.2 and xorg-server-1.0.2
computers. But as they use nvidia card cannot tell if it's an ATI driver problem.

Problem also found in two other computers with same ATI card and:
Kernel 2.6.16 - xorg 1.1.0
Kernel 2.6.15 - xorg 1.1.0

If needed I can attach strace of problematic app.
Comment 4 Obmun Hjikoal 2006-08-11 07:16:52 UTC
Tested with Radeon 9000 (RV250) on one of the computers with Radeon 9200 + xorg
1.1.0 and same behaviour. With DRI, drm_lock_take appears in dmesg. Without, not
info in dmesg. With both, X hangs. This time no cursor is showed.
Comment 5 mrsteven 2006-10-14 09:55:45 UTC
I can confirm this: With correct permissions to /dev/dri/card0, the X server 
freezes but the mouse is still moving, and the locking message is written to 
the logfile.

When I change these permissions, it freezes too, but the locking message is 
not written. I see that the intro of racer still runs fast enough (maybe 
because of AIGLX).

Now the interesting part: I have a Mobility Radeon 9600, so I use the 
experimental r300 driver. If I start racer with the environment variable 
R300_SPAN_DISABLE_LOCKING set to 1, it runs fine.
Comment 6 Michel Dänzer 2006-10-14 10:16:32 UTC
(In reply to comment #5)
> Now the interesting part: I have a Mobility Radeon 9600, so I use the 
> experimental r300 driver. If I start racer with the environment variable 
> R300_SPAN_DISABLE_LOCKING set to 1, it runs fine.

This indicates the problem being due to a crash in a software fallback or
similar. If that's the case, killing the application should allow the X server
to continue normally, and you should try and get a backtrace of the crash (from
a remote login).
Comment 7 mrsteven 2006-10-14 10:50:02 UTC
After killing racer.bin, the system continues to run normally.

I'll try to get a backtrace in a few days. How exactly do I do that? Is it ok 
to kill the X-Server with SIGSEGV or SIGABRT? Or do I have to use the debugger 
gdb?

Besides, wouldn't this be better assigned to the mesa-dev 
(mesa3d-dev@lists.sourceforge.net) mailing list?
Comment 8 Michel Dänzer 2006-10-16 08:01:31 UTC
(In reply to comment #7)
> I'll try to get a backtrace in a few days. How exactly do I do that? Is it ok 
> to kill the X-Server with SIGSEGV or SIGABRT? Or do I have to use the debugger 
> gdb?

Killing the X server certainly won't help to get a backtrace from the
application. :) Attach gdb to the app after it happens or run the app from
within gdb to boot.
Comment 9 mrsteven 2006-10-16 12:40:59 UTC
Created attachment 7435 [details]
backtrace of the problem

So, here you are, but it doesn't look that promising. Note that it freezes as
soon as racer wants to switch to the main game screen.

I was able to kill racer and the system continued to run normally.
Comment 10 Michel Dänzer 2006-10-17 02:49:55 UTC
Thanks, but again, we need a backtrace from *the application*, i.e. racer in
this case.
Comment 11 mrsteven 2006-10-20 03:35:40 UTC
Created attachment 7475 [details]
backtrace of racer.bin

Oops, I'm sorry about that! I should really read what other people write... o_O


Getting this was not that easy, because gdb didn't show its prompt after using
attach or when starting it with "gdb racer.bin $(pidof racer.bin)". Maybe it
couldn't stop the running process...
So I killed racer.bin with SIGSEGV (fortunately this caused it to terminate)
and I could get the backtrace from the core dump.
Comment 12 Michel Dänzer 2006-10-20 05:45:12 UTC
(In reply to comment #11)
> Getting this was not that easy, because gdb didn't show its prompt after using
> attach or when starting it with "gdb racer.bin $(pidof racer.bin)". 

Probably because it was still running at that point, in which case Ctrl-C should
have given you a prompt.

Thanks for the backtrace, it looks like the r300 driver attempts to grab the
hardware lock recursively.

Does setting the environment variable R300_SPAN_DISABLE_LOCKING for running
racer work around this?
Comment 13 mrsteven 2006-10-21 09:42:02 UTC
(In reply to comment #12)
> Probably because it was still running at that point, in which case Ctrl-C 
should
> have given you a prompt.
No, Ctrl+C killed the process completely, i.e. the process no longer existed.
Comment 14 chemtech 2013-03-15 14:22:21 UTC
pmhere 
Do you still experience this issue with newer soft ?
Please check the status of your issue.
Comment 15 mrsteven 2013-03-15 17:30:14 UTC
Though I am not pmhere, I want to inform you that for me this problem has been gone for a long while now. Don't know exactly with which software versions. Sorry for not posting earlier.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.