Bug 8818 - total system freeze in quake 3 & vtk
total system freeze in quake 3 & vtk
Status: RESOLVED INVALID
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/r300
6.5
x86 (IA32) Linux (All)
: high normal
Assigned To: Default DRI bug account
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-29 14:09 UTC by Alexey Spiridonov
Modified: 2010-02-25 00:21 UTC (History)
1 user (show)

See Also:


Attachments
X.org start-up log (43.07 KB, text/plain)
2006-10-29 14:10 UTC, Alexey Spiridonov
Details
glxinfo &> glxinfo.txt (4.61 KB, text/plain)
2006-10-29 14:10 UTC, Alexey Spiridonov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexey Spiridonov 2006-10-29 14:09:25 UTC
Here are my system specs:

Gentoo + xorg 7.1 + Mesa 6.5.1 (vanilla compilation, no special effects)
Vanilla kernel 2.6.18
PCIE ATI Technologies Inc M22 [Radeon Mobility M300]

Here are the relevant bits of xorg.conf:

Section "Module"
   Load "dri"
   Load "glx"
   ...
EndSection

Section "Device"
   Identifier  "ATI Graphics Adapter 0"
   Driver      "radeon"
   BusID       "PCI:1:0:0"
   # The crash happens with or without this option.
   Option      "DynamicClocks" "on"
EndSection

I will attach an Xorg log, and a glxinfo dump. 

This problem doesn't happen with the proprietary drivers, or if r300_dri.so is
not loaded (moving it or disabling DRI does the trick).

glxgears runs fine, and reasonably smoothly (750 fps reported). I see small
glitches in the framerate every second or so, but I see similar, much worse
effects without DRI, so this is probably unrelated.

Many other simple demos run fine too. 

I can start Quake3 demo, and the intro & menus work, and at a reasonable speed
too. I create a game, it loads, and at the point when the level is supposed to
be first rendered, I get a total system hang. 

Also, VTK (visualization toolkit) has a pseudo-volumetric rendering demo, which
renders a fair number of polygons. This program also triggers the same hang,
although once I had to run the program two times. The rendering was incorrect
when the program didn't crash.

The same volumetric rendering program works fine (although a bit slowly) without
DRI. Quake3 is way too slow to try.

I'm willing to compile debug versions of stuff and run gdb on it, but I would
need a bit of guidance as to the best set-up and procedures.
Comment 1 Alexey Spiridonov 2006-10-29 14:10:11 UTC
Created attachment 7574 [details]
X.org start-up log
Comment 2 Alexey Spiridonov 2006-10-29 14:10:33 UTC
Created attachment 7575 [details]
glxinfo &> glxinfo.txt
Comment 3 Alexey Spiridonov 2006-10-29 16:42:22 UTC
Actually, I just recompiled my X server (I thought about the issue being caused
by switching from gcc 3.4.6 to gcc 4.1.1), and while the VTK demo still renders
wrong, I haven't been able to get it to crash now. Quake 3 crashes just as before.

The VTK demo also says this:

*********************************WARN_ONCE*********************************
File r300_vertexprog.c function t_dst_index line 184
Unknown output 13
***************************************************************************
*********************************WARN_ONCE*********************************
File radeon_mm.c function radeon_mm_alloc line 216
Ran out of GART memory!
Please consider adjusting GARTSize option.
***************************************************************************

Also, dmesg reports some information:

Linux agpgart interface v0.101 (c) Dave Jones
[drm] Initialized drm 1.0.1 20051102
ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:01:00.0 to 64
[drm] Initialized radeon 1.25.0 20060524 on minor 0:
mtrr: 0xd0000000,0x8000000 overlaps existing 0xd0000000,0x2000000
mtrr: 0xd0000000,0x8000000 overlaps existing 0xd0000000,0x2000000
mtrr: 0xd0000000,0x8000000 overlaps existing 0xd0000000,0x2000000
[drm] Setting GART location based on new memory map
[drm] Loading R300 Microcode
[drm] writeback test succeeded in 1 usecs
[drm] Loading R300 Microcode
[drm] Loading R300 Microcode
Comment 4 Alexey Spiridonov 2006-10-29 17:25:27 UTC
Sorry for all the spam; this should be the last message, because I'm out of ideas.

Firstly, Quake 3 demo still crashes :( But,

I looked up the GARTSize error message, and found advice to set it to 64. I did,
and that got rid of the error in the VTK app. It still renders wrong, probably
because of the "Unknown output" message. Searching about that message suggests
that it has to do with shaders. However, this VTK app works just fine with fglrx. 

Increasing GARTSize also lets me start more than 2 glxgears, which is nice,
although not very useful :) 

Incidentally, fglrx sets its GART size to 128 MB. I tried doing that for the
radeon driver, but DRI failed to initialize with "out of memory (-2)". Why is it
broken, and why do I need to mess with GARTSize anyway?

Aside from larger GART sizes, the log files & glxinfo remain essentially the
same (I diffed), so I'm not uploading anything new.

Thanks for your patience :)
Comment 5 Roland Scheidegger 2006-10-30 12:36:10 UTC
(In reply to comment #4)
> I looked up the GARTSize error message, and found advice to set it to 64. I did,
> and that got rid of the error in the VTK app. It still renders wrong, probably
> because of the "Unknown output" message. Searching about that message suggests
> that it has to do with shaders. However, this VTK app works just fine with fglrx.
output 13 is back facing color 0. The driver doesn't handle that yet. Could
cause wrong colors (but only if the program also enables two-side lighting for
vertex progs).

> Increasing GARTSize also lets me start more than 2 glxgears, which is nice,
> although not very useful :) 
> 
> Incidentally, fglrx sets its GART size to 128 MB. I tried doing that for the
> radeon driver, but DRI failed to initialize with "out of memory (-2)". Why is it
> broken, and why do I need to mess with GARTSize anyway?
You need to mess with it because it's currently a static allocation. So if you
set it to 128MB, this memory is lost even if you never run a 3d app, and if you
have, say, only 256MB ram you would probably be annoyed if the driver just would
set it to 128MB... Not sure why you couldn't set it to 128MB - you can however
not set it to a larger size than what you've set in the bios (actually that
shouldn't be true on your system, only if it's agp).
Comment 6 Alexey Spiridonov 2006-10-30 13:23:48 UTC
More news:

I compiled and ran the Mesa demos. None of them crash the system, but a few of
them make it unresponsive for a few seconds during start-up. Then, these print: 
  Try R300_SPAN_DISABLE_LOCKING env var if this hangs.
Setting that variable to "1" eliminates that temporary hang in the demos, but it
doesn't help with Quake3. Also, stex3d segfaults after initialization (right at
the point when it tries to render the graphics (X remains fine).

I just discovered that the system is not _totally_ frozen by Quake3. The
keyboard, mouse & video all do freeze, but network remains responsive. It turns
out that Quake keeps running, and can be terminated by a few SIGTERM/SIGHUPs,
but X never recovers (and pegs the CPU at 100%). So, I'd like to debug this
remotely, but I need help. 

What do I build in debug mode (I assume: xserver, Mesa)? Are there special
options I should enable, or other things to be aware of? How can I find out what
 code path is triggered that causes the problem? 

(In reply to comment #5)

Thanks for your thoughts. My responses are inline.

> output 13 is back facing color 0. The driver doesn't handle that yet. Could
> cause wrong colors (but only if the program also enables two-side lighting for
> vertex progs).

All right, that makes some sense -- the proper rendering in that program
involves having some polygons be fully transparent from one side, and
semi-opaque from the other. In any case, it's probably not related to the crash.
Would this be hard for me (completely new to the codebase) to implement?

> You need to mess with it because it's currently a static allocation. So if you
> set it to 128MB, this memory is lost even if you never run a 3d app, and if you
> have, say, only 256MB ram you would probably be annoyed if the driver just would
> set it to 128MB... Not sure why you couldn't set it to 128MB - you can however
> not set it to a larger size than what you've set in the bios (actually that
> shouldn't be true on your system, only if it's agp).

Aha, fair enough. Would there be any benefit to having 128MB over 64MB, given
that my card has 32MB on-board RAM, and is labeled "64MB HyperMemory"? 
Comment 7 Roland Scheidegger 2006-10-30 14:46:03 UTC
(In reply to comment #6)
> All right, that makes some sense -- the proper rendering in that program
> involves having some polygons be fully transparent from one side, and
> semi-opaque from the other. In any case, it's probably not related to the crash.
> Would this be hard for me (completely new to the codebase) to implement?
I don't really know how r300 implements two-sided lighting. I think someone
suggested that you'd just set up another color interpolator, and use some face
register to determine what interpolator to use in the fragment program. Then
again, I haven't seen any evidence r300 actually has a face register... You'd
probably need to do some reverse engeneering (things like glxtest with a test
prog using fglrx to see how things are set up). There are probably easier things
to do... (note that the r200 driver doesn't support two-sided lighting neither
in vertex program mode, though for this chip this appears to be a chip
limitation, at least fglrx can't do it neither - it just uses a sw fallback in
this case)

> > You need to mess with it because it's currently a static allocation. So if you
> > set it to 128MB, this memory is lost even if you never run a 3d app, and if you
> > have, say, only 256MB ram you would probably be annoyed if the driver just would
> > set it to 128MB... Not sure why you couldn't set it to 128MB - you can however
> > not set it to a larger size than what you've set in the bios (actually that
> > shouldn't be true on your system, only if it's agp).
> 
> Aha, fair enough. Would there be any benefit to having 128MB over 64MB, given
> that my card has 32MB on-board RAM, and is labeled "64MB HyperMemory"? 
That would depend on the application. AFAIK gart memory is only used for vertex
buffers currently, and 64MB is quite a lot of vertex data. So I guess the answer
in general is probably no, unless you try to run multiple instances of ut2k4 or
something.
Comment 8 Maciej Cencora 2009-04-16 10:15:26 UTC
Can you reproduce it with mesa 7.4 or master?
Comment 9 Alexey Spiridonov 2009-05-08 21:41:38 UTC
I tried reproducing the bug with Mesa 7.2 on Ubuntu 8.10 -- it crashes just as before.

I then downloaded Ubuntu 9.04, which has Mesa 7.4. However, Quake 3 simply did not run (I tried using the live CD). It failed with an X11 BadMatch error in XWindowCreate. Not sure what's going on there --- probably some sort of OpenGL version incompatibility.

glxgears works fine from the live CD. However, the fancy Compiz effects that are on by default in 9.04 are all totally broken. Moreover, some of them (fades) crash the X server.

So, at the moment, at any rate, 7.4 / r300 seems fairly unusable to me.

Would you suggest any specific tests/fixes?
Comment 10 Maciej Cencora 2009-05-21 05:31:45 UTC
(In reply to comment #9)
> I tried reproducing the bug with Mesa 7.2 on Ubuntu 8.10 -- it crashes just as
> before.
> 
> I then downloaded Ubuntu 9.04, which has Mesa 7.4. However, Quake 3 simply did
> not run (I tried using the live CD). It failed with an X11 BadMatch error in
> XWindowCreate. Not sure what's going on there --- probably some sort of OpenGL
> version incompatibility.
> 
> glxgears works fine from the live CD. However, the fancy Compiz effects that
> are on by default in 9.04 are all totally broken. Moreover, some of them
> (fades) crash the X server.
> 
> So, at the moment, at any rate, 7.4 / r300 seems fairly unusable to me.
> 
> Would you suggest any specific tests/fixes?
> 

There have been many changes in radeon-rewrite branch of mesa, so it's probable that your bug has been fixed there.
Comment 11 Maciej Cencora 2009-09-19 02:48:04 UTC
Closing due to lack of user input.
Comment 12 Alexey Spiridonov 2010-02-25 00:21:45 UTC
The driver indeed works better in Ubuntu 9.10, but Quake 3 remains unplayable, and there are reproducible crashes. Tracking these in this bug:

http://bugs.freedesktop.org/show_bug.cgi?id=26480