Summary: | Lock up w/ r200, TCL, GL apps | ||
---|---|---|---|
Product: | Mesa | Reporter: | jor <j2o3r> |
Component: | Drivers/DRI/r200 | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | major | ||
Priority: | high | CC: | jaak, michel, n0nb, npeninguy, prestonbridge, sami.nieminen |
Version: | unspecified | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
URL: | http://localhost | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | A trace of the X server when it is locked up |
Description
jor
2004-09-02 13:59:57 UTC
I am experiencing the same. X hangs when running glxgears or enemy territory. I am also able to login using ssh from another computer, but it's not possible to kill X, have to reboot. This happens for me only with 6.8.0 (also happened with the 6.7.99 snapshots), but 6.7.0 works great. I haven't tried disabling TCL (don't know how to do that). You can disable TCL by exporting R200_NO_TCL, e.g. in a Bash shell you could do: export R200_NO_TCL=1 glxinfo It should show "NO-TCL" in the "OpenGL renderer string". You can then try and run glxgears in that shell/terminal and see if it locks up with it disabled. For a more permanent solution, you can also use the driconf method to disabled TCL for all openGL applications, see http://dri.sourceforge.net/cgi-bin/moin.cgi/ConfigurationInfrastructure for more info about that. Thanks, I tried disabling TCL, unfortunately that did not help. Enemy Territory still hangs my machine after few minutes. I also have this problem. It would freeze regularly when running certain screensavers. Created attachment 1761 [details]
A trace of the X server when it is locked up
I took this trace of the X server from another machine after the server locked
up. When I tried killing the server, the whole system froze.
Does disabling Render acceleration work around the problem? I get what seems to be the same bug (at least, an strace shows it hammering getparam, and I can't reproduce it with NO-TCL), but it usually doesn't happen immediately, and I haven't yet reproduced it with glxgears (StepMania and icculus-quake2 seem to do a fine job if left running for a bit, though those are far from ideal testcases). I can regain limited control over the system with magic SysRQ (unraw + kill), but only enough to cleanly reboot. Søren, how did you get that stack trace? If I try to attach gdb to my X server when this happens, gdb just hangs. Also, in case this is a hw-dependent bug, lspci has this to say about the card (it may be worth noting that it identifies as a Radeon LE / R200QL despite the fact that the card itself is labeled as a regular 8500...): 0000:01:00.0 VGA compatible controller: ATI Technologies Inc Radeon R200 QL [Radeon 8500 LE] (prog-if 00 [VGA]) Subsystem: ATI Technologies Inc Radeon 8500 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 (2000ns min), Cache Line Size: 0x08 (32 bytes) Interrupt: pin A routed to IRQ 16 Region 0: Memory at e0000000 (32-bit, prefetchable) [size=128M] Region 1: I/O ports at c000 [size=256] Region 2: Memory at ec020000 (32-bit, non-prefetchable) [size=64K] Capabilities: [58] AGP version 2.0 Status: RQ=48 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW+ AGP3- Rate=x1,x2,x4 Command: RQ=32 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW- Rate=x1 Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- and the AGP bridge (VIA KT400, any way to get a revision / stepping or similar?): 0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge Subsystem: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR- Latency: 8 Region 0: Memory at e8000000 (32-bit, prefetchable) [size=64M] Capabilities: [a0] AGP version 2.0 Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW- AGP3- Rate=x1,x2,x4 Command: RQ=32 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW- Rate=x1 Capabilities: [c0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- A lot has happened in the R200 driver since May 2005. Are people still able to reproduce this problem with recent drivers? Almost instant lockup here when running Neverwinter Nights, using 6.9.0 RC 2, on Ubuntu 5.10. (II) Primary Device is: PCI 01:00:0 (II) ATI: Candidate "Device" section "ATI Technologies, Inc. Radeon 9200 (RV280)". (WW) RADEON: No matching Device section for instance (BusID PCI:1:0:1) found (--) Chipset ATI Radeon 9200 5961 (AGP) found With X.org 6.8.2, lockups happen after about 30 minutes. Will try with TCL disabled later... Seems ok with TCL disabled, while quite slower (but I suppose it's normal...) I can still reproduce this as of 7.0rc3 and kernel 2.6.14. I can confirm that disabling TCL through driconf makes the problem go away completely. I also tried with Option "CCEusecTimeout" "20000", while having tcl_mode set to 3, and it seems to help quite a lot: I do not get hangs with gl-117 (tested for 2 hours, before I got crashes after times starting with a few seconds, until cca 20-30 minutes in some rarer occasions) and with the GL screensavers (which always hanged X). However, vegastrike makes X hang as soon as I start spinning around, in a few seconds; this works fine without TCL (played for at least 5 hours). (In reply to comment #12) > I can confirm that disabling TCL through driconf makes the problem go away > completely. > > I also tried with Option "CCEusecTimeout" "20000", while having tcl_mode set to > 3, and it seems to help quite a lot: Do you mean CPusecTimeout? CCEusecTimeout only seeems to be recognized by the r128, but not the radeon driver. (In reply to comment #13) > Do you mean CPusecTimeout? CCEusecTimeout only seeems to be recognized by the > r128, but not the radeon driver. Unfortunately not - I had a Rage 128 Pro before buying the Radeon 9250 (I was hoping to get rid of similar hangs with r128, and reading that 9250 is the best supported open-source driver... :). I just inserted the option for r128, thinking that at least part of the code is similar. None of the CCE options are documented in the Ubuntu man pages for X.org, and the X.org site seems to have the same (old?) documentation. Sorry about this, I should have grepped for WW. I tried CPusecTimeout, but it still hangs in vegastrike, within a couple of minutes. The bad news is that I just got an X.org hang with tcl_mode 0 (no CPusecTimeout, same config as over the weekend), in gl-117. Maybe it happens more seldom with TCL disabled, or maybe I was just lucky over the weekend, as the CCE placebo seems to indicate. :) gl-117 was running in a window, not full screen, and X froze as soon as I got another window on top of the gl-117 one (mail notification). Is there anything else I could try? I have posted a couple of comments in bug 2999 regarding lockups I have received whenever trying to run OpenGL xscreensavers on my RADEON 9200. I was advised to check this bug and after disabling TCL, it seems the immediate lockups are gone. I've changed no other settings. lspci reports: 0000:01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 PRO] (rev 01) (prog-if 00 [VGA]) Subsystem: ATI Technologies Inc RV280 [Radeon 9200 PRO] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 32 (2000ns min), Cache Line Size: 0x08 (32 bytes) Interrupt: pin A routed to IRQ 11 Region 0: Memory at d0000000 (32-bit, prefetchable) [size=256M] Region 1: I/O ports at c000 [size=256] Region 2: Memory at e3000000 (32-bit, non-prefetchable) [size=64K] Expansion ROM at e2000000 [disabled] [size=128K] Capabilities: [58] AGP version 2.0 Status: RQ=80 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW+ AGP3- Rate=x1,x2,x4 Command: RQ=32 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW- Rate=x1 Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 PME-Enable- DSel=0 DScale=0 PME- I've reproduced this with another card (Radeon 9000) on the same mainboard (Soyo Dragon KT400 / VIA KT400 chipset). Not much to report other than that it's not limited to the particular card/chip. I've tried the latest (cvs/git) Mesa + XServer + drm + ati-driver and things seems to have improved on the hardware TCL side : with or without R200_NO_TCL=1 it now takes some minutes to get the lockup. Anyway, that still make OpenGL unusable :-( Is there any active developer able to reproduce it ? Could export R200_DEBUG=XXX help to debug this ? With which value ? No more lockups with the latest (cvs/git) Mesa + XServer + drm + ati-driver. Note I also updated to the latest BIOS revision of my mother board (a via kt333 based card), and the old bios backup failed so I cannot go back to try again... I'm now running kernel 2.6.18 (also tested with 2.6.17) Xorg 7.1.1 and v6.6.3 of the Xorg ATI driver. The 8500LE still locks up regularly, but I can't get the 9000 to lock up anymore. The 9000 lockup I reported previously may have been an unrelated problem... I've got this problem running neverwinter nights on ubuntu 6.10 What's driconf? Can't locate it on my system. Any suggestions for how to workaround or further debug please?? I can recover by logging in on another console and using kill -9 on the nwmain process. lspci output below. Using a 19" 1280*1024 screen, newer than the rest of the system - quite a lot of unsupported multiverse games don't work. Thanks Chris $ lspci 00:00.0 Host bridge: Intel Corporation 82845 845 (Brookdale) Chipset Host Bridge (rev 11) 00:01.0 PCI bridge: Intel Corporation 82845 845 (Brookdale) Chipset AGP Bridge (rev 11) 00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 81) 00:1f.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 01) 00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 01) 00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01) 01:00.0 VGA compatible controller: Matrox Graphics, Inc. G400/G450 (rev 05) 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) 02:01.0 Communication controller: Agere Systems LT WinModem (rev 02) (In reply to comment #20) > I've got this problem running neverwinter nights on ubuntu 6.10 Please be more specific. Not every lockup is the same! > What's driconf? Can't locate it on my system. If your distribution doesn't offer it, install it manually. > Any suggestions for how to workaround or further debug please?? Drivers in ubuntu 6.10 are probably quite old, and update may help. > I can recover by logging in on another console and using kill -9 on the nwmain > process. So it's not really a lockup then. > lspci output below. > 01:00.0 VGA compatible controller: Matrox Graphics, Inc. G400/G450 (rev 05) And apparently it's not even a remotely similar graphic chip. Don't mix this in here. (In reply to comment #21) > (In reply to comment #20) > > I've got this problem running neverwinter nights on ubuntu 6.10 > Please be more specific. Not every lockup is the same! Screen goes blank, X hangs. ctrl/alt/F1 gets me a console login which I can use to recover. > > What's driconf? Can't locate it on my system. > If your distribution doesn't offer it, install it manually. Never thought - I expected it to be part of an existing package. Now installed, thanks. It fails to open any windows - but ctrl/C works on this one. Traceback: $ driconf libGL warning: 3D driver claims to not support visual 0x4b Traceback (most recent call last): File "/usr/bin/driconf", line 28, in ? driconf.main() File "/usr/lib/python2.4/site-packages/driconf.py", line 52, in main commonui.dpy = dri.DisplayInfo () File "/usr/lib/python2.4/site-packages/dri.py", line 396, in __init__ self.getScreen (i) File "/usr/lib/python2.4/site-packages/dri.py", line 411, in getScreen screen = ScreenInfo (i, self.dpy) File "/usr/lib/python2.4/site-packages/dri.py", line 380, in __init__ self.glxInfo = GLXInfo (screen, dpy) File "/usr/lib/python2.4/site-packages/dri.py", line 343, in __init__ glxInfo = infopipe.read() KeyboardInterrupt > > Any suggestions for how to workaround or further debug please?? > Drivers in ubuntu 6.10 are probably quite old, and update may help. ubuntu update manager shows no upgrades available. various lib*mesa* packages are at 6.5.1~20060817-0ubuntu3 > > I can recover by logging in on another console and using kill -9 on the nwmain > > process. > So it's not really a lockup then. X hangs, and only switching out of X (ctrl/alt/f1) allows me to do anything. > > lspci output below. > > 01:00.0 VGA compatible controller: Matrox Graphics, Inc. G400/G450 (rev 05) > And apparently it's not even a remotely similar graphic chip. Don't mix this in > here. OK - it seemed very similar to #9 to me - I'll open another bug if you're sure it's not a duplicate. Thanks > ubuntu update manager shows no upgrades available.
> various lib*mesa* packages are at 6.5.1~20060817-0ubuntu3
As an old, stable release, Ubuntu 6.10 won't see much updates. Upgrading to 7.04 is recommended, and if you really want to help out, please try 7.10 betas (Gutsy) in parallel where you more easily can get up-to-date X components.
Do you still have this issue with recent mesa & Xorg ? (In reply to comment #24) > Do you still have this issue with recent mesa & Xorg ? > Although I still have the same graphics card, I don't recall any GL problems since my last post in bug 2999. Ok, so i mark it as fixed please reopen if you experience similar issue with recent mesa,kernel,ddx |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.