When starting X (with DRI enabled), the screen just goes blank and the keyboard is dead. Xorg consumes 100% CPU and can't be killed... well it doesn't really go away, just the terminal shows it is killed, but there is an Xorg process with the same pid hogging the CPU. When run trough strace (before killing), it just endlessly does:
ioctl(9, 0x6444, 0) = -1 EBUSY (Device or resource busy)
FD #9 is /dev/dri/card0 (as is FD #10 FWIW). My X version is:
root@wombat:~> Xorg -version
X Window System Version 7.1.1
Release Date: 12 May 2006
X Protocol Version 11, Revision 0, Release 7.1.1
Build Operating System: Linux 2.6.9-34.ELsmp x86_64 Red Hat, Inc.
Current Operating System: Linux wombat 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:34 EST 2006 x86_64
Build Date: 29 January 2007
Build ID: xorg-x11-server 1.1.1-47.5.fc6
Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
Module Loader present
root@wombat:~> rpm -q xorg-x11-server-Xorg xorg-x11-drv-ati
I found some oddities in the log file, namely that a) the driver was somehow confused by the cards secondary bus ID, b) it didn't recognize the Radeon X800 GTO as such, but as an X850 PRO and c) somehow used only 128MB of the 256MB card:
(II) Primary Device is: PCI 07:00:0
(--) Assigning device section with no busID to primary device
(WW) RADEON: No matching Device section for instance (BusID PCI:7:0:1) found
(--) Chipset ATI Radeon X850 PRO (R480) (PCIE) found
(II) resource ranges after xf86ClaimFixedResources() call:
(II) RADEON(0): Page flipping disabled
(II) RADEON(0): Will try to use DMA for Xv image transfers
(II) RADEON(0): Generation 2 PCI interface, using max accessible memory
(II) RADEON(0): Detected total video RAM=262144K, accessible=131072K (PCI BAR=131072K)
(--) RADEON(0): Mapped VideoRAM: 131072 kByte (256 bit DDR SDRAM)
(II) RADEON(0): Color tiling enabled by default
The output of lspci -v also shows that only 128MB are mapped.
I could get X to work when I disabled DRI (I used "install radeon /bin/false" in /etc/modprobe.conf to do that). I'll attach xorg.conf, Xorg.0.log, strace output (shortened) shortly.
Created attachment 8687 [details]
The commented out stuff in xorg.conf was some settings I experimented with, the "Module" section wasn't originally there (but it still hung even without it, i.e. the defaults).
Created attachment 8688 [details]
Created attachment 8689 [details]
bzip2-compressed strace output ("strace -o ... /usr/bin/Xorg")
Created attachment 8690 [details]
Output of "lspci"
Created attachment 8691 [details]
Output of "lspci -n"
Created attachment 8692 [details]
Output of "lspci -v"
Created attachment 8693 [details]
Output of "lspci -vn"
Memory at e8000000 (64-bit, prefetchable) [size=128M]
Your card only exposes 128 MB via a PCI BAR. As such that's all that can be mapped as a CPU accessible framebuffer. The rest of the vram can be used, however it is only accessible via the GPU.
Forgot one thing: When killing the X server (with SIGKILL), it somehow releases /dev/dri/card0 (fuser shows it isn't in use anymore) and the process gets a very short list open files:
root@wombat:~> pgrep -lf X
root@wombat:~> lsof -p 8593
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
Xorg 8593 root cwd DIR 253,0 4096 1015809 /root
Xorg 8593 root rtd DIR 253,0 4096 2 /
Xorg 8593 root txt unknown /proc/8593/exe
The DRM kernel module is still in use, though:
root@wombat:~> lsmod|grep radeon
radeon 124257 1
drm 99049 2 radeon
PS^2: stracing the process isn't possible anymore at that point (why ever, I've set SELinux permissive, not enforcing):
root@wombat:~> strace -p 8593
attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
Can you try a 32 bit X server? If the same problem occurs with that, can you try a 32 bit kernel as well?
(In reply to comment #11)
> Can you try a 32 bit X server? If the same problem occurs with that, can you
> try a 32 bit kernel as well?
When I first set up the machine, I popped in a 32bit Ubuntu (6.10 I believe) which I had lying around to get it bootstrapped. This showed similar symptoms, except that the screen didn't go blank but distorted the logo Ubuntu shows when booting. Unfortunately I don't know which X version they use. If you wish, I'll download a 32bit Fedora Unity LiveCD and test it with that.
Commenting on just one part of your report: In Ubuntu 6.10, which uses X.org 7.1, the driver-ati most probably does not recognize your X800 GTO (bug 6796) without specifying Driver "radeon", and the stupid usplash doesn't let you see the blue X.org screen telling that. Also, only 2.6.20 kernel has the fix in radeon drm driver that allows anything on my X800 GTO to work regarding 3D. Regarding Ubuntu, you should try http://cdimage.ubuntu.com/releases/feisty/herd-3/ that both has 2.6.20 kernel and the patch to bug 6796 appended on top of 6.6.2 driver xserver-xorg-video-ati driver. Working on my X800 GTO just fine in 64-bit, including 3D. Naturally, your system configuration might be otherwise different.
X800 GTO:s are actually partly rebranded X850 Pro:s with some pipelines disabled or something like that.
(In reply to comment #13)
> Also, only 2.6.20 kernel has the fix in radeon drm driver that allows anything
> on my X800 GTO to work regarding 3D.
Do you happen to have a pointer for that specific fix?
(In reply to comment #14)
> Do you happen to have a pointer for that specific fix?
Sorry in case you misunderstood, the fix is "only in 2.6.20" if talking about released kernels, it's also in mesa/drm git of course. It was this "Unify radeon offset checking." (http://gitweb.freedesktop.org/?p=mesa/drm.git;a=commitdiff;h=aefc7a34431a8f1540b261e23d8b8d05d824b60a , http://git2.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=1d6bb8e51dba3db1c15575901022fe72d363e5a4).
Just wanted to point out various difficulties Nils could face so that he wouldn't confuse those with any non-fixed problems he's seeing.
(In reply to comment #15)
> It was this "Unify radeon offset checking."
Thanks. I'm afraid it doesn't make a difference for this bug, because the problem it fixes
* only occurs when the framebuffer lies at the very end of the card's address space, which is not the case in the log file attached here.
* should only matter when starting 3D clients, not when starting the X server.
Are you saying you had similar problems as reported in this bug without that fix?
At any rate, I guess it can't hurt to try it...
What I was saying was that Ubuntu 6.10's distorted screen with the card he's mentioning in the comment 12 is not the same problem he's seeing in Fedora, and he could try out the latest Ubuntu test release to see if he gets similar symptoms as with Fedora running that, or if it works okay. The Ubuntu 6.10 really looks like it hangs, badly, while it really just is in a wrong graphics mode, trying to show the screen telling that "no screens found".
Sorry about the phenomenal bug spam, guys. Adding xorg-team@ to the QA contact so bugs don't get lost in future.
I tried (32bit) Knoppix 5.2 and found the same problem, i.e. this is not a 32bit vs. 64bit thing.
Can you try xf86-video-ati 6.6.191 and possibly a newer DRM as well? I'm afraid it's unlikely to make a difference though...
*** Bug 11065 has been marked as a duplicate of this bug. ***
sorry for not responding for so long. I suppose I could check out current ATI driver code from git and wrap it up into a custom xorg-x11-drv-ati package, but I'm not entirely sure about DRM -- I'm on Fedora 7 now and have the kernel-2.6.21-1.3194.fc7 package installed. Whether that gives me a new DRM version, I don't have the faintest clue ;-). I so vote for a mandatory "driver version" field that one could query with modinfo...
(In reply to comment #22)
> I'm not entirely sure about DRM -- I'm on Fedora 7 now and have the
> kernel-2.6.21-1.3194.fc7 package installed. Whether that gives me a new DRM
> version, I don't have the faintest clue ;-).
Probably somewhat newer, but the kernel DRM always lags behind the main drm tree. Would be nice if you could try that.
Is there documentation about where to check out current drm code and how to build it?
(In reply to comment #24)
> Is there documentation about where to check out current drm code and how to
> build it?
I have just compiled the current drm.ko, radeon.ko and r300_dri.so from git (from 28.06.2007 on linux 2.6.21-gentoo, X.org 1.3.0). DRI works fine on my X600 (RV380), but it still does *not* on my X800 GTO(R480) card.
Does by now anyone have an idea what could be wrong there? Could I try to tune some driver settings? Can I provide any additional information???
Stephan, does your X800 hang on startup, or something else? This bug is about the startup hang, if you have some other bug, file a new bug and attach any error messages into it, as well as /var/log/Xorg.0.log, glxinfo, dmesg (as attachments, not inline).
Nils: have you tested eg. Fedora 7, does the problem still exist there?
(In reply to comment #27)
> Nils: have you tested eg. Fedora 7, does the problem still exist there?
I have tested it on Fedora 7 GA where it hung, but I'll give it a shot when I'm home again with the updated kernel and X packages.
Timo, everything mentioned here applies to the bug I experience. When starting the X-Server the screen keeps blank and neither mouse nor keyboard work. Xorg.0.log says that DRI was successfully initialized. I can still login via ssh. Killing X results in a strange X process which has no memory but eats my CPU. As far as I remember I could not strace the process before killing it; there were no messages on my screen.
I´m looking for an idea where one could start tracing this bug. Maybe a register dump after the crash could help?!
It's the usual GPU lockup, strace or debug won't help at all you
will likely just see that X is busy waiting for GPU. All those GPU
lockup are painfull, lately we got a new idea to debug them but some
code need to be put together for this, as i got some traveling this
weekend i will try to work a bit on that.
Can you name some typical reasons that cause a GPU to lock up?
I forgot to mention that the card triggers some hundret interrupts per second after the lockup.
I simply don't know typical reason beside AGP fast write but you are
on PCIE thus this don't apply to you. Strange things is that DRM should
ack for IRQ of the card, maybe it's not vblank interrupt would be nice
to know for what card send IRQ (vblank shouldn't give you more than
hundred interrupt per second).
Judging from the last comments I believe it's not necessary to drive the machine (which is also a server) against the wall. Ping me if you think I should try it out regardless.
I hope that you'll find something out. Just to update on my experiences, I have the "same" X800 GTO (1002:5d4f), manufacturer is Sapphire, and I have no problems whatsoever. I've now run my machine with both 1GB and 2GB RAM, and there's 256MB on the X800. Both 64-bit and 32-bit are fine, as are DVI+VGA multi-monitor output and 3D desktop / AIGLX. I've also used the card in both ATI chipset (with AMD Athlon 64) and Intel chipset (with Core 2 Duo) motherboard, so I'd guess it's not about specific system configuration unless it turns out all of you have NVIDIA motherboards and it is somehow affecting. At no point I have experienced hangs-on-X-startup.
I'm running Debian 4.0 and Ubuntu 7.04, and have also booted succesfully from Fedora 6 live-CD.
(In reply to comment #34)
> motherboard, so I'd guess it's not about specific system configuration unless
> it turns out all of you have NVIDIA motherboards and it is somehow affecting.
Hmm, you might be onto something here. I've tried updationg the BIOS to the latest version available as well as fiddling the one BIOS setting I could find pertaining to how the video card is driven, both to no avail though. The current versions of relevant packages are:
Stephan, do you per chance have a similar motherboard as mine (ASUS M2N-SLI Deluxe, NVIDIA nForce 570 SLI chipset)?
Is there anything else I should try? I'll bump severity to major as that's how I feel about it right now -- it itches me to finally try out 3D on this card ;-).
Nils, I have a MSI K8N motherboard with NVIDIAs CK804 (SLI) chipset on it. A friend has got an Intel machine... maybe I can do some testing there in order to find out if this is a chipset specific problem.
Timo, your X800 card is a PCIe one, isn´t it?
I have tested my X800 GTO card in conjunction with an Asus P5N-E SLI mainboard (NVidia NForce 650 chipset) and openSuse 10.3. It is exactly the same as on my NForce 4 machine, so either the problem is related to NForce chipsets or I have just a strange graphics card.
(In reply to comment #36)
Did you have the chance to test on your friend's Intel machine yet?
(In reply to comment #38)
> (In reply to comment #36)
> Did you have the chance to test on your friend's Intel machine yet?
Yes, the test mentioned in #37 was done on that machine consisting of an Intel Core 2 duo CPU and (unfortunately) an NForce chipset (like mine). Do we by now have any chance to determine the malfactor? I have no idea who has a different chipset on which I can do further testing.
can you try the latest drm git repo?
I just fixed a bug in the PCIE gart table handling that showed up playing on some newer hardware, I wonder could it affect your hardware...
(In reply to comment #40)
> can you try the latest drm git repo?
> I just fixed a bug in the PCIE gart table handling that showed up playing on
> some newer hardware, I wonder could it affect your hardware...
Unbelievable! I just built the latest mesa/drm on Ubuntu 7.10 and it works now! Unfortunately, AIGLX doesn´t work anymore but thats another point which surely gets fixed soon:
(EE) AIGLX error: dlsym for __driCreateNewScreen_20050727 failed (/usr/lib/dri/r300_dri.so: undefined symbol: __driCreateNewScreen_20050727)
(EE) AIGLX: reverting to software rendering
(In reply to comment #41)
> (EE) AIGLX error: dlsym for __driCreateNewScreen_20050727 failed
You need xserver Git for AIGLX with mesa Git.