Bug 9957 - X hangs at startup with Radeon X800 GTO (PCIe) with DRI
Summary: X hangs at startup with Radeon X800 GTO (PCIe) with DRI
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/Radeon (show other bugs)
Version: 7.1 (2006.05)
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: xf86-video-ati maintainers
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
: 11065 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-02-12 15:20 UTC by Nils Philippsen
Modified: 2007-11-02 06:47 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
/etc/X11/xorg.conf (1.17 KB, text/plain)
2007-02-12 15:22 UTC, Nils Philippsen
no flags Details
/var/log/Xorg.0.log (74.08 KB, text/plain)
2007-02-12 15:22 UTC, Nils Philippsen
no flags Details
bzip2-compressed strace output ("strace -o ... /usr/bin/Xorg") (103.73 KB, application/x-bzip)
2007-02-12 15:23 UTC, Nils Philippsen
no flags Details
Output of "lspci" (2.27 KB, text/plain)
2007-02-12 15:24 UTC, Nils Philippsen
no flags Details
Output of "lspci -n" (927 bytes, text/plain)
2007-02-12 15:24 UTC, Nils Philippsen
no flags Details
Output of "lspci -v" (11.27 KB, text/plain)
2007-02-12 15:25 UTC, Nils Philippsen
no flags Details
Output of "lspci -vn" (9.00 KB, text/plain)
2007-02-12 15:25 UTC, Nils Philippsen
no flags Details

Description Nils Philippsen 2007-02-12 15:20:23 UTC
When starting X (with DRI enabled), the screen just goes blank and the keyboard is dead. Xorg consumes 100% CPU and can't be killed... well it doesn't really go away, just the terminal shows it is killed, but there is an Xorg process with the same pid hogging the CPU. When run trough strace (before killing), it just endlessly does:

ioctl(9, 0x6444, 0)                     = -1 EBUSY (Device or resource busy)

FD #9 is /dev/dri/card0 (as is FD #10 FWIW). My X version is:

root@wombat:~> Xorg -version

X Window System Version 7.1.1
Release Date: 12 May 2006
X Protocol Version 11, Revision 0, Release 7.1.1
Build Operating System: Linux 2.6.9-34.ELsmp x86_64 Red Hat, Inc.
Current Operating System: Linux wombat 2.6.18-1.2869.fc6 #1 SMP Wed Dec 20 14:51:34 EST 2006 x86_64
Build Date: 29 January 2007
Build ID: xorg-x11-server 1.1.1-47.5.fc6 
        Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
Module Loader present
root@wombat:~> rpm -q xorg-x11-server-Xorg xorg-x11-drv-ati
xorg-x11-server-Xorg-1.1.1-47.5.fc6
xorg-x11-drv-ati-6.6.3-1.fc6

I found some oddities in the log file, namely that a) the driver was somehow confused by the cards secondary bus ID, b) it didn't recognize the Radeon X800 GTO as such, but as an X850 PRO and c) somehow used only 128MB of the 256MB card:

  (II) Primary Device is: PCI 07:00:0
  (--) Assigning device section with no busID to primary device
  (WW) RADEON: No matching Device section for instance (BusID PCI:7:0:1) found
  (--) Chipset ATI Radeon X850 PRO (R480) (PCIE) found
  (II) resource ranges after xf86ClaimFixedResources() call:

  [...]

  (II) RADEON(0): Page flipping disabled
  (II) RADEON(0): Will try to use DMA for Xv image transfers
  (II) RADEON(0): Generation 2 PCI interface, using max accessible memory
  (II) RADEON(0): Detected total video RAM=262144K, accessible=131072K (PCI BAR=131072K)
  (--) RADEON(0): Mapped VideoRAM: 131072 kByte (256 bit DDR SDRAM)
  (II) RADEON(0): Color tiling enabled by default

The output of lspci -v also shows that only 128MB are mapped.

I could get X to work when I disabled DRI (I used "install radeon /bin/false" in /etc/modprobe.conf to do that). I'll attach xorg.conf, Xorg.0.log, strace output (shortened) shortly.
Comment 1 Nils Philippsen 2007-02-12 15:22:24 UTC
Created attachment 8687 [details]
/etc/X11/xorg.conf

The commented out stuff in xorg.conf was some settings I experimented with, the "Module" section wasn't originally there (but it still hung even without it, i.e. the defaults).
Comment 2 Nils Philippsen 2007-02-12 15:22:57 UTC
Created attachment 8688 [details]
/var/log/Xorg.0.log
Comment 3 Nils Philippsen 2007-02-12 15:23:59 UTC
Created attachment 8689 [details]
bzip2-compressed strace output ("strace -o ... /usr/bin/Xorg")
Comment 4 Nils Philippsen 2007-02-12 15:24:34 UTC
Created attachment 8690 [details]
Output of "lspci"
Comment 5 Nils Philippsen 2007-02-12 15:24:56 UTC
Created attachment 8691 [details]
Output of "lspci -n"
Comment 6 Nils Philippsen 2007-02-12 15:25:20 UTC
Created attachment 8692 [details]
Output of "lspci -v"
Comment 7 Nils Philippsen 2007-02-12 15:25:48 UTC
Created attachment 8693 [details]
Output of "lspci -vn"
Comment 8 Alex Deucher 2007-02-12 15:40:45 UTC
	Memory at e8000000 (64-bit, prefetchable) [size=128M]

Your card only exposes 128 MB via a PCI BAR.  As such that's all that can be mapped as a CPU accessible framebuffer.  The rest of the vram can be used, however it is only accessible via the GPU.
Comment 9 Nils Philippsen 2007-02-12 15:42:56 UTC
Forgot one thing: When killing the X server (with SIGKILL), it somehow releases /dev/dri/card0 (fuser shows it isn't in use anymore) and the process gets a very short list open files:

root@wombat:~> pgrep -lf X
8593 Xorg
root@wombat:~> lsof -p 8593
COMMAND  PID USER   FD      TYPE DEVICE SIZE    NODE NAME
Xorg    8593 root  cwd       DIR  253,0 4096 1015809 /root
Xorg    8593 root  rtd       DIR  253,0 4096       2 /
Xorg    8593 root  txt   unknown                     /proc/8593/exe
root@wombat:~> 

The DRM kernel module is still in use, though:

root@wombat:~> lsmod|grep radeon
radeon                124257  1 
drm                    99049  2 radeon
Comment 10 Nils Philippsen 2007-02-12 15:45:20 UTC
PS^2: stracing the process isn't possible anymore at that point (why ever, I've set SELinux permissive, not enforcing):

root@wombat:~> strace -p 8593
attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
Comment 11 Michel Dänzer 2007-02-12 23:52:42 UTC
Can you try a 32 bit X server? If the same problem occurs with that, can you try a 32 bit kernel as well?
Comment 12 Nils Philippsen 2007-02-13 02:23:52 UTC
(In reply to comment #11)
> Can you try a 32 bit X server? If the same problem occurs with that, can you
> try a 32 bit kernel as well?

When I first set up the machine, I popped in a 32bit Ubuntu (6.10 I believe) which I had lying around to get it bootstrapped. This showed similar symptoms, except that the screen didn't go blank but distorted the logo Ubuntu shows when booting. Unfortunately I don't know which X version they use. If you wish, I'll download a 32bit Fedora Unity LiveCD and test it with that.
Comment 13 Timo Jyrinki 2007-02-13 05:06:29 UTC
Commenting on just one part of your report: In Ubuntu 6.10, which uses X.org 7.1, the driver-ati most probably does not recognize your X800 GTO (bug 6796) without specifying Driver "radeon", and the stupid usplash doesn't let you see the blue X.org screen telling that. Also, only 2.6.20 kernel has the fix in radeon drm driver that allows anything on my X800 GTO to work regarding 3D. Regarding Ubuntu, you should try http://cdimage.ubuntu.com/releases/feisty/herd-3/ that both has 2.6.20 kernel and the patch to bug 6796 appended on top of 6.6.2 driver xserver-xorg-video-ati driver. Working on my X800 GTO just fine in 64-bit, including 3D. Naturally, your system configuration might be otherwise different.

X800 GTO:s are actually partly rebranded X850 Pro:s with some pipelines disabled or something like that.
Comment 14 Michel Dänzer 2007-02-13 05:12:55 UTC
(In reply to comment #13)
> Also, only 2.6.20 kernel has the fix in radeon drm driver that allows anything
> on my X800 GTO to work regarding 3D.

Do you happen to have a pointer for that specific fix?
Comment 15 Timo Jyrinki 2007-02-13 13:35:25 UTC
(In reply to comment #14)
> Do you happen to have a pointer for that specific fix?

Sorry in case you misunderstood, the fix is "only in 2.6.20" if talking about released kernels, it's also in mesa/drm git of course. It was this "Unify radeon offset checking." (http://gitweb.freedesktop.org/?p=mesa/drm.git;a=commitdiff;h=aefc7a34431a8f1540b261e23d8b8d05d824b60a , http://git2.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=1d6bb8e51dba3db1c15575901022fe72d363e5a4).

Just wanted to point out various difficulties Nils could face so that he wouldn't confuse those with any non-fixed problems he's seeing.
Comment 16 Michel Dänzer 2007-02-14 00:11:58 UTC
(In reply to comment #15)
> It was this "Unify radeon offset checking."

Thanks. I'm afraid it doesn't make a difference for this bug, because the problem it fixes

* only occurs when the framebuffer lies at the very end of the card's address space, which is not the case in the log file attached here.
* should only matter when starting 3D clients, not when starting the X server.

Are you saying you had similar problems as reported in this bug without that fix?

At any rate, I guess it can't hurt to try it...
Comment 17 Timo Jyrinki 2007-02-14 01:29:06 UTC
What I was saying was that Ubuntu 6.10's distorted screen with the card he's mentioning in the comment 12 is not the same problem he's seeing in Fedora, and he could try out the latest Ubuntu test release to see if he gets similar symptoms as with Fedora running that, or if it works okay. The Ubuntu 6.10 really looks like it hangs, badly, while it really just is in a wrong graphics mode, trying to show the screen telling that "no screens found".
Comment 18 Daniel Stone 2007-02-27 01:36:22 UTC
Sorry about the phenomenal bug spam, guys.  Adding xorg-team@ to the QA contact so bugs don't get lost in future.
Comment 19 Nils Philippsen 2007-03-30 00:10:10 UTC
I tried (32bit) Knoppix 5.2 and found the same problem, i.e. this is not a 32bit vs. 64bit thing.
Comment 20 Michel Dänzer 2007-03-30 00:32:03 UTC
Can you try xf86-video-ati 6.6.191 and possibly a newer DRM as well? I'm afraid it's unlikely to make a difference though...
Comment 21 Michel Dänzer 2007-06-01 10:31:16 UTC
*** Bug 11065 has been marked as a duplicate of this bug. ***
Comment 22 Nils Philippsen 2007-06-04 00:57:56 UTC
Michel,

sorry for not responding for so long. I suppose I could check out current ATI driver code from git and wrap it up into a custom xorg-x11-drv-ati package, but I'm not entirely sure about DRM -- I'm on Fedora 7 now and have the kernel-2.6.21-1.3194.fc7 package installed. Whether that gives me a new DRM version, I don't have the faintest clue ;-). I so vote for a mandatory "driver version" field that one could query with modinfo...
Comment 23 Michel Dänzer 2007-06-04 05:27:17 UTC
(In reply to comment #22)
> I'm not entirely sure about DRM -- I'm on Fedora 7 now and have the
> kernel-2.6.21-1.3194.fc7 package installed. Whether that gives me a new DRM
> version, I don't have the faintest clue ;-).

Probably somewhat newer, but the kernel DRM always lags behind the main drm tree. Would be nice if you could try that.
Comment 24 Nils Philippsen 2007-06-04 08:11:21 UTC
Is there documentation about where to check out current drm code and how to build it?
Comment 25 Michel Dänzer 2007-06-09 04:33:53 UTC
(In reply to comment #24)
> Is there documentation about where to check out current drm code and how to
> build it?

http://dri.freedesktop.org/wiki/Building
Comment 26 stephanwib 2007-06-28 11:31:21 UTC
I have just compiled the current drm.ko, radeon.ko and r300_dri.so from git (from 28.06.2007 on linux 2.6.21-gentoo, X.org 1.3.0). DRI works fine on my X600 (RV380), but it still does *not* on my X800 GTO(R480) card.

Does by now anyone have an idea what could be wrong there? Could I try to tune some driver settings? Can I provide any additional information???
Comment 27 Timo Jyrinki 2007-08-09 02:43:18 UTC
Stephan, does your X800 hang on startup, or something else? This bug is about the startup hang, if you have some other bug, file a new bug and attach any error messages into it, as well as /var/log/Xorg.0.log, glxinfo, dmesg (as attachments, not inline).

Nils: have you tested eg. Fedora 7, does the problem still exist there?
Comment 28 Nils Philippsen 2007-08-09 04:55:01 UTC
(In reply to comment #27)
> Nils: have you tested eg. Fedora 7, does the problem still exist there?

I have tested it on Fedora 7 GA where it hung, but I'll give it a shot when I'm home again with the updated kernel and X packages.
Comment 29 stephanwib 2007-08-09 05:07:33 UTC
Timo, everything mentioned here applies to the bug I experience. When starting the X-Server the screen keeps blank and neither mouse nor keyboard work. Xorg.0.log says that DRI was successfully initialized. I can  still login via ssh. Killing X results in a strange X process which has no memory but eats my CPU.  As far as I remember I could not strace the process before killing it; there were no messages on my screen.

I´m looking for an idea where one could start tracing this bug. Maybe a register dump after the crash could help?!
Comment 30 Jerome Glisse 2007-08-09 07:37:29 UTC
It's the usual GPU lockup, strace or debug won't help at all you
will likely just see that X is busy waiting for GPU. All those GPU
lockup are painfull, lately we got a new idea to debug them but some
code need to be put together for this, as i got some traveling this
weekend i will try to work a bit on that.
Comment 31 stephanwib 2007-08-09 08:05:17 UTC
Can you name some typical reasons that cause a GPU to lock up?

I forgot to mention that the card triggers some hundret interrupts per second after the lockup.
Comment 32 Jerome Glisse 2007-08-09 08:32:39 UTC
I simply don't know typical reason beside AGP fast write but you are
on PCIE thus this don't apply to you. Strange things is that DRM should
ack for IRQ of the card, maybe it's not vblank interrupt would be nice
to know for what card send IRQ (vblank shouldn't give you more than
hundred interrupt per second).
Comment 33 Nils Philippsen 2007-08-10 01:05:57 UTC
Judging from the last comments I believe it's not necessary to drive the machine (which is also a server) against the wall. Ping me if you think I should try it out regardless.
Comment 34 Timo Jyrinki 2007-08-10 02:08:22 UTC
I hope that you'll find something out. Just to update on my experiences, I have the "same" X800 GTO (1002:5d4f), manufacturer is Sapphire, and I have no problems whatsoever. I've now run my machine with both 1GB and 2GB RAM, and there's 256MB on the X800. Both 64-bit and 32-bit are fine, as are DVI+VGA multi-monitor output and 3D desktop / AIGLX. I've also used the card in both ATI chipset (with AMD Athlon 64) and Intel chipset (with Core 2 Duo) motherboard, so I'd guess it's not about specific system configuration unless it turns out all of you have NVIDIA motherboards and it is somehow affecting. At no point I have experienced hangs-on-X-startup.

I'm running Debian 4.0 and Ubuntu 7.04, and have also booted succesfully from Fedora 6 live-CD.
Comment 35 Nils Philippsen 2007-08-30 15:25:12 UTC
(In reply to comment #34)

> motherboard, so I'd guess it's not about specific system configuration unless
> it turns out all of you have NVIDIA motherboards and it is somehow affecting.

Hmm, you might be onto something here. I've tried updationg the BIOS to the latest version available as well as fiddling the one BIOS setting I could find pertaining to how the video card is driven, both to no avail though. The current versions of relevant packages are:

kernel-2.6.22.4-65.fc7
xorg-x11-drv-ati-6.6.3-4.fc7

Stephan, do you per chance have a similar motherboard as mine (ASUS M2N-SLI Deluxe, NVIDIA nForce 570 SLI chipset)?

Is there anything else I should try? I'll bump severity to major as that's how I feel about it right now -- it itches me to finally try out 3D on this card ;-).
Comment 36 stephanwib 2007-09-10 04:40:03 UTC
Nils, I have a MSI K8N motherboard with NVIDIAs CK804 (SLI) chipset on it. A friend has got an Intel machine... maybe I can do some testing there in order to find out if this is a chipset specific problem.

Timo, your X800 card is a PCIe one, isn´t it?
Comment 37 stephanwib 2007-10-28 04:29:06 UTC
I have tested my X800 GTO card in conjunction with an Asus P5N-E SLI mainboard (NVidia NForce 650 chipset) and openSuse 10.3. It is exactly the same as on my NForce 4 machine, so either the problem is related to NForce chipsets or I have just a strange graphics card.
Comment 38 Nils Philippsen 2007-10-29 07:21:14 UTC
(In reply to comment #36)

Did you have the chance to test on your friend's Intel machine yet?

Comment 39 stephanwib 2007-10-31 08:50:49 UTC
(In reply to comment #38)
> (In reply to comment #36)
> 
> Did you have the chance to test on your friend's Intel machine yet?
> 

Yes, the test mentioned in #37 was done on that machine consisting of an Intel Core 2 duo CPU and (unfortunately) an NForce chipset (like mine). Do we by now have any chance to determine the malfactor? I have no idea who has a different chipset on which I can do further testing.
Comment 40 Dave Airlie 2007-11-01 21:54:08 UTC
can you try the latest drm git repo?

I just fixed a bug in the PCIE gart table handling that showed up playing on some newer hardware, I wonder could it affect your hardware...
Comment 41 stephanwib 2007-11-02 06:32:03 UTC
(In reply to comment #40)
> can you try the latest drm git repo?
> 
> I just fixed a bug in the PCIE gart table handling that showed up playing on
> some newer hardware, I wonder could it affect your hardware...
> 

Unbelievable! I just built the latest mesa/drm on Ubuntu 7.10 and it works now! Unfortunately, AIGLX doesn´t work anymore but thats another point which surely gets fixed soon:


(EE) AIGLX error: dlsym for __driCreateNewScreen_20050727 failed (/usr/lib/dri/r300_dri.so: undefined symbol: __driCreateNewScreen_20050727)
(EE) AIGLX: reverting to software rendering
Comment 42 Michel Dänzer 2007-11-02 06:47:11 UTC
(In reply to comment #41)
> (EE) AIGLX error: dlsym for __driCreateNewScreen_20050727 failed

You need xserver Git for AIGLX with mesa Git.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.