Bug 5853 - Radeon X600 (0x5b62) locks entire machine, hard
Summary: Radeon X600 (0x5b62) locks entire machine, hard
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/Radeon (show other bugs)
Version: 6.8.2
Hardware: x86 (IA32) Linux (All)
: high critical
Assignee: Xorg Project Team
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-10 08:33 UTC by James Ralston
Modified: 2006-06-22 14:42 UTC (History)
5 users (show)

See Also:
i915 platform:
i915 features:


Attachments

Description James Ralston 2006-02-10 08:33:22 UTC
I have a Dell Optiplex GX620 with a Radeon X600 card installed in the PEG slot:

01:00.0 VGA compatible controller: ATI Technologies Inc RV370 5B62 [Radeon X600
(PCIE)]

It has an Intel EM64T CPU and 2GB of DDR2 PC2-5300 memory (4 512MB DIMMs).  I am
running Fedore Core 4 (Linux) on it, using kernel 2.6.15-1.1831_FC4 (the latest)
and xorg-x11-6.8.2-37.FC4.49.2.

About 5-10 seconds after starting X, the entire machine locks, hard.  And when I
say "lock", I mean "the machine even stops responding to ICMP ECHO requests". 
There's no errors and nothing unusual in any log files.

Since I'm using Fedora Core 4, I reported this bug via Red Hat's Bugzilla:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=172335

Another person with the same hardware (a Dell Optiplex GX260 with a Radeon X600)
reported the same problem:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=176826

The second reporter was able to get some console messages with netconsole:

Uhhuh. NMI received. Dazed and confused, but trying to continue
Uhhuh. NMI received. Dazed and confused, but trying to continue
You probably have a hardware problem with your RAM chips
You probably have a hardware problem with your RAM chips

I have not been able to confirm if I get the same console messages.

The second reporter also reports that the proprietary ATI driver doesn't hang
the machine, but I haven't been able to test that, either.

I notice that the RV370 5B62 card wasn't supported until fairly recently:

https://bugs.freedesktop.org/show_bug.cgi?id=3602

Based on the available data, I strongly suspect that there's something special
about this particular Radeon card that the xorg radeon driver doesn't [yet] know
how to handle.

If you need additional information (other than what's available in this report
and in the reports in Red Hat's Bugzilla), please let me know.
Comment 1 James Ralston 2006-02-23 10:28:24 UTC
I'm raising the severity to critical, as this bug causes a hang/crash.

I would be happy to help debug this problem in any way I can.
Comment 2 Mike A. Harris 2006-03-08 18:00:08 UTC
If you comment out the Load "dri" line in xorg.conf and reboot and restart
X, is the problem still present?   If that works around the problem, uncomment
the line and move the radeon.ko kernel module into /root, and reboot and
test again.  Does the problem still occur?

If disabling the dri X server module prevents the problem and/or removing the
radeon kernel module, then this is probably a combination kernel DRM bug and
X bug.  In X11R7, various random R300+ hardware totally hangs if the X server's
DRI module is loaded, even if DRI is not supported on the chip.  The X server
should not load the DRM in the first place if DRI is unsupported on a given
chip, but it seems to do so anyway.  The kernel DRM however should not hang
the chip either.

In FC5 development we have just disabled R300+ kernel DRM support to work
around similar problems on X300, X800, and other R300 based hardware.

Please update the report with the results of your testing.
Comment 3 James Ralston 2006-03-09 11:56:44 UTC
Unfortunately, the problem persists regardless of whether the Load "dri" line in
xorg.conf is commented out.

Just for completeness, I commented out the Load "dri" line, commented out the
Section "DRI" section, moved the radeon.ko kernel module into /root, rebooted,
and then fired up X.  The machine wedged in exactly the same way.

My next testing step was to load this machine with FC5test3, but at this point,
I'm just going to wait until FC5 final is released.

If you have any other suggestions to try in the meantime, I'm all ears, but
otherwise, I'll report back once FC5 is out and I can test it.
Comment 4 James Ralston 2006-03-09 11:58:20 UTC
In comment #3, that should have read, "my next testing step was going to be to
load..."
Comment 5 Benjamin Herrenschmidt 2006-03-09 12:00:01 UTC
What if you try the radeon driver from CVS ?
Comment 6 James Ralston 2006-03-11 12:01:13 UTC
I tried to backport the ati driver tree from CVS into 6.8.2 (what Fedora Core 4
has), but failed.  :(

I plan to wait until FC5 is released and try X11R7.  If that doesn't work, then
at least building the ati driver from CVS should be a straightforward process.

Either way, I'll report back.
Comment 7 Benjamin Andresen 2006-03-31 16:23:59 UTC
Hello everybody.

I have a r350 and I am having the same problems... I tried different patches /
versions of the radeon driver, but unfortunately the old ones didn't load... The
patched one (a variation from #1912, Dynamic Clock disabled) didn't solve the
problem and the newest one I couldn't backport so it compiles correctly.

I tried deleting libdrm, deleting libgl-dri... So that doesn't get accidently
loaded by the X server, but that didn't change anything.

I now switched to the fglrx driver, because my machine hardlooked in 3 days
approx. 50 times... This killed my file systems twice, so it's a very severe
problem.

I would gladly test out stuff for you guys, because running fglrx just for
widescreen support (the only thing I need, and 2d) is way overkill. :-)

TIA
Comment 8 Erik Andren 2006-04-19 16:34:11 UTC
James: Did FC5 resolve this issue for you?
Comment 9 Benjamin Andresen 2006-05-14 07:04:56 UTC
xf86-video-ati 6.5.8.0 fixed the issue for me...
I don't know if I can set it like this.
Comment 10 James Ralston 2006-06-22 14:41:47 UTC
I apologize for the delay, but: YES, the FC5 radeon driver fixed this problem. 
I've been using it for several months now, and have not experienced a single
lock-up.
Comment 11 James Ralston 2006-06-22 14:42:28 UTC
Changing resolution to FIXED.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.