Bug 5341 - System locks up when starting X w/ DRI on a Radeon RV370 (X300)
System locks up when starting X w/ DRI on a Radeon RV370 (X300)
Status: RESOLVED FIXED
Product: DRI
Classification: Unclassified
Component: DRM/other
XOrg git
x86 (IA32) Linux (All)
: high normal
Assigned To: Default DRI bug account
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-12-14 21:52 UTC by Bernhard Rosenkraenzer
Modified: 2008-01-11 12:10 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
drm debug messages up to the point of the crash (5.83 KB, text/plain)
2006-01-01 00:48 UTC, Markus Niemistö
no flags Details
drm debug messages with gart_info.bus_addr (5.88 KB, text/plain)
2006-01-02 20:27 UTC, Markus Niemistö
no flags Details
new full debug messages from FreeBSD from serial console (10.65 KB, text/plain)
2006-01-28 05:08 UTC, Markus Niemistö
no flags Details
Fix current DRM on FreeBSD (1.31 KB, patch)
2006-02-01 06:01 UTC, Markus Niemistö
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Bernhard Rosenkraenzer 2005-12-14 21:52:40 UTC
Using DRM modules from DRM CVS and Xorg and Mesa from CVS (all taken as of  
yesterday), the system locks up when starting X if DRI is enabled. 
 
The same DRM/Xorg/Mesa combination works perfectly on a Radeon Mobility 9600 
M10. 
 
PCI ID of graphics card triggering the lockup: 1002:5b60, Subsystem 174b:0500 
The same card with the same Xorg/Mesa works nicely (but without 3D) if I move 
the radeon.ko kernel module out of the way.
Comment 1 Aapo Tahkola 2005-12-15 02:13:04 UTC
Does changing option GARTSize to 16, 32 or 64 affect anything?
Also check that EnablePageFlip isnt true...
Comment 2 Bernhard Rosenkraenzer 2005-12-15 03:06:21 UTC
Same effect with GARTSize 16, 32 and 64 and an explicit Option 
"EnablePageFlip" "false". 
 
Last couple of lines from 
 
mount -o remount,sync / 
Xorg -verbose 9 &>X.log 
 
: 
 
(II) RADEON(0): [DRI] installation complete 
(II) RADEON(0): [drm] Added 32 65536 byte vertex/indirect buffers 
(II) RADEON(0): [drm] Mapped 32 vertex/indirect buffers 
(II) RADEON(0): [drm] dma control initialized, using IRQ 10 
(II) RADEON(0): [drm] Initialized kernel GART heap manager, 13369344 
(II) RADEON(0): Direct rendering enabled 
[HANGS] 
Comment 3 Benjamin Herrenschmidt 2005-12-15 10:58:48 UTC
Does it work if you edit radeon_driver.c, function RADEONSetFBLocation() and
comment out those 2 lines:

    OUTREG(RADEON_MC_FB_LOCATION, mc_fb_location);
    OUTREG(RADEON_MC_AGP_LOCATION, mc_agp_location);

The code that "calculates" those values is totally bogus imho and may conflict
with what the DRM is doing. I'll try to come up with a proper patch later, but
it would be interesting if that is the cause of your problem.
Comment 4 Bernhard Rosenkraenzer 2005-12-15 22:33:41 UTC
Removing those 2 lines doesn't change anything.  
  
The same problem also occurs on a different brand X300 (PCI ID 1002:5b60, 
Subsystem 196d:1070). Both machines showing this problem here are Athlon64 
boxes running in 32bit mode. 
Comment 5 Markus Niemistö 2005-12-15 23:18:22 UTC
I guess your card is a PCI express card? I have exactly the same problem with
PCI express X600 (5b62), but I am running 64-bit version of Linux on Athlon64.
Comment 6 Bernhard Rosenkraenzer 2005-12-16 00:16:42 UTC
Yes, all X300/X600/... cards are PCI Express. 
 
The driver works perfectly on the AGP cards (at least as far as the 9600 is 
concerned). 
Comment 7 Michel Dänzer 2005-12-16 01:12:36 UTC
FWIW, no problems here with an X550 (1002:5b60 / 174b:1490) with 64-bit
X11R6.9RC2 and the rest from CVS.
Comment 8 Gernot Pansy 2005-12-16 03:14:25 UTC
same on Xpress 200M (RV370 based). i think the lock only happens with cards 
that didn't have memory on board  and so have to share the ram. 
Comment 9 Benjamin Herrenschmidt 2005-12-16 07:52:59 UTC
Ok, let's try another one. In RADEONSetFBLocation(), comment out this one:

    OUTREG (RADEON_BUS_CNTL, bus_cntl | RADEON_BUS_MASTER_DIS);
Comment 10 Markus Niemistö 2005-12-17 00:01:39 UTC
I tried commenting that line both with and without the previous two from comment
#3 and had no luck at all.

However I located the location where system crashes. On my system the crash
occurs when DRM tries to zero out the pci-gart table in function
drm_ati_pcigart_init on line 186 in ati_pcigart.c.
Comment 11 Dave Airlie 2005-12-17 10:52:01 UTC
does you card have any onboard RAM??

I'm thinking I need to make some changes to the GART allocate for PCIE for those
types of cards..
Comment 12 Markus Niemistö 2005-12-17 19:45:34 UTC
My X600 has 128 MB DDR memory onboard.
Comment 13 Dave Airlie 2005-12-30 13:15:16 UTC
can you attach a DRM log ?? echo 1 > /sys/module/drm/parameters/debug

though you might need a serial console to get it all... 
Comment 14 Markus Niemistö 2006-01-01 00:48:45 UTC
Created attachment 4206 [details]
drm debug messages up to the point of the crash
Comment 15 Dave Airlie 2006-01-02 12:51:10 UTC
(In reply to comment #14)
> Created an attachment (id=4206) [edit]
> drm debug messages up to the point of the crash
> 

Can you change the DRM_DEBUG in drivers/char/drm/radeon_cp.c 
DRM_DEBUG("Setting phys_pci_gart to %p %08lX\n", dev_priv->gart_info.addr,
dev_priv->pcigart_offset);
to also printout dev_priv->gart_info.bus_addr?

I'm wondering if there is some issue there... 
Comment 16 Markus Niemistö 2006-01-02 20:27:23 UTC
Created attachment 4214 [details]
drm debug messages with gart_info.bus_addr

(In reply to comment #15)
> Can you change the DRM_DEBUG in drivers/char/drm/radeon_cp.c 
> DRM_DEBUG("Setting phys_pci_gart to %p %08lX\n", dev_priv->gart_info.addr,
> dev_priv->pcigart_offset);
> to also printout dev_priv->gart_info.bus_addr?
> 
> I'm wondering if there is some issue there...

Here you go
Comment 17 Markus Niemistö 2006-01-28 05:05:36 UTC
I got my X600 working under Linux/i386 with Ben's latests patches. I haven't
tried it without them yet. However, DRI still doesn't work under FreeBSD/amd64.
I had my serial console attached and got quite a lot debug data. It crashes in
radeon_cp_init_ring_buffer() with gpf. I'll try to compile a debug kernel with
all fancy debug data and find out the exact location.
Comment 18 Markus Niemistö 2006-01-28 05:08:27 UTC
Created attachment 4488 [details]
new full debug messages from FreeBSD from serial console
Comment 19 Don Wilde 2006-02-01 05:41:30 UTC
I'm afraid Ben's three patches did _not_ help my FreeBSD-6.0-STABLE (updated 
1/30/06) system with an X300 PCIe Dell dual-head. I've also tried killing the 
HyperThreading, the second head, etc., to no avail. This, on the standard X.org 
6.9 as installed through ports. I remade xorg-libraries after patching the 
radeon_driver.c file. 
 
(--) PCI:*(1:0:0) ATI Technologies Inc RV370 5B60 [Radeon X300 (PCIE)] rev 0, 
Mem @ 0xd0000000/27, 0xdfde0000/16, I/O @ 0xdc00/8, BIOS @ 0xdfe00000/17 
(--) PCI: (1:0:1) ATI Technologies Inc RV370 [Radeon X300SE] rev 0, Mem @ 
0xdfdf 
0000/16 
 
The X log does not get written when it locks up. What else can I post that 
would help? 
Comment 20 Markus Niemistö 2006-02-01 06:01:15 UTC
Created attachment 4523 [details] [review]
Fix current DRM on FreeBSD

The virtual field of struct drm_sg_mem_t was not initialized in drm_sg_alloc
which caused crashes. This patch addresses this issue and also makes current
DRM compile on FreeBSD.
Comment 21 Markus Niemistö 2006-02-01 06:09:57 UTC
(In reply to comment #19)
> I'm afraid Ben's three patches did _not_ help my FreeBSD-6.0-STABLE (updated 
> 1/30/06) system with an X300 PCIe Dell dual-head. I've also tried killing the 
> HyperThreading, the second head, etc., to no avail. This, on the standard X.org 
> 6.9 as installed through ports. I remade xorg-libraries after patching the 
> radeon_driver.c file. 

Try the patch I attached with CVS versions of DRM and xorg. I also needed to
apply two of Ben's patches (radeon-memmap-7.0-2.diff and
radeon-memmap-drm-3.diff). The patch fixes an issue that only exists with
non-agp radeon cards.
  
> The X log does not get written when it locks up. What else can I post that 
> would help? 

I was able to get the vital debugging information via a serial console only.
Just compile kernel with DDB in and attach another computer with a null-modem
cable. There is more help about this subject in the FreeBSD developers handbook.
Oh... And don't forget to set sysctl hw.dri.0.debug to 1 to see the DRM debug
messages.
Comment 22 Don Wilde 2006-02-01 07:04:16 UTC
(In reply to Comment #21) I'm afraid my g-world firewall is blocking my direct 
CVS access. I'll have to set up a redirect through my outside server as I do 
for CVSup. Thanks for the rapid response, Markus! I understand what I need to 
do and will do it.  
 
Just one q: Where will I find Ben's memmap patches? 
Comment 23 Markus Niemistö 2006-02-01 18:07:06 UTC
(In reply to comment #22)
> Just one q: Where will I find Ben's memmap patches? 
Sorry. I tought I mentioned this. You can find the patches from
http://gate.crashing.org/~benh/

Comment 24 Ross Vandegrift 2006-02-03 05:31:50 UTC
(In reply to comment #8)
> same on Xpress 200M (RV370 based). i think the lock only happens with cards 
> that didn't have memory on board  and so have to share the ram. 

I think I may be seeing this same issue.  I'm running Xorg/Mesa/dri/drm from CVS
today, plus Ben's Radeon memmap-3 patch.  Everything works well if I don't
enable DRI.

In order to get the radeon DRM module to recognize my chip, I had to add this to
the drm_pciids.txt file:
0x1002 0x5955 CHIP_RV350|CHIP_IS_IGP "ATI Radeon XPRESS 200M"

From searching the net, RV350 seems to be the most appropriate choice
(CHIP_RV370 doesn't appear to exist), but I could be wrong.

X starts to a blank screen and freezes.  Can't switch VTs, NumLock/CapsLock
don't work, and I can't kill X.

However, the box isn't locked hard.  ssh sessions to the laptop still work as
normal and I can reboot.  chvt hangs if I try to switch back to a text console.

I enabled DRM_DEBUG as described in the thread, and my dmesg gets spammed with this:

[drm:radeon_do_cp_idle] 
[drm:drm_ioctl] ret = fffffff0
[drm:drm_ioctl] pid=5587, cmd=0x6444, nr=0x44, dev 0xe200, auth=1
[drm:radeon_cp_idle] 
[drm:radeon_do_cp_idle] 

There was some init stuff at the beginning, but it scrolled out of the buffer
very quickly.
Comment 25 Dave Airlie 2006-09-21 20:50:33 UTC
no idea what the state of this bug is modulo the addition of a comment about
something completely unrelated.

author please reopon if you still have problem with latest code.
Comment 26 Benjamin Close 2008-01-11 02:36:41 UTC
Bugzilla Upgrade Mass Bug Change

NEEDSINFO state was removed in Bugzilla 3.x, reopening any bugs previously listed as NEEDSINFO.

  - benjsc
    fd.o Wrangler
Comment 27 Alex Deucher 2008-01-11 12:10:14 UTC
closing. please reopen if you are still having problems with a more recent version of the driver.