Bug 3914

Summary: ATI Mach64 locks up sparc post 6.8.99.3
Product: xorg Reporter: Jeremy Huddleston <eradicator>
Component: Server/GeneralAssignee: Jesse Barnes <jbarnes>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: high CC: ajax, dberkholz, fmccor, gustavoz, jbarnes, libv
Version: 6.8.99.14   
Hardware: SPARC   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on: 2373    
Bug Blocks:    
Attachments:
Description Flags
add-altix.patch
none
remove-altix.patch
none
remove-altix patch for xorg-server-1.1.1 none

Description Jeremy Huddleston 2005-07-30 04:08:36 UTC
Starting with 6.8.99.5 and on, I've noticed my system locks up (even ignores
L1-A) when starting the X server with the ati display device.

6.8.2 works
6.8.99.3 works
6.8.99.5 doesn't
6.8.99.8 doesn't
6.8.99.14 doesn't

I'm investigating 6.8.99.4 now.

0000:01:02.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro 215GP
(rev 5c)

(==) ATI(0): Chipset:  "ati".
(**) ATI(0): Depth 24, (--) framebuffer bpp 32
(II) ATI(0): BIOS Data:  BIOSSize=0x0000, ROMTable=0x0000.
(II) ATI(0): BIOS Data:  ClockTable=0x0000, FrequencyTable=0x0000.
(II) ATI(0): BIOS Data:  LCDTable=0x0000, LCDPanelInfo=0x0000.
(II) ATI(0): BIOS Data:  VideoTable=0x0000, HardwareTable=0x0000.
(II) ATI(0): BIOS Data:  I2CType=0x00, Tuner=0x00, Decoder=0x00, Audio=0x0F.
(--) ATI(0): ATI 3D Rage Pro graphics controller detected.
(--) ATI(0): Chip type 4750 "GP", version 4, foundry UMC, class 0, revision 0x01.
(--) ATI(0): PCI bus interface detected.
(--) ATI(0): ATI Mach64 adapter detected.

cpu             : TI UltraSparc IIi (Sabre)
fpu             : UltraSparc IIi integrated FPU
promlib         : Version 3 Revision 25
prom            : 3.25.2
type            : sun4u
ncpus probed    : 1
ncpus active    : 1
Cpu0Bogo        : 876.54
Cpu0ClkTck      : 000000001a39de00
MMU Type        : Spitfire
Comment 1 Jeremy Huddleston 2005-07-30 07:49:09 UTC
Oh, and it's a u10, and the toolchain/kernel are:
linux-2.6.13_rc3-git4, gcc-3.4.4, glibc-2.3.5, binutils-2.16.1
Comment 2 Jeremy Huddleston 2005-07-30 14:24:04 UTC
Ok, it's happening with 6.8.99.4 as well, and it appears from the log that it's
not a but in the ati driver like I originally thought, so reassigning

This is a pre-release version of the The X.Org Foundation X11.
It is not supported in any way.
Bugs may be filed in the bugzilla at http://bugs.freedesktop.org/.
Select the "xorg" product for bugs you find in this release.
Before reporting bugs in pre-release versions please check the
latest version in the The X.Org Foundation "monolithic tree" CVS
repository hosted at http://www.freedesktop.org/Software/xorg/
X Window System Version 6.8.99.4
Release Date: 24 April 2005 + cvs
X Protocol Version 11, Revision 0, Release 6.8.99.4
Build Operating System: Linux 2.6.12-gentoo-r4 sparc64 [ELF] 
Current Operating System: Linux aeris 2.6.12-gentoo-r4 #6 Fri Jul 8 18:32:29 PDT
2005 sparc64
Build Date: 29 July 2005
        Before reporting problems, check http://wiki.X.Org
        to make sure that you have the latest version.
Module Loader present
Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Fri Jul 29 21:21:17 2005
(==) Using config file: "/etc/X11/xorg.conf"
f(EE) end of block range 0xc0000b < begin 0x2c00000
(EE) end of block range 0xc0001b < begin 0x2c00010
f(EE) end of block range 0xc0000b < begin 0x2c00000
(EE) end of block range 0xc0001b < begin 0x2c00010
(WW) ****INVALID IO ALLOCATION**** b: 0x2c00400 e: 0x2c00500 correcting
Comment 3 Donnie Berkholz 2006-07-01 17:37:53 UTC
Created attachment 6092 [details] [review]
add-altix.patch

A couple of ideas are the Altix support in the server, and the Theatre changes
in the ati driver. I attached the patch that added Altix support -- you could
try reversing it with `patch -R` to see whether this helps.

It doesn't apply cleanly to current xserver, so you may need to test this with
6.9 or fix up the patch.

Generated with `git diff d450a70~2 d450a70`
Comment 4 Donnie Berkholz 2006-07-01 17:53:21 UTC
(In reply to comment #2)
> f(EE) end of block range 0xc0000b < begin 0x2c00000
> (EE) end of block range 0xc0001b < begin 0x2c00010
> f(EE) end of block range 0xc0000b < begin 0x2c00000
> (EE) end of block range 0xc0001b < begin 0x2c00010
> (WW) ****INVALID IO ALLOCATION**** b: 0x2c00400 e: 0x2c00500 correcting

This also happens in my perfectly working Ultra5.

(WW) ****INVALID IO ALLOCATION**** b: 0x2c00400 e: 0x2c004ff correcting
Comment 5 Donnie Berkholz 2006-07-01 18:41:32 UTC
Created attachment 6093 [details] [review]
remove-altix.patch

This one oughta apply against current modular, and you don't need to pass -R.
Comment 6 Ferris McCormick 2006-08-16 12:08:09 UTC
This bug (or at least the symptom --- "basically my problem is a hardlock with a
nice corrupted display") persists into modular xorg-x11-7.1 on sparc.  Until
there is a fix for it, we (gentoo/sparc) cannot release x-modular as stable
because sparc+ati is a rather common configuration.  Is there an anticipated
release date for a fix, or should we plan to pin sparc at 6.8.2?
Comment 7 Daniel Stone 2006-08-17 04:49:07 UTC
maybe you could help us to help you and test remove-altix.patch?
Comment 8 Ferris McCormick 2006-08-17 06:10:47 UTC
Created attachment 6589 [details] [review]
remove-altix patch for xorg-server-1.1.1

Attached patch cleans up original remove-altix patch so that (1) it applies
cleanly and (2) resulting xorg-server-1.1.1 actually builds.  Preliminary
testing indicates that it solves the mach64 lockup problem on sparc for
xorg-server-1.1.1, but testing is not complete.

I have no idea what happens on non-sparc systems if this patch is applied.
Comment 9 Ferris McCormick 2006-08-17 13:20:39 UTC
So far, Jason Wever's (weeve@gentoo.org) version of the patch (attachment 6589 [details] [review])
seems to work fine for us.  Or at least, so says gustavoz@gentoo.org.
Comment 10 Ferris McCormick 2006-08-18 04:14:32 UTC
The proposed fix for this in effect backs out the altix support provided as a
result of Bug 2373.  Consequently, the Bug 2373 altix support cannot be quite
complete as it stands, because it apparently has a side effect of breaking xorg
for sparc/mach64 systems.  Thus, correct resolution of this bug requires correct
resolution of the altix 2373 support request, and I am linking them.

Please also note Gustavo's comment at
https://bugs.freedesktop.org/show_bug.cgi?id=2373#c51 for more details.
Comment 11 Jesse Barnes 2006-08-23 09:20:19 UTC
The snippet below (from the revert patch) removes the new 
linuxTransAddrBusToHost routine from the PCI mapping function table.  This 
routine is *supposed* to be generic, but it may well be broken on some sparc64 
configurations.  Can you try building a tree without the revert patch but with 
that line protected #if defined(__ia64__) ... #endif instead of removed 
entirely?  That would tell us for sure if the xf86GetOSOffsetFromPCI routine 
was the culprit (it probably is, it has quite a few shortcomings).

If that works, simply protecting that line with an #ifdef __ia64__ might be a 
good short term fix...

Thanks,
Jesse

--- a/hw/xfree86/os-support/bus/linuxPci.c
+++ b/hw/xfree86/os-support/bus/linuxPci.c
@@ -63,7 +63,6 @@ #include "Pci.h"
 static CARD32 linuxPciCfgRead(PCITAG tag, int off);
 static void linuxPciCfgWrite(PCITAG, int off, CARD32 val);
 static void linuxPciCfgSetBits(PCITAG tag, int off, CARD32 mask, CARD32 bits);
-static ADDRESS linuxTransAddrBusToHost(PCITAG tag, PciAddrType type, ADDRESS 
addr);
 #if defined(__powerpc__)
 static ADDRESS linuxPpcBusAddrToHostAddr(PCITAG, PciAddrType, ADDRESS);
 static ADDRESS linuxPpcHostAddrToBusAddr(PCITAG, PciAddrType, ADDRESS);
@@ -84,7 +83,7 @@ #if defined(__powerpc__)
 /* pciAddrBusToHost */	linuxPpcBusAddrToHostAddr,
 #else
 /* pciAddrHostToBus */	pciAddrNOOP,
-/* pciAddrBusToHost */	linuxTransAddrBusToHost,
+/* pciAddrBusToHost */	pciAddrNOOP,
 #endif
 
 /* pciControlBridge */		NULL,
Comment 12 Gustavo Zacarias 2006-08-25 12:38:45 UTC
Works nice with your patch, that seem to did it.
Comment 13 Jesse Barnes 2006-08-30 21:48:21 UTC
Great, I'll go ahead and check in the workaround.  I'll file a new bug on 
linuxGetOffsetFromPCI since it seems like that's broken (fixing that would be 
more proper).  Thanks for testing.

Jesse
Comment 14 Jesse Barnes 2006-08-31 17:51:19 UTC
Ok, I looked at xf86GetOSOffsetFromPCI and it definitely seems broken.  It 
uses PCI device structures that don't handle 64 bit BARs when compiled as a 32 
bit binary, afaict.  What's weird is that supposedly that was fixed by 6377 
(using direct PCI config space access instead of the silly wrapper struct), 
but I don't see that code in the tree.  ajax?

I'd really rather fix xf86GetOSOffsetFromPCI than kludge the PCI operations 
structure, and ideally we'd unify the PPC implementation too, but maybe the 
kludge is fine for now since the PCI rework tree should kill all this ugly 
code?
Comment 15 Jesse Barnes 2006-09-10 11:15:34 UTC
I've committed a workaround as b3a3020fd018df8bc5a8193d36e1a1c7ae8af8ba.  It's 
ugly (just removes the mapping routine for sparc64 compiles) but all this code 
will soon disappear so I think that's ok.  Please test the latest tree to make 
sure things work for you.

Thanks,
Jesse

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.