Bug 2322

Summary: Corruption of PCI config BAR1 of native PCIE boards on 64-bit archs
Product: xorg Reporter: Michael Yaroslavtsev <mike.work>
Component: Server/GeneralAssignee: Egbert Eich <eich>
Status: RESOLVED FIXED QA Contact:
Severity: critical    
Priority: high CC: dberkholz, jacksob2, james, mharris, sndirsch
Version: 6.8.1   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
[FIXED_X11R68x] Proposed fix for the issue [minimal].
roland.mainz: 6.8-branch+
Proposed fix for the issue [aggressive].
none
updated fix. none

Description Michael Yaroslavtsev 2005-01-18 15:23:31 UTC
This thread describes the syndrome:
http://www.nvnews.net/vbulletin/showthread.php?t=42597

Here are some quotes:
- HW cursor looks like white rectangle with colored dots
- some gtk-apps have graphic-errors
- on a VT switch console fonts are not restored.

An excerpt from Xorg.0.log:
(WW) ****INVALID MEM ALLOCATION**** b: 0xc9000004d8000000 e: 0xc9000004dfffffff
correcting
(EE) end of block range 0xa95660 < begin 0x3a7fa68406
Comment 1 Michael Yaroslavtsev 2005-01-18 15:41:25 UTC
Created attachment 1709 [details] [review]
[FIXED_X11R68x] Proposed fix for the issue [minimal].
Comment 2 Michael Yaroslavtsev 2005-01-18 15:42:18 UTC
Created attachment 1710 [details] [review]
Proposed fix for the issue [aggressive].
Comment 3 Michael Yaroslavtsev 2005-01-18 15:42:56 UTC
I have tracked this down to an off-by-1 error in hw/xfree86/common/xf86pciBus.c.
Function FindPCIVideoInfo(), lines ##292-409. [against CVS ver. 1.6]

When the HW claims a 64-bit BAR, the code for 64-bit platforms attemps to
construct a 64-bit long value, however, instead of reading BAR[i+1] for the
upper half, it gets BAR[i+2].

I am proposing 2 possible fixes for the problem, which are reflected in 2
attached patches. One (minimal) simply fixes the immediate problem at hand.

The other (aggressive) additionaly makes some rearrengements to the code, that,
I thought, were appropriate.

It also treats BAR5 specially in a way, that IMO is the most reasonable, as the
PCI spec. [both 2.3 and 3.0] remains silent on this issue.
Comment 4 Andy Ritger 2005-01-18 17:40:34 UTC
Comment on attachment 1709 [details] [review]
[FIXED_X11R68x] Proposed fix for the issue [minimal].

This bug affects all Linux-x86_64 users of the NVIDIA driver with
recent PCI-E graphics processors with 64-bit BARs.
Comment 5 James Lancaster 2005-01-21 03:47:50 UTC
Using the agressive patch on Gentoo (modified the 6.8.0-r4 ebuild) every issue 
I had with Xorg/Nvidia not working is gone (including): 
 
Logo shows instead of garbage when starting up 
Text console is not corrupted 
XV works correctly again 
X is not crashed by running nvidia-settings 
Render works  
Hardware cursor works.  
 
Comment 6 Daniel Stone 2005-01-21 19:13:41 UTC
Does the minimal patch work just as well?  I'd suspect the aggressive patch
would need a fairly high level of review.
Comment 7 Andy Ritger 2005-01-22 20:14:34 UTC
Yes, the minimal patch is sufficient to address the problem.  The aggresive 
patch was intended as a cleanup (rather than duplicate the same code for each 
BAR, loop over each BAR) but is not necessary.

I agree that the aggressive patch merits review, and probably is not 
appropriate for 6.8.2, while I hope the minimal patch is.
Comment 8 Roland Mainz 2005-01-24 18:44:35 UTC
Comment on attachment 1709 [details] [review]
[FIXED_X11R68x] Proposed fix for the issue [minimal].

Approval for commit into X11R6.8.x stable branch granted in the 2005-01-24
release-wranglers phone call (please don't commit yourself, I'll do that in a
few mins).

The other (larger and more complete) version of the patch needs to be reviewed
(Egbert is AFAIK looking at that) and then gets into Xorg trunk.
Comment 9 Roland Mainz 2005-01-24 18:53:04 UTC
Comment on attachment 1709 [details] [review]
[FIXED_X11R68x] Proposed fix for the issue [minimal].

Patch checked-in into X11R6.8.x stable branch:
/cvs/xorg/xc/ChangeLog,v  <--  ChangeLog
new revision: 1.365.2.136; previous revision: 1.365.2.135
cvs commit: Using deprecated info format strings.  Convert your scripts to use
the new argument format and remove '1's from your info file format strings.
/cvs/xorg/xc/programs/Xserver/hw/xfree86/common/xf86pciBus.c,v	<-- 
xf86pciBus.c
new revision: 1.4.2.2; previous revision: 1.4.2.1
cvs commit: Using deprecated info format strings.  Convert your scripts to use
the new argument format and remove '1's from your info file format strings.
Mailing the commit message to xorg-commit@lists.freedesktop.org...
Comment 10 Egbert Eich 2005-01-25 01:44:21 UTC
Both patches look OK. The changes in the aggressive version of the patch is
something I should have done long ago. The stupid spaghetti code existed for a 
long time and has been extended for 64 bit address space instead of being
replaced by a simpler loop.
I would like to do a compile time check for 64 bit architecture instead of a
runtime check. A clever compiler will optimize this out but it makes the code
more readable.
I also would like to use the PCIGETMEMORY64HIGH() macro (this time in the
correct way).
I will attach the patch.
Comment 11 Egbert Eich 2005-01-25 01:47:09 UTC
I will commit this to TRUNK therefore I assign this to me.
Comment 12 Egbert Eich 2005-01-25 02:02:26 UTC
Created attachment 1746 [details] [review]
updated fix.

Modified fix (see comment above)
Comment 13 Egbert Eich 2005-01-25 02:09:05 UTC
Committed:
2005-01-25  Egbert Eich  <eich-at-freedesktop-dot-org>

        * programs/Xserver/hw/xfree86/common/xf86pciBus.c:
        (FindPCIVideoInfo):
        * programs/Xserver/hw/xfree86/os-support/bus/xf86Pci.h:
        Fix interpretation of 64bit PCI bases: read hi long word
        from the right bar (Michael Yaroslavtsev, Bugzilla #2322).
Comment 14 Mike Frysinger 2005-01-28 15:43:26 UTC
just to say thanks, this patch fixed me up on Gentoo after banging my head
against the wall when my new amd64 PCI-E GeForce 6600GT wasnt working :)

here's a neat pic of what the bug looks like :)
http://dev.gentoo.org/~vapier/amd64-gtk2-fuck-up.png
Comment 15 Petre Agenbag 2005-01-31 05:33:54 UTC
Hi

I am a complete noob and stumbled on this forum in search of a solution to the
problem described in the Bug.

I notice a "patch" has been attached to one of the comments, but I have no idea
on how or where to apply this?

I am running FC3 64bit with AMD3500 processor and an NVidia 6600 Graphics card
on a Gigabyte K8N Ultra SLI board.


Can someone please help me by pointing to the location of the file to be patched
and also the patching ( recompiling?) process.

I did search for the xf86pciBus.c file but did not find it ( might be that I
don't have the source files installed?)

Is there a newer version of Xorg that already contains this fix? If not, plz
help me to know whjich files to install so I have the source files etc.

Thanks you very much
Comment 16 tonic 2005-01-31 17:48:14 UTC
(In reply to comment #14)
> just to say thanks, this patch fixed me up on Gentoo after banging my head
> against the wall when my new amd64 PCI-E GeForce 6600GT wasnt working :)
> 
> here's a neat pic of what the bug looks like :)
> http://dev.gentoo.org/~vapier/amd64-gtk2-fuck-up.png

I have applied the updated fix, I can use the desktop however I am stuck with
8-bit colour. I do not know if this is related to this bug or not though.

tonic

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.