Bug 2880 - PCI config space register access unprotected
Summary: PCI config space register access unprotected
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/DDX/Xorg (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Xorg Project Team
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 1690
  Show dependency treegraph
 
Reported: 2005-04-01 12:12 UTC by Kevin E. Martin
Modified: 2006-03-02 12:31 UTC (History)
7 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Use system method to access PCI config space (1.42 KB, patch)
2005-04-01 12:15 UTC, Kevin E. Martin
no flags Details | Splinter Review
Initialize val to 0 (473 bytes, patch)
2005-04-24 15:33 UTC, Alan Hourihane
no flags Details | Splinter Review
Write 0xCF8 without value (599 bytes, patch)
2005-04-25 07:06 UTC, Alan Hourihane
no flags Details | Splinter Review
Use system method for byte and word accesses also (2.72 KB, patch)
2005-04-25 08:11 UTC, Kevin E. Martin
no flags Details | Splinter Review
Use system method for byte/word accesses (take 2) (2.76 KB, patch)
2005-04-27 07:20 UTC, Kevin E. Martin
no flags Details | Splinter Review
Use system method for byte/word accesses (take 3) (3.58 KB, patch)
2005-08-22 13:54 UTC, Kevin E. Martin
no flags Details | Splinter Review
Use system method for all accesses (5.43 KB, patch)
2005-08-22 14:00 UTC, Kevin E. Martin
no flags Details | Splinter Review
Patch to implement byte and word level pci config access (11.50 KB, patch)
2005-09-23 12:04 UTC, Kristian Høgsberg
kem: 6.8-branch?
Details | Splinter Review

Description Kevin E. Martin 2005-04-01 12:12:23 UTC
The int10 module directly accesses PCI config space via the 0xCF8 and 0xCFC
registers.  This causes system lockups on multiprocessor systems when multiple
processes are trying to access PCI config space.
Comment 1 Kevin E. Martin 2005-04-01 12:14:07 UTC
Also, we currently have code to use the platform specific access method that
coordinates access, but it was #ifdef'd out.  Egbert and I discussed this issue
and we agreed that the #ifdefs were not necessary.  The patch that I will attach
next removes them and fixes this bug.
Comment 2 Kevin E. Martin 2005-04-01 12:15:54 UTC
Created attachment 2285 [details] [review]
Use system method to access PCI config space

Credit for this patch goes to Dell.
Comment 3 Kevin E. Martin 2005-04-01 12:25:26 UTC
Comment on attachment 2285 [details] [review]
Use system method to access PCI config space

Make the int10 module use the system method to access PCI config space.  Egbert
and I reviewed this patch.  It has already been checked into HEAD.
Comment 4 Lukas Hejtmanek 2005-04-23 16:04:48 UTC
This patch breaks Video POST code in i810 driver!

xf86Int10InfoPtr pInt = xf86InitInt10(pI830->pEnt->index);

xf86DrvMsg(pScrn->scrnIndex, X_INFO, "SetDisplayDevices failed, re-trying.\n");

/* Now perform our warm boot */
if (pInt) {
        pInt->num = 0xe6;
        xf86ExecX86int10 (pInt);
        xf86FreeInt10 (pInt);
        xf86DrvMsg(pScrn->scrnIndex, X_INFO, "Re-POSTing via int10.\n");
}

This code with "Use system method to access PCI config space" messes up
registers of video card. Without that patch it works OK.
Comment 5 Kevin E. Martin 2005-04-24 15:19:19 UTC
Alan, any ideas on this one?  I'm at a loss here why this change would cause
problems with the i810.
Comment 6 Alan Hourihane 2005-04-24 15:33:56 UTC
Created attachment 2531 [details] [review]
Initialize val to 0

I'm not sure yet, but a quick glance at the previous patch on this bug shows
that maybe this will help.
Comment 7 Lukas Hejtmanek 2005-04-24 16:06:35 UTC
This needs to be done in helper_exec.c and it all works. 

void
x_outl(CARD16 port, CARD32 val)
{
#ifdef PRINT_PORT
    ErrorF(" outl(%#x, %8.8x)\n", port, val);
#endif
-   if (!pciCfg1out(port, val))
+   /* if (!pciCfg1out(port, val))*/
        outl(Int10Current->ioBase + port, val);
}
Comment 8 Kevin E. Martin 2005-04-24 16:21:06 UTC
(In reply to comment #6)
> Created an attachment (id=2531) [edit]
> Initialize val to 0
> 
> I'm not sure yet, but a quick glance at the previous patch on this bug shows
> that maybe this will help. 

I think the two cases are covered since val will be set in the pciCfg1in
function if it returns 1, and if it returns 0, then val will be set by the call
to inl().
Comment 9 Kevin E. Martin 2005-04-24 16:29:05 UTC
(In reply to comment #7)
> This needs to be done in helper_exec.c and it all works. 
> 
> void
> x_outl(CARD16 port, CARD32 val)
> {
> #ifdef PRINT_PORT
>     ErrorF(" outl(%#x, %8.8x)\n", port, val);
> #endif
> -   if (!pciCfg1out(port, val))
> +   /* if (!pciCfg1out(port, val))*/
>         outl(Int10Current->ioBase + port, val);
> }
> 

Unfortunately, this will cause lockups (especially on multiprocessor systems). 
The problem is that when the X server is trying to access PCI config space
through CF8/CFC, there may be another process that is trying to do the same
thing.  If, for example, X sets the config addr and a second process sets it to
something else, then when the X reads/writes the value with a call to
x_inl/x_outl, the it will be reading/writing from the incorrect config space
addr.  The same is true if you reverse the roles of the two processes.  Using
the pciReadLong/pciWriteLong functions make the reading/writing of PCI config
space atomic.

I'm suprised this caused problems on the i810.

Also, adding Alan back to CC list...
Comment 10 Kevin E. Martin 2005-04-24 16:34:29 UTC
Alan, might this be caused by only having x_inl/x_outl use the system method? 
We could easily make the same changes to x_inb/x_outb and x_inw/x_outw.
Comment 11 Lukas Hejtmanek 2005-04-24 16:44:43 UTC
(In reply to comment #9)

> Unfortunately, this will cause lockups (especially on multiprocessor systems). 
> The problem is that when the X server is trying to access PCI config space
> through CF8/CFC, there may be another process that is trying to do the same
> thing.  

When I add ErrorF to see which address is used in x_outl, then I have full log
of messages like this:
outl(0x1800, 00061204)
outl(0x1804, 00000001)

that is far different address then 0xcf8 or 0xcfc. So it is strange for me why
should pciCfg1out even influence the behaviour, but maybe I'm something missing.
Comment 12 Alan Hourihane 2005-04-25 07:06:41 UTC
Created attachment 2538 [details] [review]
Write 0xCF8 without value

This patch fixes the problem. In the patches case it never writes 0xCF8 until
0xCFC is written. I've not debugged exactly why, but obviously something needs
0xCF8 to be written. Lukas tested this patch and it works.
Comment 13 Kevin E. Martin 2005-04-25 08:09:59 UTC
(In reply to comment #12)
> This patch fixes the problem. In the patches case it never writes 0xCF8 until
> 0xCFC is written. I've not debugged exactly why, but obviously something needs
> 0xCF8 to be written. Lukas tested this patch and it works.

Alan, this sound suspiciously like someone is setting the pci config space
address with outl at cf8 and then using an inb/w or outb/w at cfc.  I'll create
a patch to make byte and word reads/writes of config space also use the system
method.
Comment 14 Kevin E. Martin 2005-04-25 08:11:58 UTC
Created attachment 2540 [details] [review]
Use system method for byte and word accesses also

Lukas, could you test this patch without Alan's patch to see if it also fixes
the problem?
Comment 15 Alan Hourihane 2005-04-25 08:13:12 UTC
O.k, but it's not the driver that's mucking about with config space. It's more
than likely the Video BIOS that's doing it.
Comment 16 Kevin E. Martin 2005-04-25 08:21:26 UTC
(In reply to comment #15)
> O.k, but it's not the driver that's mucking about with config space. It's more
> than likely the Video BIOS that's doing it.

Sorry, I meant the BIOS.  I'm guessing that the BIOS on Lukas' system is trying
to read/write bytes and/or words from config space instead of just longs.  This
is the first one that I've seen do this, but I'm not surprised.  Anyway, I think
we need to use the system method for all accesses to config space; otherwise, we
will eventually end up with problems.
Comment 17 Lukas Hejtmanek 2005-04-25 14:06:51 UTC
(In reply to comment #14)
> Created an attachment (id=2540) [edit]
> Use system method for byte and word accesses also
> 
> Lukas, could you test this patch without Alan's patch to see if it also fixes
> the problem? 

Just partly. Video registers are mostly restored ok (about 50 pixels stripe on
the top of screen is ok, but the rest contains garbage and seems like different
resolution)
Comment 18 Lukas Hejtmanek 2005-04-27 02:48:06 UTC
(In reply to comment #17)

> Just partly. Video registers are mostly restored ok (about 50 pixels stripe on
> the top of screen is ok, but the rest contains garbage and seems like different
> resolution)

Anyway, tomorrow I will get new notebook, so I will be unable to test this issue
any more. (It will have i915 video card, but maybe better bios)
Comment 19 Kevin E. Martin 2005-04-27 07:18:13 UTC
> Anyway, tomorrow I will get new notebook, so I will be unable to test this issue
> any more. (It will have i915 video card, but maybe better bios)

Lukas, thank you very much for testing the patches.  I've found a couple
problems with my patch.  I'll attach another one shortly.  If you still have the
laptop today, could you try the next patch out?

Also, could you let me know the make and model of your laptop so that I can see
if I can find a similar one here for testing?
Comment 20 Kevin E. Martin 2005-04-27 07:20:07 UTC
Created attachment 2571 [details] [review]
Use system method for byte/word accesses (take 2)

This one fixes the masking problems with the config address.
Comment 21 Lukas Hejtmanek 2005-04-27 08:14:06 UTC
(In reply to comment #20)
> Created an attachment (id=2571) [edit]
> Use system method for byte/word accesses (take 2)
> 
> This one fixes the masking problems with the config address.

Did not help.

I have Acer TM 242FX laptop.

Comment 22 Lukas Hejtmanek 2005-04-27 08:20:03 UTC
(In reply to comment #20)
> Created an attachment (id=2571) [edit]
> Use system method for byte/word accesses (take 2)
> 
> This one fixes the masking problems with the config address.
> 

However, it does not fail completely, after VT switch to text console and back
it is ok. But this does not happen with previous version without pciCfg patches
or with Alan's patch.
Comment 23 Kevin E. Martin 2005-04-28 07:29:06 UTC
(In reply to comment #22) 
> However, it does not fail completely, after VT switch to text console and back
> it is ok. But this does not happen with previous version without pciCfg patches
> or with Alan's patch.

That's very interesting.  I'd like to figure out what's going on here, but that
will probably not be possible until I get a system that can reproduce the problem.

In the mean time, we could either combine Alan's and my later patch, or we could
just use Alan's patch.  Alan, any preferences?
Comment 24 Alan Hourihane 2005-04-28 07:37:39 UTC
For now, I think we should probably combine them. Undoubtably the system method
is the right way, but at least my workaround still uses the system method to
avoid the lockups.

Thanks Kevin.
Comment 25 Kevin E. Martin 2005-06-03 07:50:13 UTC
Egbert asked that this be assigned to him, so that he can look into it.
Comment 26 Dr John Austin 2005-07-23 19:28:10 UTC
Just to say that it would appear that this bug is one of the
two reasons for problems with Matrox G550 cards (and others) in FC3Tx and FC4
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=161242 and others!
The gcc volatile problem causes the Cont/Alt/F1-6 problem
The removal of 
#Patch9337: xorg-x11-6.8.2-use-linux-native-pciscan-by-default.patch
solves the Cont/Alt/F7 corruption problem
See comment 132 among many others
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=161242#c132

I apologise if this cross reference is well known and under control
Comment 27 Olivier Baudron 2005-07-24 10:33:09 UTC
Could it be that the BIOS outl at 0xCF8 and inb at 0xCFD (or 0xCFE or 0xCFF) ?
It would explain why Alan's workaround works whereas Kevin's patch misses inb.
Comment 28 Olivier Baudron 2005-07-24 21:39:03 UTC
Created attachment 3129 [details] [review]
fixes endian problem with swapped r and b colors

This patch is based on Kevin's patch. It considers odd video bioses that access
bytes or words at 0xCF9, 0xCFA, 0xCFB and 0xCFD, 0xCFE, 0xCFF.
Unfortunatly, I dont't have such a hardware.
Can someone test this?
Comment 29 Olivier Baudron 2005-07-24 22:02:06 UTC
Created attachment 3130 [details]
Xorg.0.log shows 1600x1200 mode settings and the error message

Errr, sorry, it should be better with this one.
Comment 30 Dr John Austin 2005-07-25 00:14:46 UTC
I have tested the extra patch on Athlon64 i386 FC4 G550
I used Mike's latest xorg-x11-6.8.2-43.src.rpm
and the "standard" FC4 gcc, not the very latest modified one.
I works !!!!!!!!!!!
No screen corruption either at Cont/Alt/F7 or Cont/Alt/F1-6

I have only done superficial couple of boots/reboots

Hope other people find the same
Great work
Regards John
Comment 31 Olivier Baudron 2005-07-25 18:52:46 UTC
Created attachment 3135 [details]
Server side of SolarisIA extension

Fixed interval in pciCfg1outb().
Comment 32 Kevin E. Martin 2005-07-26 01:11:15 UTC
Oliver, great catch.  I'll see if we can test this shortly, and post the results.
Thanks!
Comment 33 Kevin E. Martin 2005-08-22 13:48:25 UTC
We've tested Olivier's updated patch in rawhide for a while now and it appears
to be working as expected.  I'm going to check his patch in to CVS to get
additional testing.  If no further problems are found, we can close this bug.

Unfortunately, it seems that fdo bugzilla has lost his patch as a result of the
recent fdo hard disk failure, so I'll attach it here shortly plus a version with
both fixes in a single patch that could be applied to the 6.8 branch

Thanks again Olivier!
Comment 34 Kevin E. Martin 2005-08-22 13:54:21 UTC
Created attachment 2987 [details] [review]
Use system method for byte/word accesses (take 3)

This is the patch from Olivier Baudron that was lost in the recent fdo disk
crash.
Comment 35 Kevin E. Martin 2005-08-22 14:00:38 UTC
Created attachment 2989 [details] [review]
Use system method for all accesses

This is the combined patch to be nominated for 6.8.3.
Comment 36 Kevin E. Martin 2005-08-22 14:30:36 UTC
Comment on attachment 2989 [details] [review]
Use system method for all accesses

Updated patch to make the int10 module use the system method to access PCI
config space.  It has already received testing in rawhide and has been checked
into CVS HEAD.	This patch combines the previous two patches into one patch to
be applied to the next 6.8 branch release.
Comment 37 Adam Jackson 2005-08-28 12:19:41 UTC
marking as FIXED since this is applied to HEAD.  for 6.8.3, search on the
approval flag, not on bug open status.
Comment 38 Kristian Høgsberg 2005-09-23 12:04:26 UTC
Created attachment 3382 [details] [review]
Patch to implement byte and word level pci config access

The PCI spec only specifies that access to the standardized header of the PCI
config space shall not have side effects when reading and writing from it. 
Data beyond that header doesn't seem to have that guarantee. The matrox vesa
bios generates byte reads and writes beyond the standard header, and when using
the linux /proc way of accessing PCI config space, everything is implemented as
32 bit reads and writes.  This apparently breaks the vesa driver on matrox
cards.	The /proc method will do outw and outb if you do 2-byte and 1-byte
reads from the /proc file, and this updated patch adds support for that.  With
this change, the vesa driver works on matrox again.
Comment 39 Kristian Høgsberg 2005-09-23 12:05:46 UTC
Reopening bug for discussion of updated patch.
Comment 40 Kristian Høgsberg 2005-10-24 15:06:16 UTC
We've had this patch in the Red Hat and Fedora RPMs for a while now and it's
working pretty well.  I'll commit this to HEAD in a few days unless somebody
sees a problem with it.

Kristian
Comment 41 Kristian Høgsberg 2005-11-08 10:58:56 UTC
Patch committed to HEAD.
Comment 42 Kevin E. Martin 2005-11-10 11:04:36 UTC
Comment on attachment 3382 [details] [review]
Patch to implement byte and word level pci config access

This is a better fix for pci config space accesses.
Comment 43 Olivier Baudron 2006-03-03 07:31:03 UTC
Kristian, it may exist 2 small typos in your final patch:

--- xorg-x11-6.8.2-use-linux-native-pciscan-by-default.patch    2006-03-02
21:19:21.000000000 +0100
+++  xorg-x11-6.8.2-use-linux-native-pciscan-by-default.patch   2006-03-02
21:25:19.000000000 +0100
@@ -121,7 +121,7 @@
 +linuxPciCfgReadWord(PCITAG tag, int off)
 +{
 +      int     fd;
-+      CARD16  val = 0xff;
++      CARD16  val = 0xffff;
 +
 +      if (-1 != (fd = linuxPciOpenFile(tag))) {
 +              lseek(fd, off, SEEK_SET);
@@ -163,7 +163,7 @@
 +
 +  if ((bus >= 0) && ((bus < pciNumBuses) || inProbe) && pciBusInfo[bus] &&
 +      pciBusInfo[bus]->funcs->pciReadWord) {
-+    CARD32 rv = (*pciBusInfo[bus]->funcs->pciReadWord)(tag, offset);
++    CARD16 rv = (*pciBusInfo[bus]->funcs->pciReadWord)(tag, offset);
 +
 +    return(rv);
 +  } else {


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.