Bug 39094

Summary: WaitFor does not handle EIO (causes 100% cpu load)
Product: xorg Reporter: Chris Wilson <chris>
Component: Server/GeneralAssignee: Adam Jackson <ajax>
Status: NEW --- QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: high CC: jeremyhu, tiago.vignatti
Version: git   
Hardware: Other   
OS: All   
Whiteboard: 2012BRB_Reviewed
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 40982, 44202    

Description Chris Wilson 2011-07-09 07:26:40 UTC
The device goes into a permanent input-ready state causing X's WaitForSomething to never actually wait.

open("/dev/vga_arbiter", O_RDWR)        = 10
read(10, "count:1,PCI:0000:00:02.0,decodes"..., 64) = 64
write(10, "target PCI:0000:00:02.0", 23) = 23
read(10, "count:1,PCI:0000:00:02.0,decodes"..., 64) = 64
select(256, [1 5 9 10 12 13 14 15 16 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52], NULL, NULL, {89, 869000}) = 1 (in [10], left {89, 868993})
clock_gettime(CLOCK_MONOTONIC, {17445, 90572053}) = 0
ioctl(10, TCFLSH, 0x2)                  = -1 EIO (Input/output error)
<repeat ad infinitum>

Even more annoying this is on a laptop with only a single igfx and no expansion capabilities...

Hopefully this in fact an old bug and already fixed.
Comment 1 Jeremy Huddleston Sequoia 2011-10-12 02:17:51 UTC
This is a tad out of my area of expertise, but it smells like this might be kernel related.  When did you start seeing this issue?  What kernel version doi you have?

Doing a google search for TCFLSH EIO returns some possibly similar results:

but those all look like an EIO of a TCFLSH of /dev/tty*

1) we should "deal" with the EIO elegantly in src/os/WaitFor.c

2) What can cause us to EIO /dev/vga_arbiter?  Will that occur if some other process open(2)s it?

Also, I'm a bit confused why we're select(2)ing the /dev/vga_arbiter descriptor in X's WaitForSomething.  I don't see how that descriptor is exposed by libpciacces.  I don't see "vgaarb_fd" in the server source, and I don't see a way in libpciaccess to add that descriptor to an fd_set.

AFAICT, that descriptor is *owned* by libpciaccess, so the server should not have it in its read fd_set.  Am I missing something here?
Comment 2 Jeremy Huddleston Sequoia 2011-10-12 02:19:45 UTC
Let's take care of #1 (handle EIO properly) in xserver-1.12, but I'm still perplexed about why that descriptor is in our fd_set ...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.