Bug 2500 - i830 dualhead + power management calls = unknown exception
Summary: i830 dualhead + power management calls = unknown exception
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Alan Hourihane
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-02-08 12:03 UTC by David Bronaugh
Modified: 2006-01-26 09:33 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Patch which fixes it (515 bytes, patch)
2005-02-08 12:15 UTC, David Bronaugh
no flags Details | Splinter Review
Do not disable all pipes patch (1.45 KB, patch)
2005-02-08 12:35 UTC, Alan Hourihane
no flags Details | Splinter Review
Log with patches applied (377.72 KB, text/plain)
2005-02-08 13:40 UTC, David Bronaugh
no flags Details
Xorg configuration (5.16 KB, text/plain)
2005-02-08 15:14 UTC, David Bronaugh
no flags Details
Avoid unnecessary pipe switch (644 bytes, patch)
2005-02-08 15:18 UTC, Alan Hourihane
no flags Details | Splinter Review
Dont switch pipe unnecessarily (replacement) (630 bytes, patch)
2005-02-09 00:32 UTC, Alan Hourihane
no flags Details | Splinter Review
Disable GetModeSupport and GetLFPCompMode (695 bytes, patch)
2005-02-09 01:44 UTC, Alan Hourihane
no flags Details | Splinter Review
Cleans up PM func (2.12 KB, patch)
2005-02-09 02:34 UTC, David Bronaugh
no flags Details | Splinter Review
Xorg log, noaccel + swcursor (130.40 KB, text/plain)
2005-02-09 14:12 UTC, David Bronaugh
no flags Details
xf86EnableAccess traces compared (1.16 KB, text/plain)
2005-02-14 03:05 UTC, David Bronaugh
no flags Details

Description David Bronaugh 2005-02-08 12:03:37 UTC
On my Panasonic R1N laptop (uses an i830 chip) mplayer causes an unhandled
exception to be logged by the X server. This is the result (apparently) of a
general protection fault handled by the linux int10 module due to a BIOS call to
handle power management while the video overlay is active.
Comment 1 David Bronaugh 2005-02-08 12:15:19 UTC
Created attachment 1866 [details] [review]
Patch which fixes it

The attached patch fixes the exception but doesn't do it in a particularly
pretty way.
Comment 2 David Bronaugh 2005-02-08 12:29:07 UTC
Just confirmed that this does -not- happen without dualhead enabled
Comment 3 Alan Hourihane 2005-02-08 12:35:36 UTC
Created attachment 1867 [details] [review]
Do not disable all pipes patch

Try this patch which should hopefully fix it.
Comment 4 David Bronaugh 2005-02-08 13:12:45 UTC
(In reply to comment #3)
> Created an attachment (id=1867) [edit]
> Do not disable all pipes patch
> 
> Try this patch which should hopefully fix it.

Unfortunately it fixes nothing to do with this problem. If you look at the log
after sticking a debug print at the top of the I830DisplayPowerManagementSet
function, you'll see that the error happens when I830DisplayPowerManagementSet
calls SetPipeAccess (which calls SetBIOSPipe), and later (in the same function)
where it makes its own custom int10 call with ax=4f10

Then the error is repeated for the 2nd head.
Comment 5 David Bronaugh 2005-02-08 13:40:23 UTC
Created attachment 1868 [details]
Log with patches applied

This is the log created with both of your patches applied and a debug print
statement at the top of I830DisplayPowerManagementSet
Comment 6 Alan Hourihane 2005-02-08 13:57:42 UTC
David, Just because it calls SetBIOSPipe and that's where the crash happens,
something before it causes the BIOS to get upset. The logic behind the patch is
that the plane is being disabled in which the BIOS is being set to.

So don't think it has nothing to do with it.
Comment 7 David Bronaugh 2005-02-08 14:05:57 UTC
(In reply to comment #6)
> David, Just because it calls SetBIOSPipe and that's where the crash happens,
> something before it causes the BIOS to get upset. The logic behind the patch is
> that the plane is being disabled in which the BIOS is being set to.

Fair enough. I just wanted to make sure you had the largest possible amount of
background information.
Comment 8 Alan Hourihane 2005-02-08 15:03:20 UTC
Right, so I need a little more information on what you are doing to trigger the
problem.

I can see you are playing a video, do you then do 'xset dpms ...' to force a
DPMS mode or something else ?
Comment 9 David Bronaugh 2005-02-08 15:07:10 UTC
(In reply to comment #8)
Actually, I'm not doing anything. mplayer makes some kind of call (I gather when
it starts) to prevent the screen from blanking when a video is playing. I'm
guessing that when mplayer shuts down, it restores the previous blanking status,
and this is when the bug is triggered.
Comment 10 Alan Hourihane 2005-02-08 15:09:40 UTC
Presumably the video is constantly on the first head and not being moved around
to the second ?
Comment 11 Alan Hourihane 2005-02-08 15:10:22 UTC
Oh, can you attach your config too ?
Comment 12 David Bronaugh 2005-02-08 15:14:52 UTC
Created attachment 1869 [details]
Xorg configuration

The video's always staying on the first head. I haven't been moving it back and
forth or anything particularly strange.

Driver options in config have been trimmed down to a minimum.
Comment 13 Alan Hourihane 2005-02-08 15:18:48 UTC
Created attachment 1870 [details] [review]
Avoid unnecessary pipe switch

This patch should avoid the problem for now, although I suspect it can re-occur
if you moved the video to the second head. 

I'll need to dig deeper into this.
Comment 14 David Bronaugh 2005-02-08 20:45:09 UTC
Hmm, that doesn't work. Instead of 2 exceptions, I now get 3. By the looks of
the stack, it first calls GetBIOSPipe (this causes an exception), then calls
SetBIOSPipe (which causes an exception), then calls the x86 emulator/vm86emu
code with 4f10 in ax.

Is there some way to set up some sane state on the card before making BIOS
calls? Also, is there any other way to cause I830DisplayPowerManagementSet to be
called? I've only found mplayer to do this... xset doesn't seem to.
Comment 15 Alan Hourihane 2005-02-09 00:32:09 UTC
Created attachment 1872 [details] [review]
Dont switch pipe unnecessarily (replacement)

Sorry, made an error on that last patch. This one should avoid the problem.
Comment 16 Alan Hourihane 2005-02-09 00:35:11 UTC
Having said that though. I think the BIOS got messed up well before the call to
I830DisplayPowerManagementSet. And therefore even any other BIOS call might get
to this exception crash. We might just be postponing the problem with that last
patch.

So we need to trace which BIOS call was issued last before getting to this stage
and find out why it caused the BIOS to get confused.
Comment 17 David Bronaugh 2005-02-09 00:38:14 UTC
OK this definitely happens without Xv active. Finally confirmed that. 'xset dpms
force <anything>' certainly causes the same problem.

So this is definitely a dualhead/dpms interaction. I've changed the summary
appropriately.
Comment 18 David Bronaugh 2005-02-09 00:40:10 UTC
Comment on attachment 1866 [details] [review]
Patch which fixes it

This doesn't fix the problem -- it only masks it in the case of mplayer
Comment 19 David Bronaugh 2005-02-09 00:49:17 UTC
Can you think of any other functions that make BIOS calls on i830 while the card
is in a fully initted state? I tried switching to console; but this first
restores the HW state using VESA BIOS calls and such before it calls
SetBIOSPipe. I'd like to see if there's another example of this (which I'll
test); or if perhaps this is an isolated example of a function that makes a BIOS
call during "normal" operation.
Comment 20 Alan Hourihane 2005-02-09 01:10:27 UTC
BIOS calls can happen at various times depending on what's happening.

Your gonna have to trace the driver calls and find out which one is causing the
BIOS to get messed up. You've already got the debug enabled, it's just a case of
backtracking through it.
Comment 21 David Bronaugh 2005-02-09 01:35:34 UTC
Comment on attachment 1872 [details] [review]
Dont switch pipe unnecessarily (replacement)

This causes dualhead to not function at all. The second head (in this case the
LFP) displays nothing, and the first head is useless (can't pull up menu for
window manager).
Comment 22 Alan Hourihane 2005-02-09 01:44:21 UTC
Created attachment 1873 [details] [review]
Disable GetModeSupport and GetLFPCompMode

Try this David.

It disables two BIOS calls that are not required, but are used for
informational purposes. I've had problems with these on other systems, so I may
remove them altogether if it works for you.
Comment 23 Alan Hourihane 2005-02-09 01:45:41 UTC
Comment on attachment 1873 [details] [review]
Disable GetModeSupport and GetLFPCompMode

Don't use this. It's completely bogus. I'll go and grab a coffee :-)
Comment 24 Alan Hourihane 2005-02-09 01:46:02 UTC
Comment on attachment 1873 [details] [review]
Disable GetModeSupport and GetLFPCompMode

Argh, wrong patch. Definately use this.
Comment 25 Alan Hourihane 2005-02-09 01:46:29 UTC
Comment on attachment 1872 [details] [review]
Dont switch pipe unnecessarily (replacement)

This one is quite rightly bogus. This is where I go and get my coffee.
Comment 26 Alan Hourihane 2005-02-09 02:17:09 UTC
As for the confusion over the last few posts to this.

Please let me know what happens with this patch David...

https://bugs.freedesktop.org/attachment.cgi?id=1873
Comment 27 David Bronaugh 2005-02-09 02:32:04 UTC
I tried with your patch to disable GetModeSupport and GetLFPComp; no difference.

I disabled acceleration (Option "noaccel" on both heads) to eliminate possible
interaction due to Xv init or other such things; no difference.

This pretty much left me with the following running between the last SetBIOSPipe
call that succeeded, and the one that failed:
 - ResetState
 - SetHWOperatingState
 - SetFenceRegs
 - I830InitHWCursor
 - I830BIOSAdjustFrame
 - I830BIOSScreenInit
 - Bunch of other cursor functions
 - and right before the SaveBIOSPipe calls, etc -- I830BIOSSaveScreen

I haven't forgotten that your first patch dealt with this. What's the purpose of
this function?
Comment 28 David Bronaugh 2005-02-09 02:34:24 UTC
Created attachment 1874 [details] [review]
Cleans up PM func

This just cleans up the power management function. It eliminates redundant code
copy-and-pasted from vbe.c and simplifies the checks for the function from
whence the code was ripped.

It's not pertinent to this per se, but it reduces the line count... and that's
good IMO :)
Comment 29 David Bronaugh 2005-02-09 03:02:29 UTC
OK more things I tried.

1) #if 0'd out the whole I830BIOSSaveScreen function -- no difference
2) put I830Sync and DO_RING_IDLE above all the BIOS calls in PM func -- no diff 
Comment 30 Alan Hourihane 2005-02-09 03:40:00 UTC
You can use

Option "swcursor"

to remove all the cursor calls.
Comment 31 Alan Hourihane 2005-02-09 04:53:16 UTC
Also,

I'd put the function call...

SetPipeAccess(pScrn);

at the start of each function that's called before the DPMS call. And see if
that narrows down the functions to which the problem occurs.
Comment 32 David Bronaugh 2005-02-09 13:20:45 UTC
OK, I enabled swcursor, I put SetPipeAccess at the top of every function except
the init ones (sig11 if I did that). Result: Nothing blows up. No exceptions, no
nothing.

I put SetPipeAccess at the top of I830BIOSSaveScreen -- and X locks up
immediately after displaying the crosshatch pattern and the X cursor.

Nothing in the log about SetBIOSPipe either, interestingly enough, when that
happens.

This is total guesswork, but could it be something like a buffer offset being
wrong with dualhead so that some scratch RAM or something used by the BIOS is
stomped on when, say, an offscreen pixmap is allocated?
Comment 33 Alan Hourihane 2005-02-09 13:49:49 UTC
I doubt it. All my machines run dual head without this type of problem.

So, you are using "noaccel" & "swcursor" and it still emits the same exception ?

Can you post another log with these two options enabled ?
Comment 34 David Bronaugh 2005-02-09 14:12:51 UTC
Created attachment 1881 [details]
Xorg log, noaccel + swcursor

Xorg log with noaccel and swcursor enabled; dualhead + xinerama (though
with/without makes no difference)
Comment 35 David Bronaugh 2005-02-09 14:19:56 UTC
Oh, another question -- do you end up with Xv corruption on your setup?
Offscreen pixmaps seem to collide with the overlay.

I do here; that's why I ask.
Comment 36 Alan Hourihane 2005-02-09 14:56:08 UTC
I did get corruption - yes, and I've fixed that in my driver. But it's not the
cause of this problem.

Can you try adding a 'return;' immediately in the function SetBIOSMemSize().
Comment 37 David Bronaugh 2005-02-09 15:55:27 UTC
Added a 'return' at the top of SetBIOSMemSize; no difference.

Do you have a patch ready for the Xv corruption fix?
Comment 38 Alan Hourihane 2005-02-10 04:57:13 UTC
I've just committed the Xv fix.

As for the driver, Can you make sure that there are DPRINTF's at the start of
every function to ensure the trace in the log file is accurate.
Comment 39 Alan Hourihane 2005-02-10 08:03:24 UTC
Another question.

Once running X, and without running mplayer, essentially from a fresh startup,
can you VT switch at all ?
Comment 40 David Bronaugh 2005-02-10 14:21:24 UTC
I've tried switching VTs; this works fine, before and after running mplayer /
whatever else.

I'll play around with the driver some more tonight.
Comment 41 David Bronaugh 2005-02-11 04:25:07 UTC
Ugh. I just tried pretty much everything I could think of, in rapid-fire
succession. So far as I can tell, I modified every function that modified
registers or sent stuff out via the ring buffer to print out when it was running.

I got to the point where NOTHING was (apparently) running after the last
SetBIOSPipe. I defanged LidTimer just in case, and put SetBIOSPipe at the top
and bottom of I830BIOSSaveScreen. The story is -- as soon as the driver's done
running the init function, SetBIOSPipe fails with the usual exception. At the
top -and- bottom of I830BIOSSaveScreen -- so it has nothing to do with this
function.

So that leaves me with, well, nothing more to test. I even tried adjusting the
base of memory forward by 256k or so, just to see if that did anything -- didn't
seem to (pI830->LinearAddr = pI830->pEnt->device->MemBase + 0x40000). I limited
VideoRAM to 4096kb, just for fun - no difference.

Got any ideas? I'm totally out; and that's kind of remarkable.
Comment 42 David Bronaugh 2005-02-11 13:53:00 UTC
OK I modified the register dumper to dump a few more registers; maybe you can
make heads or tails out of the output. I don't know if I've thrown in i810
registers there, because i810_reg.h isn't particularly clear on what's what. I
just threw in whatever looked relevant.

As usual, this is with swcursor and noaccel

0x71400 == 0x8104008e
DISPLAY_CNTL == 0x80000000
PIPEBCONF == 0x80000000
DISPLAY_BASE == 0x00000000
SWF0 == 0x00020801
SWF1 == 0x00001508
SWF2 == 0x00200155
SWF3 == 0x45450000
SWF4 == 0xc0000000
SWF5 == 0x00008249
SWF6 == 0x00000000
ADPA == 0x80000018
DVOA == 0xc000409c
DVOA_SRCDIM == 0x00000000
DVOB == 0x00000000
DVOB_SRCDIM == 0x00000000
LVDS == 0x00000000
LP_RING TAIL == 0x00000000
LP_RING HEAD == 0x00000000
LP_RING START == 0x00000000
LP_RING LEN == 0x00000000
PGETBL_CTL == 0x0ff60001
FENCE == 0x00000000
HWSTAM == 0xffffffff

The only thing I noticed is that SWF3 changes on its own. Modifying SWF3
manually results in some funny sync effects on the CRT (wiggling); so I suppose
it has something to do with sync (duh).
Comment 43 David Bronaugh 2005-02-11 13:53:54 UTC
Oh yeah. That register dump is the same before and after SetBIOSPipe starts
causing exceptions.
Comment 44 Alan Hourihane 2005-02-12 02:03:29 UTC
Can you try this.....

In the DPMS function at the top, add this...

SetBIOSPipe(pScrn, 0);

and leave the rest of the function as is, with no patches from either of us.
Comment 45 David Bronaugh 2005-02-12 02:17:33 UTC
Just tried this. No difference.

I've been looking at figuring out what code the int10 call actually executes;
what would be the best procedure for going about this? I know enough x86
assembly to get by here; and I can always refer to a reference if I don't know.

I'm thinking if I knew what code was actually executing, I might have more of a
clue about how to go about solving the problem.
Comment 46 Alan Hourihane 2005-02-12 02:23:57 UTC
Use gdb to single step through the call from a remote ssh session.
Comment 47 David Bronaugh 2005-02-12 02:27:05 UTC
Hmm, that'll work in the case of the vm86 calls?

I was thinking more along the lines of dumping the BIOS to a file and printing
out where the int10 code is calling in it. But I'm all for better ideas.
Comment 48 Alan Hourihane 2005-02-12 02:34:50 UTC
Go with what your comfortable with.
Comment 49 Alan Hourihane 2005-02-12 04:08:37 UTC
(In reply to comment #41)
The thing is, if you are VT switching fine after the Xserver is up and running,
then you should notice that SetBIOSPipe also gets called. That's why I asked if
VT switching was working.

So, if that is working, then it's very strange that the DPMS call into
SetBIOSPipe would cause this exception, especially after the Xserver is up and
running.

I would probably step through what happens when the Xserver has finished
initialization to the point where the DPMS function is called and see what
really is going on.

Use an Xserver with very limited clients. Just starting X with 'xinit' should
just get you an xterm enough to run the 'xset dpms ...' command.
Comment 50 Alan Hourihane 2005-02-12 04:09:18 UTC
Also, Egbert (also monitoring this) might be able to provide help on debugging
BIOS images. He's done it before.
Comment 51 David Bronaugh 2005-02-14 03:05:28 UTC
Created attachment 1898 [details]
xf86EnableAccess traces compared

So I got my 2nd machine set up, etc, and tooled around in gdb a bit.

The output shows some stuff here that looks a bit shaky to me. The BusAccess
member of pScrn is initialized here, but access is not (in the dualhead case).
However, when pciSetBusAccess is entered, nothing is done. This wouldn't -seem-
to be the intent of these functions...

I've been doing this all on 2.6.10-rc3; tried on 2.6.7-rc2 just recently, and
no difference. This is to eliminate the possibility of a recent kernel change
mucking things up.
Comment 52 David Bronaugh 2005-02-14 03:10:46 UTC
Further explanation on the gdb output:

This function is called when the dpms handler is set up to be executed. DPMSSet
is called, it calls the xf86EnableAccess function, which calls
I830BIOSSaveScreen; then the I830DisplayPowerManagementSet function is called.
This is why I'm suspicious of this particular item. There doesn't seem to be
anything else which would directly affect this in the trace I did. Mind you it
was not exhaustive, and particularly in this "field" I seem to be more wrong
than right.
Comment 53 Alan Hourihane 2005-07-12 23:45:06 UTC
David,

Have you tried my recent driver ??

What's the status at your end on this ??
Comment 54 Alan Hourihane 2006-01-24 08:45:58 UTC
David - do you still have the machine for testing this ??
Comment 55 Alan Hourihane 2006-01-27 04:33:27 UTC
I'm closing this as FIXED now, as I'm pretty sure the driver in CVS as of today
will have made an impact on this.

Re-open if there's still troubles.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.