Bug 13743 - [GM965] Display goes blank when starting X, or when switching VTs
Summary: [GM965] Display goes blank when starting X, or when switching VTs
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Hong Liu
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-12-19 16:32 UTC by Richard Goedeken
Modified: 2008-06-08 08:44 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments
xorg.conf (1.69 KB, text/plain)
2007-12-20 19:13 UTC, Richard Goedeken
no flags Details
Xorg.0.log (29.68 KB, text/plain)
2007-12-20 19:13 UTC, Richard Goedeken
no flags Details
Register output (7.87 KB, text/plain)
2008-02-06 05:49 UTC, Richard Goedeken
no flags Details
Xorg log from branch test (11.18 KB, text/plain)
2008-02-12 19:22 UTC, Richard Goedeken
no flags Details

Description Richard Goedeken 2007-12-19 16:32:09 UTC
- System: AOpen MiniPC MP965-DR
 - CPU/RAM: Core 2 duo T7500, 2GB ram
 - OS: Fedora 8
 - X server: xorg 1.4.99.1-0.10.fc9
 - intel_drv.so: built from git checkout 12/09
 - mesa/drm drivers: built from git checkout 12/09
 - system outputs: VGA, TMDS-1, LVDS, TV
 - TV: 32" Sony WEGA, NTSC 480i only

This bug documents an individual problem - several problems were reported at once in #13611, and I am now separating them as requested.

This bug occurs about 50% of all times, when starting the redhat graphical boot screen, when starting the X server before the login screen, or when switching to either text VTs or back to the X VT.  Rather than showing the correct text/graphical display, the screen will remain blank.  This has occurred while the output was attached to a VGA monitor, and while connected to a TV.

When this problem occurs, it may be worked around by switching between text and graphical VTs repeatedly until the correct display appears.
Comment 1 Gordon Jin 2007-12-19 19:30:24 UTC
Seems dup with #13252. It's already fixed in git, and will be in next release.

*** This bug has been marked as a duplicate of bug 13252 ***
Comment 2 Richard Goedeken 2007-12-20 05:46:42 UTC
This bug still exists.  I did a 'git pull origin' on the morning of 12/20, rebuilt the driver, and re-tested.  I can still duplicate this problem in both VGA-only and TV-only configurations.

When running in the TV-only configuration, I switched to a text VT and back to the graphical VT 10 times.  The text display came up every time.  However the graphical display failed to come up 3 times out of 10.  I believe that this is an improvement over the previous behavior, but still a problem.
Comment 3 Gordon Jin 2007-12-20 17:36:30 UTC
So it becomes a new problem.
Is startx issue gone now? If so, that means the only issue is switching back to X, and I'd suggest to open a new bug to highlight this new issue, with Xorg.0.log and xorg.conf attached. Thanks.
Comment 4 Richard Goedeken 2007-12-20 19:13:13 UTC
Created attachment 13261 [details]
xorg.conf
Comment 5 Richard Goedeken 2007-12-20 19:13:28 UTC
Created attachment 13262 [details]
Xorg.0.log
Comment 6 Richard Goedeken 2007-12-20 19:21:46 UTC
The startx issue remains.  The symptom that I experience (blank display) happens randomly during either driver startup or when switching Virtual Terminals.  I did more testing tonight, using the driver that I built from the git head this morning.  It appears that the intel driver is started and initialized twice during the Fedora boot process - once at Redhat Graphical Boot startup, and the second time during X startup, right before the X/Gnome login screen.  During my testing tonight, I rebooted the MiniPC 7 times without touching any keys during the boot process and recorded the results:

Test | RHGB screen | X Login screen | Display connected
---------------------------------------------------------
1    | OK          | Blank          | TV only (component)
2    | Blank       | OK             | TV only (component)
3    | Blank       | Blank          | VGA only
4    | Blank       | OK             | VGA only
5    | OK          | Blank          | VGA only
6    | OK          | OK             | VGA only
7    | OK          | OK             | VGA only
Comment 7 Jesse Barnes 2008-01-03 11:56:32 UTC
Hm, so this one is different from the pipe enable/disable bug.  But if the situation improved following that patch, it may be that we actually need to let the PLLs settle even longer than 10ms.  Can you try patching your driver to let them settle for 200ms?  You should only have to modify i830_dpll_settle() in i830_driver.c...

--- a/src/i830_driver.c
+++ b/src/i830_driver.c
@@ -1995,7 +1995,7 @@ SaveHWState(ScrnInfoPtr pScrn)
 static void
 i830_dpll_settle(void)
 {
-    usleep(10000); /* 10 ms *should* be plenty */
+    usleep(200000); /* 200 ms *should* be plenty */
 }

 static Bool
Comment 8 Richard Goedeken 2008-01-03 16:32:48 UTC
I rebuilt with your i830_driver.c patch and rebooted with the new driver but alas the problem remains.  Both RHGB and X failed to bring up the display properly, and when switching between VTs a couple of dozen times the text screen came up 100% of the cases but the graphical display failed to come up about 25% of the time.
Comment 9 Håvard Wigtil 2008-01-30 13:16:04 UTC
I'm also having this problem. Hardware is AOpen MiniPC MP965-DR, I've tried with driver 2.1.1, 2.2.0 and git as of today, but I can never reilably get X to start on a PAL TV when it is the only output source commected. Switching VTs eventutally gives me X, and X always comes up if I use the VESA driver instead.

Is there anything I can do to help debug this?
Comment 10 Jesse Barnes 2008-02-05 17:57:04 UTC
This one's tough since we can't reproduce it in-house.

If we're lucky we'll see some register differences between the working and non-working cases.  Can one of you try building the intel_reg_dumper tool and dumping state from non-working and working states?
Comment 11 Richard Goedeken 2008-02-06 05:49:44 UTC
Created attachment 14175 [details]
Register output

I ran an experiment whereby I switched between text and X VTs.  Out of 5 tries, the graphical display came up 2 times and failed to come up 3 times.  After switching to X each time I dumped the registers, so I had 3 'bad' cases and 2 'good' ones.  I did an MD5sum on the resulting 5 text files and all of them were identical (1aa584177c2d15c12ecfd05fc06aa122).  I have attached one of the register dumps here.
Comment 12 Jesse Barnes 2008-02-12 14:51:31 UTC
The register dump indicates that TV is enabled but VGA is off, so you were doing these tests with just the TV connected?
Comment 13 Richard Goedeken 2008-02-12 15:25:48 UTC
Correct - this bug causes the same symptoms in either the TV-only configuration or the VGA-only configuration (only 2 I've tested).  This particular test was done with only the TV connected.
Comment 14 Jesse Barnes 2008-02-12 15:48:26 UTC
Richard, I have a git tree at people.freedesktop.org/~jbarnes/xf86-video-intel that contains some startup mode setting changes.  Can you give it a try?  I went over the current code several times today trying to figure out what could be causing your startup problems, but couldn't find anything, but there's an off chance that some of the ordering changes I made in my tree will make a difference...
Comment 15 Richard Goedeken 2008-02-12 19:22:23 UTC
Created attachment 14293 [details]
Xorg log from branch test


I built the driver from your branch and rebooted, and the X server crashed.  I have attached the Xorg.log, which includes a back trace.

Is it possible that this bug could be caused by the DRM kernel module and not the 2D driver?

Is there any kind of hardware signal analysis that I could do to help?  Since the MiniPC is so small and portable, I could take it in to work where we have digital scopes and spectrum analyzers.  I'm not sure what I'd be looking for, though. ;)
Comment 16 Michael Fu 2008-02-13 18:21:09 UTC
(In reply to comment #11)
> Created an attachment (id=14175) [details]
> Register output
> 
> I ran an experiment whereby I switched between text and X VTs.  Out of 5 tries,
> the graphical display came up 2 times and failed to come up 3 times.  After
> switching to X each time I dumped the registers, so I had 3 'bad' cases and 2
> 'good' ones.  I did an MD5sum on the resulting 5 text files and all of them
> were identical (1aa584177c2d15c12ecfd05fc06aa122).  I have attached one of the
> register dumps here.
> 

Richard, would you please tell us how did you do the regdump, especially when the bad case happen... thanks.
Comment 17 Richard Goedeken 2008-02-13 19:45:44 UTC
(In reply to comment #16)
> 
> Richard, would you please tell us how did you do the regdump, especially when
> the bad case happen... thanks.
> 

I do all the driver building/installing while connected remotely via SSH from my desktop PC (totally different system - Gentoo w/ Nvidia card).  I couldn't find any documentation on this register dumping tool, so I did something like 'find . |grep reg_dump' from the xf86-video-intel driver directory, and I found a binary that was built at: ./src/reg_dumper/intel_reg_dumper.  I figured this was the right program and ran it, and got the message "Couldn't probe graphics card: Permission denied".  So I ran it with sudo and it gave me reasonable-looking data.

For the experiment, the machine had been freshly booted into a working X session.  I hit Ctrl-alt-F1 on my wireless keyboard (connected to the MiniPC) to switch to a text VT, then Ctrl-alt-F7 to go back to X.  I then ran the reg dumper (with sudo) in the SSH session and redirected the output to a file named according to whether the display had come up correctly or not.  Out of 5 consecutive tries doing this, the display came up correctly twice and did not come up 3 times.

One odd thing that I've noticed about this bug.  When the X display is up properly and I switch to a text VT, the text 'pops' up instantly, like I would expect it to.  But when the X display has failed to come up, and I switch back to a text VT, the text sort of 'walks' onto the screen from the left side.  It's a very interesting effect that I haven't seen before.
Comment 18 Michael Fu 2008-02-13 22:28:04 UTC
(In reply to comment #17)
> (In reply to comment #16)
> 
> For the experiment, the machine had been freshly booted into a working X
> session.  I hit Ctrl-alt-F1 on my wireless keyboard (connected to the MiniPC)
> to switch to a text VT, then Ctrl-alt-F7 to go back to X.  I then ran the reg
> dumper (with sudo) in the SSH session and redirected the output to a file named
> according to whether the display had come up correctly or not.  Out of 5
> consecutive tries doing this, the display came up correctly twice and did not
> come up 3 times.
> 
>

Can you get a working X for all these 5 tries after you did a fresh boot? did you also do the combination key action even if you get a blank screen after fresh boot? 

I ask this because we once met some weird case that a blank screen comes back after the reg dump tool used and I'm wondering if your case is like that too..
Comment 19 Richard Goedeken 2008-02-14 05:10:04 UTC
(In reply to comment #18)
> 
> Can you get a working X for all these 5 tries after you did a fresh boot?

I'm not 100% sure I understand this question.  I always do a cold reboot, due to a 3D driver bug that I've filed here.  Sometimes when starting up X (maybe less than 10% of all times), the X server locks up and I have to reboot again.  Out of the remaining times, when X is starting from a cold boot, sometimes the display will come up properly, and sometimes I will get a blank screen.  I have never seen a case where I could not cause a blank screen by switching between text and graphical VTs - ie, even when the display works from a fresh boot, I can reproduce this bug by switching VTs.

> did you also do the combination key action even if you get a blank screen
> after fresh boot? 

In the cases where I have a blank screen when X is starting up (and X has not crashed), I can always eventually get a good display by switching between VT 1 and 7.

> I ask this because we once met some weird case that a blank screen comes back
> after the reg dump tool used and I'm wondering if your case is like that too..

I have not seen this behavior, but I didn't try running the reg_dumper tool too many times either.

Comment 20 Michael Fu 2008-02-14 05:25:34 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > 

> In the cases where I have a blank screen when X is starting up (and X has not
> crashed), I can always eventually get a good display by switching between VT 1
> and 7.
> 
did you use the reg_dump tool before or after you "eventually get a good display"? that's what I want to confirm.  In the "bad" case, you should also be able to use reg_dump tool from your remote SSH connection..and that's what we want.

Comment 21 Richard Goedeken 2008-02-14 06:19:36 UTC
(In reply to comment #20)
> did you use the reg_dump tool before or after you "eventually get a good
> display"? that's what I want to confirm.  In the "bad" case, you should also be
> able to use reg_dump tool from your remote SSH connection..and that's what we
> want.
> 

In all cases the reg dump was taken from the remote SSH login session.  For the 'bad' cases, the reg dump was taken while the main display was blank.  For the 'good' cases, it was taken while the screen was showing the X desktop.

I ran another test this morning - rather than using Ctrl-alt-F* to reproduce the blank display bug, I just rebooted several times and grabbed a reg dump (remotely) from one instance when the X display came up properly, and a reg dump from another instance when the display was blank after X startup.  Again, the 'good' and 'bad' dumps gave the same md5sum, though not the same as the register output which I previously attached to this bug report.  I did a diff between the two and attached the results here.

09:15:46  Desktop > diff intel-reg-dump-old.txt intel-reg-dump-new.txt 
8c8
< (II):       RENCLK_GATE_D1: 0x20000000
---
> (II):       RENCLK_GATE_D1: 0x70000000
93c93
< (II):         TV_CLR_KNOBS: 0x00404000
---
> (II):         TV_CLR_KNOBS: 0x00202000
160,161c160,161
< (II):                 CR0a: 0x1f
< (II):                 CR0b: 0x1e
---
> (II):                 CR0a: 0x0d
> (II):                 CR0b: 0x0e
164,165c164,165
< (II):                 CR0e: 0x01
< (II):                 CR0f: 0x4f
---
> (II):                 CR0e: 0x00
> (II):                 CR0f: 0x00
167c167
< (II):                 CR11: 0x8e
---
> (II):                 CR11: 0x0e
184c184
< (II):                 CR22: 0x20
---
> (II):                 CR22: 0x00
Comment 22 Chris Brewer 2008-02-26 09:33:45 UTC
(In reply to comment #21)
Using ssh, I started X directly (/usr/bin/X) and then stopped it with ctrl-C 100 times. Over these iterations the bug manifested itself 10 times.

I then tried switching virtual terminals 100 times (ctrl-alt-F1, ctrl-alt-F3).
Interestingly, this yielded only 3 failures.

I had earlier noted bug #13271 (Black Screen on X startup with intel driver and i965GM). In that report Hong Liu requested that the original poster try turning on the "ModeDebug" option in xorg.conf for the intel driver.

I tried setting this option to "true" to see the register values when the bug manifested itself. What I discovered is that with this flag set I was able to repeat both these tests 100 times with zero failures.

I have no idea why this is happening, but this may be a candidate for a workaround until the actual problem is resolved.
Comment 23 Chris Brewer 2008-02-26 09:38:59 UTC
(In reply to comment #22)
One other observation I forgot to post earlier: on my system, when the bug manifests itself and shows a black screen, and when I switch to another virtual terminal, the display seems to recover briefly just before it's killed and the other virtual terminal is displayed.
Comment 24 Michael Fu 2008-02-29 15:20:03 UTC
Chris, I'm wonderring if we finally find a "lucky" user that can reproduce bug# 14018 which we thought it might be the HW issue as it's on a prototype machine.. thanks for you persistent! :)

We should consider add this kind of testing ... ccing Gordon and reassign to hong...

Comment 25 Richard Goedeken 2008-03-01 15:27:24 UTC
I tried setting the ModeDebug to true (as noted in bug #13271) and there was no change; the problem still persists on my aopen minipc.  I get a failure rate (blank screen on X startup or vt-switch) of about 60%, much higher than the rates noted for that bug.

I also previously tested with the patch given in bug #14018 (read register 0x3ccc), and that also had no effect.
Comment 26 Chris Brewer 2008-03-03 07:39:39 UTC
I've done a little research since my original posts.

Note that I started with xf86-video-intel.2.2.0.90.

After noticing that setting the "modedebug" flag to "true" seemed to work in my configuration, I attempted to find a subset of the debug code that corrected the error. To do so I copied the executed debug code from i830_debug.c into i830_driver.c and added a call to this new subroutine at the end of i830PreInit().

I then iteratively removed sections of the code and started X up to 100 times (less if it failed :)) to see if the removed code made a difference. After several days of testing I discovered that the following code seems to be effective in my configuration:

Up near the top of i830_driver.c:
<snip-------------------->
	void experimental_hack(ScrnInfoPtr pScrn)
	{
		I830Ptr pI830 = I830PTR(pScrn);
		int msr = INREG8(0x3cc);
		uint16_t st01;
		unsigned char orig_arx;

		if (msr & 1)
			st01 = 0x3da;
		else
			st01 = 0x3ba;

		INREG8(st01); /* make sure index/write register is in index mode */
	}
<snip-------------------->

Call made at the end of i830PreInit():
<snip-------------------->
	experimental_hack(pScrn);
<snip-------------------->

Documentation on the 965GM indicates that the ST01 register is modal, toggling between 'index' and 'data' modes, and reading it forces it to a known state. My guess is that some code somewhere in the driver is assuming that ST01 is in the 'index' mode when it actually isn't.

If this is true, I don't understand why the driver would randomly fail. Is there some indeterminate order of execution occurring? Perhaps someone with better knowledge of X's internal workings (i.e., almost anyone reading this post!) has some notion of what's going on.

This hack isn't perfect- in my tests the driver still fails to start properly, perhaps once every 100 restarts or so. I assume that this is because the call isn't being made in the right part of the code (I know that the end of i830PreInit() is *not* correct, as this patch does not fix the associated issue with virtual terminals), and it leaves a 'timing window' in which executing driver code would be affected by the improper state of ST01.
Comment 27 Chris Brewer 2008-03-04 09:35:30 UTC
More tests.

I attempted to follow the path of execution using gdb to see why the screen is apparently restored upon exiting the VT (but before the new VT is entered). I placed a breakpoint at I830LeaveVT and cycled between VT's until the bug occurred. I then changed VT's once more, and when gdb stopped at I830LeaveVT, the screen had already been restored: apparently something else in X shook it loose before I830LeaveVT was allowed to run. I attempted to dig further into X but became discouraged by the sheer scope of the learning opportunity :) .

In another approach I removed all previous calls to 'experimental_hack' and instead placed calls to it near the top of the following routines (in each case, after pScrn is set):
 - i830AdjustFrame
 - I830LeaveVT
 - I830EnterVT
 - I830SwitchMode
 - I830CloseScreen
 - I830PMEvent
 - I830CheckDevicesTimer (after the 'if(...) return 1000;' clause)

By iteratively removing calls I discovered that the only call that really did anything was the one in I830CheckDevicesTimer. With the call placed in this function, I could both switch virtual terminals and enter/exit X 100 times without seeing the black screen once (compared to roughly 1 in 20-30 attempts without the call). However, it introduced a new artifact where the video occasionally 'glitched' as the screen was initialized. The 'glitch' rate is comparable to the earlier failure rate, and I think it's caused by I830CheckDevicesTimer periodically resetting ST01 to the 'correct'(?) state.

Note that I haven't had the chance yet to see if this change causes any other undesirable side effects.

I have no idea why resetting the state of ST01 should have any effect at all on the system. It definitely does in the configuration I'm using- removing the patch restores the 'black screen' errors.

BTW, I have seen one other rare (i.e., 1 in 200? 300?) manifestation of the 'blank screen' bug where the screen goes to red instead of black. With the modification mentioned above, I saw a 'red screen' change from red to a 'pinkish' color to a normal screen, presumably upon each call to I830CheckDevicesTimer. I have no idea why this would occur, but I definitely saw it.

Periodically resetting ST01 doesn't sound to me like the proper approach to a fix. I'll continue trying to hunt down the root cause.

Richard- I'm wondering if we have fundamentally different issues, or if the configuration you're using is failing consistently and the one I'm using fails randomly. Could you try calling the 'experimental hack' from I830CheckDevicesTimer and see if it makes a difference in your system?
Comment 28 Chris Brewer 2008-03-04 10:20:36 UTC
(In reply to comment #27)
Fooey! In my previous post(s) I mistakenly referred to ST01 as being modal. What I meant to write was that register 0x3c0 (ARX) is modal, reading from ST01 (either 0x3ba or 0x3da) forces it back to the 'index' mode, and that the problem seems to be occur when ARX is in the 'wrong'(?) state.

I apologize for the confusion.
Comment 29 Richard Goedeken 2008-03-08 13:06:39 UTC
(In reply to comment #27)
> Richard- I'm wondering if we have fundamentally different issues, or if the
> configuration you're using is failing consistently and the one I'm using fails
> randomly. Could you try calling the 'experimental hack' from
> I830CheckDevicesTimer and see if it makes a difference in your system?
> 

I don't know if we are looking at the same issue here.  I rebuilt with your suggested changes and tested a little.  I still get blank screens quite frequently.  The last time I tested it seemed to give me a blank screen about 60% of the time when switching back into X.  During the test I just informally ran, it seemed much lower - more like 20 or 30%.  This isn't scientific though so I wouldn't conclusively say that this improved the problem.  It certainly didn't fix it. :)
Comment 30 huli 2008-03-11 02:30:51 UTC
I have the same problem. My configuration:
965GME chipset
standard kernel 2.6.24
X server 1.4
xf86-video-intel-2.1.1

The X server was fully build from newest source of
http://xorg.freedesktop.org/archive/X11R7.3/src/

There's no window manager, GNOME or KDE. And I've connected only one CRT monitor.
I use X to start the x server. It got a black screen at about 50% of the times.
But when it was in black screen, the power led of the monitor was green, showed that the sync signal is OK(maybe). The strange thing is that, if I use ssh remotely login to the system has a black screen, the screen will become ok right after I read a byte from one of the I/O ports 0x3c0~0x3df. Not all the registers will take effect at all the time. I just use "inb" under bash to do this work.
PS. If I turn the "ModeDebug" on, it seems OK every time I start X. I think this may be the driver has a "dumpreg" action when "ModeDebug" is true, which will read some I/O ports. I'm still testing.
 
Comment 31 Chris Brewer 2008-03-14 09:34:15 UTC
(In reply to comment #30)
FYI, I just tried repeating my earlier start/stop X experiments, but this time using a different motherboard (BCM MX965GME) and stock Ubuntu 7.10 (gutsy gibbon) with xorg-video-intel-2:2.1.1-0ubuntu9.1. I saw a black screen in 4 out of 25 restarts.
Comment 32 Axel Thimm 2008-03-16 03:57:15 UTC
Although this seems to be the bug I see as well, I experience the opposite of what is being described here. X always comes up, but switching to VT consoles blanks the LCD 100% of the times (with a split second sometimes where the VT is displayed before being turned off). The blanking looks like a backlight switch-off. Switching back to X sometimes works (depending on the driver updated by Fedora/RH)

The system is a Toshiba Satellite L40-16D, Core2 due T2310, 2GB RAM with

00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 03)
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 03)

running Fedora rawhide which currently carries:
# rpm -q xorg-x11-server-Xorg mesa-libGL xorg-x11-drv-i810  kernel
xorg-x11-server-Xorg-1.4.99.901-8.20080310.fc9.x86_64
mesa-libGL-7.1-0.20.fc9.x86_64
mesa-libGL-7.1-0.20.fc9.i386
xorg-x11-drv-i810-2.2.1-12.fc9.x86_64
kernel-2.6.25-0.121.rc5.git4.fc9.x86_64

I originally filed this in Fedora's bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=436636

Despite its name the xorg-x11-drv-i810 contains the http://xorg.freedesktop.org/archive/individual/driver/xf86-video-intel-2.2.1.tar.bz2  sources and has a couple of patches like

Patch2: intel-2.1.1-fix-vt-switch.patch
Patch3: intel-2.1.1-fix-xv-reset.patch
Patch4: intel-2.1.1-fix-xv-compiz.patch
Patch5: intel-2.1.1-efi.patch
Patch6: intel-2.1.1-pciaccess-version.patch
Patch7: intel-quirk-dvo.patch
Patch8: intel-2.1.1-disable-ttm.patch

Patch100: intel-master.patch
Patch101: intel-batchbuffer.patch
Patch102: intel-modeset.patch
Patch103: intel-disable-stepping.patch
Patch104: intel-fix-modeset-vt-switch.patch

I also tried setting ModeDebug, but that didn't change anything.
Comment 33 Hong Liu 2008-03-17 18:47:37 UTC
It seems there are two kind of problems in this bug.

For Richard's problem, still not sure what's going on.

The other problem looks like a duplicate of bug 14018. Chirs, thanks for your patience to testing the bug and provides such an good report.

Would you please try to search "pI830->debug_modes" in i830_driver.c and add your experimental_hack there (there are 3 places, I830PreInit, I830EnterVT, I830LeaveVT). Enabling ModeDebug option will call i830DumpRegs at these 3 places.

Currently we don't know why reading VGA registers solves the problem. If we can find the exact place need to read VGA registers, we can add a hack there to workaround the problem until we find the root cause of the problem.

BTW, I suggest you try I830EnterVT first.
Thanks again for your help.

Thanks,
Hong
Comment 34 Chris Brewer 2008-03-18 07:43:10 UTC
(In reply to comment #33)
Hong- please see comment #27, dated 2008-03-04. The hack did not restore video for me when placed in I830PreInit, I830EnterVT, or I830LeaveVT: only I830CheckDevicesTimer seemed to work.


Comment 35 Hong Liu 2008-03-18 17:32:50 UTC
(In reply to comment #34)
> (In reply to comment #33)
> Hong- please see comment #27, dated 2008-03-04. The hack did not restore video
> for me when placed in I830PreInit, I830EnterVT, or I830LeaveVT: only
> I830CheckDevicesTimer seemed to work.
> 

I've read comment #27. You said you put the hack on top of these functions, that may not work. Putting in I830CheckDevicesTimer should work, since it is executed periodically. Also you said that enabling ModeDebug option solves your problem, So I am suggesting you put your hack where the registers will be read when ModeDebug is enabled.

Thanks,
Hong

Comment 36 Chris Brewer 2008-03-19 09:30:02 UTC
(In reply to comment #35)

Hong-

Ah, I see what you're looking for now.

As a control I first tried restarting X with the unmodified driver. For some reason the bug seemed reluctant to manifest itself. X failed once at the first startup, but then there was a streak of over 50 restarts without seeing the error. The bug began showing up somewhat more often after that, but not nearly as often as my earlier tests.

FWIW the cabinet was opened halfway through the control test (and then left open) to double-check the physical configuration. This dropped the cabinet temperature slightly. I don't know if it's meaningful that the error rate seemed to increase at this point- it could simply be coincidence. I'll watch this in later tests.

Once I determined that the error was still occurring, the driver was patched by adding a call to the hack in I830EnterVT, very close to where the registers are dumped in debug mode. Here's an excerpt- I hope this is what you had in mind:

     if (pI830->debug_modes) {
          xf86DrvMsg(pScrn->scrnIndex, X_INFO, "Hardware state at EnterVT:\n");
          i830DumpRegs (pScrn);
     }
+    experimental_hack(pScrn); // <<<<<<<<<
     i830DescribeOutputConfiguration(pScrn);

Since I was concerned about the error rate being sporadic, I alternated between starting X 25 tests *without* the patch and 25 times *with* the patch. Here are the results:
   Without patch: 25 restarts, 3 errors.
   With patch:    25 restarts with no errors.
   Without patch: 25 restarts, 4 errors.
   With patch:    25 restarts with no errors.
   Without patch: 25 restarts, 1 error.

So this particular modification seems to work in my configuration. I'm surprised because I thought I had done this earlier. Apparently I had made a mistake.

I will run more thorough tests and try the other locations that you suggested when I have the opportunity.
Comment 37 Axel Thimm 2008-03-20 11:54:51 UTC
A workaround for the issue I see is to "modprobe video" before X starts.

There is some discussion about this on
https://bugzilla.redhat.com/show_bug.cgi?id=436556

Comment 38 Chris Brewer 2008-03-20 12:02:21 UTC
The redhat thread seems to be referring to LCD backlights not being enabled.
I have been using a CRT.

Comment 39 Richard Goedeken 2008-03-21 17:51:24 UTC
I spent a long time testing this today and found that with builds from git going back to mid-december, it only occurs on the TV output.  I must have been doing something wrong in my previous testing methodology in comment #6.  I think I may have been installing the intel_drv.so to the wrong place in my early testing, because I was able to reproduce this problem today on the VGA output when I booted from the Fedora 8 live CD (circa Nov 8 2007).  However, I also built drm and xf86-video-intel from git snapshots going back to mid December and tested with the VGA monitor, and for each build I did at least 25 test VT switches to text and back to X, and it never failed.

So at this point this bug only occurs with the TV output, but it is still present in the git head.
Comment 40 Hong Liu 2008-03-23 19:47:50 UTC
(In reply to comment #39)
> I spent a long time testing this today and found that with builds from git
> going back to mid-december, it only occurs on the TV output.  

Would you please do a git-bisect to find which commit makes the VGA not working?
It may be helpful to debug the TV output also.

Thanks,
Hong
Comment 41 Richard Goedeken 2008-03-26 10:09:19 UTC
(In reply to comment #40)
> Would you please do a git-bisect to find which commit makes the VGA not
> working?
> It may be helpful to debug the TV output also.

That's exactly what I was trying to do when I discovered my previous error.  The problem is that the issue only occurs on the VGA output with the drivers originally shipped with Fedora 8.  By looking at the RPM list, they appear to be version "2.1.1-7.fc8", which suggests that they're based on the 2.1.1 drivers but with other Fedora changes.

I tried building the 2.1.1 driver last night for testing but it would not build - it appears the compiler died in i810.h because there was no previous definition for the 'EntityInfoPtr' type -- missing header file maybe?  I was able to build a driver from the master branch from 08/27, which is only a few weeks away in time.  But this driver did not exhibit the problem on the VGA output.

Maybe I could get the source for the FC8 drivers and do a diff against 2.1.1 to see what changed.  But it seems like searching for this problem on the VGA output may be going in the wrong direction - we have no assurance that the two problems are related, other than similar symptoms.  Can you point me to a top-level function in the source which is responsible for handling the new display mode setup on the TV output?  Maybe I can tinker with it and figure out what's wrong.  Is there any documentation for the TV output chip?
Comment 42 Hong Liu 2008-03-27 01:32:30 UTC
(In reply to comment #41)
> Can you point me to a
> top-level function in the source which is responsible for handling the new
> display mode setup on the TV output?  Maybe I can tinker with it and figure out
> what's wrong.  Is there any documentation for the TV output chip?
> 

the tv code is in src/i830_tv.c file, the main modesetting code is in function i830_tv_mode_set.

The modesetting sequence should be
     crtc->modeset (i830_crtc_mode_set)
     output->modeset (i830_tv_mode_set)
Comment 43 Richard Goedeken 2008-06-08 08:44:43 UTC
Hong,

The patch in comment #21 of bug 14000 fixed this problem on my MiniPC, both when X starts and when switching between text and X VTs.  I'm closing this bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.