Bug 16781

Summary: Lockup when starting with second screen attached
Product: xorg Reporter: Friedrich Gräter <graeter>
Component: Driver/RadeonAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED WORKSFORME QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: felipe.contreras, johan, paulatgm
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
The X server log of the crashed session
none
Xorg.conf
none
vbios.rom
none
Video BIOS image
none
Make sure crtc is enabled before attempting to blank/unblank
none
Xorg logfile
none
don't unblank until crtc is initialized
none
Xorg.log for logout test
none
Xorg.log for boot up test
none
set initialized to false on closescreen() none

Description Friedrich Gräter 2008-07-19 05:05:51 UTC
Created attachment 17762 [details] [review]
The X server log of the crashed session

If I start it attached with a second screen on my VGA output, the X server locks up and the built-in screen of my notebook keeps black. I can shutdown my notebook by pressing the power button, but can't use the keyboard. If I attach the second screen after X has started, everything works fine. Even suspend-2-ram works without problems.

FGLRX had similar problems (suspend-2-ram failed, if a second head was attached).

I'm using the latest GIT-Version of the driver and xorg-server-1.4.99.905 from the Gentoo x11 overlay.
Comment 1 Friedrich Gräter 2008-07-19 05:11:00 UTC
Created attachment 17763 [details]
Xorg.conf
Comment 2 Friedrich Gräter 2008-07-20 05:52:33 UTC
Okay, I figured out, that there are also freezes of the entire system when removing or attaching an external monitor to a running X-session. Unfortunately I can't get any logging information because the system freezes immediate. 

Sometimes there are also freezes some seconds after resuming from suspend-2-ram (but this could be related to another problem).

Also the mouse pointer is flickering near the right edge of the second screen (and sometimes randomly anywhere on the screen). This problem disappears in single-screen mode.
Comment 3 paul 2008-08-07 09:50:36 UTC
I have the same problem on my mobility X1400 (aka M54).

Like the OP, I can attach a VGA after X is running and use xrandr to enable it, but if I restart X, by logout or reboot, with the VGA attached, then I get a black screen on the laptop and VGA.  The same happens with the S-video out (see comment #4 on bug 16178).  I can attach it after X is running and use xrandr with it, but logout or reboot results in a black screen on both.

One difference than the OP, I can use Cntl-Alt-Del to reboot.  However the consoles do not work (i.e. Cntl-Alt-F1).

Another difference is that I have not had any freeze when connecting or disconnecting a VGA or S-video.

Another difference is that I do not have any resume problems.

I'm running ubuntu hardy with git packages from the tormod xorg-pushers repository (xserver 1.4.99.906, mesa, libdrm, and ati from git) and I also upgraded libxrandr to 1.2.3.

By the way, I enabled S-video by undoing your disabling patch, so I could test it.
Comment 4 Friedrich Gräter 2008-10-25 00:18:11 UTC
Today I played a little bit with my configuration and using the recent GIT release.

Still, starting with a second head attached won't work. However after starting, the X server fails and all screens will be black, but I can use my keyboard and I'm able to execute commands from the shell. Running dmesg doesn't lead to any informations - it just looks like a normal startup, but the X-Server fails somewhere without any failure message (the Xorg.0.log is still the same as the one attached to the bug).

Today I tried this additional configuration, without success:

	Option		"DisplayPriority"	"High"
	Option		"DefaultTMDSPLL"	"on"
	Option		"DefaultConnectorTable"	"on"
Comment 5 Friedrich Gräter 2008-10-30 04:20:40 UTC
Okay, today I took some look inside the drivers source code and tried to debug it. Unfortunately I have no other machine to use SSH and GDB, so I tried to debug it by xf86DrvMsg() statements. (I know, it is a ugly way to do that...)

And I think I found the place where the driver crashes:

Call trace:

[...]
radeon_crtc_dpms
atombios_crtc_dpms 
atombios_blank_crtc
RHDAtomBiosFunc
req_func => rhdAtomExec
rhdAtomExec
ParseTableWrapper
ParseTable

In ParseTable (Decoder.c, line 223, latest git) there is the following loop:

if (!CD_ERROR(ParserTempData.Status))
{
ParserTempData.Status = CD_SUCCESS;				

while(!CD_ERROR_OR_COMPLETED(ParserTempData.Status))
{
 ...
} // while

} // if

The program executes the loop one time and leaves it, because the condition is false. However I extended the loop by the following tracing code:

if (!CD_ERROR(ParserTempData.Status))
{
xf86DrvMsg(0, X_ERROR, "[E5-1]\n");
ParserTempData.Status = CD_SUCCESS;				

while(!CD_ERROR_OR_COMPLETED(ParserTempData.Status))
{
 ...
 xf86DrvMsg(0, X_ERROR, "[E5-1-2]\n");
 xf86DrvMsg(0, X_ERROR, "[E5-1-2 %i > %i]\n", ParserTempData.Status, CD_SUCCESS);

} // while
xf86DrvMsg(0, X_ERROR, "[E5-1-3]\n");
} // if

The last output of the X-Server to Xorg.0.log is "E5-1-2 0 > 0". The output "E5-1-3" never appears on the log file. It looks like that the program never leaves the queue and the X-Server got stuck in some kind of an endless loop.

This explains why I can still shutdown the system by CTRL+ALT+DEL, but have no other access to the terminal.

I just don't know enough about AtomBIOS to fix the bug by myself. Could you give me some hints, where I can do some further investigations?
Comment 6 Friedrich Gräter 2008-10-30 06:21:42 UTC
Okay, I did some further investigations by myself. 

It seems to be a bug in the AtomBIOS code, that will be executed when unblanking the screen. It could be also a bug in the interpreter.

I put some trace-code into the AtomBIOS byte code interpreter which tells me, what the AtomBIOS is doing.

I found out, that it hangs in an endless loop, executing the following instructions:

0xEE9335BC Op-Code 0x4A | Test 0x9, AtiReg(0x1827+0x200)
0xEE9335C1 Op-Code 0x44 | Jump [Equal]

These two instructions are executed in the endless loop.

The Value of AtiReg(0x1827+0x200) = 0x2.

I try to figure out now whether it is a bug of the firmware or a bug in the processing of the jump-instruction by ProcessJump.

(It's a little bit time-intensive, because I've to reboot my computer during each test)
Comment 7 Friedrich Gräter 2008-10-30 07:30:11 UTC
Okay, for me, it looks like to be an issue with the AtomBIOS. The faulty code is:

0xEE9335BC Op-Code 0x4A | Test 0x9, AtiReg(0x1827+0x200)
0xEE9335C1 Op-Code 0x44 | Jump [Equal] EE9335BC

Where as "AtiReg(0x1827+0x200) = 0x2". 

As long as "TEST" has the same semantics as on x86, the problem is that

0x9 & 0x2 = 0 => Equal 

And as a consequence the jump will run into the endless loop. Perhaps the code of AtomBIOS reads from the wrong register or some steps of the initialization are wrong?

It is strange that the same code works if Xorg has already started completely.

I have a lot of similar problems with my Mobility x1400. The system sometimes freezes partially (the mouse pointer is moveable and ssh is still working) or completely. Esp. when resuming from disk or ram, sometimes the system hangs in a similar way (I had another bug report on this).

Comment 8 Alex Deucher 2009-01-05 07:23:29 UTC
Can you attach the problematic vbios images?  

(as root):
cd /sys/bus/pci/devices/<pci bus id>
echo 1 > rom
cat rom > /tmp/vbios.rom
echo 0 > rom
Comment 9 paul 2009-01-05 10:07:08 UTC
Created attachment 21693 [details]
vbios.rom
Comment 10 Friedrich Gräter 2009-01-05 12:05:22 UTC
Created attachment 21695 [details]
Video BIOS image
Comment 11 Alex Deucher 2009-02-17 11:27:22 UTC
Created attachment 23044 [details] [review]
Make sure crtc is enabled before attempting to blank/unblank

Does this patch help?
Comment 12 Friedrich Gräter 2009-02-17 13:40:56 UTC
Created attachment 23048 [details] [review]
Xorg logfile

Sorry, with this patch I still have the same problem. 

I patched it against the Ubuntu 8.10 sources of the radeon driver, which is "xserver-xorg-video-ati-6.9.0+git20081003.f9826a56"...

One problem I have with the secondary head is, that the XRandR utility tells me that any connected display supports a resolution of 1600x1024. Normally it wants to select this resolution, which is just not supported by the display - perhaps this is related to the bug?
Comment 13 Alex Deucher 2009-02-17 14:12:56 UTC
(In reply to comment #12)
> One problem I have with the secondary head is, that the XRandR utility tells me
> that any connected display supports a resolution of 1600x1024. Normally it
> wants to select this resolution, which is just not supported by the display -
> perhaps this is related to the bug?
> 

No that's an xserver issue.  It adds default server modes of which 1600x1024 is one.
Comment 14 paul 2009-02-17 14:28:20 UTC
I also tested it on ubuntu 9.04 with latest video-ati and a second time with today's git video-ati.  It fails here also.
Comment 15 Alex Deucher 2009-02-17 15:49:57 UTC
Created attachment 23052 [details] [review]
don't unblank until crtc is initialized

Does this patch help?
Comment 16 Alex Deucher 2009-02-17 15:53:11 UTC
*** Bug 18104 has been marked as a duplicate of this bug. ***
Comment 17 Friedrich Gräter 2009-02-17 16:39:26 UTC
Yes, this solves this problem for me. Thank you, that's great (the first time I can even use user switching...)!
Comment 18 paul 2009-02-17 16:41:19 UTC
Not here.  It does not work for me.
Comment 19 Felipe Contreras 2009-02-17 16:48:06 UTC
(In reply to comment #15)
> Created an attachment (id=23052) [details]
> don't unblank until crtc is initialized
> 
> Does this patch help?

Works fine here :) Thanks!

Comment 20 Alex Deucher 2009-02-17 16:49:37 UTC
(In reply to comment #18)
> Not here.  It does not work for me.
> 

Can you attach your xorg log with the patch applied?
Comment 21 Alex Deucher 2009-02-17 16:55:32 UTC
fix pushed:
9a108f0a0b7203458673ce6221e747a166d39617
Comment 22 paul 2009-02-17 17:05:01 UTC
Created attachment 23054 [details] [review]
Xorg.log for logout test

This is the hang that occured when I tried to logout with the vga attached.  It failed and I had to reboot.
Comment 23 paul 2009-02-17 17:06:59 UTC
Created attachment 23055 [details]
Xorg.log for boot up test

This is the xorg.log from the successful boot up with the vga attached.
Comment 24 paul 2009-02-17 17:11:03 UTC
I got a black screen hang when I logged out and had to reboot.  Attached is the Xorg.0.log.old that was still there after I rebooted with no vga attached.

Next, I shutdown, connected my vga, and booted.  Great, it works.  I also attach the Xorg.0.log from this successful boot.

I guess I need to test the logout some more and get back if I have any more problems.

I've also had hangs when my S-video is connected in the past.  Does this patch fix that situation also?  Should I test it again with this patch?

thanks,
Comment 25 paul 2009-02-17 17:23:29 UTC
I just repeated the same experience.  Logout with my vga attached results in black screen(s) hang, and necessity to reboot (Alt-SysRq-REISUB then Cntl-Alt-Del), but then bootup with vga attached works.  So, there's still a logout problem here.
Comment 26 paul 2009-02-17 17:44:02 UTC
Some more feedback.  There is definitely still a problem with logout with the vga attached.  I've tested it two more times with the same result.  However, the following tests succeeded with the vga attached: 1. switch user 2. switch to console (C-A-F1) and stop / start kdm 3. bootup.

Is there any more info I can provide?

Comment 27 Alex Deucher 2009-02-18 06:55:02 UTC
(In reply to comment #26)
> Some more feedback.  There is definitely still a problem with logout with the
> vga attached.  I've tested it two more times with the same result.  However,
> the following tests succeeded with the vga attached: 1. switch user 2. switch
> to console (C-A-F1) and stop / start kdm 3. bootup.
> 
> Is there any more info I can provide?
> 

Is the hang at log out something new introduced by the patch, or was this the previous behavior that is still not fixed with the patch?  Also as you using the patch I committed to git or the one on the bug?  Finally is the box and or xserver hung or is the console just not restored properly?
Comment 28 Alex Deucher 2009-02-18 07:12:16 UTC
Created attachment 23081 [details] [review]
set initialized to false on closescreen()

Paul, does this patch help (on top of git master)?
Comment 29 paul 2009-02-18 07:43:02 UTC
Great job, that fixed it!
Comment 30 Alex Deucher 2009-02-18 08:47:52 UTC
fix pushed: 1a237a40958c006c56b80850bd77b2ac6c17e030
Comment 31 Vince C. 2009-02-18 10:44:20 UTC
It doesn't work for me. I attached xorg log file in the wrong bug report, sorry. Here's the link to the attachment:

http://bugs.freedesktop.org/attachment.cgi?id=23084

I still get black screens when X
starts while my external monitor is plugged in the DVI-0 output. As usual, X
uses 95-100% CPU when locked.
Comment 32 Felipe Contreras 2009-02-18 10:51:49 UTC
(In reply to comment #31)
> It doesn't work for me. I attached xorg log file in the wrong bug report,
> sorry. Here's the link to the attachment:
> 
> http://bugs.freedesktop.org/attachment.cgi?id=23084
> 
> I still get black screens when X
> starts while my external monitor is plugged in the DVI-0 output. As usual, X
> uses 95-100% CPU when locked.

It works fine on my DVI-0 and VGA.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.