Bug 7154

Summary: Using MergedFB freezes system reproducable
Product: xorg Reporter: Michael <auslands-kv>
Component: Driver/RadeonAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: high CC: alexdeucher, benh, bo2hansen, jonas, marius
Version: 7.0.0Keywords: regression
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg.log file of freeze
none
Restore some checks before calling RADEONRestoreCommonRegisters() none

Description Michael 2006-06-08 07:35:23 UTC
I was trying a new xorg.conf file I found on the net that utilizes a MergedFB
setup (http://mg.pov.lt/xorg.conf).

However, when switching through the metamodes the system always freezes
completely and without any chance to save any error log message (even SysRQ
don´t work).

The metamodes are "1024x768-1280x1024 1024x768-1024x768 10
24x768+1280x1024 1280x1024 1024x768 800x600 640x480".

If I am not mistaken, the freeze happens when switching to clone mode, that is
from "1024x768+1280x1024" to "1280x1024" (with ALT-CTRL-NUM+) or from
"1024x768-1280x1024" to "640x480" (with ALT-CTRL-NUM-).

Here is the xorg.conf:

Section "Files"
	FontPath	"/usr/share/X11/fonts/misc"
##	FontPath	"/usr/share/X11/fonts/cyrillic"
	FontPath	"/usr/share/X11/fonts/100dpi/:unscaled"
	FontPath	"/usr/share/X11/fonts/75dpi/:unscaled"
	FontPath	"/usr/share/X11/fonts/Type1"
	FontPath	"/usr/share/X11/fonts/CID"
	FontPath	"/usr/share/X11/fonts/100dpi"
	FontPath	"/usr/share/X11/fonts/75dpi"
        # paths to defoma fonts
	FontPath	"/var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType"
	FontPath	"/var/lib/defoma/x-ttcidfont-conf.d/dirs/CID"
EndSection

Section "Module"
	Load	"GLcore"
	Load	"bitmap"
	Load	"ddc"
	Load	"dri"
	Load	"extmod"
	Load	"freetype"
	Load	"glx"
	Load	"int10"
	Load	"type1"
	Load	"vbe"
EndSection

Section "InputDevice"
	Identifier	"Generic Keyboard"
	Driver		"kbd"
	Option		"CoreKeyboard"
	Option		"XkbRules"	"xorg"
	Option		"XkbModel"	"pc105"
	Option		"XkbLayout"	"us"
EndSection

Section "InputDevice"
	Identifier	"Configured Mouse"
	Driver		"mouse"
	Option		"CorePointer"
	Option		"Device"		"/dev/input/mice"
	Option		"Protocol"		"ImPS/2"
	Option		"Emulate3Buttons"	"true"
	Option		"ZAxisMapping"		"4 5"
EndSection

Section "Device"
	Identifier	"MergedFB2 ATI Technologies, Inc. Radeon Mobility M6 LY"
	Driver		"ati"
	BusID		"PCI:1:0:0"
	Option		"DynamicClocks"	"on"
	Option		"MergedFB"	"true"
	Option		"CRT2Position"	"RightOf"
    # This allows X to use MergedFB if the external monitor is not connected
    # when I start X.  The ranges are taken from DDC values of the CTX monitor
    # I use at the office; as listed in Xorg.log.
	Option		"CRT2HSync"	"30-81"
	Option		"CRT2VRefresh"	"56-76"
    # The next line lets me switch between dual-head and several clone modes
    # of varying resolutions with xrandr.
	Option		"MetaModes"	"1024x768-1280x1024 1024x768-1024x768 1024x768+1280x1024
1280x1024 1024x768 800x600 640x480"
    # A newer version of the radeon driver has an option that disables vertical
    # scrolling for the 1024x768 part.
	Option		"MergedNonRectangular"	"true"
    # In 1024x768-1280x1024 mode the DPI is correct (100), but in all other
    # modes it is weird.  Try to override
	Option		"MergedDPI"	"100 100"
EndSection

Section "Device"
	Identifier	"Screen0 ATI Technologies, Inc. Radeon Mobility M6 LY"
	Driver		"ati"
	BusID		"PCI:1:0:0"
	Option		"DynamicClocks"	"on"
	Screen		0
EndSection

Section "Device"
	Identifier	"Screen1 ATI Technologies, Inc. Radeon Mobility M6 LY"
	Driver		"ati"
	BusID		"PCI:1:0:0"
	Option		"DynamicClocks"	"on"
	Screen		1
EndSection

Section "Monitor"
	Identifier	"Generic Monitor"
	Option		"DPMS"
EndSection

Section "Monitor"
	Identifier	"Second Monitor"
	Option		"DPMS"
EndSection

Section "Screen"
	Identifier	"MergedFB2 Screen"
	Device		"MergedFB2 ATI Technologies, Inc. Radeon Mobility M6 LY"
	Monitor		"Generic Monitor"
	DefaultDepth	24
	SubSection "Display"
		Depth		24
		Modes		"1280x1024" "1024x768"
	EndSubSection
EndSection

Section "ServerLayout"
	Identifier	"MergedFB2Layout"
	Screen		"MergedFB2 Screen"
	InputDevice	"Generic Keyboard"
	InputDevice	"Configured Mouse"
EndSection

Section "DRI"
	Mode	0666
EndSection

Section "ServerFlags"
  	Option		"DefaultServerLayout"	"MergedFB2Layout"
EndSection


The computer is a IBM Thinkpad X31, Debian SID, Kernel 2.6.16, xorg 7.0.20,
xserver-xorg-video-ati-6.5.0beta as well as xserver-xorg-video-ati-6.4.2.

It doesn´t matter whether XAA or EXA is used.

Best regards,

Michael
Comment 1 Alex Deucher 2006-06-13 13:18:18 UTC
Does removing the dynamicclocks option help (make sure you power off the
computer after removing the option to make sure the clocks come up as the bios
programs them)?
Comment 2 Michael 2006-06-13 13:44:10 UTC
Hi Alex,

I have tried it without the dynamicclocks option and also without the dri
modules, but I'm not sure that I restarted the computer.

Are you certain, that a restart is needed? 

I did a test a few days ago. I started an X server with DynamicClocks enabled
and measured battery consumption. Then I exited the server and restarted it with
DynamicClocks disabled (The Xorg.0.log file stated, "DynamicClocks disabled".).
I again measured battery consumption and it seemed to be higher again, as should
be when dynamicclocks is off...

But I will try tomorrow with the posted xorg.conf file.

Cheers,

Michael
Comment 3 Marius Gedminas 2006-06-13 15:27:57 UTC
Confirming.  X.org 7.0.0 often freezes my laptop (T42, Radeon Mobility 7500) if
I use MergedFB and switch from dual-head mode to clone mode.  I use xrandr for
the switch rather than Ctrl+Alt+[+/-].  The freeze is solid -- even Alt+SysRq
doesn't work.  When the machine is frozen, I see a strange picture on the
external monitor: http://mg.pov.lt/P6010010.JPG.  Not every xrandr call causes a
lockup, but three or four tries are enough to reproduce the problem.  Sometimes
xrandr succeeds, but the laptop lock up suddenly after 10-15 minutes when I'm
not doing anything in particular.

My xorg.conf is already linked from this bug report ;)  I've used it in the past
with X.org 6.9 for quite a long time with no problems (well, with only minor
problems, no lockups).

I've also filed this bug in Ubuntu as https://launchpad.net/bugs/47775.

I'll try to disable DynamicClocks and see if it helps.
Comment 4 Alex Deucher 2006-06-13 15:51:03 UTC
(In reply to comment #2)
> Are you certain, that a restart is needed? 
> 
>

A full shutdown and boot up would be preferred.  Enabling and disabling
dynamicclocks just enables and disables the bits in the approriate registers. 
It doesn't save and restore those regs to how the bios set them up.  if it is
causing problems you'll want to start up with a freshly initialized (by the
bios) chip.
Comment 5 Marius Gedminas 2006-06-13 16:01:10 UTC
I disabled DynamicClocks, turned the laptop off and back on, and started
clicking on my panel launchers that run xrandr.  The system froze after a few
clicks.
Comment 6 Alex Deucher 2006-06-13 16:36:54 UTC
(In reply to comment #5)
> I disabled DynamicClocks, turned the laptop off and back on, and started
> clicking on my panel launchers that run xrandr.  The system froze after a few
> clicks.

when you say disable, did you set the option to false or did you just remove the
line?  if you set the option to false, try again, but just remove the
dynamicclocks line altogether.
Comment 7 Michael 2006-06-13 23:17:22 UTC
O.K. so I have disabled DynamicClocks (by commenting out the line) and I have
experienced four freezes since then.

I had the feeling that it took a bit more switching between the modes until a
freeze occured (appr. 5 to 10 switches while with DynamicClocks it often
happened on the second or third switch), but this is probably not significant.

One time I saw a very very nice color pattern on the external monitor such as
mentioned in comment 3. The other times the external monitor just showed that
the video frequency has changed and is "out of range" now. So no picture.

I used "xrandr -s X" to switch between the modes as well as CTRL-ALT-Keypad+/- .
Both methods lead to freezes.

Hope this helps. Anything else I can test?

Michael


----------------------------------------
Two (more or less) relevant additions:

1.) I simplified the MetaModes to 
	Option		"MetaModes"	"1024x768 1280x1024 1024x768+1280x1024"
as these are the options that I really need: 1024x768 clone mode, when working
on the laptop and maybe a beamer attached. 1280x1024 clone mode or
1024x768-1280x1024 when working on the docking station.

Strange I found, that
a.) X always started in 1280x1024 clone mode, however only showing 1024x768 on
both monitors (with a virtual screen of 1280x1024)
b.) xrand only gave me the possibility to switch between 1024x768 and 1280x1024
clone mode (which at least gave me the expected results), not the xinerama like
setup. All Windows were repositioned correctly.
c.) switching with CTRL-ALT-Keypad+/- switched through all three modes but in
the clone modes there was always a 1280x1024 virtual screen area. No windows
were repositioned (and the the KDE kicker did not adapt either)

2.) What I always hate with such freezes is that data is lost on the harddisk.
I'm using ext3 and in the boot process you first see the journal recovering with
lots of inodes deleted, then the fsck yielding lots of errors, trying to correct
them and then finally a reboot. Awful :(

I already tried remounting the disk read-only via ALT-SYSRQ-u before switching
resolutions (and freezing the system), but I got the same disk errors on the
next boot. I guess I should find a ram disk based test environment. If anybody
has an idea?
Comment 8 Michael 2006-06-14 01:14:53 UTC
One more test: I disabled the dri module (as this leads to some system freezes
in other cases), but no change.
Comment 9 Marius Gedminas 2006-06-14 02:07:35 UTC
(In reply to comment #6)
> when you say disable, did you set the option to false or did you just remove
> the line?

I set the option to "off"

> if you set the option to false, try again, but just remove the
> dynamicclocks line altogether.

Michael tried that already (comment #7).  I can also try if you wish.
Comment 10 Alex Deucher 2006-06-14 06:18:50 UTC
(In reply to comment #7)

> 
> 
> ----------------------------------------
> Two (more or less) relevant additions:
> 
> 1.) I simplified the MetaModes to 
> 	Option		"MetaModes"	"1024x768 1280x1024 1024x768+1280x1024"
> as these are the options that I really need: 1024x768 clone mode, when working
> on the laptop and maybe a beamer attached. 1280x1024 clone mode or
> 1024x768-1280x1024 when working on the docking station.
> 
> Strange I found, that
> a.) X always started in 1280x1024 clone mode, however only showing 1024x768 on
> both monitors (with a virtual screen of 1280x1024)

This is because your first metamode is 1024x768. and the largest metamode is
1280x1024.

> b.) xrand only gave me the possibility to switch between 1024x768 and 1280x1024
> clone mode (which at least gave me the expected results), not the xinerama like
> setup. All Windows were repositioned correctly.

You didn't specify a dualhead metamode. 1024x768+1280x1024 is a clone mode. 
1024x768-1280x1024 is a dualhead mode.

> c.) switching with CTRL-ALT-Keypad+/- switched through all three modes but in
> the clone modes there was always a 1280x1024 virtual screen area. No windows
> were repositioned (and the the KDE kicker did not adapt either)

only xrandr resizes the desktop.  CTRL-ALT-Keypad+/- just changes the mode; the
desktop remains the same size.

> 
> 2.) What I always hate with such freezes is that data is lost on the harddisk.
> I'm using ext3 and in the boot process you first see the journal recovering with
> lots of inodes deleted, then the fsck yielding lots of errors, trying to correct
> them and then finally a reboot. Awful :(

save your data and type 'sync' to flush to the HD before you try test something
likely to crash.

What version of the radeon driver are you using?

Comment 11 Michael 2006-06-14 06:45:16 UTC
(In reply to comment #10)
> > Strange I found, that
> > a.) X always started in 1280x1024 clone mode, however only showing 1024x768 on
> > both monitors (with a virtual screen of 1280x1024)
> 
> This is because your first metamode is 1024x768. and the largest metamode is
> 1280x1024.
> 
Ähh, o.k. How can I start X then in 1024x768 clone mode without any virtual
screen. This would be my default setup. Only when I connect to the docking
station with the external monitor, I would like to switch to 1280x1024 clone
mode (with the virtual screen on the laptop LCD) or the dual-head-setup.

> > b.) xrand only gave me the possibility to switch between 1024x768 and 1280x1024
> > clone mode (which at least gave me the expected results), not the xinerama like
> > setup. All Windows were repositioned correctly.
> 
> You didn't specify a dualhead metamode. 1024x768+1280x1024 is a clone mode. 
> 1024x768-1280x1024 is a dualhead mode.
> 

Oh, sh... Typo. I will correct that.

> > c.) switching with CTRL-ALT-Keypad+/- switched through all three modes but in
> > the clone modes there was always a 1280x1024 virtual screen area. No windows
> > were repositioned (and the the KDE kicker did not adapt either)
> 
> only xrandr resizes the desktop.  CTRL-ALT-Keypad+/- just changes the mode; the
> desktop remains the same size.
> 

O.K. so one needs a combination of xrandr and the CTRL-ALT-Keypad in order to
switch between these three modes?

> > 
> > 2.) What I always hate with such freezes is that data is lost on the harddisk.
> > I'm using ext3 and in the boot process you first see the journal recovering with
> > lots of inodes deleted, then the fsck yielding lots of errors, trying to correct
> > them and then finally a reboot. Awful :(
> 
> save your data and type 'sync' to flush to the HD before you try test
something likely to crash.
> 

I thought ALT-SYSRQ-S + ALT-SYSRQ-U should do (sync and remount ro). But that
did not work. Next time I try to issue "sync" and see what happens...

> What version of the radeon driver are you using?
>

6.5.8.0-1 from Debian unstable.

Xorg.0.log says:

(II) LoadModule: "radeon"
(II) Loading /usr/lib/xorg/modules/drivers/radeon_drv.so
(II) Module radeon: vendor="X.Org Foundation"
        compiled for 7.0.0, module version = 4.0.3
        Module class: X.Org Video Driver
        ABI class: X.Org Video Driver, version 0.8
(II) LoadModule: "ati"
(II) Loading /usr/lib/xorg/modules/drivers/ati_drv.so
(II) Module ati: vendor="X.Org Foundation"
        compiled for 7.0.0, module version = 6.5.8
        Module class: X.Org Video Driver
        ABI class: X.Org Video Driver, version 0.8
 
(II) ATI: ATI driver (version 6.5.8) for chipsets: ati, ativga

Does that help or do you need other info?

Michael
Comment 12 Alex Deucher 2006-06-14 07:06:19 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > > Strange I found, that
> > > a.) X always started in 1280x1024 clone mode, however only showing 1024x768 on
> > > both monitors (with a virtual screen of 1280x1024)
> > 
> > This is because your first metamode is 1024x768. and the largest metamode is
> > 1280x1024.
> > 
> Ähh, o.k. How can I start X then in 1024x768 clone mode without any virtual
> screen. This would be my default setup. Only when I connect to the docking
> station with the external monitor, I would like to switch to 1280x1024 clone
> mode (with the virtual screen on the laptop LCD) or the dual-head-setup.

Unforunately that is a limitation of mergedfb, which is a big hack to begin
with.  you'd either need to hack the driver to pre-reserve the additional
desktop space at the beginning or and xrandr -s 0 to your .xinitrc to resize the
desktop when you login.  I think keithp is working on an improved version of
mergedfb that may address this.

> 
> > > c.) switching with CTRL-ALT-Keypad+/- switched through all three modes but in
> > > the clone modes there was always a 1280x1024 virtual screen area. No windows
> > > were repositioned (and the the KDE kicker did not adapt either)
> > 
> > only xrandr resizes the desktop.  CTRL-ALT-Keypad+/- just changes the mode; the
> > desktop remains the same size.
> > 
> 
> O.K. so one needs a combination of xrandr and the CTRL-ALT-Keypad in order to
> switch between these three modes?
> 

depending on what you want to do xrandr should be fine.  CTRL-ALT-Keypad uses
the vidmode extension to change the mode.  It came along before xrandr could
resize the desktop to match the mode.

> > > 
> > > 2.) What I always hate with such freezes is that data is lost on the harddisk.
> > > I'm using ext3 and in the boot process you first see the journal
recovering with
> > > lots of inodes deleted, then the fsck yielding lots of errors, trying to
correct
> > > them and then finally a reboot. Awful :(
> > 
> > save your data and type 'sync' to flush to the HD before you try test
> something likely to crash.
> > 
> 
> I thought ALT-SYSRQ-S + ALT-SYSRQ-U should do (sync and remount ro). But that
> did not work. Next time I try to issue "sync" and see what happens...

that won't work if your system is already locked solid ;)  sync is a preemptive
method. 

> 
> > What version of the radeon driver are you using?
> >
> 
> 6.5.8.0-1 from Debian unstable.
> 
> Xorg.0.log says:
> 
> (II) LoadModule: "radeon"
> (II) Loading /usr/lib/xorg/modules/drivers/radeon_drv.so
> (II) Module radeon: vendor="X.Org Foundation"
>         compiled for 7.0.0, module version = 4.0.3
>         Module class: X.Org Video Driver
>         ABI class: X.Org Video Driver, version 0.8
> (II) LoadModule: "ati"
> (II) Loading /usr/lib/xorg/modules/drivers/ati_drv.so
> (II) Module ati: vendor="X.Org Foundation"
>         compiled for 7.0.0, module version = 6.5.8
>         Module class: X.Org Video Driver
>         ABI class: X.Org Video Driver, version 0.8
>  
> (II) ATI: ATI driver (version 6.5.8) for chipsets: ati, ativga
> 

That should be the most recent stable 7.0 release, IIRC.

Comment 13 Michael 2006-06-14 07:19:46 UTC
(In reply to comment #12)
> 
> Unforunately that is a limitation of mergedfb, which is a big hack to begin
> with.  you'd either need to hack the driver to pre-reserve the additional
> desktop space at the beginning or and xrandr -s 0 to your .xinitrc to resize the
> desktop when you login.  I think keithp is working on an improved version of
> mergedfb that may address this.
> 

Well, o.k., this will certainly do. I'd be happy to use MergedFb at all. :)

> depending on what you want to do xrandr should be fine.  CTRL-ALT-Keypad uses
> the vidmode extension to change the mode.  It came along before xrandr could
> resize the desktop to match the mode.

I'll try again with my typo corrected. That is if we can get rid of the freezing
bug...

> 
> That should be the most recent stable 7.0 release, IIRC.
> 

O.K. What can I do to find the bug? 

Cheers,

Michael

Comment 14 Alex Deucher 2006-06-14 08:48:44 UTC
(In reply to comment #13)

> 
> O.K. What can I do to find the bug? 
> 

If possible, using ErrorF() or GDB, try and find out what function in the radeon
driver the crash is happening in.
Comment 15 Michael 2006-06-14 08:56:56 UTC
O.K. I will do this. Will, however, take some time, as I have no idea how to do
this. :-/

Is there somewhere a tutorial about using ErrorF() or GDB which could function
as a start for me?

Thanks,

Michael
Comment 16 Alex Deucher 2006-06-14 09:09:52 UTC
(In reply to comment #15)
> O.K. I will do this. Will, however, take some time, as I have no idea how to do
> this. :-/
> 

ErrorF() is easy.  just add them to the source code when you want to print
something. so you might do something like:
in RADEONRestoreCrtc2()

ErrorF("RestoreCrtc2 called");
...
ErrorF("about to write crtc2 regs");
...
ErrorF("RestoreCrtc2 finished");

etc.

the messages will show up in your log. then you can look at your log and get an
idea of where the problem is.

> Is there somewhere a tutorial about using ErrorF() or GDB which could function
> as a start for me?

Using GDB with the X server:
http://wiki.x.org/wiki/DebuggingTheXserver

Comment 17 Michael 2006-06-14 09:43:21 UTC
O.K. Thanks for the info.

I had a short look over it. I will try whatever I can, but I guess it will take
some time...

I'm not absolutely sure that I can come up with some results. Both methods seem
to have some hurdles that I might not overcome:

1.) errorF(): Necessary to understand the source code. I guess that's over my
capabilities. :(
But I will have a look at it (not even sure that I can sucessfully compile a
complete xserver. Might work with apt-src install easier though...). I guess I
need to find the code that gets executed when the modes are switched. uh, uh...

2.) GDB: A typical debugger it seems. However, if I am not mistaken, this
approach only works with crashed xservers, but not with a completely frozen
system. The relevant data is taken after the server gets a signal e.g. SIGSEGV.
But when the whole system is frozen, even a remote gdb session via ssh won't
work, will it?
Comment 18 Michael 2006-06-17 14:17:57 UTC
So, I had a longer look at GDB. To my mind, this won't work, if I'm not
overlooking something.

What really might be a good idea, is to insert the errorF() calls in some
prominent funtions, just to identify, in what function the freeze happens.

With a reconfigured syslogd, one could broadcast the messages and log them on
another computer on the net, so that all messages are logged for sure and won't
get lost if there is no time left for a sync.

What would be necessary, would be some educated guess from the developers what
functions are good candidates, in which such errorF() calls should be inserted.
I have no chance to understand the code and to identify places where such calls
would make sense for this debug approach.

Would anyone who understands the code be interested in helping in this approach?

Cheers,

Michael
Comment 19 Michel Dänzer 2006-06-18 10:48:30 UTC
(In reply to comment #18)
> 
> With a reconfigured syslogd, one could broadcast the messages and log them on
> another computer on the net, so that all messages are logged for sure and won't
> get lost if there is no time left for a sync.

As Roland pointed out in another bug, re-mounting the filesystem containing the
log file with -o sync should be easier and do the trick.


> What would be necessary, would be some educated guess from the developers what
> functions are good candidates, in which such errorF() calls should be inserted.

I'd start in RADEONSwitchMode().
Comment 20 Michael 2006-06-18 22:43:08 UTC
(In reply to comment #19)
> 
> I'd start in RADEONSwitchMode().

O.K. I'll see if I can find this, insert some meaningful errorF() messages and
get  hopefully some sensible information back :-)  (However, not this week, I'm
on business travel. Next week... )

Cheers,

Michael
Comment 21 Michael 2006-06-20 10:03:02 UTC
Created attachment 5994 [details]
Xorg.log file of freeze

So, here are the first results. (I'm abroad at the moment, so I have limited
possibilities for testing here.)

Setup: The Radeon driver was modified with ErrorF() function calls as shown
here:

_X_EXPORT Bool RADEONSwitchMode(int scrnIndex, DisplayModePtr mode, int flags)
{
    ScrnInfoPtr    pScrn       = xf86Screens[scrnIndex];
    RADEONInfoPtr  info        = RADEONPTR(pScrn);
    Bool	   tilingOld   = info->tilingEnabled;
    Bool	   ret;
#ifdef XF86DRI
    Bool	   CPStarted   = info->CPStarted;

    if (CPStarted) {
	DRILock(pScrn->pScreen, 0);
	RADEONCP_STOP(pScrn, info);
    }
#endif

    RADEONTRACE(("RADEONSwitchMode() !n"));
    ErrorF ("RADEONSwitchMode entered\n");

    if (info->allowColorTiling) {
	if (info->MergedFB) {
	    if ((((RADEONMergedDisplayModePtr)mode->Private)->CRT1->Flags &
		(V_DBLSCAN | V_INTERLACE)) ||
		(((RADEONMergedDisplayModePtr)mode->Private)->CRT2->Flags &
		(V_DBLSCAN | V_INTERLACE)))
		info->tilingEnabled = FALSE;
	    else info->tilingEnabled = TRUE;
	}
	else {
	    info->tilingEnabled = (mode->Flags & (V_DBLSCAN | V_INTERLACE)) ?
FALSE : TRUE;
	}
#ifdef XF86DRI	
	if (info->directRenderingEnabled && (info->tilingEnabled != tilingOld))
{
	    RADEONSAREAPrivPtr pSAREAPriv;
	    drmRadeonSetParam  radeonsetparam;
	    memset(&radeonsetparam, 0, sizeof(drmRadeonSetParam));
	    radeonsetparam.param = RADEON_SETPARAM_SWITCH_TILING;
	    radeonsetparam.value = info->tilingEnabled ? 1 : 0;
	    if (drmCommandWrite(info->drmFD, DRM_RADEON_SETPARAM,
		&radeonsetparam, sizeof(drmRadeonSetParam)) < 0)
		xf86DrvMsg(pScrn->scrnIndex, X_ERROR,
		    "[drm] failed changing tiling status\n");
	    pSAREAPriv = DRIGetSAREAPrivate(pScrn->pScreen);
	    info->tilingEnabled = pSAREAPriv->tiling_enabled ? TRUE : FALSE;
	}
#endif
    }
    ErrorF ("RADEON tiling finished\n");
    
    if (info->accelOn)
	RADEON_SYNC(info, pScrn);
    ErrorF ("RADEON_SYNC finished\n");
    
    if (info->FBDev) {
	RADEONSaveFBDevRegisters(pScrn, &info->ModeReg);

	ret = fbdevHWSwitchMode(scrnIndex, mode, flags);

	RADEONRestoreFBDevRegisters(pScrn, &info->ModeReg);
    } else {
	info->IsSwitching = TRUE;
	ret = RADEONModeInit(xf86Screens[scrnIndex], mode);
	info->IsSwitching = FALSE;
    }
	ErrorF ("RADEON info->FBdev finished\n");

    if (info->tilingEnabled != tilingOld) {
	/* need to redraw front buffer, I guess this can be considered a hack ?
*/
	xf86EnableDisableFBAccess(scrnIndex, FALSE);
	RADEONChangeSurfaces(pScrn);
	xf86EnableDisableFBAccess(scrnIndex, TRUE);
	/* xf86SetRootClip would do, but can't access that here */
    }
    ErrorF ("RADEON ENABLEDISABLE FB Access finished\n");


    if (info->accelOn) {
	RADEON_SYNC(info, pScrn);
	RADEONEngineRestore(pScrn);
    }
    ErrorF ("RADEON Engine Restore finished\n");

#ifdef XF86DRI
    if (CPStarted) {
	RADEONCP_START(pScrn, info);
	DRIUnlock(pScrn->pScreen);
    }
	ErrorF ("RADEON DRIUnlock finished\n");
#endif

    /* Since RandR (indirectly) uses SwitchMode(), we need to
     * update our Xinerama info here, too, in case of resizing
     */
    if(info->MergedFB) {
       RADEONUpdateXineramaScreenInfo(pScrn);
    }
	ErrorF ("RADEON UpdateXinerama finished\n");

    return ret;
}

The disk was remounted with the sync option (and verfied via /proc/mounts).
Screen Depth was changed to 24 bits and X restarted.

Lbreakout2 was started and - using the "f" key - resolution was switched appr.
5-6 times between fullscreen (640x480) and window mode (1024x768). Switching
was stopped when the system froze. The Xorg.log file is attached. (Sorry,
output is not nicely ordered, as I forgot the carriage returns first...) 

Interesting is: The xorg.log file does not show any indications that the freeze
happens between any function calls in RADEONSwitchMode (). To my mind, there
are three possible explanations:

1.) Even with the "sync" option, ErrorF messages are written to disk with a
short delay, resulting in lost messages when the freeze occurs.

or

2.) The freeze "takes some time", e.g. one millisecond, so that, although
initiated in RADEONSwitchMode, the function may finish before the machine
actually locks-up.

or

3.) RADEONSwitchMode does something wrong, so that registers or some settings
are screwed up, but the freeze only happens when one other function somewhere
else tries to use these screwed values.

or 

3.) The freeze has nothing to do with RADEONSwitchMode ()

Any ideas? How should I continue best?

I have done the lbreakout test as Michel suggested that it's probably due to
the same bug and it's easier to test. If it makes a big difference I can repeat
the test but use the MergedFB setup and try switching resolutions with xrandr.

Cheers,

Michael
Comment 22 Michel Dänzer 2006-06-21 03:23:24 UTC
I suspect the log writes may get buffered somewhere, so we're missing the last
ones. As for where to go from here, the best suggestion I can make is for
someone to do a git bisect, as someone on 
https://launchpad.net/distros/ubuntu/+source/xserver-xorg-driver-ati/+bug/47775
claims this only started happening recently. Michael, you could try an older
version of xserver-xorg-video-ati first, e.g. from
http://snapshot.debian.net/archive/2006/05/03/debian/pool/main/x/xserver-xorg-video-ati/


PS: Please try not to clutter up your comments. Attaching a diff would be more
useful than pasting code, e.g. Also, it's not clear for someone who just reads
this bug how the 'lbreakout test' relates to this bug, and until we've confirmed
that other bug and this one are indeed one and the same, it's better not to mix
up comments between them.
Comment 23 Michael 2006-06-23 00:49:08 UTC
Well, having read Marius' bug entry I actually thought that he meant with xorg
6.8.2 there were no problems, but now, after I have tried 6.5.7.3-3, I am happy
to say:

Yes, it is a regression! Works flawlessly with MergedFB -> In an extensive test
last night I did not manage to freeze the system at all.

And yes, bug #7251 seems to have the same cause.
Comment 24 Michel Dänzer 2006-06-23 01:08:03 UTC
*** Bug 7251 has been marked as a duplicate of this bug. ***
Comment 25 Michel Dänzer 2006-06-23 01:16:16 UTC
Would be really nice if someone could do a git bisect.

It's most likely related to Ben's memmap changes though - Ben, any other ideas
for tracking this down?
Comment 26 Daniel Dorau 2006-07-09 02:35:07 UTC
Same here with Thinkpad X31 (Radeon M6) and MergedFB (clone mode), 1280x1024
virtual desktop size, 1280x1024 resolution on CRT and 1024x768 res on LCD.

With xserver-xorg-video-ati 6.5.8.0-1 as of Debian xorg 7.0.22 I consistently
get random system freezes (even no SysRq anymore) and I even don't have to
switch modes to trigger it. Just use the system and after random time it freezes.

I already disabled DRI to sort that out.

With xserver-xorg-video-ati 6.5.7.3-3 as suggested above, it runs stable again
(thanks for that link!).
Comment 27 Michel Dänzer 2006-07-17 02:36:19 UTC
(In reply to comment #26)
> With xserver-xorg-video-ati 6.5.8.0-1 as of Debian xorg 7.0.22 I consistently
> get random system freezes (even no SysRq anymore) and I even don't have to
> switch modes to trigger it. Just use the system and after random time it freezes.

Please try current xf86-video-ati git. Both the master and ati-1-0-branch
branches have a fix that might help with this.

Other than that, a git bisect might still be useful.
Comment 28 Michael 2006-07-24 02:38:51 UTC
(In reply to comment #27)
> 
> Please try current xf86-video-ati git. Both the master and ati-1-0-branch
> branches have a fix that might help with this.
>
Hmm, how to use git? What address to use? I guess, it is something like

git clone XXX ??

I like to try it. I'm using xorg 7.0, though, as this is the latest in Debian.
 
> Other than that, a git bisect might still be useful.

How can that be done?

Thanks,

Michael
Comment 29 Michel Dänzer 2006-07-25 02:03:39 UTC
(In reply to comment #28)
> Hmm, how to use git? What address to use? I guess, it is something like
> git clone XXX ??

Yes,

git clone git://git.freedesktop.org/git/xorg/driver/xf86-video-ati

> I like to try it. I'm using xorg 7.0, though, as this is the latest in Debian.

Then you may have to do

git checkout ati-1-0-branch

in the xf86-video-ati directory before building.

> > Other than that, a git bisect might still be useful.
> 
> How can that be done?

man git-bisect, or google for a HOWTO.
Comment 30 Michael 2006-07-25 02:57:46 UTC
O.K. done that. The git drivers, however, still freeze the system when switching
resolution via xrandr.

Now I am going to have a look at git-bisect and see if I understand enough to
make any use of it.

Cheers,

Michael
Comment 31 Michael 2006-07-25 08:49:31 UTC
O.K. did the git-bisect, although I now ask myself if this was really necessary
at all, as 1.) the culprit is really quite obvious, if one knows the code and
the changes that have happened between 6.5.7.3 and 6.5.8.0 and 2.) it is such a
big patch that we're more or less at the beginning again. The patch is:

http://gitweb.freedesktop.org/?p=xorg-driver-xf86-video-ati;a=commit;h=5c141bb15d1163e04c012a0cdf0699d534f0be37

So, how do we continue?
Comment 32 Michel Dänzer 2006-07-25 23:53:03 UTC
Ben, any ideas why the memmap changes could cause freezes on setting a mode?
AFAICT the only major difference is that RADEONRestoreMemMapRegisters() gets
called in the process, maybe the unconditional writes to DISPLAY_BASE_ADDR are
problematic?
Comment 33 Michel Dänzer 2006-07-26 00:06:40 UTC
I meant to say 'DISPLAY_BASE_ADDR and friends', of course.
Comment 34 Alex Deucher 2006-07-26 07:50:31 UTC
(In reply to comment #32)
> Ben, any ideas why the memmap changes could cause freezes on setting a mode?
> AFAICT the only major difference is that RADEONRestoreMemMapRegisters() gets
> called in the process, maybe the unconditional writes to DISPLAY_BASE_ADDR are
> problematic?

The patch also explicitly disables crtc memory access when updating those
registers which one would assume would be the right thing to do.  However, IIRC,
benh has some hangs with regard to this which was worked around at the time via
some usleeps (which we later worked around some other way).  Perhaps it needs to
be revisited.
Comment 35 Michel Dänzer 2006-07-26 08:04:57 UTC
(In reply to comment #34)
> The patch also explicitly disables crtc memory access when updating those
> registers [...]

Only when updating MC_FB_LOCATION and friends, which should only happen on
server startup, not when switching modes during a generation.
Comment 36 Michael 2006-08-07 01:27:37 UTC
So, is there nothing that can be done?

Is there a chance that this megapatch can be devided into two halfs that can be
both checked to identify the code responsible for the freeze? Or anything else?

(Seems not to many people use mergedfb on an ATI setup anyway considering that
not really a lot of people are complaining about these crashes ... )
Comment 37 Jan Ask 2006-08-07 23:40:51 UTC
Well, it looks like a lot of people using Ubuntu Dapper are having this problem
(Seems to  be especially bad using IBM Thinkpads with ATI cards).
https://launchpad.net/distros/ubuntu/+source/xserver-xorg-driver-ati/+bug/47775

Being one of the sufferers, I find the bug very critical as it causes total
freeze and dataloss. Please don't give up guys!

If there is something I can do to help, please let me know
Comment 38 Benjamin Herrenschmidt 2006-08-11 01:29:24 UTC
We shouldn't hit any of that mem map change code in that case since FB_LOCATION
and AGP_LOCATION aren't changed... Not sure what's being hit. I suspect it may
be some "safety" bits I added that trigger one of the 18902734897324 bugs in the 
radeon chips ....

Try commenting out that bit in RestoreCrtc2Registers and let me know:

    /* We prevent the CRTC from hitting the memory controller until
     * fully programmed
     */
    OUTREG(RADEON_CRTC2_GEN_CNTL,
	   crtc2_gen_cntl | RADEON_CRTC2_DISP_REQ_EN_B);
Comment 39 Michael 2006-08-11 02:35:34 UTC
No, pity, that did not help. Still freezes.

Any other idea?

What's for sure is that the error was newly introduced in this one megapatch. So
can we devide that somehow into two or more parts for testing?

Cheers,

Michael

Btw. I've upgraded to xserver-xorg 1.1.0 (as it is now in Debian experimental).
Comment 40 Michel Dänzer 2006-09-04 04:12:56 UTC
(In reply to comment #38)
> We shouldn't hit any of that mem map change code in that case since FB_LOCATION
> and AGP_LOCATION aren't changed... 

As I pointed out in comment #34, it always writes to *some* registers. I
wouldn't expect those to cause problems, but...

> I suspect it may be some "safety" bits I added 

Were those already in your very first memmap commit?
Comment 41 Benjamin Herrenschmidt 2006-09-04 15:09:02 UTC
No. THe whole thing to properly stop CRTCs (and wait for them to be stopped)
etc... is more recent and did actually fix lockups on some machines here. I'm
not sure what's going at this point. There is something specific to 7000's it
seems (and possibly M6 and M7). I'm getting loads of reports from ubuntu that
they lockup randomly and I have a co-worker here where it will lockup right away
at startup if trying to enable mergedfb (before anything gets displayed, the
whole machine is down, hard locked).
Comment 42 Michel Dänzer 2006-09-06 00:39:35 UTC
(In reply to comment #41)

So Ben, any idea what in your very first memmap commit could cause this, or how
to track it down?
Comment 43 Benjamin Herrenschmidt 2006-09-06 02:01:13 UTC
Well, there were issues with the first mmap commit, typically related to the
chip stil fetching from the a mixed bag of the old and new locations, thus
causing crazy PCI accesses etc... that's why my subsequent commits have been
attempting to fix by shutting down as much as possible of things that hit the MC
(like CRTCs etc...).

At this point, the best would be to take a snapshot of all the relevant values
(MC*, *OFFSET, *BASE_ADDR...) around the mode settings and that might light a
bulb... though the problem seems to be quite specific to M7s and earlier, so I
wonder if we might be hitting some other (but related) issue (like some bit of
the chip caching the old address or whatever)

At this point, I'm desperate for some help from ATI and/or somebody who has a
reproduceable lockup and a PCI analyzer between the card and the host.
 
Comment 44 Michel Dänzer 2006-09-06 03:00:15 UTC
(In reply to comment #43)

I suspect you're still looking at this too broadly, and not taking into account
all the information we have. E.g., we're talking about runtime mode changes, so
the values the memmap changes were about shouldn't change. So, keeping this in
mind and looking only at the very first memmap commit, which part(s) do you
think could lead to significantly different behaviour (different order of
register access, ...) during a runtime modeswitch?
Comment 45 Michel Dänzer 2006-09-07 02:53:58 UTC
Created attachment 6862 [details] [review]
Restore some checks before calling RADEONRestoreCommonRegisters()

Michael, can you try this patch? It looks like we lost some checks before
calling RADEONRestoreCommonRegisters().
Comment 46 Michel Dänzer 2006-09-07 02:56:54 UTC
Marking as valid bug, per http://wiki.x.org/wiki/XorgTriage .
Comment 47 Michael 2006-09-08 00:46:23 UTC
(In reply to comment #45)
> Created an attachment (id=6862) [edit]
> Restore some checks before calling RADEONRestoreCommonRegisters()
> 
> Michael, can you try this patch? It looks like we lost some checks before
> calling RADEONRestoreCommonRegisters().

Sorry, took some time. I first tried with git, but somehow the driver did not
work. I then used the 6.6.2 driver code from debian experimental.

However, I'm really sorry. System still freezes when changing resolution in
MergedFB mode :( :(
Comment 48 Dave Airlie 2006-09-18 02:45:55 UTC
can you try xf86-video-ati git head for this?
Comment 49 Michael 2006-09-20 05:35:58 UTC
Dave,

this looks VERY promising! I don't know what you have done (there are quite a
lot of changes in git after 6.6.2), but whatever it was, it seems to be exactly
what was needed!!

So far no freezes using MergedFB on a Radeon M6 (IBM Thinkpad X31), and I have
tried quite a lot of resolution changes!

Now,..., next step will be to get the KDE guys to eliminate some annoying bugs
on the desktop when MergedFB is enabled :-D ;-)

Greeting from Switzerland

Michael
Comment 50 Michael 2006-09-25 12:23:03 UTC
I'm not sure if I can/should change the bug status, but I try to put it to
"Resolved" and "Fixed"

Cheers,

Michael
Comment 51 Marius Gedminas 2006-09-28 11:33:23 UTC
I'd like to test xf86-video-ati git head too.  Could someone gently point me to
the relevant documentation for building it, as I can't seem to get past
autogen.sh?  Also, can I build just the driver, or do I need to build the whole
server?  I currently use X.org 7.0.0 from Ubuntu Dapper.
Comment 52 Michael 2006-09-28 12:39:06 UTC
Hi Marius,

I haven't found much documentation. Everything I know is basically described by
Michel in comment 29.

This only builds the ati driver. After the build you find the driver files
somewhere in src/.libs/ . So, it's a hidden directory. The files are all named
*.so like ati_drv.so .

I searched for them in my root dir, found them in /lib/xorg/drivers/modules and
simply exchange them with the newly build ones.

If you are having problems with autogen, look if you have all the corresponding
packages (I think they are called autoconf and automake (automake1.7 if I
remember correctly). And you need all the relevant xorg dev packages.

It takes some time to set this up. If you don't like to install all these
packages on your system (e.g. because it's a production system like mine), then
you can also set up a virtual machine with vmware and use this as a build system
(this is what I do).

Hope this helps a bit.

Cheers,

Michael

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.