Bug 13376 - Intel 2.2 lockup when virtual size exceeds 2048
Summary: Intel 2.2 lockup when virtual size exceeds 2048
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: 7.3 (2007.09)
Hardware: Other All
: medium normal
Assignee: Jesse Barnes
QA Contact: Xorg Project Team
URL: http://bugs.debian.org/cgi-bin/bugrep...
Whiteboard:
Keywords: NEEDINFO
: 14476 (view as bug list)
Depends on:
Blocks: 13493 15000
  Show dependency treegraph
 
Reported: 2007-11-24 04:24 UTC by Brice Goglin
Modified: 2008-03-29 07:39 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Possible fix (2.44 KB, patch)
2007-12-03 00:44 UTC, Michel Dänzer
no flags Details | Splinter Review
fix bo_list corruption (689 bytes, patch)
2008-02-04 00:55 UTC, Hong Liu
no flags Details | Splinter Review
Log file of Xorg failing (28.45 KB, text/plain)
2008-02-06 02:31 UTC, Andrew McMillan
no flags Details
Only limit CRTC size if EXA is enabled (1.94 KB, patch)
2008-02-20 16:15 UTC, Jesse Barnes
no flags Details | Splinter Review
A log of an X session which locked up (4.51 KB, text/plain)
2008-03-05 00:22 UTC, Andrew McMillan
no flags Details
disable DRI if width > 2048 before dri init (1.08 KB, patch)
2008-03-25 02:07 UTC, Wang Zhenyu
no flags Details | Splinter Review

Description Brice Goglin 2007-11-24 04:24:44 UTC
This bug has been reported by Dylan Thurston on the Debian BTS and several people confirmed the same behavior. It also got mentioned at the end of #11453.

Since Intel driver 2.2 (actually 2.1.99 too), enabling a virtual screen larger than 2048 results in a lockup of the server at startup. Removing the Virtual line in the config solves the problem. Option NoAccel works around it too.

These people were OK with having DRI disabled with earlier drivers because of a too large virtual screen. Having to disable 2D acceleration now is far more annoying :)

There's one log exhibiting the problem at http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;filename=Xorg.bug.log;att=1;bug=451570

It has been observed at least on 915GM/GMS/910GML and 945GM/GMS/940GML.
Comment 1 Gordon Jin 2007-11-26 00:22:22 UTC
This seems a big issue. However, I'm not seeing this issue on my 915GM, with intel 2.2 + server 1.4. I only see dri disabled but X works with EXA.

Are the intel driver and server in Debian the same as those in fd.o?
Comment 2 Brice Goglin 2007-11-26 00:53:17 UTC
Debian's 2:2.2.0-1 driver contains fd.o 2.2 + the next commit (4a2b0f340357c4ca58dc9586fad1337b83966362, 
Fix typo in 1920x1080 resolution entry).

On the Xserver side, we currently have a snapshot of 1.4.1, but some people also reproduced with 1.4 and only a couple patches.

FWIW, the guy that observed the problem on 915 had "Virtual 3000 1600" (xorg.conf available in http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=5;bug=451570)
Comment 3 Gordon Jin 2007-11-26 01:36:30 UTC
"Virtual 3000 1600" also works for me.
Comment 4 Andrew McMillan 2007-11-26 19:05:11 UTC
I also reported this problem to the Debian BTS.  I am using a 945GM (Compaq nx6320 laptop).

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=452357

You can see my xorg.conf in that report, and a Xorg.log of the failure case when I uncomment the "Virtual 2880 2048" line in my xorg.conf (an attachment to one of the messages).

If EXA simply will not work with Virtual > 2048 then perhaps it should not be enabled in this situation.  Unfortunately, though it would be nice to have DRI, it is more important to me to have both screens - and many other people as well, I'm sure.

Thanks :-)

Andrew McMillan.
Comment 5 Michel Dänzer 2007-12-03 00:44:18 UTC
Created attachment 12911 [details] [review]
Possible fix

Does this xserver patch fix it?
Comment 6 Andrew McMillan 2007-12-03 21:16:10 UTC
It is entirely possible that I have screwed up somewhere, but when I apply your patch to the current Debian xserver-xorg-core and rebuild it I unfortunately don't have a lot of luck :-(

Regards,
Andrew McMillan.
Comment 7 Hong Liu 2008-02-04 00:55:22 UTC
Created attachment 14132 [details] [review]
fix bo_list corruption

please take a try on this patch.

Thanks,
Hong
Comment 8 Jesse Barnes 2008-02-05 15:55:30 UTC
This patch is upstream now...  Andrew or Brice, can you test again?
Comment 9 Brice Goglin 2008-02-05 21:36:44 UTC
I pingued my users when Hong Liu posted the patch. Only one of them tested it so far. He said it *didn't* help, no change in the behavior (no change in the log either). I will report here when the other ones will reply.
Comment 10 Andrew McMillan 2008-02-06 02:31:03 UTC
Created attachment 14171 [details]
Log file of Xorg failing

I didn't have a chance to look at this until tonight, unfortunately, but it seems that the patch is included in the latest Debian xserver-xorg-video-intel, so I was saved the effort of applying it myself :-)

With *no* AccelMethod specified in my xorg.conf, but *with* Virtual 2880 2048 present the XServer fails to start, with the log file attached.

It is also worth mentioning that I am now able to set Option NoAccel and restart X without having to reboot.  Previously the lockup left the video in such a state that a reboot was required.

Thanks,
Andrew McMillan.
Comment 11 Julien Cristau 2008-02-06 02:34:26 UTC
(In reply to comment #10)
> Created an attachment (id=14171) [details]
> Log file of Xorg failing
> 
> I didn't have a chance to look at this until tonight, unfortunately, but it
> seems that the patch is included in the latest Debian xserver-xorg-video-intel,
> so I was saved the effort of applying it myself :-)
> 
actually it isn't, the uploaded 2:2.2.0.90-1 actually has an older version of the code.  Please try again with 2:2.2.0.90-2 (which i'm uploading now).
Comment 12 Andrew McMillan 2008-02-06 02:39:02 UTC
Ah.  Bugger :-)

I checked the Git repo and it was in there, so I assumed it was in the new update I had just installed a few minutes previously.

Sorry.
Comment 13 Brice Goglin 2008-02-06 16:12:49 UTC
Chung-chieh Shan confirmed that 2.2.0.90 (the right one) still has the problem. Config and log available at http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=46;bug=451570
Comment 14 Brice Goglin 2008-02-09 13:37:58 UTC
I guess there were at least two different issues here. Dylan Thurston says his problem is fixed now (config and log at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=451570#61) while comment #13 shows that some people still have a lockup.
Comment 15 Brice Goglin 2008-02-10 03:42:15 UTC
After talking with my users, it looks like the lockup *at startup* is fixed. The other users are experiencing a lockup a bit after the startup:
* 2 users report a lockup about 10s after startup
* 1 user gets a lockup when 'mail-notification' starts and puts up a pop-up window
I have no idea how EXA + virtual>2048 could cause such lockups...
Comment 16 Jesse Barnes 2008-02-20 15:50:41 UTC
So does using the XAA AccelMethod also work around the problem?
Comment 17 Jesse Barnes 2008-02-20 16:15:00 UTC
Created attachment 14465 [details] [review]
Only limit CRTC size if EXA is enabled

Even back in old versions (2.1) we were limiting the CRTC range to 2048x2048, so I'm not sure why this problem would suddenly start biting you.

However, this patch applies the 2048 limit only when EXA is in use, otherwise it uses an 8192 limit, maybe it'll prevent the hangs you're seeing.  It also falls back to XAA if the width is > 2048; hopefully I got the logic right.
Comment 18 chantra 2008-02-24 06:55:10 UTC
As per
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/188178/comments/6

Option "MigrationHeuristic" "greedy"

In the device section solves the issue on both debian sid and ubuntu hardy
Comment 19 Michel Dänzer 2008-02-24 07:13:53 UTC
(In reply to comment #18)
> Option "MigrationHeuristic" "greedy"
> 
> In the device section solves the issue on both debian sid and ubuntu hardy

That's probably just luck, due to greedy never actually attempting acceleration of a problematic pixmap.

Has it been verified whether this problem happens with xserver Git master? If it doesn't, can the patch I attached be tested again against the 1.4 branch, making sure it really is used?
Comment 20 Michael Fu 2008-03-04 16:45:31 UTC
chantra and other bug reporter, please respond to comment# 17 and comment# 19. thanks.
Comment 21 Andrew McMillan 2008-03-05 00:22:54 UTC
Created attachment 14853 [details]
A log of an X session which locked up

In reply to comment #17, when I apply that patch to the current Debian xserver-xorg-video-intel package from unstable I still see the previous behaviour, to wit:
- X starts fine and displays the GDM login
- I log in and about ten seconds later (while everything is still starting up) the lockup occurs.

At one time I thought the lockup was happening exactly as the theming was being applied to the desktop items, but after watching this a number of times it is clear that it happens somewhat later than that.

On this occasion I logged in and started X with 'startx' in order to not get my X log overwritten by GDM restarting and other such stuff.  The attached log is the output of startx in that situation.

Finally, regarding "NoAccel" vs. "XAA": either work fine for me.  The lockup only occurs if I choose "EXA" or have all of them commented out.

As you can see from the log file this is:
X.Org X Server 1.4.0.90 (i.e. xorg-server 2:1.4.1~git20080131-1)

From the main Xorg log file, the intel driver version seems to be:
(II) LoadModule: "intel"
(II) Loading /usr/lib/xorg/modules/drivers//intel_drv.so
(II) Module intel: vendor="X.Org Foundation"
        compiled for 1.4.0.90, module version = 2.2.1
        Module class: X.Org Video Driver
        ABI class: X.Org Video Driver, version 2.0

(except with that patch 14465 applied).

Thanks,
Andrew McMillan.
Comment 22 Jesse Barnes 2008-03-05 09:29:25 UTC
Ok, so I must not have the fallback code quite right.  I'll take a look at your log and see if I can fix the patch.
Comment 23 unggnu 2008-03-14 16:57:33 UTC
*** Bug 14476 has been marked as a duplicate of this bug. ***
Comment 24 Jesse Barnes 2008-03-18 19:03:26 UTC
God, our startup logic is a mess... can you attach the log when you run with the '-verbose' option?  I want to be sure the code I think is running actually is.
Comment 25 Wang Zhenyu 2008-03-25 02:06:22 UTC
xserver has already been awared with driver maxX/Y limit and intel driver has no UTS implement.

We can do 2048 limit check earlier before DRI init, that seems to be the only fix we need for this.

Comment 26 Wang Zhenyu 2008-03-25 02:07:29 UTC
Created attachment 15447 [details] [review]
disable DRI if width > 2048 before dri init
Comment 27 Wang Zhenyu 2008-03-28 00:43:25 UTC
Close this, as patches are upstream. Pls test and verify.

Comment 28 chantra 2008-03-28 10:30:36 UTC
@wang:

I am wondering, is the issue also happening when height>2048?
In that case, shouldn't the if condition being tested on
if (!IS_I965G(pI830) && (pScrn->displayWidth > 2048 || pScrn->displayHeight > 2048)) 

?
I don't believe there is many people running one screen above the other one, but the issue will still happen for them with the patch innit?

Comment 29 chantra 2008-03-29 07:39:01 UTC
@wang:
I applied it and can verify X is not crashing anymore 
made a debdiff against xserver-xorg-video-intel-2.2.1-1ubuntu6

https://bugs.launchpad.net/debian/+source/xserver-xorg-video-intel/+bug/188178/comments/17

thanks


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.