OS: Ubuntu 12.10 (Quantal Quetzal). Version of xserver-xorg-video-intel: 2:2.20.9-0ubuntu2 Computer: HP Mini 110 Video chips: VGA compatible controller: Intel Corporation Mobile 945GSE Express Integrated Graphics Controller (rev 03) and Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03) I can set up dual screens and everything works pretty nicely for a while (with the restriction that no resolution above 800x600 can be used on the VGA). However, before long everything freezes up. The mouse pointer can still wander freely across both screens, and I can prove by careful preparation that the keyboard is still sending keys to the selected window. I suspect that the actual event is triggered by a keystroke or mouse movement because it doesn't seem to happen if you just set it up and wait for failure. Only two things can change the video display. (1) CTL-ALT-F1 blanks the VGA display but then the usual login message is not written to it. (2) the inevitable "press and hold the power button" causes blank screens when the power finally goes off. Note that this happens with either the installed 12.10 or a live DVD of 12.10. It does not appear to be a problem with Ubuntu 12.04 (tested using a live CD). 12.04 uses xserver-xorg-video-intel version 2:2.17.0-1ubuntu4. I'd like to try swapping drivers but am unwilling to go down that path without advice. Thanks.
This is also reported in Launchpad bug 1079440.
To be frank we fixed a lot of issues in the upstream kernel, so please do look for the current set of drivers in the xorg-edgers and either a mainline 3.7 kernel of drm-intel-experimental (which tracks our upstream branch).
Okay, I've loaded up a nightly build of 3.7.0 (http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/current/ -- the one I got was "3.7.0-994.201212210409". That didn't change things. Then I set up the xorg-edgers PPA as described in "https://launchpad.net/~xorg-edgers/+archive/ppa/+index?batch=75". [That brings in yet another 3.7.0 kernel.] With all the new xorg software, using the current kernel or either of the 3.7.0 kernels, nothing seems to be broken that didn't used to be broken, but the problem still occurs after a few minutes. The effect of CTL-ALT-F1 (virtual terminal) is inconsistent, so I can't rely on that to upload more details. I could probably set something up on a regular terminal that could be sent by a few simple keystrokes after the freeze occurs. I don't know what that would be, perhaps some form of appport. If someone would like me to do this, please suggest a method. If developers would like to try to find this on their own systems, let me point out that both of the reports in launchpad bug 1079440 concern computers with the same combination of i954 chips. My setup is simple: (1) use a terminal to run a perl-one-liner that squirts a running count to the screen, then position that on the boundary of the two displays; (2) set up a system monitor so that the activity graph is scrolling across both monitors; (3) set up an emacs editor and select its window; (4) wait a few minutes, typing into emacs as you please -- then save your scribblings after the freeze. Let me know what more I can do to help you find this.
Created attachment 72479 [details] Reg dump after dual monitor hang (w/ AccelMethod SNA)
I see a very similar problem on a Thinkpad T400. Using the laptop panel everything is fine, but plugging in a monitor eventually leads to a very hard hang. The monitor's resolution is correctly detected in my case, and the GPU is indentified as "Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA controller])". I can hit the problem within 5 minutes using AccelMethod SNA; using UXA can give me several hours on average, but eventually it too dies. In both cases: cat /sys/kernel/debug/dri/0/i915_error_state no error state collected I did grab a register dump after it died using SNA, if that's of any help.
I've made a breakthrough in understanding this problem. It seems to give me a 100% workaround but of course I can't be sure. The solution is to use "taskset" to force the "Xorg" and "compiz" processes to always be run on the same CPU. The simple terminal commands are: taskset -pa 1 $(pgrep -x compiz) sudo taskset -pa 1 $(pgrep -x Xorg) You might need to do this before the second monitor is connected. I have run for several hours both in a quiet state and with as much complexity and CPU bashing as the computer can reasonably handle. The high CPU loading makes the system sluggish, but it does not fail. Good luck turning this workaround into a real fix!
Note that the related launchpad bug is beginning to get some attention based on finding that even kernel 3.8-rc5 doesn't help. I have a refinement for my workaround. I supposed that the important thing was to keep Xorg and compiz on the same CPU. But that is not the case. The most important thing seems to be to run Xorg on CPU 1. Note that my computer is an Intel Atom N280 whose two cores are is not advertised to be identical. I'd be suspicious of hairy signal timing based on this table. (Xorg 1 means "taskset" Xorg to CPU 1 etc.): || || Xorg 1 || Xorg 2 || Xorg unpinned|| || Compiz 1 || WORKS || FAILS || WORKS || || Compiz 2 || WORKS || FAILS || FAILS || || Compiz unpinned || WORKS || FAILS || FAILS || All the cases that "WORK" have run with dual monitors and heavy use for more than an hour; many hours for the Xorg 1 cases. The ones that "FAIL" have never run more than a half hour or so under the same conditions. Hope this helps.
Younes Manton, can you please file a separate bug for your issue? You have a completely different platform, so rather likely you hit a different bug.
LP link for reference: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1079440 Keith, can you try to log in via ssh and grab logfiles once the system freezes up like you describe? Since mouse still works, it's likely just a gfx render issue and everything else works.
Created attachment 73851 [details] auth.log
Okay, here are all the logs I can find that cover the time period of this failure. Background. After I hooked up the monitor, I started the following perl one-liner: perl -e'my $x = 0; int $x;while(1){++$x;print"$x\n" unless $x % 1000000}' This spits out a new million about twice a second. I then started the gnome-system-monitor and positioned the terminal window and the monitor window so that both of them were regularly updating both monitors. The video hung at 8:26 with a count of 2.2 billion. I was doing other things and didn't get back to issue this report until after 11:00. By that time the perl process had run at almost 100% CPU usage for more than 200 minutes and thus many billions more counts. Xorg was moving, but slowly, only 12 minutes total. Compiz was stopped dead at 5 minutes.
Created attachment 73852 [details] syslog
Created attachment 73854 [details] wtmp
Created attachment 73855 [details] lastlog
Created attachment 73856 [details] Xorg.0.log
Created attachment 73857 [details] pm-powersave.log
Created attachment 73858 [details] syslog.1
Created attachment 73859 [details] udev
Created attachment 73860 [details] kern.log
Created attachment 73861 [details] boot.log
Created attachment 73862 [details] dmesg
One more thing I should add about this sequence. I have added 'taskset -pa 1 $(pgrep -x Xorg)' to the rcX.d set of things. So initially the computer came up with my workaround in place. I reversed that with 'sudo taskset -pa 3 $(pgrep -x Xorg)' before connecting the external monitor. All of this was about an hour after the computer was booted up at 6:57.
Can you please try drm-intel-next with commit 21ad833075801a7cd81b5ef1604ffc6c600e5ff9 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Tue Feb 19 15:16:39 2013 +0200 drm/i915: Fix races in gen4 page flip interrupt handling
We have good indication from other bugs that the race fix is indeed good. Preemptively closing...
I couldn't find a "commit" with the number Chris gave. But I loaded up the "3.8.0-997_3.8.0-997.201302180432" kernel from "drm-intel-next". However, despite the fact that it seems to have built and installed okay, it dies with an almost-all-black screen a few seconds after it starts initializing the ramfs. I repeated it all and got the same results. Both downloads of the parts gave the same md5sums as follows: 8b88e669a4117f72e58d57481247e936 linux-headers-3.8.0-997_3.8.0-997.201302180432_all.deb 86c4099dfb41e40d7e976493681a2af9 linux-headers-3.8.0-997-generic_3.8.0-997.201302180432_i386.deb ec53ae004b3c74f097d31dfc61ffeb6c linux-image-3.8.0-997-generic_3.8.0-997.201302180432_i386.deb 057040a505e5c463cc99b42f8c814d4b linux-image-extra-3.8.0-997-generic_3.8.0-997.201302180432_i386.deb ???
(In reply to comment #25) > I couldn't find a "commit" with the number Chris gave. But I loaded up the > "3.8.0-997_3.8.0-997.201302180432" kernel from "drm-intel-next". However, > despite the fact that it seems to have built and installed okay, it dies > with an almost-all-black screen a few seconds after it starts initializing > the ramfs. I repeated it all and got the same results. Both downloads of the > parts gave the same md5sums as follows: That's a complete different problem. And a critical one to boot. Normally if it breaks that early it is because the initramfs is broken and needs to be rebuilt. Try passing nomodeset to your kernel as it boots and see if that makes diagnosing the problem easier. Besides the date on kernel is earlier than the patch I referenced to fix the original issue.
Since Chris pointed out that the drm-intel-next build (Feb 18) is too old for this fix (never mind that it also seems to exhibit a critical bug on my computer), I tried the latest build from drm-intel-nightly (Feb 23). No critical bug there. However, it wastes very little time -- a few seconds -- before freezing up. This may be a slightly different freeze as well because the cursor was not responsive. It also didn't do CTL-ALT-F1 though that was never a reliable thing in the past during this freeze.
Yikes. You're not having much fun are you? :( Do you happen to be able to set up a netconsole and grab the dmesg leading to the hard hang?
Fun has returned. I don't know why the Feb 18 drm-intel-next insists on crashing and I have no idea why the first attempt at the Feb 23 drm-intel-nightly was a failure, but I returned to using it (3.8.0-994-generic) with complete success later on. A 5-hour run, an 8-hour run, and it is currently running and working fine with 2 monitors. So I agree that the problem is both resolved and fixed. Thanks!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.