Bug 23969 - [945GM] Hard lockups with KMS on intel graphics with Fedora Rawhide
Summary: [945GM] Hard lockups with KMS on intel graphics with Fedora Rawhide
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium critical
Assignee: Wang Zhenyu
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords: NEEDINFO
Depends on:
Blocks:
 
Reported: 2009-09-15 23:37 UTC by Kjartan Maraas
Modified: 2010-07-24 04:01 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
syslog output (401.50 KB, text/plain)
2009-09-23 04:47 UTC, Kjartan Maraas
no flags Details
output from intel_gpu_dump on battery (139.50 KB, application/x-gzip)
2009-10-05 14:45 UTC, Kjartan Maraas
no flags Details
dump when plugged in (170.96 KB, application/x-gzip)
2009-10-05 14:45 UTC, Kjartan Maraas
no flags Details

Description Kjartan Maraas 2009-09-15 23:37:13 UTC
Ever since KMS was enabled in Fedora rawhide I've been experiencing random lockups on my HP Compaq nc6400 laptop with this graphics controller:

00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03)

I've filed a bug against fedora with more details here:

https://bugzilla.redhat.com/show_bug.cgi?id=492686

There's dmesg output and xorg logs etc there.
Comment 1 Kjartan Maraas 2009-09-17 03:40:42 UTC
Changing this to critical
Comment 2 Kjartan Maraas 2009-09-17 03:41:25 UTC
Last kernel I reproduced this on was
Linux localhost.localdomain 2.6.31-0.204.rc9.fc12.i686.PAE #1 SMP Sat Sep 5 20:45:47 EDT 2009 i686 i686 i386 GNU/Linux

Comment 3 Kjartan Maraas 2009-09-17 03:42:10 UTC
Sorry for all the spam.

This is the driver version:
xorg-x11-drv-intel-2.8.0-13.20090909.fc12.i686
Comment 4 Gordon Jin 2009-09-22 18:11:07 UTC
We can't reproduce this with upstream code.
Comment 5 Kjartan Maraas 2009-09-22 22:44:11 UTC
Well, something is triggering this for me at least. It could be hardware or some app doing the wrong thing, but the fact is still that this started as soon as KMS was enabled in fedora. I spoke briefly to warren togami on irc yesterday and he indicated that he'd seen freezes with 945GM too, but thought they had been fixed very recently.

In fact I thought so too, but after a suspend / resume cycle coming home from work yesterday it hung on me while working in gnome-terminal. I have firefox, evolution, xchat-gnome running most of the time and do compilation/editing etc. Some in vi in gnome-terminal and some in emacs.

Is there anything I can do to get more debugging from the driver itself or maybe Xorg somewhere?

Thanks a lot for following up on this.
Comment 6 Kjartan Maraas 2009-09-23 01:44:45 UTC
Got another lockup just now. Newer packages than last time:

2.6.31-33.fc12.i686.PAE
xorg-x11-drv-intel-2.8.0-15.20090909.fc12.i686
xorg-x11-server-Xorg-1.6.99.902-1.fc12.i686
Comment 7 Wang Zhenyu 2009-09-23 01:57:07 UTC
So this is relate with suspend/resume? I think we've got some reports on that, could you still get dmesg kernel log in failure case? I'll try to see if this's also on my side.
Comment 8 Kjartan Maraas 2009-09-23 04:46:57 UTC
I had another one now without suspend / resume in between so it's definitely not related to that (at least not only that).

Attaching syslog output for all of Sep 23 so far and here I had a hang at 09.59 which was just after resuming after going to work. Later I had a hang around 13.25 after trying to connect to a projector at work to display a presentation. I see some messages there that look like it didn't like the projector at least :-)
Comment 9 Kjartan Maraas 2009-09-23 04:47:26 UTC
Created attachment 29795 [details]
syslog output
Comment 10 Kjartan Maraas 2009-09-29 02:29:39 UTC
I still see these hangs several times a day. Is there nothing I can do to get more data here?
Comment 11 Kjartan Maraas 2009-10-05 02:20:03 UTC
Still seeing this with the 2.9.0 release of the driver and a very recent kernel. One "datapoint" is that I see these hangs most often when running on battery. I had the machine running for several days without problems this weekend but shortly after unplugging the power I had a hang... Not very quantifiable but still worth mentioning I think.
Comment 12 Kjartan Maraas 2009-10-05 14:41:43 UTC
So, going with the assumption that this is related to deeper C-states when running on battery I'm going to attach output from intel_gpu_dump in both cases just in case that is helpful to anyone.
Comment 13 Kjartan Maraas 2009-10-05 14:45:12 UTC
Created attachment 30096 [details]
output from intel_gpu_dump on battery
Comment 14 Kjartan Maraas 2009-10-05 14:45:43 UTC
Created attachment 30097 [details]
dump when plugged in
Comment 15 Wang Zhenyu 2009-10-12 20:06:14 UTC
Interesting, we have new CxSR feature in KMS that will try to downclock PLL in C state, that might be possible trigger this? Have you tried with 'i915.powersave=0' option for drm/i915 module?
Comment 16 Kjartan Maraas 2009-10-13 00:26:41 UTC
Tried this and got a new lockup about two minutes after logging in... Anything else we can try?
Comment 17 Matej Cepl 2009-10-13 10:39:42 UTC
(In reply to comment #0)
> https://bugzilla.redhat.com/show_bug.cgi?id=492686
> 
> There's dmesg output and xorg logs etc there.

That's wrong bug number ... https://bugzilla.redhat.com/show_bug.cgi?id=492686 is "Evolution hangs in uninterruptible sleep state and grinds the machine to a halt", which is now kernel bug.
Comment 18 Kjartan Maraas 2009-10-13 10:59:07 UTC
Didn't you just paste a link to the very same bug I linked to in comment #0? :-)
Comment 19 Wang Zhenyu 2009-12-21 21:57:36 UTC
Does this still happen with upstreams?
Comment 20 Chris Wilson 2010-07-24 04:01:01 UTC
(In reply to comment #12)
> So, going with the assumption that this is related to deeper C-states when
> running on battery I'm going to attach output from intel_gpu_dump in both cases
> just in case that is helpful to anyone.

Yes! Apparently there was a debugging bit that needed to be set in order to stop the machine killing itself when using the blitter with deep C-states. GAH!

commit 944001201ca0196bcdb088129e5866a9f379d08c
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jul 20 13:15:31 2010 +1000

    drm/i915: enable low power render writes on GEN3 hardware.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.