Bugzilla – Bug 22904
865G freeze with Intel driver 2.8.0
Last modified: 2009-10-20 07:31:15 UTC
Created attachment 27935 [details]
Configuration that works on my 865G
Using OpenSUSE 11.2 Factory M3+, kernel 22.214.171.124 and latest updates from the X11:Xorg repository for it, X server freezes after a few seconds using the configuration file produced by "X -configure" (loading dri2 and so on).
After the freeze, only the mouse works, but the X server is unusable and doesn't respond on any action. This happened also already in KDM, if I do repeatedly some actions there. The system behind is stable and if I switch to console mode in time no crash or freeze can be recognized. WHat I did last time was:
1. KDM logon starting ICEWM (not KDE) as a normal user
2. Launch XTerm
3. 'su -' in xterm
4. From that XTerm, start KDE systemsettings (for the root user)
The systemsettings tools opens but after that there is no more response to input actions, although the windows and look of the X server isn't broken.
If I use the xorg.conf in attachment, which is produced during booting the OpenSUSE installation CD, X server is stable, but there seems not to work acceleration, especially moving windows is slow.
I'll add some attachments for this.
Created attachment 27936 [details]
Old X server log with the latest "freezing" start at the end
You see it's loading DRI2 and other modules which might cause that freeze.
Created attachment 27938 [details]
X server log for the start that works using configuration in attachment #27935 [details]
Sorry, it loads dri2, too, so the freeze must come from another difference.
Please provide intel_gpu_dump output referring to http://intellinuxgraphics.org/intel-gpu-dump.html.
What's the previous intel driver version ever working for you? (the first attachment from installation CD uses vesa driver instead of intel driver)
Created attachment 27940 [details]
Configuration that makes server freeze
(In reply to comment #3)
> What's the previous intel driver version ever working for you? (the first
> attachment from installation CD uses vesa driver instead of intel driver)
2.7.1 using EXA acceleration worked for me.
(In reply to comment #5)
> (In reply to comment #3)
> > What's the previous intel driver version ever working for you? (the first
> > attachment from installation CD uses vesa driver instead of intel driver)
> 2.7.1 using EXA acceleration worked for me.
How about 2.7.1 with UXA?
Created attachment 27943 [details]
intel_gpu_dump.txt for the frozen server
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #3)
> > > What's the previous intel driver version ever working for you? (the first
> > > attachment from installation CD uses vesa driver instead of intel driver)
> > >
> > 2.7.1 using EXA acceleration worked for me.
> How about 2.7.1 with UXA?
Freezes also, I'll attach the diagnostic and configuration files. Using 2.7.1 and EXA the server works, but from my feeling it is slow.
Created attachment 27945 [details]
Diagnostic and configuration files for the UXA freeze with driver 2.7.1
Created attachment 27946 [details]
Diagnostic and configuration files for the EXA freeze with driver 2.7.1
There is some interesting news:
Still using driver 2.7.1, I made some experiments with the server configuration and got a freeze of the same kind (from outside, after a few seconds and actions) also with the EXA acceleration.
I exchanged only the x11-xorg-driver-video package, but left the other packages, als libdrm, libpciaccess and so on as they were before (OpenSUSE 11.2 factory versions). I'm not sure, whether there is a compatibility problem. Just wanted to give a hint that the problem might be even somewhere else outside the driver, but this decision I will leave on you :-)
See attachements for log and configuration.
I continued some testing with this:
I'm now on self-compiled driver 2.7.1 and libdrm 2.4.12.
It seems definitive from my point of view - the freeze happens with both acceleration methods, UXA and EXA, in driver 2.7.1, and UXA in 2.8.0.
(wanted to try also XAA, but XAA doesn't come up at all with 2.7.1 for some buffer allocation problem.)
IMHO, the problem seems to come rather from the kernel part - drm-intel in 126.96.36.199, right?
For completing this:
The problem couldn't even be solved with the latest kernel 2.6.31-rc4 (OpenSUSE Kernel:/Head repository for OpenSUSE factory, driver 2.8.0 and libdrm 2.4.12.
It seems like I'm banned to use the vesa driver for now until this gets solved.
Thank you very much for your bug report. I greatly appreciate you trying the various versions of the driver, the various acceleration architectures, and the various kernels.
The bug seems pervasive enough that I should hopefully not have any difficult reproducing it with an 865 here. I'll try that as soon as I can, and let you know what I can come up with.
Created attachment 28301 [details]
Latest configuration and log that works - 2.7.1 with XAA, kernel 188.8.131.52
I thought it could be helpful if you would also have a configuration that runs:
I gave XAA with driver 2.7.1, kernel 184.108.40.206 a second try. Now it works and this is the latest driver version and configuration I get working at the moment.
I got to retract what I've said about 2.7.1 and XAA using kernel 220.127.116.11. Even this combination is fairly unstable. I get a black screen instead of kdm, an only each second start (using the reset button) worked.
After the latest update to OpenSUSE 11.2 Milestone 5 there is kernel 2.6.31-rc4. Using XAA with this kernel it gets more unstable than before and freezes also as initially reported for UXA.
For me it seems that this problem is independend of the acceleration method which is used, but most probably somewhere in a kernel part of the driver.
If you want to reproduce it try KDM and KDE4 (not necessary to allow desktop effects a'la Compiz). KDE4 does something which let X freeze using the 2.8.0 or 2.7.1 driver after a few seconds or sometimes a few minutes.
There just one driver working for me again now: vesa.
*** Bug 23088 has been marked as a duplicate of this bug. ***
I got the same kind of random freezes with my 865G under 2.6.30 using intel 2.7.1 in exa mode.
same symptoms, like screen hangs but ssh still works.
seems very hard to locate the issue:
- intel driver?
- KDE3? / KDE4?
- anything else?
helping ideas would be really appreciated.
if you can also ssh in and grab a gpu dump when the server hangs, that would be useful. See http://intellinuxgraphics.org/intel-gpu-dump.html for instructions. The gpu dump René captured is most curious...
(In reply to comment #18)
> Hi Thomas,
> if you can also ssh in and grab a gpu dump when the server hangs, that would
> be useful. See http://intellinuxgraphics.org/intel-gpu-dump.html for
> instructions. The gpu dump René captured is most curious...
... but it was done in the same way - logged on to ssh on the host with the hanging X from a different machine and made the dump.
BTW: Nothing new even with the combination of driver 2.8.1 and kernel 2.6.31-rc7, the hang still occurs "reliably" after a few actions and seconds playing with the KDE4 systemsettings tool from within the ICEWM window manager (which is my "reference test".
Created attachment 29116 [details]
gpu dump by ssh using>>>
Result: Screen hangs exept mouse pointer. (Even Ctrl+Alt+Back has no effect)
Thanks for any helping hints.
Created attachment 29117 [details]
and this was logged.
Created attachment 29119 [details]
and right here another dump.
Created attachment 29180 [details]
Created attachment 29181 [details]
and the according backtrace...
dmesg shows nothing exeptional.
Xorg.log same as above...
Does anyone have a workaround or a good xorg.conf for i865G on a 2.6.31 kernel except using the VESA driver until the bug gets fixed? The VESA driver comes with many other rendering problems and side effects, for instance in Eclipse, and is, of course, very slow.
I'm very happy to report that thanks to Eric Anholt, we now have what we believe
is a fix for these annoying 865 hange.
René, can you attempt the kernel patch posted by Eric here and report back
whether the bug is fixed?
*** Bug 23365 has been marked as a duplicate of this bug. ***
*** Bug 22947 has been marked as a duplicate of this bug. ***
We've gotten several out-of-band confirmations of Eric's patch fixing
865 GPU hangs. So I'm going to mark this bug fixed now.
René, if you test the newer kernel and find the bug still present, please
feel free to reopen the bug.
Thanks for all the help!
This patch does fix the problem for me. Thanks.
BUT: If I compile the i915 driver as a module, clflush_cache_range is an unresolved symbol ??
I had to compile the driver (and agp driver) in to the kernel for this to resolve.
Should I open a new bug for this?
(In reply to comment #30)
> This patch does fix the problem for me. Thanks.
Excellent. Thanks for the confirmation.
> BUT: If I compile the i915 driver as a module, clflush_cache_range is an
> unresolved symbol ??
> I had to compile the driver (and agp driver) in to the kernel for this to
> Should I open a new bug for this?
No need to open a new bug. Brice noticed this as well, reported it, and proposed a fix here:
I believe Eric has already implemented this so it will be going out this way with kernel releases (if it hasn't started appearing already).
Apparently it has been committed to Linus' git kernel tree
by the following commit:
Author: Eric Anholt <email@example.com>
Date: Thu Sep 10 17:48:48 2009 -0700
agp/intel: Fix the pre-9xx chipset flush.
Ever since we enabled GEM, the pre-9xx chipsets (particularly 865) have had
serious stability issues. Back in May a wbinvd was added to the DRM to
work around much of the problem. Some failure remained -- easily visible
by dragging a window around on an X -retro desktop, or by looking at
The chipset flush was on the right track -- hitting the right amount of
memory, and it appears to be the only way to flush on these chipsets, but the
flush page was mapped uncached. As a result, the writes trying to clear the
writeback cache ended up bypassing the cache, and not flushing anything! The
wbinvd would flush out other writeback data and often cause the data we wanted
to get flushed, but not always. By removing the setting of the page to UC
and instead just clflushing the data we write to try to flush it, we get the
desired behavior with no wbinvd.
This exports clflush_cache_range(), which was laying around and happened to
basically match the code I was otherwise going to copy from the DRM.
Signed-off-by: Eric Anholt <firstname.lastname@example.org>
Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Sorry for the late answer:
Eric's fit hit the bull's-eye, the following combination works for me in OpenSUSE 11.2 RC1 factory:
- kernel 18.104.22.168
- xorg-server 1.6.5
- intel video driver 22.214.171.1242
Previous xorg server and video driver version back to 2.7.1 might also work, the kernel patch made the change.