Created attachment 19371 [details]
xorg log with info about overflow in ring buffer or whatever
Both 2.4.1 (as packaged by ubuntu in intrepid) and also 2.5.96 from git head flashes several times (flashing as in mode setting like flashes) and then freezes xorg about 1 sec after login (same thing with DRI disabled). If I set accelmethod XAA it starts and _mostly_ works fine (unless I launch for instance GIMP then xorg freezes again, 100% reproducibly even in XAA mode).
While I've not seen it myself I've heard on IRC that some other users actually got G45 machines running ubuntu with 2.4.1 intel driver semi-working. No freeze, just strangely colored lines as reported here:
All of the people that commented on the launchpad bug have other motherboard than me though. My motherboard is a Gigabyte GA-EG45M-DS2H.
In my xorg log I see what looks me my untrained eye, a filled up ring buffer (xserver giving up because it waited more than 2 seconds and it got nowhere to put the stuff it could not fit inside the ring buffer). I will attach the xorg log.
Thanks for reporting. But this seems dup with bug#17235.
*** This bug has been marked as a duplicate of bug 17235 ***
I can still repro this bug and because bug 17235 was closed due to "no repro" I will reopen this one. Also note that I got a very different stacktrace/xorg.log and also this bug repros _every_ time I login to X.
Gordon, if you need additional information please specify exactly what config you want to run and I will try it and then attack stacks/xorg.log etc.
Currently, X still freezes using git head 2.4.97 taken from here:
For all other components I'm using what is in Ubuntu intrepid right now (should semi recent stuff afaik).
If I go "NoAccel" it does not freeze. I'm currently using XAA which works if I only use terminal+firefox. If I launch gimp or openoffice then even X in XAA mode freezes (but this is most likely another bug because it's a 100% CPU spin).
PS. I'm using an LCD flat panel (SyncMaster 2253BW from Samsung) connected through DVI.
OK. Let's stick on xf86-video-intel master tip.
Zhenyu, we are getting a G45-freeze reporter capable of using git tip. Please cooperate with him to root cause. We also have Intrepid but not able to reproduce.
Keith has fixed this non-dri bug in upstream drm/xf86-video-intel, please test against that.
Okay,I just git pulled the following repo:
(at the time I got it keithp had two commits at the top and the tip was the one with description "For non-DRM, add NOOPs after BATCH_BUFFER_START to verify completion").
I did not change kernel, drm, mesa or anything from what is already in intrepid.
For xorg.conf I used AccelMethod EXA and ModeDebug true (and no other options in xorg.conf in this test run).
X.org still frozen directly at login. I logged in through ssh and saw a tiny amount of CPU activity in dd, klogd and syslog. X itself was not taking any CPU.
I will attach a gdb session where I did "bt, bt full, c, CTRL-C, bt, bt full" and it shows basically that X is stuck doing ioctl like this:
#0 0x00007fe9c7325a17 in ioctl () from /lib/libc.so.6
#1 0x00007fe9c5f00c43 in drmIoctl (fd=10, request=1074029637, arg=0x7fffd14f9cf0) at xf86drm.c:183
#2 0x00007fe9c5f00ccb in drmCommandWrite (fd=10, drmCommandIndex=<value optimized out>, data=0x7fffd14f9cf0, size=18446744073709551615)
#3 0x00007fe9c5c821a8 in I830Sync (pScrn=0x1e46a70) at i830_accel.c:214
#4 0x00007fe9c5222f6c in exaWaitSync (pScreen=0x1e78f50) at ../../exa/exa.c:1051
Further, since the CPU activity seemed to suggest that there was some logging going on I looked at dmesg (will attach that as well) and it was completely filled with repeating entries like this:
[ 1249.152162] [drm:drm_ioctl] pid=12668, cmd=0x40046445, nr=0x45, dev 0xe200, auth=1
[ 1249.152164] [drm:i915_wait_irq] irq_nr=2836 breadcrumb=2800
[ 1249.172134] [drm:drm_ioctl] ret = fffffffc
[ 1249.172160] [drm:drm_ioctl] pid=12668, cmd=0x40046445, nr=0x45, dev 0xe200, auth=1
[ 1249.172163] [drm:i915_wait_irq] irq_nr=2836 breadcrumb=2800
[ 1249.192134] [drm:drm_ioctl] ret = fffffffc
[ 1249.192160] [drm:drm_ioctl] pid=12668, cmd=0x40046445, nr=0x45, dev 0xe200, auth=1
[ 1249.192163] [drm:i915_wait_irq] irq_nr=2836 breadcrumb=2800
Finally, I also saved (and till attach) xorg.log even though I couldn't see anything super interesting in there (but you might!).
Created attachment 19567 [details]
gdb bt for EXA+ModeDebug freeze
Created attachment 19568 [details]
dri spam in dmesg for EXA+ModeDebug freeze on login
Created attachment 19569 [details]
xorg.log for EXA+ModeDebug freeze on login
Keith also fixes in drm, so you have to pull drm from git, just build libdrm and test it for now.
I've deleted /lib/libdrm* and /usr/local/libdrm* and then ran "sudo ldconfig".
Then I pulled git://anongit.freedesktop.org/git/mesa/drm from that root I did:
sudo make install
(Note: I didn't copy any .ko files since [keithp told me that] the .ko files shipping with libdrm are out of date. I also did NOT pull this tree: http://git.kernel.org/?p=linux/kernel/git/anholt/drm-intel.git;a=summary and I assume it's okay to ignore those bits since I'm using DRI=false, right?)
After building libdrm as described above I verified that there was a recently written version of libdrm.so* written into /usr/lib/libdrm.so* (seems to have worked afaik).
The libdrm "git log" at this point certainly had keithp and eric's fixes from last week plus four (mostly freebsd) fixes at the HEAD.
After that I pulled git://anongit.freedesktop.org/git/xorg/driver/xf86-video-intel
and built it using:
sudo make install
Then I configured EXA+DRIfalse+ModeDebug and rebooted. I'm attaching xorg_log+dmesg+gdb trace showing a SEGV (even though it's on the error handling path triggered by some lockup).
Created attachment 19598 [details]
libdrm/xf86-intel GIT masters with EXA+DRI_false+ModeDebug (XORG LOG)
Created attachment 19599 [details]
libdrm/xf86-intel GIT masters with EXA+DRI_false+ModeDebug (DMESG)
Created attachment 19600 [details]
libdrm/xf86-intel GIT masters with EXA+DRI_false+ModeDebug (GDB BT SHOWING SEGV)
Again please let me know if you need more/different info (or if I didn't get libdrm installed correctly). It's a shame that the loaded libdrm version number isn't printed into xorg_log :-(
On a positive note. Using libdrm/xf86-intel GIT masters now allows me to do VT switching in XAA mode (I couldn't do that before). Still Xorg freezes when I launch GIMP or OpenOffice in XAA mode (I havn't filed a bug on that because I assume XAA lower priority).
Martin, what's exact distribution do you test? I tried the Ubuntu 8.10 Beta ( build date is 8th, Oct 2008) on a GA-EG43M-DS2H ( not 5 ) and it works fine... could you burn a live cd to have a test, so that we could make sure the SW environment we use to reproduce this bug is same?
Good idea Michael. To test this I burned the following image to disk and booted from it:
It frozen on startup just like my main configuration (which is intrepid bleeding edge, installing updates as they come). This narrows it down to the HW difference (I got G45 and you got G43 right?)
Beyond software and hardware there is also the BIOS settings even though I assume they are very unlikely to be the culprit. In my BIOS I currently got:
"On board VGA" set to "Enable if no ext PEG"
(and I dont have any graphics card in the machine beyond the on board G45)
and I also got:
"On-chip Frame Buffer Size" set to "32MB+2MB for GTT".
(I've never tried to boot with any other settings, these are the factor defaults I think).
since this bug also reproducible on XAA case, I'm removing the EXA tag...
Martin, pls checkout keith's post on email@example.com.
Great news, when I apply the intel-agp patch Keith posted to intel-gfx recently I can actually login to my machine in EXA mode. The patch I applied was posted with subject line "[Intel-gfx] G45 BIOS mis-initializes stolen GTT PTEs"
This needs to be retested against current drm-intel-next kernel and master xf86-video-intel.
Martin, the patch on drm-intel-next is a bit different with Keith's original one. So your retesting is important for this fix. Thanks!
I cloned drm-intel-next today and then I git-format-patched just this single commit:
I then applied that patch (using "patch -p1 < file.patch") onto the current ubuntu intrepid intel-agp.ko and that was sufficient for me to be able to boot into EXA.
I had no special stuff in xorg.conf except ModeDebug and I also got direct rendering, I was able to launch glxgears, gimp, frets on fire and some other random apps. So far so good.
I tried (but didn't succeed due to lack to understand of how all this stuff works really) to build the full config with:
masters of xf86intel,libdrm.so,i915.ko,drm.ko,intel-agp.ko etc
I think the problem was that I tried to build the modules against my ubuntu .27 kernel and there was a bunch of GEM stuff missing in that kernel. So I got a bunch of errors like these:
/home/mnemo/src/intel_driver/drm-intel/drivers/gpu/drm/drm_gem.c: In function ‘drm_gem_init’:
/home/mnemo/src/intel_driver/drm-intel/drivers/gpu/drm/drm_gem.c:74: error: ‘struct drm_device’ has no member named ‘object_name_lock’
/home/mnemo/src/intel_driver/drm-intel/drivers/gpu/drm/drm_gem.c:75: error: ‘struct drm_device’ has no member named ‘object_name_idr’
Is it possible to build eric's tree without GEM, I mean is there a define I can use to disable it or something?
Otherwise, if I must build with GEM should I build the kernel itself with all modules and then install it into GRUB in order to test it or how is this stuff usually done?
Thanks for testing! I think that agp patch is the only sensible one for this bug, and it clearly fixes your problem. I'm closing this one.
Yeah, gem kernel needs some other changes to build, and now it's easier as it's in upstream kernel now, so just pull linus's tree and build it.