Created attachment 14373 [details] dmesg, xorg.conf, Xorg.0.log/old I got a crash with 2.2.0.90 when my laptop was under high memory pressure and load. I've attached the two Xorg.log files from /var/log as they both had the same timestamp. I think it is Xorg.log rather than Xorg.log.old though. I didn't have any GL apps running at the time, just xchat, gnome-terminal, liferea, galeon and evolution. I was compiling a big C++ app (synfig) with -j2 at the time, which caused the memory pressure. I occasionally get crashes, the result is usually that the screen is messed up (purple mess at the top) but the kernel is fine, so pressing the power button results in a graceful shutdown. I'm using a Dell Inspiron 6400 laptop. $ lspci -n | grep 00:02 00:02.0 0300: 8086:27a2 (rev 03) 00:02.1 0380: 8086:27a6 (rev 03) $ lspci | grep 00:02 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03) 00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03) $ uname -m i686 $ dpkg -s xserver-xorg-video-intel xserver-xorg xserver-xorg-core libdrm2 libgl1-mesa-dri | egrep '(Version|Package)' Package: xserver-xorg-video-intel Version: 2:2.2.0.90-3 Package: xserver-xorg Version: 1:7.3+10 Package: xserver-xorg-core Version: 2:1.4.1~git20080131-1 Package: libdrm2 Version: 2.3.0-4 Package: libgl1-mesa-dri Version: 7.0.2-4 $ uname -r 2.6.24-1-686 $ cat /etc/lsb-release DISTRIB_ID=Debian DISTRIB_RELEASE= DISTRIB_CODENAME=sid DISTRIB_DESCRIPTION="Debian GNU/Linux" $ xrandr --verbose Screen 0: minimum 320 x 200, current 1280 x 800, maximum 2048 x 2048 VGA disconnected (normal left inverted right x axis y axis) Identifier: 0x4b Timestamp: 174813 Subpixel: unknown Clones: CRTCs: 0 1 LVDS connected 1280x800+0+0 (0x4e) normal (normal left inverted right x axis y axis) 331mm x 207mm Identifier: 0x4c Timestamp: 174813 Subpixel: horizontal rgb Clones: CRTC: 1 CRTCs: 1 EDID_DATA: 00ffffffffffff004ca3000000000000 00100103802115780a87f594574f8c27 27505400000001010101010101010101 010101010101c71b00a0502017303020 26004bcf100000190000000f00000000 00000000002387026400000000fe0044 463035360331353458330a20000000fe 002740505a81b0d9ff01010a2020009d BACKLIGHT_CONTROL: kernel supported: native legacy combination kernel BACKLIGHT: 100 (0x00000064) range: (0,100) 1280x800 (0x4e) 71.1MHz -HSync -VSync h: width 1280 start 1328 end 1360 total 1440 skew 0 clock 49.4KHz v: height 800 start 802 end 808 total 823 clock 60.0Hz 1280x800 (0x4f) 83.5MHz h: width 1280 start 1344 end 1480 total 1680 skew 0 clock 49.7KHz v: height 800 start 801 end 804 total 828 clock 60.0Hz 1280x768 (0x50) 80.1MHz h: width 1280 start 1344 end 1480 total 1680 skew 0 clock 47.7KHz v: height 768 start 769 end 772 total 795 clock 60.0Hz 1024x768 (0x51) 65.0MHz -HSync -VSync h: width 1024 start 1048 end 1184 total 1344 skew 0 clock 48.4KHz v: height 768 start 771 end 777 total 806 clock 60.0Hz 800x600 (0x52) 40.0MHz +HSync +VSync h: width 800 start 840 end 968 total 1056 skew 0 clock 37.9KHz v: height 600 start 601 end 605 total 628 clock 60.3Hz 640x480 (0x53) 25.2MHz -HSync -VSync h: width 640 start 656 end 752 total 800 skew 0 clock 31.5KHz v: height 480 start 490 end 492 total 525 clock 59.9Hz TV disconnected (normal left inverted right x axis y axis) Identifier: 0x4d Timestamp: 174813 Subpixel: unknown Clones: CRTCs: 0 1 BOTTOM: 37 (0x00000025) range: (0,100) RIGHT: 46 (0x0000002e) range: (0,100) TOP: 36 (0x00000024) range: (0,100) LEFT: 54 (0x00000036) range: (0,100) TV_FORMAT: NTSC-M supported: NTSC-M NTSC-443 NTSC-J PAL-M PAL-N PAL
Oh, please don't zip attachments. This has been mentioned in http://intellinuxgraphics.org/how_to_report_bug.html. Has this ever worked with the older driver (e.g. 2.2.0 or 2.1.x)?
Woops, missed that part of the howto. > Has this ever worked with the older driver (e.g. 2.2.0 or 2.1.x)? You mean has it ever crashed on random occasions before before? Yes, probably once every one to three months with 2.2.0. Before 2.2.0, it crashed when switching VTs though.
Does the crash happen if you disable framebuffer compression?
Created attachment 14430 [details] [review] let FBC assert immediate idle Assuming disabling FBC fixes your problem, I'm curious if this patch will help. It should change the way the chip deals with the compressed frame buffer, and hopefully be less intrusive to your memory traffic.
How do I disable framebuffer compression? Google wasn't too helpful with that unfortunately. Anyway, since it isn't something that is easily reproducible, I doubt I would be able to say that disabling FBC or applying the patch helps.
The man page should have what you need. IIRC it's Option "FramebufferCompression" "false" in the 'intel' driver section of your xorg.conf. Just run for awhile with it and see if it helps...
Created attachment 14965 [details] another crash I got another crash with framebuffer compression off. Section "Device" Identifier "Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller" Driver "intel" BusID "PCI:0:2:0" #Testing if it helps this: http://bugs.freedesktop.org/show_bug.cgi?id=14539 Option "FramebufferCompression" "false" EndSection
Created attachment 14966 [details] .old log from second crash
Paul, the last log you attached has a GPU lockup, but I don't see much else in the log that would be correlated, except for maybe the PM events. Does this crash happen more often when you open/close the lid or press display hot switch keys? Would it be possible for you to follow the info at intellinuxgraphics.org to get a backtrace of your crash?
I assume you mean this: http://www.x.org/wiki/Development/Documentation/ServerDebugging I couldn't get the Xgdb setup working, but I was eventually able to enable core files and enabled NoTrapSignals. So next time I get a crash I'll be able to provide a backtrace. The crashes are only very occasionally. Only thing they seem to be correlated with is high load.
Paul, have you seen any crashes lately with the latest driver bits? This one is going to be really hard to fix without more info. :) Thanks, Jesse
Created attachment 16678 [details] log from crash with 2.3.1 I had another crash with 2.3.1 shortly after it was uploaded to Debian. Attached the main xorg log, will attach the .log.old in the next message.
Created attachment 16679 [details] log.old from crash with 2.3.1
Created attachment 16841 [details] another 2.3.1 crash I just got another crash with 2.3.1, attached.
Created attachment 16842 [details] log.old from 2.3.1 crash
I should mention about this latest crash that Xorg didn't start correctly when gdm brought it back up - the screen stayed black and I couldn't switch to another VT. I had to login via ssh and reboot to get it to work again.
This bug looks similar: http://bugs.debian.org/484049 I guess the issue is something to do with this ring buffer, whatever that is.
I just got another ring buffer issue, crash, start a dead X, reboot. Is it useful to add the logs for this latest crash? As I see it, there are two problems in this bug; The ring buffer gets filled up and the driver doesn't know how to deal with it so it kills X. The driver doesn't know how to reset the ring buffer when it is in a fucked up situation, so a reboot is needed to make the screen useful again. Obviously fixing the first issue would be great, but really the second issue needs fixing first.
Created attachment 17073 [details] backtrace I got another crash with the same symptoms. Is there some easy way to change the size of the ring buffer so that I can get this more often or preferably less often. For some reason this time, Xorg dumped a core file, attached the backtrace.
The ring buffer error is just how crashes show up, the real problem is some sort of chip programming bug. bugs.fdo isn't being very co-operative right now, but I'll look at the logs you posted later today; hopefully they'll have something for us to go on (the one with the backtrace sounds promising).
Well, the backtrace indicates that things went bad at some point when the server tried to do a putimage, but doesn't give us much else. It's likely that the driver or a DRI client broke things earlier and the backtrace just indicates when we noticed there was a problem. A couple of things to try: - does the crash occur if you use the ExaNoComposite option set to "true"? - does the crash occur only after you've run 3D applications/screensavers? - is it correlated at all with DPMS? i.e. only happens after your screen turns back on or something? It would be really helpful if you could provide a reliable way of reproducing this, ideally with just a .xinitrc & startx so we could try to reproduce the problem ourselves...
Oh, and it looks like you may be running compiz or some other compositing manager? Is that true? Can you reproduce the problem without?
The latest crash was with metacity 2.22.0, which I don't think has compositing yet. The first crash was with metacity 2.20.2. I've used compiz occasionally but it was too flashy for my liking so I went back to metacity ages ago. It didn't seem to be correlated with anything, just happened randomly when I was in the middle of compiling, or when I pressed a key in xchat, or when browsing websites. Until now, I had no reliable way way to reproduce it, after reading the Debian bug I mentioned before more closely, I can reliably reproduce it by just putting 'scorched3d' in the .xinitrc file, running startx, then clicking "Settings", then "Normal Settings" then "OK", then "Play". This is the scorched3d and intel driver from Debian sid. Did some switching of ExaNoComposite, XAA, DPMS, Composite, XAANoOffscreenPixmaps and couldn't find any combination where the crash did not occur.
PS: If you cannot reproduce it on your hardware, I'll be at DebConf/DebCamp in August with the laptop where this is happening.
Ah great, that helps a lot. I'll try scorched3d (maybe sometime next week) and see if I can reproduce the issue.
Btw, Nian can you reproduce this crash after running scorched3d?
I can reproduce it against tip of intel 2D driver. My backtrace is similar with Paul's: #0 0xffffe424 in __kernel_vsyscall () #1 0x4fa84fa0 in raise () from /lib/libc.so.6 #2 0x4fa868b1 in abort () from /lib/libc.so.6 #3 0x081b7d5c in FatalError (f=0xb7db753c "lockup\n") at log.c:554 #4 0xb7d7e7fe in I830WaitLpRing (pScrn=0x8211f20, n=131064, timeout_millis=2000) at i830_accel.c:150 #5 0xb7d7ec15 in I830Sync (pScrn=0x8211f20) at i830.h:869 #6 0xb7d8bbc5 in i830_stop_ring (pScrn=0x8211f20, flush=<value optimized out>) at i830_driver.c:1841 #7 0xb7d8bd10 in I830LeaveVT (scrnIndex=0, flags=0) at i830_driver.c:3281 #8 0x080bbcc9 in xf86XVLeaveVT (index=0, flags=0) at xf86xv.c:1278 #9 0xb7e232ff in glxDRILeaveVT (index=0, flags=0) at glxdri.c:1001 #10 0x080a21f7 in AbortDDX () at xf86Init.c:1102 #11 0x081b7828 in AbortServer () at log.c:406 #12 0x081b7d47 in FatalError (f=0xb7db753c "lockup\n") at log.c:552 #13 0xb7d7e7fe in I830WaitLpRing (pScrn=0x8211f20, n=131064, timeout_millis=2000) at i830_accel.c:150 #14 0xb7d7ec15 in I830Sync (pScrn=0x8211f20) at i830.h:869 #15 0xb7da538a in I830EXASync (pScreen=0x821eed8, marker=0) at i830_exa.c:154 #16 0xb7c58702 in exaWaitSync (pScreen=0x11d7) at exa.c:806 #17 0xb7c590b2 in exaPrepareAccess (pDrawable=0x85a13d8, index=0) at exa.c:352 #18 0xb7c6111f in ExaCheckPutImage (pDrawable=0x85a13d8, pGC=0x8552230, depth=32, x=0, y=0, w=8, h=25, leftPad=0, format=2, bits=0x85e9514 "âèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçëÿáçëÿáçëÿáçëÿáçëÿáçëÿáçëÿáçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàæëÿàæëÿàæëÿàæëÿàæëÿàæëÿàæëÿàæëÿßæêÿßæêÿ"...) at exa_unaccel.c:95 #19 0xb7c5a3cb in exaPutImage (pDrawable=0x85a13d8, pGC=0x8552230, depth=32, x=0, y=0, w=8, h=25, leftPad=0, format=2, bits=0x85e9514 "âèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçëÿáçëÿáçëÿáçëÿáçëÿáçëÿáçëÿáçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàæëÿàæëÿàæëÿàæëÿàæëÿàæëÿàæëÿàæëÿßæêÿßæêÿ"...) at exa_accel.c:246 #20 0x0816b664 in damagePutImage (pDrawable=0x85a13d8, pGC=0x8552230, depth=32, x=0, y=0, w=8, h=25, leftPad=0, format=2, pImage=0x85e9514 "âèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿâèìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçìÿáçëÿáçëÿáçëÿáçëÿáçëÿáçëÿáçëÿáçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàçëÿàæëÿàæëÿàæëÿàæëÿàæëÿàæëÿàæëÿàæëÿßæêÿßæêÿ"...) at damage.c:790 #21 0x0808382e in ProcPutImage (client=0x8457610) at dispatch.c:2144 #22 0x08148f91 in XaceCatchDispatchProc (client=0x8457610) at xace.c:281 #23 0x0808726b in Dispatch () at dispatch.c:502 #24 0x0806e5e6 in main (argc=4, argv=0xbf87e0c4, envp=Cannot access memory at address 0x11df ) at main.c:452
I can't remember clearly what's the problem with scorched3d, but last time when I tried, keithp told me there're problem with some texture setting (I don't remember which) that crashed on 945, but this should be another 3d dri bug. But I'm still not sure if this is the real reason for Paul's crash. Paul, what's your mesa version? EXANoComposite still crash for you?
(In reply to comment #28) > what's your mesa version? My mesa version is Debian's 7.03-2, which seems to be taken from mesa git, commit id 03447de3: http://packages.debian.org/changelogs/pool/main/m/mesa/current/changelog > EXANoComposite still crash for you? As I stated in #23 comment 3, I couldn't find any situation that prevented the crash, including turning EXANoComposite on and off.
Sorry I've totally failed on this one, reassigning to the DRI guys, maybe they can come up with a fix.
(In reply to comment #29) > (In reply to comment #28) > > what's your mesa version? > > My mesa version is Debian's 7.03-2, which seems to be taken from mesa git, > commit id 03447de3: > There is a regression bug with 7.03-2. Could you try with the latest stable version 7.0.4 which includes a lot of 3D fixes ?
mesa 7.0.4 isn't yet available in Debian and Debian lenny will be released with 7.0.3.
Paul, are you able to get Mesa 7.0.4 now? BTW, this sounds pretty much like several other 945GM "intermittent crash" bugs.. what if you turns DRI off in config file? search "intermittent" in the bug summary field ... thanks.
No, Debian is still in freeze, usually no new upstreams are allowed in during freeze. This bug began with the intermittent crash* but got re-purposed to be about the immediate crash when scorch3d enters a game. The symptoms of the scorch3d crash are the same as the intermittent crash - not enough space in the ring of death/etc. Disabling DRI prevents the scorch3d crash from happening with the 7.0.3 in Debian lenny. Upgrading to mesa 7.1 from Debian experimental doesn't prevent the scorch3d crash. I would try 7.2 but the amd64 build isn't on the Debian mirrors yet. *The intermittent crash still exists; I recently switched to amd64 on the same hardware and got the same intermittent crash yesterday. As usual, xorg failed to start because the ring buffer was fucked and I had to reboot. It would be nice if xorg knew how to clear the hardware state when it gets messed up so reboots aren't needed.
Since about July 2008 my Xorg session has crashed intermittently, sometimes daily, and actually just crashed as I was writing this comment. My system is a Dell Inspiron 6000 with an Intel Mobile 915GM Express graphics controller and runs Debian Lenny with a 2.6.26-1-686 kernel. I've been watching the scorched3d Debian bug at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=484049 but recently had time to investigate further and followed Paul Wise's link on that page to this bug upstream. Upgrading Mesa from 7.0.3-6 in Lenny to 7.2-1 in the experimental repositories doesn't seem to prevent the crashes, nor does using Metacity instead of Compiz Fusion. The crashes don't seem to correlate with ACPI events such as opening the lid after a suspend, but it seems all crashes have occurred while I was using the computer. I haven't tried scorched3d on my computer, and haven't found any other application that reliably causes crashes, but I do notice that most crashes seem to occur just before quick redraws of large sections of the screen, such as navigating to a new web page. I'm willing to collect logs and backtraces, but the logs already posted seem very similar to my own.
Comment on attachment 14430 [details] [review] let FBC assert immediate idle Marking the patch obsolete since it didn't prove relevant to the bug, and it shows up in my scan for open patches. I think this bug would be better served returning it to its original topic, and making scorched3d a separate issue, as there's no evidence that the two are connected. We're really interested in the "lockups in normal 2d activities", and this was one of the most informative bugs I've seen on that topic.
> I think this bug would be better served returning it to its original topic, and making scorched3d a separate issue, as there's no evidence that the two are connected. We're really interested in the "lockups in normal 2d activities", and this was one of the most informative bugs I've seen on that topic. My Xorg session just crashed a few minutes ago, due to "lockups in normal 2d activities." However, about a week ago I tried running scorched3d, following the instructions from Paul Wise, and am able to reproduce the crash on demand. Within the next few weeks I hope to compile the mesa source from the git repository and do some testing myself. I'm willing to collect logs and debug if someone wants to guide me.
3 failures with scorched3d on G45: [ 1665.235642] [drm] PGTBL_ER: invalid sampler PTE [ 1665.235643] [drm] PGTBL_ER: invalid instruction/state cache PTE [ 1964.802353] [drm] PGTBL_ER: invalid sampler PTE [ 1964.802355] [drm] PGTBL_ER: invalid instruction/state cache PTE [ 1964.802356] [drm] PGTBL_ER: invalid command data PTE [ 1978.801296] [drm] PGTBL_ER: invalid command data PTE No clear results yet, but we're getting things smashed with this game across multiple chipsets at least.
scorched issue on 945GM should be fixed by this commit c8b505d8260cccf289c947c629471df8f5c81c0d commit c8b505d8260cccf289c947c629471df8f5c81c0d Author: Xiang, Haihao <haihao.xiang@intel.com> Date: Thu Dec 11 14:03:00 2008 +0800 i915: fallback for cube map texture. But the original issue(ockups in normal 2d activities) should still exist.
I got another random lockup today, same symptoms. I imagine the xorg log files aren't useful since so many have already been posted, please say so if I should upload them.
Did you reproduce this issue with scorched3d on 945GM after applying commit c8b505d8260cccf289c947c629471df8f5c81c0d ?
Didn't try it yet, will have a go tonite, hopefully the patch applies to mesa in lenny.
The scorched3d issue on 945GM should be fixed. Too many information is mixed here, so I mark this bug as fixed and clone a new bug 19149 to track the original issue "lockups in 2D activities"
Mass version move, cvs -> git
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.