Summary: | [915GM/945GM] ring hang during fbo_firecube demo | ||
---|---|---|---|
Product: | Mesa | Reporter: | Tobias Jakobi <liquid.acid> |
Component: | Drivers/DRI/i915 | Assignee: | haihao <haihao.xiang> |
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | major | ||
Priority: | medium | CC: | des, dr-ru, haien.liu, jiewen.lin, pierre, sa, tom.gl, wael.nasreddine, xorgbugs.philipl, zdenek.kabelac |
Version: | git | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 18841 | ||
Attachments: |
old xorg log saved after the x crash (and reboot of the system)
log after crash new log from the crash (happened today) font after setting EXANoComposite=true latest lockup log (xorg-server-1.5 + intel-2.4.2-r1) Patch making glBitmap fall back to software on fbo |
Description
Tobias Jakobi
2008-06-25 09:57:41 UTC
Created attachment 17377 [details]
old xorg log saved after the x crash (and reboot of the system)
Can you find a way to steadily reproduce it? Have same problem, attach log file. Try restart xserver some times, but it exit with same message, run only after 'reboot' command. Created attachment 17534 [details]
log after crash
(In reply to comment #2) > Can you find a way to steadily reproduce it? > Not really. Also I didn't encounter this problem since the last time it appeared (that was when I reported it here). I updated libdrm and DRM kernel module in the meantime though, so maybe this was fixed already. @Dmitry: You're also using GIT master? В Пнд, 07/07/2008 в 01:50 -0700, bugzilla-daemon@freedesktop.org пишет: > http://bugs.freedesktop.org/show_bug.cgi?id=16521 > > > > > > --- Comment #5 from Tobias Jakobi <liquid.acid@gmx.net> 2008-07-07 01:50:49 PST --- > (In reply to comment #2) > > Can you find a way to steadily reproduce it? > > > > Not really. Also I didn't encounter this problem since the last time it > appeared (that was when I reported it here). > > I updated libdrm and DRM kernel module in the meantime though, so maybe this > was fixed already. > > @Dmitry: You're also using GIT master? Im use Debian testing and libdrm version on it 2.3.0 Hmm and another same one related to my post #16664 - is anyone aware of some recent change which could lead to this problem - looks like its across distributions and different recent Xorg versions. Created attachment 17644 [details]
new log from the crash (happened today)
Hi again,
the bug is not fixed for me. It happened again today without any kind of warning.
...attaching xorg log from after the "crash", nothing in the kernel log.
Greets,
Tobias
I'm also seeing this bug. X crashed with SIGSEGV and the following log message/backtrace. I'm using Xorg 1.4.2 on Linux 2.6.25-gentoo-r6, on x86_64. I have version 2.3.2 of the Intel video driver installed. Graphics card lists itself as "Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03)" I was using KDE's trunk (post-4.1) at the time, and I had compositing enabled (via OpenGL). The X server itself may have been running for a few days (although it was reset via logout/login not more than 24 hours ago), but the login session was not more than a few hours. I wasn't doing anything fancy at the time -- I had KDE's System Settings open, and had just clicked on an icon to open one of the control panel modules. Earlier I had been watching video in Xine (using the opengl VO plugin). Here's the relevant snippet of log from the crash (from my kdm.log): ------------------------------------- Error in I830WaitLpRing(), timeout for 2 seconds pgetbl_ctl: 0x7ffc0001 getbl_err: 0x00000010 ipeir: 0x00000000 iphdr: 0x02000011 LP ring tail: 0x00001060 head: 0x0001fa14 len: 0x0001f001 start 0x00000000 eir: 0x0000 esr: 0x0010 emr: 0xffff instdone: 0xfa41 instpm: 0x0000 memmode: 0x00000306 instps: 0x800f00c4 hwstam: 0xfffe ier: 0x0082 imr: 0x0000 iir: 0x0050 Ring at virtual 0x7fd0eff13000 head 0x1fa14 tail 0x1060 count 1427 0001f994: 00000000 0001f998: 02203c00 0001f99c: 01230044 0001f9a0: 44004444 0001f9a4: 00000000 0001f9a8: 00000000 0001f9ac: 00000000 0001f9b0: 00000000 0001f9b4: 00000000 0001f9b8: 00000000 0001f9bc: 7f1c000b 0001f9c0: 41ff0000 0001f9c4: 40380000 0001f9c8: 3f800000 0001f9cc: 3f800000 0001f9d0: 41ef0000 0001f9d4: 40380000 0001f9d8: 00000000 0001f9dc: 3f800000 0001f9e0: 41ef0000 0001f9e4: be000000 0001f9e8: 00000000 0001f9ec: 00000000 0001f9f0: 54f00006 0001f9f4: 03cc0010 0001f9f8: 00000000 0001f9fc: 00030003 0001fa00: 028fd010 0001fa04: 00030003 0001fa08: 00000020 0001fa0c: 028fcf50 0001fa10: 02000011 0001fa14: 00000000 Ring end space: 125356 wanted 131064 Fatal server error: lockup Backtrace: 0: /usr/bin/X(xf86SigHandler+0x6a) [0x495def] 1: /lib/libc.so.6 [0x7fd102b0a2c0] 2: /usr/bin/X(XkbEnableDisableControls+0x11) [0x55e46c] 3: /usr/bin/X(XkbRemoveResourceClient+0xaf) [0x55ff18] 4: /usr/bin/X [0x4498f4] 5: /usr/bin/X(CloseDownDevices+0x1f) [0x449af2] 6: /usr/bin/X(AbortServer+0x13) [0x593e9b] 7: /usr/bin/X(FatalError+0xd5) [0x59442f] 8: /usr/lib64/xorg/modules/drivers//intel_drv.so(I830WaitLpRing+0x181) [0x7fd100cbf858] 9: /usr/lib64/xorg/modules/drivers//intel_drv.so(I830Sync+0x1b3) [0x7fd100cbfc2d] 10: /usr/lib64/xorg/modules//libexa.so(exaWaitSync+0x35) [0x7fd0fff178ae] 11: /usr/lib64/xorg/modules//libexa.so(exaPrepareAccess+0x51) [0x7fd0fff18165] 12: /usr/lib64/xorg/modules//libexa.so(ExaCheckPutImage+0x3b) [0x7fd0fff1fdf7] 13: /usr/lib64/xorg/modules//libexa.so [0x7fd0fff19475] 14: /usr/bin/X [0x53e563] 15: /usr/bin/X(ProcPutImage+0x188) [0x44d284] 16: /usr/bin/X(Dispatch+0x33c) [0x4508c6] 17: /usr/bin/X(main+0x4a4) [0x4373cd] 18: /lib/libc.so.6(__libc_start_main+0xe6) [0x7fd102af6486] 19: /usr/bin/X(FontFileCompleteXLFD+0x291) [0x4366b9] FatalError re-entered, aborting Caught signal 11. Server aborting ------------------------------------- My X server now refuses to start. (I have tried unloading and reloading the i915 kernel module, to no avail). Subsequent attempts to start the X server fail with the following message: ------------------------------------- Error in I830WaitLpRing(), timeout for 2 seconds pgetbl_ctl: 0x7ffc0001 getbl_err: 0x00000010 ipeir: 0x00000000 iphdr: 0x02000011 LP ring tail: 0x0001f9f0 head: 0x0001fa14 len: 0x0001f001 start 0x00000000 eir: 0x0000 esr: 0x0010 emr: 0xffff instdone: 0xfa41 instpm: 0x0000 memmode: 0x00000306 instps: 0x800f04c4 hwstam: 0xfffe ier: 0x0002 imr: 0x0000 iir: 0x00f0 Ring at virtual 0x7fa6b52eb000 head 0x1fa14 tail 0x1f9f0 count 32759 0001f994: 03cc2000 0001f998: 00280098 0001f99c: 002c009c 0001f9a0: 01000000 0001f9a4: 00000000 0001f9a8: 00000010 0001f9ac: 02000000 0001f9b0: 54f00006 0001f9b4: 03cc2000 0001f9b8: 0028009c 0001f9bc: 002c00a0 0001f9c0: 01000000 0001f9c4: 00000000 0001f9c8: 00000010 0001f9cc: 02000000 0001f9d0: 54f00006 0001f9d4: 03cc2000 0001f9d8: 002800a0 0001f9dc: 002c00a4 0001f9e0: 01000000 0001f9e4: 00000000 0001f9e8: 00000010 0001f9ec: 02000000 0001f9f0: 54f00006 0001f9f4: 03cc0010 0001f9f8: 00000000 0001f9fc: 00030003 0001fa00: 028fd010 0001fa04: 00030003 0001fa08: 00000020 0001fa0c: 028fcf50 0001fa10: 02000011 0001fa14: 00000000 Ring end space: 28 wanted 32 Fatal server error: lockup ------------------------------------- Repeated attempts to restart X result in the same error. I haven't yet tried rebooting, as the machine has some other tasks to complete (compile jobs, backups, etc.). If rebooting doesn't help, I'll leave another note. (I should note the machine has been up--and running X, with resets every day--for 7 days now.) Hope this helps. I tend to reassign such tough bugs to Jesse. Gee, thanks Gordon. :p Yeah, often you have to reboot after a chip lockup since we don't include code to do a full reset yet. So Joshua is using KDE's compositing features, Joshua are you using GL or Render based composition (I think there's a KDE setting for that). Tobias, does the crash happen after you've run 3D applications? Often ring hangs like this are due to bad programming on the Mesa side, but they can also be due to the render acceleration code (most of the other code is simple enough not to trigger these problems), so you could try disabling render accel with Option "ExaNoComposite" "true" in the intel driver section of xorg.conf. It would be good if we could narrow things down to one or the other; generic ring hangs are hard to debug. (In reply to comment #11) > So Joshua is using KDE's compositing features, Joshua are you using GL or > Render based composition (I think there's a KDE setting for that). I'm using OpenGL, in "Texture from Pixmap" mode. Both "Direct rendering" and "Use VSync" are turned on. Ok, that's a good data point. Can you try reproducing with XRender based compositing instead? (In reply to comment #13) > Ok, that's a good data point. Can you try reproducing with XRender based > compositing instead? I'll try, but it happens very infrequently, so I don't know how much luck I'll have. First of all I'm using a standard xfce4 setup. No composite, no compiz and other fancy stuff. (In reply to comment #11) > Tobias, does the crash happen after you've run 3D applications? Not really, the last time X locked up this way it was minutes after I had started the system. I was just doing regular browsing on the net, opened a virtual terminal to sync my gentoo portage tree. After typing in the command X went blank... and you know the rest :) > > Often ring hangs like this are due to bad programming on the Mesa side, but > they can also be due to the render acceleration code (most of the other code is > simple enough not to trigger these problems), so you could try disabling render > accel with Option "ExaNoComposite" "true" in the intel driver section of > xorg.conf. It would be good if we could narrow things down to one or the > other; generic ring hangs are hard to debug. > So you sugges to switch render accel off and hope that the hang doesn't appear anymore? Greets, Tobias Thanks for the update Tobias, yeah I'm just curious if the hang will happen with render accel disabled. That's not a real fix of course (we want render accel to work) but it will tell us if the problem is likely there or not... Hi there, reporting back. I haven't yet disabled EXA compositing in xorg.conf, but I encountered this. As already said I don't have anything related to composite enabled in xfce. Support is compiled in though, so I just tried to turn composite on in the settings manager. Did work for some seconds, but as soon as I opened Seamonkey and moved around the window X locked up and restarted (restart fails though). So this seems to be a way to reproduce the problem. I'm retesting this and then I'm gonna disabled EXA composite in xorg.conf, maybe this helps. Greets, Tobias Created attachment 17787 [details]
font after setting EXANoComposite=true
OK, disabling EXA composite isn't an option for me. It simply leaves all text garbled. Seems like it's used somewhere even if no explicitly activated in the settings menu.
Found a way to reproduce the crashing. Just use the new fbo_firecube demo from the mesa git repository. I crashes X in under 5 seconds, at least for me... Greets, Tobias Ooh good, that means I get to reassign to one of the 3D guys :) Reconfirming with different software setup: xf86-video-i810-2.4.2-r1 libdrm-2.3.1 mesa-7.1, xorg-server-1.5.0 gentoo-sources-2.6.26-r1 DRM kernel module was build from the kernel sources. With this setup I don't have any FBO caps exported in the GL extension string, so reproducing the lockup with this setup and fbo_firecube is not possible. I'm now attaching the new lockup xorg.log... Created attachment 18889 [details]
latest lockup log (xorg-server-1.5 + intel-2.4.2-r1)
Reconfirming with just another setup (the one I used currently). xf86-video-i810-9999 (git intel-2.5 branch) libdrm-9999 (git master branch) mesa-9999 (git master branch) xorg-server-1.5.1 anholt's linux 2.6 tree (GEM enabled) This is a GEM powered setup (and I can confirm GEM works but other 3D applications). When starting fbo_firecube with this setup X just stops responding and the graphics freezes. There is still some harddrive activity, but I can't seem to able to shutdown the system with ACPI buttons or magic SysRq. Turning it off by holding the power button is the only working method. Notice that the behaviour is not the same like before. X doesn't crash (and tries to restart) this time, it simply freezes. Everything: graphics + input I also can't find any good information in the system logs. Nothing there... For me, fbo_firecube works with glutBitmapCharacter commented out. Further testing revealed the glBitmap call used by glutBitmapCharacter does not play nicely with fbos. It does not honour the borders of the fbo. For example, coordinates wrap at the right border to the left, and seem to be clipped at the width of the window. I don't know if the y-coordinates grow upwards, but if they do, glBitmap probably does the same in that direction, with the difference that the fbo's data storage ends there and some other object begins, and is corrupted. Either way, glBitmap drawing beyond the top of the fbo is an easy way to make my graphics hang. (additionally, the position where glBitmap renders seems to be a bit erratic when there are no other primitives drawn previously) I'm using a G45 and fbo_firecube is one of the many 3D apps which results in a complete X hang. I have previously reported this in bug 18081. I'm not sure if it's the same problem or simply a symptom of something else. (In reply to comment #18) > Created an attachment (id=17787) [details] > font after setting EXANoComposite=true > > OK, disabling EXA composite isn't an option for me. It simply leaves all text > garbled. Seems like it's used somewhere even if no explicitly activated in the > settings menu. > what if you use XAA instead of EXA? I suspect the fbo_firecube bug isn't the same issue as this bug was originally opened for... Haien/Jiewen, do you see this issue when running mesa demo fbo_firecube with the latest code? (In reply to comment #27) > Haien/Jiewen, do you see this issue when running mesa demo fbo_firecube with > the latest code? > it works well with the latest code. xf86_video_intel xf86-video-intel-2.6-branch commit b156b3165e1aae5df0353737d0335ac2e653f5fd mesa intel-2008-q4 branch commit 39091cc6385e6253464900e436cd7e9c04409ce6 drm shipped with kernel libdrm master branch commit b0d93c74d884b40bd94469a5ef75fdb2fef17680 GEM_kernel: (for-airlied)66647dc60d16fae9f6963fd98b6d9baa1a8dac69 Haien, Can you elaborate on the configuration you used to test this successfully? I have updated my trees to match yours but cannot get any gl programs to run successfully except glxinfo. fbo_firecube segfaults and glxgears asserts at 'vbo/vbo_save_api.c' which I believe is in intel_drm.so This is with a 'master' xserver build from two days ago and I've tried with and without uxa - it makes no difference. Created attachment 20843 [details] [review] Patch making glBitmap fall back to software on fbo Since there is definitely a bug in intels glBitmap handling code(using display coordinates on an fbo is definitely wrong), this patch makes it fall back if used on fbos. So, this should be able to get that issue out of the way debugging anything else that may be left. Ah, found the problem. Haien, you mentioned the intel-2008-q4 branch but the commit you gave was off 'master'. I first tried with the branch and that gave me the error I mentioned so I switched back to master and it now works. Thanks. (In reply to comment #31) > Ah, found the problem. > Haien, you mentioned the intel-2008-q4 branch but the commit you gave was off > 'master'. I first tried with the branch and that gave me the error I mentioned > so I switched back to master and it now works. Good to know it works now, though it's interesting why the master works while intel-2008-q4 is said to be not working, as it's just a snapshot of master about 2 days ago. it seems this bug can be closed? Comment on attachment 20843 [details] [review] Patch making glBitmap fall back to software on fbo My patch did not even compile. Corrected version can be found in bug #18914 if anyone is still interested. Sorry for the noise. Still not fixed for me. Using Intel i915 with recent mesa git master, libdrm git master and drm-intel-next kernel branch. I start some music, then start fbo_firecube. X freezes but the music continues to play. VT switching doesn't work, ACPI buttons seem to trigger shutdown process, but it doesn't finish so the system never goes off. MagicSysRq doesn't work at all. Still a problem for me on G45, using: -- xf86-video-intel: 2e3c098c5ed9a8451713dc754a5f086992249336 -- xserver: 1.5.2 -- mesa: 6e0f8b174dddeb743b4bdc0d831eb1121f62ff50 -- drm: b0d93c74d884b40bd94469a5ef75fdb2fef17680 -- kernel: for-airlied 66647dc60d16fae9f6963fd98b6d9baa1a8dac69 Haien/Jiewen, please test on more platforms to see if we can reproduce it. All, we are using server-1.6-branch, and for-airlied kernel. (In reply to comment #31) > Ah, found the problem. > > Haien, you mentioned the intel-2008-q4 branch but the commit you gave was off > 'master'. I first tried with the branch and that gave me the error I mentioned > so I switched back to master and it now works. > > Thanks. > sorry, my fault. The fbo_firecube glBitmap problem is now fixed in master, but I'm reasonably certain that fbo_firecube has nothing to do with the original issue reported (unless you were running 3D applications using FBOs) Thanks Eric! (In reply to comment #39) > The fbo_firecube glBitmap problem is now fixed in master, but I'm reasonably > certain that fbo_firecube has nothing to do with the original issue reported > (unless you were running 3D applications using FBOs) Well, if that's so I think we can resolve this one to FIXED. The original issue, which also happened during normal work inside X didn't happen since some time now. The fbo_firecube thing was only brought up by me because Gordon Jin was asking for some way to reproduce the hang. At least fbo_firecube did cause the same errors messages in the logs back then. So, what should I resolve it? Both the original issue and fbo_firecube failure are apparently gone now, so marking it fixed. Mass version move, cvs -> git |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.