Summary: | [855GM] X crashes when moving the mouse cursor back into the LVDS screen area for the third time | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | CarlEitsger <4607vrfcr84spd21f08> | ||||||||
Component: | DRM/Intel | Assignee: | Chris Wilson <chris> | ||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||
Severity: | normal | ||||||||||
Priority: | medium | CC: | debian, intel-gfx-bugs | ||||||||
Version: | unspecified | ||||||||||
Hardware: | x86 (IA32) | ||||||||||
OS: | Linux (All) | ||||||||||
Whiteboard: | |||||||||||
i915 platform: | i915 features: | ||||||||||
Attachments: |
|
Description
CarlEitsger
2014-04-27 13:59:45 UTC
Hmm, I broke something in the saving of malloc errors then. Is there anything in the crashing Xorg.0.log or Xorg.0.log.old? (In reply to comment #1) > Is there > anything in the crashing Xorg.0.log or Xorg.0.log.old? The last entry of the .old (which had a change date matching the previous X session) has those last entrys - which are from the startup: > [ 23.934] (II) config/udev: Adding input device PC Speaker (/dev/input/event11) > [ 23.934] (II) No input driver specified, ignoring this device. > [ 23.934] (II) This device may have been added with another device file. So, no, there is nothing relating to the X restart. > $ less Xorg.0.log.old | grep -i cursor > [ 22.221] (--) intel(0): Using a maximum size of 64x64 for hardware cursors > [ 22.316] (II) intel(0): HW Cursor enabled Nothing special. Is the stderr captured anywhere? Usually as xdm.log or gdm/*.log etc. I am using KDM. `journalctl _SYSTEMD_UNIT=kdm.service` did not output any log entries. And in /var/log/kdm.log there is not anything interesting. Not sure where the stderr of kdm and its childs go to... I will try without kdm: With a pure startx 2> stderr.log (no kdm not enabled in systemd) and a exec xterm X crashes on the third re-entry of the mouse cursor. Content generated by the crash: > xterm: xinit: connection to X server lost > fatal IO error 11 (Die Ressource ist zur Zeit nicht verfügbar) or KillClient on X server ":0" With lxde instead of xterm it even crashes at the first reappear of the mouse cursor: > XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" > after 31 requests (31 known processed) with 0 events remaining. > xinit: connection to X server lost > > waiting for X server to shut down XIO: fatal IO error 11 (Die Ressource ist zur Zeit nicht > verfügbar) on X server ":0" > after 3514 requests (3511 known processed) with 0 events remaining. > pcmanfm: Fatal IO error 11 (Die Ressource ist zur Zeit nicht verfügbar) on X server :0. By the way, X does not "restart" (as I first wrote in the bug title) when not using KDM. Apparently the restarting is a feature of KDM. Being just on tty1 with startx it crashes and makes the computer not respond to any input. Accessing by ssh (and rebooting) works. However, I guess those stderrs are not really useful for you. Please tell me how to investigate better. Hmm, if we can't see a reason for the crash in either the log file or stderr, we need to attach gdb. This is easiest with a second machine and sshing in. So far, I haven't spotted anything in either code review or running with valgrind - but I still haven't tried hooking up a VGA monitor to the 855gm. (In reply to comment #5) > Hmm, if we can't see a reason for the crash in either the log file or > stderr, we need to attach gdb. This is easiest with a second machine and > sshing in. Sure, this is really no problem, I also tried it before, but as soon as I attach gdb (via PID specification) X's execution seems to stop (and I need it to continue to run to trigger the error...). Probably I just need to tell to "continue" or something but I did not find anything in the man page (quickly reading). If you could tell me how to attach and continue to run X and then backtracing ... Hmm, I just found http://visualgdb.com/gdbreference/commands/continue which tells me that I just need to issue the command "continue". Does it work this way (we have no breakpoints). > but I still haven't tried hooking up a VGA monitor to the 855gm. Well, I guess it is related to the different monitor sizes, so you likely only can reproduce it with a bigger / smaller external VGA monitor. Right, you need to hit continue. So connect gdb, using gdb --pid=`pidof Xorg` (or perhaps pidof X), then enter 'c' when it finishes loading the symbols and then trigger the error. When X dies, gdb will hopefully capture the error and present a command prompt again. Type 'bt'. To get a good backtrace, typically requires the debugging symbols to be installed. Another thing to check is whether there is a tell-tale in dmesg for why X died. nothing interesting/new in dmesg. I have no debugging symbols installed but I think it is enough:
Tested with your newest version 2a993c8.
> Program received signal SIGSEGV, Segmentation fault.
> __sna_create_cursor (sna=0xb6aec000) at sna_display.c:3112
> 3112 c->alloc = ALIGN(size, 4096);
Also see attachment.
Created attachment 98129 [details]
X crash message in gdb and backtrace
Ah, that makes sense at least. It should be impossible... Do you mind applying the index 1520533..854ee55 100644 --- a/src/sna/sna_display.c +++ b/src/sna/sna_display.c @@ -87,7 +87,7 @@ union compat_mode_get_connector{ #define DEFAULT_DPI 96 #endif -#if 0 +#if 1 #define __DBG(x) ErrorF x #else #define __DBG(x) debugging patch and attaching the tail of the Xorg.0.log? I need to work out why we end up using more cursors than I preallocate. hmm, the patch does not work (if applied to version 2a993c8aa9e8594c32d5e67329b0dbed0d92c761 or 0b23011c27736d0ae2b33d8ea147c16b909baa57) > $ patch sna_display.c /tmp/debugpatch > patching file sna_display.c > patch unexpectedly ends in middle of line > Hunk #1 FAILED at 87. > 1 out of 1 hunk FAILED -- saving rejects to file sna_display.c.rej But I could apply the one from https://bugs.freedesktop.org/show_bug.cgi?id=77351#c4 > patch unexpectedly ends in middle of line > Hunk #1 succeeded at 87 with fuzz 1. Trying this one... Created attachment 98137 [details] gdb for comment 12 ... [ 107.955] (II) intel(0): SNA compiled from 2.99.911-113-g0b23011 Created attachment 98138 [details] part of Xorg log while the debug patch was active - for comment 12 likely uninteresting info: ... actually this was Xorg.1.log because I did a startx on tty1 while there was kdm running on tty7. Ugh. It's the cursor sharing avoidance for the pwrite paths. commit 94e39323772ef6561efcc0620f67cabd2462a0d0 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 29 09:02:50 2014 +0100 sna: Recycle physical cursors A side-effect of the workaround for incoherent physical cursors is that we never reused a cursor after disabling. As such moving the cursor off the pipe and back on would eventually consume all the preallocated structs leading to a segfault. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78002 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Thanks! Confirming, seems to be fixed in
> intel(0): SNA compiled from 2.99.911-114-g94e3932
|
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.