Created attachment 99492 [details]
X server log
Hard lock? Not even ssh'able? It would be really useful to get the kernel and userspace stack traces of the stuck processes. After a missed irq, we start busy-waiting for the GPU rather than rely on any more interrupts being received. Obviously that should not lead to a lock up. (In reply to comment #2) > Hard lock? It hard locks if I (1) switch from X to the console and then (2) switch from the console back to X. If I stay on the console there isn't a problem. > Not even ssh'able? Not even ssh'able. > It would be really useful to get the kernel > and userspace stack traces of the stuck processes. What do you mean? I'm not sure which processes you're referring to. (In reply to comment #3) > > It would be really useful to get the kernel > > and userspace stack traces of the stuck processes. > > What do you mean? I'm not sure which processes you're referring to. All of them. sys-rq-t (echo t > /proc/sysrq-trigger) I realise now that the behaviour I reported previously was mis-remembered. It is not the case that the machine locks hard after X hangs and I switch to the console and back. I tried to do this recently and no hard-lock occurred. Instead, it was the case that the machine would lock hard if X hanged, I switched to the console and then *restarted* X. Recently, there was a serendipitous event which may shed light on the problem. There was a hang while two user sessions were running. That is, one user had been logged in and another user came along, selected "New Login" from the xscreensaver password dialog and then logged in to another GNOME 3 session themselves. In the process list, there were two processes named X, running on different VTs. In this state, there was the usual hang. However, I managed to switch between the different X sessions. After the hang, I switched from the active X session to the console and then switched again to the other X session. After switching to the other X session, the display was still active but only for a few moments. That is, IIRC, some animated parts of web pages were moving on the screen but then after a few moments the display froze. (In reply to comment #4) > (In reply to comment #3) > > > It would be really useful to get the kernel > > > and userspace stack traces of the stuck processes. > > > > What do you mean? I'm not sure which processes you're referring to. > > All of them. sys-rq-t (echo t > /proc/sysrq-trigger) The next time there is a hang, I will do this. Created attachment 101048 [details]
Kernel log from boot to first hang, including complete stack trace from Alt-SysRq-t
I managed to capture a complete stack trace for all processes after a hang, and a second complete stack trace from the hang after restarting X.
Created attachment 101049 [details]
Kernel log after second hang, of complete stack trace from Alt-SysRq-t
Created attachment 101538 [details]
Kernel log of stack trace from Alt-SysRq-t after first hang but before hang notification
Created attachment 101539 [details]
Kernel log of stack trace from Alt-SysRq-t after second hang after hang notification
Note that the hang notification occured only after restarting X
Please note that the hang does not lead to a hard lock anymore. It certainly did in the past, presumably with earlier kernels/drivers but I no longer get a hard lock even after restarting X a number of times. Created attachment 101696 [details]
GDB backtrace of gnome-shell after the kernel's hangcheck notification
The second bug here is a frozen display to a bad vblank counter. Ville has recently fixed many bugs with our tracking of vblanks, so there is a good chance the latter bug is fixed. Of course we still have the earlier issue - but it would be good if we could confirm that it no longer freezes at least. (In reply to comment #12) > Ville has > recently fixed many bugs with our tracking of vblanks, so there is a good > chance the latter bug is fixed. Of course we still have the earlier issue - > but it would be good if we could confirm that it no longer freezes at least. Are these changes in a specific release, or are they only in git? And if only in git, which repository and branch? Thanks. Bob, sorry we've neglected this bug a bit. Please try drm-intel-nightly branch of http://cgit.freedesktop.org/drm-intel/ and report back. Thanks. Bob, any update? Sorry, the crashes got too much so I ditched the motherboard. I can't help with debugging this issue anymore. Ok sorry for the trouble, thanks for the effort anyway. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 99491 [details] Kernel log of two repeated hangs, the last part of the kernel log before the first hang and the entirety of the log before the second hang Using vanilla Linux 3.14, X server 1.12.4 and Intel driver 2.21.15 and GNOME 3 on a G43 chipset, I get random lockups of the X desktop. The mouse cursor moves but the rest of X is dead; nothing responds. I can switch to the console (in order to reboot the computer) but if I switch back to X then the machine will hard lock within a short period. After switching to the console, after a short delay the kernel will report: [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... render ring idle This definitely happens a lot while composing emails in Evolution (and is therefore EXTREMELY ANNOYING :-). I get the feeling it may happen some time after having dragged a window to a side edge of the screen to half-maximise it. The lockups seem to happen in clumps. That is, for weeks the machine will be stable with no problems but then there will be one lockup followed by another lockup, sometimes within minutes of rebooting, and sometimes repeating a number of times, eventually ending by my giving up using the computer.