My gm45-based system began to lock up in X randomly ever since upgrading to 2.20. VT switch sometimes works and sometimes it doesn't. Attempts to restart X lock up the system hard. I was finally able to capture *some* information in the logs. intel_gpu_abrt output is attached. Unfortunately, cat /sys/kernel/debug/dri/0/i915_error_state failed with memory allocation error. It did contain some information after a reboot, although I'm not sure if it's useful. Also attached.
Created attachment 71298 [details] Logs
Impossible to tell without the error state, but at first glance that looks like a kernel bug.
Trouble is, error state is empty in most cases, just as all the logs. In the rare case it was suggested it wasn't empty I got memory allocation error trying to cat it. Are there any suggestions as to how to debug this next time it happens?
Created attachment 71347 [details] Another set of dumps, with error_info this time. OK, I managed to capture i915_error_info. You can find it attached along with reg_dumper output and vbios dump. dmesg is clean.
Looks like a pageflip versus state race. Are you able to grab a 3.7 as we fixed a few issues there recently?
I'm running 3.7.0-rc8-00014-g27d7c2a. The only drm-related commit I'm missing is caf491916b1c1e939a2c7575efb7a77f11fc9bdf
If you are sure you can hit it again, you can try the mb() patches included in http://cgit.freedesktop.org/~ickle/linux-2.6/ #master
I suspect we have a dupe of bug 53385
That branch doesn't boot for me as is, but I cherry-picked commits that seemed to contain mb() insertions on top of 3.7: * drm/i915: Insert a full mb() before reading the seqno from the status page * drm/i915: Review the memory barriers around CPU access to buffers * overlay mb() Is that it, or is there something else in that branch that may be relevant?
Heh, there were a couple more important ones. Something has broke in the fastboot patches, as least it is causing problems on one of machines, which can be avoided with: diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_ index 64a8079..78bd6f5 100644 --- a/drivers/gpu/drm/i915/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c @@ -319,6 +319,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_de struct drm_i915_gem_object *obj; struct drm_mm_node *stolen; + return NULL; + if (dev_priv->mm.stolen_base == 0) return NULL;
Also be advised that xf86-video-intel-2.20.16 fixed a race in SNA for resizing DRI buffers.
Unfortunately, 2.20.16 does not fix this issue. And I'm still unable to boot the ickle/master branch, even with the fix you posted. I'll try the 3.8-rc0 and see if the latest drm pull has changed anything.
Locks up under 3.8 as well. I booted with drm.debug=0x06 and dmesg is full of this: [drm:i915_pageflip_stall_check], Pageflip stall detected [drm:intel_update_fbc], more than one pipe active, disabling compression [drm:intel_prepare_page_flip], preparing flip with no unpin work? [drm:intel_update_fbc], more than one pipe active, disabling compression [drm:intel_update_fbc], more than one pipe active, disabling compression [drm:intel_update_fbc], more than one pipe active, disabling compression [drm:intel_update_fbc], more than one pipe active, disabling compression [drm:intel_update_fbc], more than one pipe active, disabling compression [drm:i915_pageflip_stall_check], Pageflip stall detected [drm:intel_update_fbc], more than one pipe active, disabling compression I was also hoping to be able to bisect the driver, but the whole xorg-server API thing doesn't make it easy.
A small update here. Reverted to xf86-video-intel-2.19.0 and xorg-server-1.12.4. Rock solid. 2.20.16 still locks up. Will try to "bisect" through public releases first, will hopefully narrow the regression area.
Just fyi, usually it's quicker to just do a git bisect, since commits are not evently distributed between releases ...
Hah, I think we have a solution: commit 21ad833075801a7cd81b5ef1604ffc6c600e5ff9 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Tue Feb 19 15:16:39 2013 +0200 drm/i915: Fix races in gen4 page flip interrupt handling
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.