Using Fedora 21, kernel 3.18.9-100 on x86_64, and running for a longer period, I see a GPU hang. Userspace components in use are: From Fedora: xorg-x11-drv-intel-2.21.15-9.fc20.x86_64 mesa-dri-drivers-10.3.3-1.20141110.fc20.x86_64 libdrm-2.4.58-1.fc20.x86_64 Compiled by me: libva-1.4.1 libva-intel-driver-1.4.2-0.git8e34fb34
Created attachment 116454 [details] Error state collected by the kernel Error state collected by the kernel; the GPU is whatever's inside an Intel(R) Celeron(R) CPU 1037U @ 1.80GHz
Yes, looks like libva jumps into a malformed batch. Over to libva for better analysis.
What are you actually executing with libva/intel-driver? Also, retest with libva/intel-driver master.
(In reply to Sean V Kelley from comment #3) > What are you actually executing with libva/intel-driver? > > Also, retest with libva/intel-driver master. I'm using gstreamer1-vaapi to decode and deinterlace a 1920x1080 MPEG-2 MP@HL video. I can get the gstreamer1-vaapi version on Monday, together with any other data you need; we're negotiating with the customer to get permission to send you the video, too. I'll retest with git master as well, and see if the same fault occurs; it's non-deterministic, and takes 24-48 hours to trigger.
(In reply to Sean V Kelley from comment #3) > What are you actually executing with libva/intel-driver? > > Also, retest with libva/intel-driver master. I can't easily test with just intel-driver(In reply to Simon Farnsworth from comment #4) > (In reply to Sean V Kelley from comment #3) > > What are you actually executing with libva/intel-driver? > > > > Also, retest with libva/intel-driver master. > > I'm using gstreamer1-vaapi to decode and deinterlace a 1920x1080 MPEG-2 > MP@HL video. I can get the gstreamer1-vaapi version on Monday, together with > any other data you need; we're negotiating with the customer to get > permission to send you the video, too. > > I'll retest with git master as well, and see if the same fault occurs; it's > non-deterministic, and takes 24-48 hours to trigger. GStreamer is at 1.4.3; gstreamer-vaapi is at 0.5.10
(In reply to Sean V Kelley from comment #3) > What are you actually executing with libva/intel-driver? > > Also, retest with libva/intel-driver master. I've brought libva to: commit 5d07b29687db6d17811b7ecf9b779377e9851a27 Author: Xiang, Haihao <haihao.xiang@intel.com> Date: Wed Jun 10 14:41:14 2015 +0800 test/decode/tinyjpeg: make sure the pointer is valid before dereferencing it Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Sean V Kelley <seanvk@posteo.de> (cherry picked from commit 8455834161bab3374fe9756fd4a28d919027daf7) and intel-driver to: commit e797089446c1f5b71b239b9046d76e054dfcba59 Author: Zhong Li <zhong.li@intel.com> Date: Mon Jun 8 12:42:21 2015 +0800 VP8 HWEnc: Modify qp threshold value for mode cost calculatation The patch is helpful to improve quality when qp is lower than the threshold value. Signed-off-by: Zhong Li <zhong.li@intel.com> I've seen a hang - I'll attach the error state in case it's different. I'm using vaapidecode ! queue ! vaapipostproc ! vaapisink with suitable parameters set (MCDI if possible) to handle video. The queue is set to take up to 6 frames of video from vaapidecode.
Created attachment 116522 [details] Error state with libva and intel-driver git master
Could you provide the full command line and the sample video if possible ?
(In reply to haihao from comment #8) > Could you provide the full command line and the sample video if possible ? It reproduces erratically with: gst-launch-1.0 filesrc location=01_Work4U_Finding.mpeg ! decodebin name=db max-size-bytes=$((16 * 1024 * 1024)) expose-all-streams=false ! queue max-size-buffers=0 max-size-bytes=0 max-size-time=$((200 * 1000 * 1000)) ! audioconvert dithering=tpdf-hf ! audioresample quality=10 sinc-filter-mode=full ! alsasink qos=true max-lateness=$((20 * 1000 * 1000)) db. ! queue max-size-buffers=6 max-size-bytes=0 max-size-time=$((200 * 1000 * 1000)) ! vaapipostproc force-aspect-ratio=false deinterlace-method=motion-compensated deinterlace-mode=auto ! vaapisink force-aspect-ratio=false show-preroll-frame=false max-lateness=$((20 * 1000 * 1000)) If I change deinterlace-mode on vaapipostproc to deinterlace-mode=disabled, the error disappears. Similarly, if I re-encode as progressive instead of interlaced, the error disappears. I'm getting customer permission to send you 01_Work4U_Finding.mpeg - I may have to send a link to your intel.com e-mail privately, depending on the customer's attitude.
(In reply to haihao from comment #8) > Could you provide the full command line and the sample video if possible ? Sample video link sent by e-mail; my customer does not want it shared publicly.
(In reply to haihao from comment #8) > Could you provide the full command line and the sample video if possible ? Have you got anywhere investigating this? I'm stalled at my end - the only thing I can envisage given what I've found so far is some form of erratum with large batch buffers.
(In reply to Simon Farnsworth from comment #11) > (In reply to haihao from comment #8) > > Could you provide the full command line and the sample video if possible ? > > Have you got anywhere investigating this? No. I can't reproduce this issue on my machine. I remember I replied you in an email. > > I'm stalled at my end - the only thing I can envisage given what I've found > so far is some form of erratum with large batch buffers.
As Haihao mentioned, we are unable to reproduce your specific issue, which appears to be specific to your configuration. We see no such GPU hangs on decode and deinterlace of a 1920x1080 MPEG-2 MP@HL video for IVB. Sean
(In reply to Sean V Kelley from comment #13) > As Haihao mentioned, we are unable to reproduce your specific issue, which > appears to be specific to your configuration. We see no such GPU hangs on > decode and deinterlace of a 1920x1080 MPEG-2 MP@HL video for IVB. > > Sean And as I've mentioned to Haihao, I'm happy to send a complete system with a driver build environment that reproduces the problem to an address of Intel's choosing.
I've had the chance to do a bit more work on this. It looks like VA-API doesn't give the kernel the right hints for tracking GEM object dirty state; in the kernel, if I change i915_gem_execbuffer_move_to_active to unconditionally set obj->dirty=1 (instead of only setting it if obj->base.write_domain is true), the hang goes away: --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1032,6 +1032,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas, u32 old_read = obj->base.read_domains; u32 old_write = obj->base.write_domain; + obj->dirty = 1; obj->base.write_domain = obj->base.pending_write_domain; if (obj->base.write_domain == 0) obj->base.pending_read_domains |= obj->base.read_domains; @@ -1039,7 +1040,6 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas, i915_vma_move_to_active(vma, req); if (obj->base.write_domain) { - obj->dirty = 1; i915_gem_request_assign(&obj->last_write_req, req); intel_fb_obj_invalidate(obj, ORIGIN_CS); I assume that this means that you're not calling the SET_DOMAIN ioctl() at appropriate points, but relying on the kernel doing the right thing for you anyway.
Yes, the SET_DOMAIN ioctl() is called explicitly in the driver, The driver calls drm_intel_gem_bo_map()/drm_intel_gem_bo_map_gtt()/drm_intel_bo_emit_reloc() to change DOMAIN setting.
(In reply to haihao from comment #16) > Yes, the SET_DOMAIN ioctl() is called explicitly in the driver, The driver > calls > drm_intel_gem_bo_map()/drm_intel_gem_bo_map_gtt()/drm_intel_bo_emit_reloc() > to change DOMAIN setting. Sorry, The SET_DOMAIN ioctl() *isn't* called explicitly in the driver, The driver calls drm_intel_gem_bo_map()/drm_intel_gem_bo_map_gtt()/drm_intel_bo_emit_reloc() to change DOMAIN setting.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.