I have a Lenovo Thinkpad X201 tablet laptop with Intel Core i7 L-620 CPU. I use Debian jessie (testing) 64-bit with KDE 4.10.5. The problem appears when I try to use hardware video decoding via vaapi (I use vlc player) after some minutes of playback. This error appears on kernel 3.11.7 and not on 3.11.6. I don't understand why because no changes in GPU driver between these versions (if I read correctly the kernel's changelog)... When GPU has hung I connect to laptop via ssh and collect errors. Please let me know if you need more information. $ uname -a Linux h13 3.11.7-zen+ #1 ZEN SMP Wed Nov 6 16:42:14 FET 2013 x86_64 GNU/Linux The version of packages: i965-va-driver:amd64 1.2.1-2.1 libdrm-intel1:amd64 2.4.46-3 libdrm-nouveau2:amd64 2.4.46-3 libdrm-radeon1:amd64 2.4.46-3 libdrm2:amd64 2.4.46-3 libva-drm1:amd64 1.2.1-2.1 libva-glx1:amd64 1.2.1-2.1 libva-x11-1:amd64 1.2.1-2.1 libva1:amd64 1.2.1-2.1 xserver-xorg-core 2:1.14.3-4 xserver-xorg-video-intel 2:2.99.905+git1383858000.b46d0d3 The syslog messages: Nov 9 18:23:37 localhost kernel: [ 2623.487215] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring Nov 9 18:23:37 localhost kernel: [ 2623.487225] [drm] capturing error event; look for more information in /sys/kernel/debug/dri/0/i915_error_state Nov 9 18:23:43 localhost kernel: [ 2629.459913] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring Nov 9 18:23:49 localhost kernel: [ 2635.468525] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring Nov 9 18:23:55 localhost kernel: [ 2641.441165] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring Nov 9 18:24:01 localhost kernel: [ 2647.461769] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring Nov 9 18:24:07 localhost kernel: [ 2653.446329] [drm:i915_hangcheck_elapsed] *ERROR* stuck on render ring The intel_gpu_top output when GPU has hung (in ssh session): render busy: 100%: ████████████████████ render space: 898/131072 bitstream busy: 1%: ▎ bitstream space: 0/131072 task percent busy CS: 100%: ████████████████████ vert fetch: 0 (0/sec) URB: 100%: ████████████████████ prim fetch: 0 (0/sec) VFE: 100%: ████████████████████ VS invocations: 0 (0/sec) BCS: 1%: ▎ GS invocations: 0 (0/sec) AI: 1%: ▎ GS prims: 0 (0/sec) AC: 1%: ▎ CL invocations: 0 (0/sec) AM: 1%: ▎ CL prims: 0 (0/sec) PS invocations: 0 (0/sec) PS depth pass: 0 (0/sec)
Created attachment 88940 [details] Information collected by intel_gpu_abrt tool.
Hmmm. The bug appears on 3.11.6 too... Possible steps to reproduce: 1. Download a video "Planet" (file "h264_720p_hp_5.1_6mbps_ac3_unstyled_subs_planet.mkv") from this page http://www.auby.no/files/video_tests/ 2. Check these settings in VLC: [postproc] # Video post processing filter postproc-q=0 [avcodec] # FFmpeg audio/video decoder ffmpeg-hw=1 [main] # main program fullscreen=0 skip-frames=0 vout=xcb_x11 3. Open video file and start playback. Maybe you should switch to fullscreen after some secods. $. GPU hung. I use this script to automatically collect errors (it run by cron every 2 minutes): --- cut --- # cat get_i915_error_state.sh #!/bin/sh TS=$(date +%Y-%m-%d_%H-%M-%S) cd /opt/intel_bugs [ -f /tmp/intel_gpu_abrt.lock ] && exit 0 [ -f /sys/kernel/debug/dri/0/i915_error_state ] && \ grep -q "no error state collected" /sys/kernel/debug/dri/0/i915_error_state if [ 0 -ne $? ]; then echo "i915 error detected at $TS" intel_gpu_abrt [ -f intel_gpu_abrt.tar ] && \ mv intel_gpu_abrt.tar intel_gpu_abrt_${TS}.tar && \ gzip intel_gpu_abrt_${TS}.tar echo "Trying to reset GPU..." echo 1 > /sys/kernel/debug/dri/0/i915_wedged touch /tmp/intel_gpu_abrt.lock fi --- cut --- System environment: -- chipset: Intel Corporation 5 Series/3400 Series Chipset -- system architecture: x86_64 -- xf86-video-intel: 2.99.905 git b46d0d3 -- xserver: 1.14.3 -- mesa: 10.0.0-devel -- libdrm: 2.4.46 -- kernel: 3.11.6-zen+ -- Linux distribution: Debian/jessie -- Machine or mobo model: ThinkPad X201t -- Display connector: LVDS
Can you reproduce this issue with other video files ?
Yes. I have a lot of videos in .flv and .mp4 format downloaded from youtube. These videos have a different resolution like 720p, 480p and less. Does this bug is similar to bug 59050? The symptoms are similar.
So. What I tried: switch to previous kernel 3.10.12 - nothing good. change kwin render backend from OpenGL to Xrender - nothing good. change mesa from 10.0-dev to 9.2 - nothing good. switch off desktop environment (disable kdm at startup) and run pure X with xterm - nothing good.
One more observation: If I use Xvideo output in vlc then I get bug immediately. If I use OpenGL output in vlc then I get bug after switch to fullscreen mode.
> Does this bug is similar to bug 59050? The symptoms are similar. For 59050, the issues is only reproduced with some videos with specific resolutions, so you mean you can play back some videos without GPU haung , right ?
No. I just proposed that the problem maybe in buffer size. Sorry if it was a mistake.
Some ILK related fixes included in the following tarball, could you give a try ? http://www.freedesktop.org/software/vaapi/testing/libva-intel-driver/libva-intel-driver-1.2.2.pre1.tar.bz2
It works. And I've built this driver from git - it works too. Thanks.
Unfortunately the problem appears again. I am not sure that cause of bug only in libva component. I suspect the bug appears when we use OpenGL and libva simultaneously. For example the probability of bug is increasing when I run some simple 3D game in some window and vlc in another one.
Created attachment 90950 [details] fresh collected error state
hi Mihail Kasadjikov can you still find this issue? with ILK I run some 3D game in some window and mplayer in another one. cannot find this problem.
Hi. I noticed the problem appears when I use "i915.i915_enable_rc6=1" in kernel cmdline. Because of this I can't use the power saving for Intel's GPU. Please see this bug: https://bugzilla.kernel.org/show_bug.cgi?id=77691#c1 Generally I don't play 3D games on my laptop but modern desktop environments like KDE or Unity use OpenGL acceleration. I usually catch a "GPU hang" when I watch youtube using flash player with video acceleration and with RC6 power saving. Also the VLC catch this error. Now I'm not sure that this error in libva. Maybe it is in the kernel module...
Can you please escalate this bug to kernel's developers?
if you want to escalate this bug to kernel's developers, you can change the Product from libva to DRI.
(In reply to comment #14) > I noticed the problem appears when I use "i915.i915_enable_rc6=1" in kernel > cmdline. Because of this I can't use the power saving for Intel's GPU. > Please see this bug: https://bugzilla.kernel.org/show_bug.cgi?id=77691#c1 Let's track your bug here. First, do not change the enable_rc6 module parameter from its platform specific defaults, or all bets are off. Please see if you can reproduce the bug without (though AFAICT what you set there should be the same as the default). Please also try a more recent kernel.
So. I try to test kernel 3.15.10. I can't use 3.16 because of some issues related to reiserfs. I found that by default RC6 disabled for IronLake (gen 5). In file «drivers/gpu/drm/i915/intel_pm.c»: int intel_enable_rc6(const struct drm_device *dev) { … /* Disable RC6 on Ironlake */ if (INTEL_INFO(dev)->gen == 5) return 0; … } When I try to force enable_rc6 to 1 it still goes to default (disabled) value: $ cat /proc/cmdline root=/dev/mapper/h13ssd-root ro ipv6.disable=1 elevator=deadline no_console_suspend=1 pcie_aspm=powersave video=inteldrmfb:1280x800R-8 quiet intel_iommu=igfx_off resume=/dev/mapper/h13ssd-swap zswap.enabled=1 drm.debug=0x04 i915.enable_rc6=1 $ dmesg | egrep -i "rc6" [ 1.913863] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off And now I've got one error but without overall system hung. $ dmesg | egrep -i "hangcheck" [ 350.800043] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... bsd ring idle
I found and removed the «semaphores=0» from «/etc/modprobe.d/options.conf». It was added as a workaround according to bug #54226. Right now I have no any parameters for i915 kernel module $ cat /proc/cmdline root=/dev/mapper/h13ssd-root ro ipv6.disable=1 elevator=deadline no_console_suspend=1 pcie_aspm=powersave video=inteldrmfb:1280x800R-8 quiet intel_iommu=igfx_off resume=/dev/mapper/h13ssd-swap zswap.enabled=1 But after some videos on youtube using flashplayer I've got the «Hangcheck timer elapsed» without overall system hung: $ dmesg | grep Hang [ 4121.662268] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... bsd ring idle It is interesting but after this message in dmesg and little pause in video playback, the other videos on YT are playing with HW decoder normally and without errors for long time... And this, after message in dmesg: # cat /sys/kernel/debug/dri/0/i915_error_state no error state collected
Could you please try latest drm-intel-nightly from cgit.freedesktop.org/drm-intel?
I can't use the fresh kernel because of I use Reiser4 filesystem at my /home. $ git clone --depth=1 --branch="drm-intel-nightly" git://anongit.freedesktop.org/drm-intel drm-intel-nightly Cloning into 'drm-intel-nightly'... remote: Counting objects: 50283, done. remote: Compressing objects: 100% (47765/47765), done. remote: Total 50283 (delta 4384), reused 12857 (delta 1911) Receiving objects: 100% (50283/50283), 134.94 MiB | 692.00 KiB/s, done. Resolving deltas: 100% (4384/4384), done. Checking connectivity... done. Checking out files: 100% (47560/47560), done. $ cd drm-intel-nightly/ $ zcat ~/dev/kernel/Reiser4/reiser4-for-3.16.2.patch.gz | patch -p 1 --dry-run | grep -B 1 ^Hunk checking file fs/fs-writeback.c Hunk #2 succeeded at 575 (offset 1 line). Hunk #3 succeeded at 618 (offset 1 line). Hunk #4 succeeded at 651 (offset 1 line). Hunk #5 succeeded at 675 (offset 1 line). Hunk #6 succeeded at 683 (offset 1 line). Hunk #7 succeeded at 1012 (offset 1 line). Hunk #8 succeeded at 1048 (offset 1 line). -- checking file include/linux/fs.h Hunk #4 succeeded at 1578 (offset 26 lines). Hunk #5 succeeded at 2250 (offset 26 lines). Hunk #6 succeeded at 2380 with fuzz 2 (offset 27 lines). Hunk #7 succeeded at 2388 (offset 27 lines). -- checking file include/linux/sched.h Hunk #1 succeeded at 1881 (offset -11 lines). checking file include/linux/writeback.h Hunk #2 succeeded at 93 with fuzz 2. checking file mm/filemap.c Hunk #1 succeeded at 1441 (offset 6 lines). checking file mm/page-writeback.c Hunk #1 succeeded at 2196 (offset -3 lines). checking file mm/vmscan.c Hunk #1 FAILED at 2490. Hunk #2 succeeded at 2538 with fuzz 2 (offset 8 lines). Sorry. I'm not a software developer and don't know how to backport the new drm module into kernel 3.16.
On kernel 3.16.6 the behavior is like described in comment 19. So. Now it works almost ideal except this one error in dmesg and one little pause after some minutes while playng video.
What about newer kernel? it is strange to have the erro check and no error state collected. Could you please try to reproduce and grab the error state from latest stage where you have it working propperly but seeing warns? Also, it would be good if you can try newer kernel. If you use Ubuntu you can try nightly deb from ppa:http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/current/
Hello. $ uname -a Linux h13 3.17.7 #1 SMP Sun Jan 18 01:51:42 MSK 2015 x86_64 GNU/Linux Test with vlc 1. config: [avcodec] # FFmpeg audio/video decoder avcodec-hw=vaapi_x11 [postproc] # Video post processing filter postproc-q=0 [core] # core program skip-frames=0 quiet-synchro=1 deinterlace=-1 vout=xcb_xv stderr: $ vlc Kizomba\ Isabelle\ and\ Felicien\ Asty\ -\ Curti\ ma\ mi.mp4 VLC media player 2.2.0-rc2 Weatherwax (revision 2.2.0-rc1-118-g22fda39) [0000000000a67118] core libvlc: Запуск vlc с интерфейсом по умолчанию. Используйте 'cvlc' для запуска vlc без интерфейса. libva info: VA-API version 0.36.0 libva info: va_getDriverName() returns 0 libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so libva info: Found init function __vaDriverInit_0_36 libva info: va_openDriver() returns 0 [00007f7becc328d8] avcodec decoder: Using Intel i965 driver for Intel(R) Ironlake Mobile - 1.4.1 for hardware decoding. libva info: VA-API version 0.36.0 libva info: va_getDriverName() returns 0 libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so libva info: Found init function __vaDriverInit_0_36 libva info: va_openDriver() returns 0 [00007f7becc328d8] avcodec decoder: Using Intel i965 driver for Intel(R) Ironlake Mobile - 1.4.1 for hardware decoding. Vlc hung after about 10-20 seconds and ignore TERM signal. It was killed by "kill -9". No errors in dmesg. Test with vlc 2. config: [avcodec] # FFmpeg audio/video decoder avcodec-hw=vaapi_drm [postproc] # Video post processing filter postproc-q=0 [core] # core program skip-frames=0 quiet-synchro=1 deinterlace=-1 vout=xcb_xv stderr: $ vlc Kizomba\ Isabelle\ and\ Felicien\ Asty\ -\ Curti\ ma\ mi.mp4 VLC media player 2.2.0-rc2 Weatherwax (revision 2.2.0-rc1-118-g22fda39) [00000000015b5118] core libvlc: Запуск vlc с интерфейсом по умолчанию. Используйте 'cvlc' для запуска vlc без интерфейса. libva info: VA-API version 0.36.0 libva info: va_getDriverName() returns 0 libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so libva info: Found init function __vaDriverInit_0_36 libva info: va_openDriver() returns 0 [00007f0110c32938] avcodec decoder: Using Intel i965 driver for Intel(R) Ironlake Mobile - 1.4.1 for hardware decoding. libva info: VA-API version 0.36.0 libva info: va_getDriverName() returns 0 libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so libva info: Found init function __vaDriverInit_0_36 libva info: va_openDriver() returns 0 [00007f0110c32938] avcodec decoder: Using Intel i965 driver for Intel(R) Ironlake Mobile - 1.4.1 for hardware decoding. Video file was played normally. Flash video works normally. So. Right now it seems OK. The problem with "avcodec-hw=vaapi_x11" may be in X11 driver or vlc but I think the DRM driver works normally.
Kernel 3.18.4. Another test with vlc. config: [avcodec] # FFmpeg audio/video decoder avcodec-hw=vaapi_drm [postproc] # Video post processing filter postproc-q=0 [core] # core program skip-frames=0 quiet-synchro=1 deinterlace=-1 vout=xcb_xv After many stops and rewinds by some seconds the vlc has been hung like in previous post in "test 1". No errors in dmesg. But in htop I saw a 99% IOwait on one CPU core. Vlc was killed by "-9" and no side-effects I can observe.
I have switched HW decoder in vlc to VDPAU via libvdpau-va-gl1 library: [avcodec] # FFmpeg audio/video decoder avcodec-hw=vdpau_avcodec And I have some additional errors: libva info: VA-API version 0.36.0 libva info: va_getDriverName() returns 0 libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so libva info: Found init function __vaDriverInit_0_36 libva info: va_openDriver() returns 0 [00007fc750f78278] avcodec decoder: Using OpenGL/VAAPI/libswscale backend for VDPAU for hardware decoding. libva info: VA-API version 0.36.0 libva info: va_getDriverName() returns 0 libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so libva info: Found init function __vaDriverInit_0_36 libva info: va_openDriver() returns 0 [00007fc750f78278] avcodec decoder: Using OpenGL/VAAPI/libswscale backend for VDPAU for hardware decoding. [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpDecoderRender_h264): no surfaces left in buffer [VS] error (vdpVideoSurfaceGetBitsYCbCr): not implemented conversion VA FOURCC � -> VDP_YCBCR_FORMAT_YV12 [00007fc721499778] vdpau_chroma filter error: video surface export failure: VDP_STATUS_INVALID_Y_CB_CR_FORMAT $ echo -n "FOURCC � ->" | hd 00000000 46 4f 55 52 43 43 20 ef bf bd 20 2d 3e |FOURCC ... ->| After about 2 minutes vlc has been hung. No errors in dmesg. In htop I saw a 99% IOwait on one CPU core. Vlc was killed by "-9" and no side-effects on overall system I can observe. This behaviour is not on all video files. So. It looks like some video file (and youtube stream) has some frames that HW decoder can't understand and it is a fatal for HW decoder. Or decoder generate some error code that libva can't parse? At this stage I can't understand which component is broken.
The mpv player works fine and without freezes. So. I think the freezes is not in hardware but in libva/X11/mesa/vlc.
(In reply to Mihail Kasadjikov from comment #27) > The mpv player works fine and without freezes. > > So. I think the freezes is not in hardware but in libva/X11/mesa/vlc. Thanks, closing. Please reopen or file a new bug if the problem reappears.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.