Created attachment 111903 [details] dmesg output Running Debian unstable x86/64 on AMD64 (4 cores) using a Radeon 3850HD graphics card with Linus git head kernel. H.264 videos at 720p or 1080p resolution cause lock-up with VLC (but not with mpv --vo vdpau). Using: mesa-vdpau-drivers:amd64 10.3.2-1 libdrm-radeon1:amd64 2.4.58-2 xserver-xorg-video-radeon 1:7.5.0-1 vlc 2.2.0~rc2-1 I don't have any short non-commercial videos to demonstrate the problem, but it does appear more likely with encodes from broadcast sources than blu-rays, suggesting that corrupt video gets fed through to where it causes a gpu lock-up while audio playback and other processes continue. $ lspci 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] RS780 Host Bridge 00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI bridge (ext gfx port 0) 00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI bridge (PCIE port 2) 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller 00:12.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1 Controller 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI0 Controller 00:13.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1 Controller 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB EHCI Controller 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller (rev 3a) 00:14.1 IDE interface: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 IDE Controller 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia (Intel HDA) 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 LPC host controller 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI Bridge 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 USB OHCI2 Controller 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor Address Map 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 10h Processor Link Control 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV670 [Radeon HD 3690/3850] 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RV670/680 HDMI Audio [Radeon HD 3690/3800 Series] 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 02) # vdpauinfo display: :0 screen: 0 API version: 1 Information string: G3DVL VDPAU Driver Shared Library version 1.0 Video surface: name width height types ------------------------------------------- 420 8192 8192 NV12 YV12 422 8192 8192 UYVY YUYV 444 8192 8192 Y8U8V8A8 V8U8Y8A8 Decoder capabilities: name level macbs width height ------------------------------------------- MPEG1 0 9216 2048 1152 MPEG2_SIMPLE 3 9216 2048 1152 MPEG2_MAIN 3 9216 2048 1152 H264_BASELINE 41 9216 2048 1152 H264_MAIN 41 9216 2048 1152 H264_HIGH 41 9216 2048 1152 VC1_ADVANCED 4 9216 2048 1152 Output surface: name width height nat types ---------------------------------------------------- B8G8R8A8 8192 8192 y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 R8G8B8A8 8192 8192 y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 R10G10B10A2 8192 8192 y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 B10G10R10A2 8192 8192 y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 Bitmap surface: name width height ------------------------------ B8G8R8A8 8192 8192 R8G8B8A8 8192 8192 R10G10B10A2 8192 8192 B10G10R10A2 8192 8192 A8 8192 8192 Video mixer: feature name sup ------------------------------------ DEINTERLACE_TEMPORAL y DEINTERLACE_TEMPORAL_SPATIAL - INVERSE_TELECINE - NOISE_REDUCTION y SHARPNESS y LUMA_KEY - HIGH QUALITY SCALING - L1 - HIGH QUALITY SCALING - L2 - HIGH QUALITY SCALING - L3 - HIGH QUALITY SCALING - L4 - HIGH QUALITY SCALING - L5 - HIGH QUALITY SCALING - L6 - HIGH QUALITY SCALING - L7 - HIGH QUALITY SCALING - L8 - HIGH QUALITY SCALING - L9 - parameter name sup min max ----------------------------------------------------- VIDEO_SURFACE_WIDTH y 48 2048 VIDEO_SURFACE_HEIGHT y 48 1152 CHROMA_TYPE y LAYERS y 0 4 attribute name sup min max ----------------------------------------------------- BACKGROUND_COLOR y CSC_MATRIX y NOISE_REDUCTION_LEVEL y 0.00 1.00 SHARPNESS_LEVEL y -1.00 1.00 LUMA_KEY_MIN_LUMA y LUMA_KEY_MAX_LUMA y I am happy to supply additional information or try things to narrow down the source of the problem.
Created attachment 111904 [details] dmesg output
Created attachment 111905 [details] 2015010719dmesg.txt
Created attachment 111906 [details] 20150107dmesg.txt
The commit: https://github.com/torvalds/linux/commit/dd5a74f2f982193620cfa1ef609df1ee805781d4 appears to at least reduce the problem. Is there any (semi-)automated way to check for any more occurences of signed variables that should be unsigned?
Created attachment 112092 [details] 20150112dmesg.txt After updating to the Linus git head kernel with https://github.com/torvalds/linux/commit/dd5a74f2f982193620cfa1ef609df1ee805781d4 applied, and applying the patch at http://article.gmane.org/gmane.linux.kernel.mm/127052 I still had vlc lock-up with one video with 20150112dmesg.txt dump. mediainfo reports the video format as: Format : MPEG-4 Format profile : Base Media Codec ID : isom File size : 2.54 GiB Duration : 48mn 0s Overall bit rate : 7 579 Kbps Encoded date : UTC 2010-01-10 03:49:24 Tagged date : UTC 2010-01-10 03:49:24 Video ID : 1 Format : AVC Format/Info : Advanced Video Codec Format profile : High@L4.0 Format settings, CABAC : Yes Format settings, ReFrames : 3 frames Codec ID : avc1 Codec ID/Info : Advanced Video Coding Duration : 48mn 0s Source duration : 47mn 59s Bit rate : 7 110 Kbps Width : 1 440 pixels Height : 1 080 pixels Display aspect ratio : 16:9 Frame rate mode : Constant Frame rate : 29.970 fps Standard : NTSC Color space : YUV Chroma subsampling : 4:2:0 Bit depth : 8 bits Scan type : Progressive Bits/(Pixel*Frame) : 0.153 Stream size : 2.38 GiB (94%) Source stream size : 2.49 GiB (98%) Language : English Encoded date : UTC 2010-01-10 03:49:24 Tagged date : UTC 2010-01-10 03:49:24 Color primaries : BT.709 Transfer characteristics : BT.709 Matrix coefficients : BT.709 mdhd_Duration : 2880936
(In reply to Arthur Marsh from comment #4) > The commit: > > https://github.com/torvalds/linux/commit/ > dd5a74f2f982193620cfa1ef609df1ee805781d4 > > appears to at least reduce the problem. The patch can't affect the issue, cause it only applies to non UMS mode which is completely deprecated and doesn't support UVD at all.
Created attachment 112109 [details] 2015011216dmesg.txt - dmesg output with 3.19.0-rc4+ problem further reduced but not eliminated.
previous 2015011216dmesg.txt was inadvertantly with 3.19.0-rc3+ When playing the same video under kernel 3.19.0-rc4+ it played for longer before locking up.
Created attachment 112129 [details] 20150113dmesg.txt - video run under kernel 3.19.0-rc4+
When I ran vlc on the file with VLC_VERBOSE=3 I had no GPU lockup. When I ran vlc on the file with VLC_VERBOSE=2 the GPU locked up again around the same time as the previous test with kernel 3.19.0-rc4+. Might that suggest a timing issue?
Created attachment 112166 [details] vlcdebug2.log, output from running vlc with VLC_VERBOSE=2
Created attachment 112167 [details] 2015011322dmesg.txt dmesg output with GPU lockup when VLC_VERBOSE=2
Upgraded mesa-related packages to 10.4.2-1. Still seeing a lockup a few minutes into video playback.
Created attachment 112175 [details] 2015011404dmesg.txt dmesg output with mesa 10.4.2
Upgraded to current Linus git head and tried again. This time there was no GPU reset associated with starting kdm (which had happened over the last few days), but the lock-up when playing the same video came less than 2 minutes into the video, much sooner than before.
Created attachment 112211 [details] 2015011421dmesg.txt - GPU lock-up less than 2 minutes into video play-back.
With kernel 3.19.0-rc5 the same video played right through with vlc without locking up.
Created attachment 112426 [details] 20150119dmesg.txt 3.19.0-rc5 dmesg
A further run of the same video with kernel 3.19.0-rc5, doing some skipping of the the video.
Created attachment 112447 [details] 2015011907dmesg.txt - lockup with vlc and 3.19.0-rc5
Created attachment 112458 [details] 2015011922dmesg.txt - lockup with 3.19.0-rc5 and 848x480 resolution video First lock-up with lower than 720p resolution video playback
Created attachment 112606 [details] 20150122dmesg.txt lock-up with same video after radeon updates to 3.19.0-rc5 Rebuilt the kernel after the latest Radeon updates to Linus' 3.19.0-rc5, lock-up occurred sooner.
Created attachment 112660 [details] 2015012222dmesg.txt test after upgrading vlc. I upgraded vlc to 2.2.0~rc2-2 and re-tested against the same video running under the current Linus' git head kernel. There was a gpu lock-up again - each different test seems to have the lock-up happen at a different stage in play-back, as if there is a non-deterministic event leading to the lock-up.
Created attachment 112868 [details] 20150127dmesg.txt with 3.19.0-rc6 - lock-up after a few seconds of video play with kernel 3.19.0-rc6, the gpu locked up after a few seconds of playing the same video.
Created attachment 112889 [details] 2015012718dmesg.txt lock-up with first post 3.19.0-rc6 patches applied The lock-up occurred within the first 20 seconds of playing the video, but slightly later than with plain 3.19.0-rc6.
Created attachment 112901 [details] 2015012814dmesg.txt - lockup after updating kernel to latest radeon patches With the latest radeon patches in the 3.19.0-rc6+ kernel, I still experienced a lockup.
Created attachment 112955 [details] 20150130dmesg.txt - lock-up with the latest git head patches vlc behaved differently - going green and stalling before finally causing a gpu lock-up.
Created attachment 113239 [details] 20150207dmesg.txt - lock-up about 5 and a half minutes into video playback After latest radeon and mm updates to Linus git head, the same video played back fine until about 5 and a half minutes into playback, then locked-up all video.
Created attachment 113270 [details] 20150209dmesg.txt didn't lock up on usual video but did on another with 3.19 kernel With the 3.19 kernel, I didn't get a lock-up with the usual test video but did eventually with another video.
Created attachment 113394 [details] 2015021218dmesg.txt - lockup of screen except for mouse, was able to restart kdm For the first time when I experienced a lock-up due to running vlc with vdpau, although the desktop was locked up apart from the mouse cursor, I was able to control-alt-F1 and restart kdm successfully. Using current Linus' git head.
after last lock-up, although I could restart kdm, vdpau didn't work until I'd powered off and restarted the machine (vdpau failed even after a kexec restart): Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory [vo/vdpau] Error when calling vdp_device_create_x11: 1 Error opening/initializing the selected video_out (-vo) device. Video: no video Ironically I was getting that message, even though I've only had Radeon hardware in this machine.
(In reply to Arthur Marsh from comment #31) > Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object > file: No such file or directory > [vo/vdpau] Error when calling vdp_device_create_x11: 1 > Error opening/initializing the selected video_out (-vo) device. > Video: no video > > Ironically I was getting that message, even though I've only had Radeon > hardware in this machine. I guess the Xorg radeon driver couldn't initialize hardware acceleration, so it didn't advertise the VDPAU driver name, and libvdpau fell back to its hardcoded default 'nvidia'. As for the hangs, I suspect they just happen randomly regardless of kernel version. Attaching even more dmesg files just clutters up this report and makes it harder for anyone to make sense of it.
Where to from here? I'm happy to build kernels and other packages from source, test and bisect, but I'm not a C programmer.
(In reply to Arthur Marsh from comment #33) > Where to from here? I'm happy to build kernels and other packages from > source, test and bisect, but I'm not a C programmer. Unfortunately we don't really have time taking care of the older hardware generations. Getting the new generations working has usually priority. I have a couple of ideas what could cause this, but you clearly need to hack into the code to figure out what it is. Sorry that I can't help here much, Christian.
Created attachment 114333 [details] multiple lockup errors immediately after starting video 2015031613dmesg.txt After upgrading the kernel to 4.0.0-rc4, and vdpau / libdrm: libvdpau1:amd64 0.9-1 libdrm-radeon1:amd64 2.4.59-1 I saw 4 GPU lockup error messages in dmesg (attached) within about half a second. The video itself locked up a very few seconds into playback. (I could supply the starting minute of the video if anyone was happy to look at it).
With the first post- 4.0.0-rc7 drm update, I am no longer seeing the error, but have been unable to git-bisect to find the commit that fixed the problem.
Are you sure this bug is not related to bug #85320? Also, are you certain it is fixed in 4.0.0-rc7+ (linux git). In the mentioned report users of RV620/630 and RS780/880 (3450/2600 and 3200/4200 respectively) report GPU Resets and lockups when using vdpau hardware decoding. Do you use also mesa git and might the fix be rather introduced by a mesa git pull? That would explain, why you could not bisect it in linux git. I for my part have a Radeon HD 3200 Mobility (RS780M) and could still reproduce it with linux git. Did you test the fix thoroughly? For example I could start a video with hardware accelerated video decoding in mpv 40 times without a GPU Reset, but seeking in the video or disabling and reenabling the video track could cause it, while normal playback usually did not trigger it. Stable VLC however caused the GPU Reset on the first try using vaapi decoding with vdpau wrapper. It would be at least a good sign, if VLC can´t reproduce this anymore for you and maybe for the others at bug #85320 also. So basically I´m asking, whether the described methods still cause a GPU Reset and what libraries you use in git version.
Sorry for the delay in replying, I tried a few more tests first. The bad news is that even with a kernel build that didn't lock up on one complete playback of the video, even when trying skipping during the video, on a subsequent reboot and run I'd experience a lock-up. The good news is that with current git head, even though I can experience lock-ups, I can restart kdm successfully following a lock-up. Prior to 4.0.0 kernel I needed to do a power down restart to undo the lock-up. Besides Linus git head kernel built with current Debian experimental gcc-5, the other packages installed include: libdrm - related packages at version 2.4.60-2 libvdpau1:amd64 0.9-1 mesa packages 10.4.2-2 xserver-xorg packages 1:7.5.0-1 At one stage I wanted to be sure that it wasn't a problem with the 3850HD video card so I removed it and used the onboard Radeon 3200HD and experienced the same problems.
Somehow, for the most part I'm no longer experiencing lock-ups. Kernel is Linus' is current git head: Linux version 4.1.0-rc2+ (root@am64) (gcc version 5.1.1 (Debian 5 .1.1-4) ) #1700 SMP PREEMPT Sat May 9 14:01:46 ACST 2015 DDX is 1:7.5.0-1+b1 libdrm is 2.4.60-3 mesa is 10.4.2-2 libc6 is 2.19-18 vlc is 2.2.1-1+b1
The "disable semaphores" patch: http://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-fixes-4.1&id=013ead48a843442e63b9426e3bd5df18ca5d054a appears to stop the lock-ups from happening. It also appears to prevent one having multiple vdpau-enabled vlc sessions from working at the same time, and there are some issues with videos of different resolutions sometimes showing a black screen when their resolution is different from the previous video played. Playing a few different videos with vlc appears to reset things so that a video that previously showed a black screen plays fine again.
*** This bug has been marked as a duplicate of bug 85320 ***
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.