Summary: | [RV620][RV630][RS880] GPU hangs using UVD hardware acceleration | ||
---|---|---|---|
Product: | Mesa | Reporter: | Eugene <ken20001> |
Component: | Drivers/Gallium/r600 | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | major | ||
Priority: | high | CC: | arthur.marsh, ckoenig.leichtzumerken, daniele.rogora, freedesktop.jim-j, ken20001, nicolamori, russianneuromancer, tanertas, zima |
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Xorg.log
dmesg Xorg.log with backtrace dmesg syslog dmesg log, gpu hangs UVD hang on RV620 journalctl dump for uvd/vdpau crash Failed UVD playback session with RS780 mainboard Something to test More to test. |
Description
Eugene
2014-10-22 09:31:36 UTC
If this is a regression can you narrow down which component (kernel, mesa, etc.) caused the problem and bisect? Please also attach your xorg log and dmesg output. I suspect this is the same as: https://bugs.freedesktop.org/show_bug.cgi?id=85323 (In reply to Alex Deucher from comment #1) > If this is a regression can you narrow down which component (kernel, mesa, > etc.) caused the problem and bisect? Please also attach your xorg log and > dmesg output. And yes, I would do bisect if somebody explain how to. But I don't know. Created attachment 108252 [details]
Xorg.log
Created attachment 108253 [details]
dmesg
Google for "git bisect howto". There are lots of good tutorials. (In reply to Alex Deucher from comment #6) > Google for "git bisect howto". There are lots of good tutorials. What exactly I shoud bisect, mesa ? Where to get it ? (In reply to Eugene from comment #7) > (In reply to Alex Deucher from comment #6) > > Google for "git bisect howto". There are lots of good tutorials. > What exactly I shoud bisect, mesa ? Where to get it ? Can you narrow down whether it was a mesa update or a kernel update that caused the regression? You'll need to do that first. Once you've figured that out, you can bisect the appropriate component (mesa or kernel). Mesa git info is here: http://cgit.freedesktop.org/mesa/mesa/ kernel git info: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/ The thing is that Linux 3.18 is the first kernel with which hardwear acceleration for HD2600 became possible. So I don't think the regression takes place. Recently I checked Mesa 10.1.3 stable. The same result: system hangs, display becomes dark. No possibility to switch to any VT. So, as I understood, this is not a regression. This is just new feature (for my graphics adapter) that comes with Linux 3.18. And it is very unstable. You need the bleeding edge Mesa code to get video accerleration working on HD2600. But even then the hardware on the HD2600 is so buggy that it is really tricky to get this working right. Anyway I'm ready to test anything that possible. I was able to save logs after playing video with VLC with VDPAU and hardware decoding turned on. There are also backtrace in Xorg log file. Please, look in attachment. Created attachment 108772 [details]
Xorg.log with backtrace
Created attachment 108773 [details]
dmesg
Created attachment 108774 [details]
syslog
I see the same happening here with a HD4200 (RS880) GPU. I tried with: - kernel 3.18rc3 - latest radeon ucode firmware files from the 23rd of August - updated mesa, xserver and libs from the oibaf ubuntu ppa I observed the same behavior: video starts well for a second, then the image freeze (but mouse cursor is still alive) and I can't do anything, not even switching to another vt. After a while the screen becomes black. In the meantime I can hear the sound of the video correctly, and indeed if I log in with ssh from another pc everything is working well; still xserver doesn't work until a reboot is performed. I attach a piece of the dmesg log taken from the ssh session right after the video was played where the GPU hang is reported. Xorg doesn't log any error. Created attachment 109807 [details]
dmesg log, gpu hangs
This is quite strange cause my xorg log shows a lot of error messages. There was a VDPAU fix in MPV 0.6.1 that fixed this kind of lockup on my hardware: RS880G. My videos are all 1080i50 H.264 recordings from DVB-S2 broadcasts of various channels. Since they are just chunks of transport stream, I think that lockups when starting a video and when seeking would be similar. Also I have ~oibaf MESA from 4th October, the first with the VAAPI state tracker, and I tested that too. Here is a summary of what I found, with MPV 0.6.1 --hwdec=vdpau --vo=vdpau does not lock up the GPU but... --hwdec=vdpau --vo=vdpau --deinterlace locks the GPU immediately. I assume deinterlace will not work until there is a proper workaround for the frame based output. UVD is not useful for me without deinterlace. --hwdec=vaapi --vo=vaapi locks up the GPU quite often when starting to play a video, similar to vdpau on MPV 0.6.0 --hwdec=vaapi --vo=opengl does not lock up the GPU but there is quite a lot of coloured or speckled picture corruption on my video. There is a VLC and using it with HW acceleration turned on also locks up GPU. My previous tests were done with VLC and Gstreamer. I tried with MPV 0.6.2 as well but GPU still hangs for me. The video I use is a 1080p, so no deintetlacing needed, I never tried to use it. Also tried mpv 0.6.2 and also GPU hangs. Ok I've just learnt something interesting: suspending kwin desktop effects make everything work flawlessy here. I tried mpv, vlc and even the flash plugin and hw accel works well; I tried also seeking in the video without any problem. Now we should find out if the problems is only there with kwin or even with other desktop environments. Update: there are still some videos (taken with my smartphone) causing the GPU to hang as it did before, both with VLC and mpv, but I confirm that I can now play youtube with hw acceleration. Linux 3.19RC1. Nothing's changed. Created attachment 111184 [details]
UVD hang on RV620
Can confirm the same thing on RV620.
dmesg log attached
As for me, disabling kwin effect does not change anything Also looked briefly through the attachments, it seems that me and Daniele are reporting different issue than OP. (In reply to Eugene from comment #26) > Linux 3.19RC1. Nothing's changed. You don't need to test every new kernel version. I'm going to leave a note here if I find time to work on this issue. On the other hand if you want to get your hands dirty and try a fe things than I can give you dirrections on what could it be (but you need to get into the code yourself). (In reply to Christian König from comment #29) > (In reply to Eugene from comment #26) > > Linux 3.19RC1. Nothing's changed. > > You don't need to test every new kernel version. I'm going to leave a note > here if I find time to work on this issue. > > On the other hand if you want to get your hands dirty and try a fe things > than I can give you dirrections on what could it be (but you need to get > into the code yourself). Thanks, it would be great if you'll decide to work on this issue. If you'll need any any additional info, any test I'm ready to help with it. It's a pity but I'm not a programmer, so I can't write a code. But any other things that would help I'll do all I can. I think I have the same on RS880 here (HD4290). Mesa 10.3.5, libdrm 2.4.58, kernel 3.18.1, R600_rlm.bin + RS780_uvd.bin firmware from August 2014. Testing with mplayer + vdpau. Created attachment 111799 [details]
journalctl dump for uvd/vdpau crash
Same issue with RS880M. Sometimes hang on very beginning (before first frame appear on the screen) and sometimes after one/few/many attempts to rewind. Tested with Linux 3.19, latest Mesa snapshot from Oibaf PPA, and mpv 0.8. Same issue with a HD4290 (RS880). My motherboard has AM3+ socket so is not obsolete hardware, it's support uvd2 and the last AMD procesors. It would be nice to fix it. If you need some help i can dirty my hands too :) Created attachment 114623 [details]
Failed UVD playback session with RS780 mainboard
Comment on attachment 114623 [details]
Failed UVD playback session with RS780 mainboard
I can confirm the same happening my bleeding edge Arch linux system with my RS780/HD3200 mainboard.
Kernel 3.19.2
Mesa 10.5.1
libdrm 2.4.60
xf86-video-ati 7.5.0
I tried UVD acceleration with mpv, vdr/softhddevice and flashplugin via vdpau output enabled. Screen freezes if I seek forward/back and If I adjust playback window size (eg. going to fullscreen).
I can switch the VT consoles after X session freezes. I can create a new working X session after killing the previous freezed X session but playback video using GPU is not possible anymore. I have to reboot the system.
dmesg attached.
Created attachment 115398 [details] [review] Something to test Just an idea I had recently what this issue could be. Please test the attached patch and see if it works or not. I patched a 4.0.1 kernel on a Debian 8 for testing ..... i will test it with several movies :) (In reply to Christian König from comment #37) > Created attachment 115398 [details] [review] [review] > Something to test > > Just an idea I had recently what this issue could be. Please test the > attached patch and see if it works or not. The patch made things on my system even worse. Before, playing a movie with VLC and VDPAU enabled resulted in random screen freezes after some time, while with the patched kernel the freeze happens immediately as I start playing, all the times. Tested with Mobility Radeon HD3400 (RV620) on ArchLinux 64 bit with linux-ck 4.0.1, mesa 10.5.4, mesa-vdpau 10.5.4 and libvdpau 1.1. I got a black screen at the beggining, and console show this: Radeon 0000:01:05.0 ring 5 stalled for more than .....secs Created attachment 115475 [details]
More to test.
Interesting, attached is another patch you could test.
It just disables using UVD semaphores for now.
Works like a charm :) I confirm that your last patch makes things work here too The new patch works also for me. A couple of questions, Christian: does the patch remove some features? Do you think to mainline it or rather implement a different fix now that the problem seems to be better defined? Thanks. (In reply to Nicola Mori from comment #44) > The new patch works also for me. A couple of questions, Christian: does the > patch remove some features? Do you think to mainline it or rather implement > a different fix now that the problem seems to be better defined? Thanks. It disables hw semaphores for UVD1, but it's likely they were buggy on that early hw anyway. It shouldn't affect UVD functionality. The driver just uses a different method for synchronizing between rings. (In reply to Nicola Mori from comment #44) > The new patch works also for me. A couple of questions, Christian: does the > patch remove some features? Do you think to mainline it or rather implement > a different fix now that the problem seems to be better defined? Thanks. Instead of submitting the commands to the hardware directly with semaphores to sync between the GFX and UVD engines we block until the dependent task is completed. That's rather bad in a couple of different cases, for example doing 3D gaming and video playback at the same time. What essentially happens is instead of keeping UVD and GFX busy all the same time (and only occasionally block one engine waiting the other one) you do it more like this: 1. Run UVD job. 2. Wait for UVD to finish. 3. Run GFX. 4. Wait for GFX to finish. 5. Run UVD 6. Wait for UVD to finish. .... Thanks for the clarification, Christian. Could it also impact other cases, e.g. a desktop environment with hardware accelerated visual effects? Playing a movie while resizing or dragging a window should result in a usage pattern of GFX and UVD that is similar to your example with movie and 3D game. I did some experiments with my KDE desktop with OpenGL 2.0 and 3.1 visual effects, but I didn't notice any lag (maybe it's a too light workload to show any issue). Given the comment by Alex about the possibly buggy UVD1 hardware semaphores, the overall satisfactory performance of the patch and the old hardware affected by the bug I would guess that this patch likely is the final fix for this bug. If so, when will it be mainlined (approx.)? Thanks. (In reply to Nicola Mori from comment #47) > Thanks for the clarification, Christian. Could it also impact other cases, > e.g. a desktop environment with hardware accelerated visual effects? Not really, that is way to less load to cause any real trouble. 3D games on the other hand are a different story. > Given the comment by Alex about the possibly buggy UVD1 hardware semaphores, > the overall satisfactory performance of the patch and the old hardware > affected by the bug I would guess that this patch likely is the final fix > for this bug. If so, when will it be mainlined (approx.)? Alex merged it into hist drm-fixes-4.1 branch and I put a CC stable on it. So if everything works well it will show up in 4.1 and is then backported to the stable kernel versions used by distributions. Hello, I have the same problem with RV730. The "disable semaphores ..." patch makes it better, but not completely, UVD is still not usable. Please follow me to bug #67994 which seems more approprite for RV730. *** Bug 88152 has been marked as a duplicate of this bug. *** (In reply to zimous from comment #49) > Hello, I have the same problem with RV730. The "disable semaphores ..." > patch makes it better, but not completely, UVD is still not usable. Please > follow me to bug #67994 which seems more approprite for RV730. Please add a new bug report for this, cause as you already wrote on bug #67994 your bug has different symptoms than this one here. Closing this bug as the problem seems to be solved now. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.