Summary: | [radeonsi]Playing videos with vdpau or vaapi hardware acceleration crashes my pc | ||
---|---|---|---|
Product: | Mesa | Reporter: | snpidek <snpidek> |
Component: | Drivers/Gallium/radeonsi | Assignee: | Default DRI bug account <dri-devel> |
Status: | CLOSED FIXED | QA Contact: | Default DRI bug account <dri-devel> |
Severity: | normal | ||
Priority: | medium | CC: | john.ettedgui |
Version: | 12.0 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
A quick script I wrote to trigger the issue.
dmesg written by the script before I restart the machine another dmesg from another run Xorg log |
Description
snpidek
2016-10-15 08:13:21 UTC
I may have the same problem, and I know how to trigger it *easily*: When watching videos with mpv using vdpau's output, if I quickly go and back forth in the video, eventually the system will freeze, go blank, and I have to reset it manually. If I wait maybe a second between jumps, I get no issue. I tried it with xv, and no issue no matter how many jumps. I'm on mesa-git and have had that issue for a while, but not sure how long anymore, so it may or may not be the same as here. The card is 280x. Since snpidek gave an entry point, I guess we should try bisecting this. (In reply to John from comment #1) > I may have the same problem, and I know how to trigger it *easily*: Thanks, that is a very valuable information. Going to try to reproduce this, cause previously that sounded like a bug we will never get a grip on. > Since snpidek gave an entry point, I guess we should try bisecting this. Yeah, completely agree. If you can reproduce it more or less reliable please try to bisect the issue between 11.2.2-1 and 12.0.3. Well I tried bisecting it today assuming 11.2.2 and got nowhere so I tried at commit 3a9f6283f435f90ca1a2901be39ec9d629c95bb6 and it still froze. Because of that I am not sure if that is the same problem or not... I'll attach a few things in case. Created attachment 127357 [details]
A quick script I wrote to trigger the issue.
It takes a video file as an input (I used an X264 mkv movie file if it matters).
It doesn't happen as quickly as I thought originally, as I've had runs up to 25 minutes (and some in seconds..).
I added the 2nd sleep to simulate better the speed at which I would usually press keys, but maybe it just delays the whole thing, not sure.
Created attachment 127358 [details]
dmesg written by the script before I restart the machine
Since there are quite some lines in dmesg about the issue, the computer is obviously not fully dead.
Created attachment 127359 [details]
another dmesg from another run
not sure if it helps but in case its information is a bit different.
Created attachment 127360 [details]
Xorg log
I'll try today to go a bit further than 11.2, if anything in the logs give you an idea of a good starting point please do share. (In reply to John from comment #4) > Created attachment 127357 [details] > A quick script I wrote to trigger the issue. For me this would use s/w dec + --vo=opengl with current mpv. I guess you have a config or something that changes the mpv defaults? If so maybe specify what they are, though I don't think I can reproduce with TONGA using amdgpu anyway. Correct, I have an mpv config with: hwdec=vdpau hwdec-codecs=all vo=opengl-hq The rest shouldn't matter I believe. I tried going back to the commit of 11.0 (so with llvm 3.7) but I still got the issue. I'd guess the bug is in the kernel not mesa, because I don't think I've had the issue for so long, I could be wrong though. Thankfully having amgdpu working with SI gave me something else to try. So I ran the same script with amdgpu instead of radeon (alas and a 4.9 kernel instead of a 4.8...), back on the latest code from mesa's git: the script ran for 2 hours before I killed it. Since 2 hours is not that much more significant than the maximum of half hour before crash that I've seen so far, I won't say that's it yet. I'll run the script over night again and if it still doesn't crash then it should be good enough to know. John, please double check that you are actually correctly installing VDPAU. E.g. add something like "while(1);" into the VDPAU driver create function or something like this. vdp_imp_device_create_x11() would be a good place for that. That it's a kernel issue came to my mind as well, but we haven't changed anything on UVD in the radeon module in quite a while. So this is a bit unlikely. > John, please double check that you are actually correctly installing VDPAU. > E.g. add something like "while(1);" into the VDPAU driver create function or something like this. vdp_imp_device_create_x11() would be a good place for that. Alright, I've just tried that and mpv seems to be waiting, no error in output nor in dmesg, which should be as expected I guess. Please tell me if you can think of any other thing I can test. > That it's a kernel issue came to my mind as well, but we haven't changed anything on UVD in the radeon module in quite a while. So this is a bit unlikely. I looked at radeon_uvd.c's history quickly and there were a few in the time period I'd think of. Based on the date I'd guess possibly either of the kernel commits on 2016-05-05, probably nothing later, and before is so far away. Is amdgpu using the same firmware as radeon for SI? if not maybe that's another option for the culprit. (In reply to John from comment #10) > Correct, I have an mpv config with: > > hwdec=vdpau > hwdec-codecs=all > vo=opengl-hq > > The rest shouldn't matter I believe. If your mpv is not too old then there is an issue with vo=opengl-hq + hwdec that means you only get half vrez. May or may not affect this issue - I don't know. https://bugs.freedesktop.org/show_bug.cgi?id=97988 Do you crash with vo=vdpau with radeon? I actually had to use -vo vdpau when I tried against mesa 11.0 (somehow mpv didn't work with ogl-hq on that version) so I know it is problematic. But the bug you link is still interesting to me, as I was wondering why my movies looked aliased lately, so thanks for that! amdgpu does not support UVD or VCE on SI parts yet. > amdgpu does not support UVD or VCE on SI parts yet.
oh, I should have verified in dmesg.
Sorry about that.
What should I try next?
(In reply to John from comment #18) > What should I try next? Installing an older kernel, see if that works with 12.0 mesa. If yes we have narrowed it down to the kernel, if not we need to stick a bit more into mesa. Another possibility which came to my mind is that this might not we an issue with UVD decoding, but rather presenting it. E.g. install both VDPAU and OpenGL from a certain Mesa version *AND* make sure that you restart X after that so that the X acceleration uses the new library versions as well. > Installing an older kernel, see if that works with 12.0 mesa. > If yes we have narrowed it down to the kernel, if not we > need to stick a bit more into mesa. I've tried with a 3.18 kernel and still got the issue, so the issue is not in the kernel. I had the firmware files from that date as well to eliminate that possibility. > Another possibility which came to my mind is that this might not > we an issue with UVD decoding, but rather presenting it. > E.g. install both VDPAU and OpenGL from a certain Mesa version > *AND* make sure that you restart X after that so that the > X acceleration uses the new library versions as well. Now this is interesting, as the reboot were only post-freeze so never to test a certain mesa version. I've rolled back to 11 and restarted the computer and will try. Since you mentioned presenting, could it be the DDX? New information: I don't need to have the video on screen for the issue to happen. I can alt-tab or switch to another virtual desktop while the script runs and it still freezes. Sorry for the late update I wanted to tests a few more things first. So I went back to a 11 mesa and rebooted before testing, no difference. I tried reverting to the DDX from back then, and disabling DRI3 (which I don't think the DDX supported anyway), and still no difference. Then I thought a bit more about what Andy wrote and updated back mesa stuff to latest git and a 4.8 kernel, but downgraded mpv to 0.10.0 (about same date as mesa 11). Now I was able to have a 4 hours run without any issue and then a 7 hours run still without issues. So maybe this is it after all. I'll try for a last longer run, and then maybe try bisecting mpv. Should I keep posting results here or would that be more of an mpv issue? Sorry for the late update I wanted to tests a few more things first. So I went back to a 11 mesa and rebooted before testing, no difference. I tried reverting to the DDX from back then, and disabling DRI3 (which I don't think the DDX supported anyway), and still no difference. Then I thought a bit more about what Andy wrote and updated back mesa stuff to latest git and a 4.8 kernel, but downgraded mpv to 0.10.0 (about same date as mesa 11). Now I was able to have a 4 hours run without any issue and then a 7 hours run still without issues. So maybe this is it after all. I'll try for a last longer run, and then maybe try bisecting mpv. Should I keep posting results here or would that be more of an mpv issue? Took me a while but it seems to come from commit 6b22b216514ee2eb784711f4539410d3b312a4fd Author: wm4 <wm4@nowhere> Date: Mon Nov 16 16:22:23 2015 +0100 vo_opengl: attempt to improve GLX vs. EGL backend detection For the sake of vaapi interop, we want to use EGL, but on the other hand, but because driver developers are full of shit, vdpau interop will not work on EGL (even if the driver supports EGL). The latter happens with both nvidia and AMD Mesa drivers. Additionally, EGL vaapi interop support can apparently only detected at runtime by actually using it. While hwdec_vaegl.c already does this, it would require initializing libva on _every_ system, which will cause libav to print an unpreventable bullshit message to the terminal. Try to counter these huge loads of bullshit by adding more fucking bullshit. Well now I get the same problem in Kodi as well :/ -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1238. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.