Bug 91322

Summary: Usage of 'gallium' vaapi driver crashes radeon with inability to reset itself and scary pictures as if card has burned out
Product: Mesa Reporter: Sergey Kondakov <virtuousfox>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact: Default DRI bug account <dri-devel>
Severity: major    
Priority: medium    
Version: 10.6   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Xorg.0.log.old
dmesg-from-crash-on-flash-video-with-vaapi-gallium-#1
dmesg-from-crash-on-flash-video-with-vaapi-gallium-#2
dmesg-from-X-hang
mpv-vdpau-via-va_gl=X-hang
mpv-vaapi=slideshow
mpv-vdpau=slideshow

Description Sergey Kondakov 2015-07-13 06:23:29 UTC
Created attachment 117082 [details]
Xorg.0.log.old

If I try to play any video anywhere with 'LIBVA_DRIVER_NAME=gallium' set (I've tested mpv/bomy & freshplayerplugin/PPAPI-flash/Firefox and the same plus VDPAU_DRIVER=va_gl) then after few seconds picture goes dark, then shows a mix of chessboard and whitenoise for a while and then blanks out or hangs with the last seen normal frame without even resetting itself. Keyboard is locked during this also, hopefully "shutdown" button on the chassis still recognized by the system.

With 'VDPAU_DRIVER=r600' decoding is like 2 frames per second but it doesn't seem to kill the card at least. And I'm not sure if gst-omx+beelagio+omx-state-tracker really does anything anywhere.

`inxi -F`
System:    Host: arsenal.patriots Kernel: 4.1.1-7.gcac28b3-desktop x86_64 (64 bit) Desktop: LXQt
           Distro: Hackeurs Sans Frontieres (openSUSE 13.2)
Machine:   Mobo: Gigabyte model: GA-990XA-UD3 Bios: Award v: F14e date: 09/09/2014
CPU:       Hexa core AMD FX-6100 Six-Core (-MCP-) cache: 12288 KB 
           clock speeds: max: 4026 MHz 1: 4026 MHz 2: 4026 MHz 3: 4026 MHz 4: 4026 MHz 5: 4026 MHz 6: 4026 MHz
Graphics:  Card: Advanced Micro Devices [AMD/ATI] Barts XT [Radeon HD 6870]
           Display Server: X.Org 1.17.2 driver: radeon Resolution: 1920x1080@60.00hz
           GLX Renderer: Gallium 0.4 on AMD BARTS GLX Version: 3.0 Mesa 10.6.1
Audio:     Card-1 Advanced Micro Devices [AMD/ATI] Barts HDMI Audio [Radeon HD 6800 Series] driver: snd_hda_intel
           Card-2 Advanced Micro Devices [AMD/ATI] SBx00 Azalia (Intel HDA) driver: snd_hda_intel
           Sound: Advanced Linux Sound Architecture v: k4.1.1-7.gcac28b3-desktop
Network:   Card-1: Qualcomm Atheros AR5418 Wireless Network Adapter [AR5008E 802.11(a)bgn] (PCI-Express)
           driver: ath9k
           IF: wlp3s0 state: down mac: <censored>
           Card-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller driver: r8169
           IF: enp4s0 state: up speed: 100 Mbps duplex: full mac: <censored>
Drives:    HDD Total Size: 5879.2GB (82.4% used) ID-1: /dev/sda model: SAMSUNG_MZ7TD128 size: 128.0GB
           ID-2: /dev/sdb model: WDC_WD2500AAJS size: 250.1GB ID-3: /dev/sdd model: WDC_WD15EARS size: 1500.3GB
           ID-4: /dev/sde model: WDC_WD10EADS size: 1000.2GB ID-5: /dev/sdf model: WDC_WD10EADS size: 1000.2GB
           ID-6: /dev/sdc model: WDC_WD20EARS size: 2000.4GB
Partition: ID-1: / size: 25G used: 6.8G (28%) fs: btrfs dev: /dev/sda3
           ID-2: /boot size: 976M used: 97M (11%) fs: ext4 dev: /dev/sda2
           ID-3: /home size: 50G used: 29G (60%) fs: ext4 dev: /dev/sda4
           ID-4: swap-1 size: 0.34GB used: 0.00GB (0%) fs: swap dev: /dev/zram0
           ID-5: swap-2 size: 0.34GB used: 0.00GB (0%) fs: swap dev: /dev/zram1
           ID-6: swap-3 size: 0.34GB used: 0.00GB (0%) fs: swap dev: /dev/zram2
           ID-7: swap-4 size: 0.34GB used: 0.00GB (0%) fs: swap dev: /dev/zram3
           ID-8: swap-5 size: 0.34GB used: 0.00GB (0%) fs: swap dev: /dev/zram4
           ID-9: swap-6 size: 0.34GB used: 0.00GB (0%) fs: swap dev: /dev/zram5
Sensors:   System Temperatures: cpu: 39.0C mobo: 34.0C gpu: 42.0
           Fan Speeds (in rpm): cpu: 1061 fan-1: 1869 fan-3: 1240 fan-5: 1558
Info:      Processes: 302 Uptime: 0:18 Memory: 2860.2/7895.6MB Client: Shell (zsh) inxi: 2.2.25

`dmesg | grep -i drm`
[    3.278456] [drm] Initialized drm 1.1.0 20060810
[    3.291615] [drm] radeon kernel modesetting enabled.
[    3.294444] fb: switching to radeondrmfb from EFI VGA
[    3.297693] [drm] initializing kernel modesetting (BARTS 0x1002:0x6738 0x1787:0x2305).
[    3.297710] [drm] register mmio base: 0xFD5C0000
[    3.297714] [drm] register mmio size: 131072
[    3.298273] [drm] Detected VRAM RAM=1024M, BAR=256M
[    3.298278] [drm] RAM width 256bits DDR
[    3.298537] [drm] radeon: 1024M of VRAM memory ready
[    3.298541] [drm] radeon: 1024M of GTT memory ready.
[    3.298556] [drm] Loading BARTS Microcode
[    3.298640] [drm] Internal thermal controller with fan control
[    3.303816] [drm] radeon: dpm initialized
[    3.303912] [drm] GART: num cpu pages 262144, num gpu pages 262144
[    3.304861] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
[    3.339086] [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000).
[    3.340717] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    3.340722] [drm] Driver supports precise vblank timestamp query.
[    3.340822] [drm] radeon: irq initialized.
[    3.357473] [drm] ring test on 0 succeeded in 3 usecs
[    3.357499] [drm] ring test on 3 succeeded in 5 usecs
[    3.534484] [drm] ring test on 5 succeeded in 2 usecs
[    3.534497] [drm] UVD initialized successfully.
[    3.534877] [drm] ib test on ring 0 succeeded in 0 usecs
[    3.534934] [drm] ib test on ring 3 succeeded in 0 usecs
[    4.185851] [drm] ib test on ring 5 succeeded
[    4.187501] [drm] Radeon Display Connectors
[    4.187515] [drm] Connector 0:
[    4.187525] [drm]   DP-1
[    4.187535] [drm]   HPD4
[    4.187546] [drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[    4.187565] [drm]   Encoders:
[    4.187575] [drm]     DFP1: INTERNAL_UNIPHY2
[    4.187586] [drm] Connector 1:
[    4.187595] [drm]   DP-2
[    4.187604] [drm]   HPD5
[    4.187615] [drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 0x644c
[    4.187633] [drm]   Encoders:
[    4.187642] [drm]     DFP2: INTERNAL_UNIPHY2
[    4.187653] [drm] Connector 2:
[    4.187663] [drm]   HDMI-A-1
[    4.187672] [drm]   HPD3
[    4.187683] [drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c
[    4.187701] [drm]   Encoders:
[    4.187710] [drm]     DFP3: INTERNAL_UNIPHY1
[    4.187721] [drm] Connector 3:
[    4.187749] [drm]   DVI-D-1
[    4.187759] [drm]   HPD1
[    4.187770] [drm]   DDC: 0x6480 0x6480 0x6484 0x6484 0x6488 0x6488 0x648c 0x648c
[    4.187789] [drm]   Encoders:
[    4.187798] [drm]     DFP4: INTERNAL_UNIPHY1
[    4.187809] [drm] Connector 4:
[    4.187819] [drm]   DVI-I-1
[    4.187828] [drm]   HPD6
[    4.187838] [drm]   DDC: 0x6470 0x6470 0x6474 0x6474 0x6478 0x6478 0x647c 0x647c
[    4.187857] [drm]   Encoders:
[    4.187867] [drm]     DFP5: INTERNAL_UNIPHY
[    4.187877] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[    4.277370] [drm] fb mappable at 0xD0475000
[    4.277385] [drm] vram apper at 0xD0000000
[    4.277396] [drm] size 8294400
[    4.277406] [drm] fb depth is 24
[    4.277416] [drm]    pitch is 7680
[    4.277569] fbcon: radeondrmfb (fb0) is primary device
[    4.352076] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
[    4.358652] [drm] Initialized radeon 2.42.0 20080528 for 0000:01:00.0 on minor 0
[  207.549351] [drm:btc_dpm_set_power_state [radeon]] *ERROR* rv770_restrict_performance_levels_before_switch failed
[  243.388271] [drm:btc_dpm_set_power_state [radeon]] *ERROR* rv770_restrict_performance_levels_before_switch failed
Comment 1 Alex Deucher 2015-07-13 14:01:28 UTC
Please attach your full dmesg output.  Can you also provide the application log from and options used to decode?
Comment 2 Sergey Kondakov 2015-07-14 16:01:16 UTC
Created attachment 117109 [details]
dmesg-from-crash-on-flash-video-with-vaapi-gallium-#1
Comment 3 Sergey Kondakov 2015-07-14 16:02:08 UTC
Created attachment 117110 [details]
dmesg-from-crash-on-flash-video-with-vaapi-gallium-#2

Same situation as with the previous one but with new message at the end
Comment 4 Sergey Kondakov 2015-07-14 16:03:29 UTC
Created attachment 117111 [details]
dmesg-from-X-hang

Not a crash but a hanging X. Happened while running mpv video player with va_gl->vdpau and not flash plugin as was in case of full crash.
Comment 5 Sergey Kondakov 2015-07-14 16:04:03 UTC
Created attachment 117112 [details]
mpv-vdpau-via-va_gl=X-hang
Comment 6 Sergey Kondakov 2015-07-14 16:05:13 UTC
Created attachment 117113 [details]
mpv-vaapi=slideshow

Using vaapi=gallium or vdpau=r600 results in pure slideshow but not crash this time around
Comment 7 Sergey Kondakov 2015-07-14 16:05:32 UTC
Created attachment 117114 [details]
mpv-vdpau=slideshow

Using vaapi=gallium or vdpau=r600 results in pure slideshow but not crash this time around
Comment 8 Sergey Kondakov 2015-07-14 16:08:41 UTC
(In reply to Alex Deucher from comment #1)
> Please attach your full dmesg output.  Can you also provide the application
> log from and options used to decode?

For some reason I couldn't reproduce the complete crash with a player this time but I've managed to consistently and immediately crash the driver with playing http://thedailyshow.cc.com/videos/vbp8i5/living-in-denali in the Firefox with chromium-pepper-flash-18.0.0.160 + freshplayerplugin-0.3.1 while its "enable_hwdec" option is set. All with LIBVA_DRIVER_NAME=gallium

However, I also managed to hang X (picture is stuck but zapping combination works and brings the new one back) by running 'env LIBVA_DRIVER_NAME=gallium VDPAU_DRIVER=va_gl mpv -vo opengl --hwdec=vdpau' which needs https://github.com/i-rinat/libvdpau-va-gl from the creator of freshplayerplugin.

Here's some logs.
Comment 9 Sergey Kondakov 2015-07-14 16:26:10 UTC
Ah, and I don't know if its normal or not and pertaining to that issue or not (maybe that confuses something else in the driver). But I can't seem to get vsync working. Earlier I used "EXAVsync" and that was that. But now even it doesn't help and I'm getting stuff like:
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
20074 frames in 5.0 seconds = 4014.652 FPS

Doesn't work under both EXA and glamor. Don't quite know when it changed. 
It does work however with bleeding edge X stack + mesa 10.7~git and "TearFree" option but with it
1) clock speed and voltage are always on maximum
2) mpv/bomi are slowly desynchronizing audio and video
Comment 10 Christian König 2015-07-14 16:45:44 UTC
Well first of all try to avoid the VDPAU wrapper for VA-API or the VA-API backend for VDPAU. Those transition layers seem to have the tendency to add quite a bit of instability.

If you can try to use the native VDPAU implementation. We only did the VA-API implementation as a drop in replacement when the user has no other choice than to use VA-API.

So can you reproduce the issues with VDPAU as well, or is that limited to VA-API only?
Comment 11 Sergey Kondakov 2015-07-14 17:01:04 UTC
(In reply to Christian König from comment #10)
> Well first of all try to avoid the VDPAU wrapper for VA-API or the VA-API
> backend for VDPAU. Those transition layers seem to have the tendency to add
> quite a bit of instability.

It would be easier if there was only one damn standard for GPU decoding. I can do this for myself but I'm also would like to make a livecd where it works to some degree automatically. LIBVA_DRIVER_NAME=gallium + VDPAU_DRIVER=va_gl would solve that perfectly. Or work around the fact that those libs need to learn autodetect appropriate implementation and there has to be only one of them or one has to use a wrapper.

And don't get me started on what I had to do to make gst-omx loading with mesa ST. Still haven't tested that. But no apps use it anyway.

> If you can try to use the native VDPAU implementation. We only did the
> VA-API implementation as a drop in replacement when the user has no other
> choice than to use VA-API.
> 
> So can you reproduce the issues with VDPAU as well, or is that limited to
> VA-API only?

The crash ? No. No crash or X hang on pure vdpau=r600 BUT it's quite useless anyway since it always only shows the slideshow, in player's stats it shows no more than ~10 frames (when it's ~150 on CPU) but on screen it looks more like one per second.

And driver dying like still can't be appropriate.
Comment 12 GitLab Migration User 2019-09-18 19:19:27 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/551.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.