Bug 95282 - G96: system hang on video playback via vdpau
Summary: G96: system hang on video playback via vdpau
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-05 09:21 UTC by Elmar Stellnberger
Modified: 2018-11-17 10:39 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
screenshot of hanging system on start of playback (448.00 KB, image/jpeg)
2016-05-05 09:23 UTC, Elmar Stellnberger
no flags Details
screenshot of backtrace generated by Alt-PrnScr-C (504.00 KB, image/jpeg)
2016-05-05 09:24 UTC, Elmar Stellnberger
no flags Details
journal of the day (574.55 KB, text/plain)
2016-05-05 09:25 UTC, Elmar Stellnberger
no flags Details
Xorg.0.log: nothing in here (42.49 KB, text/plain)
2016-05-05 09:25 UTC, Elmar Stellnberger
no flags Details
dmesg while trying to play with vdpau by xine (4.80 KB, text/plain)
2016-09-06 15:25 UTC, Elmar Stellnberger
no flags Details
screenshot trying to play an iso6avc1mp41 file - hanging on startup (414.00 KB, image/jpeg)
2017-02-16 17:34 UTC, Elmar Stellnberger
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Elmar Stellnberger 2016-05-05 09:21:24 UTC
Yesterday I have installed the proprietary firmware for nouveau as described under wiki/VideoAcceleration/#firmware and now my machine hangs reliably on start of video playback. If vdpau is not used (i.e. by specifying xine -V XShm xy.mkv) then it does not hang. It first starts to hang, then there is a short moment where it reacts on pressing num lock and where playback continues and then it hangs forever.
  When switching to vt02 before playback starts you can see several messages; it also reacts on Alt-PrnScr-l/m/p printing out that it would now print the memory/registers/backtraces but then does not actually print them unless you presss Alt-PrnScr-C. There appears to be nothing in the log (switched to vt02 is the last message in the xorg.0.log).

Current Operating System: Linux AmiloXi3650 4.6.0-rc6-ARCH-00006-g7d92f59 #8 SMP PREEMPT Tue May 3 14:17:04 CEST 2016 x86_64
Xorg Build Date: 05 April 2016  05:24:02PM
Comment 1 Elmar Stellnberger 2016-05-05 09:23:00 UTC
Created attachment 123489 [details]
screenshot of hanging system on start of playback
Comment 2 Elmar Stellnberger 2016-05-05 09:24:27 UTC
Created attachment 123490 [details]
screenshot of backtrace generated by Alt-PrnScr-C

Unfortunately nothing of all of that seems to have entered the logs.
Comment 3 Elmar Stellnberger 2016-05-05 09:25:12 UTC
Created attachment 123491 [details]
journal of the day
Comment 4 Elmar Stellnberger 2016-05-05 09:25:36 UTC
Created attachment 123492 [details]
Xorg.0.log: nothing in here
Comment 5 Ilia Mirkin 2016-05-06 16:12:06 UTC
Interesting. Looks like we messed up and somehow the a rendertarget was not mapped into vram, and then the bsp/vp engines hung (and I never figured out how to reset them).

Does xine use GL + VDPAU in separate threads? If so, that could definitely be the cause. I know kodi does that, and I think mpv does as well. mplayer is what I developed VP2 support with, and was pretty reliable.
Comment 6 Elmar Stellnberger 2016-05-06 17:18:40 UTC
Unfortunately I don`t know; perhaps ask one of the xine developers.
Comment 7 poma 2016-05-09 17:36:16 UTC
VDPAU enabled in xine-ui:
...
# video driver to use
# { auto  vdpau  aadxr3  dxr3  xv  vaapi  raw  opengl2  opengl  aa  xshm  caca  none  xxmc  sdl  fb  xvmc }, default: 0
video.driver:vdpau

...
# vdpau: color of none video area in output window
# numeric, default: 0
#video.output.vdpau_background_color:0

# vdpau: HD deinterlace method
# { bob  half temporal  temporal }, default: 2
#video.output.vdpau_hd_deinterlace_method:temporal

# vdpau: disable deinterlacing when progressive_frame flag is set
# bool, default: 0
#video.output.vdpau_honor_progressive:0

# vdpau: SD deinterlace method
# { bob  half temporal  temporal }, default: 2
#video.output.vdpau_sd_deinterlace_method:temporal

# vdpau: restrict enabling video properties for SD video only
# { none  noise  sharpness  noise+sharpness }, default: 0
#video.output.vdpau_sd_only_properties:none

# vdpau: disable advanced deinterlacers chroma filter
# bool, default: 0
#video.output.vdpau_skip_chroma_deinterlace:0

...
# default length of display queue
# numeric, default: 3
#video.output.vdpau_display_queue_length:3

# maximum number of output surfaces buffered for reuse
# numeric, default: 10
#video.output.vdpau_output_surface_buffer_size:10

...
# priority for vdpau_h264 decoder
# numeric, default: 0
#engine.decoder_priorities.vdpau_h264:0

# priority for vdpau_h264_alter decoder
# numeric, default: 0
#engine.decoder_priorities.vdpau_h264_alter:0

# priority for vdpau_mpeg12 decoder
# numeric, default: 0
#engine.decoder_priorities.vdpau_mpeg12:0

# priority for vdpau_mpeg4 decoder
# numeric, default: 0
#engine.decoder_priorities.vdpau_mpeg4:0

# priority for vdpau_vc1 decoder
# numeric, default: 0
#engine.decoder_priorities.vdpau_vc1:0

...
Comment 8 poma 2016-05-09 17:42:28 UTC
https://kodi.tv/media-samples
Tested with which of these media samples?
Comment 9 poma 2016-05-09 17:58:19 UTC
Besides, it is enough to just run the application itself, without starting the video,

$ xine
This is xine (X11 gui) - a free video player v0.99.9.
(c) 2000-2014 The xine Team.
vo_vdpau: vdpau API version : 1
vo_vdpau: vdpau implementation description : G3DVL VDPAU Driver Shared Library version 1.0
vo_vdpau: maximum video surface size for chroma type 4:2:2 is 8192x8192
vo_vdpau: maximum video surface size for chroma type 4:2:0 is 8192x8192
vo_vdpau: maximum output surface size is 8192x8192
vo_vdpau: hold a maximum of 10 video output surfaces for reuse
vo_vdpau: using 3 output surfaces of size 1920x1080 for display queue
vo_vdpau: this hardware doesn't support mpeg4-part2.
vdpau_set_property: property=1, value=0
vo_vdpau: deinterlace: none
vo_vdpau: set_scaling_level=0
vo_vdpau: disable noise reduction.
vo_vdpau: disable sharpness.
vo_vdpau: skip_chroma = 0


... and this jumps in kernel log:
$ dmesg -t
...
nouveau 0000:02:00.0: gr: TRAP_PROP - TP 0 - 00000040 [RT_FAULT] - Address 0041a80000
nouveau 0000:02:00.0: gr: TRAP_PROP - TP 0 - e0c: 00000000, e18: 00000000, e1c: 00800080, e20: 00001800, e24: 00030000
nouveau 0000:02:00.0: gr: 00200000 [] ch 5 [001f7bd000 xine[8158]] subc 3 class 8297 mthd 1558 data 00000001
nouveau 0000:02:00.0: fb: trapped write at 0041a80000 on channel 5 [1f7bd000 xine[8158]] engine 00 [PGRAPH] client 0b [PROP] subclient 00 [RT0] reason 00000002 [PAGE_NOT_PRESENT]


-NVIDIA G98-
Comment 10 Elmar Stellnberger 2016-05-13 19:19:06 UTC
Could perhaps anyone have a look at Bug 95390 and whether it could also be related to nouveau?
Comment 11 Elmar Stellnberger 2016-05-18 09:08:00 UTC
much better with 4.6.0-ARCH-00466-ge80ac9b. However sometimes I still get a hang; I will have to assert whether the occasional hangs do really stem from vidoe playback ...
Comment 12 Elmar Stellnberger 2016-05-24 15:42:01 UTC
System/kernel hangs still occur in 20-30% of the time but only when xine intializes playback. Once it has successfully started to play it will continue until it finishes (4.6.0-ARCH-00466-ge80ac9b). The bug still depends on vdpau since there are no hangs when starting xine with -V XShm.
Comment 13 Elmar Stellnberger 2016-06-04 14:47:19 UTC
  Worse than ever with 4.7.0-rc1-ARCH-11985-gf758c64. Now xine has crashed 3 out of 3 times on startup using vdpau for video playback. Last known better was 4.6.0-ARCH-00466-ge80ac9b.
Comment 14 Elmar Stellnberger 2016-06-13 11:00:15 UTC
same with 4.7.0-rc3-ARCH-12588-g39543dd. Any video playback with xine/vdpau will hang the whole system when it in deed should start.
Comment 15 Elmar Stellnberger 2016-07-30 10:53:11 UTC
  Same problem with 4.7.0-ARCH-13902-gd491e80. Immediate crash on starting xine video playback when started with vdpau; the only escape is via SysRQ-keys. Anyone here who would care?
Comment 16 Elmar Stellnberger 2016-08-29 16:36:41 UTC
still 100%-hanging on start of playback with 4.8.0-rc4-ARCH-00609-g6db4082 - not even NumLock works - hard reset necessary.
Comment 17 Elmar Stellnberger 2016-09-06 15:25:00 UTC
Created attachment 126248 [details]
dmesg while trying to play with vdpau by xine

 There is not much in the logs except:
nouveau 0000:01:00.0: bsp: Watchdog interrupt, engine hung.

 tested with kernel 9ca581b50dab6103183396852cc08e440fcda18e and a freshly downloaded firmware.

  What may I do in order to get more meaningful debug information? Is vdpau playback an issue that is being worked upon?
Comment 18 David Kredba 2017-01-18 21:51:35 UTC
The same with VLC-2.2.4 (DVB-T stream), VDPAU, kernel 4.9.4, mesa-13.0.3, xf86-video-nouveau-1.0.13. Any web browser crashes KDE Plasma - XOrg-server the same way - it does not react to anything or it také ages to react. Mpv absolutely stable till now using opengl-hq profile.
Comment 19 Elmar Stellnberger 2017-02-16 17:30:12 UTC
VDPAU - playback still crashing / not implemented with 4.10.0-rc8+ (Wed Feb 15 13:10:17 CET 2017) for the NV50 family (tested with G96GLM [Quadro FX 770M]. As vdpau support for this card is on your ToDo-list at nouveau.freedesktop.org/wiki/VideoAcceleration at least for the iso6avc1mp41 and the H.264 codecs I have tested videos with these codecs  and both videos have crashed with xine -V vdpau right after xine started having done some init-steps first. I will add a screenshot of these messages shortly.
Comment 20 Elmar Stellnberger 2017-02-16 17:34:43 UTC
Created attachment 129680 [details]
screenshot trying to play an iso6avc1mp41 file - hanging on startup

You can see some messages including the message that avc1 has not yet been implemented. Still I wonder why it is crashing if it can not use VDPAU directly for video decodation. Though I have run nice -20 dmesg --follow in another terminal at the same time no messages have appeared while xine was hanging the system. Quite a while after startup the music (and only the music) played for short but that was all of what had happened after the given messages appeared on the xine-console. Escaping with SysRq-S-U-B was still possible.
Comment 21 Elmar Stellnberger 2017-02-16 17:37:53 UTC
  Anyone here who could asses whether the hang-problem with vdpau could be related to GART/IOMMU? Here is what I have in my dmesg:

[    5.186566] Linux agpgart interface v0.103
[    5.186765] AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[    5.186766] AMD IOMMUv2 functionality not available on this system
[    8.857000] nouveau 0000:01:00.0: DRM: VRAM: 256 MiB
[    8.857001] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
Comment 22 Maris Nartiss 2018-11-17 10:39:33 UTC
(In reply to poma from comment #9)
> Besides, it is enough to just run the application itself, without starting
> the video,

I confirm – starting xine is enough without any need of playing video. I get the same error as in comment #9 Still the error looks a bit different than one in the log from comment #3

G98M [Quadro NVS 160M]
sys-kernel/gentoo-sources-4.18.0
media-libs/mesa-18.2.2-r1

$ xine --verbose
This is xine (X11 gui) - a free video player v0.99.10.
Built with xine library 1.2.9 (1.2.9)
Found xine library version: 1.2.9 (1.2.9).
main: probing <vdpau> video output plugin
vo_vdpau: vdpau API version : 1
vo_vdpau: vdpau implementation description : G3DVL VDPAU Driver Shared Library version 1.0
vo_vdpau: maximum video surface size for chroma type 4:2:2 is 8192x8192
vo_vdpau: maximum video surface size for chroma type 4:2:0 is 8192x8192
vo_vdpau: maximum output surface size is 8192x8192
vo_vdpau: hold a maximum of 10 video output surfaces for reuse
vo_vdpau: using 3 output surfaces of size 1440x900 for display queue
vo_vdpau: this hardware doesn't support mpeg4-part2.
video_out_vdpau: b 0 c 128 s 128 h 0 [full range ITU-R 470 BG / SDTV]
video_out: max frames used: 1 of 22
video_out: early wakeups: 1 of 127
video_out: max frames used: 0 of 22
video_out: early wakeups: 0 of 128

$ dmesg -t
WARNING: CPU: 0 PID: 28 at drivers/gpu/drm/nouveau/nvif/vmm.c:71 nvif_vmm_put+0x65/0x70 [nouveau]
Modules linked in: ctr ccm uvcvideo videobuf2_vmalloc nouveau videobuf2_memops videobuf2_v4l2 videodev videobuf2_common i2c_algo_bit drm_kms_helper dell_smm_hwmon wmi_bmof dell_wmi iwldvm sparse_keymap coretemp hwmon snd_hda_codec_idt cfbfillrect syscopyarea kvm_intel mac80211 cfbimgblt dell_laptop kvm sysfillrect sysimgblt dell_smbios fb_sys_fops cfbcopyarea dell_wmi_descriptor dcdbas fb snd_hda_codec_generic irqbypass font sdhci_pci fbdev snd_hda_intel cqhci ttm snd_hda_codec sdhci iwlwifi pcspkr drm mmc_core firewire_ohci lpc_ich firewire_core snd_hwdep cfg80211 mfd_core crc_itu_t drm_panel_orientation_quirks vboxpci(O) snd_hda_core vboxnetadp(O) vboxnetflt(O) rfkill e1000e wmi tpm_tis video tpm_tis_core ptp tpm backlight pps_core vboxdrv(O) pcc_cpufreq
CPU: 0 PID: 28 Comm: kworker/0:1 Tainted: G           O      4.18.0-gentoo #1
Hardware name: Dell Inc. Latitude E6500                  /, BIOS A27 12/06/2011
Workqueue: events nouveau_cli_work [nouveau]
RIP: 0010:nvif_vmm_put+0x65/0x70 [nouveau]
Code: 00 00 48 89 e2 be 02 00 00 00 48 c7 04 24 00 00 00 00 48 89 44 24 08 e8 a9 e6 ff ff 85 c0 75 0a 48 c7 43 08 00 00 00 00 eb b7 <0f> 0b eb f2 e8 02 90 9f e0 66 90 53 48 83 ec 20 65 48 8b 04 25 28 
RSP: 0018:ffffc90000723de8 EFLAGS: 00010282
RAX: 00000000fffffffe RBX: ffffc90000723e10 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffc90000723d58 RDI: ffffc90000723df8
RBP: ffffc90000723e40 R08: 000000000042a000 R09: 0000000000000000
R10: 0000000000000000 R11: fefefefefefefeff R12: ffff8800d10d1e60
R13: dead000000000200 R14: dead000000000100 R15: ffff8800d10d1e70
FS:  0000000000000000(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd7bd3dc000 CR3: 00000000da5b4000 CR4: 00000000000006f0
Call Trace:
 nouveau_vma_del+0x6b/0xb0 [nouveau]
 nouveau_gem_object_delete_work+0x31/0x60 [nouveau]
 nouveau_cli_work+0x71/0x100 [nouveau]
 process_one_work+0x1cf/0x3f0
 worker_thread+0x28/0x3c0
 ? process_one_work+0x3f0/0x3f0
 kthread+0x10e/0x130
 ? kthread_flush_work_fn+0x10/0x10
 ret_from_fork+0x35/0x40
---[ end trace be79f07b609be4df ]---
CE: hpet increased min_delta_ns to 20115 nsec
nouveau 0000:01:00.0: gr: TRAP_PROP - TP 0 - 00000040 [RT_FAULT] - Address 0021340000
nouveau 0000:01:00.0: gr: TRAP_PROP - TP 0 - e0c: 00000000, e18: 00000000, e1c: 01000080, e20: 00001800, e24: 00030000
nouveau 0000:01:00.0: gr: 00200000 [] ch 24 [000dcf9000 xine[5772]] subc 3 class 8297 mthd 1558 data 00000001
nouveau 0000:01:00.0: fb: trapped write at 0021340000 on channel 24 [0dcf9000 xine[5772]] engine 00 [PGRAPH] client 0b [PROP] subclient 00 [RT0] reason 00000002 [PAGE_NOT_PRESENT]


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.