Bug 63730

Summary: UVD broken on HD5470 by "drm/radeon: raise UVD clocks only on demand"
Product: DRI Reporter: Johannes Hirte <johannes.hirte>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: jharbestonus
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Possible fix
none
Possible fix v2
none
Possible fix v3
none
Raise UVD clocks while booting the VCPU.
none
Possible fix none

Description Johannes Hirte 2013-04-19 13:50:55 UTC
commit 6b1932bd23b6f3b3d46a2e23b2fdad4f58142b55 breaks UVD on my HD5470. The dmesg output with this commit:

Apr 19 15:32:10 localhost kernel: Linux agpgart interface v0.103
Apr 19 15:32:10 localhost kernel: [drm] Initialized drm 1.1.0 20060810
Apr 19 15:32:10 localhost kernel: [drm] radeon kernel modesetting enabled.
Apr 19 15:32:10 localhost kernel: [drm] initializing kernel modesetting (CEDAR 0x1002:0x68E0 0x1025:0x0489).
Apr 19 15:32:10 localhost kernel: [drm] register mmio base: 0xF2100000
Apr 19 15:32:10 localhost kernel: [drm] register mmio size: 131072
Apr 19 15:32:10 localhost kernel: ATOM BIOS: Acer
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used)
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: GTT: 512M 0x0000000020000000 - 0x000000003FFFFFFF
Apr 19 15:32:10 localhost kernel: [drm] Detected VRAM RAM=512M, BAR=256M
Apr 19 15:32:10 localhost kernel: [drm] RAM width 64bits DDR
Apr 19 15:32:10 localhost kernel: [TTM] Zone  kernel: Available graphics memory: 2022516 kiB
Apr 19 15:32:10 localhost kernel: [TTM] Initializing pool allocator
Apr 19 15:32:10 localhost kernel: [TTM] Initializing DMA pool allocator
Apr 19 15:32:10 localhost kernel: [drm] radeon: 512M of VRAM memory ready
Apr 19 15:32:10 localhost kernel: [drm] radeon: 512M of GTT memory ready.
Apr 19 15:32:10 localhost kernel: [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
Apr 19 15:32:10 localhost kernel: [drm] Driver supports precise vblank timestamp query.
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: irq 43 for MSI/MSI-X
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: radeon: using MSI.
Apr 19 15:32:10 localhost kernel: [drm] radeon: irq initialized.
Apr 19 15:32:10 localhost kernel: [drm] GART: num cpu pages 131072, num gpu pages 131072
Apr 19 15:32:10 localhost kernel: [drm] probing gen 2 caps for device 1022:9603 = 300d02/0
Apr 19 15:32:10 localhost kernel: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
Apr 19 15:32:10 localhost kernel: [drm] Loading CEDAR Microcode
Apr 19 15:32:10 localhost kernel: [drm] PCIE GART of 512M enabled (table at 0x000000000025D000).
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: WB enabled
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff88011a16bc00
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff88011a16bc0c
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90001e9c418
Apr 19 15:32:10 localhost kernel: [drm] ring test on 0 succeeded in 1 usecs
Apr 19 15:32:10 localhost kernel: [drm] ring test on 3 succeeded in 1 usecs
Apr 19 15:32:10 localhost kernel: ACPI: Deprecated procfs I/F for battery is loaded, please retry with CONFIG_ACPI_PROCFS_POWER cleared
Apr 19 15:32:10 localhost kernel: ACPI: Battery Slot [BAT1] (battery present)
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
Apr 19 15:32:10 localhost kernel: [drm:r600_uvd_init] *ERROR* UVD not responding, giving up!!!
Apr 19 15:32:10 localhost kernel: [drm:evergreen_startup] *ERROR* radeon: error initializing UVD (-1).
Apr 19 15:32:10 localhost kernel: [drm] Enabling audio support
Apr 19 15:32:10 localhost kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Apr 19 15:32:10 localhost kernel: [drm] ib test on ring 3 succeeded in 0 usecs
Apr 19 15:32:10 localhost kernel: Switching to clocksource tsc
Apr 19 15:32:10 localhost kernel: [drm] radeon atom DIG backlight initialized
Apr 19 15:32:10 localhost kernel: [drm] Radeon Display Connectors
Apr 19 15:32:10 localhost kernel: [drm] Connector 0:
Apr 19 15:32:10 localhost kernel: [drm]   LVDS-1
Apr 19 15:32:10 localhost kernel: [drm]   DDC: 0x6560 0x6560 0x6564 0x6564 0x6568 0x6568 0x656c 0x656c
Apr 19 15:32:10 localhost kernel: [drm]   Encoders:
Apr 19 15:32:10 localhost kernel: [drm]     LCD1: INTERNAL_UNIPHY
Apr 19 15:32:10 localhost kernel: [drm] Connector 1:
Apr 19 15:32:10 localhost kernel: [drm]   HDMI-A-1
Apr 19 15:32:10 localhost kernel: [drm]   HPD1
Apr 19 15:32:10 localhost kernel: [drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
Apr 19 15:32:10 localhost kernel: [drm]   Encoders:
Apr 19 15:32:10 localhost kernel: [drm]     DFP1: INTERNAL_UNIPHY1
Apr 19 15:32:10 localhost kernel: [drm] Connector 2:
Apr 19 15:32:10 localhost kernel: [drm]   VGA-1
Apr 19 15:32:10 localhost NetworkManager[1538]: <info> (wlan0): supplicant interface state: disconnected -> inactive
Apr 19 15:32:10 localhost kernel: [drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c
Apr 19 15:32:10 localhost kernel: [drm]   Encoders:
Apr 19 15:32:10 localhost kernel: [drm]     CRT1: INTERNAL_KLDSCP_DAC1
Apr 19 15:32:10 localhost kernel: [drm] Internal thermal controller with fan control
Apr 19 15:32:10 localhost kernel: [drm] radeon: power management initialized
Apr 19 15:32:10 localhost kernel: [drm] fb mappable at 0xE035F000
Apr 19 15:32:10 localhost kernel: [drm] vram apper at 0xE0000000
Apr 19 15:32:10 localhost kernel: [drm] size 4325376
Apr 19 15:32:10 localhost kernel: [drm] fb depth is 24
Apr 19 15:32:10 localhost kernel: [drm]    pitch is 5632
Apr 19 15:32:10 localhost kernel: fbcon: radeondrmfb (fb0) is primary device
Apr 19 15:32:10 localhost kernel: Console: switching to colour frame buffer device 170x48
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
Apr 19 15:32:10 localhost kernel: radeon 0000:01:00.0: registered panic notifier
Apr 19 15:32:10 localhost kernel: [drm] Initialized radeon 2.33.0 20080528 for 0000:01:00.0 on minor 0
Comment 1 Christian König 2013-04-19 14:15:43 UTC
Created attachment 78243 [details] [review]
Possible fix

Does this patch fixes the issue?
Comment 2 Johannes Hirte 2013-04-19 18:18:35 UTC
No, still the same error.
Comment 3 Christian König 2013-04-20 12:54:59 UTC
Created attachment 78270 [details] [review]
Possible fix v2

Please try this one instead.
Comment 4 Johannes Hirte 2013-04-20 16:52:19 UTC
Also with v2 the same error.
Comment 5 Jrh 2013-04-21 19:42:25 UTC
I am also seeing this on a foxconn nt-a3500 that has a radeon hd6310 - part of the amd fusion. If you want additional information, let me know.

Regards.
Comment 6 Christian König 2013-04-22 09:05:20 UTC
Created attachment 78323 [details] [review]
Possible fix v3

This patch reverts to the original behavior, just to make sure that it's indeed this problem.
Comment 7 Johannes Hirte 2013-04-22 14:05:46 UTC
I've already tested this by myself and can confirm that this fix the problem.
Comment 8 Christian König 2013-04-23 08:47:13 UTC
Created attachment 78357 [details] [review]
Raise UVD clocks while booting the VCPU.

>I've already tested this by myself and can confirm that this fix the problem.

Ok then let's try to narrow this further down. The attached patch should tries to only raise the clocks while the VCPU is booting.

Please test on your hardware.
Comment 9 Johannes Hirte 2013-04-23 10:35:15 UTC
with the latest patch I get the following output:

[    2.137767] Linux agpgart interface v0.103
[    2.138046] [drm] Initialized drm 1.1.0 20060810
[    2.138309] [drm] radeon kernel modesetting enabled.
[    2.138962] [drm] initializing kernel modesetting (CEDAR 0x1002:0x68E0 0x1025:0x0489).
[    2.139136] [drm] register mmio base: 0xF2100000
[    2.139284] [drm] register mmio size: 131072
[    2.144757] ATOM BIOS: Acer
[    2.145053] radeon 0000:01:00.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used)
[    2.145343] radeon 0000:01:00.0: GTT: 512M 0x0000000020000000 - 0x000000003FFFFFFF
[    2.145722] [drm] Detected VRAM RAM=512M, BAR=256M
[    2.145897] [drm] RAM width 64bits DDR
[    2.146136] [TTM] Zone  kernel: Available graphics memory: 2022516 kiB
[    2.146285] [TTM] Initializing pool allocator
[    2.146436] [TTM] Initializing DMA pool allocator
[    2.146621] [drm] radeon: 512M of VRAM memory ready
[    2.146785] [drm] radeon: 512M of GTT memory ready.
[    2.146949] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
[    2.149945] [drm] Driver supports precise vblank timestamp query.
[    2.150136] radeon 0000:01:00.0: irq 43 for MSI/MSI-X
[    2.150147] radeon 0000:01:00.0: radeon: using MSI.
[    2.150325] [drm] radeon: irq initialized.
[    2.152070] [drm] GART: num cpu pages 131072, num gpu pages 131072
[    2.153407] [drm] probing gen 2 caps for device 1022:9603 = 300d02/0
[    2.153596] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
[    2.153909] [drm] Loading CEDAR Microcode
[    2.168856] [drm] PCIE GART of 512M enabled (table at 0x000000000025D000).
[    2.169128] radeon 0000:01:00.0: WB enabled
[    2.169283] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff88011a174c00
[    2.169570] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff88011a174c0c
[    2.170039] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90001e9c418
[    2.186632] [drm] ring test on 0 succeeded in 1 usecs
[    2.186878] [drm] ring test on 3 succeeded in 1 usecs
[    2.363300] [drm] ring test on 5 succeeded in 1 usecs
[    2.363484] [drm] UVD initialized successfully.
[    2.363745] [drm] ib test on ring 0 succeeded in 0 usecs
[    2.363927] [drm] ib test on ring 3 succeeded in 0 usecs
[    2.744243] ACPI: Deprecated procfs I/F for battery is loaded, please retry with CONFIG_ACPI_PROCFS_POWER cleared
[    2.744539] ACPI: Battery Slot [BAT1] (battery present)
[    3.046208] tsc: Refined TSC clocksource calibration: 2094.754 MHz
[    3.046364] Switching to clocksource tsc
[   12.648332] radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
[   12.648493] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000002 last fence id 0x0000000000000000)
[   12.648781] [drm:r600_uvd_ib_test] *ERROR* radeon: fence wait failed (-35).
[   12.648932] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35).
[   12.703326] [drm] radeon atom DIG backlight initialized
[   12.703481] [drm] Radeon Display Connectors
[   12.703630] [drm] Connector 0:
[   12.703777] [drm]   LVDS-1
[   12.703925] [drm]   DDC: 0x6560 0x6560 0x6564 0x6564 0x6568 0x6568 0x656c 0x656c
[   12.704074] [drm]   Encoders:
[   12.704221] [drm]     LCD1: INTERNAL_UNIPHY
[   12.704374] [drm] Connector 1:
[   12.704522] [drm]   HDMI-A-1
[   12.704668] [drm]   HPD1
[   12.704815] [drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[   12.704964] [drm]   Encoders:
[   12.705111] [drm]     DFP1: INTERNAL_UNIPHY1
[   12.705260] [drm] Connector 2:
[   12.705408] [drm]   VGA-1
[   12.705555] [drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c
[   12.705703] [drm]   Encoders:
[   12.705851] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[   12.706045] [drm] Internal thermal controller with fan control
[   12.706343] [drm] radeon: power management initialized
[   13.091252] [drm] fb mappable at 0xE0361000
[   13.091443] [drm] vram apper at 0xE0000000
[   13.091591] [drm] size 4325376
[   13.091738] [drm] fb depth is 24
[   13.091885] [drm]    pitch is 5632
[   13.092233] fbcon: radeondrmfb (fb0) is primary device
[   13.685775] Console: switching to colour frame buffer device 170x48
[   13.690367] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
[   13.690403] radeon 0000:01:00.0: registered panic notifier
[   13.690556] [drm] Initialized radeon 2.33.0 20080528 for 0000:01:00.0 on minor 0
Comment 10 Johannes Hirte 2013-04-23 11:10:31 UTC
And I've had a X crash now. The output is from kdm.log since Xorg.log was already overwritten:

(EE) 
(EE) Backtrace:
(EE) 0: /usr/bin/X (xorg_backtrace+0x3d) [0x59ed1d]
(EE) 1: /usr/bin/X (0x400000+0x1a3409) [0x5a3409]
(EE) 2: /lib64/libpthread.so.0 (0x7f43cdf07000+0x10ed0) [0x7f43cdf17ed0]
(EE) 3: /lib64/libc.so.6 (0x7f43cd20c000+0x907bb) [0x7f43cd29c7bb]
(EE) 4: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7f43ca808000+0x3e436) [0x7f43ca846436]
(EE) 5: /usr/lib64/xorg/modules/libexa.so (0x7f43cea5a000+0x970c) [0x7f43cea6370c]
(EE) 6: /usr/bin/X (0x400000+0xdd521) [0x4dd521]
(EE) 7: /usr/bin/X (0x400000+0xde2e5) [0x4de2e5]
(EE) 8: /usr/bin/X (0x400000+0x3587f) [0x43587f]
(EE) 9: /usr/bin/X (0x400000+0x23fed) [0x423fed]
(EE) 10: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x7f43cd230c15]
(EE) 11: /usr/bin/X (0x400000+0x23b09) [0x423b09]
(EE) 
(EE) Bus error at address 0x7f43bddca000

Fatal server error:
Caught signal 7 (Bus error). Server aborting

(EE) 
Please consult the The X.Org Foundation support 
         at http://wiki.x.org
 for help. 
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE)

Don't know if this is related to the UVD problem.
Comment 11 Michel Dänzer 2013-04-23 11:14:50 UTC
(In reply to comment #10)
> Caught signal 7 (Bus error). Server aborting

Could be bug 61182.
Comment 12 Johannes Hirte 2013-04-23 11:49:13 UTC
It's related to the [PATCH] drm/radeon: raise UVD clocks while booting the VCPU patch. With this applied, X crashes when starting chromium browser.
Comment 13 Christian König 2013-04-23 12:07:49 UTC
I don't think that crashing X is related to a not working UVD ring, at least it shouldn't.

Anyway, I've got a good news: I was able to reproduce the problem with a HD5670, so a final patch fixing this is on the way.
Comment 14 Johannes Hirte 2013-04-23 14:54:20 UTC
(In reply to comment #13)
> I don't think that crashing X is related to a not working UVD ring, at least
> it shouldn't.

I can only say what I'm observing. with this patch X crashes as soon as I start chromium. Without it doesn't.

> Anyway, I've got a good news: I was able to reproduce the problem with a
> HD5670, so a final patch fixing this is on the way.

Sounds good, I'm waiting for testing.
Comment 15 Christian König 2013-04-23 15:39:53 UTC
Created attachment 78377 [details] [review]
Possible fix

This one should do it, the problems indeed seems to be that the VCPU on evergreen doesn't work with the lower clocks.

Please test.
Comment 16 Johannes Hirte 2013-04-23 19:21:09 UTC
Looks good. System booted as expected, dmesg shows

[    2.167883] radeon 0000:01:00.0: WB enabled
[    2.168038] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff88011a165c00
[    2.168325] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff88011a165c0c
[    2.168817] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90001e9c418
[    2.185378] [drm] ring test on 0 succeeded in 1 usecs
[    2.185625] [drm] ring test on 3 succeeded in 1 usecs
[    2.362042] [drm] ring test on 5 succeeded in 1 usecs
[    2.362226] [drm] UVD initialized successfully.
[    2.362489] [drm] ib test on ring 0 succeeded in 0 usecs
[    2.362671] [drm] ib test on ring 3 succeeded in 0 usecs
[    2.514356] [drm] ib test on ring 5 succeeded
[    2.568759] [drm] radeon atom DIG backlight initialized
[    2.568914] [drm] Radeon Display Connectors

and no errors, video playback with UVD (mplayer) works and no crashes with chromium. Perhaps the crash with the last patch came from the flash plugin that tried to use VDPAU/UVD? Anyhow, it works now. Feel free to add my tested-by.
Comment 17 Christian König 2013-04-24 08:14:45 UTC
Thanks for testing, good to know that it fixes the problem.

The comment about flash sounds valid, but I still don't get why the heck that should crash X.

Anyway this bug seems to be fixed, please open up another one if you can reproduce the X crash with the newest drm-next-3.10.
Comment 18 Parag 2013-05-04 17:58:16 UTC
I am getting the same error on HD6750M - I verified the kernel (Linus git from today) I am running has the raise clocks patch. I've also replaced everything in /lib/firmware/radeon from http://people.freedesktop.org/~agd5f/radeon_ucode/. 

lspci | grep -i radeon
----------------------
01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Whistler [Radeon HD 6600M/6700M/7600M Series]
01:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Turks/Whistler HDMI Audio [Radeon HD 6000 Series]

dmesg
-----
dmesg |grep -i uvd
[   17.676387] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   18.697587] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   19.718765] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   20.739957] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   21.761139] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   22.782347] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   23.803569] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   24.824728] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   25.845904] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   26.867070] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   26.887096] [drm:r600_uvd_init] *ERROR* UVD not responding, giving up!!!
[   26.887102] [drm:evergreen_startup] *ERROR* radeon: error initializing UVD (-1).
Comment 19 Parag 2013-05-04 18:01:05 UTC
Ugh, wrong status and wrong bug looks like. I will leave this closed and comment on 63935 which seems more appropriate. Sorry about the noise.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.