93288 – dpm malfunction on radeon grenada r9 390X

Bug 93288 - dpm malfunction on radeon grenada r9 390X

Summary: dpm malfunction on radeon grenada r9 390X

Status:	RESOLVED DUPLICATE of bug 91880

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/Gallium/radeonsi (show other bugs)
Version:	git
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium normal
Assignee:	Default DRI bug account
QA Contact:	Default DRI bug account

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-12-08 01:13 UTC by Thomas DEBESSE
Modified:	2016-08-18 00:32 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments
2015-10-21 dmesg (linux 4.2, mesa git, radeon hawaii r9 390X) (97.37 KB, text/plain) 2015-12-08 01:18 UTC, Thomas DEBESSE	Details
2015-10-21 lspci (linux 4.2, mesa git, radeon hawaii r9 390X) (3.22 KB, text/plain) 2015-12-08 01:18 UTC, Thomas DEBESSE	Details
2015-10-21 screen glitch photo (linux 4.2, mesa git, radeon hawaii r9 390X) (2.61 MB, image/jpeg) 2015-12-08 01:20 UTC, Thomas DEBESSE	Details
syslog (Linux 4.3, mesa git, radeon hawaii r9 390X) (5.37 KB, text/plain) 2015-12-08 01:23 UTC, Thomas DEBESSE	Details
dmesg (Linux 4.3, mesa git, radeon hawaii r9 390X) (128.48 KB, text/plain) 2015-12-08 02:21 UTC, Thomas DEBESSE	Details
Xorg.0.log (Linux 4.3, mesa git, radeon hawaii r9 390X) (90.22 KB, text/plain) 2015-12-08 02:21 UTC, Thomas DEBESSE	Details
screen video corruption photo (Linux 4.3, mesa git, radeon hawaii r9 390X) (1.26 MB, image/jpeg) 2015-12-08 02:34 UTC, Thomas DEBESSE	Details
screen corruption linux 4.7 mesa git , radeon ci r7 260x (500.13 KB, image/png) 2016-08-18 00:04 UTC, Jacinto	Details
syslog linux 4.8 rc2 mesa git radeon ci r7 260x (42.38 KB, text/plain) 2016-08-18 00:32 UTC, Jacinto	Details
View All

Description Thomas DEBESSE 2015-12-08 01:13:58 UTC

Hi, I encounter many problems with radeonsi and dpm (I never got it working, even with previous radeon cards I owned).

While running linux 4.2 two months ago I got a a graphical hang with a "[drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed" message.

To get this bug I just had to boot my system and wait less than 10min before it hangs. The system was still usable by ssh, but the "reboot" never complete.

I was running 4.2 kernel on Ubuntu Wily with mesa git (I'm using some nightly build packages). I have a radeon R9 390X (Hawaii), lspci says that:

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon R9 290X] [1002:67b0] (rev 80) (prog-if 00 [VGA controller])

dmesg says that:
[drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed

I 'll join a detailed lspci entry, a complete dmesg log and a screen photo.

The only way to run the radeon R9 390X is to run radeon.dpm=0 and it was needed too for a radeon HD 7970 I had in the past.

Right now, I tried to reboot with radeon.dpm=1 and I got a graphical hang then some weird stuff printed on screen. I rebooted my computer to not burn my CPU (see https://bugs.freedesktop.org/show_bug.cgi?id=93101 ), so currently the only information I have is the ones was syslogged.

It printed things like that:

Dec  8 01:32:07 gollum kernel: [   82.262780] radeon 0000:01:00.0: ring 3 stalled for more than 10368msec
Dec  8 01:32:07 gollum kernel: [   82.262790] radeon 0000:01:00.0: GPU lockup (current fence id 0x000000000000155c last fence id 0x0000000000001655 on ring 3)
Dec  8 01:32:08 gollum kernel: [   82.366721] radeon 0000:01:00.0: ring 4 stalled for more than 10472msec
Dec  8 01:32:08 gollum kernel: [   82.366729] radeon 0000:01:00.0: ring 0 stalled for more than 10472msec
Dec  8 01:32:08 gollum kernel: [   82.366737] radeon 0000:01:00.0: GPU lockup (current fence id 0x000000000000089a last fence id 0x000000000000089b on ring 4)
Dec  8 01:32:08 gollum kernel: [   82.366744] radeon 0000:01:00.0: GPU lockup (current fence id 0x0000000000000ce4 last fence id 0x0000000000000d0e on ring 0)
Dec  8 01:32:27 gollum kernel: [  101.898853] radeon 0000:01:00.0: Saved 3901 dwords of commands on ring 0.
Dec  8 01:32:27 gollum kernel: [  101.898870] radeon 0000:01:00.0: GPU softreset: 0x00000009

Right now I'm running:

- Ubuntu Wily 15.10 + Oibaf PPA
- Linux 4.3
- libdrm-radeon1 2.4.65+git1512011830.42f2f9~gd~w
- xserver-xorg-video-radeon 7.6.99+git1512040732.78fbca~gd~w
- xserver-xorg 7.7+7ubuntu4
- libgl1-mesa-dri 11.2~git1512061930.d108b6~gd~w

And dmesg | grep radeon says:

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.3.0-040300-generic root=/dev/mapper/gollum--vg-gollum--disk ro quiet splash nomdmonddf nomdmonisw libata.atapi_passthru16=0 radeon.dpm=0 vt.handoff=7
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.3.0-040300-generic root=/dev/mapper/gollum--vg-gollum--disk ro quiet splash nomdmonddf nomdmonisw libata.atapi_passthru16=0 radeon.dpm=0 vt.handoff=7
[    1.958053] [drm] radeon kernel modesetting enabled.
[    1.962224] fb: switching to radeondrmfb from VESA VGA
[    1.962529] radeon 0000:01:00.0: Invalid ROM contents
[    1.962629] radeon 0000:01:00.0: VRAM: 8192M 0x0000000000000000 - 0x00000001FFFFFFFF (8192M used)
[    1.962631] radeon 0000:01:00.0: GTT: 2048M 0x0000000200000000 - 0x000000027FFFFFFF
[    1.962710] [drm] radeon: 8192M of VRAM memory ready
[    1.962711] [drm] radeon: 2048M of GTT memory ready.
[    1.962888] [drm] radeon: power management initialized
[    1.968997] radeon 0000:01:00.0: WB enabled
[    1.969003] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000200000c00 and cpu addr 0xffff8808126b5c00
[    1.969005] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000200000c04 and cpu addr 0xffff8808126b5c04
[    1.969007] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000200000c08 and cpu addr 0xffff8808126b5c08
[    1.969009] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000200000c0c and cpu addr 0xffff8808126b5c0c
[    1.969011] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000200000c10 and cpu addr 0xffff8808126b5c10
[    1.969410] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000076c98 and cpu addr 0xffffc90004036c98
[    1.969558] radeon 0000:01:00.0: fence driver on ring 6 use gpu addr 0x0000000200000c18 and cpu addr 0xffff8808126b5c18
[    1.969560] radeon 0000:01:00.0: fence driver on ring 7 use gpu addr 0x0000000200000c1c and cpu addr 0xffff8808126b5c1c
[    1.969631] radeon 0000:01:00.0: radeon: using MSI.
[    1.969660] [drm] radeon: irq initialized.
[    3.762178] fbcon: radeondrmfb (fb0) is primary device
[    3.780407] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
[    3.791974] [drm] Initialized radeon 2.43.0 20080528 for 0000:01:00.0 on minor 0

Comment 1 Thomas DEBESSE 2015-12-08 01:18:14 UTC

Created attachment 120402 [details]
2015-10-21 dmesg (linux 4.2, mesa git, radeon hawaii r9 390X)

Comment 2 Thomas DEBESSE 2015-12-08 01:18:38 UTC

Created attachment 120403 [details]
2015-10-21 lspci (linux 4.2, mesa git, radeon hawaii r9 390X)

Comment 3 Thomas DEBESSE 2015-12-08 01:20:44 UTC

Created attachment 120404 [details]
2015-10-21 screen glitch photo (linux 4.2, mesa git, radeon hawaii r9 390X)

Comment 4 Thomas DEBESSE 2015-12-08 01:23:14 UTC

Created attachment 120405 [details]
syslog (Linux 4.3, mesa git, radeon hawaii r9 390X)

Comment 5 Thomas DEBESSE 2015-12-08 02:20:26 UTC

I reproduced it right now. For the first time in my life it does not at gnome session loading, I started the Unigine Valley Benchmark to stress test a little the GPU.

`dmesg | grep ERROR` said:

[  425.896560] [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting
[  425.896593] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing BD42 (len 254, WS 0, PS 4) @ 0xBD6C
[  425.896621] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing B2B6 (len 145, WS 0, PS 8) @ 0xB341
[  426.113327] [drm:ci_dpm_enable [radeon]] *ERROR* ci_start_dpm failed
[  426.113372] [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed
[  426.903401] [drm:cik_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x3010C)=0xCAFEDEAD)
[  426.903456] [drm:cik_resume [radeon]] *ERROR* cik startup failed on resume
[  426.903573] [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed
[  436.916775] [drm:radeon_vce_ib_test [radeon]] *ERROR* radeon: fence wait failed (-35).
[  436.916816] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB on ring 6 (-35).
[  436.952672] [drm:radeon_vce_ib_test [radeon]] *ERROR* radeon: fence wait failed (-35).
[  436.952712] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB on ring 7 (-35).

see the full dmesg log I will attach

Comment 6 Thomas DEBESSE 2015-12-08 02:21:08 UTC

Created attachment 120407 [details]
dmesg (Linux 4.3, mesa git, radeon hawaii r9 390X)

Comment 7 Thomas DEBESSE 2015-12-08 02:21:35 UTC

Created attachment 120408 [details]
Xorg.0.log (Linux 4.3, mesa git, radeon hawaii r9 390X)

Comment 8 Thomas DEBESSE 2015-12-08 02:34:23 UTC

Created attachment 120409 [details]
screen video corruption photo (Linux 4.3, mesa git, radeon hawaii r9 390X)

Comment 9 Alex Deucher 2015-12-09 16:37:42 UTC

May be a duplicate of bug 91880.

Comment 10 Thomas DEBESSE 2016-01-14 13:34:44 UTC

yes definitely a duplicate

*** This bug has been marked as a duplicate of bug 91880 ***

Comment 11 Jacinto 2016-08-18 00:04:51 UTC

Created attachment 125858 [details]
screen corruption linux 4.7 mesa git , radeon ci r7 260x

Comment 12 Jacinto 2016-08-18 00:07:10 UTC

I added an attachment I might be hitting the same bug. With 4.7 the screen will artifact some white squares and then it will hard lock , I had to hard reboot. With 4.8 there are no artifacts It will freeze, and after a while I can get to tty dmesg show the same message.

[drm:ci_dpm_enable [radeon]] *ERROR* ci_start_dpm failed [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed

Comment 13 Jacinto 2016-08-18 00:32:19 UTC

Created attachment 125859 [details]
syslog linux 4.8 rc2 mesa git radeon ci r7 260x

syslog linux kernel 4.8 rc2 , mesa git .

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.