Bug 93288 - dpm malfunction on radeon grenada r9 390X
Summary: dpm malfunction on radeon grenada r9 390X
Status: RESOLVED DUPLICATE of bug 91880
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact: Default DRI bug account
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-08 01:13 UTC by Thomas DEBESSE
Modified: 2016-08-18 00:32 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
2015-10-21 dmesg (linux 4.2, mesa git, radeon hawaii r9 390X) (97.37 KB, text/plain)
2015-12-08 01:18 UTC, Thomas DEBESSE
Details
2015-10-21 lspci (linux 4.2, mesa git, radeon hawaii r9 390X) (3.22 KB, text/plain)
2015-12-08 01:18 UTC, Thomas DEBESSE
Details
2015-10-21 screen glitch photo (linux 4.2, mesa git, radeon hawaii r9 390X) (2.61 MB, image/jpeg)
2015-12-08 01:20 UTC, Thomas DEBESSE
Details
syslog (Linux 4.3, mesa git, radeon hawaii r9 390X) (5.37 KB, text/plain)
2015-12-08 01:23 UTC, Thomas DEBESSE
Details
dmesg (Linux 4.3, mesa git, radeon hawaii r9 390X) (128.48 KB, text/plain)
2015-12-08 02:21 UTC, Thomas DEBESSE
Details
Xorg.0.log (Linux 4.3, mesa git, radeon hawaii r9 390X) (90.22 KB, text/plain)
2015-12-08 02:21 UTC, Thomas DEBESSE
Details
screen video corruption photo (Linux 4.3, mesa git, radeon hawaii r9 390X) (1.26 MB, image/jpeg)
2015-12-08 02:34 UTC, Thomas DEBESSE
Details
screen corruption linux 4.7 mesa git , radeon ci r7 260x (500.13 KB, image/png)
2016-08-18 00:04 UTC, Jacinto
Details
syslog linux 4.8 rc2 mesa git radeon ci r7 260x (42.38 KB, text/plain)
2016-08-18 00:32 UTC, Jacinto
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas DEBESSE 2015-12-08 01:13:58 UTC
Hi, I encounter many problems with radeonsi and dpm (I never got it working, even with previous radeon cards I owned).

While running linux 4.2 two months ago I got a a graphical hang with a "[drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed" message.

To get this bug I just had to boot my system and wait less than 10min before it hangs. The system was still usable by ssh, but the "reboot" never complete.

I was running 4.2 kernel on Ubuntu Wily with mesa git (I'm using some nightly build packages). I have a radeon R9 390X (Hawaii), lspci says that:

01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon R9 290X] [1002:67b0] (rev 80) (prog-if 00 [VGA controller])

dmesg says that:
[drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed

I 'll join a detailed lspci entry, a complete dmesg log and a screen photo.

The only way to run the radeon R9 390X is to run radeon.dpm=0 and it was needed too for a radeon HD 7970 I had in the past.

Right now, I tried to reboot with radeon.dpm=1 and I got a graphical hang then some weird stuff printed on screen. I rebooted my computer to not burn my CPU (see https://bugs.freedesktop.org/show_bug.cgi?id=93101 ), so currently the only information I have is the ones was syslogged.

It printed things like that:

Dec  8 01:32:07 gollum kernel: [   82.262780] radeon 0000:01:00.0: ring 3 stalled for more than 10368msec
Dec  8 01:32:07 gollum kernel: [   82.262790] radeon 0000:01:00.0: GPU lockup (current fence id 0x000000000000155c last fence id 0x0000000000001655 on ring 3)
Dec  8 01:32:08 gollum kernel: [   82.366721] radeon 0000:01:00.0: ring 4 stalled for more than 10472msec
Dec  8 01:32:08 gollum kernel: [   82.366729] radeon 0000:01:00.0: ring 0 stalled for more than 10472msec
Dec  8 01:32:08 gollum kernel: [   82.366737] radeon 0000:01:00.0: GPU lockup (current fence id 0x000000000000089a last fence id 0x000000000000089b on ring 4)
Dec  8 01:32:08 gollum kernel: [   82.366744] radeon 0000:01:00.0: GPU lockup (current fence id 0x0000000000000ce4 last fence id 0x0000000000000d0e on ring 0)
Dec  8 01:32:27 gollum kernel: [  101.898853] radeon 0000:01:00.0: Saved 3901 dwords of commands on ring 0.
Dec  8 01:32:27 gollum kernel: [  101.898870] radeon 0000:01:00.0: GPU softreset: 0x00000009

Right now I'm running:

- Ubuntu Wily 15.10 + Oibaf PPA
- Linux 4.3
- libdrm-radeon1 2.4.65+git1512011830.42f2f9~gd~w
- xserver-xorg-video-radeon 7.6.99+git1512040732.78fbca~gd~w
- xserver-xorg 7.7+7ubuntu4
- libgl1-mesa-dri 11.2~git1512061930.d108b6~gd~w

And dmesg | grep radeon says:

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.3.0-040300-generic root=/dev/mapper/gollum--vg-gollum--disk ro quiet splash nomdmonddf nomdmonisw libata.atapi_passthru16=0 radeon.dpm=0 vt.handoff=7
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.3.0-040300-generic root=/dev/mapper/gollum--vg-gollum--disk ro quiet splash nomdmonddf nomdmonisw libata.atapi_passthru16=0 radeon.dpm=0 vt.handoff=7
[    1.958053] [drm] radeon kernel modesetting enabled.
[    1.962224] fb: switching to radeondrmfb from VESA VGA
[    1.962529] radeon 0000:01:00.0: Invalid ROM contents
[    1.962629] radeon 0000:01:00.0: VRAM: 8192M 0x0000000000000000 - 0x00000001FFFFFFFF (8192M used)
[    1.962631] radeon 0000:01:00.0: GTT: 2048M 0x0000000200000000 - 0x000000027FFFFFFF
[    1.962710] [drm] radeon: 8192M of VRAM memory ready
[    1.962711] [drm] radeon: 2048M of GTT memory ready.
[    1.962888] [drm] radeon: power management initialized
[    1.968997] radeon 0000:01:00.0: WB enabled
[    1.969003] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000200000c00 and cpu addr 0xffff8808126b5c00
[    1.969005] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000200000c04 and cpu addr 0xffff8808126b5c04
[    1.969007] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000200000c08 and cpu addr 0xffff8808126b5c08
[    1.969009] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000200000c0c and cpu addr 0xffff8808126b5c0c
[    1.969011] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000200000c10 and cpu addr 0xffff8808126b5c10
[    1.969410] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000076c98 and cpu addr 0xffffc90004036c98
[    1.969558] radeon 0000:01:00.0: fence driver on ring 6 use gpu addr 0x0000000200000c18 and cpu addr 0xffff8808126b5c18
[    1.969560] radeon 0000:01:00.0: fence driver on ring 7 use gpu addr 0x0000000200000c1c and cpu addr 0xffff8808126b5c1c
[    1.969631] radeon 0000:01:00.0: radeon: using MSI.
[    1.969660] [drm] radeon: irq initialized.
[    3.762178] fbcon: radeondrmfb (fb0) is primary device
[    3.780407] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
[    3.791974] [drm] Initialized radeon 2.43.0 20080528 for 0000:01:00.0 on minor 0
Comment 1 Thomas DEBESSE 2015-12-08 01:18:14 UTC
Created attachment 120402 [details]
2015-10-21 dmesg (linux 4.2, mesa git, radeon hawaii r9 390X)
Comment 2 Thomas DEBESSE 2015-12-08 01:18:38 UTC
Created attachment 120403 [details]
2015-10-21 lspci (linux 4.2, mesa git, radeon hawaii r9 390X)
Comment 3 Thomas DEBESSE 2015-12-08 01:20:44 UTC
Created attachment 120404 [details]
2015-10-21 screen glitch photo (linux 4.2, mesa git, radeon hawaii r9 390X)
Comment 4 Thomas DEBESSE 2015-12-08 01:23:14 UTC
Created attachment 120405 [details]
syslog (Linux 4.3, mesa git, radeon hawaii r9 390X)
Comment 5 Thomas DEBESSE 2015-12-08 02:20:26 UTC
I reproduced it right now. For the first time in my life it does not at gnome session loading, I started the Unigine Valley Benchmark to stress test a little the GPU.

`dmesg | grep ERROR` said:

[  425.896560] [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting
[  425.896593] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing BD42 (len 254, WS 0, PS 4) @ 0xBD6C
[  425.896621] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing B2B6 (len 145, WS 0, PS 8) @ 0xB341
[  426.113327] [drm:ci_dpm_enable [radeon]] *ERROR* ci_start_dpm failed
[  426.113372] [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed
[  426.903401] [drm:cik_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x3010C)=0xCAFEDEAD)
[  426.903456] [drm:cik_resume [radeon]] *ERROR* cik startup failed on resume
[  426.903573] [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed
[  436.916775] [drm:radeon_vce_ib_test [radeon]] *ERROR* radeon: fence wait failed (-35).
[  436.916816] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB on ring 6 (-35).
[  436.952672] [drm:radeon_vce_ib_test [radeon]] *ERROR* radeon: fence wait failed (-35).
[  436.952712] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB on ring 7 (-35).

see the full dmesg log I will attach
Comment 6 Thomas DEBESSE 2015-12-08 02:21:08 UTC
Created attachment 120407 [details]
dmesg (Linux 4.3, mesa git, radeon hawaii r9 390X)
Comment 7 Thomas DEBESSE 2015-12-08 02:21:35 UTC
Created attachment 120408 [details]
Xorg.0.log (Linux 4.3, mesa git, radeon hawaii r9 390X)
Comment 8 Thomas DEBESSE 2015-12-08 02:34:23 UTC
Created attachment 120409 [details]
screen video corruption photo (Linux 4.3, mesa git, radeon hawaii r9 390X)
Comment 9 Alex Deucher 2015-12-09 16:37:42 UTC
May be a duplicate of bug 91880.
Comment 10 Thomas DEBESSE 2016-01-14 13:34:44 UTC
yes definitely a duplicate

*** This bug has been marked as a duplicate of bug 91880 ***
Comment 11 Jacinto 2016-08-18 00:04:51 UTC
Created attachment 125858 [details]
screen corruption linux 4.7 mesa git , radeon ci r7 260x
Comment 12 Jacinto 2016-08-18 00:07:10 UTC
I added an attachment I might be hitting the same bug. With 4.7 the screen will artifact some white squares and then it will hard lock , I had to hard reboot. With 4.8 there are no artifacts It will freeze, and after a while I can get to tty dmesg show the same message.

[drm:ci_dpm_enable [radeon]] *ERROR* ci_start_dpm failed [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed
Comment 13 Jacinto 2016-08-18 00:32:19 UTC
Created attachment 125859 [details]
syslog linux 4.8 rc2 mesa git radeon ci r7 260x

syslog linux kernel 4.8 rc2 , mesa git .


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.