Bug 66945 - Heavy artifacts and unusable graphics system with the latest DPM changes
Summary: Heavy artifacts and unusable graphics system with the latest DPM changes
Status: RESOLVED DUPLICATE of bug 66932
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium blocker
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-07-16 00:57 UTC by tobi
Modified: 2013-07-17 16:42 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg output (75.60 KB, text/plain)
2013-07-16 00:58 UTC, tobi
no flags Details
lspci output (3.18 KB, text/plain)
2013-07-16 00:59 UTC, tobi
no flags Details
add module parameter to disable aspm (4.80 KB, patch)
2013-07-16 20:51 UTC, Alex Deucher
no flags Details | Splinter Review
dmesg w/ dpm enabled & aspm disabled (75.83 KB, text/plain)
2013-07-16 23:56 UTC, queryv+fd
no flags Details
lspci output different user (2.57 KB, text/plain)
2013-07-16 23:58 UTC, queryv+fd
no flags Details
debugging output (1.59 KB, patch)
2013-07-17 01:40 UTC, Alex Deucher
no flags Details | Splinter Review
dmesg w/ dpm enabled & debug patch (76.04 KB, text/plain)
2013-07-17 02:51 UTC, queryv+fd
no flags Details
dmesg with dpm=1 and aspm=0 after patch from comment #12 (75.87 KB, text/plain)
2013-07-17 03:05 UTC, tobi
no flags Details

Description tobi 2013-07-16 00:57:49 UTC
Using the latest drm-fixes-3.11 and drm-next-3.11 I get heavy artifacts as soon as the radeon module is loaded, leaving my system unusable. Only the graphics system stops to work, SSH connections to the system still work.

Hardware: Radeon HD6870 (01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Barts XT [ATI Radeon HD 6800 Series])
Software: Slackware -current

I got these error messages from dmesg:
[   26.514278] radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
[   26.514291] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000001)
[   26.649947] radeon 0000:01:00.0: Saved 119 dwords of commands on ring 0.
[   26.649969] radeon 0000:01:00.0: GPU softreset: 0x00000008
[   26.649977] radeon 0000:01:00.0:   GRBM_STATUS               = 0xA0003828
[   26.649983] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000007
[   26.649989] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000007
[   26.649994] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[   26.649999] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[   26.650004] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[   26.650010] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00004100
[   26.650015] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00020180
[   26.650020] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80028042
[   26.650026] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[   26.661433] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001
[   26.661490] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
[   26.662652] radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003828
[   26.662657] radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000007
[   26.662663] radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000007
[   26.662668] radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
[   26.662673] radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
[   26.662679] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[   26.662684] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[   26.662689] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[   26.662694] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
[   26.662700] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[   26.662711] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
[   26.686572] radeon 0000:01:00.0: WB enabled
[   26.686581] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff880429b6dc00
[   26.686588] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff880429b6dc0c
[   26.688117] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc900112b2118
[   26.920227] [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD)

I attach a complete dmesg output and lspci output, If anything else is needed just ask me and I will do my best to get you the info.
Comment 1 tobi 2013-07-16 00:58:50 UTC
Created attachment 82464 [details]
dmesg output
Comment 2 tobi 2013-07-16 00:59:27 UTC
Created attachment 82465 [details]
lspci output
Comment 3 Alex Deucher 2013-07-16 01:01:45 UTC
Are you using radeon as a module or built into the kernel?
Comment 4 tobi 2013-07-16 01:16:58 UTC
I use it as a module. Will test it inbuilt ASAP, needs some minutes to compile.
Comment 5 tobi 2013-07-16 02:10:10 UTC
OK, took me some time to figure out that I also need the firmware inbuilt (and which one, isn't quite obvious that I need firmware from BARTS, BTC and SUMO to boot).
Anyways, problem persists, same errors in dmesg. If you want to see how it actually looks, I have made a short video (sorry for the bad quality, need a better camera): http://slackeee.de/public/drm-next-3.11.mp4
Comment 6 Alex Deucher 2013-07-16 02:26:24 UTC
(In reply to comment #4)
> I use it as a module. Will test it inbuilt ASAP, needs some minutes to
> compile.

Sorry, no need to do that, modules should be fine.  Sometimes there are problems with firmware when the driver is built in.
Comment 7 queryv+fd 2013-07-16 04:21:07 UTC
I have this issue too (same card), and I made a comment about it in one of the Phoronix threads mentioning that the problem started post wip-5 patch-set.

Having read most of the other related threads, I've noticed that most/all owners who try enabling DPM with a 6870 run into this problem.

If any other information is needed to find a commonality from whence this issue originates (separate from OPs), let me know an I'll attach the needed logs.
Comment 8 Alex Deucher 2013-07-16 20:51:07 UTC
Created attachment 82504 [details] [review]
add module parameter to disable aspm

Try this patch which adds a new module parameter to disable aspm.  Add radeon.aspm=0 to your kernel command line in grub to disable aspm support.
Comment 9 queryv+fd 2013-07-16 23:55:16 UTC
(In reply to comment #8)
> Created attachment 82504 [details] [review] [review]
> add module parameter to disable aspm
> 
> Try this patch which adds a new module parameter to disable aspm.  Add
> radeon.aspm=0 to your kernel command line in grub to disable aspm support.

The issue still persists for me after using that patch and disabling aspm. I've attached a dmesg log taken after booting (to console, no X) using a 3.10 kernel, patched with the latest changes from the drm-fixes-3.11 branch (a01c34) and your aspm patch. And I've also attached the output from lspci since mine differs from OPs.

I'm still kind of new to the whole patching thing (I'm a newb) so there's a chance that I may have done something wrong, but everything seemed like it went OK to me.
Comment 10 queryv+fd 2013-07-16 23:56:16 UTC
Created attachment 82509 [details]
dmesg w/ dpm enabled & aspm disabled
Comment 11 queryv+fd 2013-07-16 23:58:02 UTC
Created attachment 82510 [details]
lspci output different user
Comment 12 Alex Deucher 2013-07-17 01:40:12 UTC
Created attachment 82517 [details] [review]
debugging output

Can you attach a dmesg output with dpm enabled with this patch?
Comment 13 queryv+fd 2013-07-17 02:51:21 UTC
Created attachment 82520 [details]
dmesg w/ dpm enabled & debug patch

dmesg output after booting to console.
Comment 14 tobi 2013-07-17 03:05:36 UTC
Created attachment 82521 [details]
dmesg with dpm=1 and aspm=0 after patch from comment #12

Same here, problem not fixed with patch from #8, attached output from dmesg with dpm=1 and aspm=0 after patch from #12
Comment 15 Alex Deucher 2013-07-17 16:42:32 UTC

*** This bug has been marked as a duplicate of bug 66932 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.