Bug 59672 - Problems initializing Radeon driver: lockup during IB test
Summary: Problems initializing Radeon driver: lockup during IB test
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: unspecified
Hardware: PowerPC Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-21 18:06 UTC by Lucas Kannebley Tavares
Modified: 2013-02-05 02:38 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
modprobe radeon with drm.debug=1 (110.18 KB, text/plain)
2013-01-21 18:06 UTC, Lucas Kannebley Tavares
no flags Details
Backtrace upon reboot (20.07 KB, text/plain)
2013-01-21 18:20 UTC, Lucas Kannebley Tavares
no flags Details
Enable more DMA byte swapping bits on big endian hosts (1.93 KB, patch)
2013-01-22 10:26 UTC, Michel Dänzer
no flags Details | Splinter Review
use sw swap for dma rings (1.15 KB, patch)
2013-01-22 15:07 UTC, Alex Deucher
no flags Details | Splinter Review

Description Lucas Kannebley Tavares 2013-01-21 18:06:44 UTC
Created attachment 73396 [details]
modprobe radeon with drm.debug=1

Hi all,

I've been trying to get a evergreen adapter to work with the Radeon driver on a ppc64 machine. And while attempting that, I'm running into what seems to be a infinite loop while running the IB test on ring 3.
I'm using a 3.8.0-rc4 kernel from today.
Follows an excerpt from the logs, the entire modprobe log can be found attached.

[  171.975487] [drm:evergreen_blit_init], evergreen blit allocated bo 00000600 vs 00000400 ps 00000500
[  171.975631] radeon 0001:01:00.0: WB enabled
[  171.975636] radeon 0001:01:00.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xc0000001d32b0c00
[  171.975642] radeon 0001:01:00.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xc0000001d32b0c0c
[  171.992732] [drm] ring test on 0 succeeded in 0 usecs
[  171.992799] [drm] ring test on 3 succeeded in 1 usecs
[  171.993112] [drm:evergreen_irq_set], evergreen_irq_set: sw int gfx
[  171.993154] [drm] ib test on ring 0 succeeded in 0 usecs
[  171.993197] [drm:evergreen_irq_set], r600_irq_set: sw int dma
[  172.419617] [drm:evergreen_irq_set], r600_irq_set: sw int dma
....
[  182.399612] [drm:evergreen_irq_set], r600_irq_set: sw int dma
[  182.409618] [drm:evergreen_irq_set], r600_irq_set: sw int dma
[  182.419604] radeon 0001:01:00.0: GPU lockup CP stall for more than 10000msec
[  182.419615] radeon 0001:01:00.0: GPU lockup (waiting for 0x0000000000000001 last fence id 0x0000000000000000)
[  182.419626] [drm:r600_dma_ib_test] *ERROR* radeon: fence wait failed (-35).
[  182.419634] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 3 (-35).

Do you guys have any idea what could be wrong, or what should be looked into? 

Thanks
Comment 1 Lucas Kannebley Tavares 2013-01-21 18:20:36 UTC
Created attachment 73398 [details]
Backtrace upon reboot

I can't remove the module the modprobe (some resource got stuck) but there's a backtrace printout if I attempt to reboot the machine.
Comment 2 Alex Deucher 2013-01-21 21:16:05 UTC
Is this still an issue with Dave's latest drm pull request:
http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-fixes
Comment 3 Michel Dänzer 2013-01-22 10:26:38 UTC
Created attachment 73440 [details] [review]
Enable more DMA byte swapping bits on big endian hosts

Does this patch help?
Comment 4 Alex Deucher 2013-01-22 15:07:26 UTC
Created attachment 73455 [details] [review]
use sw swap for dma rings

If the hw swapper patch doesn't work, you can try this patch.
Comment 5 Alex Deucher 2013-01-22 15:12:35 UTC
in attachment 73440 [details] [review], I don't think we want to enable fence swapping:

+#ifdef __BIG_ENDIAN
+	dma_cntl |= FENCE_SWAP_ENABLE;
+#endif

since we already swap the fence in radeon_fence_read().
Comment 6 Alex Deucher 2013-01-22 15:25:34 UTC
Ignore my patch and comments.  After thinking about this more they won't work.  See is Michel's patch fixes it for your.
Comment 7 Michel Dänzer 2013-01-22 15:35:53 UTC
(In reply to comment #5)
> I don't think we want to enable fence swapping [...] since we already swap the
> fence in radeon_fence_read().

Ah, right. Lucas, so it's probably better if you try without FENCE_SWAP_ENABLE first.
Comment 8 Lucas Kannebley Tavares 2013-01-22 16:27:52 UTC
Hi Michael, the patch you provided did indeed get rid of all errors, but now modprobe enters an infinite loop on the test that previously failed

[   62.294510] [drm] ring test on 0 succeeded in 0 usecs
[   62.294578] [drm] ring test on 3 succeeded in 1 usecs
[   62.294901] [drm:evergreen_irq_set], evergreen_irq_set: sw int gfx
[   62.294943] [drm] ib test on ring 0 succeeded in 0 usecs
[   62.294989] [drm:evergreen_irq_set], r600_irq_set: sw int dma
[   62.721958] [drm:evergreen_irq_set], r600_irq_set: sw int dma
[   62.731965] [drm:evergreen_irq_set], r600_irq_set: sw int dma
[   62.741979] [drm:evergreen_irq_set], r600_irq_set: sw int dma
...
[  434.061917] [drm:evergreen_irq_set], r600_irq_set: sw int dma

Just saw your last comment and so I'm off to change remove FENCE_SWAP_ENABLE. 
Thanks
Comment 9 Lucas Kannebley Tavares 2013-01-24 16:00:07 UTC
Ok, so the patch with FENCE_SWAP_ENABLE removed worked perfectly!

Thanks
Comment 10 Michel Dänzer 2013-01-24 17:34:47 UTC
(In reply to comment #9)
> Ok, so the patch with FENCE_SWAP_ENABLE removed worked perfectly!

Great, thanks for testing! However, bugs should only be resolved once the fix has landed in mainline.
Comment 11 Florian Mickler 2013-02-05 00:08:38 UTC
A patch referencing this bug report has been merged in Linux v3.8-rc6:

commit b3dfcb207e550dffb8680cab7afaf6b4fb6eae33
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Thu Jan 24 19:02:01 2013 +0100

    drm/radeon: Enable DMA_IB_SWAP_ENABLE on big endian hosts.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.