Bug 28375

Summary: [KMS][RV620] Lockup on PM reclock
Product: DRI Reporter: Rafał Miłecki <zajec5>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg with WARNING no. 1
none
dmesg with WARNING no. 2
none
dmesg with WARNING no. 3 (with drm:drm_ioctl)
none
dmesg with GPU lockup CP stall detected no. 1
none
dmesg with GPU lockup CP stall detected no. 2
none
dmesg with GPU lockup CP stall detected no. 3
none
Xorg.0.log
none
step through voltage levels
none
less revoltaging and more deubgging messages
none
dmesg with less revoltaging patch applied none

Description Rafał Miłecki 2010-06-03 16:26:52 UTC
I use today's d-r-t with my notebook.

1) Booting into init 3 and massive (50 times) switching between high and low profile
No problems.

2) Booting into init 5 and switching to high profile
Some kind of [black screen|lockup|reclock] loop. I get black screen for second, then corrupted and freezed display for a moment, then black screen again and so on. Sysrq works.

3) Booting into init 5, switching to dynpm, provoking short upclocking
Black screen for a second, just WARNING or lockup detection (dmesgs attached)
Comment 1 Rafał Miłecki 2010-06-03 16:28:41 UTC
Created attachment 36046 [details]
dmesg with WARNING no. 1

Most important part:
[  512.026242] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:235 radeon_fence_wait+0x236/0x290 [radeon]()

Backtrace quite useless :|
[  512.026794]  [<fac7a34d>] drm_ioctl+0x12d/0x350 [drm]
[  512.026919]  [<fac7a220>] ? drm_ioctl+0x0/0x350 [drm]
Comment 2 Rafał Miłecki 2010-06-03 16:30:22 UTC
Created attachment 36047 [details]
dmesg with WARNING no. 2

Most important part:
[  743.244277] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:235 radeon_fence_wait+0x236/0x290 [radeon]()

Backtrace quite useless :|
[  743.244819]  [<fac7a34d>] drm_ioctl+0x12d/0x350 [drm]
[  743.244929]  [<fac7a220>] ? drm_ioctl+0x0/0x350 [drm]
Comment 3 Rafał Miłecki 2010-06-03 16:32:45 UTC
Created attachment 36048 [details]
dmesg with WARNING no. 3 (with drm:drm_ioctl)

In previous dmesgs I disabled drm:drm_ioctl debugging. This one keeps that debugging enabled.

Most important part:
[  101.033181] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:235 radeon_fence_wait+0x236/0x290 [radeon]()

Backtrace quite useless :|
[  101.033712]  [<fac7d3c2>] drm_ioctl+0x1a2/0x3f0 [drm]
[  101.033828]  [<fac7d220>] ? drm_ioctl+0x0/0x3f0 [drm]
Comment 4 Alex Deucher 2010-06-03 16:39:21 UTC
Also are you using the ddx tiling patches?  If so, do you get the same lock up with those removed?  This might be a dupe of bug 28342.
Comment 5 Rafał Miłecki 2010-06-03 16:42:37 UTC
Created attachment 36049 [details]
dmesg with GPU lockup CP stall detected no. 1
Comment 6 Rafał Miłecki 2010-06-03 16:47:24 UTC
Created attachment 36050 [details]
dmesg with GPU lockup CP stall detected no. 2
Comment 7 Rafał Miłecki 2010-06-03 16:47:48 UTC
Created attachment 36051 [details]
dmesg with GPU lockup CP stall detected no. 3
Comment 8 Rafał Miłecki 2010-06-03 16:51:51 UTC
Created attachment 36052 [details]
Xorg.0.log
Comment 9 Rafał Miłecki 2010-06-03 16:53:25 UTC
(In reply to comment #4)
> Also are you using the ddx tiling patches?  If so, do you get the same lock up
> with those removed?  This might be a dupe of bug 28342.

Today's drm-radeon-testing means:

commit 9e67e5b1a6fd4bdca48a9c267386afb236d08783
Merge: b55ad86 e40152e
Author: Dave Airlie <airlied@redhat.com>
Date:   Tue Jun 1 13:04:52 2010 +1000

    Merge tag 'v2.6.34' into drm-radeon-testing



As for DDX I use quite old one:

commit 488c9fd8300505cc6c0c2f8f0f00849f27cc5d63
Author: Alex Deucher <alexdeucher@gmail.com>
Date:   Mon Mar 15 12:25:57 2010 -0400

    r6xx/r7xx: fix domain handling in accel code

    Noticed by Pauli and Michel on IRC.

    Improves GetImage performace by a factor of ~10.


Let me try updating it.
Comment 10 Rafał Miłecki 2010-06-03 17:20:47 UTC
Ohh, that was silly. I didn't understand why first 3 dmesgs are so meaningless... I grepped them for "drm"! :|

Before updating xf86-video-ati I tried patch from attachment #36012 [details] [review] but it does not help.
Comment 11 Rafał Miłecki 2010-06-03 17:40:09 UTC
Updating DDX didn't help. I gone back with drm tree to:

commit 94f2983b1dc64a4a90a1ac9a6da6d7a0ec2f06a8
Author: Tiago Vignatti <tiago.vignatti@nokia.com>
Date:   Mon May 24 18:24:31 2010 +0300

    vgaarb: use MIT license

which seems to be one commit before first tiling patches. It also has problems I described in this bug report.

As I didn't apply any patches to DDX manually, I guess there is nothing I can revert, right?
Comment 12 Rafał Miłecki 2010-06-03 18:14:29 UTC
 bad: 94f2983b1dc64a4a90a1ac9a6da6d7a0ec2f06a8
good: 36d1701c502d4f46386e1000ad58d9497a11688d

Suspected commits:
drm/radeon/kms/pm: add support for SetVoltage cmd table (V2)
drm/radeon/kms/pm: voltage fixes
drm/radeon/kms/pm: radeon_set_power_state fixes
drm/radeon/kms/pm: patch default power state with default clocks/voltages on r6xx+
Comment 13 Alex Deucher 2010-06-03 19:42:12 UTC
If I had to guess, I would say:
drm/radeon/kms/pm: add support for SetVoltage cmd table (V2)

Changing the voltage could be problematic on your card.  We should probably step up/down the voltage rather than changing it directly.  E.g., when going from 1200 to 900 millivolts, step down to 1100, 1000, then 900.  I'll see about hacking something up tomorrow if you can verify that is the problematic commit.
Comment 14 Alex Deucher 2010-06-03 21:38:38 UTC
Created attachment 36057 [details] [review]
step through voltage levels

If the SetVoltage stuff is indeed causing problems, this patch might help.
Comment 15 Rafał Miłecki 2010-06-04 13:48:12 UTC
I can confirm it's voltage setting that causes problem. Dropping it from r600.c::r600_pm_misc makes d-r-t work again.

I didn't test your patch Alex, but it does not make much sense in my case.

1) Looking through all power modes I noticed only 2 different voltages: 950 and 1200.
2) Step is 250 anyway: 0018: USHORT usVoltageStep = 0x00fa (250)
Comment 16 Rafał Miłecki 2010-06-06 03:31:16 UTC
Created attachment 36081 [details]
less revoltaging and more deubgging messages
Comment 17 Rafał Miłecki 2010-06-06 03:34:55 UTC
Created attachment 36082 [details]
dmesg with less revoltaging patch applied

There you have some results of my debugging tries.

Voltage was changed at 86.386427 and it didn't cause any problem. When starting glxgears I expected voltage to be raised but it wasn't the case. Maybe this is the problem?
Comment 18 Rafał Miłecki 2010-06-06 05:43:20 UTC
Patch posted:

[PATCH V3] drm/radeon/kms/r600+: use voltage from requested clock mode
http://lists.freedesktop.org/archives/dri-devel/2010-June/001139.html

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.