Summary: | AMD Radeon HD 6950 (Cayman): Power profile has no effect after resume from hibernation | ||||||
---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Harald Judt <h.judt> | ||||
Component: | DRM/Radeon | Assignee: | Default DRI bug account <dri-devel> | ||||
Status: | RESOLVED FIXED | QA Contact: | |||||
Severity: | normal | ||||||
Priority: | medium | ||||||
Version: | XOrg git | ||||||
Hardware: | x86-64 (AMD64) | ||||||
OS: | Linux (All) | ||||||
Whiteboard: | |||||||
i915 platform: | i915 features: | ||||||
Attachments: |
|
Description
Harald Judt
2011-06-20 01:32:26 UTC
Are you sure you can't change clocks after suspend? I know that after a gpu crash the clocks will be at default, but profile still incorrectly showing it's at low, could be the same after resume. Which means to actually get it to switch to low again you need to first switch it to mid/high. Thanks for your response. Yes, I'm sure. I've already tried what you suggested, it doesn't matter. Additionally, the display will not flicker when changing the power profile after resume, in contrast to before hibernation. low: default engine clock: 800000 kHz current engine clock: 249990 kHz default memory clock: 1250000 kHz current memory clock: 150000 kHz voltage: 900 mV medium: default engine clock: 800000 kHz current engine clock: 499990 kHz default memory clock: 1250000 kHz current memory clock: 1250000 kHz voltage: 1000 mV default: default engine clock: 800000 kHz current engine clock: 799940 kHz default memory clock: 1250000 kHz current memory clock: 1250000 kHz voltage: 1000 mV high: Not possible, machine freezes completely. Not even magic sysreq keys work anymore, hard reset required. Is there any other way to change the clock speeds, without using power_profile? As explained before, something seems to get messed up by standby/resume (see strange voltage readings in my initial report). This *could* be caused by me applying self-modified tuxonice patch. I will have to test hibernation in vanilla kernel to exclude this possibility. On the other hand, power_profile 'high' doesn't work at all, and I don't think this has anything to do with suspend/hibernation or the tuxonice patch. Shall I open another bug report for this? I understand that cayman support is still in development. The funny voltage value isn't a real value it's a flag for the driver. This patch should fix up that issue: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a377e187df725fe7e62d2cec59ec290c5a605d93 > The funny voltage value isn't a real value it's a flag for the driver.
> This patch should fix up that issue:
[...]
I had this patch already applied, but this seems to fix a different issue not related to resuming.
I reproduced the problem on git vanilla kernel now (bccaeafd7c117acee36e90d37c7e05c19be9e7bf) using in-kernel suspend. I didn't check whether this has your patch already applied, but I don't think it would change anything here.
So there are two things to conclude:
* The strange voltage reading and the power_profile malfunction are both caused by doing hibernate/resume.
* The self-modified tuxonice patch is not the culprit, as the symptoms are the same with in-kernel suspend.
Therefore, something is not right with the cayman power management code.
Would it help if I provide drm.debug information? Any special parameters required for drm.debug?
Please attach a copy of your vbios. (as root) (use lspci to get the bus id) cd /sys/bus/pci/devices/<pci bus id> echo 1 > rom cat rom > /tmp/vbios.rom echo 0 > rom Created attachment 48350 [details]
Sapphire Radeon HD6950 2GiB Video BIOS
lspci:
01:00.0 VGA compatible controller: ATI Technologies Inc Device 6719 (prog-if 00 [VGA controller])
Subsystem: ATI Technologies Inc Device 0b00
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 54
Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M]
Region 2: Memory at fe620000 (64-bit, non-prefetchable) [size=128K]
Region 4: I/O ports at e000 [size=256]
Expansion ROM at fe600000 [disabled] [size=128K]
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <64ns, L1 <1us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee0100c Data: 41c9
Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [150 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Kernel driver in use: radeon
A strange thing: Accidentally, I put the computer into suspend mode instead of hibernation. Resume was successful, and the power profile and clock readings were ok this time. So suspend-to-ram seems to work correctly, while suspend-to-disk does not. (In reply to comment #7) > So suspend-to-ram seems to work correctly, while suspend-to-disk does not. Strange. The driver doesn't differentiate between the two. Yes, but it gets even stranger: Do hibernate & resume --> power_profile does not work, radeon_pm_info messed up. *Now* do suspend & resume --> power_profile works again, radeon_pm_info too! So what can be wrong here? I've updated the kernel to current tuxonice-head which is somewhere in between linux-3.0-rc4 and linux-3.0-rc5, and the strange voltage issue has been fixed. Resuming from hibernation still produces the same problem with not being able to change the clock speeds, though. Furthermore, the machine will not awake correctly after the second or third suspend attempt; the screen stays black. Ok, another update, including some positive results: Compiled updated kernel based on 3.0 final: http://git.kernel.org/?p=linux/kernel/git/nigelc/tuxonice-head.git;a=summary merged branch git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-core-next @5a96a899bbdee86024ab9ea6d02b9e242faacbed * Hibernation & resume still does not work correctly, leaving the power profiles unfunctional * Suspend & resume now works reliably and can be performed successfully multiple times after hibernation & resume, fixing the power profile issue caused by resume from hibernation > Strange. The driver doesn't differentiate between the two. Do you think it's a bug in the general hibernation / resume code, meaning I should ask somewhere else? What speaks against this is that the same config works on a laptop (ATI Mobility Radeon HD 3400) where there are no such issues, hence I concluded that this *might* be specific to HD6950/Cayman. I'm still using 3.1-rc2, but this issue seems solved after applying some patches I grabbed from the DRI mailing list, though I don't know exactly which one(s). So everything looks fine now, but if the problems returns when 3.1 gets released, then I can will report back here. For the time being, consider it solved. Thank you. Definitely solved in kernel-3.2-rc1, therefore setting resolved fixed. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.