Bug 87992

Summary: X crash because si_restrict_performance_levels_before_switch(?)
Product: xorg Reporter: Öyvind Saether <oyvinds>
Component: Driver/RadeonAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED WORKSFORME QA Contact: Xorg Project Team <xorg-team>
Severity: trivial    
Priority: low CC: porcelain_mouse
Version: 7.4 (2008.09)   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg none

Description Öyvind Saether 2015-01-03 12:56:19 UTC
Xorg crashed, dmesg had this message afterwards:

[191856.414365] [drm:si_dpm_set_power_state] *ERROR* si_restrict_performance_levels_before_switch failed

total scandal.

Linux meilong 3.18.1-gentoo #1 SMP PREEMPT Fri Dec 26 12:45:10 CET 2014 x86_64 Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz GenuineIntel GNU/Linux

x11-drivers/xf86-video-ati 7.5.0
x11-libs/libdrm 2.4.58
media-libs/mesa 10.3.5
x11-base/xorg-server 1.16.2.901

I can not remember this happening with kernel 3.17.x. Crash happened after about 2 days of uptime and it had been suspended to RAM a few times during that period of time.

I was, so sorry, not running with options drm debug=0x06 at the time of this unfortunate event so I will have to come back with additional information if/when this happens again.

http://www.phoronix.com/scan.php?page=news_item&px=MTgzNzI  story is that 3.19 will have all sorts of fixes so this bug may not be that relevant, so sorry for filing it if this is already fixed.

Please let me know if/how I can provide additional information.

Card in question:

01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin B routed to IRQ 29
	Region 0: Memory at f7e60000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Latency L0 <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0f00c  Data: 4182
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel
Comment 1 Alex Deucher 2015-01-04 17:31:43 UTC
Please attach your xorg log and dmesg output.  I think the error message is a result of a GPU hang probably caused by a problem is the mesa driver.  If you can reliably reproduce this by only changing the kernel (not the userspace components), can you bisect?
Comment 2 Öyvind Saether 2015-01-05 14:38:45 UTC
"[191856.414365] [drm:si_dpm_set_power_state] *ERROR* si_restrict_performance_levels_before_switch failed" was the only interesting line in dmesg and I had restarted X twice before I noticed it, so sorry (Gentoo only keeps current and last log).

I changed to 3.19rc2 on the off-chance that it's dpm changes would solve this and I have not yet got any crash or such error. I turned on and is running with "options drm debug=0x06" now so I will be able to post more useful information if/when this happens again.

I am using the same mesa version so if the problem lies therein then it may show up again.

I guess it would be best to ignore this bug and let it sit here for a few weeks. I (or someone else) will need to get another X crash with this error to make any progress. If no such crash happens with 3.19rc2 (or what later version I may switch to) within weeks then perhaps it is solved.
Comment 3 PMouse 2016-05-23 06:07:15 UTC
I've been getting this since January, when I switched to open source drivers.  DPMS is completely broken (at least on dp port, which is all I've tried) and I've had to turn of all display power saving just to use.

My dmesg shows a two additional lines, too.

[    4.583144] [drm:si_dpm_set_power_state [radeon]] *ERROR* si_restrict_performance_levels_before_switch failed
[    4.624808] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery reached max voltage
[    4.624835] [drm:radeon_dp_link_train [radeon]] *ERROR* clock recovery failed


I would love to contribute to the resolution of this issue.  Please let me know how I can help.
Comment 4 PMouse 2016-05-23 06:09:59 UTC
Created attachment 123976 [details]
dmesg

I have no xorg.conf file.  Only 00-keyboard.conf in the .d dir.
Comment 5 Öyvind Saether 2018-05-04 23:53:04 UTC
This 3 year old bug is probably fixed years ago given that it involves a 3.x series kernel and mesa 10.x. I don't have a Radeon 7800 anymore so I couldn't provide any information even if it's not.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.