Bug 71930 - Kernel Bug and X fails to start when using radeon.runpm=1
Summary: Kernel Bug and X fails to start when using radeon.runpm=1
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-11-23 01:11 UTC by Mike Lothian
Modified: 2014-01-02 14:14 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Dmesg of bug (66.41 KB, text/plain)
2013-11-23 01:11 UTC, Mike Lothian
no flags Details
Dmesg of switcheroo (3.91 KB, text/plain)
2013-11-26 00:02 UTC, Mike Lothian
no flags Details
Dmesg dynpm=1 (234.64 KB, text/plain)
2013-11-26 00:14 UTC, Mike Lothian
no flags Details
Dmesg from drm-fixes-3.12-radeon-poweroff (69.21 KB, text/plain)
2013-11-30 13:33 UTC, Mike Lothian
no flags Details
Dmesg from 3.13-rc2 (234.58 KB, text/plain)
2013-11-30 13:44 UTC, Mike Lothian
no flags Details
dmesg with radeon.runpm=1 (232.75 KB, text/plain)
2013-12-03 00:23 UTC, Hohahiu
no flags Details
Xorg.0.log (62.99 KB, text/plain)
2013-12-03 00:23 UTC, Hohahiu
no flags Details

Description Mike Lothian 2013-11-23 01:11:04 UTC
Created attachment 89664 [details]
Dmesg of bug

During 3.13-rc0 I was unable to boot my system to capture the logs, since 3.13-rc1 has been cut the have been improvements where I can now boot the system but X won't start whilst radeon.runpm=1

I'll attach my dmesg
Comment 1 Mike Lothian 2013-11-23 01:14:24 UTC
I should add I think this bug is also triggered during shutdown but systmed is a little too quick for me to see

If required I can do a git bisect testing the shutdown rather than the boot with runpm as I don't think the runpm code is at fault but only manifests the issue
Comment 2 Mike Lothian 2013-11-25 17:16:04 UTC
I think the same issue has been reported upstream too

https://bugzilla.kernel.org/show_bug.cgi?id=65761
Comment 3 Alex Deucher 2013-11-25 18:51:52 UTC
Does manually turning off the dGPU using switcheroo still work with radeon.runpm=0?  If not, can you bisect?  runpm and switcheroo use the same apci mechanism to turn off the dGPU.
Comment 4 Mike Lothian 2013-11-26 00:00:45 UTC
It locks up the machine

Output from journalctl -f will be attached which was captured by sshing from another machine

It froze up the laptop completely and the connection died
Comment 5 Mike Lothian 2013-11-26 00:02:11 UTC
Created attachment 89797 [details]
Dmesg of switcheroo
Comment 6 Mike Lothian 2013-11-26 00:13:08 UTC
X is now starting with runpm=1 but I get a kworker eating lots of cpu and a slight freeze / stutter every few seconds

Looks like the card is initialised every few seconds in the dmesg
Comment 7 Mike Lothian 2013-11-26 00:14:28 UTC
Created attachment 89800 [details]
Dmesg dynpm=1
Comment 8 Alex Deucher 2013-11-26 15:41:16 UTC
(In reply to comment #4)
> It locks up the machine

Can you bisect to see what broke vgaswitcheroo on your system?  Both runpm and switcheroo use the same acpi method, so fixing one with likely fix both.
Comment 9 Mike Lothian 2013-11-26 19:30:57 UTC
I bisected back to [bbd34fcdd1b201e996235731a7c98fd5197d9e51] ACPI / hotplug / PCI: Register all devices under the given bridge

Which was during the 3.11 cycle I believe - I guess there's a chance I picked up on a previous bug I'll do some more testing
Comment 10 Alex Deucher 2013-11-26 19:57:48 UTC
(In reply to comment #9)
> I bisected back to [bbd34fcdd1b201e996235731a7c98fd5197d9e51] ACPI / hotplug
> / PCI: Register all devices under the given bridge
> 
> Which was during the 3.11 cycle I believe - I guess there's a chance I
> picked up on a previous bug I'll do some more testing

Does reverting that commit fix the issues?
Comment 11 Mike Lothian 2013-11-26 21:15:59 UTC
I've just tried but it doesn't revert cleanly - not even on 3.11
Comment 12 Mike Lothian 2013-11-26 22:06:34 UTC
I think that was when I was playing around with the dynamic shutdown patches in one of Dave's git branches - which explains why I haven't really noticed this until now - I rarely used switcheroo 

This is also possibly related to https://bugzilla.kernel.org/show_bug.cgi?id=61891
Comment 13 sotiris papadimitriou 2013-11-29 06:47:51 UTC
Same problem.
My graphics cards:
lspci | grep VGA
01:05.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RS880M [Mobility Radeon HD 4225/4250]
02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Park [Mobility Radeon HD 5430/5450/5470]
If radeon.runpm=0 all o.k.
If manually turning off the dGPU using switcheroo then freeze
 
Use Ubuntu 14.04 with 13.rc1
Comment 14 sotiris papadimitriou 2013-11-29 06:59:04 UTC
(In reply to comment #13)
> Same problem.
> My graphics cards:
> lspci | grep VGA
> 01:05.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI
> RS880M [Mobility Radeon HD 4225/4250]
> 02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Park
> [Mobility Radeon HD 5430/5450/5470]
> If radeon.runpm=0 all o.k.
> If manually turning off the dGPU using switcheroo then freeze
>  
> I'm use Ubuntu 14.04 with kernel 3.13.rc1
Comment 15 sotiris papadimitriou 2013-11-30 05:45:01 UTC
$ sudo cat /sys/kernel/debug/vgaswitcheroo/switch
0:IGD:+:DynPwr:0000:01:05.0
1:DIS: :DynPwr:0000:02:00.0
The DynPwr should exist on both cards or only the dGPU?
Does runpm extinguished and the two cards and the computer freezes?
Comment 16 Mike Lothian 2013-11-30 13:21:15 UTC
Switching back to airlied's drm-radeon-poweroff branch which is based on 3.11-rc3 (which doesn't show my radeon card as a provider) has the following in switcheroo

0:DIS: :DynOff:0000:01:00.0
1:IGD:+:Pwr:0000:00:02.0

and this is what is shows with runpm=0 but with this set I'm able to see my radeon card as a provider
Comment 17 Mike Lothian 2013-11-30 13:33:24 UTC
Created attachment 90020 [details]
Dmesg from drm-fixes-3.12-radeon-poweroff

This is the dmesg from adg5f's drm-fixes-3.12-radeon-poweroff branch when I do a switcheroo OFF then ON with runpm=0
Comment 18 Mike Lothian 2013-11-30 13:39:47 UTC
With runpm=1 on adg5f's drm-fixes-3.12-radeon-poweroff which is based on 3.11.0-rc7 I get the following in switcheroo

0:DIS: :DynOff:0000:01:00.0
1:IGD:+:Pwr:0000:00:02.0

It also lists providers and DRI_PRIME works great powering the card as and when I fire something up and shows this in switcheroo

0:DIS: :DynPwr:0000:01:00.0
1:IGD:+:Pwr:0000:00:02.0
Comment 19 Mike Lothian 2013-11-30 13:44:05 UTC
This is currently what's in switcheroo in 3.13-rc2 with runpm=1

0:IGD:+:Pwr:0000:00:02.0
1:DIS: :DynPwr:0000:01:00.0

Which seems to have a reversed order - the system keeps powering up the radeon card and powering down
Comment 20 Mike Lothian 2013-11-30 13:44:55 UTC
Created attachment 90021 [details]
Dmesg from 3.13-rc2
Comment 21 Mike Lothian 2013-11-30 13:47:11 UTC
I'm only seeing the intel provider out out xrandr --listproviders and the system keeps stuttering when the card is either powered up (looks like when it's printing the dpm tables)
Comment 22 Hohahiu 2013-12-03 00:21:47 UTC
I also experience issues with radeon.runpm=1 on 3.13-rc2.
My system is intel hd4000 + Mobile radeon 7750M. However lspci shows the following:
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Chelsea LP [Radeon HD 7730M]

Below are attached dmesg with radeon.runpm=1 and Xorg.0.log.
Comment 23 Hohahiu 2013-12-03 00:23:18 UTC
Created attachment 90126 [details]
dmesg with radeon.runpm=1
Comment 24 Hohahiu 2013-12-03 00:23:48 UTC
Created attachment 90127 [details]
Xorg.0.log
Comment 25 Mike Lothian 2014-01-02 14:14:37 UTC
Fixed in https://bugzilla.kernel.org/show_bug.cgi?id=61891


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.