Bug 67043

Summary: Atombios stuck in a loop during resume
Product: DRI Reporter: Parag <parag.warudkar>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: daniel, johannes.hirte, mike
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg with dpm enabled
none
dmesg from 3.10.1 kernel without the timeout issue
none
possible fix
none
Dmesg none

Description Parag 2013-07-18 13:17:49 UTC
With mainline git kernel build from yesterday (3.11.0-rc1+ #7 SMP Wed Jul 17 11:34:34 EDT) I see resume takes longer and the below is printed in system log -

[ 7647.285017] sd 0:0:0:0: [sda] Starting disk
[ 7647.374821] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting
[ 7647.374822] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing CD3E (len 55, WS 0, PS 0) @ 0xCD61

I think this started after DPM commits. Prior kernels don't show this error.

Hardware is 

lspci |grep VGA
01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Turks PRO [Radeon HD 7570]
Comment 1 Alex Deucher 2013-07-18 13:29:23 UTC
Please attach your dmesg output.  Is this with dpm enabled?  Do you still have the issue without dpm enabled?  If so can you bisect?
Comment 2 Parag 2013-07-18 14:33:33 UTC
Created attachment 82603 [details]
dmesg with dpm enabled
Comment 3 Parag 2013-07-18 15:00:24 UTC
Created attachment 82613 [details]
dmesg from 3.10.1 kernel without the timeout issue
Comment 4 Parag 2013-07-18 15:02:39 UTC
Tried with radeon.dpm=0 on current mainline - no difference, still got the stuck message and delayed resume. Also tried with 3.8 Ubuntu stock kernel and 3.10.1 kernel from Ubunty daily mainline builds - the issue doesn't exist on both kernels. So it matches my vague memory that it started post 3.11 - unsure if that was after dpm or some other patchset prior to that.
Comment 5 Alex Deucher 2013-07-18 15:20:05 UTC
If it started in 3.11 can you bisect?
Comment 6 Parag 2013-07-18 15:23:52 UTC
I suck at bisecting, but yes, I will give it a shot.
Comment 7 Parag 2013-07-18 21:32:43 UTC
After a not-so-clean git bisect (compilation failures in vgacon.c, some TI driver I didn't need) this was the result - at least it is related to drm.

git bisect good
372835a8527f85b3eff20a18c2c339e827dfd4e4 is the first bad commit
commit 372835a8527f85b3eff20a18c2c339e827dfd4e4
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Sat Jun 15 00:13:13 2013 +0200

    drm/crtc-helper: explicit DPMS on after modeset
    
    Atm the crtc helper implementation of set_config has really
    inconsisten semantics: If just an fb update is good enough, dpms state
    will be left as-is, but if we do a full modeset we force everything to
    dpms on.
    
    This change has already been applied to the i915 modeset code in
    
    commit e3de42b68478a8c95dd27520e9adead2af9477a5
    Author: Imre Deak <imre.deak@intel.com>
    Date:   Fri May 3 19:44:07 2013 +0200
    
        drm/i915: force full modeset if the connector is in DPMS OFF mode
    
    which according to Greg KH seems to aim for a new record in most
    Bugzilla: links in a commit message.
    
    The history of this dpms forcing is pretty interesting. This patch
    here is an almost-revert of
    
    commit 811aaa55ba21ab37407018cfc01770d6b037d3fb
    Author: Keith Packard <keithp@keithp.com>
    Date:   Thu Feb 3 16:57:28 2011 -0800
    
        drm: Only set DPMS ON when actually configuring a mode
    
    which fixed the bug of trying to dpms on disabled outputs, but
    introduced the new discrepancy between an fb update only and full
    modesets. The actual introduction of this goes back to
    
    commit bf9dc102e284a5aa78c73fc9d72e11d5ccd8669f
    Author: Keith Packard <keithp@keithp.com>
    Date:   Fri Nov 26 10:45:58 2010 -0800
    
        drm: Set connector DPMS status to ON in drm_crtc_helper_set_config
    
    And if you'd dig around in the i915 driver code there's even more fun
    around forcing dpms on and losing our heads and temper of the
    resulting inconsistencies. Especially the DP re-training code had tons
    of funny stuff in it.
    
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
Comment 8 Parag 2013-07-18 21:59:55 UTC
Verified : Reverting 372835a8527f85b3eff20a18c2c339e827dfd4e4 from HEAD does make the problem go away.
Comment 9 Alex Deucher 2013-07-19 17:02:01 UTC
Created attachment 82700 [details] [review]
possible fix

Does the attached patch fix it?
Comment 10 Parag 2013-07-19 19:10:49 UTC
Tested a couple times with HEAD plus patch in previous comment and it seems to have fixed the issue. Thanks!
Comment 11 Alex Deucher 2013-07-22 21:04:59 UTC
*** Bug 66767 has been marked as a duplicate of this bug. ***
Comment 12 Mike Lothian 2013-08-03 13:17:50 UTC
This is happening still on a Trinity 7500G

00:01.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Trinity [Radeon HD 7500G] [1002:990a]

Even with this patch
Comment 13 Mike Lothian 2013-08-03 13:19:37 UTC
Created attachment 83581 [details]
Dmesg

This is running with the latest drm-fixes-3.11 branch with the above fix
Comment 14 Parag 2013-08-03 13:46:05 UTC
Note that the original one and its duplicate 66767 reported in this thread happened on suspend/resume whereas this one doesn't seem to be suspend/resume related.
Comment 15 Mike Lothian 2013-08-03 13:53:40 UTC
I was actually in a VT at the time I think when this normally happens X is up and the whole laptop appears to freeze up
Comment 16 Alex Deucher 2013-08-03 13:55:10 UTC
I think you may be seeing a different issue that just happens to have a similar sympotom.  Please open a new bug.  This particular issue is fixed by the referenced patch.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.