Bug 49025 - [SNB] gtfifodbg WARN_ON triggered on suspend
Summary: [SNB] gtfifodbg WARN_ON triggered on suspend
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium minor
Assignee: Ben Widawsky
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-20 13:55 UTC by Ben Widawsky
Modified: 2017-07-24 23:02 UTC (History)
5 users (show)

See Also:
i915 platform:
i915 features:


Attachments
only enable ips polling on ilk (2.40 KB, patch)
2012-05-02 01:19 UTC, Daniel Vetter
no flags Details | Splinter Review
Don't read the non-existent DPLL register on ILK+ (1.92 KB, patch)
2012-05-02 04:07 UTC, Chris Wilson
no flags Details | Splinter Review

Description Ben Widawsky 2012-04-20 13:55:08 UTC
I can reproduce this by running glxgears and initiating a suspend to ram
rtcwake -s 10 -m mem


[  103.641899] ------------[ cut here ]------------
[  103.641927] WARNING: at drivers/gpu/drm/i915/i915_drv.c:465 gen6_gt_check_fifodbg.isra.3+0x40/0x50 [i915]()
[  103.641932] Hardware name: 1286CTO
[  103.641936] MMIO read or write has been dropped ffffffff
[  103.641939] Modules linked in: fuse i915 fbcon bitblit softcursor font tileblit i2c_algo_bit drm_kms_helper drm uvcvideo videobuf2_vmalloc videobuf2_memops coretemp videobuf2_core i2c_i801 videodev media thinkpad_acpi sha256_generic aesni_intel aes_x86_64 crc32c_intel iwlwifi e1000e dm_crypt dm_mod ext3 jbd ext2 mbcache ehci_hcd xhci_hcd usbcore usb_common sd_mod ahci libahci libata
[  103.642003] Pid: 25, comm: kworker/u:1 Not tainted 3.4.0-rc3+ #10
[  103.642007] Call Trace:
[  103.642019]  [<ffffffff8103325f>] warn_slowpath_common+0x7f/0xc0
[  103.642027]  [<ffffffff81033356>] warn_slowpath_fmt+0x46/0x50
[  103.642044]  [<ffffffffa02ab3e0>] gen6_gt_check_fifodbg.isra.3+0x40/0x50 [i915]
[  103.642060]  [<ffffffffa02ab87e>] __gen6_gt_force_wake_put+0x1e/0x20 [i915]
[  103.642078]  [<ffffffffa02abd90>] i915_read32+0x130/0x140 [i915]
[  103.642096]  [<ffffffffa02aeb5d>] i915_update_gfx_val+0x8d/0xf0 [i915]
[  103.642121]  [<ffffffffa02ce5af>] intel_idle_update+0x6f/0x1a0 [i915]
[  103.642132]  [<ffffffff8104ed96>] process_one_work+0x196/0x500
[  103.642139]  [<ffffffff8104ed30>] ? process_one_work+0x130/0x500
[  103.642161]  [<ffffffffa02ce540>] ? intel_enable_pipe+0x140/0x140 [i915]
[  103.642170]  [<ffffffff810506a6>] worker_thread+0x126/0x2d0
[  103.642178]  [<ffffffff81050580>] ? manage_workers.isra.23+0x1f0/0x1f0
[  103.642184]  [<ffffffff81055e1e>] kthread+0xae/0xc0
[  103.642196]  [<ffffffff815b2454>] kernel_thread_helper+0x4/0x10
[  103.642206]  [<ffffffff815a91d9>] ? retint_restore_args+0xe/0xe
[  103.642213]  [<ffffffff81055d70>] ? __init_kthread_worker+0x70/0x70
[  103.642221]  [<ffffffff815b2450>] ? gs_change+0xb/0xb
[  103.642226] ---[ end trace 7ba795e35b4bfd88 ]---
Comment 1 Daniel Vetter 2012-04-22 05:57:12 UTC
I've managed to hit this once on my snb, but now fail to reproduce.
Comment 2 Andrey Rahmatullin 2012-04-24 04:22:09 UTC
The same here on my ASUS K53E every time I suspend with X running even if it's just kdm login prompt. No visible problems after resuming.
Comment 3 Chris Wilson 2012-05-01 09:45:12 UTC
Is it always the IPS function i915_update_gfx_val? Could it just be that the chipset if unhappy with us writing to this bogus register, or is it merely just frequent enough to be blamed most of the time?
Comment 4 Ben Widawsky 2012-05-01 11:24:10 UTC
(In reply to comment #3)
> Is it always the IPS function i915_update_gfx_val? Could it just be that the
> chipset if unhappy with us writing to this bogus register, or is it merely just
> frequent enough to be blamed most of the time?

For me at least it seems it's always this backtrace. I only get 2 per suspend/resume; so I doubt it's a frequency thing.
Comment 5 Andrey Rahmatullin 2012-05-01 14:56:34 UTC
(In reply to comment #3)
> Is it always the IPS function i915_update_gfx_val? Could it just be that the
> chipset if unhappy with us writing to this bogus register, or is it merely just
> frequent enough to be blamed most of the time?

I get two WARNINGs in each suspend/resume. The first one looks like the one included in the first message and contains i915_update_gfx_val:

[<ffffffff8102a5c1>] warn_slowpath_common+0x7e/0x96
[<ffffffff8102a66d>] warn_slowpath_fmt+0x41/0x43
[<ffffffffa01e03b7>] gen6_gt_check_fifodbg.isra.5+0x31/0x44 [i915]
[<ffffffffa01e062e>] __gen6_gt_force_wake_put+0x19/0x1b [i915]
[<ffffffffa01e0884>] i915_read32+0x61/0x82 [i915]
[<ffffffffa01fa1f6>] ? intel_disable_plane+0x60/0x60 [i915]
[<ffffffffa01e27ea>] i915_update_gfx_val+0x61/0xb9 [i915]
[<ffffffffa01fa23b>] intel_idle_update+0x45/0x18b [i915]
[<ffffffff810463e9>] ? need_resched+0x1e/0x28
[<ffffffffa01fa1f6>] ? intel_disable_plane+0x60/0x60 [i915]
[<ffffffff8103c29e>] process_one_work+0x13c/0x21e
[<ffffffff8103cb9f>] worker_thread+0xce/0x152
[<ffffffff8103cad1>] ? manage_workers.isra.28+0x16c/0x16c
[<ffffffff8103ffb7>] kthread+0x86/0x8e
[<ffffffff812f2054>] kernel_thread_helper+0x4/0x10
[<ffffffff8103ff31>] ? kthread_freezable_should_stop+0x3e/0x3e
[<ffffffff812f2050>] ? gs_change+0xb/0xb

The second one is different but occurs at the same source line:


[<ffffffff8102a5c1>] warn_slowpath_common+0x7e/0x96
[<ffffffff8102a66d>] warn_slowpath_fmt+0x41/0x43
[<ffffffffa01e03b7>] gen6_gt_check_fifodbg.isra.5+0x31/0x44 [i915]
[<ffffffffa01e062e>] __gen6_gt_force_wake_put+0x19/0x1b [i915]
[<ffffffffa01e0884>] i915_read32+0x61/0x82 [i915]
[<ffffffffa01fa293>] intel_idle_update+0x9d/0x18b [i915]
[<ffffffffa01fa1f6>] ? intel_disable_plane+0x60/0x60 [i915]
[<ffffffff8103c29e>] process_one_work+0x13c/0x21e
[<ffffffff8103cb9f>] worker_thread+0xce/0x152
[<ffffffff8103cad1>] ? manage_workers.isra.28+0x16c/0x16c
[<ffffffff8103ffb7>] kthread+0x86/0x8e
[<ffffffff812f2054>] kernel_thread_helper+0x4/0x10
[<ffffffff8103ff31>] ? kthread_freezable_should_stop+0x3e/0x3e
[<ffffffff812f2050>] ? gs_change+0xb/0xb
Comment 6 Daniel Vetter 2012-05-02 01:19:48 UTC
Created attachment 60883 [details] [review]
only enable ips polling on ilk

Please test this patch.
Comment 7 Andrey Rahmatullin 2012-05-02 03:48:03 UTC
(In reply to comment #6)
> Created attachment 60883 [details] [review] [review]
> only enable ips polling on ilk
> 
> Please test this patch.

The first WARNING disappeared but the second one still happens.
Comment 8 Chris Wilson 2012-05-02 04:07:40 UTC
Created attachment 60890 [details] [review]
Don't read the non-existent DPLL register on ILK+
Comment 9 Andrey Rahmatullin 2012-05-02 04:23:05 UTC
(In reply to comment #8)
> Created attachment 60890 [details] [review] [review]
> Don't read the non-existent DPLL register on ILK+

This patch removes the second WARNING.
Comment 10 Daniel Vetter 2012-05-04 01:34:20 UTC
Both patches are merged to -fixes and should land in 3.4-rc6 soon:

commit e90f3b61f4432e3c5bb6b57f4b3e8d8cba747541                                                                                      
Author: Chris Wilson <chris@chris-wilson.co.uk>                                                                                      
Date:   Mon Apr 30 19:35:02 2012 +0100                                                                                               
                                                                                                                                     
    drm/i915: Only enable IPS polling for gen5                                                                                       
                                                                                                                                     
    On SandyBridge IPS was entirely implemented in hardware and not reliant                                                          
    on the driver monitoring power consumption and feeding back desired run                                                          
    states, so the hardware is able to adapt quicker and more flexibly. Which                                                        
    is a huge relief for us as we no longer have to carry empirically                                                                
    derived magic algorithms.                                                                                                        
                                                                                                                                     
    Yet despite the advance in technology, the driver was still doing its                                                            
    IPS polling on all machines. Restrict it to the only supported hardware,                                                         
    Clarkdale/Arrandale.                                                                                                             
                                                                                                                                     
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>                                                                           
    Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>                                                                             
    Tested-by: Andrey Rahmatullin <wrar@wrar.name>                                                                                   
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49025
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

commit 074b5e1a99fb5017122591d70098601e0484ca6a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed May 2 12:07:06 2012 +0100

    drm/i915: Do not read non-existent DPLL registers on PCH hardware
    
    We only execute intel_decrease_pllclock for pre-PCH hardware, typically
    gen4 mobiles. However, in the variable declaration we did read from the
    non-PCH DPLL register, quite naughty and detected by SandyBridge.
    
    Reported-and-tested-by: Andrey Rahmatullin <wrar@wrar.name>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49025
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.