Bug 76544 - [HSW Regression]igt/kms_cursor_crc causes calltrace
Summary: [HSW Regression]igt/kms_cursor_crc causes calltrace
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: high major
Assignee: Todd Previte
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-24 08:42 UTC by lu hua
Modified: 2016-10-07 10:27 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (115.14 KB, text/plain)
2014-03-24 08:42 UTC, lu hua
no flags Details
dmesg(patch) (125.03 KB, text/plain)
2014-03-27 05:23 UTC, lu hua
no flags Details

Description lu hua 2014-03-24 08:42:28 UTC
Created attachment 96274 [details]
dmesg

System Environment:
--------------------------
Platform: Haswell
Kernel:   drm-intel-next-queued/56da08b36ee58a58723fe1bfb5fbe624a7442689/

Bug detailed description:
-----------------------------
Run ./kms_cursor_crc on Haswell with -queued kernel, call trace reports in dmesg.

The latest known good commit: 484b41dd70a9fbea894632d8926bbb93f05021c7
The latest known bad commit: 56da08b36ee58a58723fe1bfb5fbe624a7442689

[   89.802315] WARNING: CPU: 4 PID: 781 at drivers/gpu/drm/i915/intel_display.c:6882 hsw_enable_pc8+0xcf/0x626 [i915]()
[   89.802344] CRTC for pipe A enabled
[   89.802354] Modules linked in: dm_mod snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi pcspkr serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support lpc_ich mfd_core snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore battery tpm_infineon tpm_tis tpm wmi acpi_cpufreq i915 video button drm_kms_helper drm
[   89.802470] CPU: 4 PID: 781 Comm: kworker/4:1 Not tainted 3.14.0-rc7_drm-intel-nightly_ef4aa4_20140323+ #942
[   89.802496] Hardware name: ASUS All Series/Z87-EXPERT, BIOS 1008 05/17/2013
[   89.802517] Workqueue: pm pm_runtime_work
[   89.802529]  0000000000000000 0000000000000009 ffffffff81716de3 ffff880255e7bc88
[   89.802553]  ffffffff81035052 ffffc900090c4024 ffffffffa009e9d6 0000000000000000
[   89.802578]  ffff88025114c000 ffff88025586d000 ffff8802514a2b00 ffff880256317098
[   89.802601] Call Trace:
[   89.802612]  [<ffffffff81716de3>] ? dump_stack+0x41/0x51
[   89.802629]  [<ffffffff81035052>] ? warn_slowpath_common+0x73/0x8b
[   89.802650]  [<ffffffffa009e9d6>] ? hsw_enable_pc8+0xcf/0x626 [i915]
[   89.802669]  [<ffffffff81035102>] ? warn_slowpath_fmt+0x45/0x4a
[   89.802689]  [<ffffffffa009e9d6>] ? hsw_enable_pc8+0xcf/0x626 [i915]
[   89.802710]  [<ffffffffa00600f3>] ? i915_runtime_suspend+0x62/0xb1 [i915]
[   89.802732]  [<ffffffff812f8728>] ? pci_pm_runtime_suspend+0x60/0x119
[   89.802750]  [<ffffffff81383e2e>] ? __rpm_callback+0x28/0x4c
[   89.802766]  [<ffffffff81383e9d>] ? rpm_callback+0x4b/0x69
[   89.802781]  [<ffffffff81384449>] ? rpm_suspend+0x2a9/0x3f9
[   89.802798]  [<ffffffff8103c75e>] ? internal_add_timer+0xd/0x28
[   89.802814]  [<ffffffff8103cf3f>] ? mod_timer+0x125/0x13d
[   89.802829]  [<ffffffff813850e9>] ? pm_runtime_work+0x65/0x7b
[   89.802847]  [<ffffffff81046660>] ? process_one_work+0x1bc/0x2ed
[   89.802863]  [<ffffffff810445c0>] ? pwq_activate_delayed_work+0x1e/0x28
[   89.802883]  [<ffffffff81046bce>] ? worker_thread+0x1c7/0x2bc
[   89.802899]  [<ffffffff81046a07>] ? rescuer_thread+0x251/0x251
[   89.802916]  [<ffffffff8104b53e>] ? kthread+0xc5/0xcd
[   89.802932]  [<ffffffff8104b479>] ? kthread_freezable_should_stop+0x40/0x40
[   89.802951]  [<ffffffff817213fc>] ? ret_from_fork+0x7c/0xb0
[   89.802968]  [<ffffffff8104b479>] ? kthread_freezable_should_stop+0x40/0x40
[   89.802987] ---[ end trace 7da3cc66c49a1c38 ]---


Reproduce steps:
---------------------------- 
1. ./kms_cursor_crc
Comment 1 Chris Wilson 2014-03-24 09:52:32 UTC

*** This bug has been marked as a duplicate of bug 76543 ***
Comment 2 Daniel Vetter 2014-03-24 10:16:13 UTC
The calltrace doesn't look specific to kms_cursor_crc, I think this is just due to modesets in general.

Can you please figure out how else to reproduce this with some other testcase or sequence of operations and then attempt a bisect? Unfortunately we can't bisect with kms_cursor_crc easily ... Note that Paulo's pm_pc8 has tons of special-purpose testcases for runtime PM - and the backtraces are all in runtime PM code.

If you can't find a different way to reproduce this backtrace I'll explain how to work around the now resolved issue with kms_cursor_crc and still be able to bisect.
Comment 3 lu hua 2014-03-25 07:26:56 UTC
Bisect shows:3e09dcd5bde5c1c3bf6aa3f848fe065f0c8fae9c is the first bad commit.
commit 3e09dcd5bde5c1c3bf6aa3f848fe065f0c8fae9c
Merge: 6ba6b7c b8a5ff8
Author: Dave Airlie <airlied@redhat.com>
Date:   Thu Feb 27 14:36:01 2014 +1000

    Merge tag 'drm-intel-next-2014-02-07' of ssh://git.freedesktop.org/git/drm-i

    - Yet more steps towards atomic modeset from Ville.
    - DP panel power sequencing improvements from Paulo.
    - irq code cleanups from Ville.
    - 5.4 GHz dp lane clock support for bdw/hsw from Todd.
    - Clock readout support for hsw/bdw (aka fastboot) from Jesse.
    - Make pipe underruns report at ERROR level (Ville). This is to check our
      improved watermarks code.
    - Full ppgtt support from Ben for gen7.
    - More fbc fixes and improvements from Ville all over the place, unfortunate
      not yet enabled by default on more platforms.
    - w/a cleanups from Ville.
    - HiZ stall optimization settings (Chia-I Wu).
    - Display register mmio offset refactor patch from Antti.
    - RPS improvements for corner-cases from Jeff McGee.

    * tag 'drm-intel-next-2014-02-07' of ssh://git.freedesktop.org/git/drm-intel
      drm/i915: Update rps interrupt limits
      drm/i915: Restore rps/rc6 on reset
      drm/i915: Prevent recursion by retiring requests when the ring is full
      drm/i915: Generate a hang error code
      drm/i915: unify FLIP_DONE macro names
      drm/i915: vlv: s/spin_lock_irqsave/spin_lock/ in irq handler
      drm/i915: factor out valleyview_pipestat_irq_handler
      drm/i915: vlv: don't unmask IIR[DISPLAY_PIPE_A/B_VBLANK] interrupt
      drm/i915: Reorganize display pipe register accesses
      drm/i915: Treat using a purged buffer as a source of EFAULT
      drm/i915: Convert EFAULT into a silent SIGBUS
      drm/i915: release mutex in i915_gem_init()'s error path
      drm/i915: check for oom when allocating private_default_ctx
      drm/i915/vlv: WA to fix Voltage not getting dropped to Vmin when Gfx is po
      drm/i915: Get rid of acthd based guilty batch search
      drm/i915: Use hangcheck score to find guilty context
      drm/i915: Drop WaDisablePSDDualDispatchEnable:ivb for IVB GT2
      drm/i915: Fix IVB GT2 WaDisableDopClockGating and WaDisablePSDDualDispatch
      drm/i915: Don't access snooped pages through the GTT (even for error captu
      drm/i915: Only print information for filing bug reports once
      ...

    Conflicts:
        drivers/gpu/drm/i915/intel_dp.c
Comment 4 Daniel Vetter 2014-03-26 13:20:36 UTC
This bisect result looks funny, runtime PM should have been in decent state before Paulo left, and that merge commit is 1 month old. Are you sure you've chased the same warning?
Comment 5 Paulo Zanoni 2014-03-26 13:35:47 UTC
Hi

Can you please apply "[PATCH 1/6] drm/i915: don't schedule force_wake_timer at gen6_read" from the mailing list and test the tree again? It fixes the problem on my machine.

Here's the link to the patch: http://patchwork.freedesktop.org/patch/21687/

Thanks,
Paulo
Comment 6 lu hua 2014-03-27 05:22:43 UTC
(In reply to comment #5)
> Hi
> 
> Can you please apply "[PATCH 1/6] drm/i915: don't schedule force_wake_timer
> at gen6_read" from the mailing list and test the tree again? It fixes the
> problem on my machine.
> 
> Here's the link to the patch: http://patchwork.freedesktop.org/patch/21687/
> 
> Thanks,
> Paulo

Test this patch, The call trace goes away. but has following error:
<3>[  658.042450] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 193
Comment 7 lu hua 2014-03-27 05:23:22 UTC
Created attachment 96441 [details]
dmesg(patch)
Comment 8 lu hua 2014-03-27 05:39:55 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > Hi
> > 
> > Can you please apply "[PATCH 1/6] drm/i915: don't schedule force_wake_timer
> > at gen6_read" from the mailing list and test the tree again? It fixes the
> > problem on my machine.
> > 
> > Here's the link to the patch: http://patchwork.freedesktop.org/patch/21687/
> > 
> > Thanks,
> > Paulo
> 
> Test this patch, The call trace goes away. but has following error:
> <3>[  658.042450] [drm:drm_edid_block_valid] *ERROR* EDID checksum is
> invalid, remainder is 193

This error isn't caused by this case, it appears after boot system.
Comment 9 Daniel Vetter 2014-03-27 07:24:06 UTC
(In reply to comment #8)
> (In reply to comment #6)
> > Test this patch, The call trace goes away. but has following error:
> > <3>[  658.042450] [drm:drm_edid_block_valid] *ERROR* EDID checksum is
> > invalid, remainder is 193
> 
> This error isn't caused by this case, it appears after boot system.

This is a separate issue, please file a new bug.
Comment 10 lu hua 2014-03-28 07:34:52 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #6)
> > > Test this patch, The call trace goes away. but has following error:
> > > <3>[  658.042450] [drm:drm_edid_block_valid] *ERROR* EDID checksum is
> > > invalid, remainder is 193
> > 
> > This error isn't caused by this case, it appears after boot system.
> 
> This is a separate issue, please file a new bug.

Test on latest -nightly kernel, the error goes away, about the call trace reported Bug 76723
Comment 11 Paulo Zanoni 2014-04-02 14:33:37 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > Hi
> > 
> > Can you please apply "[PATCH 1/6] drm/i915: don't schedule force_wake_timer
> > at gen6_read" from the mailing list and test the tree again? It fixes the
> > problem on my machine.
> > 
> > Here's the link to the patch: http://patchwork.freedesktop.org/patch/21687/
> > 
> > Thanks,
> > Paulo
> 
> Test this patch, The call trace goes away. but has following error:
> <3>[  658.042450] [drm:drm_edid_block_valid] *ERROR* EDID checksum is
> invalid, remainder is 193

Patch merged, closing bug. Thanks for reporting and testing!

If the same original bug still happens, please reopen this bug report. If new error messages happen, please open a new bug.

Thanks,
Paulo
Comment 12 lu hua 2014-04-03 05:30:55 UTC
Verified.fixed.
Comment 13 Jari Tahvanainen 2016-10-07 10:27:25 UTC
Closing verified+fixed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.