Bug 100770 - [BDW][EXT] igt@kms_fbcon_fbt@fbc-suspend cause the following kms_flip tests to fail on consecutive runs
Summary: [BDW][EXT] igt@kms_fbcon_fbt@fbc-suspend cause the following kms_flip tests t...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Imre Deak
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: PatchMerged
Keywords:
Depends on:
Blocks:
 
Reported: 2017-04-24 09:58 UTC by Marta Löfstedt
Modified: 2017-07-27 16:54 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (1.04 MB, text/plain)
2017-04-26 11:05 UTC, Marta Löfstedt
no flags Details

Description Marta Löfstedt 2017-04-24 09:58:54 UTC
While consecutively running the extended list I have noticed that the kms_flip subtests following igt@kms_flip@bo-too-big, are failing on my BDW NUCi5. When this happens I typically get vblank time out as below: 


"[ 6408.016000] WARNING: CPU: 2 PID: 8961 at drivers/gpu/drm/i915/intel_display.c:12616 intel_atomic_commit_tail+0x1028/0x1030 [i915]
[ 6408.016002] pipe B vblank wait timed out
[ 6408.016003] Modules linked in: rfcomm bnep arc4 iwlmvm binfmt_misc nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp mac80211 coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_codec_hdmi snd_soc_ssm4567 aes_x86_64 iwlwifi crypto_simd cryptd glue_helper snd_soc_rt5640 snd_hda_codec_realtek snd_soc_rl6231 snd_soc_core snd_hda_codec_generic intel_cstate cfg80211 snd_hda_intel snd_hda_codec snd_compress snd_seq_midi snd_seq_midi_event intel_rapl_perf btusb snd_hda_core btrtl btbcm btintel input_leds snd_rawmidi bluetooth snd_pcm mei_me snd_seq lpc_ich mei snd_hwdep shpchp snd_timer snd_seq_device acpi_als ir_lirc_codec snd lirc_dev rc_rc6_mce elan_i2c kfifo_buf nuvoton_cir soundcore 8250_dw rc_core acpi_pad i2c_designware_platform industrialio mac_hid dw_dmac
[ 6408.016048]  snd_soc_sst_acpi i2c_designware_core snd_soc_sst_match spi_pxa2xx_platform parport_pc ppdev lp parport ip_tables x_tables autofs4 i915 hid_generic usbhid i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm e1000e ahci ptp libahci pps_core sdhci_acpi sdhci i2c_hid hid video
[ 6408.016070] CPU: 2 PID: 8961 Comm: kms_flip Tainted: G     U  W       4.11.0-rc7+ #33
[ 6408.016071] Hardware name:                  /NUC5i5RYB, BIOS RYBDWi35.86A.0249.2015.0529.1640 05/29/2015
[ 6408.016072] Call Trace:
[ 6408.016079]  dump_stack+0x63/0x81
[ 6408.016082]  __warn+0xcb/0xf0
[ 6408.016085]  warn_slowpath_fmt+0x5a/0x80
[ 6408.016117]  intel_atomic_commit_tail+0x1028/0x1030 [i915]
[ 6408.016120]  ? wake_atomic_t_function+0x60/0x60
[ 6408.016147]  intel_atomic_commit+0x3b7/0x4c0 [i915]
[ 6408.016149]  ? wake_atomic_t_function+0x60/0x60
[ 6408.016166]  drm_atomic_commit+0x4b/0x50 [drm]
[ 6408.016175]  drm_atomic_helper_set_config+0x80/0xc0 [drm_kms_helper]
[ 6408.016188]  __drm_mode_set_config_internal+0x65/0x110 [drm]
[ 6408.016200]  drm_mode_setcrtc+0x4f1/0x660 [drm]
[ 6408.016212]  drm_ioctl+0x218/0x4b0 [drm]
[ 6408.016222]  ? drm_mode_getcrtc+0x180/0x180 [drm]
[ 6408.016226]  ? kmem_cache_free+0x1b6/0x1e0
[ 6408.016230]  do_vfs_ioctl+0xa3/0x600
[ 6408.016233]  SyS_ioctl+0x79/0x90
[ 6408.016236]  entry_SYSCALL_64_fastpath+0x1e/0xad
[ 6408.016237] RIP: 0033:0x7fcdeaa658b7
[ 6408.016239] RSP: 002b:00007ffe58ce8838 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 6408.016241] RAX: ffffffffffffffda RBX: 000056111e4fa80c RCX: 00007fcdeaa658b7
[ 6408.016242] RDX: 00007ffe58ce8870 RSI: 00000000c06864a2 RDI: 0000000000000003
[ 6408.016243] RBP: 000056111e4fbd90 R08: 0000000000000000 R09: 00007ffe58ce8d28
[ 6408.016244] R10: 00007ffe58ce8bd4 R11: 0000000000000246 R12: 00007ffe58ce1670
[ 6408.016245] R13: 00000000000005df R14: 00007ffe58ce1560 R15: 0000000000000000
[ 6408.016247] ---[ end trace 0491a6d0dd9e81a4 ]---
[ 6418.127974] [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:39:pipe B] flip_done timed out"

However, if I remove the kms_fbcon_fbt subtest fbc-suspend the kms_flip thest are fine.

I am currently running drm-tip git@bc781a3cf, but judging from extended results from farm2 BDWs this has been going on for quite a while.
Comment 1 Paulo Zanoni 2017-04-24 12:37:12 UTC
What if you replace the fbc-suspend subtest for a normal suspend/resume cycle? Just manually suspend/resume the machine, then run the tests.
Comment 2 Paulo Zanoni 2017-04-24 12:39:33 UTC
(In reply to Paulo Zanoni from comment #1)
> What if you replace the fbc-suspend subtest for a normal suspend/resume
> cycle? Just manually suspend/resume the machine, then run the tests.

If that still causes the error, the next step would be to do the same thing but with i915.enable_fbc=0 and see if it still happens.
Comment 3 Marta Löfstedt 2017-04-25 10:21:40 UTC
with: i915.enable_fbc=0
I can not reproduce the issue.
Comment 4 Marta Löfstedt 2017-04-26 11:05:35 UTC
Created attachment 131050 [details]
dmesg

Added dmesg captured by spinning a list with these 2 tests:

igt@kms_fbcon_fbt@fbc-suspend
igt@kms_flip@vblank-vs-modeset-rpm-interruptible

In this case I get:

"[  154.166951] WARNING: CPU: 0 PID: 274 at drivers/gpu/drm/i915/intel_display.c:8623 hsw_enable_pc8+0x6be/0x710 [i915]"

Instead of the previous vblank WARN. This starts on the second run of igt@kms_flip@vblank-vs-modeset-rpm-interruptible and the WARN persist spamming, for whatever (kms) test I run after.
Comment 5 Marta Löfstedt 2017-04-27 06:08:02 UTC
Noticed some hda involvment in the dmesg when the problem use start to appear, so I tested with Imres patch:

https://patchwork.freedesktop.org/patch/149696/

and I can no longer reproduce this issue.
Comment 6 Marta Löfstedt 2017-04-27 13:25:36 UTC
I have seen a similar sequence of issues going on with the pm_rpm tests.

i.e.
pm_runtime_get_sync() failed: -13
WARNING: CPU: 2 PID: 2308 at drivers/gpu/drm/i915/intel_drv.h:1755 intel_runtime_pm_get+0x9a/0xd0 [i915]
WARNING: CPU: 2 PID: 2308 at drivers/gpu/drm/i915/intel_drv.h:1755 gen6_read32+0x153/0x1d0 [i915]
WARNING: CPU: 3 PID: 208 at drivers/gpu/drm/i915/intel_display.c:8623 hsw_enable_pc8+0x6be/0x710 [i915]

and then the hsw_enable_pc8 WARN spams for a very long time after this happended.

The issue is reproducible by spinning:
igt@pm_rpm@debugfs-read
igt@pm_rpm@i2c
igt@pm_rpm@pc8-residency
igt@pm_rpm@sysfs-read
igt@pm_rpm@system-suspend
igt@pm_rpm@system-suspend-execbuf
igt@pm_rpm@system-suspend-modeset
igt@pm_rpm@universal-planes

And it is also, not reproducible if I apply Imres patch.
Comment 7 Marta Löfstedt 2017-05-03 09:54:31 UTC
Imres patches has been merged to topic/core_for_CI.
So, let's see if the results on for the extended list is improved by this.
Comment 8 Marta Löfstedt 2017-05-04 06:09:04 UTC
As anticipated Imres patches had a massive impact on the number of passing tests of the extended list.
So, what state should this bug be in now?
The patches are on:  topic/core_for_CI.
We see that the issue is fixed from the results of the CI extended list.
So, is this bug resolved or should we wait for this until the patches are moved from topic/core_for_CI to a "real" branch?
Comment 9 Jani Nikula 2017-05-04 09:17:03 UTC
(In reply to Marta Löfstedt from comment #8)
> So, is this bug resolved or should we wait for this until the patches are
> moved from topic/core_for_CI to a "real" branch?

Please wait. Could assign to Imre to have him close when the patches have been merged to a proper upstream branch.
Comment 10 Marta Löfstedt 2017-05-09 06:07:41 UTC
Assiged to Imre to close when the patch-set is merged upstream.
Comment 11 Imre Deak 2017-05-31 14:12:31 UTC
The fix is now merged to drm-tip.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.