Bug 73037 - [BYT]igt/many kms_flip subcases randomly fail with *ERROR* failed to train DP, aborting
Summary: [BYT]igt/many kms_flip subcases randomly fail with *ERROR* failed to train DP...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: highest critical
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 80701 (view as bug list)
Depends on:
Blocks: 80701
  Show dependency treegraph
 
Reported: 2013-12-26 07:27 UTC by lu hua
Modified: 2016-09-28 13:30 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (125.31 KB, text/plain)
2013-12-30 07:48 UTC, lu hua
no flags Details
dmesg (127.49 KB, text/plain)
2014-02-26 02:20 UTC, lu hua
no flags Details
dmesg (124.89 KB, text/plain)
2014-06-25 06:15 UTC, lu hua
no flags Details
dmesg(patch) (124.87 KB, text/plain)
2014-07-03 05:33 UTC, lu hua
no flags Details
Patch for link training (9.90 KB, text/plain)
2014-07-15 17:18 UTC, Todd Previte
no flags Details
Patch for check_link_status() (2.96 KB, text/plain)
2014-07-15 17:18 UTC, Todd Previte
no flags Details
dmesg (124.95 KB, text/plain)
2014-07-17 06:37 UTC, lu hua
no flags Details

Description lu hua 2013-12-26 07:27:32 UTC
System Environment:
--------------------------
Platform: Baytrail
kernel:   (drm-intel-nightly)5f7867f14f8b3688584f4bf589fe4b8aa8c718b3

Bug detailed description:
---------------------------
It fails on baytrail with -nightly, -queued, -fixes kernel.

It also fails on earlier igt commit and earlier queued commit.

output:
IGT-Version: 1.5-g72fe2b6 (x86_64) (Linux: 3.13.0-rc4_drm-intel-nightly_5f7867_20131226+ x86_64)
Using monotonic timestamps
kms_flip: drmtest.c:1124: igt_exit: Assertion `skipped_one || succeeded_one || failed_one' failed.
Aborted (core dumped)

Reproduce steps:
-------------------------
1. ./kms_flip --run-subtest --run-subtest wf_vblank-vs-dpms-interruptible
Comment 1 lu hua 2013-12-30 07:48:58 UTC
Created attachment 91317 [details]
dmesg

Sorry, It fails with "*ERROR* failed to train DP, aborting" not "core dumped".
It randomly causes *ERROR* failed to train DP, aborting. It happens 1 in 15 runs.

output:
IGT-Version: 1.5-g72fe2b6 (x86_64) (Linux: 3.13.0-rc4_drm-intel-nightly_5f7867_20131230+ x86_64)
Using monotonic timestamps
Beginning wf_vblank-vs-dpms-interruptible on crtc 3, connector 20
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
...................
wf_vblank-vs-dpms-interruptible on crtc 3, connector 20: PASSED

Beginning wf_vblank-vs-dpms-interruptible on crtc 6, connector 20
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
.....
wf_vblank-vs-dpms-interruptible on crtc 6, connector 20: PASSED

Subtest wf_vblank-vs-dpms-interruptible: SUCCESS

Call trace
[  324.889970] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[  324.889976] [drm:intel_dp_link_down],
[  324.941966] [drm:g4x_wait_for_vblank], vblank wait timed out
[  324.993102] [drm:valleyview_irq_handler], hotplug event received, stat 0x10000020
[  326.005962] ------------[ cut here ]------------
[  326.005992] WARNING: CPU: 0 PID: 3924 at drivers/gpu/drm/i915/intel_display.c:1535 vlv_pre_enable_dp+0x109/0x118 [i915]()
[  326.005996] timed out waiting for port C ready: 0x000000ff
[  326.005998] Modules linked in: ip6table_filter ip6_tables ipv6 iptable_filter ip_tables ebtable_nat ebtables x_tables dm_mod snd_hda_codec_hdmi snd_hda_codec_realtek pcspkr s
nd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore battery ac acpi_cpufreq i915 video button drm_kms_helper drm
[  326.006026] CPU: 0 PID: 3924 Comm: kms_flip Tainted: G        W    3.13.0-rc4_drm-intel-nightly_5f7867_20131230+ #42
[  326.006029]  0000000000000000 0000000000000009 ffffffff81711fd2 ffff8800025e7c08
[  326.006034]  ffffffff8103312a 0000000000000286 ffffffffa009977b dead000000200200
[  326.006039]  ffff880002a0c000 0000000000000001 ffff880002949800 0000000000000200
[  326.006044] Call Trace:
[  326.006061]  [<ffffffff81711fd2>] ? dump_stack+0x41/0x51
[  326.006069]  [<ffffffff8103312a>] ? warn_slowpath_common+0x73/0x8b
[  326.006091]  [<ffffffffa009977b>] ? vlv_pre_enable_dp+0x109/0x118 [i915]
[  326.006096]  [<ffffffff810331da>] ? warn_slowpath_fmt+0x45/0x4a
[  326.006114]  [<ffffffffa00821c3>] ? vlv_wait_port_ready+0xcb/0x104 [i915]
[  326.006132]  [<ffffffffa009977b>] ? vlv_pre_enable_dp+0x109/0x118 [i915]
[  326.006150]  [<ffffffffa00861bb>] ? valleyview_crtc_enable+0x298/0x375 [i915]
[  326.006167]  [<ffffffffa00878dd>] ? intel_crtc_update_dpms+0x4f/0x66 [i915]
[  326.006193]  [<ffffffffa008b3c1>] ? intel_connector_dpms+0x3f/0x4a [i915]
[  326.006205]  [<ffffffffa000fbcb>] ? drm_mode_obj_set_property_ioctl+0x15c/0x1e6 [drm]
[  326.006216]  [<ffffffffa000fc81>] ? drm_mode_connector_property_set_ioctl+0x2c/0x31 [drm]
[  326.006226]  [<ffffffffa0002f37>] ? drm_ioctl+0x263/0x385 [drm]
[  326.006236]  [<ffffffffa000fc55>] ? drm_mode_obj_set_property_ioctl+0x1e6/0x1e6 [drm]
[  326.006243]  [<ffffffff8108ec07>] ? __audit_syscall_exit+0x209/0x225
[  326.006248]  [<ffffffff810360fd>] ? timespec_add_safe+0x1e/0x4d
[  326.006254]  [<ffffffff810e90e8>] ? do_vfs_ioctl+0x3f1/0x43a
[  326.006258]  [<ffffffff810e917f>] ? SyS_ioctl+0x4e/0x7b
[  326.006264]  [<ffffffff8171c5e2>] ? system_call_fastpath+0x16/0x1b
[  326.006267] ---[ end trace e75f891d266e7a18 ]---

Reproduce steps:
-------------------------
1. ./kms_flip  --run-subtest wf_vblank-vs-dpms-interruptible
Comment 2 lu hua 2014-01-23 08:28:59 UTC
Many kms_flip subcases randomly fail with *ERROR* failed to train DP, aborting" not "core dumped on Baytrail.
Comment 3 Gordon Jin 2014-01-24 05:57:40 UTC
this bug need to be triaged.
Comment 4 Imre Deak 2014-01-31 07:50:34 UTC
Could you try the patch at http://lists.freedesktop.org/archives/intel-gfx/2014-January/039252.html ?
Comment 5 Imre Deak 2014-01-31 08:02:30 UTC
(In reply to comment #4)
> Could you try the patch at
> http://lists.freedesktop.org/archives/intel-gfx/2014-January/039252.html ?

Sorry you have an eDP panel, so it's a separate issue.
Comment 6 Jesse Barnes 2014-02-25 01:11:16 UTC
Does this still occur with current bits?  Quite a few changes to the DP code have landed recently...
Comment 7 lu hua 2014-02-26 01:52:58 UTC
(In reply to comment #6)
> Does this still occur with current bits?  Quite a few changes to the DP code
> have landed recently...

Test on latest igt and -nightly kernel. It works well.
Close it.
Comment 8 lu hua 2014-02-26 01:53:19 UTC
Verified.Fixed.
Comment 9 lu hua 2014-02-26 02:19:21 UTC
Sorry, Reopen it.
run ./kms_flip --run-subtest wf_vblank-vs-dpms-interruptible, this error goes away.

run ./kms_flip  --run-subtest absolute-wf_vblank-interruptible, it still has this issue.
output:
IGT-Version: 1.5-g81ade89 (x86_64) (Linux: 3.13.0_drm-intel-nightly_1be8f2_20140226+ x86_64)
Using monotonic timestamps
Beginning absolute-wf_vblank-interruptible on crtc 3, connector 20
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
...........................................................................................
absolute-wf_vblank-interruptible on crtc 3, connector 20: PASSED

Beginning absolute-wf_vblank-interruptible on crtc 6, connector 20
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
...........................................................................................
absolute-wf_vblank-interruptible on crtc 6, connector 20: PASSED

Subtest absolute-wf_vblank-interruptible: SUCCESS

dmesg -r | egrep "<[1-3]>" |grep drm
<3>[  346.945826] [drm:intel_dp_start_link_train] *ERROR* failed to enable link training
<3>[  348.046878] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
<3>[  349.065782] [drm:valleyview_pipestat_irq_handler] *ERROR* pipe A underrun
<3>[  349.068277] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
Comment 10 lu hua 2014-02-26 02:20:03 UTC
Created attachment 94745 [details]
dmesg
Comment 11 Jesse Barnes 2014-03-04 18:39:24 UTC
I think Todd has been buried in this area recently.  Maybe applying some of the mode set sequence changes will help here...
Comment 12 Gordon Jin 2014-03-20 23:57:14 UTC
Todd, can you reproduce this?
Comment 13 Todd Previte 2014-03-21 01:05:06 UTC
Sure, I'll see what I can do.
Comment 14 Todd Previte 2014-04-22 16:41:11 UTC
From the log on 2/26, it looks like the AUX write is failing at the beginning of link training. So the sink device wouldn't be aware that the source device was attempting to train it.

>>[  346.929303] [drm:intel_dp_aux_ch], dp_aux_ch timeout status 0x70150064

It's unclear why we try to continue link training when clearly the sink device hasn't responded. I'll see if I can track this down on a ByT here in Oregon since mine doesn't have the eDP display on it.

-T
Comment 15 Guang Yang 2014-04-25 05:16:43 UTC
Up the priority because almost every subcase of Kms_flip will have this issue randomly, caused lots of failure in nightly and hard for us to analysis the result.
If the randomly failure occur one day and disappear the next day, then occur again. Our guys feel so confused  for such a ugly behavior.
Comment 16 Todd Previte 2014-05-12 23:08:29 UTC
On my ByT system I can't reproduce this with an external display. I just acquired a ByT system with an eDP panel on it, so when that arrives, I'll see if I can reproduce it.

It looks like the call trace is the result of intel_dp_set_link_train() failing. I'll be able to figure out more when I can reproduce this issue on the inbound ByT system.
Comment 17 Todd Previte 2014-05-27 19:28:12 UTC
I can unreliably reproduce this problem as well on the ByT system I received last week (It occurred once in approximately runs). It looks like something is going wrong in the panel power up sequence  such that the sink is unresponsive. I'll dig into it more and see if I can figure out what's going wrong.
Comment 18 Guang Yang 2014-05-28 00:58:29 UTC
(In reply to comment #17)
> I can unreliably reproduce this problem as well on the ByT system I received
> last week (It occurred once in approximately runs). It looks like something
> is going wrong in the panel power up sequence  such that the sink is
> unresponsive. I'll dig into it more and see if I can figure out what's going
> wrong.
Good news, Todd, this issue blocked QA hardly, so if you have any updates or fixed patch, feel free to ping QA and add comments to this bug.
Comment 19 Daniel Vetter 2014-06-18 15:35:59 UTC
phy reset might help, please retest with latest drm-intel-nightly.
Comment 20 lu hua 2014-06-19 07:27:08 UTC
Run ./kms_flip  --run-subtest absolute-wf_vblank-interruptible 15 cycles on latest -nightly kernel, It works well. I will check all kms_flip subcases, If they all work well, I will close it.
Comment 21 lu hua 2014-06-25 06:15:51 UTC
Created attachment 101717 [details]
dmesg

It still happens on latest -nightly kernel.

# ./kms_flip
IGT-Version: 1.7-gfedb9b6 (x86_64) (Linux: 3.16.0-rc2_drm-intel-nightly_891f8a_20140625+ x86_64)
Using monotonic timestamps
Beginning wf_vblank on crtc 6, connector 25
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
..........................................................................................
wf_vblank on crtc 6, connector 25: PASSED

Beginning wf_vblank on crtc 11, connector 25
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
......................................................................^C

# dmesg -r | egrep "<[1-3]>" |grep drm
<3>[   94.616157] [drm:intel_dp_start_link_train] *ERROR* failed to enable link training
<3>[   95.504899] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
<3>[   95.507159] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
<3>[   95.507835] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe B underrun
<3>[   95.507839] [drm:valleyview_pipestat_irq_handler] *ERROR* pipe B underrun
<3>[  107.666500] [drm:intel_dp_start_link_train] *ERROR* failed to enable link training
<3>[  108.789734] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
<3>[  109.794928] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
<3>[  109.795557] [drm:valleyview_pipestat_irq_handler] *ERROR* pipe A underrun
Comment 22 Todd Previte 2014-07-02 17:49:36 UTC
So this:
[   94.380058] [drm:_edp_panel_vdd_on] PP_STATUS: 0x00000000 PP_CONTROL: 0xabcd0008
[   94.380061] [drm:_edp_panel_vdd_on] eDP was not running
[   94.584341] [drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x71450064
[   94.588878] [drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x71450064
[   94.592886] [drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x71450064
[   94.596936] [drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x71450064
[   94.599478] [drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x71450064
[   94.613493] [drm:intel_dp_set_signal_levels] Using signal levels 00000000
[   94.616148] [drm:intel_dp_aux_ch] dp_aux_ch timeout status 0x71450064
[   94.616157] [drm:intel_dp_start_link_train] *ERROR* failed to enable link training

shows that we're trying to train the link when the sink isn't ready. The timeout status for the AUX transactions is expected in this case. Further down in the log is this:

[   94.904063] [drm:intel_dp_set_signal_levels] Using signal levels 00000000
[   94.904882] ------------[ cut here ]------------
[   94.904914] WARNING: CPU: 0 PID: 4020 at drivers/gpu/drm/i915/intel_dp.c:3170 intel_dp_complete_link_train+0x100/0x27f [i915]()

caused by calling intel_dp_complete_link_train() when DP has not been enabled in the first place. Which makes sense considering the previous failure to begin link training. I believe the root cause of this is that we're continuing the training process even though previous, required steps have failed. I think I can clean this up with a little bit of rework to the link training functions and how they're called.
Comment 23 Jani Nikula 2014-07-02 18:55:20 UTC
In plenty of cases we go on regardless of errors because sometimes it will work anyway, and we can avoid a black screen.
Comment 24 lu hua 2014-07-03 05:33:04 UTC
Created attachment 102179 [details]
dmesg(patch)

Applay Todd's patch.
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./kms_flip  --run-subtest absolute-wf_vblank-interruptible
IGT-Version: 1.7-g67e29a3 (x86_64) (Linux: 3.16.0-rc2_kcloud_dc05ed_20140703+ x86_64)
Using monotonic timestamps
Beginning absolute-wf_vblank-interruptible on crtc 6, connector 25
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
...........................................................................................
absolute-wf_vblank-interruptible on crtc 6, connector 25: PASSED

Beginning absolute-wf_vblank-interruptible on crtc 11, connector 25
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
................................................................................................................................................................................
absolute-wf_vblank-interruptible on crtc 11, connector 25: PASSED

Subtest absolute-wf_vblank-interruptible: SUCCESS
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r | egrep "<[1-3]>" |grep drm
<3>[   66.649840] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
<3>[   66.657152] [drm:valleyview_pipestat_irq_handler] *ERROR* pipe B underrun
<3>[   83.791251] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
<3>[   83.791896] [drm:valleyview_pipestat_irq_handler] *ERROR* pipe A underrun
Comment 25 Todd Previte 2014-07-15 17:18:02 UTC
Created attachment 102863 [details]
Patch for link training
Comment 26 Todd Previte 2014-07-15 17:18:25 UTC
Created attachment 102864 [details]
Patch for check_link_status()
Comment 27 Todd Previte 2014-07-15 17:18:57 UTC
Interesting. That's similar behavior to what I saw when the PHYs weren't coming up on VLV. I'd like to see the results of the two patches I've attached. The first one is an updated, cleaned up version of the patch I sent previously. The second one changes check_link_status() such that we don't try and configure the link when it's in a disconnected state.

Please apply, test and post the results.

-T
Comment 28 lu hua 2014-07-17 06:13:49 UTC
(In reply to comment #27)
> Interesting. That's similar behavior to what I saw when the PHYs weren't
> coming up on VLV. I'd like to see the results of the two patches I've
> attached. The first one is an updated, cleaned up version of the patch I
> sent previously. The second one changes check_link_status() such that we
> don't try and configure the link when it's in a disconnected state.
> 
> Please apply, test and post the results.
> 
> -T

Apply these 2 patches, run ./kms_flip  --run-subtest absolute-wf_vblank-interruptible crash due to bug 81320. 

./kms_flip  --run-subtest absolute-wf_vblank-interruptible
output:
IGT-Version: 1.7-g428060c (x86_64) (Linux: 3.16.0-rc5_kcloud_1f014f_20140717+ x86_64)
Using monotonic timestamps
Test assertion failure function igt_install_exit_handler, file igt_core.c:1086:
Failed assertion: exit_handler_count < MAX_EXIT_HANDLERS
kms_flip: igt_core.c:657: igt_fail: Assertion `!test_with_subtests || in_fixture' failed.
Aborted
Comment 29 lu hua 2014-07-17 06:37:54 UTC
Created attachment 102963 [details]
dmesg

Apply these 2 patches and test it on igt commit b6ed9fb0d9a68.
Run 4 cycles, "*ERROR* failed to train DP, aborting" doesn't appear. But reports "*ERROR* mismatch in has_dp_encoder (expected 1, found 0)".

root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools# dmesg -r | egrep "<[1-3]>" |grep drm
<3>[  182.656611] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
<3>[  182.663755] [drm:valleyview_pipestat_irq_handler] *ERROR* pipe A underrun
<3>[  221.645594] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
<3>[  221.652742] [drm:valleyview_pipestat_irq_handler] *ERROR* pipe A underrun

output
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./kms_flip  --run-subtest absolute-wf_vblank-interruptible
IGT-Version: 1.7-gb6ed9fb (x86_64) (Linux: 3.16.0-rc5_kcloud_1f014f_20140717+ x86_64)
Using monotonic timestamps
Beginning absolute-wf_vblank-interruptible on crtc 7, connector 27
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
...........................................................................................
absolute-wf_vblank-interruptible on crtc 7, connector 27: PASSED

Beginning absolute-wf_vblank-interruptible on crtc 12, connector 27
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
...........................................................................................
absolute-wf_vblank-interruptible on crtc 12, connector 27: PASSED

Subtest absolute-wf_vblank-interruptible: SUCCESS
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./kms_flip  --run-subtest absolute-wf_vblank-interruptible
IGT-Version: 1.7-gb6ed9fb (x86_64) (Linux: 3.16.0-rc5_kcloud_1f014f_20140717+ x86_64)
Using monotonic timestamps
Beginning absolute-wf_vblank-interruptible on crtc 7, connector 27
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
................................................................................................................................................................................
absolute-wf_vblank-interruptible on crtc 7, connector 27: PASSED

Beginning absolute-wf_vblank-interruptible on crtc 12, connector 27
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
...........................................................................................
absolute-wf_vblank-interruptible on crtc 12, connector 27: PASSED

Subtest absolute-wf_vblank-interruptible: SUCCESS
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./kms_flip  --run-subtest absolute-wf_vblank-interruptible
IGT-Version: 1.7-gb6ed9fb (x86_64) (Linux: 3.16.0-rc5_kcloud_1f014f_20140717+ x86_64)
Using monotonic timestamps
Beginning absolute-wf_vblank-interruptible on crtc 7, connector 27
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
................................................................................................................................................................................
absolute-wf_vblank-interruptible on crtc 7, connector 27: PASSED

Beginning absolute-wf_vblank-interruptible on crtc 12, connector 27
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
...........................................................................................
absolute-wf_vblank-interruptible on crtc 12, connector 27: PASSED

Subtest absolute-wf_vblank-interruptible: SUCCESS
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./kms_flip  --run-subtest absolute-wf_vblank-interruptible
IGT-Version: 1.7-gb6ed9fb (x86_64) (Linux: 3.16.0-rc5_kcloud_1f014f_20140717+ x86_64)
Using monotonic timestamps
Beginning absolute-wf_vblank-interruptible on crtc 7, connector 27
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
................................................................................................................................................................................
absolute-wf_vblank-interruptible on crtc 7, connector 27: PASSED

Beginning absolute-wf_vblank-interruptible on crtc 12, connector 27
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
...........................................................................................
absolute-wf_vblank-interruptible on crtc 12, connector 27: PASSED

Subtest absolute-wf_vblank-interruptible: SUCCESS
Comment 30 Todd Previte 2014-07-21 21:48:44 UTC
From the update here, it appears that the tests are passing but there are still errors in the log. Looking over the log, it seems like the eDP panel isn't coming up fast enough and the driver is trying to train the panel before it's ready. That seems to be what causes the out-of-sync issues between the driver and the hardware. That might be fixable by preventing the start of link training until the panel is fully running and putting some checks in the code to verify the responses received.

-T
Comment 31 Todd Previte 2014-07-25 21:52:45 UTC
I think I need some clarification here. According to the log posted 17JULY, the tests appear to be passing. Yet this bug is still flagged as a highest priority, blocking bug. That does not seem accurate to me. The WARNs and errors that are presenting now are quite different and arguably less severe than those of the original problem (which was "*ERROR* failed to train DP, aborting"). How is this still highest priority and blocking?
Comment 32 lu hua 2014-07-29 01:53:39 UTC
(In reply to comment #31)
> I think I need some clarification here. According to the log posted 17JULY,
> the tests appear to be passing. Yet this bug is still flagged as a highest
> priority, blocking bug. That does not seem accurate to me. The WARNs and
> errors that are presenting now are quite different and arguably less severe
> than those of the original problem (which was "*ERROR* failed to train DP,
> aborting"). How is this still highest priority and blocking?

Comment 29's result is from testing patches, are they merged into nightly?
Bug 81662 blocks testing on latest nightly kernel.
If "*ERROR* failed to train DP, aborting" goes away, we will close this bug.
Comment 33 Todd Previte 2014-08-04 22:00:06 UTC
Sounds good. No they are not integrated yet. Review feedback from the mailing list indicates some work is necessary to get those patches cleaned up and ready for submission outside of their original context.
Comment 34 lu hua 2014-08-29 05:09:52 UTC
It still happens on latest -nightly kernel.
run ./kms_flip  --run-subtest absolute-wf_vblank-interruptible

output:
IGT-Version: 1.7-gd0a41b4 (x86_64) (Linux: 3.17.0-rc2_drm-intel-nightly_519a43_20140829+ x86_64)
Using monotonic timestamps

Beginning absolute-wf_vblank-interruptible on crtc 8, connector 28
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
..............................................
absolute-wf_vblank-interruptible on crtc 8, connector 28: PASSED

Beginning absolute-wf_vblank-interruptible on crtc 13, connector 28
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
..............................................
absolute-wf_vblank-interruptible on crtc 13, connector 28: PASSED

Beginning absolute-wf_vblank-interruptible on crtc 8, connector 16
  1024x768 60 1024 1048 1184 1344 768 771 777 806 0xa 0x40 65000
..............................................
absolute-wf_vblank-interruptible on crtc 8, connector 16: PASSED

Beginning absolute-wf_vblank-interruptible on crtc 13, connector 16
  1024x768 60 1024 1048 1184 1344 768 771 777 806 0xa 0x40 65000
..............................................
absolute-wf_vblank-interruptible on crtc 13, connector 16: PASSED

Subtest absolute-wf_vblank-interruptible: SUCCESS


# dmesg -r|egrep "<[1-3]>"|grep drm
<3>[  235.727747] [drm:intel_dp_start_link_train] *ERROR* failed to enable link training
<3>[  236.630574] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
<3>[  236.632712] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
<3>[  236.633414] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe B underrun
<3>[  236.633418] [drm:valleyview_pipestat_irq_handler] *ERROR* pipe B underrun
<3>[  261.291479] [drm:intel_dp_start_link_train] *ERROR* failed to enable link training
<3>[  262.176234] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
<3>[  263.181448] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
<3>[  263.182083] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
<3>[  263.182085] [drm:valleyview_pipestat_irq_handler] *ERROR* pipe A underrun
Comment 35 Rodrigo Vivi 2014-10-08 20:48:14 UTC
*** Bug 80701 has been marked as a duplicate of this bug. ***
Comment 36 lu hua 2014-10-10 07:13:52 UTC
Test on latest -nightly kernel, many kms_flip sub cases randomly has this issue:
igt/kms_flip/bcs-flip-vs-modeset
igt/kms_flip/bcs-flip-vs-modeset-interruptible
igt/kms_flip/bcs-wf_vblank-vs-dpms
igt/kms_flip/bcs-wf_vblank-vs-dpms-interruptible
igt/kms_flip/bcs-wf_vblank-vs-modeset
igt/kms_flip/bcs-wf_vblank-vs-modeset-interruptible
igt/kms_flip/bo-too-big
igt/kms_flip/bo-too-big-interruptible
igt/kms_flip/flip-vs-dpms-off-vs-modeset
igt/kms_flip/flip-vs-dpms-off-vs-modeset-interruptible
igt/kms_flip/flip-vs-expired-vblank
igt/kms_flip/rcs-flip-vs-dpms
igt/kms_flip/rcs-flip-vs-dpms-interruptible
igt/kms_flip/rcs-flip-vs-modeset
igt/kms_flip/rcs-flip-vs-modeset-interruptible
igt/kms_flip/rcs-flip-vs-panning
igt/kms_flip/rcs-flip-vs-panning-interruptible
igt/kms_flip/rcs-wf_vblank-vs-dpms
igt/kms_flip/rcs-wf_vblank-vs-dpms-interruptible
igt/kms_flip/rcs-wf_vblank-vs-modeset
igt/kms_flip/rcs-wf_vblank-vs-modeset-interruptible

run ./kms_flip --run-subtest bcs-flip-vs-modeset
dmesg:
[ 1275.022998] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[ 1276.056699] [drm:edp_panel_vdd_off_sync] Turning eDP VDD off
[ 1276.056707] [drm:edp_panel_vdd_off_sync] PP_STATUS: 0x80000008 PP_CONTROL: 0xabcd0003
[ 1276.073848] ------------[ cut here ]------------
[ 1276.073887] WARNING: CPU: 1 PID: 5003 at drivers/gpu/drm/i915/intel_display.c:1748 valleyview_crtc_enable+0x9cb/0xad2 [i915]()
[ 1276.073891] timed out waiting for port C ready: 0x0000201f
[ 1276.073893] Modules linked in: dm_mod snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support snd_hda_codec_conexant snd_hda_codec_generic pcspkr r8169 snd_hda_intel snd_hda_controller snd_hda_codec i2c_i801 snd_hwdep lpc_ich snd_pcm mfd_core snd_timer snd soundcore iosf_mbi battery wmi ac acpi_cpufreq uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev i915 button video drm_kms_helper drm cfbfillrect cfbimgblt cfbcopyarea
[ 1276.073933] CPU: 1 PID: 5003 Comm: kms_flip Not tainted 3.17.0_drm-intel-nightly_ea4bec_20141009+ #326
[ 1276.073937]  0000000000000000 0000000000000009 ffffffff8178032c ffff880002bbbab8
[ 1276.073942]  ffffffff8103a8ec dead000000200200 ffffffffa00c9725 0000000000000246
[ 1276.073948]  ffff8800791a8000 0000000000186018 ffff88017a45db30 00000000f0006000
[ 1276.073953] Call Trace:
[ 1276.073964]  [<ffffffff8178032c>] ? dump_stack+0x41/0x51
[ 1276.073972]  [<ffffffff8103a8ec>] ? warn_slowpath_common+0x78/0x90
[ 1276.073999]  [<ffffffffa00c9725>] ? valleyview_crtc_enable+0x9cb/0xad2 [i915]
[ 1276.074005]  [<ffffffff8103a99c>] ? warn_slowpath_fmt+0x45/0x4a
[ 1276.074032]  [<ffffffffa00c4af0>] ? vlv_wait_port_ready+0xdc/0x114 [i915]
[ 1276.074059]  [<ffffffffa00c9725>] ? valleyview_crtc_enable+0x9cb/0xad2 [i915]
[ 1276.074086]  [<ffffffffa00cc182>] ? __intel_set_mode+0x1115/0x121d [i915]
[ 1276.074092]  [<ffffffff8177d404>] ? printk+0x4f/0x54
[ 1276.074119]  [<ffffffffa00ce448>] ? intel_set_mode+0xd/0x27 [i915]
[ 1276.074147]  [<ffffffffa00cf05b>] ? intel_crtc_set_config+0x757/0xa99 [i915]
[ 1276.074163]  [<ffffffffa0017fe9>] ? drm_mode_set_config_internal+0x48/0xbd [drm]
[ 1276.074176]  [<ffffffffa001ba12>] ? drm_mode_setcrtc+0x3e6/0x48b [drm]
[ 1276.074188]  [<ffffffffa0010760>] ? drm_ioctl+0x279/0x3bc [drm]
[ 1276.074202]  [<ffffffffa001b62c>] ? drm_mode_setplane+0x1ce/0x1ce [drm]
[ 1276.074208]  [<ffffffff81108374>] ? kmem_cache_free+0xf7/0x135
[ 1276.074215]  [<ffffffff8111cda3>] ? do_vfs_ioctl+0x3ec/0x435
[ 1276.074221]  [<ffffffff8104e66d>] ? task_work_run+0x78/0x87
[ 1276.074226]  [<ffffffff8111ce35>] ? SyS_ioctl+0x49/0x78
[ 1276.074232]  [<ffffffff8178601a>] ? int_signal+0x12/0x17
[ 1276.074237]  [<ffffffff81785d92>] ? system_call_fastpath+0x16/0x1b
[ 1276.074241] ---[ end trace d5b4ab87e33aca0d ]---

Increasing priority, It impacts so many sub cases and the result is unstable.
Comment 37 lu hua 2014-10-10 07:27:36 UTC
Test on latest -nightly kernel, kms_flip, kms_mmio_vs_cs_flip, kms_flip_event_leak, kms_plane and kms_rotation_crc subcases randomly have this issue, it impacts 100+ sub cases. It's difficult to track test result.
Comment 38 Ville Syrjala 2014-12-02 12:29:45 UTC
Can you retest with the latest nightly now that all the pps fixes have gone in?

In addition you may want to test this patch: http://patchwork.freedesktop.org/patch/37945/
It should eliminate all self inflicted link training failures, but it may make some tests fail outright since the kernel will reject the invalid connector combination instead of pretending that it's supported.
Comment 39 lu hua 2014-12-03 07:48:18 UTC
(In reply to Ville Syrjala from comment #38)
> Can you retest with the latest nightly now that all the pps fixes have gone
> in?
> 
> In addition you may want to test this patch:
> http://patchwork.freedesktop.org/patch/37945/
> It should eliminate all self inflicted link training failures, but it may
> make some tests fail outright since the kernel will reject the invalid
> connector combination instead of pretending that it's supported.

Run kms_flip on latest -nightly kernel, execute about 80 subcases. I don't meet this error, some subcases have "[ 5086.662683] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack."

Run kms_plane 5 cycles, this error doesn't exist.
Comment 40 lu hua 2014-12-12 01:35:24 UTC
Run on cycle full igt, this issue goes away. Close it.
Comment 41 Jari Tahvanainen 2016-09-28 13:30:15 UTC
Closing verified+fixed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.