Bug 69693 - [BYT]igt/kms_flip: DP link train fail due to IOSF sideband fabric timeout
Summary: [BYT]igt/kms_flip: DP link train fail due to IOSF sideband fabric timeout
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: high major
Assignee: Jesse Barnes
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-23 06:56 UTC by lu hua
Modified: 2016-09-28 13:21 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (128.48 KB, text/plain)
2013-09-23 06:56 UTC, lu hua
no flags Details
dmesg nightly(24c8329416) (126.83 KB, text/plain)
2013-09-26 05:47 UTC, lu hua
no flags Details
dmesg nightly(nightlytop_c53fbc_20131015) (95.89 KB, text/plain)
2013-10-15 03:18 UTC, Qingshuai Tian
no flags Details
dmesg (commit 2f29640) (126.13 KB, text/plain)
2013-11-08 05:10 UTC, lu hua
no flags Details

Description lu hua 2013-09-23 06:56:04 UTC
Created attachment 86333 [details]
dmesg

System Environment:
--------------------------
Platform:  Baytrail
Kernel: (drm-intel-nightly)565d170f33cc46ec4b33911c9302a506fe72fd60

Bug detailed description:
-----------------------------
Run ./kms_flip --run-subtest plain-flip on byt, It causes <3>[  537.365605] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
It fails on Baytrail with -nightly kernel.
Many kms_flip subcase also fail.

output:
Using monotonic timestamps
Beginning plain-flip on crtc 3, connector 16
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
select timed out or error (ret 0)
Subtest plain-flip: FAIL
DPMS property not found on 16

Reproduce steps:
----------------------------
1../kms_flip --run-subtest plain-flip
Comment 1 lu hua 2013-09-25 07:32:18 UTC
Run the 1st cysle, output shows success, It causes <3>[  537.365605] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting appears in dmesg.
output(the first cycle):
Using monotonic timestamps
Beginning plain-flip on crtc 3, connector 16
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780

plain-flip on crtc 3, connector 16: PASSED

Beginning plain-flip on crtc 6, connector 16
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780

plain-flip on crtc 6, connector 16: PASSED

Subtest plain-flip: SUCCESS

Run the 2nd cycle, output shows fail, and it will not exit. If run reboot, system is no response.
output:
Using monotonic timestamps
Beginning plain-flip on crtc 3, connector 16
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
select timed out or error (ret 0)
Subtest plain-flip: FAIL
DPMS property not found on 16
Comment 2 Daniel Vetter 2013-09-25 10:24:00 UTC
Can you please retest with latest -nightly? I've just merged a patch from Chon Lee to fix the DP dpll settings.
Comment 3 lu hua 2013-09-26 05:43:16 UTC
Test on -nightly(24c8329416b54b79655afe45370cf3d46f41e283), It randomly causes system hang.It happens 3 in 5 cycles.
output:
Using monotonic timestamps
Beginning plain-flip on crtc 3, connector 16
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780

plain-flip on crtc 3, connector 16: PASSED

Beginning plain-flip on crtc 6, connector 16
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
.....................................................................................................................................................................................................................................................................................................................................................................................................................................unexpected flip seq 422, should be >= 423
Subtest plain-flip: FAIL
DPMS property not found on 16

dmesg:
[  127.059316] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
[  127.059318] [drm:intel_dp_link_down],
[  127.110743] [drm:ironlake_wait_for_vblank], vblank wait timed out
[  128.163737] ------------[ cut here ]------------
[  128.163817] WARNING: CPU: 0 PID: 3474 at drivers/gpu/drm/i915/intel_display.c:1479 vlv_pre_enable_dp+0x112/0x121 [i915]()
[  128.163822] timed out waiting for port C ready: 0xf00020ff
[  128.163882] Modules linked in: ip6table_filter ip6_tables ipv6 iptable_filter ip_tables ebtable_nat ebtables x_tables dm_mod snd_hda_codec_realtek serio_raw pcspkr snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore battery ac acpi_cpufreq i915 video button drm_kms_helper drm freq_table
[  128.163886] CPU: 0 PID: 3474 Comm: kms_flip Not tainted 3.12.0-rc2_nightlytop_24c832_20130926_+ #286
[  128.163890]  0000000000000000 0000000000000009 ffffffff817041c4 ffff880073d85848
[  128.163893]  ffffffff8103319e 0000000000000282 ffffffffa009b1ff dead000000200200
[  128.163895]  ffff88006ff44000 0000000000000000 0000000000000001 ffff880004d11800
[  128.163896] Call Trace:
[  128.163904]  [<ffffffff817041c4>] ? dump_stack+0x41/0x51
[  128.163910]  [<ffffffff8103319e>] ? warn_slowpath_common+0x73/0x8b
[  128.163926]  [<ffffffffa009b1ff>] ? vlv_pre_enable_dp+0x112/0x121 [i915]
[  128.163930]  [<ffffffff8103324e>] ? warn_slowpath_fmt+0x45/0x4a
[  128.163946]  [<ffffffffa0084d45>] ? vlv_wait_port_ready+0xc9/0xfc [i915]
[  128.163962]  [<ffffffffa009b1ff>] ? vlv_pre_enable_dp+0x112/0x121 [i915]
[  128.163977]  [<ffffffffa00854ac>] ? valleyview_crtc_enable+0x2b8/0x395 [i915]
[  128.163993]  [<ffffffffa008b0f5>] ? __intel_set_mode+0xfbc/0x10c7 [i915]
[  128.163998]  [<ffffffff810635b9>] ? console_trylock+0xf/0x47
[  128.164014]  [<ffffffffa008ca94>] ? intel_set_mode+0xd/0x27 [i915]
[  128.164029]  [<ffffffffa008d0df>] ? intel_crtc_set_config+0x631/0x838 [i915]
[  128.164034]  [<ffffffff812c719b>] ? cpumask_next_and+0x2b/0x3b
[  128.164045]  [<ffffffffa000f4a8>] ? drm_mode_set_config_internal+0x44/0xac [drm]
[  128.164050]  [<ffffffffa0048c6e>] ? drm_fb_helper_set_par+0x55/0x9a [drm_kms_helper]
[  128.164055]  [<ffffffff812f94f6>] ? fb_set_var+0x246/0x32c
[  128.164059]  [<ffffffff8130333d>] ? fbcon_blank+0x71/0x230
[  128.164064]  [<ffffffff81352368>] ? do_unblank_screen+0xef/0x169
[  128.164068]  [<ffffffff8134a798>] ? vt_ioctl+0x4af/0xf87
[  128.164072]  [<ffffffff810e68ce>] ? path_openat+0x22d/0x5b9
[  128.164075]  [<ffffffff81342f1d>] ? tty_ioctl+0x8b4/0x923
[  128.164078]  [<ffffffff810e6f26>] ? do_filp_open+0x2b/0x6f
[  128.164080]  [<ffffffff810e81f2>] ? vfs_ioctl+0x1e/0x31
[  128.164083]  [<ffffffff810e89c8>] ? do_vfs_ioctl+0x3ad/0x3ef
[  128.164087]  [<ffffffff810f0382>] ? __fd_install+0x15/0x39
[  128.164090]  [<ffffffff810e8a58>] ? SyS_ioctl+0x4e/0x7e
[  128.164093]  [<ffffffff8170e722>] ? system_call_fastpath+0x16/0x1b
[  128.164095] ---[ end trace d48407b80d740924 ]---

It fails 2 in 5 runs.
output:
Using monotonic timestamps
Beginning plain-flip on crtc 3, connector 16
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
.......................unexpected flip seq 23, should be >= 24
Subtest plain-flip: FAIL
Comment 4 lu hua 2013-09-26 05:47:52 UTC
Created attachment 86613 [details]
dmesg nightly(24c8329416)
Comment 5 Jani Nikula 2013-10-11 11:13:37 UTC
The dmesg is filled with:

[  114.978739] [drm:vlv_sideband_rw], IOSF sideband finish wait (read) timed out
[  114.986739] [drm:vlv_sideband_rw], IOSF sideband idle wait (write) timed out
[  114.994738] [drm:vlv_sideband_rw], IOSF sideband idle wait (read) timed out

which means the DP doesn't have a chance to pass link training, as sideband is used for setting up the DPLL, port, etc.

Please provide a full dmesg from boot. Use log_buf_size=4M or similar if necessary to catch everything.
Comment 6 Qingshuai Tian 2013-10-15 03:18:45 UTC
Created attachment 87641 [details]
dmesg nightly(nightlytop_c53fbc_20131015)

Test case (/kms_flip/plain-flip) with latest -nightly kernel(nightlytop_c53fbc_20131015).
Output:
  Using monotonic timestamps
  Beginning plain-flip on crtc 3, connector 16
    1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780
  .......flip ts before the flip was issued!
  timerdiff -1s, 999959us
  Subtest plain-flip: FAIL
  Test assertion failure function set_connector_dpms, file kms_flip.c:269:
  Failed assertion: found_it
  DPMS property not found on 16
Comment 7 Gordon Jin 2013-10-18 08:43:34 UTC
the error messiage seems similar to Bug#67245. Are they dup?
Comment 8 Daniel Vetter 2013-10-18 14:11:36 UTC
(In reply to comment #7)
> the error messiage seems similar to Bug#67245. Are they dup?

Yeah I think so, good catch.

*** This bug has been marked as a duplicate of bug 67245 ***
Comment 9 Daniel Vetter 2013-10-18 14:14:13 UTC
Actually here we seem to have a stuck IO sideband fabric which is not the case on the other bug. Better track the separately for now.
Comment 10 Gordon Jin 2013-11-01 07:58:05 UTC
Can we take this bug as high priority, as it impacts the i-g-t passrate a lot?
Comment 11 Daniel Vetter 2013-11-05 08:21:02 UTC
Jesse, ping?
Comment 12 Daniel Vetter 2013-11-05 08:29:52 UTC
To qa: Can you pls try to unplug the dp/edp screen and use something else (hdmi or vga) to unblock nightly testing while we try to debug this issue?
Comment 13 lu hua 2013-11-05 09:07:11 UTC
(In reply to comment #12)
> To qa: Can you pls try to unplug the dp/edp screen and use something else
> (hdmi or vga) to unblock nightly testing while we try to debug this issue?

OK, attach vga.
Comment 14 Daniel Vetter 2013-11-05 15:50:24 UTC
Do you see this only with kms_flip or also with other modeset tests like testdisplay or kms_cursor_crc? I mean the DP IOSF failure on first run, all subsequent failures until reboot look like they're only a consequence of the hw getting confused.
Comment 15 Jesse Barnes 2013-11-06 16:12:14 UTC
Based on comment #6, it looks like you might be past the link training/IOSF failures?  If so, then the timestamp/seqno bugs should be fixed by:

commit 7b5562d401c760814110385baa574480e4ce12f9
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Tue Nov 5 15:48:01 2013 -0800

    drm/i915/vlv: use PIPE_START_VBLANK interrupts on VLV

Can you try?
Comment 16 lu hua 2013-11-08 05:09:57 UTC
Test on -nightly branch(commit:2f29640dfa2), The ERROR still exists, #comment 1's fail goes away.
output:
Using monotonic timestamps
Beginning plain-flip on crtc 3, connector 20
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780

plain-flip on crtc 3, connector 20: PASSED

Beginning plain-flip on crtc 6, connector 20
  1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780

plain-flip on crtc 6, connector 20: PASSED

Subtest plain-flip: SUCCESS


dmesg -r | egrep "<[1-3]>" |grep drm
<3>[  126.266259] [drm:intel_dp_start_link_train] *ERROR* failed to enable link training
<3>[  127.154303] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
<3>[  128.263739] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
Comment 17 lu hua 2013-11-08 05:10:58 UTC
Created attachment 88873 [details]
dmesg (commit 2f29640)
Comment 18 Jesse Barnes 2013-11-08 20:53:30 UTC
Can you try the patches in:
https://bugs.freedesktop.org/show_bug.cgi?id=67245

in particular I wonder if the backlight VDD patch might help.
Comment 19 lu hua 2013-11-11 08:31:29 UTC
(In reply to comment #18)
> Can you try the patches in:
> https://bugs.freedesktop.org/show_bug.cgi?id=67245
> 
> in particular I wonder if the backlight VDD patch might help.

Test patches http://lists.freedesktop.org/archives/intel-gfx/2013-November/035276.html and https://bugs.freedesktop.org/show_bug.cgi?id=67245#c29.

The ERROR goes away.
Comment 20 Daniel Vetter 2013-11-11 08:44:44 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > Can you try the patches in:
> > https://bugs.freedesktop.org/show_bug.cgi?id=67245
> > 
> > in particular I wonder if the backlight VDD patch might help.
> 
> Test patches
> http://lists.freedesktop.org/archives/intel-gfx/2013-November/035276.html

These patches are already merged to -nightly.

> and https://bugs.freedesktop.org/show_bug.cgi?id=67245#c29.

This one isn't yet. Do you need it to fix the bug or is plain -nightly enough?
Comment 21 lu hua 2013-11-15 02:05:58 UTC
> 
> > and https://bugs.freedesktop.org/show_bug.cgi?id=67245#c29.
> 
> This one isn't yet. Do you need it to fix the bug or is plain -nightly
> enough?

Need it to fix the bug.
Comment 22 Daniel Vetter 2013-11-16 12:30:27 UTC
Jesse, please make that little diff into a real patch and submit it to intel-gfx. Wasting some brain-cycles and time on a good commit message highly appreciated - we've had countless bugs in this code, so some history digging and to assure me we don't rebreak any old issues should be done.
Comment 23 Jesse Barnes 2013-11-19 01:05:30 UTC
Can I get extra confirmation that current -nightly doesn't work by itself?

I don't like that patch that I posted to the other bug, it theoretically shouldn't be needed...
Comment 24 lu hua 2013-11-20 03:24:35 UTC
(In reply to comment #23)
> Can I get extra confirmation that current -nightly doesn't work by itself?
> 
> I don't like that patch that I posted to the other bug, it theoretically
> shouldn't be needed...

Test on latest -nightly kernel.

If attach VGA, it works well.

If attach edp, The error still exists.
dmesg -r | egrep "<[1-3]>" |grep drm
<3>[   60.106871] [drm:intel_dp_start_link_train] *ERROR* failed to enable link training
<3>[   60.995917] [drm:intel_dp_complete_link_train] *ERROR* failed to train DP, aborting
<3>[   62.019355] [drm:intel_pipe_config_compare] *ERROR* mismatch in has_dp_encoder (expected 1, found 0)
Comment 25 Paulo Zanoni 2013-12-04 21:23:59 UTC
Hi

Can you please test this tree: http://cgit.freedesktop.org/~pzanoni/linux/?h=vdd-rework

Make sure you're on branch vdd-rework.

Thanks,
Paulo
Comment 26 lu hua 2013-12-05 07:17:32 UTC
(In reply to comment #25)
> Hi
> 
> Can you please test this tree:
> http://cgit.freedesktop.org/~pzanoni/linux/?h=vdd-rework
> 
> Make sure you're on branch vdd-rework.
> 
> Thanks,
> Paulo


It works well on this branch.
Comment 27 Daniel Vetter 2013-12-05 07:35:00 UTC
Can we please have the precise measurements so we can compare how much better it works on Paulo's branch?
Comment 28 lu hua 2013-12-06 06:29:48 UTC
(In reply to comment #27)
> Can we please have the precise measurements so we can compare how much
> better it works on Paulo's branch?

Attach eDP, Both the fail and ERROR go away.
Comment 29 Paulo Zanoni 2013-12-11 19:42:23 UTC
(In reply to comment #28)
> (In reply to comment #27)
> > Can we please have the precise measurements so we can compare how much
> > better it works on Paulo's branch?
> 
> Attach eDP, Both the fail and ERROR go away.

Comment #21 leads me to believe the commit that fixes this patch is "drm/i915: don't touch the VDD when disabling the panel". This was merged today on -nightly, so closing bug. Please reopen if it still happens.

Thanks,
Paulo
Comment 30 lu hua 2013-12-13 02:41:58 UTC
Verified.Fixed.
Comment 31 Jari Tahvanainen 2016-09-28 13:21:22 UTC
Hygienic: Closing verified+fixed bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.