Bug 91579 - [HSW] kms_flip atomic warnings
Summary: [HSW] kms_flip atomic warnings
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: highest normal
Assignee: Maarten Lankhorst
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-08-06 20:42 UTC by Jesse Barnes
Modified: 2016-06-08 09:47 UTC (History)
6 users (show)

See Also:
i915 platform: HSW
i915 features: display/atomic


Attachments
Make update stricter (553 bytes, patch)
2015-08-10 12:10 UTC, Maarten Lankhorst
no flags Details | Splinter Review
Make spam include timestamp (5.16 KB, patch)
2015-08-11 12:23 UTC, Maarten Lankhorst
no flags Details | Splinter Review
dmesg (85.89 KB, text/plain)
2016-03-26 15:43 UTC, Hohahiu
no flags Details
Xorg.0.log (46.00 KB, text/plain)
2016-03-26 15:44 UTC, Hohahiu
no flags Details
broadwell dmesg debug=0x1f kms_flip render-flip-vs-modeset-interruptible, failing setcrtc (12.35 KB, text/plain)
2016-06-06 11:31 UTC, Maarten Lankhorst
no flags Details

Description Jesse Barnes 2015-08-06 20:42:14 UTC
The kms_flip basic flip tests all cause this spew in the logs:

[13127.815244] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=56177 end=56178)
[13128.231445] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=56201 end=56202)

These and some pipe_crc failures are the only problems with the BAT list on HSW, so I'm marking them high priority so we can start using the BAT results as a merge criteria.
Comment 1 Maarten Lankhorst 2015-08-10 10:00:37 UTC
Which test?
Comment 2 Maarten Lankhorst 2015-08-10 12:10:09 UTC
Created attachment 117612 [details] [review]
Make update stricter

Just curious, does this fix your issue?

Also I'm having troubles reproducing it. Do you have a more reliable way to do so?
Comment 3 Maarten Lankhorst 2015-08-10 12:41:11 UTC
Oh I guess I misunderstood min/max. Try bumping 100 to 200?
Comment 4 Maarten Lankhorst 2015-08-11 12:23:13 UTC
Created attachment 117627 [details] [review]
Make spam include timestamp

Just curious, how long does updating take? See attached patch.
Comment 5 Jesse Barnes 2015-08-27 16:23:23 UTC
doesn't seem to be an issue anymore...
Comment 6 Jesse Barnes 2015-08-27 16:33:09 UTC
Arg, no, still happens during a full basic test run:

[  492.955073] kms_flip: starting subtest basic-flip-vs-modeset
[  526.137521] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=1737 end=1738)
[  526.402477] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=1752 end=1753)
[  531.509458] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=2042 end=2043)
[  532.301956] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=2087 end=2088)
[  533.094934] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=2132 end=2133)


That was after some suspend/resume cycles and other testing.
Comment 7 Jesse Barnes 2015-08-27 16:54:12 UTC
Just using ./scripts/run_tests.sh -t basic from igt.
Comment 8 Jesse Barnes 2015-08-27 17:16:24 UTC
One with the timestamp patch added.  6us doesn't seem too long, we must be starting really late to cross a vblank though...  Unless we somehow have a tiny frame time in some configuration we test.

[  363.331046] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=15621 end=15622) time 6 us
Comment 9 Jesse Barnes 2015-08-27 17:16:38 UTC
More results:

[  300.811492] kms_flip: starting subtest basic-flip-vs-modeset
[  332.684685] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=15044 end=15045) time 2 us
[  332.858676] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=15054 end=15055) time 7 us
[  333.034304] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=15064 end=15065) time 7 us
[  333.121409] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=15069 end=15070) time 7 us
[  339.955108] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=15459 end=15460) time 2 us
[  341.450044] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=15544 end=15545) time 6 us
Comment 10 Ville Syrjala 2015-08-27 17:35:51 UTC
I wonder if someone broke the frame counter magic. Is crtc->scanline_offset sane wrt. your platform and output type when the error happens?
Comment 11 Maarten Lankhorst 2015-09-03 09:44:54 UTC
I can't reproduce it, can you give me more information about your system?
Comment 12 Ville Syrjala 2015-09-09 15:06:41 UTC
(In reply to Ville Syrjala from comment #10)
> I wonder if someone broke the frame counter magic. Is crtc->scanline_offset
> sane wrt. your platform and output type when the error happens?

Also can try to catch this with the pipe_update tracepoints? Those would show the current scanline before and after the update.
Comment 13 Jesse Barnes 2015-09-10 15:39:01 UTC
Update from yesterday's experiments.  I confirmed that the frame counter is incrementing correctly and at the right rate, but the scanline register read returns 0 when read around the start of vblank.  A sequence of reads might return:
  0
  0
  0
  <delay 1us>
  1600

This happens right around the time the frame counter increments, so when this happens we trigger the warning in the atomic update end function.  I posted a patch to workaround this by ignoring scanline read results for the first couple of lines in the display; that should be safe everywhere and lets us schedule things correctly when we hit this case.
Comment 14 Jesse Barnes 2015-09-23 14:52:16 UTC
commit 41b578fb0e8b930f2470d3f673b0fa279e77a7b8
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Tue Sep 22 12:15:54 2015 -0700

    drm/i915: workaround bad DSL readout v3
Comment 15 Maarten Lankhorst 2015-09-24 13:47:09 UTC
*** Bug 91883 has been marked as a duplicate of this bug. ***
Comment 16 Maarten Lankhorst 2015-10-27 14:30:21 UTC
This can be triggered with kms_plane too, and happens on BDW and SKL as well.

Changing the IS_HASWELL to HAS_DDI fixes the bug for me.
Comment 17 Maarten Lankhorst 2015-11-09 08:58:41 UTC
Fixed upstream for other platforms too with

commit b291681926a142958112eedde62823230d6afb84
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Tue Nov 3 08:31:41 2015 +0100

    drm/i915: Extend DSL readout fix to BDW and SKL.
Comment 18 Ritesh Raj Sarraf 2016-02-28 07:23:38 UTC
Is it okay to re-open this bug?


I have a Haswell based machine, and I've run into similar symptoms. But surprisingly the kernel is 4.4

Linux learner 4.4.3+ #3 SMP Sat Feb 27 12:54:36 IST 2016 x86_64 GNU/Linux

[18018.773603] done.
[18018.841281] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
[18018.841455] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
[18018.842171] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
[18018.842220] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
[18018.842441] acpi device:30: Cannot transition to power state D3hot for parent in (unknown)
[18018.843116] pci_bus 0000:01: Allocating resources
[18018.843148] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
[18018.843329] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
[18018.843708] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
[18019.117188] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[18019.840893] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[18019.922415] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[18021.632556] wlan0: authenticate with c4:6e:1f:d0:67:26
[18022.258600] wlan0: send auth to c4:6e:1f:d0:67:26 (try 1/3)
[18022.262324] wlan0: authenticated
[18022.264864] wlan0: associate with c4:6e:1f:d0:67:26 (try 1/3)
[18022.270164] wlan0: RX AssocResp from c4:6e:1f:d0:67:26 (capab=0x411 status=0 aid=4)
[18022.270882] wlan0: associated
[18022.270928] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[20173.215136] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=129514 end=129515) time 490 us, min 1073, max 1079, scanline start 1059, end 1092
Comment 19 Hohahiu 2016-03-26 15:43:01 UTC
Hello guys,

I do see this error on my Ivy bridge HD4000. In addition, I have some freezes (not sure if they are related to these errors). The screen and mouse may be frozen for 10-100 seconds. During these freezes computer does not response at all. The sound plays. If I watch video then after the freeze I see the video I skipped during the freeze at fast speed and then it plays normally. I noticed this bug since kernel 3.4 and it is still present in 4.5. But it could be also userspace bug (I do use userspace from git). FWIW I use plasma 5.6.

/sbin/lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Chelsea LP [Radeon HD 7730M] (rev ff)

dmesg and Xorg.0.org are attached below. If you think this is a different bug, I can open new one.

Thanks.
Comment 20 Hohahiu 2016-03-26 15:43:44 UTC
Created attachment 122575 [details]
dmesg
Comment 21 Hohahiu 2016-03-26 15:44:16 UTC
Created attachment 122576 [details]
Xorg.0.log
Comment 22 alium 2016-05-22 20:02:08 UTC
I have problem with Atomic again:


[drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=301707 end=301708) time 1 us, min 1073, max 1079, scanline start 1088, end 1080


kernel: linux-zen 4.6-1
driver: xf86-video-intel 1:2.99.917+654+ga508b11
DRM: libdrm 2.4.68

on Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)
Comment 23 Maarten Lankhorst 2016-06-06 11:31:27 UTC
Created attachment 124356 [details]
broadwell dmesg debug=0x1f kms_flip render-flip-vs-modeset-interruptible, failing setcrtc

This is on my DP-MST Broadwell, same error. However it does end with 'Channel EQ done. DP Training successful'. So I'm uncertain if link training failed.
Comment 24 Maarten Lankhorst 2016-06-06 11:42:21 UTC
Oops, wrong bug!
Comment 25 Maarten Lankhorst 2016-06-08 09:47:47 UTC
Closing bug, please file a new bug and retest with -nightly?


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.