Bug 69869

Summary: [byt chv] VSync (for windowed updates) dysfunctional
Product: xorg Reporter: cancan,feng <cancan.feng>
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: RESOLVED MOVED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: low CC: cancan.feng, christophe.prigent, guang.a.yang, joakim, lei.a.liu, mengmeng.meng, qingshuai.tian, yangweix.shui
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg.0.log
none
dmesg log
none
dmesg.log
none
i915_error_state.log
none
Xorg.0.log
none
call trace on BYT-M when run dri2-test
none
HSW_dmesg.log none

Description cancan,feng 2013-09-27 05:26:08 UTC
Created attachment 86705 [details]
Xorg.0.log

Environment:
----------------------------
 Libdrm:		(master)libdrm-2.4.46-42-gbf4a7cd4b2456d4dc93a86bbcc51eba4ae73390a
 Mesa:		(master)fe2528c0b69d5719b15d926ada9424cac7569b9c
 Xserver:		(master)xorg-server-1.14.99.1-215-g7d3d4ae55dd6ee338439e2424ac423b1df80501b
 Xf86_video_intel:		(master)2.99.902-54-gd7eb40efa79dd9ea720606b94a20179c7dd18e03
 Cairo:		(master)337ab1f8d9e29086bfb4001508b28835b41c6390
 Libva:		(staging)f5c913765e6af38d835cacf339054ccc60bddefb
 Libva_intel_driver:		(staging)f6685c309d94fb7679c9772703c8790cb71cdd73
 Kernel:	(drm-intel-nightly) 24c8329416b54b79655afe45370cf3d46f41e283

Bug Details Description:
----------------------------
Mplayer can't play video smoothly after starting X, this issue happens with SNA enabled, and it will disappear with SNA disabled. This is a xf86_video_intel regression:

dbe75982457cfe6bb1f7422a517ced32cc74f909 is the first bad commit
commit dbe75982457cfe6bb1f7422a517ced32cc74f909
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Sep 9 15:35:42 2013 +0100

    sna/hsw: Fix the event selection for scanline waits on pipe A

Reproduce Steps:
------------------------------
1. xinit&
2. mplayer -vo xv /home/testframework/MPEG-2.mpeg
Comment 1 Chris Wilson 2013-09-27 07:52:26 UTC
So... You are complaining because you ask for vsync and you get vsync...
Comment 2 cancan,feng 2013-09-27 08:14:23 UTC
I think this is really a issue here..A frame maybe stuck for several seconds even several minutes and screen blocked.. The same video file can be played smoothly under a good xf86_video_intel. This issue is easy to reproduce, so I think you can try to reproduce it and see how it's going. :)
Comment 3 Chris Wilson 2013-09-27 08:19:37 UTC
You didn't say that originally :)

If it sticks for several seconds, you should have hangcheck warnings - which is what should be unsticking it.
Comment 4 meng 2013-09-27 08:38:55 UTC
Created attachment 86710 [details]
dmesg log
Comment 5 Chris Wilson 2013-09-27 08:45:58 UTC
Hmm, we will have to wait for that BUG to be resolved first - it should be fixed already in -nightly.
Comment 6 Daniel Vetter 2013-09-27 09:21:57 UTC
Nope, BUG fix isn't merged yet.
Comment 7 Chris Wilson 2013-10-04 10:12:52 UTC
The offending kernel patches have been reverted, so this should be working again...
Comment 8 meng 2013-10-08 01:46:19 UTC
(In reply to comment #7)
> The offending kernel patches have been reverted, so this should be working
> again...

Yes, it works now.
-----------------------
Libdrm:	(master)libdrm-2.4.46-46-gddbbdb13d80ea7f60e6f71356a444995b905366b
Mesa:	(master)373f8670d1c670003674e1eaa7c1f0cd823a0431
Xserver:	(master)xorg-server-1.14.99.2-2-gccbe17b1c6da1ad9d08 
Xf86_video_intel:	(master)2.99.903-38-g7284e7f48b812948b40d67396214f 
Cairo:	(master)49366c5e9e7d5afd0daef4c53a41472e020145eb
Kernel:	(drm-intel-nightly) ee3204556365264795ea557685b353bd2a271ce7
Comment 9 Daniel Vetter 2013-10-08 06:01:07 UTC
Progress on ppgtt patches is a bit on hold, so let's close this for now.
Comment 10 meng 2013-10-08 06:06:50 UTC
Verified.
Comment 11 meng 2013-10-10 07:22:27 UTC
Hi, it’s wrong for me to verify the bug. 
Last time I tested IVB and it’s OK, but the problem can’t reproduce on IVB.
The bug still exists on HSW and BYT-M, please see Xorg.0.log,dmesg.log and i915_error_state.log on HSW.
------------------------------------
Libdrm:	(master)libdrm-2.4.46-46-gddbbdb13d80ea7f60e6f71356a444995b905366b
Mesa:	(master)1176a3aac65dfe935809182bfac883708def8046
Xf86_video_intel:(master)2.99.903-47-g082c08789cf9a8c0cc2bf44d0fee579b96c0798f
Cairo:		(master)f1eefee985b4361386a167e80d9836593ade59b9
Libva:		(staging)f5c913765e6af38d835cacf339054ccc60bddefb
Kernel:	(drm-intel-nightly) git-4e1044
Comment 12 meng 2013-10-10 07:24:34 UTC
Created attachment 87369 [details]
dmesg.log
Comment 13 meng 2013-10-10 07:25:11 UTC
Created attachment 87370 [details]
i915_error_state.log
Comment 14 meng 2013-10-10 07:25:44 UTC
Created attachment 87371 [details]
Xorg.0.log
Comment 15 Chris Wilson 2013-10-10 21:46:21 UTC
There's a test-case in xf86-video-intel/tests/dri2-test to exercise swaps (flips and vblank waits) on each connector. Can you please test this on your system and see if that reproduces the hang?
Comment 16 meng 2013-10-11 09:23:41 UTC
(In reply to comment #15)
> There's a test-case in xf86-video-intel/tests/dri2-test to exercise swaps
> (flips and vblank waits) on each connector. Can you please test this on your
> system and see if that reproduces the hang?

No, it's no hang. BYT, dri2-test runs in blank screen,dmesg:
------
[   44.428953] ALSA sound/pci/hda/hda_eld.c:334 HDMI: ELD buf size is 0, force 128
[   44.428987] ALSA sound/pci/hda/hda_eld.c:351 HDMI: invalid ELD data byte 0
[   44.429074] ALSA sound/pci/hda/hda_eld.c:334 HDMI: ELD buf size is 0, force 128
[   44.429116] ALSA sound/pci/hda/hda_eld.c:351 HDMI: invalid ELD data byte 0
[   47.149407] ALSA sound/pci/hda/hda_eld.c:334 HDMI: ELD buf size is 0, force 128
[   47.149442] ALSA sound/pci/hda/hda_eld.c:351 HDMI: invalid ELD data byte 0
[   47.149499] ALSA sound/pci/hda/hda_eld.c:334 HDMI: ELD buf size is 0, force 128
[   47.149531] ALSA sound/pci/hda/hda_eld.c:351 HDMI: invalid ELD data byte 0
[   49.769930] ALSA sound/pci/hda/hda_eld.c:334 HDMI: ELD buf size is 0, force 128
[   49.769944] ALSA sound/pci/hda/hda_eld.c:351 HDMI: invalid ELD data byte 0
[   49.769990] ALSA sound/pci/hda/hda_eld.c:334 HDMI: ELD buf size is 0, force 128
[   49.770012] ALSA sound/pci/hda/hda_eld.c:351 HDMI: invalid ELD data byte 0
Comment 17 Chris Wilson 2013-10-11 09:28:46 UTC
And let's test hsw as well. If that also passes, please retest with mplayer (and update any error logs).
Comment 18 meng 2013-10-11 09:48:41 UTC
(In reply to comment #17)
> And let's test hsw as well. If that also passes, please retest with mplayer
> (and update any error logs).

Hi, #16 test based on HSW, "BYT" is typo:(  On HSW, mplayer still hang.

On BYT-M, call trace in dmesg attached when run dri2-test.
Comment 19 meng 2013-10-11 09:49:31 UTC
Created attachment 87446 [details]
call trace on BYT-M when run dri2-test
Comment 20 Chris Wilson 2013-10-11 10:05:09 UTC
It only flips between two black windows, all that we care about is whether or not it generates an IO error or a GPU hang. Neither of which appear to be the case, so it looks to have passed.
Comment 21 Chris Wilson 2013-10-16 12:05:24 UTC
Can you please test mplayer on all suspect machines (ivb/byt/hsw) and report whether any of those have a GPU hang (and only if they have a GPU hang)? All other warnings should be filed as a fresh bug report.
Comment 22 meng 2013-10-17 08:48:48 UTC
(In reply to comment #21)
> Can you please test mplayer on all suspect machines (ivb/byt/hsw) and report
> whether any of those have a GPU hang (and only if they have a GPU hang)? All
> other warnings should be filed as a fresh bug report.


GPU hang exists on BYT-M and HSW(desktop, mobile, ULT).BTW, I attached dmesg for HSW.
IVB is OK.
Comment 23 meng 2013-10-17 08:49:14 UTC
Created attachment 87783 [details]
HSW_dmesg.log
Comment 24 Chris Wilson 2013-11-07 13:15:46 UTC
Found the Haswell bug,

commit 68cef6cd281572fcfb76a341dc45b7c8e5baffe6
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Nov 7 13:09:25 2013 +0000

    sna/gen7: Request secure batches for Haswell vsync
    
    Since commit 8ff8eb2b38dc705f5c86f524c1cd74a811a7b04c
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Mon Sep 9 16:23:04 2013 +0100
    
        sna/hsw: Scanline waits require both DERRMR and forcewake
    
    we have been emitting LRI to enable vsync on the render ring. This
    requires a privileged batch buffer, and whilst we were checking for
    kernel support, we forgot to actually tell the kernel to submit the
    batch with the right privileges.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71328
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

The baytrail symptoms are very similar (the LRIs are not landing) but the hardware implementation is completely difference (and the software doesn't have the same bug as haswell unfortunately.)
Comment 25 meng 2013-11-08 02:03:00 UTC
(In reply to comment #24)
Yes, it was fixed on HSW with above commit git-68cef6cd28.
Comment 26 Gordon Jin 2013-11-15 00:28:44 UTC
This bug looks critical for me. 

Chris, do you agree this a high priority bug for you?
Comment 27 Chris Wilson 2013-11-15 09:00:43 UTC
(In reply to comment #26)
> This bug looks critical for me. 
> 
> Chris, do you agree this a high priority bug for you?

It is, the only answer I have found so far has been to turn off vsync for byt. I have found no explanation as to why LRI seem not to work.
Comment 28 Chris Wilson 2013-11-19 14:09:55 UTC
This goes against everything the render bspec says, but it doesn't ap[ear to hang:


diff --git a/src/sna/sna_display.c b/src/sna/sna_display.c
index 8982d6a..c31a501 100644
--- a/src/sna/sna_display.c
+++ b/src/sna/sna_display.c
@@ -3923,6 +3923,8 @@ sna_wait_for_scanline(struct sna *sna,
                ret = false;
        else if (sna->kgem.gen >= 075)
                ret = sna_emit_wait_for_scanline_hsw(sna, crtc, pipe, y1, y2, full_height);
+       else if (sna->kgem.gen == 071)
+               ret = sna_emit_wait_for_scanline_gen4(sna, crtc, pipe, y1, y2, full_height);
        else if (sna->kgem.gen >= 070)
                ret = sna_emit_wait_for_scanline_ivb(sna, crtc, pipe, y1, y2, full_height);
        else if (sna->kgem.gen >= 060)
Comment 29 Chris Wilson 2013-11-19 14:18:30 UTC
But though it doesn't hang, it is not sync'ed to vrefresh either.
Comment 30 Chris Wilson 2013-11-19 16:42:18 UTC
Replacing the LOAD_SCANLINE_INCL with a LRI to 0x70004 is also ineffective. (No hang, still tears.)
Comment 31 meng 2013-11-20 05:10:20 UTC
(In reply to comment #28)
Yes, with above xf86 patch, the bug was fixed.
Comment 32 Chris Wilson 2013-12-04 22:50:02 UTC
Haven't been able to resolve how to make vsync work on Baytrail, so applied

commit bd22abee8f33b20ff6bc7297b0a9ae8708d18727
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Nov 7 22:27:50 2013 +0000

    sna: Update Baytrail VSync logic
    
    My current best guess at the glaring hole in the spec that is
    synchronisation to vertical refresh.
    
    Note that this leaves VSync disabled for BYT for now as it is
    ineffective - but at least it now doesn't hang!
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69869
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

to prevent the hangs. Downgrading priority.
Comment 33 Martin Peres 2019-11-27 13:32:56 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-intel/issues/25.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.