Bug 100577 - DC + TearFree display lock
Summary: DC + TearFree display lock
Status: RESOLVED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-04-05 10:52 UTC by Andy Furniss
Modified: 2017-08-16 12:54 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg with errors and trace (68.57 KB, text/plain)
2017-04-05 10:52 UTC, Andy Furniss
no flags Details
dmesg (81.93 KB, text/plain)
2017-05-04 09:10 UTC, Nikola Forró
no flags Details
dmesg on current staging memclk is stuck high + some errors (64.03 KB, text/plain)
2017-07-05 20:02 UTC, Andy Furniss
no flags Details
xrandr --verbose higher clocks missing (7.70 KB, text/plain)
2017-07-05 20:03 UTC, Andy Furniss
no flags Details
xrandr --verbose on 4.14-wip showing all modes (8.61 KB, text/plain)
2017-07-05 20:37 UTC, Andy Furniss
no flags Details

Description Andy Furniss 2017-04-05 10:52:40 UTC
Created attachment 130688 [details]
dmesg with errors and trace

This one may be hard to reproduce for some as it seems to rely a bit on cpufreq ondemand being a bit rubbish on my old cpu. Lowest is 800MHz for me (high 3.4 GHz).

I don't use a compositing desktop and vids played in my browser (seamonkey) don't get vsync.

Turning on TearFree gets me vsync, but while testing I got a screen lock. Seems to involve timing luck - I couldn't reproduce with the browser in a sane timescale so invented a different test that could run while AFK.

mpv -fs --vo=x11 a720p60video --loop=inf

Will usually lock in < 15 minutes with cpufreq ondemand.

It was still going after an hour with cpufreq set to perf.

Also seems OK with amdgpu.dc=0 on same kernel = amd-staging-4.9

As I've only recently started running DC, and never tried this test before, I have no idea whether it ever worked.

xserver is latest release ddx is git.

dmesg attached shows errors and a trace starting with

[ 2009.442985] [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:39:crtc-0] flip_done timed out
[ 2013.027575] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:39:crtc-0] flip_done timed out
[ 2013.027727] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* dm_dc_surface_commit: acrtc 0, already busy
Comment 1 Michel Dänzer 2017-04-06 02:29:30 UTC
Removing myself from CC, I get notifications via the mailing list.
Comment 2 Nikola Forró 2017-05-04 09:08:41 UTC
Same problem here. It was definitely working fine several weeks ago.
I'll try to bisect the issue if I'll find reliable reproducer.
Comment 3 Nikola Forró 2017-05-04 09:10:53 UTC
Created attachment 131193 [details]
dmesg
Comment 4 Nikola Forró 2017-05-06 17:54:57 UTC
Bisected to this commit:
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-4.9&id=afbeb2d0961b2139bcf6553a710e6a8ae5d09d34

Which makes me think I'm experiencing a different issue, because this bug was reported long before that commit emerged.

My reproducer is to fill the screen with two xterm windows next to each other and generate random lines of text in both of them, using command like this for example:

tr -dc a-z1-4 </dev/urandom | tr 1-2 ' \n' | awk 'length==0 || length>50' | tr 3-4 ' ' | sed 's/^ *//' | cat -s | sed 's/ / /g' | fmt

The screen lock will occur in a couple of minutes.
Comment 5 Michel Dänzer 2017-05-07 07:38:25 UTC
(In reply to Nikola Forró from comment #4)
> Which makes me think I'm experiencing a different issue, because this bug
> was reported long before that commit emerged.

Indeed; please file your own report.
Comment 6 Andy Furniss 2017-05-16 21:47:08 UTC
Testing current amd-staging-4.9 + patch from Bug 101053 I can't reproduce this anymore.

It was a couple of hours running, I'll try longer and with 4.11 over the next few days and either reconfirm or close this.
Comment 7 Harry Wentland 2017-05-17 13:16:03 UTC
This might've been fixed by Mario's patches plus Andrey's scanline change

d58f8724e636 Mario Kleiner
    drm/amd/display: Prevent premature pageflip when comitting in vblank. (v3)   
8494f61e14ff Mario Kleiner
    drm/amd/display: Fix race between vblank irq and pageflip irq. (v2)
Comment 8 Andy Furniss 2017-05-17 22:17:39 UTC
I can still reproduce this.

Seems I may just get lucky runs - one 90 mins no issue, stopped and started again later and it locked after 20.
Comment 9 Andy Furniss 2017-07-05 18:44:34 UTC
Can't reproduce this on current amd-staging-4.11 will keep trying and close soon if OK.

One nit I notice is I may not be comparing like with like as current and some older 4.11s I still have around all seem to peg memory clock high. 3.7s I still have don't do this, neither does non DC current kernel.

I'll search/file a separate bug for this one in coming days, it causes +10 degrees idle GPU temp.
Comment 10 Andy Furniss 2017-07-05 18:59:11 UTC
(In reply to Andy Furniss from comment #9)

> One nit I notice is I may not be comparing like with like as current and
> some older 4.11s I still have around all seem to peg memory clock high. 3.7s

Ugh, not 3.7s, I mean amd-staging-4.9s didn't peg memclk high. I don't still have all the 4.11s I've ever built so don't know when this started for them.
Comment 11 Alex Deucher 2017-07-05 19:19:54 UTC
(In reply to Andy Furniss from comment #10)
> Ugh, not 3.7s, I mean amd-staging-4.9s didn't peg memclk high. I don't still
> have all the 4.11s I've ever built so don't know when this started for them.

What sort of display(s) do you have and what modes are you using?  Mclk dpm is disabled if there are multiple monitors or if the vblank period is too short to support mclk switching.
Comment 12 Andy Furniss 2017-07-05 20:01:43 UTC
Hmm, so I just booted back into current staging to get a dmesg and xrandr and ended up noticing 2 more issues.

My monitor is 1920x1080 and can do 120Hz but pref and used by default is 60Hz.

Issue 1 - I can't see 120Hz with xrandr, I normally can - but then I normally use non DC kernels.

Issue 2 I see a few errors at the end of the dmesg =

[drm] dc_get_validate_context:resource validation failed, dc_status:6
Comment 13 Andy Furniss 2017-07-05 20:02:52 UTC
Created attachment 132465 [details]
dmesg on current staging memclk is stuck high + some errors
Comment 14 Andy Furniss 2017-07-05 20:03:29 UTC
Created attachment 132466 [details]
xrandr --verbose higher clocks missing
Comment 15 Andy Furniss 2017-07-05 20:11:36 UTC
It seems the errors in the dmesg are created when I do xrandr --verbose.
Comment 16 Andy Furniss 2017-07-05 20:37:20 UTC
Created attachment 132467 [details]
xrandr --verbose on 4.14-wip showing all modes

This is xrandr on 4.14-wip - the higher modes are shown, though the clocks aren't  precise 100/110/120 - it's always been like this, I live in hope that may get better one day!

I forgot to put earlier WRT 2 monitors, that HDMI-A-0 is physically connected, but not seen as the TV is off, and as I said, 4.9 DC and all non DC kernels for ages don't peg memclk high with the same setup.
Comment 17 Andy Furniss 2017-08-16 12:54:42 UTC
Testing with amd-staging-drm-next, which doesn't peg memclk high I can't reproduce this anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.