Bug 99173

Summary: Xorg waits for long time(200ms+) to invoke VT_RELDISP
Product: DRI Reporter: yu.c.chen <yu.c.chen>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED INVALID QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: low CC: intel-gfx-bugs, lenb, rui.zhang
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: BDW i915 features:
Attachments:
Description Flags
pm_console_suspend delay during suspend
none
bootup dmesg with drm debug =14 appended none

Description yu.c.chen@intel.com 2016-12-22 03:37:32 UTC
Hi all,
Currently I'm evaluating the suspend/resume speed on my laptop,
and it was found the switching from Graphic Mode to text mode vt has taken more
than 200ms, and according to the ftrace result, after Xorg received
the signal from kernel, it waits for 200ms+ until finally issue the
ioctl VT_RELDISP, which makes the kernel blocking at this flag to be
set.
As people replied to me in maillist: Because it's been asked to set the hardware back to text mode, which requires reprogramming the GPU, which takes time.

The GPU is:
00:02.0 VGA compatible controller: Intel Corporation Broadwell-U
Integrated Graphics (rev 09) (prog-if 00 [VGA controller])
    Subsystem: Apple Inc. Broadwell-U Integrated Graphics
driver in use: i915
Xorg version:
X.Org X Server 1.18.3
Release Date: 2016-04-04

Any idea how to track this ? thanks very much.

More info please refer to:
https://bugzilla.kernel.org/show_bug.cgi?id=186211
Comment 1 Chris Wilson 2016-12-22 07:21:22 UTC
Why are you switching vt and/or changing mode on the vt in the first place?
Comment 2 yu.c.chen@intel.com 2016-12-22 07:26:33 UTC
Hi,
Because during suspend, the system has to switch to another 'safe' vt for reliable output, when no_console_suspend is appended in grub.
Comment 3 yann 2017-01-23 16:54:15 UTC
(In reply to yu.c.chen@intel.com from comment #0)
> Hi all,
> Currently I'm evaluating the suspend/resume speed on my laptop,
> and it was found the switching from Graphic Mode to text mode vt has taken
> more
> than 200ms, and according to the ftrace result, after Xorg received
> the signal from kernel, it waits for 200ms+ until finally issue the
> ioctl VT_RELDISP, which makes the kernel blocking at this flag to be
> set.
> As people replied to me in maillist: Because it's been asked to set the
> hardware back to text mode, which requires reprogramming the GPU, which
> takes time.
> 
> The GPU is:
> 00:02.0 VGA compatible controller: Intel Corporation Broadwell-U
> Integrated Graphics (rev 09) (prog-if 00 [VGA controller])
>     Subsystem: Apple Inc. Broadwell-U Integrated Graphics
> driver in use: i915
> Xorg version:
> X.Org X Server 1.18.3
> Release Date: 2016-04-04
> 
> Any idea how to track this ? thanks very much.
> 
> More info please refer to:
> https://bugzilla.kernel.org/show_bug.cgi?id=186211

Reassign to Xorg product based on the description
Comment 4 Chris Wilson 2017-01-23 17:06:04 UTC
(In reply to yann from comment #3)
> Reassign to Xorg product based on the description

Pardon? The description is that modesetting is the ratelimiting step. Though that modesetting is requested by the user, not by the kernel, and so we might classify this as NOTOURBUG. Notwithstanding that the complaint here is a modeswitch is still far too slow.
Comment 5 yann 2017-01-23 17:20:21 UTC
(In reply to Chris Wilson from comment #4)
> (In reply to yann from comment #3)
> > Reassign to Xorg product based on the description
> 
> Pardon? The description is that modesetting is the ratelimiting step. Though
> that modesetting is requested by the user, not by the kernel, and so we
> might classify this as NOTOURBUG. Notwithstanding that the complaint here is
> a modeswitch is still far too slow.

Thanks for the explanation Chris. I reassigned to Xorg because I didn't see fault of i915 but wasn't sure for Xorg.

So following Chris'advice, resolving as NOTOURBUG
Comment 6 zhang rui 2017-01-27 01:52:53 UTC
Hi, Chris and Yann,

the problem was found when we trying to optimize the suspend speed, together with Len, and we found that the pm_prepare_console step in kernel is quite slow(some times it takes only 10ms, but in some cases/platforms, it takes 200~1500ms), on several laptops we've tested, and Yu found that the time is cost when we do a vt switch in pm_prepare_console, and further more, this latency also applied to all the cases when we want to switch from graphics mode to text mode. 
And that's why this bug is filed, first in kernel bugilla and then here.

This may or may not be a bug, but
if it is not a bug, we want to know where this 200~1500ms is spent.
if it is a bug, we should keep it open, although I don't know which component it belongs to.
Comment 7 zhang rui 2017-02-06 06:45:10 UTC
Bug reopened. :)
Comment 8 Len Brown 2017-02-16 13:30:15 UTC
Is there any doubt that this is a bug?

If no, do we all agree that it may not be the fault of the i915 driver,
but may be an issue rooted in the Xorg user space?*

If yes, how to get somebody who has a clue about Xorg to fix it?

*Note, this is not slow on all Xorg machines,
there is a machine dependency, so the driver
or some other hardware dependent code
may not be totally off the hook...
Comment 9 Jari Tahvanainen 2017-04-19 15:48:05 UTC
Can you please provide new data from the latest kernel (preferable from drm-tip). Attach dmesg etc. logs according https://01.org/linuxgraphics/documentation/how-report-bugs.
Comment 10 zhang rui 2017-05-18 17:04:00 UTC
Created attachment 131408 [details]
pm_console_suspend delay during suspend

Don't know what kind of info is needed.
This problem can be reproduced on many platforms. here is the analyze_suspend output on Dell XPS 13 9360, which is a KBL platform, with 4.11 kernel.

As you can see in the ftrace output
          python-2517  [003] d...   267.793000: pm_prepare_console_cal: (pm_prepare_console+0x0/0x40)
          python-2517  [000] d...   268.065713: pm_prepare_console_ret: (pm_suspend+0x1dd/0x3a0 <- pm_prepare_console) arg1=0x0

this is the time cost by switching console during suspend.
Comment 11 Imre Deak 2017-05-18 18:07:16 UTC
(In reply to zhang rui from comment #10)
> Created attachment 131408 [details]
> pm_console_suspend delay during suspend
> 
> Don't know what kind of info is needed.
> This problem can be reproduced on many platforms. here is the
> analyze_suspend output on Dell XPS 13 9360, which is a KBL platform, with
> 4.11 kernel.
> 
> As you can see in the ftrace output
>           python-2517  [003] d...   267.793000: pm_prepare_console_cal:
> (pm_prepare_console+0x0/0x40)
>           python-2517  [000] d...   268.065713: pm_prepare_console_ret:
> (pm_suspend+0x1dd/0x3a0 <- pm_prepare_console) arg1=0x0
> 
> this is the time cost by switching console during suspend.

Could you attach the dmesg starting from bootup booting with drm.debug=14? Also is it possible to get a more detailed ftrace during suspend with i915 included in the ftrace module list?
Comment 12 Ricardo 2017-07-07 18:31:17 UTC
Based on lack of recent activity and also with comments looks like is not a bug directed to i915.. closing
Comment 13 yu.c.chen@intel.com 2017-07-09 10:03:33 UTC
Sorry for my late response, I've just confirmed this issue  can be still reproduced on 4.12 vanilla kernel.  Here's the dmesg boot up log with drm.debug=14 appended. Could you please point to me which part/code/function
in the i915 should I track when  using ftrace? 
Here's the history discussion on this issue:
https://lists.freedesktop.org/archives/xorg-devel/2016-December/052064.html
Comment 14 yu.c.chen@intel.com 2017-07-09 10:04:23 UTC
Created attachment 132568 [details]
bootup dmesg with drm debug =14 appended
Comment 15 zhang rui 2017-07-10 01:58:24 UTC
Plus, this one was discussed via email, together with a i915 suspend performance regression. All the information is provided via email.
Imre has fixed the i915 regression, but this one is still open...
Comment 16 Elizabeth 2017-08-11 21:15:47 UTC
(In reply to zhang rui from comment #15)
> Plus, this one was discussed via email, together with a i915 suspend
> performance regression. All the information is provided via email.
> Imre has fixed the i915 regression, but this one is still open...
Hello Zhang,
Any update that can be shared on this case? Is this still going be open at FDO or is going to be followed internally?
Thank you.
Comment 17 zhang rui 2017-08-14 00:48:19 UTC
Hi,

it's true that we did not make any progress on this thread.
But I will re kick off the discussion this week.
Comment 18 Elizabeth 2017-10-25 16:13:17 UTC
(In reply to zhang rui from comment #17)
> Hi,
> 
> it's true that we did not make any progress on this thread.
> But I will re kick off the discussion this week.
Hello everybody, any update in this case? Thank you.
Comment 19 zhang rui 2017-10-26 03:09:00 UTC
Hi, the bug still exists.
But I don't have time to track this issue recently, so I think we can close it for now.
Comment 20 Elizabeth 2017-10-26 19:12:12 UTC
(In reply to zhang rui from comment #19)
> Hi, the bug still exists.
> But I don't have time to track this issue recently, so I think we can close
> it for now.
I'll change to low priority for now and I agree to close it, and when possible from you side, to push it by IRC to speed it up. If there are no objections I'll close it in the following days. Thank you.
Comment 21 zhang rui 2017-10-27 03:14:56 UTC
yes, please close it.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.