Bug 108577

Summary: Black X laptop screen with cursor if HDMI not plugged in, bisected
Product: DRI Reporter: Duncan Roe <duncan_roe>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: blocker    
Priority: medium CC: harry.wentland, nicholas.kazlauskas, roman.li, sunpeng.li
Version: DRI git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Git diff specifying 2 commits (GD is aliased to git diff)
none
Diff of cursor-only and normal Xorg logs
none
Help for CONFIG_DRM_AMD_DC_FBC
none
O/P from dmesg
none
Xorg log
none
Patch for Linux-19.0 to revert 5099114 & reinstate DRM_AMD_DC_FBC kconfig option
none
possible fix 1/2
none
possible fix 2/2
none
possible fix 1/2 updated for Linux 19.0
none
possible fix 2/2 updated for Linux 19.0
none
the latest FBC fixes
none
One-line patch as Alex requested
none
Log of applying FBC patchset
none
series/52117 patches as migrated to Linux 4.19.1
none
Diagnostic patches to determine which pointer is null none

Description Duncan Roe 2018-10-28 01:53:53 UTC
Created attachment 142239 [details]
Git diff specifying 2 commits (GD is aliased to git diff)

(High severity because laptop is useless when travelling)
With HDMI unplugged, X laptop screen is black with only a cursor. Cursor changes shape as it moves over xterm windows put there by xinitrc.
Bisecting in git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git finds 2bfb0b678e48dee76543bfa2079b7e42609332fb has the problem but adjacent commit 9c24c10a2c1e1bb478b6bb70612d9e885aee044f does not.
There is an oddity with git diff between these two: git diff <commit1> <commit2> produces copious output and a git warning (see attachment), but git checkout <commit2>; git diff <commit1> only shows one patched file (next attachment). You might want to try this - I re-cloned the repo in case the old one was corrupted but it made no difference.
Comment 1 Duncan Roe 2018-10-28 01:57:18 UTC
Hardware is as in Bug 108139
Comment 2 Duncan Roe 2018-10-28 02:09:12 UTC
Sorry guys, just noticed that after git checkout 2bfb0b678e48dee76543bfa2079b7e42609332fb I see a bunch of new commits that weren't there before - will bisect
Comment 3 Duncan Roe 2018-10-28 22:48:51 UTC
The second bisect identifies 5099114ba3b2e5ae9fb487aeb3ae0434fe38a7da as the oldest commit showing the problem, which does not show in adjacent  commit b646c1dc835b6b73884a88643c2534f1a4a1928f.
In order to see either of these commits in git list, one first needs to git checkout 2bfb0b678e48dee76543bfa2079b7e42609332fb.
(In reply to Duncan Roe from comment #0)
> Created attachment 142239 [details]
> Git diff specifying 2 commits (GD is aliased to git diff)
> 
> (High severity because laptop is useless when travelling)
> With HDMI unplugged, X laptop screen is black with only a cursor. Cursor
> changes shape as it moves over xterm windows put there by xinitrc.
> Bisecting in git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git finds
> 2bfb0b678e48dee76543bfa2079b7e42609332fb has the problem but adjacent commit
> 9c24c10a2c1e1bb478b6bb70612d9e885aee044f does not.
> There is an oddity with git diff between these two: git diff <commit1>
> <commit2> produces copious output and a git warning (see attachment), but
> git checkout <commit2>; git diff <commit1> only shows one patched file (next
> attachment). You might want to try this - I re-cloned the repo in case the
> old one was corrupted but it made no difference.

Please ignore the git diff stuff above. After checking out 2bfb0b678e48dee76543bfa2079b7e42609332fb, I diffed the second commit in git list not realising that this was no longer 9c24c10a2c1e1bb478b6bb70612d9e885aee044f.
The new bisect is the valid one
Comment 4 Duncan Roe 2018-10-28 22:55:03 UTC
After stripping the timestamps from good and bad copies of Xorg.0.log, the only significant difference between them appears to be gart size. See attachment 2 [details] [review]
Comment 5 Duncan Roe 2018-10-28 22:57:11 UTC
Created attachment 142250 [details]
Diff of cursor-only and normal Xorg logs
Comment 6 Duncan Roe 2018-10-28 23:00:23 UTC
(In reply to Duncan Roe from comment #4)
> After stripping the timestamps from good and bad copies of Xorg.0.log, the
> only significant difference between them appears to be gart size. See
> attachment 2 [details] [review] [review]

Sorry that is now attachment 1 [details] [review] after obsoleting the old one
Comment 7 Duncan Roe 2018-10-28 23:02:05 UTC
(In reply to Duncan Roe from comment #6)
> (In reply to Duncan Roe from comment #4)
> > After stripping the timestamps from good and bad copies of Xorg.0.log, the
> > only significant difference between them appears to be gart size. See
> > attachment 2 [details] [review] [review] [review]
> 
> Sorry that is now attachment 1 [details] [review] [review] after obsoleting the old
> one

In any case, don't follow the auto-generated attachment hyperlink. Just open the attachment
Comment 8 Duncan Roe 2018-10-28 23:03:51 UTC
(In reply to Duncan Roe from comment #7)
> (In reply to Duncan Roe from comment #6)
> > (In reply to Duncan Roe from comment #4)
> > > After stripping the timestamps from good and bad copies of Xorg.0.log, the
> > > only significant difference between them appears to be gart size. See
> > > attachment 2 [details] [review] [review] [review] [review]
> > 
> > Sorry that is now attachment 1 [details] [review] [review] [review] after obsoleting the old
> > one
> 
> In any case, don't follow the auto-generated attachment hyperlink. Just open
> the attachment

Ah ther you go - attachment numbers are global. See attachment 142250 [details]
Comment 9 Michel Dänzer 2018-10-29 11:28:28 UTC
Commit 5099114ba3b2e5ae9fb487aeb3ae0434fe38a7da is "drm/amdgpu/display: drop DRM_AMD_DC_FBC kconfig option".
Comment 10 Duncan Roe 2018-10-29 21:42:39 UTC
DRM_AMD_DC_FBC is not in my .config and never has been (from the RCS tree)
Comment 11 Duncan Roe 2018-10-30 03:08:44 UTC
(and the patch enables DRM_AMD_DC_FBC)
Comment 12 Duncan Roe 2018-10-30 21:18:30 UTC
Created attachment 142286 [details]
Help for CONFIG_DRM_AMD_DC_FBC
Comment 13 Duncan Roe 2018-10-30 21:30:30 UTC
Rebuilt b646c1dc835b6b73884a88643c2534f1a4a1928f (previously OK) with CONFIG_DRM_AMD_DC_FBC turned on and got a black X display.
The Help for this option says to check hardware availability before enabling. Looks like my hardware doesn't have it.
I can see no way this option should be forced on. What I think was intended was to remove CONFIG_DRM_AMD_DC as an option instead.
Comment 14 Duncan Roe 2018-10-30 21:40:43 UTC
Commit 5099114ba3b2e5ae9fb487aeb3ae0434fe38a7da was authored by alexander.deucher@amd.com but Bugzilla says "CC: alexander.deucher@amd.com did not match anything" when I try to add his email
Comment 15 Alex Deucher 2018-10-30 21:42:05 UTC
What chip do you have?  Please attach your xorg log and dmesg output.
Comment 16 Duncan Roe 2018-10-31 00:38:41 UTC
Created attachment 142292 [details]
O/P from dmesg
Comment 17 Duncan Roe 2018-10-31 00:44:06 UTC
Created attachment 142293 [details]
Xorg log

See also Comment 4
This log and the dmesg o/p are form a build of commit 5099114ba3b2e5ae9fb487aeb3ae0434fe38a7da to git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Comment 18 Duncan Roe 2018-11-02 02:31:41 UTC
Created attachment 142334 [details] [review]
Patch for Linux-19.0 to revert 5099114 & reinstate DRM_AMD_DC_FBC kconfig option
Comment 19 Alex Deucher 2018-11-02 15:56:15 UTC
Created attachment 142347 [details] [review]
possible fix 1/2

Do these patches fix the issue?
Comment 20 Alex Deucher 2018-11-02 15:56:32 UTC
Created attachment 142348 [details] [review]
possible fix 2/2
Comment 21 Duncan Roe 2018-11-03 00:23:16 UTC
Yes they fix Black X. I tested at Linux 4.19 (commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d) with updated patches that apply cleanly (to be attached).
I wonder if an approach like yours (checking dc->fbc_compressor or some other flag in dc) could be used to resolve Bug 108139?
Comment 22 Duncan Roe 2018-11-03 00:28:28 UTC
Created attachment 142350 [details] [review]
possible fix 1/2 updated for Linux 19.0
Comment 23 Duncan Roe 2018-11-03 00:30:51 UTC
Created attachment 142351 [details] [review]
possible fix 2/2 updated for Linux 19.0
Comment 24 Alex Deucher 2018-11-05 19:27:11 UTC
Does just removing this line from the code:
value |= 0x84;
also fix the issue?
Comment 25 Roman Li 2018-11-05 21:02:29 UTC
Created attachment 142374 [details]
the latest FBC fixes

Duncan, you can also try 'the latest FBC fixes' to check whether they fix your issue.
Comment 26 Duncan Roe 2018-11-06 03:45:53 UTC
Created attachment 142382 [details] [review]
One-line patch as Alex requested
Comment 27 Duncan Roe 2018-11-06 03:50:28 UTC
(In reply to Alex Deucher from comment #24)
> Does just removing this line from the code:
> value |= 0x84;
> also fix the issue?

No, I get the black screen again. To be clear, I applied the patch in  Attachment 142382 [details] to Linux 4.19 stable commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
Comment 28 Duncan Roe 2018-11-06 06:12:44 UTC
Created attachment 142384 [details]
Log of applying FBC patchset
Comment 29 Duncan Roe 2018-11-06 09:34:58 UTC
(In reply to Roman Li from comment #25)
> Created attachment 142374 [details]
> the latest FBC fixes
> 
> Duncan, you can also try 'the latest FBC fixes' to check whether they fix
> your issue.

I applied the patchset to Linux 4.19 (stable commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d). Some lines were offset as per attachment 142384 [details].
The black screen remained.
Comment 30 Alex Deucher 2018-11-06 21:43:01 UTC
Can you try this patch set?
https://patchwork.freedesktop.org/series/52117/
Comment 31 Duncan Roe 2018-11-07 08:21:41 UTC
Created attachment 142396 [details] [review]
series/52117 patches as migrated to Linux 4.19.1
Comment 32 Duncan Roe 2018-11-07 22:12:15 UTC
X displays normally with this patch set (as amended) applied to Linux 4.19.1 commit 07a03b97b9ce2a6430344386eeab9b16283b893f on branch linux-4.19.y of the stable tree.
There was an oddity the first time I tried it which I have not  been able to reproduce since: the X screen went dark after timing out (as normal) but it wouldn't light up again for keyboard presses, mouse clicks etc. I ahve tried numerous time to reproduce this but not managed, so I have to say the patch is good.
Comment 33 Alex Deucher 2018-11-29 16:26:39 UTC
Does this patch fix the issue with fbc enabled?
https://patchwork.freedesktop.org/patch/264806/
Comment 34 Duncan Roe 2018-11-30 11:18:32 UTC
(In reply to Alex Deucher from comment #33)
> Does this patch fix the issue with fbc enabled?
> https://patchwork.freedesktop.org/patch/264806/

No. I get the black screen after entering *startx* (system boots to command-line prompt). 
However, I let the screen time out for a long time while I went checking for stuff. When I went to Ctrl-Alt-F1 to get back to command mode, a proper X display appeared.
I rebooted (Had to power off owing to Bug 108464). This time I tried touching keys as soon as I noticed X had timed out but nothing would wake it up (like the "oddity" mentioned in Comment 32).
Trying again, last time for tonight. Have taken the precaution of starting an xterm using my desktop's DISPLAY before starting X. Am following /var/log/debug (set up to log everything).

    Nov 30 22:09:46 smallstar kernel: [  697.279929] [drm:dc_commit_state [amdgpu]] dc_commit_state: 0 streams
    Nov 30 22:09:46 smallstar kernel: [  697.280185] [drm:hwss_edp_backlight_control [amdgpu]] hwss_edp_backlight_control: backlight action: Off
    Nov 30 22:09:46 smallstar kernel: [  697.312816] [drm:generic_reg_wait [amdgpu]] REG_WAIT taking a while: 5ms in dce110_stream_encoder_dp_blank line:922
    Nov 30 22:09:46 smallstar kernel: [  697.313269] [drm:hwss_edp_power_control [amdgpu]] hwss_edp_power_control: Panel Power action: Off
    Nov 30 22:09:46 smallstar kernel: [  697.329572] [drm:drm_mode_setcrtc [drm]] [CRTC:40:crtc-0]
    Nov 30 22:09:46 smallstar kernel: [  697.329617] [drm:drm_mode_setcrtc [drm]] [CRTC:42:crtc-1]
    Nov 30 22:09:46 smallstar kernel: [  697.329667] [drm:drm_mode_setcrtc [drm]] [CRTC:40:crtc-0]

Touch the Ctrl key at 22:15:08 (I was called away) and X display shows:

    Nov 30 22:15:08 smallstar kernel: [ 1019.371568] [drm:drm_mode_setcrtc [drm]] [CRTC:40:crtc-0]
    Nov 30 22:15:08 smallstar kernel: [ 1019.371601] [drm:drm_mode_setcrtc [drm]] [CONNECTOR:44:eDP-1]
    Nov 30 22:15:08 smallstar kernel: [ 1019.378652] [drm:dc_commit_state [amdgpu]] dc_commit_state: 1 streams
    Nov 30 22:15:08 smallstar kernel: [ 1019.378727] [drm:dc_stream_log [amdgpu]] core_stream 0x00000000f8a1cb08: src: 0, 0, 1366, 768; dst: 0, 0, 1366, 768, colorSpace:1
    Nov 30 22:15:08 smallstar kernel: [ 1019.378797] [drm:dc_stream_log [amdgpu]] ^Ipix_clk_khz: 76300, h_total: 1558, v_total: 816, pixelencoder:1, displaycolorDepth:1
    Nov 30 22:15:08 smallstar kernel: [ 1019.378867] [drm:dc_stream_log [amdgpu]] ^Isink name: , serial: 0
    Nov 30 22:15:08 smallstar kernel: [ 1019.378938] [drm:dc_commit_state [amdgpu]] ^Ilink: 0
    Nov 30 22:15:08 smallstar kernel: [ 1019.379244] [drm:dce110_compressor_disable_fbc [amdgpu]] FBC status changed to 0
    Nov 30 22:15:08 smallstar kernel: [ 1019.379501] [drm:hwss_edp_power_control [amdgpu]] hwss_edp_power_control: Panel Power action: On
    Nov 30 22:15:08 smallstar kernel: [ 1019.500815] [drm:enable_link_dp [amdgpu]] Link: 0 eDP panel mode supported: 1 eDP panel mode enabled: 1 
    Nov 30 22:15:08 smallstar kernel: [ 1019.503669] [drm:dc_link_dp_perform_link_training [amdgpu]] HBRx1 pass VS=0, PE=0
    Nov 30 22:15:08 smallstar kernel: [ 1019.504331] [drm:hwss_edp_backlight_control [amdgpu]] hwss_edp_backlight_control: backlight action: On
    Nov 30 22:15:08 smallstar kernel: [ 1019.516877] [drm:dc_commit_state [amdgpu]] {1366x768, 1558x816@76300Khz}
Comment 35 Duncan Roe 2018-11-30 23:31:07 UTC
Tried to add a comment just now but got server failure
Comment 36 Duncan Roe 2018-11-30 23:33:28 UTC
Tried to add a comment just now but got server failure
Comment 37 Duncan Roe 2018-11-30 23:44:43 UTC
Control key straight away after timeout also brings up X display.
Comment 38 Duncan Roe 2018-11-30 23:45:48 UTC
Control key straight away after timeout also brings up X display.
Comment 39 Duncan Roe 2018-12-03 22:28:58 UTC
Regarding Comment 32, I have had the "dead screen" once more but noticed that caps lock light would not come on. Rather looks like the keyboard has become detached, so is a separate issue (I *am* using the laptop keyboard)
Comment 40 Duncan Roe 2019-01-08 04:07:41 UTC
Created attachment 143006 [details]
Diagnostic patches to determine which pointer is null

These patches are against Linux 4.19.12, commit 2a7cb228d29c3882c1414c10a44c5f3f59bfa44d in git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Comment 41 Duncan Roe 2019-01-08 04:19:36 UTC
Comment on attachment 143006 [details]
Diagnostic patches to determine which pointer is null

Sorry everyone - attachment 143006 [details] belongs to bug 108464
Comment 42 Duncan Roe 2019-08-05 07:59:50 UTC
Since Linux 5.1, I do not see this bug any more.
So I guess it is "fixed".

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.