Summary: | drm:intel_dp_start_link_train [i915] *ERROR* failed to enable link training | ||
---|---|---|---|
Product: | DRI | Reporter: | Chris Murphy <bugzilla> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | RESOLVED MOVED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | howaboutsynergy, intel-gfx-bugs, james.ausmus, manasi.d.navare |
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | Triaged, ReadyForDev | ||
i915 platform: | SKL | i915 features: | display/Other |
Attachments: |
Description
Chris Murphy
2019-01-31 16:20:55 UTC
Created attachment 143263 [details]
journalctl
kernel 5.0.0-0.rc4.git1.1.fc30.x86_64 is git 4aa9fc2a435a
drm.debug=0x1e is applied
These times are based on a cellphone; the laptop hasn't yet raised wifi to sync with NTP so its clock could be off by multiple seconds.
8:37:31 - approximate time spacebar is pushed to wake laptop
8:37:42 - approximate time of flicker
This journal contains everything from the time of wake through login; about 90 seconds.
Created attachment 143264 [details]
acpidump
acpidump > acpidump.bin
Created attachment 143265 [details]
dmidecode
dmidecode > dmidecode.txt
Created attachment 143266 [details]
lspci
lspci -vvnn > lspcivvnn.txt
The reported flicker happens on the built-in display. At the time of the flicker the external display is still in powersave, its amber light hasn't yet turned green. Is this issue duplicate to Bug 105998? It is a dup of 105998 comment 3. But I'm not certain if it's a dup of that bug report. Actually in bug 105998 comment 4 suggests I open a new bug, as it's unclear if what I'm seeing in comment 3 is really the same problem as that bug. Anyway, I can test with i915.disable_power_well=0 if that's helpful. Chris, can you add the full dmesg with kernal parameters drm.debug=0x1e log_buf_len=4M from boot? That might more information about the issue. Created attachment 143444 [details]
dmesg
5.0.0-0.rc7.git2.1 - This is linux git 2137397c92ae
wakeup and error happens at
[ 362.025567] flap.local kernel: [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training
Getting this on ArchLinux with kernel 5.1.3-g7cb9c5d341b9, i7-8700k CPU, no discrete graphics card, motherboard Asus Prime Z370-A, using BIOS version Version 1002 2018/07/06 8.76 MBytes (latest BIOS being Version 2001 2019/05/15 10.31 MBytes, please don't ask me to update), intel-ucode 20190514.a-1 (ie. the MDS one), mesa 19.0.4-1, NOTE: NOT using: xf86-video-intel, libva-intel-driver, libva1-intel-driver. ``` 00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 630 (Desktop) 00:14.0 USB controller: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller ``` I think this is why I also get: ``` $ xrandr --verbose|grep -C2 -w 85|grep --color=always Bad|head -1 link-status: Bad $ grep -A1 DP-1 /home/user/.config/xfce4/xfconf/xfce-perchannel-xml/displays.xml|grep -e 'Active' -e 'false' <property name="Active" type="bool" value="false"/> ``` seemingly at/after the times when monitor was turned off(or went to sleep?): ``` [ 4373.521845] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training [ 6979.762718] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training ``` Something from `Xorg.0.log`: ``` ... [ 32.021] (II) modeset(0): glamor X acceleration enabled on Mesa DRI Intel(R) UHD Graphics 630 (Coffeelake 3x8 GT2) ... [ 2974.053] (II) modeset(0): Modeline "1280x800"x0.0 83.50 1280 1352 1480 1680 800 803 809 831 -hsync +vsync (49.7 kHz e) [ 4374.136] (WW) modeset(0): hotplug event: connector 85's link-state is BAD, tried resetting the current mode. You may be leftwith a black screen if this fails... [ 4374.719] (WW) modeset(0): hotplug event: connector 85's link-state is BAD, tried resetting the current mode. You may be leftwith a black screen if this fails... [ 4378.425] (II) modeset(0): EDID vendor "GSM", prod id 23305 ... [ 4442.760] (WW) modeset(0): hotplug event: connector 85's link-state is BAD, tried resetting the current mode. You may be leftwith a black screen if this fails... ... [ 4445.432] (II) modeset(0): Modeline "1280x800"x0.0 83.50 1280 1352 1480 1680 800 803 809 831 -hsync +vsync (49.7 kHz e) [ 6980.360] (WW) modeset(0): hotplug event: connector 85's link-state is BAD, tried resetting the current mode. You may be leftwith a black screen if this fails... [ 6980.926] (WW) modeset(0): hotplug event: connector 85's link-state is BAD, tried resetting the current mode. You may be leftwith a black screen if this fails... [ 6980.976] (WW) modeset(0): hotplug event: connector 85's link-state is BAD, tried resetting the current mode. You may be leftwith a black screen if this fails... [ 6981.172] (II) modeset(0): EDID vendor "GSM", prod id 23305 ... [ 6990.753] (WW) modeset(0): hotplug event: connector 85's link-state is BAD, tried resetting the current mode. You may be leftwith a black screen if this fails... ``` Can't manually reproduce it, I tried turning monitor off, then back on; tried: ``` sleep 1; xset dpms force suspend; sleep 1; xset dpms force off; sleep 1; xset dpms force standby setterm --blank force ``` They all seemed to work so far, so the dmesg output with this patch: ```diff $ git diff drivers/gpu/drm/i915/intel_dp_link_training.c|tee diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c b/drivers/gpu/drm/i915/intel_dp_link_training.c index b59c87daa4f7..e69b2198af89 100644 --- a/drivers/gpu/drm/i915/intel_dp_link_training.c +++ b/drivers/gpu/drm/i915/intel_dp_link_training.c @@ -87,6 +87,7 @@ intel_dp_set_link_train(struct intel_dp *intel_dp, ret = drm_dp_dpcd_write(&intel_dp->aux, DP_TRAINING_PATTERN_SET, buf, len); + DRM_ERROR("ret=%d len=%d\n", ret, len); return ret == len; } ``` apparently right after the monitor turns on, I get these 5,5,1 each time on dmesg: ``` $ dmesg --level err [ 3.844322] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 3.846020] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 3.848732] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 [ 56.871626] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 56.873573] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 56.874693] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 [ 58.212665] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 58.214381] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 58.217124] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 [ 103.393851] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 103.395610] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 103.398358] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 ``` So this is when it works normally! When it doesn't work..., I'll have to update in a next comment when I hit it again. Curious what ret&len I'll get. ok I got it: 1. first turn off monitor 2. while monitor is off, run that `monoff` script from the prev. comment, yes this: ``` sleep 1; xset dpms force suspend; sleep 1; xset dpms force off; sleep 1; xset dpms force standby setterm --blank force ``` 3. wait a few seconds to be sure 4. press Ctrl key to wake up desktop 5. turn on monitor got: ``` [ 694.869217] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=-110 len=5 [ 694.869243] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training [ 695.138593] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=-110 len=1 [ 696.768921] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 696.770645] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 696.773386] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 ``` Ok I tried it again, but left 10 seconds before actually doing step 5: ``` [ 887.319215] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=-110 len=5 [ 887.319254] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training [ 887.588579] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=-110 len=1 [ 898.050275] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 898.052012] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 898.054707] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 ``` So looks like link is Good afterwards(after step 5 that is), but something keeps track of link as being still Bad (via xrandr) if it once was Bad (10 seconds ago). linux-stable/usr/include/asm-generic/errno.h #define ETIMEDOUT 110 /* Connection timed out */ that's the -110 for the ret it obviously times out because the monitor is physically(from the button) off. But as soon as it's turned on, all is good again, however that `xrandr --verbose` still reports connection as Bad... unsure if this is an issue in something else. ok, this is weird, I can't reproduce it anymore using Comment 13 steps, after the first boot since the PC was turned off completely. tl;dr: for some reason running xrandr every second prevents this from happening. ok the only reason that I couldn't reproduce it is because(/when) I was running the following in another terminal: ``` $ while true; do (date; cat /proc/uptime; xrandr --verbose|grep -w 85 -C 2|grep 'link-status: Bad'|| echo "good"); sleep 1; done ``` So while that is running, there's no way I can reproduce it with Comment 13 steps! As soon as I C-c (aka stop) it, bam, steps are good to reproduce. Start that while again and again can't reproduce! So, I guess that's a workaround? @Manasi, can you help here? ok I have another way to reproduce that bypasses the xrandr run, and it involves switching step 1 with step 2 in Comment 13. So with this, it doesn't matter if xrandr is ran in a `while` loop, the -110 will happen nonetheless. Kernel tested 5.1.10 The monitor is LG(something) and is connected via the DisplayPort to the ASUS Prime Z370-A motherboard connector and I've no dedicated gfx card, it's using the integrated one from Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz (oh I see I've mentioned this in Comment 11 already, the BIOS is the same version still because the next one is probably trojaned xD) Ok so to summarize, steps from Comment 13 will only work(to reproduce the issue) if you're not running xrandr in a while loop like: $ while true; do (date; cat /proc/uptime; xrandr --verbose --dryrun --nograb|grep -w 85 -C 2|grep 'link-status: Bad'|| echo "good"); sleep 1; done but, if you insist on running xrandr like that(or even without --dryrun --nograb) then simply switching step 1 with step 2(aka run 'monoff' first then turn off monitor) then xrandr will have no effect and the issue can be reproduced 100% (or so it seems, to me). By reproduce the issue I mean: [ 3946.388708] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=-110 len=5 [ 3946.388736] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training [ 3946.658135] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=-110 len=1 ... [ 3950.951159] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 3950.952869] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 3950.955573] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 And in the case when xrandr is ran in a `while` loop using Comment 13 steps (without swapping steps 1 and 2), only the last 3 dmesg messages from above appear, meaning the issue doesn't occurr. I'm up for trying any patches/tries that you want me to test, even if they lockup/oops/panic, I have kdump set up and we'll get nice stacktrace and stuff;-) just like in this bug for example: https://bugzilla.kernel.org/show_bug.cgi?id=203849 Cheers! `i915.disable_power_well=0` does not help, ie.: ``` $ dmesg|grep -e '0000:00:02.0' -e 'i915' [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-linux-stable root=UUID=2b8b9ab8-7ac5-4586-aa42-d7ffb12de92a rw root_trim=yes rd.luks.allow-discards rd.luks.options=discard ipv6.disable=1 ipv6.disable_ipv6=1 ipv6.autoconf=0 loglevel=15 log_buf_len=16M ignore_loglevel printk.always_kmsg_dump=y printk.time=y printk.devkmsg=on mminit_loglevel=4 memory_corruption_check=1 fbcon=scrollback:4096k fbcon=font:ProFont6x11 net.ifnames=0 nolvm dobtrfs console=tty1 earlyprintk=vga audit=0 systemd.log_target=kmsg systemd.journald.forward_to_console=1 enforcing=0 udev.children-max=1256 rd.udev.children-max=1256 nohz=on oops=panic crashkernel=256M panic=0 page_poison=1 psi=1 sysrq_always_enabled random.trust_cpu=off logo.nologo lpj=0 mce=bootlog reboot=force,cold noexec=on nohibernate scsi_mod.use_blk_mq=1 consoleblank=120 mitigations=off nospectre_v1 nospectre_v2 spectre_v2=off nospec_store_bypass_disable kvm-intel.vmentry_l1d_flush=never l1tf=off nopti pti=off no_stf_barrier noibrs noibpb ssbd=force-off spectre_v2_user=off noretpoline mds=off rd.log=all noefi cpuidle.governor=menu zram.num_devices=3 zswap.enabled=0 zswap.same_filled_pages_enabled=1 zswap.compressor=zstd zswap.max_pool_percent=40 zswap.zpool=z3fold i915.alpha_support=1 i915.fastboot=1 i915.disable_power_well=0 [ 0.351304] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-linux-stable root=UUID=2b8b9ab8-7ac5-4586-aa42-d7ffb12de92a rw root_trim=yes rd.luks.allow-discards rd.luks.options=discard ipv6.disable=1 ipv6.disable_ipv6=1 ipv6.autoconf=0 loglevel=15 log_buf_len=16M ignore_loglevel printk.always_kmsg_dump=y printk.time=y printk.devkmsg=on mminit_loglevel=4 memory_corruption_check=1 fbcon=scrollback:4096k fbcon=font:ProFont6x11 net.ifnames=0 nolvm dobtrfs console=tty1 earlyprintk=vga audit=0 systemd.log_target=kmsg systemd.journald.forward_to_console=1 enforcing=0 udev.children-max=1256 rd.udev.children-max=1256 nohz=on oops=panic crashkernel=256M panic=0 page_poison=1 psi=1 sysrq_always_enabled random.trust_cpu=off logo.nologo lpj=0 mce=bootlog reboot=force,cold noexec=on nohibernate scsi_mod.use_blk_mq=1 consoleblank=120 mitigations=off nospectre_v1 nospectre_v2 spectre_v2=off nospec_store_bypass_disable kvm-intel.vmentry_l1d_flush=never l1tf=off nopti pti=off no_stf_barrier noibrs noibpb ssbd=force-off spectre_v2_user=off noretpoline mds=off rd.log=all noefi cpuidle.governor=menu zram.num_devices=3 zswap.enabled=0 zswap.same_filled_pages_enabled=1 zswap.compressor=zstd zswap.max_pool_percent=40 zswap.zpool=z3fold i915.alpha_support=1 i915.fastboot=1 i915.disable_power_well=0 [ 2.757573] pci 0000:00:02.0: [8086:3e92] type 00 class 0x030000 [ 2.757741] pci 0000:00:02.0: reg 0x10: [mem 0xde000000-0xdeffffff 64bit] [ 2.757906] pci 0000:00:02.0: reg 0x18: [mem 0xc0000000-0xcfffffff 64bit pref] [ 2.758133] pci 0000:00:02.0: reg 0x20: [io 0xf000-0xf03f] [ 2.758318] device: '0000:00:02.0': device_add [ 2.758446] bus: 'pci': add device 0000:00:02.0 [ 2.758609] PM: Adding info for pci:0000:00:02.0 [ 2.904428] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff] [ 5.653546] bus: 'pci': add driver i915 [ 5.653711] bus: 'pci': driver_probe_device: matched device 0000:00:02.0 with driver i915 [ 5.653938] bus: 'pci': really_probe: probing driver i915 with device 0000:00:02.0 [ 5.654166] i915 0000:00:02.0: no default pinctrl state [ 5.654329] i915 0000:00:02.0: runtime IRQ mapping not provided by arch [ 5.657558] i2c i2c-1: adapter [i915 gmbus dpc] registered [ 5.658289] i2c i2c-2: adapter [i915 gmbus dpb] registered [ 5.658975] i2c i2c-3: adapter [i915 gmbus dpd] registered [ 5.659613] [drm] Finished loading DMC firmware i915/kbl_dmc_ver1_04.bin (v1.4) [ 5.944140] device: 'i915': device_add [ 5.944304] bus: 'event_source': add device i915 [ 5.944466] PM: Adding info for event_source:i915 [ 5.948208] [drm] Initialized i915 1.6.0 20190207 for 0000:00:02.0 on minor 0 [ 5.952289] i915 0000:00:02.0: adding component (ops i915_audio_component_bind_ops [i915]) [ 5.952878] snd_hda_intel 0000:00:1f.3: found component 0000:00:02.0, duplicate 0 [ 5.953119] snd_hda_intel 0000:00:1f.3: binding 0000:00:02.0 (ops i915_audio_component_bind_ops [i915]) [ 5.953732] snd_hda_intel 0000:00:1f.3: Linked as a consumer to 0000:00:02.0 [ 5.953909] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915]) [ 5.954339] driver: 'i915': driver_bound: bound to device '0000:00:02.0' [ 5.954507] bus: 'pci': really_probe: bound device 0000:00:02.0 to driver i915 [ 5.995964] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 5.997663] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 6.000374] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 [ 6.062740] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device ``` ``` ... [ 5.648405] Setting dangerous option disable_power_well - tainting kernel ... [ 140.419975] gpg-agent[1102]: handler 0x7a423a6ff700 for fd 10 started [ 140.477095] gpg-agent[1102]: handler 0x7a423a6ff700 for fd 10 terminated [ 200.482440] gpg-agent[1102]: handler 0x7a4239efe700 for fd 10 started [ 200.543903] gpg-agent[1102]: handler 0x7a4239efe700 for fd 10 terminated [ 201.779269] user[2626]: '/home/user/bin/monoff' is executing: 'sleep 1' [ 202.782765] user[2633]: '/home/user/bin/monoff' is executing: 'xset dpms force suspend' [ 202.886388] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D0 [ 202.898494] snd_hda_intel 0000:00:1f.3: PME# disabled [ 202.911455] snd_hda_intel 0000:00:1f.3: PME# enabled [ 202.923592] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D3hot [ 202.946158] user[2635]: '/home/user/bin/monoff' is executing: 'sleep 1' [ 203.900374] user[2642]: '/home/user/bin/monoff' is executing: 'xset dpms force off' [ 204.009111] user[2644]: '/home/user/bin/monoff' is executing: 'sleep 1' [ 205.015169] user[2651]: '/home/user/bin/monoff' is executing: 'xset dpms force standby' [ 205.124020] user[2653]: '/home/user/bin/monoff' is executing: 'setterm --blank force' [ 220.715002] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=-110 len=5 [ 220.715016] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link training [ 220.984404] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=-110 len=1 [ 220.984787] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D0 [ 220.996593] snd_hda_intel 0000:00:1f.3: PME# disabled [ 221.009573] snd_hda_intel 0000:00:1f.3: PME# enabled [ 221.022464] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D3hot [ 221.320977] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 221.324827] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 221.328677] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 221.600815] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 221.604597] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 221.608409] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 221.904445] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 221.908255] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 221.912065] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 222.183955] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 222.187592] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 222.191224] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 222.544129] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 222.547899] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 222.551550] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 222.585170] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D0 [ 222.596599] snd_hda_intel 0000:00:1f.3: PME# disabled [ 222.609442] snd_hda_intel 0000:00:1f.3: PME# enabled [ 222.621440] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D3hot [ 232.057307] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 232.366672] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 232.378634] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 232.382283] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 232.385931] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 232.397218] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 232.400973] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 232.404723] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 232.443129] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 232.446689] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 232.450356] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 232.457487] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 232.459178] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 232.461879] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 [ 232.462209] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D0 [ 232.473474] snd_hda_intel 0000:00:1f.3: PME# disabled [ 232.486275] snd_hda_intel 0000:00:1f.3: PME# enabled [ 232.497460] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D3hot [ 260.543548] gpg-agent[1102]: handler 0x7a423a6ff700 for fd 10 started [ 260.627503] gpg-agent[1102]: handler 0x7a423a6ff700 for fd 10 terminated [ 301.490986] user[3150]: okiedokie ``` at [ 220.715002] I pressed Ctrl key to wake up the software but the monitor was still off! got the ret=-110 messages. 10 seconds later(so at [ 232.057307] I'm guessing) I turned on monitor, got the normal ret=5 messages. `drm.debug=0x1e` spews just too many messages, not sure what to do with it... not even sure if it would log all of them during the test or they would just get overwritten due to the ring property of the dmesg(kmsg?) buffer. Maybe I'll retry. Created attachment 144603 [details] dmesg with drm.debug=0x1e on both just console and within X only while inside X (ie. startx) `drm.debug=0x1e` constantly generates dmesg spam I'm attaching full dmesg log (ring buffer didn't wrap around and thus no overwrites) with drm.debug=0x1e First I tried in console only (tty1) and the issue doesn't happen there, so this is good: [ 6.032810] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 however on console/tty1 , it requires running `monon` in order to unblank! Pressing enter didn't do anything, so between these two messages I couldn't see what I was typing: [ 222.564063] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D3hot [ 229.562650] user[1383]: '/home/user/bin/monon'(on /dev/tty1) is executing: 'sleep 1' I guess at [ 395.795237] [drm:drm_atomic_state_init [drm]] Allocated atomic state 0000000023bfb3ad I ran `startx` and that's when the spam started. At about [ 415.457985] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 I ran a normal monoff then turned on monitor test, so this wouldn't have caused the -110 (link training issue) - I actually forgot (sort of) the steps I was supposed to do here, so ignore this. I did it right at: [ 443.388949] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=-110 len=5 so between these two I waited: after pressing the key(433) to then turning on monitor(441, 8 seconds later): [ 433.005944] [drm:__drm_atomic_state_free [drm]] Freeing atomic state 0000000076021904 [ 441.745985] [drm:drm_atomic_state_init [drm]] Allocated atomic state 0000000076021904 and like 2 sec later is the -110 mentioned above. Looks like it's those: [ 441.820989] [drm:intel_dp_aux_xfer [i915]] dp_aux_ch timeout status 0x7d4003ff that Manasi mentions in https://bugs.freedesktop.org/show_bug.cgi?id=105998#c5 however since I've tried with `i915.disable_power_well=0` in my previous comment(albeit without the `drm.debug=0x1e` - does it matter though?!) and it had no effect, I'm out of ideas.. ------- monon is essentially this: #!/bin/bash roon() { local ec logger "'$0'(on $(tty)) is executing: '$*'" "$@" ; ec="$?" if test "$ec" != "0"; then logger "'$0' failed(ec='$ec') executing: '$*'" fi return "$ec" } roon sleep 1 roon xset dpms force on roon setterm --blank poke #to also get back from blanking roon xset s reset ------- monoff is this: #!/bin/bash roon() { local ec logger "'$0'(on $(tty)) is executing: '$*'" "$@" ; ec="$?" if test "$ec" != "0"; then logger "'$0' failed(ec='$ec') executing: '$*'" fi return "$ec" } roon sleep 1 roon xset dpms force suspend roon sleep 1 roon xset dpms force off roon sleep 1 roon xset dpms force standby roon setterm --blank force ------- correction: [ 6.032810] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 should be: [ 222.488296] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 I guess it just needs a little more time: [ 125.566788] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 125.570588] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 125.574389] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 130.895667] user[1498]: '/home/user/bin/monoff'(on /dev/pts/1) is executing: 'sleep 1' [ 131.904024] user[1506]: '/home/user/bin/monoff'(on /dev/pts/1) is executing: 'xset dpms force suspend' [ 132.011136] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 132.019006] user[1509]: '/home/user/bin/monoff'(on /dev/pts/1) is executing: 'sleep 1' [ 133.027453] user[1517]: '/home/user/bin/monoff'(on /dev/pts/1) is executing: 'xset dpms force off' [ 133.138670] user[1520]: '/home/user/bin/monoff'(on /dev/pts/1) is executing: 'sleep 1' [ 134.147175] user[1528]: '/home/user/bin/monoff'(on /dev/pts/1) is executing: 'xset dpms force standby' [ 134.258545] user[1531]: '/home/user/bin/monoff'(on /dev/pts/1) is executing: 'setterm --blank force' [ 138.779003] snd_hda_intel 0000:00:1f.3: PME# enabled [ 138.790413] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D3hot [ 151.416744] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 887 retries [ 151.417491] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(2) after 0 retries [ 151.417637] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(2) after 0 retries [ 151.417806] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(5) after 0 retries [ 151.417822] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 151.418056] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.418230] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(6) after 0 retries [ 151.418389] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(4) after 0 retries [ 151.418623] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.418797] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(6) after 0 retries [ 151.418956] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(4) after 0 retries [ 151.419189] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.419363] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(6) after 0 retries [ 151.419531] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(5) after 0 retries [ 151.419547] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=5 len=5 [ 151.420080] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.420254] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(6) after 0 retries [ 151.420413] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(4) after 0 retries [ 151.420948] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.421121] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(6) after 0 retries [ 151.421279] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(4) after 0 retries [ 151.421814] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.421987] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(6) after 0 retries [ 151.422268] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.422282] [drm:intel_dp_set_link_train [i915]] *ERROR* ret=1 len=1 [ 151.422634] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D0 [ 151.434403] snd_hda_intel 0000:00:1f.3: PME# disabled [ 151.447335] snd_hda_intel 0000:00:1f.3: PME# enabled [ 151.459283] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 151.460351] snd_hda_intel 0000:00:1f.3: power state changed by ACPI to D3hot [ 151.488549] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 151.492329] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 151.496143] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 151.500188] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 151.504014] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 151.507808] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 151.620812] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.621097] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(15) after 0 retries [ 151.621265] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.621468] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.621632] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.621873] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(12) after 0 retries [ 151.622022] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.622172] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.628561] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.628726] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.628901] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.629110] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(6) after 0 retries [ 151.632209] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 151.638502] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.638779] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(15) after 0 retries [ 151.638933] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.639086] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.639242] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.639497] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(12) after 0 retries [ 151.639664] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.639826] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.639988] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.640183] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(6) after 0 retries [ 151.646707] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.646863] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.650080] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 151.653663] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 151.657264] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 151.658088] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.658377] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(15) after 0 retries [ 151.658542] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.658704] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.658860] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.659104] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(12) after 0 retries [ 151.659258] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.659439] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.659626] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.659830] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(6) after 0 retries [ 151.666278] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.666455] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 0 retries [ 151.670324] i2c i2c-2: NAK from device addr 0x50 msg #0 [ 151.673941] i2c i2c-1: NAK from device addr 0x50 msg #0 [ 151.677536] i2c i2c-3: NAK from device addr 0x50 msg #0 [ 186.356944] gpg-agent[1081]: handler 0x7b7cdd4c7700 for fd 10 started [ 186.406135] gpg-agent[1081]: handler 0x7b7cdd4c7700 for fd 10 terminated [ 235.049835] user[2041]: okiedo 887 retries instead of just 32 however, if I use 3200 retries, there's a long(21sec?) blank screen at system startup due to something else timing out(also seen in attached dmesg in Comment 20): [ 11.444603] e1000e: net0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 32.961810] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries, giving up. First error: -110 [ 32.962048] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(-110) after 3200 retries [ 32.962268] [drm] failed to retrieve link info, disabling eDP [ 32.965708] device: 'i915': device_add [ 32.965904] bus: 'event_source': add device i915 [ 32.966060] PM: Adding info for event_source:i915 the "patch" I used was: diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c b/drivers/gpu/drm/i915/intel_dp_link_training.c index b59c87daa4f7..e69b2198af89 100644 --- a/drivers/gpu/drm/i915/intel_dp_link_training.c +++ b/drivers/gpu/drm/i915/intel_dp_link_training.c @@ -87,6 +87,7 @@ intel_dp_set_link_train(struct intel_dp *intel_dp, ret = drm_dp_dpcd_write(&intel_dp->aux, DP_TRAINING_PATTERN_SET, buf, len); + DRM_ERROR("ret=%d len=%d\n", ret, len); return ret == len; } diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index 54a6414c5d96..a6baf92dba8f 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -234,7 +234,7 @@ static int drm_dp_dpcd_access(struct drm_dp_aux *aux, u8 request, * aux i2c transactions but real world devices this wasn't * sufficient, bump to 32 which makes Dell 4k monitors happier. */ - for (retry = 0; retry < 32; retry++) { + for (retry = 0; retry < 3200; retry++) { if (ret != 0 && ret != -ETIMEDOUT) { usleep_range(AUX_RETRY_INTERVAL, AUX_RETRY_INTERVAL + 100); @@ -262,10 +262,11 @@ static int drm_dp_dpcd_access(struct drm_dp_aux *aux, u8 request, err = ret; } - DRM_DEBUG_KMS("Too many retries, giving up. First error: %d\n", err); + DRM_ERROR("Too many retries, giving up. First error: %d\n", err); ret = err; unlock: + DRM_ERROR("Success(%d) after %u retries\n", ret, retry); mutex_unlock(&aux->hw_mutex); return ret; } I guess this will vary... [ 1410.660548] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 1189 retries [ 1495.555416] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 1481 retries [ 1533.515404] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 1381 retries [ 1597.437236] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Success(1) after 965 retries but if I don't wake up the software(or should I say video card?) by pressing a key before I turn on the monitor, the success is after 0 retries! actually the variation is strictly due to the time I spent until I turned on the monitor! if I don't turn it on, it will still -110 the only problem is that it keeps retrying for about 220 seconds(while the monitor is still off): [ 117.970253] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* drm_dp_dpcd_access starting, stack: ... [ 134.801604] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 151.638435] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 168.473882] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 185.388476] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 202.299276] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 219.451539] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 236.282573] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 253.066295] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 269.861933] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 286.623649] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 303.417173] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 320.280230] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 ... [ 337.188082] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Too many retries(2000), giving up. First error: -110 [ 337.188086] [drm:drm_dp_dpcd_access [drm_kms_helper]] *ERROR* Failed ret='-110' after 2000 retries ... [ 337.188284] i915 Wakeref count: 62 ... looking at the caller stack, it's for different callers, but it gives up after 13 such attempts, apparently (unless that's the time the video card went to sleep again, since I went away from the computer - monitor was still turned off) Apart from from the log messages, is there any other impact? Have you verified the issue with drmtip? > Apart from from the log messages, is there any other impact? Yes, xrandr will report(until reboot, apparently) that link-status is Bad (previously, was Good) as soon as this issue happens once in the current session/boot. and file contents of /home/user/.config/xfce4/xfconf/xfce-perchannel-xml/displays.xml change from: - <property name="Active" type="bool" value="true"/> to + <property name="Active" type="bool" value="false"/> in context: <property name="DP-1" type="string" value="LG 27""> <property name="Active" type="bool" value="false"/> <property name="EDID" type="string" value="0fd458f22dac5decf2a8543c7190b329e7cf113c"/> <property name="Resolution" type="string" value="3840x2160"/> <property name="RefreshRate" type="double" value="59.996625"/> <property name="Rotation" type="int" value="0"/> <property name="Reflection" type="string" value="0"/> <property name="Primary" type="bool" value="true"/> <property name="Position" type="empty"> <property name="X" type="int" value="0"/> <property name="Y" type="int" value="0"/> </property> </property> Here's full xrandr output for the aforementioned `link-status: Bad`: $ xrandr --verbose Screen 0: minimum 320 x 200, current 3840 x 2160, maximum 8192 x 8192 DP-1 connected 3840x2160+0+0 (0x47) normal (normal left inverted right x axis y axis) 600mm x 340mm Identifier: 0x42 Timestamp: 372563 Subpixel: unknown Gamma: 1.0:1.0:1.0 Brightness: 1.0 Clones: HDMI-1 CRTC: 0 CRTCs: 0 1 2 Transform: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 filter: EDID: 00ffffffffffff001e6d095b1f2c0400 011c0104b53c22789e3035a7554ea326 0f50542108007140818081c0a9c0d1c0 8100010101014dd000a0f0703e803020 650c58542100001a286800a0f0703e80 0890650c58542100001a000000fd0038 3d1e8738000a202020202020000000fc 004c4720556c7472612048440a20014f 02031171449004030123090707830100 00023a801871382d40582c4500585421 00001e565e00a0a0a029503020350058 542100001a0000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 000000000000000000000000000000c8 Content Protection: Undesired supported: Undesired, Desired, Enabled max bpc: 12 range: (6, 12) Broadcast RGB: Automatic supported: Automatic, Full, Limited 16:235 audio: auto supported: force-dvi, off, auto, on link-status: Bad supported: Good, Bad CONNECTOR_ID: 85 supported: 85 non-desktop: 0 range: (0, 1) 3840x2160 (0x47) 533.250MHz +HSync -VSync *current +preferred h: width 3840 start 3888 end 3920 total 4000 skew 0 clock 133.31KHz v: height 2160 start 2214 end 2219 total 2222 clock 60.00Hz 3840x2160 (0x48) 266.640MHz +HSync -VSync h: width 3840 start 3848 end 3992 total 4000 skew 0 clock 66.66KHz v: height 2160 start 2214 end 2219 total 2222 clock 30.00Hz 2560x1440 (0x49) 241.500MHz +HSync -VSync h: width 2560 start 2608 end 2640 total 2720 skew 0 clock 88.79KHz v: height 1440 start 1443 end 1448 total 1481 clock 59.95Hz 1920x1080 (0x4a) 148.500MHz +HSync +VSync h: width 1920 start 2008 end 2052 total 2200 skew 0 clock 67.50KHz v: height 1080 start 1084 end 1089 total 1125 clock 60.00Hz 1920x1080 (0x4b) 148.500MHz +HSync +VSync h: width 1920 start 2008 end 2052 total 2200 skew 0 clock 67.50KHz v: height 1080 start 1084 end 1089 total 1125 clock 60.00Hz 1920x1080 (0x4c) 148.352MHz +HSync +VSync h: width 1920 start 2008 end 2052 total 2200 skew 0 clock 67.43KHz v: height 1080 start 1084 end 1089 total 1125 clock 59.94Hz 1600x900 (0x4d) 108.000MHz +HSync +VSync h: width 1600 start 1624 end 1704 total 1800 skew 0 clock 60.00KHz v: height 900 start 901 end 904 total 1000 clock 60.00Hz 1280x1024 (0x4e) 108.000MHz +HSync +VSync h: width 1280 start 1328 end 1440 total 1688 skew 0 clock 63.98KHz v: height 1024 start 1025 end 1028 total 1066 clock 60.02Hz 1280x800 (0x4f) 83.500MHz -HSync +VSync h: width 1280 start 1352 end 1480 total 1680 skew 0 clock 49.70KHz v: height 800 start 803 end 809 total 831 clock 59.81Hz 1152x864 (0x50) 81.579MHz -HSync +VSync h: width 1152 start 1216 end 1336 total 1520 skew 0 clock 53.67KHz v: height 864 start 865 end 868 total 895 clock 59.97Hz 1280x720 (0x51) 74.250MHz +HSync +VSync h: width 1280 start 1390 end 1430 total 1650 skew 0 clock 45.00KHz v: height 720 start 725 end 730 total 750 clock 60.00Hz 1280x720 (0x52) 74.250MHz +HSync +VSync h: width 1280 start 1390 end 1430 total 1650 skew 0 clock 45.00KHz v: height 720 start 725 end 730 total 750 clock 60.00Hz 1280x720 (0x53) 74.176MHz +HSync +VSync h: width 1280 start 1390 end 1430 total 1650 skew 0 clock 44.96KHz v: height 720 start 725 end 730 total 750 clock 59.94Hz 1024x768 (0x54) 65.000MHz -HSync -VSync h: width 1024 start 1048 end 1184 total 1344 skew 0 clock 48.36KHz v: height 768 start 771 end 777 total 806 clock 60.00Hz 800x600 (0x55) 40.000MHz +HSync +VSync h: width 800 start 840 end 968 total 1056 skew 0 clock 37.88KHz v: height 600 start 601 end 605 total 628 clock 60.32Hz 720x480 (0x56) 27.027MHz -HSync -VSync h: width 720 start 736 end 798 total 858 skew 0 clock 31.50KHz v: height 480 start 489 end 495 total 525 clock 60.00Hz 720x480 (0x57) 27.000MHz -HSync -VSync h: width 720 start 736 end 798 total 858 skew 0 clock 31.47KHz v: height 480 start 489 end 495 total 525 clock 59.94Hz 640x480 (0x58) 25.200MHz -HSync -VSync h: width 640 start 656 end 752 total 800 skew 0 clock 31.50KHz v: height 480 start 490 end 492 total 525 clock 60.00Hz 640x480 (0x59) 25.175MHz -HSync -VSync h: width 640 start 656 end 752 total 800 skew 0 clock 31.47KHz v: height 480 start 490 end 492 total 525 clock 59.94Hz 640x480 (0x5a) 25.175MHz -HSync -VSync h: width 640 start 656 end 752 total 800 skew 0 clock 31.47KHz v: height 480 start 490 end 492 total 525 clock 59.94Hz HDMI-1 disconnected (normal left inverted right x axis y axis) Identifier: 0x43 Timestamp: 372563 Subpixel: unknown Clones: DP-1 CRTCs: 0 1 2 Transform: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 filter: Content Protection: Undesired supported: Undesired, Desired, Enabled max bpc: 12 range: (8, 12) content type: No Data supported: No Data, Graphics, Photo, Cinema, Game aspect ratio: Automatic supported: Automatic, 4:3, 16:9 Broadcast RGB: Automatic supported: Automatic, Full, Limited 16:235 audio: auto supported: force-dvi, off, auto, on link-status: Good supported: Good, Bad CONNECTOR_ID: 93 supported: 93 non-desktop: 0 range: (0, 1) HDMI-2 disconnected (normal left inverted right x axis y axis) Identifier: 0x44 Timestamp: 372563 Subpixel: unknown Clones: CRTCs: 0 1 2 Transform: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 filter: Content Protection: Undesired supported: Undesired, Desired, Enabled max bpc: 12 range: (8, 12) content type: No Data supported: No Data, Graphics, Photo, Cinema, Game aspect ratio: Automatic supported: Automatic, 4:3, 16:9 Broadcast RGB: Automatic supported: Automatic, Full, Limited 16:235 audio: auto supported: force-dvi, off, auto, on link-status: Good supported: Good, Bad CONNECTOR_ID: 99 supported: 99 non-desktop: 0 range: (0, 1) HDMI-3 disconnected (normal left inverted right x axis y axis) Identifier: 0x45 Timestamp: 372563 Subpixel: unknown Clones: CRTCs: 0 1 2 Transform: 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 filter: Content Protection: Undesired supported: Undesired, Desired, Enabled max bpc: 12 range: (8, 12) content type: No Data supported: No Data, Graphics, Photo, Cinema, Game aspect ratio: Automatic supported: Automatic, 4:3, 16:9 Broadcast RGB: Automatic supported: Automatic, Full, Limited 16:235 audio: auto supported: force-dvi, off, auto, on link-status: Good supported: Good, Bad CONNECTOR_ID: 103 supported: 103 non-desktop: 0 range: (0, 1) I'm not sure what other side-effects these cause. > Have you verified the issue with drmtip? I haven't tried drmtip, but I'm running most recent kernel stable 5.1.15-gf0fae702de30 (released 40 hours ago) If drmtip would've fixed it, I think they would've mentioned it here, no? Thanks! looks like full reboot isn't necessary but instead restarting X twice will clear both changes: first X restart will clear the xrandr Bad link status, and the second X restart will update/clear the change in ~/.config/xfce4/xfconf/xfce-perchannel-xml/displays.xml (it becomes `true` again) > > > Have you verified the issue with drmtip? > > I haven't tried drmtip, but I'm running most recent kernel stable > 5.1.15-gf0fae702de30 (released 40 hours ago) > If drmtip would've fixed it, I think they would've mentioned it here, no? Drmtip (https://cgit.freedesktop.org/drm-tip) is a separate branch where you will find latest changes. It will take some time to get these changes to stable kernel. Logs from drmtip will help from investigation point of view. Also, we need to know if this issue is same as the original bug comment 1, even though the error is same this could be a different issue. Anyways logs from drmtip would be helpful. Created attachment 144653 [details] dmesg with drm.debug-0x1e on drm-tip (only within X) Used this drm-tip: commit cc7ffe2a301509ba233652ec17d0f74c18b5c8a9 (HEAD -> drm-tip, origin/drm-tip, origin/HEAD) Date: Thu Jun 27 10:28:10 2019 +0200 drm-tip: 2019y-06m-27d-08h-26m-31s UTC integration manifest Had the following changes in `drivers/gpu/drm/` so that hopefully dumping the stack each time -110 was hit(1+13 times) is helpful to see in the dmesg, however this does increase the time spent(by about 4-5 seconds total), so let me know if you want me to undo this and repost one clean-ish without it. ```patch diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index b9362b4f6353..c5f7f4c37565 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -76,7 +76,7 @@ config DRM_KMS_FB_HELPER bool depends on DRM_KMS_HELPER select FB - select FRAMEBUFFER_CONSOLE if !EXPERT + select FRAMEBUFFER_CONSOLE select FRAMEBUFFER_CONSOLE_DETECT_PRIMARY if FRAMEBUFFER_CONSOLE select FB_SYS_FOPS select FB_SYS_FILLRECT diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index 0b994d083a89..92b2199af2ad 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -264,10 +264,12 @@ static int drm_dp_dpcd_access(struct drm_dp_aux *aux, u8 request, err = ret; } - DRM_DEBUG_KMS("Too many retries, giving up. First error: %d\n", err); + DRM_ERROR("Too many retries, giving up. First error: %d\n", err); + dump_stack(); ret = err; unlock: + DRM_ERROR("Ret '%d' after %u retries\n", ret, retry); mutex_unlock(&aux->hw_mutex); return ret; } diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 1984e5c54d58..60004f8ae03a 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -2379,7 +2379,7 @@ EXPORT_SYMBOL(drm_fbdev_generic_setup); */ int __init drm_fb_helper_modinit(void) { -#if defined(CONFIG_FRAMEBUFFER_CONSOLE_MODULE) && !defined(CONFIG_EXPERT) +#if defined(CONFIG_FRAMEBUFFER_CONSOLE_MODULE) && ( !defined(CONFIG_EXPERT) || defined(CONFIG_DRM_MODULE) || defined(CONFIG_DRM) ) const char name[] = "fbcon"; struct module *fbcon; diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c b/drivers/gpu/drm/i915/display/intel_dp_link_training.c index 9b1fccea966b..1242cf9e9828 100644 --- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c +++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c @@ -89,6 +89,7 @@ intel_dp_set_link_train(struct intel_dp *intel_dp, ret = drm_dp_dpcd_write(&intel_dp->aux, DP_TRAINING_PATTERN_SET, buf, len); + DRM_ERROR("ret=%d len=%d\n", ret, len); return ret == len; } ``` For full set of changes(including actual .config file used) see: https://gist.github.com/howaboutsynergy/9b6e890ec15592a55a0c07b5db234ce6 Created attachment 144654 [details]
cleaner dmesg with drm.debug=0x1e on drm-tip (only within X)
this one undid the patches to the 2 files, ie.
git checkout drivers/gpu/drm/i915/display/intel_dp_link_training.c drivers/gpu/drm/drm_dp_helper.c
so there's no stack dumping or needless logging, thus less time is spent in those functions
the log is cleaner this way, but if you want to see the caller stack for when -110 happens just look at prev. comment's attachment :D
What else can I do?
Thanks!
Created attachment 144655 [details]
dmesg with drm.debug=0x1e and i915.disable_power_well=0 (only within X)
`only within X` means that I reproduce the issue only within X(ie. `startx` command)
X.Org X Server 1.20.5
X Protocol Version 11, Revision 0
this attachment is just like the previous one but I've added kernel cmdline i915.disable_power_well=0 (which had no effect on the issue)
The Xserver logs report the link status as BAD because the kernel sends a uevent reporting this as soon as the link training fails or its unable to link train. The aux timeouts that we have seen previously in case of eDP panels could be because the panel is not really ON when we try to link train and that can happen because the panel power cycle delay is not long enough. Could you please try the attached patch that increases the panel power cycle delay to see if that fixes the issue and also send a dmesg log with drm.debug = 0xe? Manasi Created attachment 144729 [details] [review] Increase panel power cycle delay test Created attachment 144741 [details] dmesg with patch from comment 33 and drm.debug=0xe tl;dr: the issue persists even with that patch. I tried these steps inside `startx`, twice: turn off monitor, then ran `monoff` to put card(?) to sleep, then typed something to wake up card(while monitor is still off), then turned on the monitor. (there's some intentional delay(few seconds) between each step) The second time I tried it, I failed to properly execute `logger` to add my own message(s) to `dmesg`, as follows: ``` user@i87k 2019/07/09 14:42:04 -bash5.0.7 t:4 j:0 d:3 pp:1149 p:1269 ut217 !36219 7 0 5.2.0-g0ecfebd2b524 #19 SMP Tue Jul 9 14:35:54 CEST 2019 /home/user $ logger "ok, monior is off, about to execute monoff" ----------- user@i87k 2019/07/09 14:42:19 -bash5.0.7 t:4 j:0 d:3 pp:1149 p:1269 ut232 !36220 8 0 5.2.0-g0ecfebd2b524 #19 SMP Tue Jul 9 14:35:54 CEST 2019 /home/user $ monoff setterm: terminal xterm-256color does not support --blank ----------- user@i87k 2019/07/09 14:42:27 -bash5.0.7 t:4 j:0 d:3 pp:1149 p:1269 ut240 !36221 9 0 5.2.0-g0ecfebd2b524 #19 SMP Tue Jul 9 14:35:54 CEST 2019 /home/user $ ld someething to wake up the card" > logger "monitor is still off, btw" > logger "about to tun on the monitor" > ^C ----------- user@i87k 2019/07/09 14:43:31 -bash5.0.7 t:4 j:0 d:3 pp:1149 p:1269 ut303 !36222 9 0 5.2.0-g0ecfebd2b524 #19 SMP Tue Jul 9 14:35:54 CEST 2019 /home/user $ ``` in the above, the command: `$ ld someething to wake up the card"` is supposed to be: `$ logger "typed someething to wake up the card"` but somehow those keys(`ogger "type`) got eaten up, probably due to drm doing its thing after my typing woke up the card The 'utNUM' at the end of my prompt is uptime(seconds), it might help correlate the events with the ones in dmesg. (In reply to Manasi from comment #32) the panel is definitely not on when drm tries to link train, ie. the monitor is turned off. Given this, I wonder if this is even an issue anymore? In other words, since the monitor is off when I type a few keys on the keyboard to wake up the card(which it thinks it already put the monitor to sleep), then drm trying link-training with a monitor that it turned off(from the monitor's button that is, not unplugged from power though) would always fail, right? But as soon as monitor is turned on, then it re-linktrains and works fine. So, is it even an issue? I guess Comment 25 was right, it's just the log messages saying when link training failed. If it weren't for the other two side-effects (see Comment 26), it wouldn't really be an issue and I guess there's nothing you can do, unless the kernel does not send a uevent reporting that the link training succeeded, but if it already does, well then I'm stuck with those side effects. is issue recently reproduced on latest drm tip? one similar issue on ICL -lp got resolved with 5.2.0 kernel. any update here? -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/224. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.