Description
oliver.grafe
2014-11-05 17:12:08 UTC
==System Environment== -------------------------- Supermicro X10SAE (C226 Lynxpoint) Intel(R) Xeon(R) CPU E3-1268L old hp1530 Monitor attached via Display-Port->DVI Adapter Cable Distribution: OpenSuse 13.2 RC2 ==kernel== -------------------------- Latest OpenSuse "Kernel of the day": linux-bc55:/etc # uname -sovr Linux 3.18.0-rc3-3.ge706e91-default #1 SMP Tue Nov 4 12:43:57 UTC 2014 (e706e91) GNU/Linux ==Bug detailed description== When the system is booted with the monitor attached, hot-plugging works fine: Pull the DP adapter, plug back in -> monitor image restored. This does not work when the system is cold-booted without any DP cable attached: Boot system without monitor, plug DP cable in -> no monitor image. dmesg with drm.debug=6 attached, hot-insert at timecode 163. ==Reproduce steps== ---------------------------- 1. Boot up the system without a monitor attached 2. Hot-Plug Display Port Cable 3. Monitor shows no image Created attachment 108975 [details]
dmesg
Does hotplug work if you're using a native DP monitor without a DP->DVI adapter? Please attach intel_reg_dumper output for the two cases. It's part of the intel-gpu-tools package. Hello Jani, Let me summarize some ideas and my thoughts here. It seems that on headless booting vBIOS, using mode 50h (assuming this mode is set), waits for some synch signals it sent to monitor if HW I/F (DP, eDP, HDMI, DVI, VGA) present. The timer is set. Once timer expires, vBIOS concludes that monitor is NOT present, then it disables host/monitor synch protocol and continues booting. If vBIOS does not detect any HW I/F, it automatically disables host/monitor synch protocol, then continues booting. So, while booting kernel, GFX driver in its init function should retest HW presense, and regardless enable host/monitor synch protocol again, or to make sure that monitor hot plug works, not to take blindly any vBIOS assumptions. My best guess. Thank you, Zoran Does hotplug work if you're using a native DP monitor without a DP->DVI adapter? Please attach intel_reg_dumper output *and* dmesg with drm.debug=14 module parameter set for the two cases: 1) boot with the monitor attached at boot 2) boot without monitor attached at boot, and attach monitor after i915 loaded Created attachment 110305 [details]
dmesg/intel reg dumper for both cases
Hello All, please find attached an archive with the outputs of ‘dmesg’ and ‘intel_reg_dumper’ with drm.debug=14 set: 1) Boot with no monitor attached 2) Boot with no monitor attached initially, but DP plugged in later 3) Boot with monitor attached via DP-to-DVI adapter 4) Boot with same monitor attached via DVI natively (for reference) (In reply to oliver.grafe from comment #7) > please find attached an archive ... For future reference, please attach the plain files. Having to download and open the archive is an unnecessary extra hurdle for us. Thanks. Jani, Is there anything else you need from my side? Please let me know, I can get more debug output if you need... Greetings, Oliver Ville, do we parhaps have a similar issue here as with BYT in bug 89008? (In reply to Jani Nikula from comment #10) > Ville, do we parhaps have a similar issue here as with BYT in bug 89008? Scratch that, we do get the hotplug per dmesg. So we get the hotplug, but we are unable to do DDC communication with the display. Hello Oliver, I have another idea for you to try. What I am hearing is that you use vBIOS (you have set vBIOS option in CMOS instead GOP driver). And I think what you need to try is to use GOP driver, as the proper way how to use UEFI BIOS with UEFI compliant OSes. And, yes, you do use OpenSuse 13.2 RC2, which is 100% UEFI compliant OS. Also as of my best understanding, you are using 64 bit HSW BIOS - all 64 bit BIOSes are UEFI compliant by definition, but regardless, all 32/64 bit for HSW are UEFI compliant from two years back/ago till Future. vBIOS is good ONLY for legacy OSes (all WIN7 flavors and older). Could you, please, switch vBIOS with GOP driver, and retest these scenarios? Please, let us know. :-) Thank you, Zoran Would be interesting to get intel_reg_dumper output for both legacy and UEFI boot. It's part of intel gpu tools http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/ Hi, I can re-test with the GOP driver instead of vBIOS. So this might help us to figure out what's going on, but in the end I think we need to find a way to have this working in legacy boot... I'll get you this: - intel_reg_dumper output of DP hot plug when booted with monitor attached using vBIOS - intel_reg_dumper output of DP hot plug when booted without monitor attached using vBIOS - intel_reg_dumper output of DP hot plug when booted with monitor attached using GOP - intel_reg_dumper output of DP hot plug when booted without monitor attached using GOP Greetings, Oliver Created attachment 113910 [details]
Output of Intel regdumper VBIOS monitor attached
Created attachment 113911 [details]
Output of Inetl redumper VBIOS monitor not attached
Created attachment 113912 [details]
Output of Intel regdumper GOP monitor attached
Created attachment 113913 [details]
Output of Intel regdumper GOP monitor not attached
G'Day!
Oliver Grafe asked me to provide the information of the tests. The dumps of the Intel regsumper tool were taken on an Intel Server Board S1200RPM under Ubuntu 14.10 Desktop in CSM and UEFI mode. The log files are:
o VBIOS with monitor attached
o VBIOS w/o monitor attached
o GOP with monitor attached
o GOP w/o monitor attached
Ther is no difference in waht we see, whether the OS was installed in CSM or
EFI mode or VBIOS or GOP is selected.
Greetings, Karin
Hello Karin (Willers), I would like you to try only one additional test, which will tell us a lot! The current state of affairs is the following, after several tests we all did: Attempting monitor hot pug with various OSes: [1] HSW UEFI BIOS + vBIOS + WIN7 = OK [2] HSW UEFI BIOS + vBIOS + WIN8.1 = NEVER Tested [3] HSW UEFI BIOS + GOP + WIN7 = NOT Supported [4] HSW UEFI BIOS + GOP + WIN8.1 = NEVER Tested [5] HSW UEFI BIOS + vBIOS + F20 = FAILED [6] HSW UEFI BIOS + GOP + F20 = FAILED I am eager to get from this table ONLY one more result: [4] HSW UEFI BIOS + GOP + WIN8.1 = ?! [2] would be desirable, since it will be few minutes more work. Please, please, could you test this (eventually with [2] also) combination(s) with monitor hot plug and post here your results? Thank you! Zoran (In reply to Zoran Stojsavljevic from comment #13) > I have another idea for you to try. What I am hearing is that you use vBIOS > (you have set vBIOS option in CMOS instead GOP driver). And I think what you > need to try is to use GOP driver, as the proper way how to use UEFI BIOS > with UEFI compliant OSes. (In reply to Karin Willers from comment #19) > Ther is no difference in waht we see, whether the OS was installed in CSM or > EFI mode or VBIOS or GOP is selected. (In reply to Zoran Stojsavljevic from comment #20) > [5] HSW UEFI BIOS + vBIOS + F20 = FAILED > [6] HSW UEFI BIOS + GOP + F20 = FAILED Zoran, so the conclusion here is that the bug is not dependent on vBIOS vs. GOP, right? Can we get a test result with a more recent kernel please? v3.19.1 or v4.0-rc2. Please attach the dmesg for the failing (boot without monitor, hot plug later) case. (In reply to Jani Nikula from comment #5) > Does hotplug work if you're using a native DP monitor without a DP->DVI > adapter? I seem to have asked this twice in the beginning but I don't see a reply. Does a native DP or HDMI monitor hotplug work? > Zoran, so the conclusion here is that the bug is not dependent on vBIOS vs.
> GOP, right?
Hello Jani,
Here is what I am expecting to have:
[1] HSW UEFI BIOS + vBIOS + WIN7 = OK
[2] HSW UEFI BIOS + vBIOS + WIN8.1 = ??? (NOT POR/BKM) <<=== (BONUS QUESTION)
[3] HSW UEFI BIOS + GOP + WIN7 = NOT IMPORTANT
[4] HSW UEFI BIOS + GOP + WIN8.1 = OK <<==================== (CRITICAL)
[5] HSW UEFI BIOS + vBIOS + F20 = FAILED (NOT POR/BKM)
[6] HSW UEFI BIOS + GOP + F20 = FAILED
Then, All Clear, right? If test [4] is OK, the obvious conclusion: 99% probability Fedora 20 i915 GFX driver malfunctions.
Thank you,
Zoran
> (In reply to Jani Nikula from comment #22)
> Can we get a test result with a more recent kernel please? v3.19.1 or
> v4.0-rc2. Please attach the dmesg for the failing (boot without monitor, hot
> plug later) case.
Hello Jani,
If GE is willing to do this test (which is on newest kernel on the flux, and I don't think GE is willing to pay extra testing for kernel 3.19.1), and it appears to work (which I do not believe it will)... Then we need exact transition when and where the fix has occurred (exact old kernel number -> new kernel number) with the at minimum root cause explanation, and delta fix explanation (the best would be the fix in the form of the patch).
In contrary, I do not understand why do you need dmesg from newest kernel?
The presented logs should be enough.
Thank you,
Zoran
The driver is under constant heavy development; features, bug fixes, new platform enablement, etc. Between v3.16 and v3.19 the diffstat for the drm/i915 driver alone is: 70 files changed, 26673 insertions(+), 12471 deletions(-). For a lot of people here the old versions are archeology, and it gets really really hard to remember what was changed and what happened in which versions of the kernel. If the later kernels work, it should be possible (for whoever can reproduce the issue) to reverse bisect when the fix was introduced, and we should be in a much better position to backport the fix to older kernels. We usually keep improving the logging we do to dmesg, so it's often easier to figure out what's going on with the dmesg. This, of course, is the upstream perspective. I've performed tests with straight DP connection, (passive) DP-to-DVI adapters and an active DP-to-DVI adapter. The probably passive adapters (I tested three out of my box) did not work and the display stayed dark after plug-in after a boot w/o display attached. I'll test with a 3.19.1 kernel tomorrow. Created attachment 114200 [details]
dmesg Kernel 3.19.1 passive DP-to-DVI
The following files are the output for Kernel 3.19.1:
Linux UEFI 3.19.1-031901-generic #201503080052 SMP Sun Mar 8 00:54:35 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
dmesg of: Linux in UEFI mode, GOP driver, no monitor plugged in at boot. Monitor with a passive DP-to-DVI cable connected later.
Created attachment 114201 [details]
intel_reg_dumper Kernel 3.19.1 passive DP-to-DVI
Created attachment 114202 [details]
dmesg Kernel 3.19.1 active DP-to-DVI
Created attachment 114203 [details]
intel_reg_dumper Kernel 3.19.1 active DP-to-DVI
So the behaviro is the same with kernel 3.19.1: When the monitor is not plugged in at boot time, only an active DP-to-DVI connectotion makes the monitor display the Linux desktiop. Tested with the otherwise unchanged Ubuntu 14.10 64bit Desktop installation.
We verified the open question from comment 24 above: [4] HSW UEFI BIOS + GOP + WIN8.1 = OK OS was Windows 8.1 Enterprise with graphics driver win64_153618.exe from the Intel web page. The monitor was properly detected when initially unplugged at boot and attached later with: o native DP connection o active DP-to-DVI adapter o two types of passive DP-to-DVI adapters Win 8.1 in CSM (i.e. non-UEFI mode) and VBIOS is also working as expected. Hello Karin, Here is what (after what you have tested) we have: [1] HSW UEFI BIOS + vBIOS + WIN7 = OK [2] HSW UEFI BIOS + vBIOS + WIN8.1 = OK [3] HSW UEFI BIOS + GOP + WIN7 = NOT IMPORTANT [4] HSW UEFI BIOS + GOP + WIN8.1 = OK [5] HSW UEFI BIOS + vBIOS + F20 = FAILED (NOT POR/BKM) [6] HSW UEFI BIOS + GOP + F20 = FAILED Could you, please, do one more quick test for me? Please, use at minimum 3 cables of each: o native DP connection o active DP-to-DVI adapter o two types of passive DP-to-DVI adapters And redo tests for use cases: [1] HSW UEFI BIOS + vBIOS + WIN7 = OK [2] HSW UEFI BIOS + vBIOS + WIN8.1 = OK [4] HSW UEFI BIOS + GOP + WIN8.1 = OK [6] HSW UEFI BIOS + GOP + F20 = FAILED Just to be sure if i915 kernel GFX driver is not on the timing edge (I expect that for [6] some use cases have chances to work)? You have everything in place, this will be +30 minutes to verify that [1], [2] and [4] work always, and [6] fails most of/ALL the time, so we eliminate cable electrical characteristics variations (same type of cables have slightly different characteristics, I know that from my Past)? Thank you, Zoran (In reply to oliver.grafe from comment #7) > Hello All, > > please find attached an archive with the outputs of ‘dmesg’ and > ‘intel_reg_dumper’ with drm.debug=14 set: > > 1) Boot with no monitor attached > 2) Boot with no monitor attached initially, but DP plugged in later > 3) Boot with monitor attached via DP-to-DVI adapter > 4) Boot with same monitor attached via DVI natively (for reference) For the cases described in "2" and "3", can you please run the dump_more_regs.sh script and attach the output here? The intel-reg-dumper tool doesn't print all the registers that could be affecting our case here. Created attachment 114482 [details]
dump_more_regs.sh
(In reply to Paulo Zanoni from comment #35) > Created attachment 114482 [details] > dump_more_regs.sh I forgot to mention: you need to be on the intel-gpu-tools/tools directory :) By the way, after hotplugging (and reproducing the bug), did you try to start X and run xrandr? Does it help? I have a DP-to-DVI adapter here, tried to reproduce the problem but couldn't. (In reply to Paulo Zanoni from comment #37) > By the way, after hotplugging (and reproducing the bug), did you try to > start X and run xrandr? Does it help? > > I have a DP-to-DVI adapter here, tried to reproduce the problem but couldn't. Just for the record: the adapter I used is, apparently, from StarTech. Also please attach /sys/kernel/debug/dri/0/i915_opregion. (It doesn't matter which kernel is used for this.) Created attachment 114724 [details]
i915_opregion.bin zipped
i915_opregion seems to have binary content, so i've attached it as a .zip file. (In reply to Karin Willers from comment #41) > i915_opregion seems to have binary content, so i've attached it as a .zip > file. For future reference, it's okay to attach binary files. Updating this thread, here's what one of our DP experts has to say: "What’s supposed to happen is that when HPD asserts, we get an interrupt that we process and fire off the appropriate work functions after we determine what to do with it. The work functions are where all the good stuff happens, calling the driver functions for ->detect() and intel_dp_hpd_pulse. Then then we go about setting up all the plane/pipe/port stuff for all the connectors that have been detected as being connected. For DP, we then read the DPCD, read the EDID and parse the display modes in it, figure out what the “preferred” mode is and set up the link configuration for that (by default - userspace can override it). There’s also a user space notification that takes place that notifies the fbdev components and calls their helper functions too." He's going to read through the dmesg logs to see how many steps in this process are taking place. Note: Intel OTC is still unable to reproduce the problem. (See note from Paulo above who is on my team) We need to be able to reproduce the problem so the matter can be investigated, preferably in the US or EMEA. Hello to everybody, I will again make summary of the problem. HW used: As I understand, GE IP and Unify were able to reproduce this problem on both of the following platforms: [1] INTEL Reference platform being mentioned right at the beginning of the thread: Supermicro board and the E3-1268Lv3 [2] GE Proprietary Haswell Desktop CPUs i3-4330TE, i5-4570TE GE folks have not seen different results doing the tests on their own system vs. Supermicro. SW used: Fedora 21 kernel 3.19.1 Problem statement/description (with the tests done so far): [1] (INTEL POR configuration) HSW UEFI BIOS + vBIOS + WIN7 = OK [2] HSW UEFI BIOS + vBIOS + WIN8.1 = OK [3] HSW UEFI BIOS + GOP + WIN7 = NOT IMPORTANT [4] (INTEL POR configuration) HSW UEFI BIOS + GOP + WIN8.1 = OK [5] HSW UEFI BIOS + vBIOS + F21 = FAILED (NOT POR/BKM) [6] (INTEL POR configuration) HSW UEFI BIOS + GOP + F21 = FAILED Please, keep in mind that these configurations were tested with the same passive DP -> DVI cables, two types of it! Thank you, Zoran Just an observation, "passive DP->DVI cable" means DP++ port with HDMI/DVI output. Created attachment 115826 [details] [review] Add loop to drm_get_edid() to make multiple attempts to probe the DDC Here is a potential fix for this problem. Please test with the attached patch and report back here. Details: drm_get_edid() only makes one attempt to probe the DDC before returning NULL. Some passive DP dongles will fail to operate correctly in this case because they need that initial transaction to "wake up" and get things running. Each time an I2C transaction needs to be done, this probe must occur or the device will not respond. In the case of the dongles, this has to happen before the EDID can be read out of the device. This patch corrects this problem by probing the DDC up to 5 times before giving up and returning NULL, waiting for 10-11ms between attempts to give the device time to get setup. Once the DDC probe completes successfully, the EDID can then be read normally from the device. drm_probe_ddc() calls drm_do_probe_ddc_edid() which attempts probing up to 5 times. With this patch we will retry up to 25 times on I2C buses where no EDID is present, which will certainly incur a noticeable delay. Your patch adds a delay between attempts, which the original code doesn't have. So if the patch really helps, maybe the proper fix would be to add this delay in the retry loop already present in drm_do_probe_ddc_edid(). If we end up doing that, please insert the delay _after_ the first attempt rather than before it as your previous patch did, so that we don't suffer from the extra delay when everything works fine (EDID probe succeeds at first attempt.) Created attachment 115935 [details] [review] Fix for passive DP->DVI/HDMI dongles In light of Jean's comments, I looked further into this bug. This problem is related to the fix for bug #41059 - the code put in place there stops of the execution of that internal loop which makes a second attempt to probe the DDC. And that's where things go wrong for the DP dongles. This patch supersedes the previous work-around patch I posted. This is a cleaner solution that doesn't require the usleep_range() delays that the higher level function patch did. This patch also preserves (as much as possible) the spirit and intent of the solution for #41059, in that the code shouldn't repeatedly flog a non-existent or unresponsive adapter, thus delaying things unnecessarily. With this patch, a single NAK will be tolerated and the loop will continue. A second NAK happens and it will break out of the loop as it did previously. If you have these passive DP->DVI/HDMI adapters, please test with this patch applied. -T Thanks Todd. Minor nitpicking: fdo.org doesn't make sense, it's either fdo or freedesktop.org. This should be functionally the same as Todd's fix, but moved to another layer of the stack. Please try. http://patchwork.freedesktop.org/patch/50631 All, tried Jani's patch on a SLES / using 4.1-rc4. It improves the situation a lot! All active/passive DP adapters I could find now work flawlessly when hot-plugged into running system. I have found one DP->VGA adapter which still does not hot-plug even with this patch. Please see attached drm.debug=14 output. Thanks so much for the continuing effort to close this! Oliver Created attachment 116153 [details]
drm.debug=14 for delock 61848
The failing VGA dongle is not going through the same path as the other dongles. The DP->VGA dongle is an active dongle whereas these others are passive. You'll notice in the logs that this dongle is responding to AUX transactions versus the other ones which timeout. As an initial test, please restart the machine and add the following parameter to the kernel command line: i915.disable_power_well=0 Please report here to indicate the results of this test. Also, it might be a good idea to file a separate bug to track this VGA dongle problem, since it's an active adapter and not a passive one. -T (In reply to oliver.grafe from comment #51) > tried Jani's patch on a SLES / using 4.1-rc4. It improves the situation a > lot! All active/passive DP adapters I could find now work flawlessly when > hot-plugged into running system. > > I have found one DP->VGA adapter which still does not hot-plug even with > this patch. Please see attached drm.debug=14 output. > > Thanks so much for the continuing effort to close this! That's great news! Credits go to Todd for figuring out the problem. Unfortunately there was some negative review feedback on the patch itself; here's another version that *should* be functionally equivalent in your case: http://patchwork.freedesktop.org/patch/50930 It would be much appreciated if you could test this one as well. Thanks. (In reply to Jani Nikula from comment #54) > Unfortunately there was some negative review feedback on the patch itself; > here's another version that *should* be functionally equivalent in your > case: http://patchwork.freedesktop.org/patch/50930 Too hasty, try v4 http://patchwork.freedesktop.org/patch/50934 (In reply to Jani Nikula from comment #55) > (In reply to Jani Nikula from comment #54) > > Unfortunately there was some negative review feedback on the patch itself; > > here's another version that *should* be functionally equivalent in your > > case: http://patchwork.freedesktop.org/patch/50930 > > Too hasty, try v4 http://patchwork.freedesktop.org/patch/50934 Okay, this is really getting embarrassing now. v5 http://patchwork.freedesktop.org/patch/50953 (In reply to Jani Nikula from comment #56) > (In reply to Jani Nikula from comment #55) > > (In reply to Jani Nikula from comment #54) > > > Unfortunately there was some negative review feedback on the patch itself; > > > here's another version that *should* be functionally equivalent in your > > > case: http://patchwork.freedesktop.org/patch/50930 > > > > Too hasty, try v4 http://patchwork.freedesktop.org/patch/50934 > > Okay, this is really getting embarrassing now. v5 > http://patchwork.freedesktop.org/patch/50953 Oliver, this one passed review, I'll queue it towards v4.1 with cc: stable as soon as I get a tested-by from you. Thanks. Oliver, I'm pretty confident now with the fix, but I'd really really appreciate you testing v5 of the patch, so that we can don't have to do another fix later on. If we can get the Tested-by early next week, we may be able to get the fix to the v4.1 release, and have the backport committed to stable kernels in a few weeks, waterfalling to distributions through regular stable kernel channels. Thanks. Fixed by commit 3f5f1554ee715639e78d9be87623ee82772537e0 Author: Jani Nikula <jani.nikula@intel.com> Date: Tue Jun 2 19:21:15 2015 +0300 drm/i915: Fix DDC probe for passive adapters in drm-intel-fixes. This is expected to land in the v4.1-rc8 (or final v4.1 if there is no -rc8) kernel release within a week. After that, it will usually take some weeks for the stable backports to land, and waterfall to distributions. You may be able to expedite the distro backport by filing a bug directly against the distro, citing the commit above *after* it has landed in a kernel release. Everyone, thanks for the report, testing, debugging, and patience. Credits to Todd for root causing the issue. Thanks Todd and Jani for finally fixing this bug, your perseverance is appreciated :-) Good work all around on this one - thanks guys! Glad to see this one finally put to rest. Oliver, did you file a separate bug for the VGA dongle problem you reported on 5/29? If not, can you please do that and assign it directly to me? Thanks Oliver. -T Jani, Todd, I could now verify the v5 patch with our C226 system and 4.1-rc4, all passive DP->DVI and DP->HDMI cables / adapters I could get hold of now working fine when hot-inserted into a running system. This is a really great success! Thank you all for searching and finding a solution for this! I'll work with our customer/Novell to get this officially back ported for SLES 11 SP3 (3.0.101..). I'll file the active VGA dongle bug in a minute. Thanks again and greetings, Oliver Oliver, thanks for confirming the patch fixes the issues. I expect it to be merged to Linus' upstream kernel tree this Sunday - and that's the requirement for official backports. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.