Summary: | [bisected][DC] commit ebbf7337e2daacacef3e01114e6be68a2a4f11b4 prevents X11 from starting | ||
---|---|---|---|
Product: | DRI | Reporter: | dwagner <jb5sgc1n.nya> |
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | blocker | ||
Priority: | medium | CC: | fdsfgs, harry.wentland, Hi-Angel, jordan.lazare |
Version: | DRI git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Description
dwagner
2017-09-17 11:46:39 UTC
Please attach your xorg log, dmesg output, and xorg conf if you are using one. Created attachment 134299 [details] Xorg.0.log written while symptoms of Bug 102820 occur Created attachment 134300 [details]
dmesg (filtered by "grep -i -w -e drm -e amdgpu")
I attached the Xorg.0.log as requested, the "dmesg" output as a whole would be somewhat hard to edit for potentially sensitive content, thus I filtered it through "grep -i -w -e drm -e amdgpu" and hope that is good enough. I do not use an xorg.conf file, but I do have some minor customization in /etc/X11/xorg.conf.d/* files that were supplied by the Linux Arch distribution: Section "Monitor" Identifier "Monitor0" DisplaySize 508 285 HorizSync 20.0 - 150.0 VertRefresh 23.96 - 90.0 Option "DPMS" # 3840x2160p at 24Hz 16:9 ModeLine "3840x2160@24" 297.000 3840 5116 5204 5500 2160 2168 2178 2250 +hsync +vsync # 3840x2160p at 25Hz 16:9 ModeLine "3840x2160@25" 297.000 3840 4896 4984 5280 2160 2168 2178 2250 +hsync +vsync # 3840x2160p at 30Hz 16:9 ModeLine "3840x2160@30" 297.000 3840 4016 4104 4400 2160 2168 2178 2250 +hsync +vsync # 3840x2160p at 50Hz 16:9 ModeLine "3840x2160@50" 594.000 3840 4896 4984 5280 2160 2168 2178 2250 +hsync +vsync # 3840x2160p at 60Hz 16:9 ModeLine "3840x2160@60" 594.000 3840 4016 4104 4400 2160 2168 2178 2250 +hsync +vsync EndSection Section "Device" Identifier "Card0" Option "Monitor-HDMI-A-0" "Monitor0" Driver "amdgpu" BusID "PCI:40:0:0" Option "TearFree" "On" # [<bool>] EndSection Section "Screen" Identifier "Screen0" Device "Card0" Monitor "Monitor0" DefaultDepth 24 SubSection "Display" Viewport 0 0 Depth 24 EndSubSection EndSection (The once manually added ModeLines are not currently in use.) Two additional remarks: - I meanwhile verified that X runs fine when I compile the current amd-staging-drm-next with only commit ebbf7337e2daacacef3e01114e6be68a2a4f11b4 reverted. - The kernel Call Trace starting with "amdgpu_dm_connector_atomic_set_property+0x10a/0x180 [amdgpu]" is probably an unrelated issue, as it also occurs without commit ebbf7337e2daacacef3e01114e6be68a2a4f11b4. Just as an update: This very bug still occurs with https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next as of today, and it is still fixed by reverting commit ebbf7337e2daacacef3e01114e6be68a2a4f11b4 Could somebody comment what this commit is good for, given that it seems to only prevent X11 from running with certain 4k HDMI displays? I think it blocks modes that require 6Ghz timings if the platform isn't validated for them. Does everything work correctly if you remove the modes from your monitor section? (In reply to Alex Deucher from comment #7) > I think it blocks modes that require 6Ghz timings if the platform isn't > validated for them. What does "platform isn't validated for 6Ghz timings" mean? Both the RX 460 and the 4k TV I use are officially advertised as supporting 4k @ 60Hz, and indeed they work just fine in that mode if commit ebbf7337e2daacacef3e01114e6be68a2a4f11b4 is not part of the kernel. > Does everything work correctly if you remove the modes > from your monitor section? Removing the (not really used) ModeLine statements from the X11 config does in fact change the symptoms when commit ebbf7337e2daacacef3e01114e6be68a2a4f11b4 is present: X11 will start then, but only to use what xrandr says to be a "3840x2160 @ 30Hz" mode, but which the display picks up as some not-really-filling-the-whole-screen signal, with black borders on both sides (so an image is shown but compressed horizontally - in the usually active "just scan" mode and also even if the display is manually forced to a 16:9 aspect ratio). (In reply to dwagner from comment #8) > (In reply to Alex Deucher from comment #7) > > I think it blocks modes that require 6Ghz timings if the platform isn't > > validated for them. > What does "platform isn't validated for 6Ghz timings" mean? > > Both the RX 460 and the 4k TV I use are officially advertised as supporting > 4k @ 60Hz, and indeed they work just fine in that mode if commit > ebbf7337e2daacacef3e01114e6be68a2a4f11b4 is not part of the kernel. > Harry or Jordan can verify, but I think it means the board maker did not validate the HDMI connector for 6Ghz timings. The asic may support it, but the board has to be validated to make sure the physical connector supports it and the traces are not too long, etc. Some boards only support 4k@60 over DP. > > Does everything work correctly if you remove the modes > > from your monitor section? > Removing the (not really used) ModeLine statements from the X11 config does > in fact change the symptoms when commit > ebbf7337e2daacacef3e01114e6be68a2a4f11b4 is present: X11 will start then, > but only to use what xrandr says to be a "3840x2160 @ 30Hz" mode, but which > the display picks up as some not-really-filling-the-whole-screen signal, > with black borders on both sides (so an image is shown but compressed > horizontally - in the usually active "just scan" mode and also even if the > display is manually forced to a 16:9 aspect ratio). The driver drops the higher bandwidth modes for HDMI connectors if the board does not support them. Alex is correct about the intention of this change. I've never played around with the modeline before so don't fully understand the impact of having that in the xorg.conf. It sounds like that's forcing certain modes which we then can't support due to the commit you mention. We have an open issue where we don't correctly filter out modes if we can't support them for whatever reason. Fixing that might help you but we don't have anyone looking at it yet. As for the behavior you're seeing with the modelines removed, I don't fully understand what you're seeing. Mind posting a picture? It sounds like we should be driving the monitor in 4k30 at that point but seems like something goes wrong there, from your description. (In reply to Alex Deucher from comment #9) > Harry or Jordan can verify, but I think it means the board maker did not > validate the HDMI connector for 6Ghz timings. The asic may support it, but > the board has to be validated to make sure the physical connector supports > it and the traces are not too long, etc. Some boards only support 4k@60 > over DP. This is the manufacturers page advertising my graphics board: http://www.xfxforce.com/en-us/products/amd-radeon-rx-400-series/rx-460-4gb-heatsink-rx-460p4hfg5- It couldn't be more affirmative regarding support of HDMI 2.0b and high refresh rates for 4k modes... (And practical experience during the last months also tells me: Yes, HDMI 4k @ 60Hz output is stable - even when using a 4m length HDMI cable. > The driver drops the higher bandwidth modes for HDMI connectors if the board > does not support them. I have looked if there are any firmware upgrades for this card or any hints from others regarding lack of 4k 60 Hz support, but found neither. (I only found unofficial firmware that switches on shader units the manufacturer keeps dormant, but I am not using that.) Notice that the trailing dash in above link is part of the link, without it the page is not found: http://www.xfxforce.com/en-us/products/amd-radeon-rx-400-series/rx-460-4gb-heatsink-rx-460p4hfg5- (In reply to dwagner from comment #11) > > This is the manufacturers page advertising my graphics board: > > http://www.xfxforce.com/en-us/products/amd-radeon-rx-400-series/rx-460-4gb- > heatsink-rx-460p4hfg5- > > It couldn't be more affirmative regarding support of HDMI 2.0b and high > refresh rates for 4k modes... It's pretty vague unfortunately regarding HDMI: "Latest Display Connections Ready for the latest displays Radeon™ GPUs with the Polaris architecture support HDMI® 2.0b and DisplayPort™ 1.3 for compatibility with a new generation of monitors that would make any gamer excited: • 1080p @ 240Hz • 1440p @ 240Hz • 4K @ 120Hz • 1440p ultra-wide @ 190Hz" It does not explicitly say 4K@60 on HDMI. The high refresh rates may only be available on DP. HDMI 2.0b does not imply 4K@60. > > (And practical experience during the last months also tells me: Yes, HDMI 4k > @ 60Hz output is stable - even when using a 4m length HDMI cable. > Your board may work, but others might not depending on the board, cable, monitor, etc. > > The driver drops the higher bandwidth modes for HDMI connectors if the board > > does not support them. > > I have looked if there are any firmware upgrades for this card or any hints > from others regarding lack of 4k 60 Hz support, but found neither. (I only > found unofficial firmware that switches on shader units the manufacturer > keeps dormant, but I am not using that.) The display features that are validated are stored in the vbios and the vbios is updated by the board vendor based on what features they validated on the board. (In reply to Alex Deucher from comment #13) > It's pretty vague unfortunately regarding HDMI: > It does not explicitly say 4K@60 on HDMI. The high refresh rates may only > be available on DP. Well then, if you suspect XFX to be sneaky bitches, I should refer you to the page of the actual reseller that I bought my XFX RX 460 from: https://www.caseking.de/en/xfx-radeon-rx-460-passive-heatsink-edition-2048-mb-gddr5-gcxf-148.html which clearly states: "The 2.0b HDMI port carries a 4K/UHD resolution signal at 60 Hz and permits the sending and receiving of encrypted signals using the HDCP 2.2 protocol (4K streaming, 4K Blu-Rays)." And that corresponds well to the many reports of Windows users you find on the Internet that confirm XFX RX460 cards do in fact drive HDMI displays at 4k 60Hz also under Windows - or should we assume the Windows drivers to be broken and all those users just being "lucky"? > HDMI 2.0b does not imply 4K@60. Support of 4k @ 60Hz was the first and foremost feature advertised by the HDMI licensors as being the main reason for introducing HDMI 2.0! - Here's their press release from back then: https://www.hdmi.org/press/press_release.aspx?prid=133 and a clear statement on HDMI 2.0b supporting 4k@60Hz: https://www.hdmi.org/manufacturer/hdmi_2_0/index.aspx > Your board may work, but others might not depending on the board, cable, > monitor, etc. Yes, the GPU board cannot guarantee anything with regards to cabling or monitors - but how is that a reason for a driver software to keep users from even trying to use this fundamentally important, reseller-promised, broadly-reported-to-be-working-under-Windows feature? Should audio drivers discard LFE channels because some sound card vendor cannot validate that the user connected a capable sub-woofer to his amplifier...? (In reply to Harry Wentland from comment #10) > As for the behavior you're seeing with the modelines removed, I don't fully > understand what you're seeing. Mind posting a picture? It sounds like we > should be driving the monitor in 4k30 at that point but seems like something > goes wrong there, from your description. The display is driven in 3840x2160 @ 30Hz with modelines removed and commit present - but the picture fills only ~80% in the middle of the screen horizontally (100% vertically). If I use "xrandr --output HDMI-A-0 --mode 3840x2160 --rate 24" to switch to 24Hz, then the picture fills the whole screen. Without the commit (still without Modelines), when I use "xrandr --output HDMI-A-0 --mode 3840x2160 --rate 30" to voluntarily only use 30Hz, then the picture fills the whole screen. So with and without the commit, different parameters seem to be used to output 3840x2160 @ 30Hz. (If really required, I can shoot a photo later, but it doesn't really show anything remarkable except for the two black bars to the left and the right of the screen.) That's interesting. No picture needed anymore. I get it now. This is really weird behavior. Do you have the actual TV model by any chance? If I get a chance I'd love to see if I can find something similar in the office and repro it. As for 4k60 support, you're right that that's usually entailed by HDMI 2.0 but like Alex said HDMI 2.0 doesn't necessarily imply 4k60. In your case it looks like our Video BIOS doesn't report that 4k60 (i.e. 6GB) is validated. I'll try to find out more. (In reply to Harry Wentland from comment #16) > This is really weird behavior. Do you have the actual TV model by any > chance? It's an LG 55EG9609 TV. A link to a manual: https://www.lg.com/de/lgecs.downloadFile.ldwf?DOC_ID=20150135519057&what=MANUAL&fromSystem=LG.COM&fileId=IMgqHFIlfEO4t7Hfb0BBA&ORIGINAL_NAME_b1_a1=4_MFL68823613_06_151020.pdf (And of course, the GPU is connected to one of the two HDMI 2.0 ports where "HDMI ULTRA HD Deep Colour" is possible and switched on - in LG's lingo that is what enables the higher clocked modes.) Created attachment 137476 [details] [review] drm/amd/display: Default HDMI6G support to true. Log VBIOS table error. Can you see if this helps? Our Windows driver definitely checks the HDMI6G flag from VBIOS but it will default to allow 6G on HDMI if the VBIOS check fails. This patch is porting the same behavior in the hopes that it will help with your issue. If this patch works a dmesg log with the amdgpu.dc_log=1 option on the kernel would help us understand the root cause a bit better. If it doesn't work it's back to the drawing board for me. (In reply to Harry Wentland from comment #19) > Created attachment 137476 [details] [review] [review] > drm/amd/display: Default HDMI6G support to true. Log VBIOS table error. > > Can you see if this helps? Our Windows driver definitely checks the HDMI6G > flag from VBIOS but it will default to allow 6G on HDMI if the VBIOS check > fails. > > This patch is porting the same behavior in the hopes that it will help with > your issue. I had to change "ctx->logger" into "enc110->base.ctx->logger" to make your patch compile (applied on today's head of amd-staging-drm-next). Yes, that patch changes the behaviour for the better: HDMI 2.0 modes - especially 4k@60Hz work fine with this patch applied on my system. Tried multiple reboots, result was consistent. > If this patch works a dmesg log with the amdgpu.dc_log=1 option on the > kernel would help us understand the root cause a bit better. I did enable amdgpu.dc_log=1 on the kernel command line - but there is no "Failed to get encoder_cap_info from VBIOS..." message visible in dmesg, which makes me wonder what makes the new code path differ from the old one. (Attaching dmesg output below.) Created attachment 137487 [details]
dmesg output after Harry's recent patch for the "6G" check was applied
Thanks for fixing and testing the patch. I'll get it reviewed and merged. It looks like the dc_log=1 didn't take. I'd expect a lot more spam from DC if it took. It should be fine in any 4.15 RC but there might still be a bugfix for it that didn't make it into 4.15. I don't remember. amd-staging-drm-next should be good with the log option. Either way, looks like the VBIOS info isn't what we expect on some boards. Marking resolved as fix has been in mainline for a while now. If this is still an issue feel free to reopen. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.