Bug 29141 - [Arrandale switchable] PCH eDP panel mode setting failure at boot
[Arrandale switchable] PCH eDP panel mode setting failure at boot
Status: RESOLVED FIXED
Product: DRI
Classification: Unclassified
Component: DRM/Intel
unspecified
x86-64 (AMD64) Linux (All)
: high major
Assigned To: Jesse Barnes
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-18 23:56 UTC by Adam Hill
Modified: 2010-12-21 11:16 UTC (History)
11 users (show)

See Also:


Attachments
intel_reg_dumper output. (10.37 KB, text/plain)
2010-07-18 23:56 UTC, Adam Hill
no flags Details
Dmesg with drm.debug=0x4 and patches from bug 28479 (121.24 KB, text/plain)
2010-07-19 06:40 UTC, Adam Hill
no flags Details
add debug statements to dp failure paths (1.22 KB, patch)
2010-07-20 09:10 UTC, Jesse Barnes
no flags Details | Splinter Review
DRM Error output with patch from attachment 37246 (13.17 KB, text/plain)
2010-07-20 10:41 UTC, Adam Hill
no flags Details
Debug with available bandwidth per link artificially high (17.19 KB, text/plain)
2010-07-21 04:28 UTC, Adam Hill
no flags Details
Dmesg with drm.debug=0x04 and boot without "text" (79.72 KB, text/plain)
2010-07-21 14:00 UTC, Adam Hill
no flags Details
Increase timeout (1.21 KB, patch)
2010-08-09 09:27 UTC, Chris Wilson
no flags Details | Splinter Review
Dmesg from drm-intel-next (66.98 KB, text/plain)
2010-08-09 11:09 UTC, Adam Hill
no flags Details
eDP detection fix (2.10 KB, patch)
2010-08-23 13:54 UTC, Jesse Barnes
no flags Details | Splinter Review
dmesg with drm.debug=4 and 2.6.36-rc2 patch from id 38111 (76.33 KB, text/plain)
2010-08-24 03:09 UTC, Adam Hill
no flags Details
DRM debug messages from 2.6.36-rc2 with my mods (11.59 KB, text/plain)
2010-08-25 02:56 UTC, Adam Hill
no flags Details
Latest 2.6.36-rc2+ dmesg (13.69 KB, text/plain)
2010-08-26 02:03 UTC, Adam Hill
no flags Details
dmesg drm.debug=0x4 for jbarnes kernel git tree, branch edp-testing, commit ad2456c (84.16 KB, text/plain)
2010-09-01 11:38 UTC, Knuth Posern
no flags Details
dmesg drm.debug=0x4 for ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf (3.89 KB, text/plain)
2010-09-08 07:18 UTC, Knuth Posern
no flags Details
intel_reg_dump with "nomodeset" kernel parameter - otherwise same situation like in my last post (10.85 KB, text/plain)
2010-09-08 07:45 UTC, Knuth Posern
no flags Details
intel_reg_dump with modeset (without "nomodeset" kernel parameter) - otherwise same situation like in my last post (10.86 KB, text/plain)
2010-09-08 07:46 UTC, Knuth Posern
no flags Details
dmesg drm.debug=0x4 for ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf (86.15 KB, text/plain)
2010-09-08 08:29 UTC, Knuth Posern
no flags Details
kernel config - for ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf (for attachments 38553, 38554, 38556) (72.92 KB, text/plain)
2010-09-08 08:33 UTC, Knuth Posern
no flags Details
uname + dmesg of "modprobe drm debug=0x04" and "modprobe i915 modeset=1" + regdump + kernel-config (138.62 KB, text/plain)
2010-09-08 13:51 UTC, Knuth Posern
no flags Details
uname + regdump + dmesg + kernel-config for newest commit on ickles git, branch drm-intel-next (1af5fa1b7e5ff8332f8a2ee3c5fb44d93b34868d) (220.46 KB, text/plain)
2010-09-08 15:22 UTC, Knuth Posern
no flags Details
kernel config for ickle kernel git tree, branch drm-intel-next, commit ickle-23a0d9e4fc8e54d08bfd23e1cf943bff48d552a5 (71.43 KB, application/octet-stream)
2010-09-08 21:08 UTC, camalot
no flags Details
make sure fdi tx,rx and pch dplls are enabled (1.20 KB, patch)
2010-09-09 14:24 UTC, Jesse Barnes
no flags Details | Splinter Review
Chris (ickle) git repo, drm-intel-next branch + patch from attachment 38588 (169.22 KB, text/plain)
2010-09-09 17:03 UTC, Knuth Posern
no flags Details
Chris (ickle) git repo, drm-intel-staging branch + patch from attachment 38588 (216.75 KB, text/plain)
2010-09-09 17:16 UTC, Knuth Posern
no flags Details
Chris (ickle) git repo, drm-intel-next branch + patch from attachment 38588 (220.62 KB, text/plain)
2010-09-09 18:30 UTC, Knuth Posern
no flags Details
uname + regdump + dmesg + kernel-config for newest commit on ickles git, branch edp-hacks (commit 050486e2f3f8aa97c5f8afdc9b898b00a7114493)) (246.74 KB, text/plain)
2010-10-05 18:56 UTC, Knuth Posern
no flags Details
uname + regdump + dmesg + kernel-config for newest commit on jbarnes git, branch edp-hacks (commit f0c744bbce33c64bedbe15d9785efafd5380c58c)) (164.44 KB, application/octet-stream)
2010-10-07 07:52 UTC, Knuth Posern
no flags Details
dmesg.out intel-drm-next (42.17 KB, text/plain)
2010-10-25 07:39 UTC, Marcel Heß
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Hill 2010-07-18 23:56:54 UTC
Created attachment 37175 [details]
intel_reg_dumper output.

When loading the i915 module with mode setting enabled, the LCD immediately goes blank, though everything else appears to be running normally ( i.e. I can still log in blindly, start X, ssh in, etc. ) I simply boot in text mode, log in as root, and "modprobe i915 modeset=1"

Not sure if its relevant, but I am using grub2 as a boot loader, and have set its resolution to 1920x1080 and used gfxpayload=keep. This results in a 1920x1080 console which appears to work correctly until the i915 module is loaded.

lspci relevant line seems to be:
00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 02)

Lines from dmesg when the module is loaded:
[   68.038674] agpgart-intel 0000:00:00.0: Intel HD Graphics Chipset
[   68.040445] agpgart-intel 0000:00:00.0: detected 131068K stolen memory
[   68.107161] agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xa0000000
[   68.160709] [drm] Initialized drm 1.1.0 20060810
[   68.221306] i915 0000:00:02.0: power state changed by ACPI to D0
[   68.221342] i915 0000:00:02.0: power state changed by ACPI to D0
[   68.221349] i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[   68.221352] i915 0000:00:02.0: setting latency timer to 64
[   68.258048] mtrr: no more MTRRs available
[   68.258050] [drm] MTRR allocation failed.  Graphics performance may suffer.
[   68.258179]   alloc irq_desc for 46 on node -1
[   68.258184]   alloc kstat_irqs on node -1
[   68.258198] i915 0000:00:02.0: irq 46 for MSI/MSI-X
[   68.258212] [drm] set up 127M of stolen space
[   68.449944] checking generic (a0500000 500000) vs hw (a0000000 10000000)
[   68.449947] fb: conflicting fb hw usage inteldrmfb vs EFI VGA - removing generic driver
[   68.450010] Console: switching to colour dummy device 80x25
[   68.450704] Console: switching to colour frame buffer device 240x67
[   68.450711] fb0: inteldrmfb frame buffer device
[   68.450713] drm: registered panic notifier
[   68.450716] Slow work thread pool: Starting up
[   68.450802] Slow work thread pool: Ready
[   68.452728] acpi device:00: registered as cooling_device4
[   68.453025] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input8
[   68.453065] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
[   68.453107] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0


xrandr reports:
Screen 0: minimum 320 x 200, current 1920 x 1080, maximum 8192 x 8192
VGA1 disconnected (normal left inverted right x axis y axis)
HDMI1 disconnected (normal left inverted right x axis y axis)
DP1 disconnected (normal left inverted right x axis y axis)
HDMI2 disconnected (normal left inverted right x axis y axis)
HDMI3 disconnected (normal left inverted right x axis y axis)
DP2 disconnected (normal left inverted right x axis y axis)
DP3 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 291mm x 164mm
   1920x1080      40.0* 

I've attached the output of intel_reg_dumper, but am not sure what other info is relevant, so let me know what I can try. I've tried a few recent kernels - this is with 2.6.35-rc4

Note that this laptop has dual graphics, but there is a trick of booting an old kernel first which forces the BIOS to disable one or other of them, avoiding the need for VGA switcheroo. The above is with this trick and only the Intel VGA showing in lspci.

There seem to be quite a few mentions of blank screens with KMS and i915 so apologies if this has already been reported, or if this is not the correct place.
Comment 1 Chris Wilson 2010-07-19 03:49:20 UTC
Yes, there are still a few issues with Arrandale...

Can you add the option drm.debug=0x4 and upload the dmesg? The 40Hz refresh rate sounds a little odd, but if that is what is being reported then it must true!

I think the patches attached to bug 28479 are relevant here, in particular the ones to use the fixed panel mode:

https://bugs.freedesktop.org/attachment.cgi?id=36699
https://bugs.freedesktop.org/attachment.cgi?id=37105
https://bugs.freedesktop.org/attachment.cgi?id=37106 (unlikely to be panel fitting, but helpful)
https://bugs.freedesktop.org/attachment.cgi?id=37107
https://bugs.freedesktop.org/attachment.cgi?id=37108

Can you try those?
Comment 2 Adam Hill 2010-07-19 06:40:42 UTC
Created attachment 37177 [details]
Dmesg with drm.debug=0x4 and patches from bug 28479

Hello Chris - firstly thanks for the rapid response!

The patches would not apply to the mainline 2.6.35-rc4 that I already had, so I've switched to the drm-intel-next version from git, and then applied all of the patches mentioned. Unfortunately the problem still persists, though now the screen backlight also turns off - however, I can't be sure whether that is down to any of the patches or just the change to the git version.

I've attached the full dmesg output in case there is anything else relevant in there, but there are certainly errors relating to mode setting it seems.

X also now fails to start. I'm using the pre-compiled stuff from the Xorg Edgers PPA, which I assume should be OK to use with this kernel, but obviously that might not be the case, or it may just be the driver picking up that drm initialisation has failed. The last bit of the Xorg.0.log reads:

[    36.121] (EE) intel(0): failed to set mode: Invalid argument
[    36.121] 
Fatal server error:
[    36.121] AddScreen/ScreenInit failed for driver 0
[    36.121] 

I won't clutter things up with more info yet because this may be all you need, though if you need any more then let me know and I'll sort it ASAP.
Comment 3 Jesse Barnes 2010-07-19 12:52:22 UTC
Can you try the last attachment from https://bugs.freedesktop.org/show_bug.cgi?id=28739 (https://bugs.freedesktop.org/attachment.cgi?id=37049)?
Comment 4 Adam Hill 2010-07-19 13:31:36 UTC
Nope, sorry... same result. Let me know if you need any updated debug info.
Comment 5 Chris Wilson 2010-07-20 08:31:31 UTC
Okay, this is worrying:

[    6.468896] [drm:drm_mode_debug_printmodeline], Modeline 26:"1920x1080" 0 99910 1920 1952 1984 2281 1080 1083 1086 1095 0x0 0xa
[    6.468901] [drm:drm_crtc_helper_set_config] *ERROR* failed to set mode on crtc ffff88023f323000
[    6.469106] [drm:drm_crtc_helper_set_config], 
[    6.469107] [drm:drm_crtc_helper_set_config], crtc: ffff88023f323000 3 fb: ffff88024126e4b0 connectors: ffff88023e6e6620 num_connectors: 1 (x, y) (0, 0)
[    6.469111] [drm:drm_crtc_helper_set_config], crtc has no fb, full mode set
[    6.469112] [drm:drm_crtc_helper_set_config], modes are different, full mode set
[    6.469113] [drm:drm_mode_debug_printmodeline], Modeline 0:"" 0 0 0 0 0 0 0 0 0 0 0x0 0x0
[    6.469115] [drm:drm_mode_debug_printmodeline], Modeline 21:"1920x1080" 40 162840 1920 1952 1984 2281 1080 1083 1086 1095 0x40 0xa
[    6.469118] [drm:drm_crtc_helper_set_config], encoder changed, full mode switch
[    6.469119] [drm:drm_crtc_helper_set_config], crtc changed, full mode switch
[    6.469121] [drm:drm_crtc_helper_set_config], setting connector 17 crtc to ffff88023f323000
[    6.469122] [drm:drm_crtc_helper_set_config], attempting to set mode from userspace
[    6.469123] [drm:drm_mode_debug_printmodeline], Modeline 21:"1920x1080" 40 162840 1920 1952 1984 2281 1080 1083 1086 1095 0x40 0xa
[    6.469127] [drm:drm_crtc_helper_set_config] *ERROR* failed to set mode on crtc ffff88023f323000
[    6.469129] detected fb_set_par error, error code: -22
Comment 6 Jesse Barnes 2010-07-20 09:09:45 UTC
I'm assuming it's the encoder mode_set/commit pair that fails, but can you try adding a DRM_ERROR like so just to be sure:

@@ -387,8 +388,10 @@ bool drm_crtc_helper_set_mode(struct drm_crtc *crtc,
         * on the DPLL.
         */
        ret = !crtc_funcs->mode_set(crtc, mode, adjusted_mode, x, y, old_fb);
-       if (!ret)
-           goto done;
+       if (!ret) {
+               DRM_ERROR("crtc mode set failed\n");
+               goto done;
+       }

 
        list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
 
The DP mode_set function doesn't actually do much: it just sets some bits in a field that will ultimately be written to a register.  The commit function calls DPMS to do that, and so is probably the place where things are failing.
Comment 7 Jesse Barnes 2010-07-20 09:10:27 UTC
Created attachment 37246 [details] [review]
add debug statements to dp failure paths

Hopefully the DP code is failing somewhere that can give us a clue where things are going wrong.  Can you apply this patch and attach the debug output again?
Comment 8 Adam Hill 2010-07-20 10:41:01 UTC
Created attachment 37249 [details]
DRM Error output with patch from attachment 37246 [details] [review]

I've attached the new debug output but unfortunately it doesn't look any more helpful :(. Note that I changed the driver date yesterday to make sure that I hadn't done something daft and was loading the wrong version, in case you notice the date string!
Comment 9 Adam Hill 2010-07-21 01:32:17 UTC
I've been blindly adding debug statements to the code ( I say blindly because I know nothing about the drm code or the intel chipset :( )

It appears that intel_dp_mode_fixup is failing, my debug output is:

[   42.717167] [drm:intel_dp_mode_fixup], DP mode fixup.
[   42.717169] [drm:intel_dp_mode_fixup], Mode clock now 162840, adjusted mode is:
[   42.717175] [drm:drm_mode_debug_printmodeline], Modeline 25:"1920x1080" 40 162840 1920 1952 1984 2481 1080 1083 1086 1095 0x40 0xa
[   42.717178] [drm:intel_dp_mode_fixup], Try lane_count 1.
[   42.717180] [drm:intel_dp_mode_fixup], Try clock 0.
[   42.717181] [drm:intel_dp_mode_fixup], Availale is 129600, required is 488520.
[   42.717183] [drm:intel_dp_mode_fixup], Try clock 1.
[   42.717184] [drm:intel_dp_mode_fixup], Availale is 216000, required is 488520.
[   42.717186] [drm:intel_dp_mode_fixup], All clocks failed.
[   42.717188] [drm:intel_dp_mode_fixup], Try lane_count 2.
[   42.717189] [drm:intel_dp_mode_fixup], Try clock 0.
[   42.717191] [drm:intel_dp_mode_fixup], Availale is 259200, required is 488520.
[   42.717193] [drm:intel_dp_mode_fixup], Try clock 1.
[   42.717194] [drm:intel_dp_mode_fixup], Availale is 432000, required is 488520.
[   42.717196] [drm:intel_dp_mode_fixup], All clocks failed.
[   42.717197] [drm:intel_dp_mode_fixup], All lane counts failed.
[   42.717199] [drm:intel_dp_mode_fixup], IS_eDP false, so returning false.
[   42.717200] [drm:drm_crtc_helper_set_mode] *ERROR* mode_fixup call in list_for_each_entry failed.

Here's hoping that this helps :)
Comment 10 Adam Hill 2010-07-21 04:28:26 UTC
Created attachment 37263 [details]
Debug with available bandwidth per link artificially high

Just as a random thing to try, I changed the intel_dp_max_data_rate function to ignore what I assume is 8/10 encoding on the bus, expecting to get some sort of corrupted display. However, although intel_dp_mode_fixup now returns a valid mode, the end result is the same - i.e. the backlight is turned off and the link disabled. Obviously I have no idea if this is because the mode is not, in fact, valid as far as some other part of the code is concerned, or if it is still indicative of the fundamental issue.

I'm attaching some more debug output with all sorts of extra debug lines in case anything obvious is going on ( for someone who knows how this works! )
Comment 11 Jesse Barnes 2010-07-21 10:43:17 UTC
You probably have a PCH attached eDP panel rather than one attached directly to the CPU output if IS_eDP fails.  So you could try changing mode_fixup to use:
  if (IS_eDP(intel_encoder) || IS_PCH_eDP(dp_priv)) {
instead.

I'm also curious if removing these lines from intel_display.c helps your situation at all:

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_d
index b97a8d0..7d245d8 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3885,8 +3885,6 @@ static int intel_crtc_mode_set(struct drm_crtc *crtc,
                        pipeconf &= ~PIPEACONF_DOUBLE_WIDE;
        }
 
-       dspcntr |= DISPLAY_PLANE_ENABLE;
-       pipeconf |= PIPEACONF_ENABLE;
        dpll |= DPLL_VCO_ENABLE;
 
 
We enable them later when we commit the configuration, so it *should* be safe, however a similar patch from Carl caused regressions for some people, so it's not generally safe.
Comment 12 Adam Hill 2010-07-21 13:00:42 UTC
(In reply to comment #11)
> You probably have a PCH attached eDP panel rather than one attached directly to
> the CPU output if IS_eDP fails.  So you could try changing mode_fixup to use:
>   if (IS_eDP(intel_encoder) || IS_PCH_eDP(dp_priv)) {
> instead.

Changing the above fixed it! Oddly, I manually did an insmod i915 modeset=1 from the console, and I got a blank screen, though with the backlight still on. I typed "startx", and to my amazement up popped a desktop, and I can also switch to other consoles. There will be many VPC-Z owners who'll be happy about this progress - thanks! Now, if I can just get the nouveau driver working.... :)

> I'm also curious if removing these lines from intel_display.c helps your
> situation at all:

I tried removing those lines, but that resulted in a hard freeze.
Comment 13 Adam Hill 2010-07-21 14:00:47 UTC
Created attachment 37282 [details]
Dmesg with drm.debug=0x04 and boot without "text"

Not there yet unfortunately. If I boot with "text" kernel parameter, then:

modprobe i915 modeset=1
startx

it works. I can even use the modified Z series sony-laptop module to switch off the NVidia graphics, and the i915 driver still works ( so no booting old kernels any more. ) KDE desktop effects don't work when using OpenGL though, but this may be unrelated ( though they work with the NVidia card and proprietary drivers. )

HOWEVER, if I boot normally, I'm still faced with a black screen. dmesg with drm.debug=0x4 attached.
Comment 14 Jesse Barnes 2010-07-21 14:05:36 UTC
Hm, those should be equivalent.  Are you sure the fixed i915 module ended up in your initrd?
Comment 15 Adam Hill 2010-07-21 14:24:22 UTC
(In reply to comment #14)
> Hm, those should be equivalent.  Are you sure the fixed i915 module ended up in
> your initrd?

Yep... just changed the date string again to make 100% sure.
Comment 16 Jesse Barnes 2010-07-21 14:53:05 UTC
Do you also have the patch from 28739 applied?  I can imagine that the panel reg lock might be causing you trouble...
Comment 17 Sérgio M. Basto 2010-07-21 20:46:55 UTC
(In reply to comment #1)
> Yes, there are still a few issues with Arrandale...
> 
> Can you add the option drm.debug=0x4 and upload the dmesg? The 40Hz refresh
> rate sounds a little odd, but if that is what is being reported then it must
> true!
> 
> I think the patches attached to bug 28479 are relevant here, in particular the
> ones to use the fixed panel mode:
> 
> https://bugs.freedesktop.org/attachment.cgi?id=36699
> https://bugs.freedesktop.org/attachment.cgi?id=37105
> https://bugs.freedesktop.org/attachment.cgi?id=37106 (unlikely to be panel
> fitting, but helpful)
> https://bugs.freedesktop.org/attachment.cgi?id=37107
> https://bugs.freedesktop.org/attachment.cgi?id=37108

good review, might be useful also : 
http://lists.freedesktop.org/archives/intel-gfx/2010-June/007232.html
from https://bugs.freedesktop.org/show_bug.cgi?id=28070#c46
Comment 18 Adam Hill 2010-07-22 01:36:32 UTC
This is all getting messy and confusing, so I thought that I would try to summarise where we are with it!

I've reset my git tree and applied the various mods individually, and found that:

- The mod suggested by Jesse in comment 11 to the mode_fixup routine does not fix the issue on its own, but is required.

- The patch from attachment id 37049 in combination with that mod makes the laptop usable, but only if I:

Boot with "text" kernel parameter.
modprobe i915 modeset=1  ( at which point the screen blanks, but with backlight on. Switching VTs makes no difference, screen still blank. )
startx  ( Desktop appears. VT switching is possible and consoles are visible. )

What seems to happen after the modprobe is that the backlight turns off for a second or so, then comes on again. I believe that if X is started during the time when the light is off, it stays off, which is also why it isn't possible to boot without text ( presumably something tries to use the chip before it is fully initialised? ) This is confirmed by the fact that after booting in text mode, and just running startx to let the modules sort themselves out, the screen goes blank with no backlight and stays that way.
Comment 19 Adam Hill 2010-07-22 03:55:36 UTC
Interestingly, I tried reverting to the standard Ubuntu 10.04 X stuff ( using ppa-purge xorg-edgers ) and tried again. If I boot with the modified sony-laptop driver, so that the NVidia card is disabled but still shows in lspci, I can type "startx" and the screen goes blank, and the back light comes on. I can then switch terminals, but X failed to start because it said that all connectors ( including DP3 ) are disconnected. However, I can also boot without "text", and everything sort of works - because X is retried by init and it eventually starts ( after the driver has turned on the backlight again. ) HOWEVER, ( and perhaps oddly ) if I boot with the old kernel first, thus disabling the NVidia card completely ( even from lspci ) the behaviour is as with the latest X stuff - i.e. the backlight stays off and no display. Also, if I boot without "text" after booting the old kernel in this way, the system becomes completely unresponsive as before.

I am guessing that this behaviour is because the older X driver does not start because it thinks that there is no connected display and hence the conflict of initialising the display before the driver itself has initialised never happens.

Quite why the behaviour is different depending on how the NVidia card is disabled though I have no idea, and if anything I would have expected things to work better when it is disabled completely rather than still appearing on the device list.

Unrelated, but just in case anyone else can give me any pointers, KDE composting still doesn't work using any combination of drivers, even though all of the required features seem to exist :(.
Comment 20 Jesse Barnes 2010-07-22 12:48:19 UTC
*** Bug 29221 has been marked as a duplicate of this bug. ***
Comment 21 Jesse Barnes 2010-07-26 12:31:57 UTC
The for-linus branch has some other changes that might help:

commit 6ba770dc5c334aff1c055c8728d34656e0f091e2
Author: Adam Jackson <ajax@redhat.com>
Date:   Fri Jul 2 16:43:30 2010 -0400

    drm/i915: Make G4X-style PLL search more permissive

commit fe27d53e5c597ee5ba5d72a29d517091f244e974
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Jun 30 11:46:17 2010 +1000

    i915: fix ironlake edp panel setup (v4)

have you tested those?
Comment 22 Adam Hill 2010-07-27 01:56:48 UTC
(In reply to comment #21)
> The for-linus branch has some other changes that might help:
> 
> commit 6ba770dc5c334aff1c055c8728d34656e0f091e2
>     drm/i915: Make G4X-style PLL search more permissive

I've tried that one now but it made no discernible difference.

> commit fe27d53e5c597ee5ba5d72a29d517091f244e974
>     i915: fix ironlake edp panel setup (v4)

This one was already in the drm-intel git that I was using. I've also now tried the mainline 2.6.35-rc6 but with intel_dp.c from git ( plus the two patches to it already identified, ) again with the same results, except that oddly now an rc.local which does the modprobes does not seem to work any more ( black screen ) whereas if I log in to the console and do it manually it works as before. I maybe need to test this with much longer time delays in the script, but if true that is another weird thing to add to the mix :(
Comment 23 Adam Hill 2010-07-27 06:54:10 UTC
Ok, more testing and hopefully some more clues... I've now managed to explain the difference in the different methods of disabling the NVidia card and that was a bit of a red herring. It also explains why the boot script never worked but manually typing the commands did...

What I've found is that you have to cause the display to scroll before typing the insmod command... so, modifying rc.local to start with:

for i in $(seq 1 100) do; echo "....." >/dev/tty0; done

makes it work. Changing the string to echo -e ".....\r", or echo -n ".....", doesn't work, so text seems to have to scroll off both the right and bottom of the screen, or else modprobe i915 modeset=1 results in a black screen with no backlight but still a responsive system.

Note that all of this is with Grub2 in 1920x1080 mode and gfxpayload=keep. Removing either of those factors means that the modprobe results in a black screen PLUS a completely unresponsive system.

( The reason that the "boot the old kernel" trick behaved differently is related. The newer kernels have a problem with the sound controller on the NVidia card appearing in the device list but being disabled by ACPI. The kernel tries to turn it on, fails, but the routine in pci.c does not actually return an error code. That results in the hda-intel driver trying to initialise the card despite it being in D3 state, and hence spewing a load of errors to the console, causing me to miss the above because the display had been scrolled loads. I changed the routine in pci.c to return an error if something fails to be woken up from D3, which fixes the other issue but also causing the different behaviour because of the lack of console messages. )
Comment 24 Adam Hill 2010-08-06 13:32:36 UTC
Very small amount of progress on this... in intel_display.c there is a loop which waits for the transcoder enable bit to clear:

                       /* wait for PCH transcoder off, transcoder state */
                        while ((I915_READ(transconf_reg) & TRANS_STATE_ENABLE) != 0) {
                                n++;
                                if (n < 60) {
                                        udelay(500);
                                        continue;
                                } else {
...

If I increase the max loop count to 200 then that removes the need to send output to the console before loading the i915 module, so presumably the transcoder disable is taking longer than expected to take effect. However, this does not affect the fact that there needs to be a delay between loading the module and starting X. Also, there are versions of the same machine but with a 1600x900 display panel which still can not be made to work currently.

If I load the module and try to use it immediately then there seems to be an aux channel timeout message on the i2c bus which is significant and as a result displayport 3 is read as disconnected... I'm still looking in to the cause of this.
Comment 25 Adam Hill 2010-08-07 02:11:50 UTC
Actually, the mod in my previous comment does seem to fix things, I think that I was wrong about the aux_ch timeout message. However, the screen takes a couple of seconds to initialise, and if X is started before it completes, it fails to load because it does not detect any outputs.

So, increasing the transcoder disable loop count is definitely a workaround, though I'm assuming that the length of time that it takes for the screen to finally initialise and turn on the backlight is not optimal!
Comment 26 Chris Wilson 2010-08-09 09:27:59 UTC
Created attachment 37732 [details] [review]
Increase timeout

Adam, it would be good if you could test this patch which should do exactly the same timeout tweak. The patch is based on anholt/drm-intel-next which should be push upstream in the near future.
Comment 27 Adam Hill 2010-08-09 11:07:46 UTC
(In reply to comment #26)
> Created an attachment (id=37732) [details]
> Increase timeout
> 
> Adam, it would be good if you could test this patch which should do exactly the
> same timeout tweak. The patch is based on anholt/drm-intel-next which should be
> push upstream in the near future.

Hi Chris, as it happens I pulled the latest drm-intel-next code this afternoon and already tried changing the timeout for both transcoder on and off ( though note that it was turning the transcoder off that was causing the problem in 2.6.35. Unfortunately, the drm-intel-next branch has an even more fundamental problem - it detects no outputs... and so whilst I guess the patch is needed ( though as I said, mainly the transcoder off wait which you haven't changed ) I can not be 100% sure. It seems that there are so many issues with this machine at the moment :(. I'm not sure how many bugs this "it doesn't work" bug ID represents!

The significant difference seems to be that whereas 2.6.35 detects DisplayPort-1,2 and 3, with 3 being the one connected, drm-intel-next detects "Embedded DisplayPort-1", and DisplayPort-1 and 2, with none being connected. Also, previously drm_enable_connectors reported that connector 17 was connected, now connector 17 is not listed.
Comment 28 Adam Hill 2010-08-09 11:09:27 UTC
Created attachment 37739 [details]
Dmesg from drm-intel-next

Sorry - dmesg output with drm.debug=0xff from drm-intel-next attached.
Comment 29 Adam Hill 2010-08-10 01:44:18 UTC
In intel_dp.c ( drm-intel-next, ) function intel_dp_aux_ch, if I change the clock divider etc logic thus:

	if (HAS_PCH_SPLIT(dev)) {
		aux_clock_divider = 62; /* IRL input clock fixed at 125Mhz */
		I915_WRITE(PCH_PP_CONTROL, I915_READ(PCH_PP_CONTROL) | (1<<3));
	} else if (IS_eDP(intel_dp)) {
		if (IS_GEN6(dev))
			aux_clock_divider = 200; /* SNB eDP input clock at 400Mhz */
		else
			aux_clock_divider = 225; /* eDP input clock at 450Mhz */
	} else
		aux_clock_divider = intel_hrawclk(dev) / 2;

( note the addition of I915_WRITE(PCH_PP_CONTROL, I915_READ(PCH_PP_CONTROL) | (1<<3));, a previously suggested patch ) then I now get a blank screen with backlight on, and no aux_ch timeouts for Embedded DisplayPort-1. The correct modes are detected etc. When starting X and the module is initialised, I get:

connector (null)1 is connected.

So, again progress but I now get a backlit blank screen instead of a dark one!

If I make the same change to the 2.6.35-rc6 kernel that I have made work, there are no adverse affects, BUT the 60Hz mode which was initially detected as 0Hz and rejected is now detected correctly as 60Hz. The fact that having the clock divider set at 62 still works leads me to believe that it is the correct value. Also, I added a debug at the start of the aux_ch subroutine to read the previous control register contents, and the divider was indeed 0x3e before loading the i915 module.

I'm gong to see if this change has any affect on the 1600x900 panel with rc6.
Comment 30 Jesse Barnes 2010-08-10 14:15:48 UTC
Sounds very similar to 29278 and also to an issue I'm seeing here on an HP 8440p.  Moving the panel_on call above the dp link train call in the dpms function worked around the issue for me.
Comment 31 Adam Hill 2010-08-12 03:11:04 UTC
(In reply to comment #30)
> Sounds very similar to 29278 and also to an issue I'm seeing here on an HP
> 8440p.  Moving the panel_on call above the dp link train call in the dpms
> function worked around the issue for me.

Doing this kills the backlight again for me.

Does anyone know if there has been a fundamental change in the way this driver sets up the chip compared to the 2.6.35 mainline? The reason that I ask is that there are quite a few differences in the register dump from after a modprobe ( but before X ) for the two versions.

I've started to try to make them look the same, and to this end I've tried commenting out the if(!HAS_eDP) test from ironlake_crtc_dpms in intel_display.c, which causes transcoder A to be enabled. Then I added:

	temp &= ~(7 << 19);
	temp |= (intel_crtc->fdi_lanes - 1) << 19;
	temp |= 0x2010;

after the read of fdi_tx_reg in ironlake_fdi_link_train which stops link train 1 from failing and makes the FDI_RXA_CTL register look the same... but I still can't get a display :(.
Comment 32 Jesse Barnes 2010-08-23 13:54:59 UTC
Created attachment 38111 [details] [review]
eDP detection fix

Does 2.6.36-rc2 with the attached patch work ok?
Comment 33 Adam Hill 2010-08-24 03:09:33 UTC
Created attachment 38120 [details]
dmesg with drm.debug=4 and 2.6.36-rc2 patch from id 38111

Nope, sorry... no backlight and lots of errors in the dmesg output. The attached is from a manual "modprobe i915 modeset=1" then a "startx".
Comment 34 Jesse Barnes 2010-08-24 09:46:42 UTC
The *ERROR* panel on wait timed out: 0x00000000 messages are disturbing.  If we can't turn the panel on, link training and detection will fail, and you won't see the backlight come on.


Can you change the code in the last patch to set the VDD AUX bit instead of powering on the panel fully?  I thought that powering on the panel would be safe on all machines, but it may be that on newer ones we need to use the VDD AUX bit after all.  (That's the 1<<3 bit in PCH_PP_CONTROL.)
Comment 35 Adam Hill 2010-08-25 02:56:32 UTC
Created attachment 38138 [details]
DRM debug messages from 2.6.36-rc2 with my mods

Afraid it didn't make any difference, though I don't have any documentation for the chip and so wasn't 100% sure what to do. Essentially though I copied the stuff from ironlake_edp_panel_on and created an ironlake_edp_vdd_on function which I called in the dp_detect routine instead of on. I removed the "disable reset around power sequence" and changed the write to pp |= PANEL_UNLOCK_REGS | EDP_FORCE_VDD, and removed the reset of that bit further down.

However, a couple of things I have tried since:
Again, swapping the if clauses of if (HAS_PCH_SPLIT(dev)) and if (IS_eDP(intel_dp)) in intel_dp_aux_ch FIXES ALL OF THE AUX TIMEOUTS FOR AUX D. Therefore I believe as I said earlier that the clock divider needs to be 62 rather than 225 as it is otherwise. I don't think that the panel power thing was the cause of the aux channel failing at all.

Also, changing references to PP_ON to be PANEL_POWER_ON for the PCH_PP_STATUS tests in edp_panel_on/off seems to fix most of the errors waiting for the panel to come on and turning off the pipe. This is pure guesswork based on the fact that the bit definitions for PCH_PP_CONTROL are completely different to PP_CONTROL, so I guessed that PCH_PP_STATUS would follow suit. This may be bogus.

After making these changes, the backlight comes on after about 12 seconds from the modprobe, but then goes off several seconds later. The drm messages are attached for this case... note that there is still a power on ERROR which happens around 16 seconds after the modprobe, which seems to coincide with when the panel goes off again, so presumably it fails to detect that the light is already on? Also, 12 seconds seems an age for the light to come on - there seems to be no debug output during most of this pause.
Comment 36 Jesse Barnes 2010-08-25 09:03:33 UTC
(In reply to comment #35)
> Created an attachment (id=38138) [details]
> DRM debug messages from 2.6.36-rc2 with my mods
> 
> Afraid it didn't make any difference, though I don't have any documentation for
> the chip and so wasn't 100% sure what to do. Essentially though I copied the
> stuff from ironlake_edp_panel_on and created an ironlake_edp_vdd_on function
> which I called in the dp_detect routine instead of on. I removed the "disable
> reset around power sequence" and changed the write to pp |= PANEL_UNLOCK_REGS |
> EDP_FORCE_VDD, and removed the reset of that bit further down.

Yeah, that sounds right.  Setting EDP_FORCE_VDD is an alternative to turning the panel on fully; it just gives it enough power that AUX transactions should work and we can retrieve the panel config.

> However, a couple of things I have tried since:
> Again, swapping the if clauses of if (HAS_PCH_SPLIT(dev)) and if
> (IS_eDP(intel_dp)) in intel_dp_aux_ch FIXES ALL OF THE AUX TIMEOUTS FOR AUX D.
> Therefore I believe as I said earlier that the clock divider needs to be 62
> rather than 225 as it is otherwise. I don't think that the panel power thing
> was the cause of the aux channel failing at all.

Ok, I'll check on the correct frequency, that sounds promising.

> Also, changing references to PP_ON to be PANEL_POWER_ON for the PCH_PP_STATUS
> tests in edp_panel_on/off seems to fix most of the errors waiting for the panel
> to come on and turning off the pipe. This is pure guesswork based on the fact
> that the bit definitions for PCH_PP_CONTROL are completely different to
> PP_CONTROL, so I guessed that PCH_PP_STATUS would follow suit. This may be
> bogus.

PCH_PP_STATUS has a similar format to PP_STATUS though, in particular the 'on' bit is still bit 31, so I think these checks are correct already (we only ever compare PP_ON with PCH_PP_STATUS; if we had mixed up the control bits there would definitely be problems though).

> After making these changes, the backlight comes on after about 12 seconds from
> the modprobe, but then goes off several seconds later. The drm messages are
> attached for this case... note that there is still a power on ERROR which
> happens around 16 seconds after the modprobe, which seems to coincide with when
> the panel goes off again, so presumably it fails to detect that the light is
> already on? Also, 12 seconds seems an age for the light to come on - there
> seems to be no debug output during most of this pause.

Yeah there's definitely something funky going on.
Comment 37 Jesse Barnes 2010-08-25 10:40:33 UTC
> > However, a couple of things I have tried since:
> > Again, swapping the if clauses of if (HAS_PCH_SPLIT(dev)) and if
> > (IS_eDP(intel_dp)) in intel_dp_aux_ch FIXES ALL OF THE AUX TIMEOUTS FOR AUX D.
> > Therefore I believe as I said earlier that the clock divider needs to be 62
> > rather than 225 as it is otherwise. I don't think that the panel power thing
> > was the cause of the aux channel failing at all.
> 
> Ok, I'll check on the correct frequency, that sounds promising.

Got confirmation on this.  For PCH attached eDP panels, we need to use the 125MHz reference clock as you found.  So the block should look more like:

	if (IS_eDP(intel_dp) && !IS_PCH_eDP(intel_dp)) {
		if (IS_GEN6(dev))
			aux_clock_divider = 200; /* SNB eDP input clock at 400Mhz */
		else
			aux_clock_divider = 225; /* eDP input clock at 450Mhz */
	} else if (HAS_PCH_SPLIT(dev))
		aux_clock_divider = 62; /* IRL input clock fixed at 125Mhz */
	else
		aux_clock_divider = intel_hrawclk(dev) / 2;

Still working on the other issues...
Comment 38 Jesse Barnes 2010-08-25 14:50:40 UTC
Just pushed some bits including the change above into the edp-testing branch of my drm repo:
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-intel.git
can you give it a try?
Comment 39 Adam Hill 2010-08-26 02:03:33 UTC
Created attachment 38174 [details]
Latest 2.6.36-rc2+ dmesg 

The aux ch timeouts are now gone, but no other progress. This may be a complete coincidence but the:

[   33.329676] [drm:ironlake_crtc_dpms] *ERROR* failed to turn off cpu pipe

goes away if I change the check of PP_ON to PANEL_POWER_ON in the backlight function, but if I then add a delay after the wait ( so it waits for bit 0 clear, then a delay of a second ) it comes back again. From what you say bit 0 is not relevant, BUT it definitely changes some time after the panel power off is written ( and some time after the PP_ON bit clears. ) I've put some additional prints of the status/control reg around the writes in the attached dmesg, though the delay here isn't long enough to see it ( or maybe it has something to do with blanking, hence the fact that an additional delay causes the pipe disable to fail again. Maybe something else is still enabled which means that the pipe can only be disabled during blanking, when the "something else" is inherently disabled? ) Sorry - random ramblings!

Jesse - I sent you an e-mail yesterday, not sure if you saw it or if it helps at all?
Comment 40 Adam Hill 2010-08-26 03:59:54 UTC
Just another observation that comment 31 still holds true for this kernel - there is no link_train/transcoder enable call unless I comment out the if(!HAS_eDP) in ironlake_crtc_dpms, though of course I don't know what the FDI link or transcoder do or whether they has any purpose on an emedded display port :). All that I can say though is that without these, the backlight doesn't come on, whilst with them it does. So my guess is that either the FDI link/transcoder do have a purpose, or something used instead isn't being set up? It doesn't fix the fundamental problem though - there is still no display... I just get a backlit blank screen instead of a dark one ;).

Thanks for putting effort into this Jesse - just wish I could be more help!
Comment 41 Jesse Barnes 2010-08-30 09:50:22 UTC
Ok, those are good clues, lemme see what else I can come up with.
Comment 42 Knuth Posern 2010-09-01 11:38:14 UTC
Created attachment 38366 [details]
dmesg drm.debug=0x4 for jbarnes kernel git tree, branch edp-testing, commit ad2456c

I am having the same problem than Adam (probably nearly the same hardware - Sony Vaio VPCZ1 intel i7 with built-in intel:

00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 02)

If I can be of any help somehow: Please let me know.
Comment 43 Knuth Posern 2010-09-01 11:42:57 UTC
Comment on attachment 38366 [details]
dmesg drm.debug=0x4 for jbarnes kernel git tree, branch edp-testing, commit ad2456c

I am having the same problem than Adam (probably nearly the same hardware:

Sony Vaio VPCZ1290S notebook with intel i-7, built-in intel:

00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 02)

There is also a built in nvidia gpu, but it is deactivated on boot --> lspci only shows the ine above intel VGA).

uname -a:
Linux seven 2.6.36-rc2-pixel-80560-gaf096b3 #1 SMP PREEMPT Wed Sep 1 13:55:26 EDT 2010 x86_64 Intel(R) Core(TM) i7 CPU M 620 @ 2.67GHz GenuineIntel GNU/Linux

Distro: Gentoo Linux amd64 multilib (with 32-bit lib support)

If you need anything else or I can be of any help somehow: Please let me know.
Comment 44 Knuth Posern 2010-09-01 11:44:04 UTC
(In reply to comment #43)
> (From update of attachment 38366 [details])
...

Sorry I hit submit to early ;)
Comment 45 Adam Hill 2010-09-04 09:47:13 UTC
I've now had a chance to try the Grub 1 boot loader rather than Grub 2. With Grub 1, everything seems to work properly for some odd reason. I can't use Grub 1 as a permanent solution though but it was just a test.

One very important difference seems to be that the console driver is using ( presumably ) the Intel frame buffer when booted with Grub 1, because the scrolling is MUCH faster... when booted using Grub 2 even after my fix which allows X to work and switching to another VT, scrolling is painfully slow.

So, the question has to be, why should the boot loader make a difference to whether or not the display driver then works once the kernel is booted? Is the driver assuming that something is initialised when in fact Grub 2 sets some strange values? Is it to do with the frame buffer set-up?

Please let me know if any of this is useful and hence whether it would be worth me attaching drm messages or register dumps for Grub 1 vs Grub 2. I will also try to test the 2.6.36-rc2+ kernel when I get a chance to see if that too works with Grub 1.
Comment 46 Michael Zugelder 2010-09-05 04:45:13 UTC
(In reply to comment #45)
> I've now had a chance to try the Grub 1 boot loader rather than Grub 2. With
> Grub 1, everything seems to work properly for some odd reason. I can't use Grub
> 1 as a permanent solution though but it was just a test.

I can confirm this as I just dumped syslinux (which works, too) for grub 1.
This causes those strange situations where a live cd/usb works flawlessly and booting after installation freezes the system.
Comment 47 camalot 2010-09-05 15:03:32 UTC
I'm also experiencing what I believe may be this issue on my Dell Latitude E6510.  Certainly the symptoms appear to be the same.  I'd be happy to help in troubleshooting or verifying fixes.
Comment 48 Jesse Barnes 2010-09-07 12:00:33 UTC
Can you try the drm-intel-next branch at git://people.freedesktop.org/~ickle/drm-intel?  It has all my edp fixes merged in along with a few others that might help.
Comment 49 camalot 2010-09-07 20:54:43 UTC
I see the same behavior using commit  from drm-intel-next at git://people.freedesktop.org/~ickle/drm-intel.  I have the drm code compiled into my kernel (not as a module) and no initramfs.  What I see is that after selecting the boot option in grub I get approximately two screenfuls of kernel text in a large font (standard console?) and then at the point where normally I would expect the system to switch to the framebuffer and the text to become much smaller the screen goes completely dark as if the backlight is off.  After a few seconds I think the backlight comes back on because the screen brightens although it remains black.

The system is clearly up because I know the IP address it is configured with and can SSH into it.  I can start the X server but nothing is displayed.  From "xrandr -d :0" I get:

Screen 0: minimum 320 x 200, current 1920 x 1080, maximum 8192 x 8192
(null)1 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 344mm x 194mm
   1920x1080      60.0*+   40.0  
VGA1 disconnected (normal left inverted right x axis y axis)
HDMI1 disconnected (normal left inverted right x axis y axis)
DP1 disconnected (normal left inverted right x axis y axis)
HDMI2 disconnected (normal left inverted right x axis y axis)
HDMI3 disconnected (normal left inverted right x axis y axis)
DP2 disconnected (normal left inverted right x axis y axis)
DP3 disconnected (normal left inverted right x axis y axis)

If I then connect my external VGA monitor "xrandr -d :0" reports:

Screen 0: minimum 320 x 200, current 1920 x 1080, maximum 8192 x 8192
(null)1 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 344mm x 194mm
   1920x1080      60.0*+   40.0  
VGA1 connected (normal left inverted right x axis y axis)
   1024x768       60.0  
   800x600        60.3     56.2  
   848x480        60.0  
   640x480        59.9     59.9  
HDMI1 disconnected (normal left inverted right x axis y axis)
DP1 disconnected (normal left inverted right x axis y axis)
HDMI2 disconnected (normal left inverted right x axis y axis)
HDMI3 disconnected (normal left inverted right x axis y axis)
DP2 disconnected (normal left inverted right x axis y axis)
DP3 disconnected (normal left inverted right x axis y axis)

The external monitor is a Dell 2940W 1920x1600 flat panel being driven by its VGA input so reported information is suspect.  However, if I attempt to direct the X output to the external VGA with "xrandr -d :0 --output VGA1 --auto" I *do* see the X server running at what appears to be 1024x768.   If I do "xrandr -d :0 --output '(null)1' --off" the backlight goes off and "xrandr -d :0 --output '(null)1' --auto" turns it back on but I still don't see any output on that display.  Notice the somewhat suspect name of this output.

For what it is worth I don't understand exactly how the grub version is related to this but I am using grub1 (0.97, as patched by Gentoo).  The interaction may explain some other weird behavior I see.  Sometimes I never saw the grub screen after booting.  I assumed that this was some other problem.  However, twice when this has happened I've hit the enter key and then linux starts booting as witnessed by the regular non-console display text until the switch to the frame buffer goes wrong.  I could try other version of grub if that would be helpful.
Comment 50 camalot 2010-09-07 20:56:45 UTC
(In reply to comment #49)
> I see the same behavior using commit...

I left out the commit number I tested with.  It was 23a0d9e4fc8e54d08bfd23e1cf943bff48d552a5.
Comment 51 Knuth Posern 2010-09-08 07:18:02 UTC
Created attachment 38550 [details]
dmesg drm.debug=0x4 for ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf

ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf

Still the same problem: Screen goes blank.
Comment 52 Knuth Posern 2010-09-08 07:45:23 UTC
Created attachment 38553 [details]
intel_reg_dump with "nomodeset" kernel parameter - otherwise same situation like in my last post

intel_reg_dump with "nomodeset" kernel parameter - otherwise same situation like in my last post
Comment 53 Knuth Posern 2010-09-08 07:46:36 UTC
Created attachment 38554 [details]
intel_reg_dump with modeset (without "nomodeset" kernel parameter) - otherwise same situation like in my last post

intel_reg_dump with modeset (without "nomodeset" kernel parameter) - otherwise same situation like in my last post
Comment 54 Jesse Barnes 2010-09-08 07:56:35 UTC
If you don't see the GRUB screen, you're probably hitting a VBIOS bug of some kind; the VBIOS should be able set a mode whatever the state of the GPU, but in your case it sounds like it can't in all cases.

Can you attach your kernel .config so we can double check that against known issues, and maybe I can reproduce your issue here?
Comment 55 Knuth Posern 2010-09-08 08:29:30 UTC
Created attachment 38556 [details]
dmesg drm.debug=0x4 for ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf

dmesg drm.debug=0x4 for ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf
Comment 56 Knuth Posern 2010-09-08 08:33:34 UTC
Created attachment 38558 [details]
kernel config - for ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf (for attachments 38553, 38554, 38556)

kernel config - for ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf (for attachments 38553, 38554, 38556)
Comment 57 Knuth Posern 2010-09-08 13:51:53 UTC
Created attachment 38565 [details]
uname + dmesg of "modprobe drm debug=0x04" and "modprobe i915 modeset=1" + regdump + kernel-config

uname + dmesg of "modprobe drm debug=0x04" and "modprobe i915 modeset=1" + regdump + kernel-config

kernel config now based on:
https://bugs.freedesktop.org/show_bug.cgi?id=29278
(afer make oldconfig + with initramfs + device-mapper all moduls + raid-0 1 in kernel)

kernel source still:
ickle kernel git tree, branch drm-intel-next, commit 434ed097245423c5ea277d18121c0fad0df42abf
Comment 58 Knuth Posern 2010-09-08 15:22:01 UTC
Created attachment 38566 [details]
uname + regdump + dmesg + kernel-config for newest commit on ickles git, branch drm-intel-next (1af5fa1b7e5ff8332f8a2ee3c5fb44d93b34868d)

j.barnes:  [drm:ironlake_edp_panel_on] *ERROR* panel on wait timed out: 0x00000000 would appear to be the current sticking point
Comment 59 camalot 2010-09-08 21:08:47 UTC
Created attachment 38572 [details]
kernel config for ickle kernel git tree, branch drm-intel-next, commit ickle-23a0d9e4fc8e54d08bfd23e1cf943bff48d552a5

My kernel config corresponding to the behavior described in comment 49.  This was manually configured with the goal of a compiling in a minimal set of stuff used by the laptop and the things I usually attach.  There are a few modules due to wireless driver issues, etc.
Comment 60 Jesse Barnes 2010-09-09 14:24:18 UTC
Created attachment 38588 [details] [review]
make sure fdi tx,rx and pch dplls are enabled

This lets my Dell E6510 work without having to unlock the panel regs for power on.  Can you guys give it a try on top of Chris's drm-intel-staging branch?
Comment 61 Knuth Posern 2010-09-09 17:03:34 UTC
Created attachment 38591 [details]
Chris (ickle) git repo, drm-intel-next branch + patch from attachment 38588 [details] [review]

Chris (ickle) git repo, drm-intel-next branch + patch from attachment 38588 [details] [review]

RESULT:
Screen is blank, but backlight is still on very early in boot process (maybe before initramfs, and I think before init)
Comment 62 Knuth Posern 2010-09-09 17:05:21 UTC
(In reply to comment #61)
> Created an attachment (id=38591) [details]
> Chris (ickle) git repo, drm-intel-next branch + patch from attachment 38588 [details] [review]

Sorry, I forgot: commit: 1af5fa1b7e5ff8332f8a2ee3c5fb44d93b34868d

Plus: Here the patch applied without "fuzz" (see my next attachment)
Comment 63 Knuth Posern 2010-09-09 17:16:40 UTC
Created attachment 38592 [details]
Chris (ickle) git repo, drm-intel-staging branch + patch from attachment 38588 [details] [review]


Commit: b7ffdc988523fb57ac1ef454b77d6ecc01dda4d3
(I locally committed the patch (don't know why ;) 
--> kernel probably shows commit 357d51d0515677e794eccd9480c595b169fb8a63)

RESULT:
Screen is OFF (black) (like blank + backlight off)
... and I think this time LATER than for the "-next + patch" (attachment 38591 [details] case) only on start of udev --> after init (and initramfs) ... (but I am not 100% sure - but can verify if you need)

P.S.: Here the patch applied with: "Hunk #1 succeeded at 769 with fuzz 1 (offset -7 lines)."
Comment 64 Knuth Posern 2010-09-09 18:30:35 UTC
Created attachment 38593 [details]
Chris (ickle) git repo, drm-intel-next branch + patch from attachment 38588 [details] [review]

Commit: df0e924883d029a8651a2a0c7b8da67a07611ed2 + patch from attachment 38588 [details] [review]

RESULT:
Screen is blank, backlight is still on
(--> eDP panel detection seems to work)
(--> no errors, so presumably a fail in setting up the dp/fdi link?)
Also there is still the error:
     [drm:ironlake_crtc_dpms] *ERROR* failed to turn off cpu pipe
Comment 65 camalot 2010-09-09 21:48:57 UTC
In my E6510 with a kernel built from ickle repo drm-intel-staging branch commit b7ffdc988523fb57ac1ef454b77d6ecc01dda4d3 plus the patch in attachement 38588 with the config in attachment 38572 [details] the behavior I get at boot is that the display turns off (goes blank plus backlight turns off) at the point I expect the framebuffer starts being used and stays that way.  It boots fine.

I tried starting X and then turning on and off the display using xrandr.  This is kind of interesting:

kea-nicira-lt i915 # time xrandr -d :0 --output '(null)1' --auto

real	0m5.299s
user	0m0.000s
sys	0m0.005s


kea-nicira-lt i915 # time xrandr -d :0 --output '(null)1' --off

real	0m0.123s
user	0m0.001s
sys	0m0.002s

These timings are repeatable and the backlight only comes on when xrandr exists when turning no the display.  Presumably this is due to this loop:


                        /* Enable CPU FDI TX PLL, always on for Ironlake */
                        temp = I915_READ(fdi_tx_reg);
                        if ((temp & FDI_TX_PLL_ENABLE) == 0) {
                                I915_WRITE(fdi_tx_reg, temp | FDI_TX_PLL_ENABLE);
                                I915_READ(fdi_tx_reg);
                                udelay(100);
                        }

I don't know anything about this hardware but it seems crazy that it should take something like 50 writes of a register before you get back the value you were trying to set unless there is something else about the state of the hardware that is blocking the write.  I have no idea what that might be though.
Comment 66 Jesse Barnes 2010-09-10 09:52:58 UTC
Two things jump out of those last two comments:
  1) CPU pipe disable still fails
  2) panel power sequencing is timing out when turning on the panel

(Note the "(null)" output names are probably due to an old libdrm; the git version has support for EmbeddedDisplayPort, but the last released version doesn't; updating should fix that.)

Unfortuantely, all the fixes I've come up with so far are for machines with the panel attached directly to the CPU; I don't have one with eDP connected to the PCH.  I'll have to get one and see if I can reproduce your isuses.

It sounds like our FDI training and panel sequence handling are broken for that case.

I'd expect problem (1) to be caused by the panel still being on when we try to disable the pipe.  We shouldn't be doing that...  One of the transcoder or FDI bits may also lock that register and prevent the pipe from shutting down.  If we fail to bring it down I would expect other problems later on.

Problem (2) is generally caused by not having the correct panel source bits enabled before turning on the panel.  We need the plane, pipe, PLL, FDI and transcoders all up and running before we try to enable the panel, or it will refuse to turn on.  As a workaround you can try to use PANEL_UNLOCK_REGS when starting the power on sequence.  But really, a timeout indicates a bug elsewhere, so that workaround would just be temporary.
Comment 67 camalot 2010-09-11 14:49:02 UTC
(In reply to comment #66)
> (Note the "(null)" output names are probably due to an old libdrm; the git
> version has support for EmbeddedDisplayPort, but the last released version
> doesn't; updating should fix that.)

I updated libdrm to commit 7ec9a1effa4f551897f91f3b017723a8adf011d9 of the repo at git://anongit.freedesktop.org/git/mesa/drm but I still see the '(null)1' name for the screen.  Is that something I should report as a separate bug against libdrm?

> Unfortuantely, all the fixes I've come up with so far are for machines with the
> panel attached directly to the CPU; I don't have one with eDP connected to the
> PCH.  I'll have to get one and see if I can reproduce your isuses.

I'm afraid I can't help much with making any changes (lack of knowledge and time) but I'd be happy to continue to test speculative patches.  The laptop on which I'm having this issue is not going to be used for anything until I get these resolved so I can try things out without too much concern.
Comment 68 Chris Wilson 2010-09-13 10:16:11 UTC
(In reply to comment #67)
> I updated libdrm to commit 7ec9a1effa4f551897f91f3b017723a8adf011d9 of the repo
> at git://anongit.freedesktop.org/git/mesa/drm but I still see the '(null)1'
> name for the screen.  Is that something I should report as a separate bug
> against libdrm?

No, the array of names is in xf86-video-intel which also needs to be updated.

> I'm afraid I can't help much with making any changes (lack of knowledge and
> time) but I'd be happy to continue to test speculative patches.  The laptop on
> which I'm having this issue is not going to be used for anything until I get
> these resolved so I can try things out without too much concern.

If you can keep tracking drm-intel-next and warn if things get any worse of if they magically work...
Comment 69 camalot 2010-09-13 19:22:10 UTC
(In reply to comment #68)
> No, the array of names is in xf86-video-intel which also needs to be updated.

Yup, updating xf86-video-intel to current git solves that problem.  It now shows:

Screen 0: minimum 320 x 200, current 1920 x 1080, maximum 8192 x 8192
eDP1 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 344mm x 194mm
   1920x1080      60.0*+   40.0  
   1400x1050      60.0  
   1280x1024      60.0  
   1280x960       60.0  
   1024x768       60.0  
   800x600        60.3     56.2  
   640x480        59.9  
VGA1 disconnected (normal left inverted right x axis y axis)
HDMI1 disconnected (normal left inverted right x axis y axis)
DP1 disconnected (normal left inverted right x axis y axis)
HDMI2 disconnected (normal left inverted right x axis y axis)
HDMI3 disconnected (normal left inverted right x axis y axis)
DP2 disconnected (normal left inverted right x axis y axis)
DP3 disconnected (normal left inverted right x axis y axis)


> If you can keep tracking drm-intel-next and warn if things get any worse of if
> they magically work...

Will do.  Should I follow the project repository or Chris's (the ickle one...)?  For now I'm assuming the normal project one.
Comment 70 Knuth Posern 2010-09-15 17:47:25 UTC
I can report that not only with grub1 but also with grub2 you can avoid this bug by using the linux16/initrd16 commands of grub2 (for now I also use "set gfxpayload=text" but I have word from the grub2 community that this should be not necessary - yet unconfirmed).

AFAIK: The linux16 does basically what grub1 does: It makes grub leave the graphics alone and like this the intel drm/i915 code finds a untouched gpu which it then can use successfully :)

As ickle mentioned: Still this remains a bug, as the intel code should be capable of initializing the gpu in whatever state grub2 left it.

Also noteworthy is probably the fact that there seems a regression in the drm/i915 code in the sense that so far all git repo (ickle / jbarnes) kernels that identify as 2.6.36-rc* do NOT work anymore with grub1/grub2!

Please advise (e.g. if and what you need).
Comment 71 Knuth Posern 2010-09-15 17:59:51 UTC
Hi camalot.

> I updated libdrm to commit 7ec9a1effa4f551897f91f3b017723a8adf011d9 of the repo
> at git://anongit.freedesktop.org/git/mesa/drm (In reply to comment #67)

Did you build / Should I build libdrm with libkms support? There is a new option for this for the git repository version of libdrm in gentoo.

Advise would be greatly apreciated!

Knuth

P.S.: You would not run a gentoo wouldn't you?
Comment 72 Michael Zugelder 2010-09-16 13:26:54 UTC
(In reply to comment #70)
> Also noteworthy is probably the fact that there seems a regression in the
> drm/i915 code in the sense that so far all git repo (ickle / jbarnes) kernels
> that identify as 2.6.36-rc* do NOT work anymore with grub1/grub2!

I just bisected this issue in the mainline kernel (git bisect start v2.6.35 v2.6.36-rc1) and got to commit 2bd34f6ca86b5a5f9b749624f73310820e7a93fd.
Link: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2bd34f6ca86b5a5f9b749624f73310820e7a93fd;hp=-c

Since I'm very inexperienced using git, how could I bisect this further, since this is a merge?
I guess bisecting the a2757b6 branch (since 9fe6206 is the working 2.6.35) back to the root would find the issue?
That would be just 2 commits until I hit an already tested and working commit.
Comment 73 Jesse Barnes 2010-09-21 09:11:33 UTC
An update on this.

I received my Sony Vaio with switchable graphics and installed Ubuntu 10.04 on it (just a stock install using grub2).

So far Linux can't fully light up the panel; neither the stock Ubuntu kernel nor the current drm-intel-next branch.  The backlight and panel come on, but the panel is blank.

Still debugging it, but so far I've found that disabling the pipe corresponding to the panel will time out unless the panel power is in a certain state (I can get it to shut off every other mode set if I play with the VDD power setting, or shut it down by hand using intel_reg_write after a mode set).

I don't think the pipe disable failure is the root cause of the issue though; it's likely we're programming the PCH incorrectly (either with the wrong clocks, or training improperly), so that's where I'm going to look next.

I haven't tried Grub's linux16 option yet, but will do today.
Comment 74 Jesse Barnes 2010-10-04 17:23:31 UTC
Ok, finally got some bits to show up with recent code.  Looks like a bad combination of panel power sequencing and DP training; hope to have a reasonable patch for testers tomorrow.
Comment 75 Adam Williamson 2010-10-05 13:54:37 UTC
Jesse: I have this issue and will be able to test a patch when you have one.
Comment 76 Jesse Barnes 2010-10-05 17:24:45 UTC
Please try the edp-hacks branch of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-intel . It works on my system and I want to confirm that it works for you guys as well.  I still have to fix the power sequencing code (not disabling the panel is not acceptable), but it should work ok hopefully.
Comment 77 Knuth Posern 2010-10-05 18:56:31 UTC
Created attachment 39202 [details]
uname + regdump + dmesg + kernel-config for newest commit on ickles git, branch edp-hacks (commit 050486e2f3f8aa97c5f8afdc9b898b00a7114493))

Nope :(

It only fixes the regression I had with I think drm-intel-next.
-->
With my linux16 grub2 parameter I can still boot and everything is like with my 2.6.35.4 kernel.

I tried suspend: Screen is still turned off/blank afterwards, if I then start to hibernate (with tux-on-ice) the screen will be turned on for the textual progress indicator.

If I remove the "16" --> use grub2 "linux" to start your kernel version then I get a blank screen + the machine HANGs again --> I have no regdump or anything for you :( !

Attached you find the uname + regdump + dmesg + kernel-config from the linux16 boot.

Let me know, thanks!
Comment 78 Knuth Posern 2010-10-05 18:58:12 UTC
(In reply to comment #77)
> It only fixes the regression I had with I think drm-intel-next.

Sorry I meant to explain the "regression" I was referring to:
With the drm-intel-next branch I tried my linux16 trick and it did NOT work.
Comment 79 Michael Zugelder 2010-10-05 23:28:17 UTC
(In reply to comment #76)
> Please try the edp-hacks branch of
> git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-intel . It works on
> my system and I want to confirm that it works for you guys as well.  I still
> have to fix the power sequencing code (not disabling the panel is not
> acceptable), but it should work ok hopefully.

Booting with syslinux/grub1/grub2+linux16 works again, like in 2.6.35.
Booting "normally" results in a black screen, but working system.
If you're interested, I can make a reg dump before/after loading i915.
Comment 80 Knuth Posern 2010-10-06 13:39:48 UTC
I verified again and if I start the above kernel (branch edp-hacks (commit 050486e2f3f8aa97c5f8afdc9b898b00a7114493)) with

(a)
menuentry "Linux ACTUAL (intel testing modeset=0)" {
    insmod ext2
    set root=(hd0,msdos5)
    linux /kernels/actual drm.debug=0x04 i915.modeset=0
}

(b)
menuentry "Linux ACTUAL (intel testing modeset=1)" {
    insmod ext2
    set root=(hd0,msdos5)
    linux /kernels/actual drm.debug=0x04 i915.modeset=1
}
(a) works, but (b) does record nothing into syslog, the magic SysRq key (Alt+SysRq + R E I S U B B B B) does NOT work, no acpi-shutdown, no network, no hdd activity and a black screen --> the system definitely hangs ;)

And as mentioned before: (a) and (b) work if I replace linux with linux16 (in grub2).
Comment 81 camalot 2010-10-06 13:56:46 UTC
(In reply to comment #76)
> Please try the edp-hacks branch of
> git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-intel . It works on
> my system and I want to confirm that it works for you guys as well.  I still
> have to fix the power sequencing code (not disabling the panel is not
> acceptable), but it should work ok hopefully.

Unfortunately these changes didn't work for me either.  The behavior I see is the same as before.  On KMS initialization the screen goes black with the backlight still on and stays that way.
Comment 82 Jesse Barnes 2010-10-06 19:27:50 UTC
Ok just pushed one more fix that will hopefully let things work.  Please pull the latest changes from edp-hacks and try again (and cross your fingers! :).
Comment 83 Fatih Aşıcı 2010-10-06 22:55:23 UTC
(In reply to comment #82)
> Ok just pushed one more fix that will hopefully let things work.  Please pull
> the latest changes from edp-hacks and try again (and cross your fingers! :).

Great! The last commit fixes the issue (also https://bugs.freedesktop.org/show_bug.cgi?id=27220 seems fixed). Tested on Sony VPCZ114GX.
Comment 84 Fatih Aşıcı 2010-10-06 23:36:28 UTC
BTW, dmesg has these lines:

[    0.672669] [drm:intel_dsm_pci_probe] *ERROR* _DSM function support bitmask: 0x00000003
[    0.673199] [drm:intel_dsm_pci_probe] *ERROR* _DSM function support bitmask: 0x00000003
[    0.674850] [drm:intel_dp_init] *ERROR* fuse regs: 0x00002807
Comment 85 Michael Zugelder 2010-10-07 03:12:26 UTC
(In reply to comment #82)
> Ok just pushed one more fix that will hopefully let things work.  Please pull
> the latest changes from edp-hacks and try again (and cross your fingers! :).

Works good. As KMS kicks in, the backlight gets turned off and on, booting completes fine.

A bit more information about those 3 error lines:
With both Intel and NVidia graphics active (default) I get the same 3 error lines:

[    3.310646] [drm] set up 31M of stolen space
[    3.311672] [drm:intel_dsm_pci_probe] *ERROR* _DSM function support bitmask: 0x00000003
[    3.312778] [drm:intel_dsm_pci_probe] *ERROR* _DSM function support bitmask: 0x00000003
[    3.313666] VGA switcheroo: detected DSM switching method \_SB_.PCI0.P0P2.DGPU handle
[    3.313696] PM: Adding info for No Bus:card0-Embedded DisplayPort-1
[    3.314338] PM: Adding info for i2c:i2c-0
[    3.314566] [drm:intel_dp_init] *ERROR* fuse regs: 0x00002807

When booting with just Intel graphics (bios hack or old kernel+reboot) I get just 2:

[    2.846341] [drm] set up 31M of stolen space
[    2.847315] [drm:intel_dsm_pci_probe] *ERROR* _DSM function support bitmask: 0x00000003
[    2.848120] PM: Adding info for No Bus:card0-Embedded DisplayPort-1
[    2.848763] PM: Adding info for i2c:i2c-0
[    2.848991] [drm:intel_dp_init] *ERROR* fuse regs: 0x00002807
Comment 86 Adam Hill 2010-10-07 07:03:21 UTC
Just to confirm that it works for me too, with the same errors ( though they don't *seem* to affect anything in use. )
Comment 87 Knuth Posern 2010-10-07 07:52:40 UTC
Created attachment 39256 [details]
uname + regdump + dmesg + kernel-config for newest commit on jbarnes git, branch edp-hacks (commit f0c744bbce33c64bedbe15d9785efafd5380c58c))

I can also confirm that bootup now works with linux16 and with linux.
I attached the usual regdump+dmesg, in case that might be helpful.

Plus: Shame on me:
As the 3rd line of my *last* attachment tells: I was booting still from 2.6.35.5 :(
--> The kernel hungs /are/ real... but only for 2.6.35.5.
Comment 88 Jesse Barnes 2010-10-08 10:22:45 UTC
Fixed in drm-intel-next by the "PCH eDP fixes" patchset.
Comment 89 Adam Williamson 2010-10-09 12:43:03 UTC
I tested with a drm-intel-next kernel and confirmed that this is fixed, but resume from suspend fails for me. Filed that as https://bugs.freedesktop.org/show_bug.cgi?id=30738 .
Comment 90 Knuth Posern 2010-10-09 13:46:32 UTC
Yes the patched version from Jesse works...
BUT for me it brings much more additional flickering!

I generated 2 videos to demonstrate what I mean:
http://posern.org/start-with-2.6.35.5_so-WITHOUT-flickering_.avi
http://posern.org/start-with-2.6.36-rc5_so-WITH-flickering_.avi

So in words: With the vanilla 2.6.35.5: I have ONE flickering on load of i915, then NO flickering what-so-ever.
With the patched version from Jesse: I have NO flickering on load of i915, BUT:
   (a) every time I start X
   (b) when I switch BACK from a console/tty to X
   (c) I think when compiz is first loaded
   (d) When I start picasa via wine

:(

I would like your feedback: Do you have/see the same problem?
Comment 91 Marcel Heß 2010-10-25 07:39:03 UTC
Created attachment 39762 [details]
dmesg.out intel-drm-next

Hello,

this is my first bug report. Please be lenient toward me and give me some feedback, so I can learn something.

I have a Sony Vaio VPCY21S1E, with an Intel® Pentium® Prozessor U5400 and an Intel® HD Graphics on it. I think the platform is quiet similar to the i5-core.

When I start the machine with the new drm-intel-next, it ends with a black LCD and a working backlight. I tried a external LCD via HDMI, it works fine.

If I start the machine with i915.modeset=0, it works. At the moment I have no X11 installed yet.

I added the output of dmesg, with the parameter drm.debug=0x4 and no external monitor.

When my added stuff is not adequate, please tell me what to post.

Thanks for your attention!
Comment 92 Jorge Maroto 2010-11-12 04:02:29 UTC
Hello Marcel.

There is a specific report for this hardware (although not advance yet).

https://bugs.freedesktop.org/show_bug.cgi?id=30758

Check it out.
Comment 93 Marcel Heß 2010-11-19 05:07:02 UTC
Thank you Jorge :)

(In reply to comment #92)
> Hello Marcel.
> 
> There is a specific report for this hardware (although not advance yet).
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=30758
> 
> Check it out.
Comment 94 Adam Williamson 2010-12-21 09:37:58 UTC
jesse, has this regressed, or has the fix not gone to 2.6.37? current 2.6.37 builds do not work for me, the panel is blank after kms init (but the system isn't hung, now; blind switch to a console, log in as root, 'poweroff' switches off the system).
Comment 95 Jesse Barnes 2010-12-21 11:16:36 UTC
Yes, this has regressed for some people.  My earlier fixes for my Vaio broke things for other people, so I need to come up with a better solution.  I found another fix for my Vaio that's safe for others, but it doesn't work everywhere.

We're tracking the revert and related issues in https://bugs.freedesktop.org/show_bug.cgi?id=31988.