Description
Mathieu Desnoyers
2012-03-08 15:09:01 UTC
Created attachment 58213 [details]
dmesg of working and non-working runs with drm.debug=0xe
Another piece of information on this issue: I tried the monitor with an Apple PowerBook running MacOS X, and the monitor works flawlessly (monitor always brought up, and the same video works without glitch with vlc). I also simplified my routine that successfully bring the monitor into a working state: running 5 to 20 times the following script sometimes ends up doing the trick: xrandr --output DP1 --off sleep 1 xrandr --output DP1 --mode 2560x1600 But most of the time I must resign to put my laptop in suspend-to-ram: when I bring it back from suspend-to-ram, the screen usually shows up fine. Please let me know if I can provide more information on the issue than the detailed logs (success/fail) below. I would also like to know if there is a knob somewhere in the Intel driver I could play with to provide more time for the screen to send its EDID information. Based on this info: http://www.intel.com/support/graphics/sb/CS-028366.htm, Intel provide a modified Windows driver that might target this kind of issue, and I assume they possibly let more time than the spec requires for the screen to send EDID info. Also, if there is specific timing information you need, let me know which functions you want me to trace, so I can use my LTTng kernel tracer for what it's made for. :) Reference to the original LKML thread I posted on this issue (for cross-reference): https://lkml.org/lkml/2012/1/21/149 Turning of audio might help, can you please try this: xrandr --output DP1 --mode 2560x1600 --set audio off (In reply to comment #4) > Turning of audio might help, can you please try this: > > xrandr --output DP1 --mode 2560x1600 --set audio off I tried it with "--set audio off", and I get the same success/failure rate as without. Currently, the one way I achieve getting my DisplayPort to work at 2560x1600 each and every time is by putting my laptop in sleep mode and bringing it back up. Thanks for the suggestion, Mathieu In that case, can you please install intel-gpu-tools and attach the output of intel_reg_dumper, once right after suspend (i.e. in the working case) and once when you set the xrandr configuration manually, and things are broken. Created attachment 59087 [details]
intel_reg_dumper, ok run 1
First reg dump with screen working correctly.
Created attachment 59088 [details]
intel_reg_dumper, fail run
Dumping registers in state where screen is black, going to power saving mode (experienced failure).
Created attachment 59089 [details]
intel_reg_dumper, ok run 2
A second reg dump (after doing to laptop sleep and going back) to have a second OK run, so the delta between the two OK runs can be discarded from diffs with the failed run.
What I see as interesting parts of the diffs between register dumps: OK runs: PCH_DREF_CONTROL: 0x00001400 (cpu source disable, ssc_source enable, nonspread_source enable, superspread_source disable, ssc4_mode downspread, ssc1 disable, ssc4 disable) FAIL run: PCH_DREF_CONTROL: 0x00001402 (cpu source disable, ssc_source enable, nonspread_source enable, superspread_source disable, ssc4_mode downspread, ssc1 enable, ssc4 disable) (ssc1 is enabled in the failed run) Other possibly interesting different fields: OK: TRANS_HTOTAL_C: 0x00000000 (1 active, 1 total) TRANSC_DP_LINK_N2: 0x00ffffff (val 0xffffff 16777215) FAIL: TRANS_HTOTAL_C: 0x20800000 (1 active, 8321 total) TRANSC_DP_LINK_N2: 0x00000000 (val 0x0 0) The PCH_PP_STATUS and PCH_PP_CONTROL fields do not seem to be that interesting, since each of the two OK runs have different values for those. > --- Comment #10 from Mathieu Desnoyers <mathieu.desnoyers@efficios.com> 2012-03-26 13:15:49 PDT --- > What I see as interesting parts of the diffs between register dumps: > > OK runs: > PCH_DREF_CONTROL: 0x00001400 (cpu source disable, ssc_source enable, > nonspread_source enable, superspread_source disable, ssc4_mode downspread, ssc1 > disable, ssc4 disable) > > FAIL run: > PCH_DREF_CONTROL: 0x00001402 (cpu source disable, ssc_source enable, > nonspread_source enable, superspread_source disable, ssc4_mode downspread, ssc1 > enable, ssc4 disable) > > (ssc1 is enabled in the failed run) That's rather interesting. Can you boot with i915.lvds_use_ssc=0 appended to your kernel cmdline? That should disable the use of ssc. If that helps, please attach the output of lspic -vmmn, so we can add a corresponding quirk for your machine. Created attachment 59101 [details] intel_reg_dumper, fail run, boot with i915.lvds_use_ssc=0 cmdline arg Interesting diff between ok and fail-nossc: < PCH_DREF_CONTROL: 0x00001400 (cpu source disable, ssc_source enable, nonspread_source enable, superspread_source disable, ssc4_mode downspread, ssc1 disable, ssc4 disable) --- > PCH_DREF_CONTROL: 0x00001402 (cpu source disable, ssc_source enable, nonspread_source enable, superspread_source disable, ssc4_mode downspread, ssc1 enable, ssc4 disable) (i915.lvds_use_ssc=0 seems to still have ssc1 enable, is it normal ?) < PCH_DPLL_A: 0x88046004 (enable, sdvo high speed no, mode LVDS, p2 Div 14, FPA0 P1 3, FPA1 P1 3, refclk SSC, sdvo/hdmi mul 1) --- > PCH_DPLL_A: 0x88040004 (enable, sdvo high speed no, mode LVDS, p2 Div 14, FPA0 P1 3, FPA1 P1 3, refclk default 120Mhz, sdvo/hdmi mul 1) I'm a bit puzzled as PCH_DREF_CONTROL is written once just initialisation, it should not vary at runtime, not even for DPMS or mode changes. I'm decently confused, so I need more information. Can you please grab the _full_ dmesg (the one you've attached is cut of at the top) with drm.debug=0xe after one suspend cycle (i.e. boot into broken state, then fix it with suspend/resume). Please make sure that you don't loose any dmesg output at the beginning of the boot. Correction re Chris' comment: We _do_ reinit the PCH_DREF_CONTROL on resume by calling ilk_init_pch_refclk from i915_drm_thaw. Which makes it even more puzzling that a resume cycle magically fixes this. Hopefully the full drm debug log will shed some light on this ... Created attachment 59156 [details]
Complete dmesg output with drm.debug=0xe
In this dmesg, you will find the complete execution including the following timestamps:
[ 19.055042] eth0: no IPv6 routers present
-> this is the point where my login screen appears, at lower resolution, so I get something shown on my DP screen.
[ 57.864385] [drm:intel_crtc_cursor_set], cursor off
-> this timestamp is where my DP screen became blank, following attempt to switch to 2560x1600 after login.
[ 145.001230] [drm:intel_crtc_cursor_set], cursor off
-> this is after suspend/resume cycle of the laptop, where the DP screen shows up OK at 2560x1600.
Hoping this helps!
Thanks,
Mathieu
Created attachment 59162 [details] [review] properly clear SSC1 Not much more clue yet, but there's a small thing I've noticed. Please test the attached patch. Created attachment 59226 [details]
Complete dmesg output with drm.debug=0xe with "Properly clear ssc1" patch applied
It turns out the patch to properly clear ssc1 does not help. Here is the full dmesg with the patch applied. If I can provide more detailed information about some code paths (by e.g. adding tracing instrumentation or adding printk), please let me know what functions sites/variables are of particular interest.
In this dmesg, you will find the complete execution including the following
timestamps:
[ 18.283007] eth0: no IPv6 routers present
-> this is the point where my login screen appears, at lower resolution, so I
get something shown on my DP screen.
[ 45.274438] [drm:intel_crtc_cursor_set], cursor off
-> this timestamp is where my DP screen became blank, following attempt to
switch to 2560x1600 after login.
[ 67.982331] [drm:intel_crtc_cursor_set], cursor off
-> this is after suspend/resume cycle of the laptop, where the DP screen shows
up OK at 2560x1600.
Hm, my hopes vanished. Can you please again grab the output of intel_reg_dumper with the SSC1 clear patch attached from the bad state? Created attachment 59280 [details] intel_reg_dumper, fail run, with "properly clear SSC1" patch applied Interesting part of the diff between prior "failed" runs and this failed run with patch applied: < PCH_DREF_CONTROL: 0x00001402 (cpu source disable, ssc_source enable, nonspread_source enable, superspread_source disable, ssc4_mode downspread, ssc1 enable, ssc4 disable) --- > PCH_DREF_CONTROL: 0x00001400 (cpu source disable, ssc_source enable, nonspread_source enable, superspread_source disable, ssc4_mode downspread, ssc1 disable, ssc4 disable) so it looks like this patch does indeed clear the ssc1 enable flag, but it does not fix the underlying problem. (In reply to comment #18) > Created attachment 59226 [details] > Complete dmesg output with drm.debug=0xe with "Properly clear ssc1" patch > applied > > It turns out the patch to properly clear ssc1 does not help. Here is the full > dmesg with the patch applied. If I can provide more detailed information about > some code paths (by e.g. adding tracing instrumentation or adding printk), > please let me know what functions sites/variables are of particular interest. > > In this dmesg, you will find the complete execution including the following > timestamps: > > [ 18.283007] eth0: no IPv6 routers present > -> this is the point where my login screen appears, at lower resolution, so I > get something shown on my DP screen. > > [ 45.274438] [drm:intel_crtc_cursor_set], cursor off > -> this timestamp is where my DP screen became blank, following attempt to > switch to 2560x1600 after login. > > [ 67.982331] [drm:intel_crtc_cursor_set], cursor off > -> this is after suspend/resume cycle of the laptop, where the DP screen shows > up OK at 2560x1600. There are interesting activity differences between the last sequence that lead to the failed and OK states. The failed state seems to have some activity triggered by a "drm:intel_ironlake_crt_detect_hotplug", while the sequence that leads to the OK state (when coming back from suspend) does not have this: Just before failed state: [ 39.334403] [drm:intel_crtc_cursor_set], [ 39.334408] [drm:intel_crtc_cursor_set], cursor off [ 39.334416] [drm:intel_crtc_cursor_set], [ 39.334419] [drm:intel_crtc_cursor_set], cursor off [ 39.336658] [drm:intel_crtc_cursor_set], [ 39.336669] [drm:intel_crtc_cursor_set], [ 40.497637] [drm:intel_ironlake_crt_detect_hotplug], ironlake hotplug adpa=0xf40000, result 0 [ 40.497644] [drm:intel_crt_detect], CRT not detected via hotplug [ 40.497648] [drm:output_poll_execute], [CONNECTOR:12:VGA-1] status updated from 2 to 2 [ 40.509056] [drm:output_poll_execute], [CONNECTOR:15:HDMI-A-1] status updated from 2 to 2 [ 40.509372] [drm:intel_dp_detect], DPCD: 110a840101000100 [ 40.510427] [drm:i2c_algo_dp_aux_xfer], dp_aux_xfer return 2 [ 40.537548] [drm:i2c_algo_dp_aux_xfer], dp_aux_xfer return 2 [ 40.546890] wlan0: authenticate with 00:10:18:90:20:db (try 1) [ 40.549653] wlan0: authenticated [ 40.550595] wlan0: associate with 00:10:18:90:20:db (try 1) [ 40.552925] wlan0: RX AssocResp from 00:10:18:90:20:db (capab=0x411 status=0 aid=3) [ 40.552931] wlan0: associated [ 40.552938] wlan0: moving STA 00:10:18:90:20:db to state 1 [ 40.552942] wlan0: moving STA 00:10:18:90:20:db to state 2 [ 40.560345] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready [ 40.564755] [drm:i2c_algo_dp_aux_xfer], dp_aux_xfer return 2 [ 40.564761] [drm:drm_detect_monitor_audio], Monitor has basic audio support [ 40.564765] [drm:output_poll_execute], [CONNECTOR:18:DP-1] status updated from 1 to 1 [ 40.584823] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni) [ 40.587655] wlan0: moving STA 00:10:18:90:20:db to state 3 [ 45.274417] [drm:intel_crtc_cursor_set], [ 45.274424] [drm:intel_crtc_cursor_set], cursor off [ 45.274434] [drm:intel_crtc_cursor_set], [ 45.274438] [drm:intel_crtc_cursor_set], cursor off Just before OK state: [ 62.548521] [drm:drm_mode_getconnector], [CONNECTOR:18:?] [ 62.548943] [drm:intel_crtc_cursor_set], [ 62.548950] [drm:intel_crtc_cursor_set], [ 63.107716] [drm:intel_crtc_cursor_set], [ 63.107721] [drm:intel_crtc_cursor_set], cursor off [ 63.107728] [drm:intel_crtc_cursor_set], [ 63.107731] [drm:intel_crtc_cursor_set], cursor off [ 63.157934] [drm:intel_crtc_cursor_set], [ 63.157942] [drm:intel_crtc_cursor_set], [ 63.203489] e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx [ 66.461456] [drm:intel_crtc_cursor_set], [ 66.461462] [drm:intel_crtc_cursor_set], cursor off [ 66.461472] [drm:intel_crtc_cursor_set], [ 66.461475] [drm:intel_crtc_cursor_set], cursor off [ 66.538491] [drm:intel_crtc_cursor_set], [ 66.538502] [drm:intel_crtc_cursor_set], [ 67.982314] [drm:intel_crtc_cursor_set], [ 67.982320] [drm:intel_crtc_cursor_set], cursor off [ 67.982328] [drm:intel_crtc_cursor_set], [ 67.982331] [drm:intel_crtc_cursor_set], cursor off I also used LTTng 2.0 to trace the kernel to see where those "cursor_set" calls were coming from, and it seems to be an ioctl from the Xorg server. So if the cursor_set ever needs to happen in a specific timing with respect to the screen, it may get delayed by Xorg preemption due to gnome (and others) initialization. (further thoughts on this hypothesis are welcome) I also tried disabling wlan entirely from my kernel config, as it appears in the log just before the failed state (I thought it might be causing a timing-related issue). But it did not fix things. I also tried to run Xorg at nice -20, and it did not help. I also tried to disable the if(intel_crt_detect_hotplug(connector)) branch within intel_crt_detect (replacing it by an if (0)), but it did not help. I also tried changing "DRM_OUTPUT_POLL_PERIOD (10*HZ)" to "(30*HZ)", but not effect. Other ideas are welcome, Thanks, Mathieu First success! I tried with Linux master branch, currently at commit 9613bebb223dea3179c265dc31e1bb41ae39f321 and it works fine without any patch applied. Let me know if there are things I could do to spot the fix and help backporting it to older stable kernels. From what I know, the fix came in between v3.3 and 9613bebb223dea3179c265dc31e1bb41ae39f321. Thanks, Mathieu Well, let's just be happy then. If you could, attaching a intel_reg_dump with the fixed version would be great (for documentation). I'll close this as fixed now, please reopen if it blows up again. And thanks for reporting this issue and doing all the testing. Created attachment 59299 [details] [review] intel_reg_dumper, ok, with 3.3 kernel and backported drm/i2c from 9613bebb223dea3179c265dc31e1bb41ae39f321 Diff between failed setup and this fixed kernel: 34c34 < DSPASURF: 0x047ab000 --- > DSPASURF: 0x03754000 47c47 < DSPBSURF: 0x047ab000 --- > DSPBSURF: 0x03754000 165c165 < BLC_PWM_CPU_CTL: 0x000002a3 --- > BLC_PWM_CPU_CTL: 0x00000ef4 A patch referencing this bug report has been merged in Linux v3.4-rc2: commit e77166b5a653728f312d07e60a80819d1c54fca4 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Mar 30 22:14:05 2012 +0200 drm/i915: properly clear SSC1 bit in the pch refclock init code Closing - patch merged 2012. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.