Bug 29005

Summary: [Arrandale] blanks at xrandr after plugging VGA
Product: xorg Reporter: Brice Goglin <brice.goglin>
Component: Driver/intelAssignee: Carl Worth <cworth>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: critical    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg from 2.6.34-1-amd64
none
Xorg log
none
output of intel_gpu_dump
none
Backtrace that appeared in dmesg a bit later
none
Experimental patch
none
Another minor patch in the vicinity.
none
And a cleanup patch!
none
1280901139-8922-1-git-send-email-airlied@gmail.com none

Description Brice Goglin 2010-07-11 01:48:37 UTC
Hello,

Laptop is Dell Latitude E6410 with
00:02.0 VGA compatible controller [0300]: Intel Corporation Core Processor Integrated Graphics Controller [8086:0046] (rev 02)

It runs intel driver 2.12.0, libdrm 2.4.21, Xserver 1.7.7, mesa 7.7.1, with Debian's 2.6.34-1 or Debian's latest 2.6.32-5 or my custom 2.6.35-rc4.

When I plug a VGA monitor (tried several different monitors or videoprojectors) and run xrandr, X blanks and freezes. It I boot with VGA plugged, I get a black screen as soon as KMS starts.

It looks like I just can't use VGA together with another output. The only way I got VGA to work is to boot with lid closed and VGA plugged. This works fine as long as the lid is closed when KMS starts. Then I can keep the lid open, even when X starts, the internal panel with remain off. But then I can't reenable the internal panel later. It's marked as disconnected and I found no way to force reenable it (tried adding modes, ...).

This may be related to bug#28811 except that:
* the problem seems specific to the VGA output, I don't have any problem when using a DVI monitor on this machine (plugged on output HDMI1 through a dock station) but I didn't test all outputs intensively
* The crash occurs at xrandr after plugging VGA, not when disconnecting later

Brice
Comment 1 Brice Goglin 2010-07-11 01:49:12 UTC
Created attachment 36941 [details]
dmesg from 2.6.34-1-amd64
Comment 2 Brice Goglin 2010-07-11 01:49:40 UTC
Created attachment 36942 [details]
Xorg log
Comment 3 Brice Goglin 2010-07-11 01:53:24 UTC
Created attachment 36943 [details]
output of intel_gpu_dump

Note that /sys/kernel/debug/dri/0/i915_error_state didn't report any problem.
Comment 4 Brice Goglin 2010-07-11 01:54:27 UTC
Created attachment 36944 [details]
Backtrace that appeared in dmesg a bit later

It's probably related to the bug since it talks about Ironlake, but I don't see why it appeared later (30s-1mn later or so) and I don't see if it's a oops or what else.
Comment 5 Chris Wilson 2010-07-11 02:05:07 UTC
I wonder if the backtrace is a soft-lockup from:

while ((I915_READ(transconf_reg) & TRANS_STATE_ENABLE) == 0)
    ;

I think there are reasonable similarities between this and bug 28911.
Comment 6 Chris Wilson 2010-07-12 03:43:24 UTC
Created attachment 36959 [details] [review]
Experimental patch
Comment 7 Chris Wilson 2010-07-12 03:46:36 UTC
Created attachment 36961 [details] [review]
Another minor patch in the vicinity.
Comment 8 Chris Wilson 2010-07-12 03:47:10 UTC
Created attachment 36962 [details] [review]
And a cleanup patch!
Comment 9 Brice Goglin 2010-07-12 10:08:22 UTC
Applied all 3 patches on top of 2.6.35-rc4, no change.

By the way, your first patch wrongly removes the semi-colon at the end of the while loop. And I had to remove the call to intel_update_fbc() since it's not there in 2.6.35-rc4. Do you need me to test your patches on top of another drm tree?

I can confirm that this is a soft lockup, I found the message in another instance of the backtrace in dmesg.
Comment 10 Chris Wilson 2010-07-12 10:16:19 UTC
(In reply to comment #9)
> Applied all 3 patches on top of 2.6.35-rc4, no change.

Ok, something a bit more sinister then.
 
> By the way, your first patch wrongly removes the semi-colon at the end of the
> while loop. And I had to remove the call to intel_update_fbc() since it's not
> there in 2.6.35-rc4. Do you need me to test your patches on top of another drm
> tree?

I tend to work against Eric's drm-intel-next unless I am targetting a stable tree.

Keith mentioned that he has seen some spurious hangs on DPMS-on on gm45, so we may be doing something wrong on the common DP paths -- which is drifting away from the original bug...
Comment 11 Chris Wilson 2010-08-04 01:40:42 UTC
Created attachment 37568 [details] [review]
1280901139-8922-1-git-send-email-airlied@gmail.com

Can you please test this patch by Dave Airlie? Thanks.
Comment 12 Brice Goglin 2010-08-04 04:44:41 UTC
Tested the patch on top of 2.6.35-rc4, no change.
Comment 13 Corona 2010-08-05 04:39:59 UTC
I have an E6510 with an intel core i5 processor and intel integrated graphics. I believe the hardware is the same as yours.

For me, this issue was solved in the latest drm-intel-next kernel. I had to revert a patch which caused a regression giving me a blank screen on boot (see https://bugzilla.kernel.org/show_bug.cgi?id=16496), but now the VGA output works without problems.
Comment 14 Chris Wilson 2010-08-05 05:39:05 UTC
It pays to be cautious and not assume that the same model from the same manufacturer contains the same components. Lenovo, for example, is infamous for shipping wildly different configurations under the same model number, and they all seem to fail in different ways.

Thanks for the feedback, as indeed quite a few patches for Arrandale are now upstream in drm-core-next (and drm-intel-next) and now seems to be a good point to check whether one of those holds the fix.
Comment 15 Corona 2010-08-05 07:02:01 UTC
It might indeed be possible that the e6510 is slightly different from the e6410. There is also a bug about a black screen when resuming from suspend, and the fix that is reported to work for the e6410 doesn't solve the issue on my e6510. Anyway, here's the output from sudo lshw :
*-cpu
          description: CPU
          product: Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz
          vendor: Intel Corp.
          physical id: 4
          bus info: cpu@0
          version: 6.5.2
          serial: 0002-0652-0000-0000-0000-0000
          slot: CPU 1
          size: 2399MHz
          capacity: 4GHz

If you need somebody to test patches, let me know. 

(In reply to comment #14)
> It pays to be cautious and not assume that the same model from the same
> manufacturer contains the same components. Lenovo, for example, is infamous for
> shipping wildly different configurations under the same model number, and they
> all seem to fail in different ways.
> 
> Thanks for the feedback, as indeed quite a few patches for Arrandale are now
> upstream in drm-core-next (and drm-intel-next) and now seems to be a good point
> to check whether one of those holds the fix.
Comment 16 Corona 2010-08-05 07:04:16 UTC
And here's the info about the gpu:

*-display
             description: VGA compatible controller
             product: Core Processor Integrated Graphics Controller
             vendor: Intel Corporation
             physical id: 2
             bus info: pci@0000:00:02.0
             version: 02
             width: 64 bits
             clock: 33MHz
             capabilities: msi pm bus_master cap_list rom
             configuration: driver=i915 latency=0
             resources: irq:48 memory:90000000-903fffff memory:80000000-8fffffff ioport:70b0(size=8)
Comment 17 Brice Goglin 2010-08-05 07:18:28 UTC
Indeed, drm-core-next gives a blank screen at boot, and it finally works after reverting the commit specified in https://bugzilla.kernel.org/show_bug.cgi?id=16496

Looks like we can close this bug? I'll subscribe to the above one.
Comment 18 Chris Wilson 2010-08-05 08:27:50 UTC
Thankyou Brice and Corona for testing the patches and reporting back.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.