Bug 58436

Summary: [GM45] System freeze when plugging/unplugging the VGA connector
Product: DRI Reporter: Lionel Duriez <lionel.duriez>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED NOTOURBUG QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: jani.nikula
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
See Also: https://bugzilla.kernel.org/show_bug.cgi?id=66771
i915 platform: i915 features:
Description Flags
Netconsole log when unplugging the VGA connector
dmesg obtained after boot with drm.debug=0xe
new dmesg obtained after boot with drm.debug=0xe none

Description Lionel Duriez 2012-12-17 21:34:25 UTC
The system freezes when plugging or unplugging the VGA connector on a Sony Vaio VGN-SR19XN with an Intel 4500MHD graphic card (Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)).
The problem happens on both Ubuntu 12.10 and Ubuntu 13.04, within a gnome session and also in console mode after exiting lightdm using "sudo service lightdm stop".

See below for the tested configurations.

1) Ubuntu 12.10 configuration

Kernel: 3.5.0-19 x86_64
dpkg -l | grep intel
ii  intel-gpu-tools 1.3-0ubuntu2 amd64 tools for debugging the Intel graphics driver
ii  libdrm-intel1:amd64 2.4.39-0ubuntu1 amd64 Userspace interface to intel-specific kernel DRM services -- runtime
ii  whois 5.0.19 amd64 intelligent WHOIS client
ii  xserver-xorg-video-intel 2:2.20.9-0ubuntu2 amd64 X.Org X server -- Intel i8xx, i9xx display driver

2) Ubuntu 13.04 configuration (daily build 2012/12/15)

Kernel: 3.7.0-6 x86_64
dpkg -l | grep intel
ii  intel-gpu-tools 1.3-0ubuntu2 amd64 tools for debugging the Intel graphics driver
ii  libdrm-intel1:amd64 2.4.40-1 amd64 Userspace interface to intel-specific kernel DRM services -- runtime
ii  xserver-xorg-video-intel 2:2.20.14-0ubuntu1 amd64 X.Org X server -- Intel i8xx, i9xx display driver
Comment 1 Chris Wilson 2012-12-17 21:38:11 UTC
How hard is the freeze? Is the machine still accessible over the network? Are you able to switch to a VT?
Comment 2 Lionel Duriez 2012-12-17 21:51:41 UTC
The freeze is total:
- the machine is not anymore accessible over the network
- the mouse and the keyboard are not responding anymore => it is not possible to switch to another VT

In a nutshell I would declare it legally dead.
Comment 3 Daniel Vetter 2012-12-17 23:31:47 UTC
Hm, that's pretty bad. If you have a second machine, can you try to setup netconsole loggind (will require some yelling&screaming to get that going, I can try to dig out howtos if you want). If you have that set up, enable full drm debuggin with

# echo 0xf > /sys/module/drm/parameter/debug

while the netconsole is logged on the other side, then hang your machine with the VGA unplug/plug. If we're lucky the netconsole output has some useful information.
Comment 4 Lionel Duriez 2012-12-18 21:35:27 UTC
Created attachment 71754 [details]
Netconsole log when unplugging the VGA connector
Comment 5 Lionel Duriez 2012-12-18 21:41:28 UTC
I have grabbed the logs using netconsole using this howto: https://wiki.ubuntu.com/Kernel/Netconsole

On Ubuntu, the drm debug file is in a directory called "parameters" so to enable full drm debugging I used:

echo 0xf > /sys/module/drm/parameters/debug

I don't know if the logs contain useful information, they end with these 2 lines: 

[  277.344263] [drm:i965_irq_handler], hotplug event received, stat 0x08000800
[  277.344291] [drm:i915_hotplug_work_func], running encoder hotplug functions

Hope this helps.
Comment 6 Daniel Vetter 2012-12-18 21:46:26 UTC
Hm, seems to be fairly garbled, which can happen if too much goes over console and the netconsole tx/rx side gets overloaded (it doesn't resend obviously ...). It looks though as if we attempt a modeset, so new stuff for you to test:
- Maybe try to enable the netconsole only at runtime, ime that helps with such cases (it's already garbled in early boot ...).
- Please start X with a really dumb desktop enviroment (failsafe with just a shell) to check whether the automatic modeset is the issue, or the hotplug handling itself. You can't just stop X, since the kms fbcon will also do an automatic modeset.
- Please boot with VGA plugged in and enable/disable the VGA output with xrandr manually (your DE might again get in the way, some do a lot of automagic for display configuration).

If we're really unlucky, the above works and the kernel only dies when the modeset follows the hotplug right away ...
Comment 7 Daniel Vetter 2012-12-18 21:48:38 UTC
Disregard my comment, I've looked at the wrong netconsole output - too many bugs ...

It seems to indeed die right when the hotplug handling is run, which is strange. I need to cook further debug patches. For reference, can you please boot with drm.debug=0xe added to your kernel cmdline, boot and attach the complete dmesg? It'll tell us tons about what your hw looks like.
Comment 8 Lionel Duriez 2012-12-18 22:37:42 UTC
Created attachment 71767 [details]
dmesg obtained after boot with drm.debug=0xe
Comment 9 Daniel Vetter 2013-03-20 11:54:10 UTC
The "non asle set request??" line is a bit interesting, which happens when probing the VGA output:

[    9.492349] [drm:drm_helper_probe_single_connector_modes], [CONNECTOR:9:VGA-1]
[    9.492356] non asle set request??

Can you please check whether that always happens (you need drm.debug=0xe enabled, ofc)? You should be able to force a probe cycle by running xrandr.
Comment 10 Lionel Duriez 2013-03-21 22:22:35 UTC
Created attachment 76881 [details]
new dmesg obtained after boot with drm.debug=0xe

Obtained with boot parameter drm.debug=14
Comment 11 Lionel Duriez 2013-03-21 22:32:52 UTC
No logs are written to /var/logs/dmesg when running xrandr with drm.debug=0xe.
I have tried xrandr and xrandr --output VGA1 --auto

I have attached the dmesg file obtained after booting the system.
Comment 12 Daniel Vetter 2013-11-18 17:47:35 UTC
Presuming fixed on latest kernel versions, please reopen if this is still broken. Thanks.
Comment 13 Lionel Duriez 2013-11-23 09:41:57 UTC
The problem is still present with latest kernel 3.12.0.
Tested on Ubuntu 13.10 64 bit.
Comment 14 Daniel Vetter 2013-11-25 09:37:50 UTC
Shot in the dark: If you disable acpi on the kernel cmdline, does this still happen? I wonder whether we have a hw issue at hand here or whether this is just a really ugly interaction with the firmware.
Comment 15 Lionel Duriez 2013-11-30 09:19:40 UTC
The problem disappears when acpi is off.
Comment 16 Daniel Vetter 2013-12-04 08:42:32 UTC
Cool, occasional a wild guess succeeeds. Looks like your acpi implementation is doing something truly nasty behind our backs. Can you please file a bug on bugzilla.kernel.org for this issue and link to this report here?

If it turns out it's an issue on the gfx side then we can easily move it back on kernel bz to the intel driver.
Comment 17 Lionel Duriez 2013-12-08 09:42:43 UTC
Thanks Daniel !
Kernel bug report :

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.