Bug 96761 - skylake dual monitor display freezes when using xserver-master, fine with xserver-1.18.x
Summary: skylake dual monitor display freezes when using xserver-master, fine with xse...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-07-01 10:35 UTC by Hans de Goede
Modified: 2017-03-08 14:39 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: display/Other


Attachments
dmesg of hang (via ssh) with drm.debug=14 (243.63 KB, text/plain)
2016-07-07 10:53 UTC, Hans de Goede
no flags Details

Description Hans de Goede 2016-07-01 10:35:38 UTC
Hi,

I've been experiencing this freeze a couple of minutes (1-5 min) after login into a gnome 3 sessopm, usually when using firefox.

I'm seeing this with Fedora kernel 4.6.3 and 4.7-rc4 (*) which contain a backport of this series:
http://www.spinics.net/lists/intel-gfx/msg95595.html

Since that fixes a ton of skylake issues Fedora users are experiencing.

This works fine (excellent even) with the xserver shipped with Fedora-24, which is 1.18.3, but when building the xserver from upstream master (as I'm a gfx developer working on it) I hit this issue where
the screen freezes and Xorg is in an unkillable state. I can still ssh in and do a "reboot --force".

I'm using the modesetting driver in both cases.

When this happens dmesg shows:

jun 30 18:33:31 shalem.localdomain kernel: ------------[ cut here ]------------
jun 30 18:33:31 shalem.localdomain kernel: WARNING: CPU: 1 PID: 1279 at drivers/gpu/drm/drm_modeset_lock.c:184 drm_modeset_lock_crtc+0xf0/0x100 [drm]
jun 30 18:33:31 shalem.localdomain kernel: Modules linked in: fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat tun ebtable_filter ebtables xt_physdev br_netfilter bridge stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip6table_filter nf_conntrack ip6_tables joydev snd_usb_audio intel_rapl snd_usbmidi_lib snd_rawmidi x86_pkg_temp_thermal coretemp kvm_intel kvm snd_hda_codec_hdmi snd_hda_codec_realtek tuner_simple tuner_types snd_hda_codec_generic tuner ppdev mxm_wmi snd_hda_intel msp3400 vfat fat irqbypass crct10dif_pclmul crc32_pclmul bttv snd_hda_codec snd_hda_core tea575x tveeprom snd_bt87x videobuf_dma_sg snd_hwdep ghash_clmulni_intel hci_uart snd_seq btbcm btqca snd_seq_device btintel videobuf_core bluetooth
jun 30 18:33:31 shalem.localdomain kernel:  rc_core v4l2_common videodev snd_pcm media snd_timer snd soundcore i2c_i801 shpchp parport_pc pinctrl_sunrisepoint parport wmi rfkill pinctrl_intel intel_lpss_acpi intel_lpss acpi_pad acpi_als kfifo_buf industrialio tpm_tis tpm binfmt_misc uas usb_storage i915 e1000e crc32c_intel i2c_algo_bit drm_kms_helper drm serio_raw ptp nvme pps_core nvme_core video i2c_hid fjes i2c_dev
jun 30 18:33:31 shalem.localdomain kernel: CPU: 1 PID: 1279 Comm: Xorg Not tainted 4.6.3-300.fc24.x86_64 #1
jun 30 18:33:31 shalem.localdomain kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B150M Pro4S/D3, BIOS P1.70 01/25/2016
jun 30 18:33:31 shalem.localdomain kernel:  0000000000000286 0000000062edda06 ffff880068fe7cb0 ffffffff813dbcff
jun 30 18:33:31 shalem.localdomain kernel:  0000000000000000 0000000000000000 ffff880068fe7cf0 ffffffff810a740b
jun 30 18:33:31 shalem.localdomain kernel:  000000b84c89f028 ffff8807dc038980 ffff88084c89f028 ffff88084c0e6000
jun 30 18:33:31 shalem.localdomain kernel: Call Trace:
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff813dbcff>] dump_stack+0x63/0x84
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff810a740b>] __warn+0xcb/0xf0
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff810a753d>] warn_slowpath_null+0x1d/0x20
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffffc0076c00>] drm_modeset_lock_crtc+0xf0/0x100 [drm]
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffffc006cafe>] drm_mode_page_flip_ioctl+0x7e/0x310 [drm]
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffffc005c922>] drm_ioctl+0x152/0x540 [drm]
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffffc006ca80>] ? drm_mode_gamma_get_ioctl+0xe0/0xe0 [drm]
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff813e4b89>] ? timerqueue_add+0x59/0xb0
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff8111556d>] ? enqueue_hrtimer+0x3d/0x90
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff8125d4a3>] do_vfs_ioctl+0xa3/0x5d0
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff8125da49>] SyS_ioctl+0x79/0x90
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff817dd3f2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
jun 30 18:33:31 shalem.localdomain kernel: ---[ end trace dc5cdee6ba5718c5 ]---
jun 30 18:33:31 shalem.localdomain kernel: ------------[ cut here ]------------
jun 30 18:33:31 shalem.localdomain kernel: WARNING: CPU: 2 PID: 1287 at drivers/gpu/drm/drm_modeset_lock.c:233 drm_modeset_unlock_crtc+0x45/0x50 [drm]
jun 30 18:33:31 shalem.localdomain kernel: Modules linked in: fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat tun ebtable_filter ebtables xt_physdev br_netfilter bridge stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip6table_filter nf_conntrack ip6_tables joydev snd_usb_audio intel_rapl snd_usbmidi_lib snd_rawmidi x86_pkg_temp_thermal coretemp kvm_intel kvm snd_hda_codec_hdmi snd_hda_codec_realtek tuner_simple tuner_types snd_hda_codec_generic tuner ppdev mxm_wmi snd_hda_intel msp3400 vfat fat irqbypass crct10dif_pclmul crc32_pclmul bttv snd_hda_codec snd_hda_core tea575x tveeprom snd_bt87x videobuf_dma_sg snd_hwdep ghash_clmulni_intel hci_uart snd_seq btbcm btqca snd_seq_device btintel videobuf_core bluetooth
jun 30 18:33:31 shalem.localdomain kernel:  rc_core v4l2_common videodev snd_pcm media snd_timer snd soundcore i2c_i801 shpchp parport_pc pinctrl_sunrisepoint parport wmi rfkill pinctrl_intel intel_lpss_acpi intel_lpss acpi_pad acpi_als kfifo_buf industrialio tpm_tis tpm binfmt_misc uas usb_storage i915 e1000e crc32c_intel i2c_algo_bit drm_kms_helper drm serio_raw ptp nvme pps_core nvme_core video i2c_hid fjes i2c_dev
jun 30 18:33:31 shalem.localdomain kernel: CPU: 2 PID: 1287 Comm: Xorg Tainted: G        W       4.6.3-300.fc24.x86_64 #1
jun 30 18:33:31 shalem.localdomain kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B150M Pro4S/D3, BIOS P1.70 01/25/2016
jun 30 18:33:31 shalem.localdomain kernel:  0000000000000286 000000006b79f9de ffff8808300e7ca0 ffffffff813dbcff
jun 30 18:33:31 shalem.localdomain kernel:  0000000000000000 0000000000000000 ffff8808300e7ce0 ffffffff810a740b
jun 30 18:33:31 shalem.localdomain kernel:  000000e900000000 0000000000000000 ffff88084c89f068 ffff88084c821400
jun 30 18:33:31 shalem.localdomain kernel: Call Trace:
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff813dbcff>] dump_stack+0x63/0x84
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff810a740b>] __warn+0xcb/0xf0
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff810a753d>] warn_slowpath_null+0x1d/0x20
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffffc0076655>] drm_modeset_unlock_crtc+0x45/0x50 [drm]
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffffc00680b9>] drm_mode_cursor_common+0x89/0x180 [drm]
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffffc006b910>] drm_mode_cursor_ioctl+0x50/0x70 [drm]
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffffc005c922>] drm_ioctl+0x152/0x540 [drm]
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff81610f80>] ? evdev_read+0xd0/0x2d0
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffffc006b8c0>] ? drm_mode_setcrtc+0x520/0x520 [drm]
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff8125d4a3>] do_vfs_ioctl+0xa3/0x5d0
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff8125da49>] SyS_ioctl+0x79/0x90
jun 30 18:33:31 shalem.localdomain kernel:  [<ffffffff817dd3f2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
jun 30 18:33:31 shalem.localdomain kernel: ---[ end trace dc5cdee6ba5718c6 ]---

So it seems that the newer xserver or modesetting driver somehow triggers a locking issue in the kernel.

Regards,

Hans




*) I've also tried with drm-intel/drm-intel-next but with that:
1) The gdm wayland login session only shows a cursor and an otherwise black screen
2) Starting at runlevel 3 and then doing startx only lights up the primary monitor of the 2 1920x1080 monitors I have, and the display on that one flickers. Where as the secondary monitor gets no signal, even though gnome / X think it is getting signal (xrandr settings show it is on in the same way as always).
Note no messages show up in dmesg
Comment 1 yann 2016-07-04 18:12:10 UTC
I am not seeing execution path going through i915 in your backtrace.
can you attach full log with drm.debug=14?
Comment 2 Hans de Goede 2016-07-07 10:53:58 UTC
Created attachment 124942 [details]
dmesg of hang (via ssh) with drm.debug=14
Comment 3 Jari Tahvanainen 2017-03-08 09:57:26 UTC
Hello Hans. Sorry about this delay for getting back to you. Do you still see this problem with latest fedora (with newest kernel) and X? And if you can still reproduce this problem with the latest then attach new dmesg (like you did in comment 2) and please check also if card0/error has data and attach "cat /sys/class/drm/card0/error | bz2 > error.bz2" if applicable.
Comment 4 Hans de Goede 2017-03-08 10:07:02 UTC
Hi,

(In reply to Jari Tahvanainen from comment #3)
> Hello Hans. Sorry about this delay for getting back to you. Do you still see
> this problem with latest fedora (with newest kernel) and X? And if you can
> still reproduce this problem with the latest then attach new dmesg (like you
> did in comment 2) and please check also if card0/error has data and attach
> "cat /sys/class/drm/card0/error | bz2 > error.bz2" if applicable.

I had completely forgotten about this bug, iow this is long fixed, closing.

Regards,

Hans
Comment 5 yann 2017-03-08 14:39:20 UTC
(In reply to Hans de Goede from comment #4)
> Hi,
> 
> (In reply to Jari Tahvanainen from comment #3)
> > Hello Hans. Sorry about this delay for getting back to you. Do you still see
> > this problem with latest fedora (with newest kernel) and X? And if you can
> > still reproduce this problem with the latest then attach new dmesg (like you
> > did in comment 2) and please check also if card0/error has data and attach
> > "cat /sys/class/drm/card0/error | bz2 > error.bz2" if applicable.
> 
> I had completely forgotten about this bug, iow this is long fixed, closing.
> 
> Regards,
> 
> Hans

Thanks Hans


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.