Bug 94912 - Stuttering delays in graphics, sound, USB after resume from suspend (i915)
Summary: Stuttering delays in graphics, sound, USB after resume from suspend (i915)
Status: CLOSED DUPLICATE of bug 89055
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-13 03:27 UTC by dwk128
Modified: 2017-07-24 22:42 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: power/suspend-resume


Attachments
syslog while the stuttering crash is happening (177.62 KB, text/plain)
2016-04-13 03:27 UTC, dwk128
no flags Details

Description dwk128 2016-04-13 03:27:18 UTC
Created attachment 122887 [details]
syslog while the stuttering crash is happening

I have a Dell XPS 13 9350 running Debian. Suspend and hibernate are unreliable (I'll talk more about suspend because it seems closer to working). I am using the i915 graphics driver and I replaced the wireless card with an Intel 7265 (using driver iwlwifi).

With stock Debian kernels, I try to suspend and the system wakes up immediately. I suspect i915 is not allowing the system to sleep. I have tried 4.2.0, 4.3.0, 4.4.0-rc4, 4.4.0-1, and 4.5.0-trunk (from experimental) with the same issue (with i915.preliminary_hw_support as needed). There is nothing in dmesg about this; I tried the . The only clue is a stack trace right when the graphics is first enabled (the login screen shows for about a second, then blinks black for half a second and recovers):

[    4.343461] ------------[ cut here ]------------
[    4.343493] WARNING: CPU: 3 PID: 24 at /build/linux-T0g1Eb/linux-4.5/drivers/gpu/drm/i915/intel_pm.c:3553 skl_update_other_pipe_wm+0x15d/0x170 [i915]()
[    4.343494] WARN_ON(!wm_changed)
[    4.343519] Modules linked in: bnep binfmt_misc hid_multitouch intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel nls_utf8 nls_cp437 arc4 i2c_designware_platform i2c_designware_core vfat dell_laptop dell_wmi dcdbas fat sparse_keymap iwlmvm mac80211 sha256_ssse3 sha256_generic hmac iwlwifi rtsx_pci_ms drbg ansi_cprng memstick aesni_intel cfg80211 aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd efi_pstore pcspkr serio_raw efivars snd_hda_intel snd_hda_codec i2c_i801 snd_hda_core idma64 snd_hwdep virt_dma snd_pcm snd_timer snd soundcore shpchp mei_me mei intel_lpss_pci btusb btrtl i915 drm_kms_helper drm i2c_algo_bit processor_thermal_device intel_soc_dts_iosf wmi battery hci_uart btbcm btqca btintel bluetooth rfkill
[    4.343536]  video int3400_thermal acpi_thermal_rel int3403_thermal button int340x_thermal_zone intel_lpss_acpi intel_lpss tpm_tis acpi_pad ac tpm processor acpi_als kfifo_buf industrialio joydev hid_generic usbhid evdev uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media parport_pc ppdev lp parport efivarfs autofs4 ext4 crc16 mbcache jbd2 dm_mod rtsx_pci_sdmmc mmc_core crc32c_intel psmouse nvme rtsx_pci mfd_core xhci_pci xhci_hcd usbcore usb_common fan thermal i2c_hid hid fjes
[    4.343538] CPU: 3 PID: 24 Comm: kworker/3:0 Tainted: G     U          4.5.0-trunk-amd64 #1 Debian 4.5-1~exp1
[    4.343539] Hardware name: Dell Inc. XPS 13 9350/07TYC2, BIOS 1.0.0 09/10/2015
[    4.343544] Workqueue: events output_poll_execute [drm_kms_helper]
[    4.343546]  0000000000000286 000000002635a511 ffffffff81301755 ffff8802b6817a38
[    4.343547]  ffffffffc05eb828 ffffffff8107904d ffff8802b6263000 ffff8802b6817a90
[    4.343549]  ffff8802b5389000 ffff8802b53893a8 ffff8802b6817aac ffffffff810790dc
[    4.343549] Call Trace:
[    4.343553]  [<ffffffff81301755>] ? dump_stack+0x5c/0x77
[    4.343556]  [<ffffffff8107904d>] ? warn_slowpath_common+0x7d/0xb0
[    4.343557]  [<ffffffff810790dc>] ? warn_slowpath_fmt+0x5c/0x80
[    4.343578]  [<ffffffffc0528d2d>] ? skl_update_other_pipe_wm+0x15d/0x170 [i915]
[    4.343596]  [<ffffffffc0528ec1>] ? skl_update_wm+0x181/0x5c0 [i915]
[    4.343624]  [<ffffffffc05afdfc>] ? intel_ddi_enable_transcoder_func+0x17c/0x260 [i915]
[    4.343651]  [<ffffffffc0593e68>] ? haswell_crtc_enable+0x308/0x880 [i915]
[    4.343677]  [<ffffffffc058f8f5>] ? intel_atomic_commit+0x6d5/0x16e0 [i915]
[    4.343692]  [<ffffffffc048d355>] ? drm_atomic_check_only+0x185/0x600 [drm]
[    4.343704]  [<ffffffffc048dc02>] ? drm_atomic_add_affected_connectors+0x22/0xe0 [drm]
[    4.343710]  [<ffffffffc04f3193>] ? restore_fbdev_mode+0x223/0x250 [drm_kms_helper]
[    4.343715]  [<ffffffffc04f523e>] ? drm_fb_helper_restore_fbdev_mode_unlocked+0x2e/0x70 [drm_kms_helper]
[    4.343719]  [<ffffffffc04f52a9>] ? drm_fb_helper_set_par+0x29/0x50 [drm_kms_helper]
[    4.343723]  [<ffffffffc04f51c5>] ? drm_fb_helper_hotplug_event+0xc5/0x110 [drm_kms_helper]
[    4.343727]  [<ffffffffc04e884b>] ? output_poll_execute+0x18b/0x1d0 [drm_kms_helper]
[    4.343729]  [<ffffffff810909fa>] ? process_one_work+0x15a/0x410
[    4.343730]  [<ffffffff81090cfd>] ? worker_thread+0x4d/0x480
[    4.343732]  [<ffffffff81090cb0>] ? process_one_work+0x410/0x410
[    4.343734]  [<ffffffff81096b3d>] ? kthread+0xcd/0xf0
[    4.343736]  [<ffffffff81096a70>] ? kthread_create_on_node+0x190/0x190
[    4.343739]  [<ffffffff815ac68f>] ? ret_from_fork+0x3f/0x70
[    4.343740]  [<ffffffff81096a70>] ? kthread_create_on_node+0x190/0x190
[    4.343742] ---[ end trace 680ca7b652016555 ]---


So I've been using drm-intel-nightly kernels instead. They have the same crash that you see above. Suspend (+ resume) works with them for a while, but I run into a stuttering bug which I'll detail below. I've been using first 4.4.0-994-generic when I got the system in November, more recently 4.6.0-994-generic compiled April 1st, and I also compiled a 4.5 kernel from source using the same config to see how that worked. No luck.

What happens with all of these kernels is that suspend seems to work, but occasionally after a resume there will be a lot of stuttering in graphics. When I watch a video, it will play fine if the cursor is in the corner of the screen and it will stutter terribly if the cursor is even one pixel away from the edge. I assume this is a 2D blending bug. My sound starts stuttering at the same time, very severely, like maybe only 60% of audio packets are getting through. My USB subsystem starts dropping communications, and USB keyboards, mice, and hubs stop working (sometimes a keyboard will infinitely press keys until I unplug it). USB 3 hubs are worse (higher frequency interrupts I guess); I actually bought multiple USB 2 hubs with dedicated on/off toggle buttons to try and mitigate this problem.

Once the stuttering has begun, suspend still works, but upon resume the stuttering will never disappear. I have to reboot the system to get back to normal. Unfortunately the system works just well enough that I often continue working, cursing at my keyboard and mice, unplugging and replugging them every three minutes or so :( I can usually last two or three days, suspending several times a day, before the bug hits. So it happens maybe one in twenty times.

I have gone through the dmesg logs several times. There are often messages from the USB subsystem e.g.

Feb 23 15:20:33 foo kernel: [98592.061593] evbug: Event. Dev: input4, Type: 0, Code: 0, Value: 0
Feb 23 15:20:33 foo kernel: [98592.091441] evbug: Event. Dev: input4, Type: 4, Code: 4, Value: 29
Feb 23 15:20:33 foo kernel: [98592.091444] evbug: Event. Dev: input4, Type: 1, Code: 29, Value: 2
Feb 23 15:20:33 foo kernel: [98592.091445] evbug: Event. Dev: input4, Type: 0, Code: 0, Value: 0
Feb 23 15:20:33 foo kernel: [98592.121606] evbug: Event. Dev: input4, Type: 4, Code: 4, Value: 29
Feb 23 15:20:33 foo kernel: [98592.121610] evbug: Event. Dev: input4, Type: 1, Code: 29, Value: 2
Feb 23 15:20:33 foo kernel: [98592.121611] evbug: Event. Dev: input4, Type: 0, Code: 0, Value: 0
Feb 23 15:20:33 foo kernel: [98592.151943] evbug: Event. Dev: input4, Type: 4, Code: 4, Value: 29

but I'm pretty sure that the root cause is i915 because I also sometimes see i915 stack traces when I suspend (after the bug is triggered). It must mess up some internal kernel timers or something. Here is one such trace:

Apr 12 20:44:44 foo kernel: [125705.196800] ------------[ cut here ]------------
Apr 12 20:44:44 foo kernel: [125705.196818] WARNING: CPU: 1 PID: 30705 at drivers/gpu/drm/i915/intel_pm.c:3553 skl_update_other_pipe_wm+0x147/0x150 [i915]()
Apr 12 20:44:44 foo kernel: [125705.196819] WARN_ON(!wm_changed)
Apr 12 20:44:44 foo kernel: [125705.196843] Modules linked in: nls_utf8 btrfs xor raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c drbg ansi_cprng ctr ccm xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables snd_hda_codec_hdmi bnep binfmt_misc dell_led snd_hda_codec_realtek snd_hda_codec_generic hid_multitouch intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp arc4 kvm_intel kvm dell_laptop irqbypass dcdbas i2c_designware_platform nls_iso8859_1 dell_wmi crct10dif_pclmul i2c_designware_core sparse_keymap crc32_pclmul ghash_clmulni_intel snd_soc_skl iwlmvm snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp aesni_intel mac80211 snd_hda_ext_core aes_x86_64 lrw snd_soc_sst_match gf128mul snd_soc_core glue_helper ablk_helper cryptd snd_compress iwlwifi ac97_bus snd_pcm_dmaengine dw_dmac_core rtsx_pci_ms cfg80211 memstick pcspkr i915 snd_hda_intel snd_hda_codec snd_hda_core uvcvideo serio_raw snd_hwdep videobuf2_vmalloc snd_pcm videobuf2_memops videobuf2_v4l2 snd_timer videobuf2_core snd videodev i2c_i801 soundcore drm_kms_helper media drm joydev input_leds idma64 i2c_algo_bit virt_dma fb_sys_fops btusb mei_me syscopyarea btrtl sysfillrect mei sysimgblt shpchp processor_thermal_device intel_soc_dts_iosf intel_lpss_pci wmi hci_uart btbcm btqca btintel bluetooth video intel_lpss_acpi intel_lpss int3403_thermal int340x_thermal_zone int3400_thermal acpi_thermal_rel acpi_pad mac_hid acpi_als kfifo_buf industrialio parport_pc ppdev lp parport autofs4 hid_generic usbhid rtsx_pci_sdmmc psmouse rtsx_pci nvme i2c_hid hid pinctrl_sunrisepoint pinctrl_intel fjes
Apr 12 20:44:44 foo kernel: [125705.196873] CPU: 1 PID: 30705 Comm: kworker/u8:17 Tainted: G     U  W       4.5.0intsusp #1
Apr 12 20:44:44 foo kernel: [125705.196874] Hardware name: Dell Inc. XPS 13 9350/07TYC2, BIOS 1.0.0 09/10/2015
Apr 12 20:44:44 foo kernel: [125705.196877] Workqueue: events_unbound async_run_entry_fn
Apr 12 20:44:44 foo kernel: [125705.196879]  0000000000000286 00000000c5dc1ff8 ffff880100b5b948 ffffffff813d9213
Apr 12 20:44:44 foo kernel: [125705.196880]  ffff880100b5b990 ffffffffc05797f0 ffff880100b5b980 ffffffff8107ebf2
Apr 12 20:44:44 foo kernel: [125705.196881]  ffff8802b68ab000 ffff880100b5ba0c ffff880036179bd4 ffff8802b52f0000
Apr 12 20:44:44 foo kernel: [125705.196882] Call Trace:
Apr 12 20:44:44 foo kernel: [125705.196885]  [<ffffffff813d9213>] dump_stack+0x63/0x90
Apr 12 20:44:44 foo kernel: [125705.196888]  [<ffffffff8107ebf2>] warn_slowpath_common+0x82/0xc0
Apr 12 20:44:44 foo kernel: [125705.196889]  [<ffffffff8107ec8c>] warn_slowpath_fmt+0x5c/0x80
Apr 12 20:44:44 foo kernel: [125705.196899]  [<ffffffffc04b25b7>] skl_update_other_pipe_wm+0x147/0x150 [i915]
Apr 12 20:44:44 foo kernel: [125705.196908]  [<ffffffffc04b272a>] skl_update_wm+0x16a/0x5d0 [i915]
Apr 12 20:44:44 foo kernel: [125705.196923]  [<ffffffffc053d3ff>] ? intel_ddi_enable_transcoder_func+0x17f/0x260 [i915]
Apr 12 20:44:44 foo kernel: [125705.196933]  [<ffffffffc04b604e>] intel_update_watermarks+0x1e/0x20 [i915]
Apr 12 20:44:44 foo kernel: [125705.196947]  [<ffffffffc0520b61>] haswell_crtc_enable+0x321/0x8c0 [i915]
Apr 12 20:44:44 foo kernel: [125705.196961]  [<ffffffffc051c40e>] intel_atomic_commit+0x73e/0x1880 [i915]
Apr 12 20:44:44 foo kernel: [125705.196975]  [<ffffffffc02db511>] ? drm_atomic_check_only+0x181/0x600 [drm]
Apr 12 20:44:44 foo kernel: [125705.196984]  [<ffffffffc02db9c7>] drm_atomic_commit+0x37/0x60 [drm]
Apr 12 20:44:44 foo kernel: [125705.196998]  [<ffffffffc0526aff>] intel_display_resume+0x10f/0x150 [i915]
Apr 12 20:44:44 foo kernel: [125705.197006]  [<ffffffffc04a10dd>] i915_drm_resume+0xdd/0x170 [i915]
Apr 12 20:44:44 foo kernel: [125705.197014]  [<ffffffffc04a1195>] i915_pm_resume+0x25/0x30 [i915]
Apr 12 20:44:44 foo kernel: [125705.197016]  [<ffffffff8142a2c4>] pci_pm_resume+0x64/0xa0
Apr 12 20:44:44 foo kernel: [125705.197017]  [<ffffffff8142a260>] ? pci_pm_thaw+0x90/0x90
Apr 12 20:44:44 foo kernel: [125705.197020]  [<ffffffff81548a7e>] dpm_run_callback+0x4e/0x130
Apr 12 20:44:44 foo kernel: [125705.197021]  [<ffffffff81549013>] device_resume+0xd3/0x1f0
Apr 12 20:44:44 foo kernel: [125705.197023]  [<ffffffff8154914d>] async_resume+0x1d/0x50
Apr 12 20:44:44 foo kernel: [125705.197024]  [<ffffffff810a0a48>] async_run_entry_fn+0x48/0x150
Apr 12 20:44:44 foo kernel: [125705.197025]  [<ffffffff81097af5>] process_one_work+0x165/0x480
Apr 12 20:44:44 foo kernel: [125705.197027]  [<ffffffff81097e5b>] worker_thread+0x4b/0x500
Apr 12 20:44:44 foo kernel: [125705.197028]  [<ffffffff81097e10>] ? process_one_work+0x480/0x480
Apr 12 20:44:44 foo kernel: [125705.197030]  [<ffffffff8109e048>] kthread+0xd8/0xf0
Apr 12 20:44:44 foo kernel: [125705.197031]  [<ffffffff8109df70>] ? kthread_create_on_node+0x1a0/0x1a0
Apr 12 20:44:44 foo kernel: [125705.197034]  [<ffffffff8181500f>] ret_from_fork+0x3f/0x70
Apr 12 20:44:44 foo kernel: [125705.197035]  [<ffffffff8109df70>] ? kthread_create_on_node+0x1a0/0x1a0
Apr 12 20:44:44 foo kernel: [125705.197036] ---[ end trace 3cc60a24822ab8de ]---

See attached syslog. I got the crash sometime earlier today and tried suspending a few times at 10:30 before giving up and rebooting.

Please let me know how I should debug this. It's incredibly frustrating. I'm willing to compile my own kernels, try patches, etc. I see many people online running Ubuntu on exactly the same hardware and suspend seems to work for them. It's possible that my Debian is slightly different, or there could even be a firmware or hardware bug in my unit. If I knew what the problem was I could address it, but I don't... :(
Comment 1 Jani Nikula 2016-06-16 07:33:00 UTC
Please reopen if the problem persists with current drm-intel-nightly.

*** This bug has been marked as a duplicate of bug 89055 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.