Bug 99024 - [SNB] WARNING: CPU: 0 PID: 1334 at /home/kernel/COD/linux/drivers/gpu/drm/i915/intel_display.c:3906 intel_atomic_commit+0x559/0x560 [i915]
Summary: [SNB] WARNING: CPU: 0 PID: 1334 at /home/kernel/COD/linux/drivers/gpu/drm/i91...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-12-08 09:07 UTC by Laurent Bonnaud
Modified: 2017-02-24 14:19 UTC (History)
1 user (show)

See Also:
i915 platform: SNB
i915 features: GPU hang


Attachments
Full dmesg from boot (29.67 KB, application/x-bzip)
2016-12-08 09:08 UTC, Laurent Bonnaud
no flags Details
/sys/class/drm/card0/error content (117.71 KB, application/x-bzip)
2016-12-08 09:16 UTC, Laurent Bonnaud
no flags Details
Content of /sys/devices/pci0000:00/0000:00:02.0/rom (64.00 KB, application/octet-stream)
2016-12-08 09:20 UTC, Laurent Bonnaud
no flags Details

Description Laurent Bonnaud 2016-12-08 09:07:35 UTC
Hi,

here is a GPU hang I got with this configuration:
 - CPU/GPU : Core i7-2640M
 - system : Ubuntu 16.10
 - kernel: linux-image-4.8.12-040812-generic (Ubuntu mainline kernel)

This kind of event is quite rare therefore I cannot reproduce it at will on a drm-intel-nightly kernel.

However I will try to provide as much information as possible...

[231716.438851] ------------[ cut here ]------------
[231716.438881] WARNING: CPU: 0 PID: 1334 at /home/kernel/COD/linux/drivers/gpu/drm/i915/intel_display.c:3906 intel_atomic_commit+0x559/0x560 [i915]
[231716.438882] Removing stuck page flip
[231716.438905] Modules linked in: ccm ses enclosure scsi_transport_sas ufs qnx4 hfsplus hfs minix ntfs msdos libcrc32c cpuid uas usb_storage nls_iso8859_1 mmc_block rfcomm cmac bnep dm_crypt dell_wmi sparse_keymap dell_rbtn dell_laptop dell_smbios dcdbas dell_smm_hwmon binfmt_misc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_hdmi irqbypass btusb crct10dif_pclmul crc32_pclmul snd_hda_codec_idt snd_hda_codec_generic ghash_clmulni_intel btrtl btbcm snd_hda_intel aesni_intel btintel snd_hda_codec bluetooth snd_hda_core snd_hwdep snd_pcm aes_x86_64 lrw glue_helper ablk_helper cryptd snd_seq_midi snd_seq_midi_event snd_rawmidi intel_cstate intel_rapl_perf arc4 joydev input_leds serio_raw iwlmvm snd_seq mac80211 snd_seq_device snd_timer iwlwifi lpc_ich snd cfg80211
[231716.438917]  soundcore shpchp mac_hid dell_smo8800 mei_me mei tpm_rng parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs xor raid6_pq hid_generic usbhid hid i915 i2c_algo_bit psmouse drm_kms_helper syscopyarea sysfillrect ahci sysimgblt libahci firewire_ohci sdhci_pci sdhci firewire_core crc_itu_t wmi fb_sys_fops drm e1000e ptp fjes video pps_core [last unloaded: jfs]
[231716.438919] CPU: 0 PID: 1334 Comm: Xorg Not tainted 4.8.12-040812-generic #201612020431
[231716.438919] Hardware name: Dell Inc. Latitude E6520/0NVF5K, BIOS A19 11/14/2013
[231716.438922]  0000000000000086 000000008f1c6b0f ffffa2e41cd8f878 ffffffffb3e12e02
[231716.438923]  ffffa2e41cd8f8c8 0000000000000000 ffffa2e41cd8f8b8 ffffffffb3a82d9b
[231716.438924]  00000f421cd8f898 0000000000000000 ffffa2e41ac26000 ffffa2e423708000
[231716.438924] Call Trace:
[231716.438927]  [<ffffffffb3e12e02>] dump_stack+0x63/0x81
[231716.438929]  [<ffffffffb3a82d9b>] __warn+0xcb/0xf0
[231716.438931]  [<ffffffffb3a82e1f>] warn_slowpath_fmt+0x5f/0x80
[231716.438932]  [<ffffffffb3ac6845>] ? finish_wait+0x55/0x70
[231716.438946]  [<ffffffffc051e039>] intel_atomic_commit+0x559/0x560 [i915]
[231716.438947]  [<ffffffffb3ac6cf0>] ? wake_atomic_t_function+0x60/0x60
[231716.438960]  [<ffffffffc036fc17>] drm_atomic_commit+0x37/0x60 [drm]
[231716.438968]  [<ffffffffc043fe3c>] restore_fbdev_mode+0x14c/0x270 [drm_kms_helper]
[231716.438972]  [<ffffffffc0441ab4>] drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper]
[231716.438975]  [<ffffffffc0441b2d>] drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
[231716.438990]  [<ffffffffc0537cea>] intel_fbdev_set_par+0x1a/0x60 [i915]
[231716.438992]  [<ffffffffb3e9e325>] ? fb_set_var+0x2f5/0x460
[231716.438993]  [<ffffffffb3e9e266>] fb_set_var+0x236/0x460
[231716.438994]  [<ffffffffb3ab7826>] ? update_curr+0x66/0x170
[231716.438995]  [<ffffffffb3ab4ccc>] ? __enqueue_entity+0x6c/0x70
[231716.438996]  [<ffffffffb3abbba8>] ? enqueue_entity+0x2e8/0x8b0
[231716.438997]  [<ffffffffb3e9420f>] fbcon_blank+0x30f/0x350
[231716.438998]  [<ffffffffb3aad9bf>] ? ttwu_do_activate+0x6f/0x80
[231716.439000]  [<ffffffffb3f2d8d2>] do_unblank_screen+0xd2/0x1a0
[231716.439001]  [<ffffffffb3f22fa9>] complete_change_console+0x59/0xe0
[231716.439003]  [<ffffffffb3f23739>] vt_ioctl+0x709/0x12b0
[231716.439008]  [<ffffffffc034f55a>] ? drm_dropmaster_ioctl+0x4a/0x70 [drm]
[231716.439014]  [<ffffffffc0354dd6>] ? drm_ioctl+0x236/0x4f0 [drm]
[231716.439017]  [<ffffffffb3f17793>] tty_ioctl+0x363/0xc70
[231716.439019]  [<ffffffffb3bda936>] ? handle_mm_fault+0x8a6/0x13c0
[231716.439021]  [<ffffffffb3c46603>] do_vfs_ioctl+0xa3/0x600
[231716.439022]  [<ffffffffb3a6b3a3>] ? __do_page_fault+0x203/0x4d0
[231716.439023]  [<ffffffffb3c46bd9>] SyS_ioctl+0x79/0x90
[231716.439025]  [<ffffffffb42832b6>] entry_SYSCALL_64_fastpath+0x1e/0xa8
[231716.439026] ---[ end trace 5a1628be00cd83d9 ]---
Comment 1 Laurent Bonnaud 2016-12-08 09:08:24 UTC
Created attachment 128376 [details]
Full dmesg from boot
Comment 2 Laurent Bonnaud 2016-12-08 09:15:25 UTC
Here is more information:

$ uname -a
Linux vougeot 4.8.12-040812-generic #201612020431 SMP Fri Dec 2 09:33:31 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

# dmidecode
[...]
BIOS Information
        Vendor: Dell Inc.
        Version: A19
        Release Date: 11/14/2013
[...]
System Information
        Manufacturer: Dell Inc.
        Product Name: Latitude E6520
        Version: 01
[...]
Base Board Information
        Manufacturer: Dell Inc.
        Product Name: 0NVF5K
        Version: A01

Display : only laptop panel ate the time of crash (I used an external VGA projector the previous day and did 2 suspend/resume in between).
Comment 3 Laurent Bonnaud 2016-12-08 09:16:49 UTC
Created attachment 128377 [details]
/sys/class/drm/card0/error content
Comment 4 Laurent Bonnaud 2016-12-08 09:20:27 UTC
Created attachment 128378 [details]
Content of /sys/devices/pci0000:00/0000:00:02.0/rom
Comment 5 yann 2016-12-13 08:37:17 UTC
Laurent, can you update your kernel with latest one and see if you reproduce the issue. In this last case (where error is happening again) add "drm.debug=0x1e log_buf_len=1M" in your boot command line adn then attach new kernel log and gpu crash dump.
Comment 6 Ricardo 2017-02-23 22:01:54 UTC
Request for update logs with latest kernel has been place for submitter, if there is no response in 30 days the bug will be closed.

Laurent if the GPU Hang does not longer occur please close the bug
Comment 7 Laurent Bonnaud 2017-02-24 14:10:53 UTC
Since my initial report I ran all kernels from 4.9.0 to 4.9.12 on this system and never saw a GPU hang again.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.