Bug 89660 - OOM, then kernel BUG at drivers/gpu/drm/i915/i915_gem.c:4256!
Summary: OOM, then kernel BUG at drivers/gpu/drm/i915/i915_gem.c:4256!
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-03-18 14:57 UTC by mikhail.v.gavrilov
Modified: 2016-10-10 11:52 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features: display/atomic


Attachments
kernel 4.0.0-0.rc3 log (254.18 KB, text/plain)
2015-03-18 14:57 UTC, mikhail.v.gavrilov
no flags Details

Description mikhail.v.gavrilov 2015-03-18 14:57:23 UTC
Created attachment 114444 [details]
kernel 4.0.0-0.rc3 log

[ 6313.275028] Out of memory: Kill process 13778 (chrome) score 344 or sacrifice child
[ 6313.275032] Killed process 13778 (chrome) total-vm:405592kB, anon-rss:79944kB, file-rss:6716kB
[ 6318.491501] systemd-journald[14041]: File /var/log/journal/736fbfa4372649649ba18e438d4cdd62/system.journal corrupted or uncleanly shut down, renaming and replacing.
[ 7934.344804] systemd-journald[14041]: File /var/log/journal/736fbfa4372649649ba18e438d4cdd62/user-1000.journal corrupted or uncleanly shut down, renaming and replacing.
[ 9038.579417] show_signal_msg: 24 callbacks suppressed
[ 9038.579424] Chrome_ChildThr[14474]: segfault at 0 ip 0805c93a sp b013b9e0 error 6 in plugin-container[8048000+8b000]
[ 9359.479516] Chrome_ChildThr[15322]: segfault at 0 ip 0805c93a sp b013b9e0 error 6 in plugin-container[8048000+8b000]
[ 9381.060524] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[ 9381.060561] [drm:intel_set_pch_fifo_underrun_reporting [i915]] *ERROR* uncleared pch fifo underrun on pch transcoder A
[ 9381.060581] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH transcoder A FIFO underrun
[ 9413.864798] Adding 16383996k swap on /dev/sda2.  Priority:-1 extents:1 across:16383996k FS
[13231.445374] polkitd[642]: segfault at 0 ip b70bb45a sp bff23e00 error 4 in libmozjs-17.0.so[b7000000+3a0000]
[16374.943598] perf interrupt took too long (2508 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[16697.250968] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[16697.251024] [drm:intel_set_pch_fifo_underrun_reporting [i915]] *ERROR* uncleared pch fifo underrun on pch transcoder A
[16697.251045] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH transcoder A FIFO underrun
[16728.168760] ------------[ cut here ]------------
[16728.168794] WARNING: CPU: 3 PID: 15797 at drivers/gpu/drm/i915/i915_gem.c:4284 i915_gem_object_unpin_fence+0x6e/0x90 [i915]()
[16728.168795] WARN_ON(dev_priv->fence_regs[obj->fence_reg].pin_count <= 0)
[16728.168796] Modules linked in:
[16728.168798]  vfat fat bnep bluetooth rfkill fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw intel_rapl iosf_mbi snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp kvm_intel snd_hda_intel iTCO_wdt snd_hda_controller iTCO_vendor_support snd_hda_codec gpio_ich hid_logitech_hidpp lpc_ich kvm crc32_pclmul crc32c_intel usblp snd_hwdep snd_seq snd_seq_device
[16728.168822]  mfd_core snd_pcm snd_timer snd soundcore serio_raw mei_me mei i2c_i801 nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc i915 i2c_algo_bit video 8021q drm_kms_helper uas garp stp usb_storage drm llc mrp r8169 mii hid_logitech_dj
[16728.168837] CPU: 3 PID: 15797 Comm: Xorg Not tainted 4.0.0-0.rc3.git0.1.fc22.i686+PAE #1
[16728.168838] Hardware name: Gigabyte Technology Co., Ltd. H67N-USB3-B3/H67N-USB3-B3, BIOS F9 03/27/2012
[16728.168840]  c0d3a947 bf9bd342 00000000 e83a7c6c c0a8b26f e83a7cac e83a7c9c c04669c7
[16728.168844]  f8367f2c e83a7ccc 00003db5 f8367d24 000010bc f82da79e f82da79e f6480000
[16728.168847]  d1921f00 f835dca0 e83a7cb8 c0466a3e 00000009 e83a7cac f8367f2c e83a7ccc
[16728.168851] Call Trace:
[16728.168857]  [<c0a8b26f>] dump_stack+0x41/0x52
[16728.168862]  [<c04669c7>] warn_slowpath_common+0x87/0xc0
[16728.168878]  [<f82da79e>] ? i915_gem_object_unpin_fence+0x6e/0x90 [i915]
[16728.168890]  [<f82da79e>] ? i915_gem_object_unpin_fence+0x6e/0x90 [i915]
[16728.168910]  [<c0466a3e>] warn_slowpath_fmt+0x3e/0x60
[16728.168922]  [<f82da79e>] i915_gem_object_unpin_fence+0x6e/0x90 [i915]
[16728.168949]  [<f8311a70>] intel_unpin_fb_obj+0x20/0x50 [i915]
[16728.168952]  [<c0a8e470>] ? mutex_lock+0x10/0x30
[16728.168968]  [<f831ccf4>] intel_cleanup_plane_fb+0x34/0x90 [i915]
[16728.168975]  [<f80fd2c3>] drm_plane_helper_commit+0x1c3/0x2b0 [drm_kms_helper]
[16728.168982]  [<f80fd428>] drm_plane_helper_update+0x78/0xc0 [drm_kms_helper]
[16728.168998]  [<f831c8b4>] intel_crtc_set_config+0xc34/0xe10 [i915]
[16728.169012]  [<f812ebbe>] drm_mode_set_config_internal+0x4e/0xc0 [drm]
[16728.169023]  [<f813297c>] drm_mode_setcrtc+0x31c/0x590 [drm]
[16728.169033]  [<f8132660>] ? drm_mode_setplane+0x250/0x250 [drm]
[16728.169041]  [<f8124169>] drm_ioctl+0x1b9/0x570 [drm]
[16728.169045]  [<c057537d>] ? do_set_pte+0xcd/0x110
[16728.169053]  [<f8132660>] ? drm_mode_setplane+0x250/0x250 [drm]
[16728.169056]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.169065]  [<f8123fb0>] ? drm_getmap+0xc0/0xc0 [drm]
[16728.169068]  [<c05bb9b2>] do_vfs_ioctl+0x2f2/0x510
[16728.169072]  [<c0693a8d>] ? file_has_perm+0x8d/0xc0
[16728.169074]  [<c0697bfb>] ? selinux_file_ioctl+0x4b/0xe0
[16728.169076]  [<c05bbc38>] SyS_ioctl+0x68/0x80
[16728.169078]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.169080]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.169083]  [<c0a9099f>] sysenter_do_call+0x12/0x12
[16728.169085]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.169087]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.169089] ---[ end trace f7767a0335d2fb13 ]---
[16728.169104] ------------[ cut here ]------------
[16728.169135] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:4256!
[16728.169166] invalid opcode: 0000 [#1] SMP 
[16728.169192] Modules linked in: vfat fat bnep bluetooth rfkill fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw intel_rapl iosf_mbi snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp kvm_intel snd_hda_intel iTCO_wdt snd_hda_controller iTCO_vendor_support snd_hda_codec gpio_ich hid_logitech_hidpp lpc_ich kvm crc32_pclmul crc32c_intel usblp snd_hwdep
[16728.169632]  snd_seq snd_seq_device mfd_core snd_pcm snd_timer snd soundcore serio_raw mei_me mei i2c_i801 nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc i915 i2c_algo_bit video 8021q drm_kms_helper uas garp stp usb_storage drm llc mrp r8169 mii hid_logitech_dj
[16728.169796] CPU: 3 PID: 15797 Comm: Xorg Tainted: G        W       4.0.0-0.rc3.git0.1.fc22.i686+PAE #1
[16728.169844] Hardware name: Gigabyte Technology Co., Ltd. H67N-USB3-B3/H67N-USB3-B3, BIOS F9 03/27/2012
[16728.169891] task: ce31d0a0 ti: e83a6000 task.ti: e83a6000
[16728.169920] EIP: 0060:[<f82da5c2>] EFLAGS: 00013246 CPU: 3
[16728.169960] EIP is at i915_gem_object_ggtt_unpin+0xb2/0xc0 [i915]
[16728.169991] EAX: d1921f00 EBX: 00000000 ECX: e12c1680 EDX: e12c1680
[16728.170022] ESI: f6484354 EDI: d1921f78 EBP: e83a7cc4 ESP: e83a7cb4
[16728.170055]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[16728.170084] CR0: 80050033 CR2: b6001014 CR3: 36634000 CR4: 000407f0
[16728.170116] Stack:
[16728.170127]  009bd342 d1921f00 d1921f00 f835dca0 e83a7cd4 f82da5e1 d1921f00 d1921f00
[16728.170179]  e83a7cec f8311a77 00000000 e83a7cec c0a8e470 f6a94434 e83a7d08 f831ccf4
[16728.170229]  f693a000 e83a7d28 e83a7d08 f68b8200 e83a7d28 e83a7d40 f80fd2c3 f68b8200
[16728.170280] Call Trace:
[16728.170307]  [<f82da5e1>] i915_gem_object_unpin_from_display_plane+0x11/0x70 [i915]
[16728.170361]  [<f8311a77>] intel_unpin_fb_obj+0x27/0x50 [i915]
[16728.170392]  [<c0a8e470>] ? mutex_lock+0x10/0x30
[16728.170432]  [<f831ccf4>] intel_cleanup_plane_fb+0x34/0x90 [i915]
[16728.170469]  [<f80fd2c3>] drm_plane_helper_commit+0x1c3/0x2b0 [drm_kms_helper]
[16728.170512]  [<f80fd428>] drm_plane_helper_update+0x78/0xc0 [drm_kms_helper]
[16728.170562]  [<f831c8b4>] intel_crtc_set_config+0xc34/0xe10 [i915]
[16728.170606]  [<f812ebbe>] drm_mode_set_config_internal+0x4e/0xc0 [drm]
[16728.170649]  [<f813297c>] drm_mode_setcrtc+0x31c/0x590 [drm]
[16728.170687]  [<f8132660>] ? drm_mode_setplane+0x250/0x250 [drm]
[16728.170724]  [<f8124169>] drm_ioctl+0x1b9/0x570 [drm]
[16728.170753]  [<c057537d>] ? do_set_pte+0xcd/0x110
[16728.170784]  [<f8132660>] ? drm_mode_setplane+0x250/0x250 [drm]
[16728.170816]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.170853]  [<f8123fb0>] ? drm_getmap+0xc0/0xc0 [drm]
[16728.170882]  [<c05bb9b2>] do_vfs_ioctl+0x2f2/0x510
[16728.170910]  [<c0693a8d>] ? file_has_perm+0x8d/0xc0
[16728.170936]  [<c0697bfb>] ? selinux_file_ioctl+0x4b/0xe0
[16728.170965]  [<c05bbc38>] SyS_ioctl+0x68/0x80
[16728.170990]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.171018]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.171047]  [<c0a9099f>] sysenter_do_call+0x12/0x12
[16728.171073]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.171103]  [<c06864a2>] ? SyS_mq_timedsend+0x302/0x310
[16728.171132] Code: e3 0f 09 d9 84 db 88 4a 68 75 07 80 a0 a5 00 00 00 7f 83 c4 04 5b 5e 5f 5d c3 8d b4 26 00 00 00 00 0f 0b 8d b6 00 00 00 00 0f 0b <0f> 0b 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 56 53 66 66
[16728.171327] EIP: [<f82da5c2>] i915_gem_object_ggtt_unpin+0xb2/0xc0 [i915] SS:ESP 0068:e83a7cb4
[16728.184338] ---[ end trace f7767a0335d2fb14 ]---
Comment 1 Paulo Zanoni 2015-03-20 19:15:45 UTC
Hi

What are the steps to reproduce this problem? How reproducible is the bug?

Which platform are you using? Please provide the output of "lspci -nn".

Is there any way for you to provide us a complete dmesg? Even better if you are able to boot with drm.debug=0xe.

When did this start happening? Are you able to bisect this bug? Bisecting is a _really_ good way to help fixing the bug, and since you're running RC Kernels, maybe the bisect interval would be small.

Thanks,
Paulo
Comment 2 Ander Conselvan de Oliveira 2015-06-02 11:29:41 UTC
Timeout. Closing. Please reopen if you are able to provide the requested information.
Comment 3 Chris Wilson 2015-06-02 11:39:36 UTC
(In reply to Ander Conselvan de Oliveira from comment #2)
> Timeout. Closing. Please reopen if you are able to provide the requested
> information.

Pardon? The requested information is not relevant to the bug.
Comment 4 Ander Conselvan de Oliveira 2015-06-02 12:06:01 UTC
If you know how to reproduce it and have all the relevant information, then please go ahead and fix it.
Comment 5 Chris Wilson 2015-06-02 15:11:41 UTC
Yes, we just revert atomic...
Comment 6 Ander Conselvan de Oliveira 2015-06-03 07:55:01 UTC
(In reply to Chris Wilson from comment #5)
> Yes, we just revert atomic...

I'm afraid it's too late for 4.0.

Paulo did ask at least two relevant questions:

 - how to reproduce the bug and how often it happens?

 - on which platform was this issue seen? (this might be useful even if the bug is in generic code)

Given that over two months went by since he asked those questions and 4.0 was released in the meantime, Mikhail, please verify if you still see the issue with that release.

Also, please include dmesg with 'drm.debug=0x1e log_buf_len=4M' in your kernel command line.
Comment 7 mikhail.v.gavrilov 2015-06-05 14:21:47 UTC
[mikhail@corei5 ~]$ journalctl --no-pager | grep 'cut here'
[mikhail@corei5 ~]$ 

it means that no more incidents happened in this machine since 2015-04-17.
Comment 8 Jani Nikula 2015-08-18 14:01:02 UTC
(In reply to mikhail.v.gavrilov from comment #7)
> [mikhail@corei5 ~]$ journalctl --no-pager | grep 'cut here'
> [mikhail@corei5 ~]$ 
> 
> it means that no more incidents happened in this machine since 2015-04-17.

Fingers crossed. Please reopen if the problem reappears. Thanks for the report.
Comment 9 Jari Tahvanainen 2016-10-10 11:52:01 UTC
Closing resolved+worksforme based on the verification by Mikhail.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.