Bug 104566

Summary: [CI] igt@kms_flip@busy-flip[-interruptible] - Incomplete - owatch Softdog
Product: DRI Reporter: Marta Löfstedt <marta.lofstedt>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: BXT, GLK, HSW, KBL, SNB i915 features: display/atomic

Description Marta Löfstedt 2018-01-10 07:08:10 UTC
Regression started on CI_DRM_3615 and reproduced on so far all consecutive runs:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-snb5/igt@kms_flip@busy-flip.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-hsw5/igt@kms_flip@busy-flip.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-apl3/igt@kms_flip@busy-flip.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-kbl6/igt@kms_flip@busy-flip.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-glkb3/igt@kms_flip@busy-flip.html

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-snb3/igt@kms_flip@busy-flip-interruptible.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-hsw4/igt@kms_flip@busy-flip-interruptible.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-apl1/igt@kms_flip@busy-flip-interruptible.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-kbl5/igt@kms_flip@busy-flip-interruptible.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3615/shard-glkb1/igt@kms_flip@busy-flip-interruptible.html

<4>[   41.044561] stack segment: 0000 [#1] PREEMPT SMP PTI
<0>[   41.044565] Dumping ftrace buffer:
<0>[   41.044570]    (ftrace buffer empty)
<4>[   41.044572] Modules linked in: snd_hda_codec_hdmi i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel r8169 snd_hda_codec_realtek mii snd_hda_codec_generic lpc_ich snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm mei_me mei prime_numbers
<4>[   41.044594] CPU: 0 PID: 1647 Comm: kms_flip Tainted: G     U           4.15.0-rc7-CI-CI_DRM_3615+ #1
<4>[   41.044596] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
<4>[   41.044600] RIP: 0010:kfree+0x66/0x2d0
<4>[   41.044602] RSP: 0018:ffffc90000b1bbd0 EFLAGS: 00010203
<4>[   41.044605] RAX: ffffea0000000000 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000001
<4>[   41.044606] RDX: 00000000000003d6 RSI: ffffffff82246460 RDI: ffff88040370d818
<4>[   41.044608] RBP: 01ad998dadadad80 R08: 0000000000000000 R09: 0000000000000000
<4>[   41.044609] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff815a32cd
<4>[   41.044611] R13: 0000000000000000 R14: 0000000000000000 R15: ffffc90000b1bce0
<4>[   41.044613] FS:  00007fdb45f4ba40(0000) GS:ffff88041fa00000(0000) knlGS:0000000000000000
<4>[   41.044615] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   41.044616] CR2: 00007fdb40dfa220 CR3: 0000000403400006 CR4: 00000000001606f0
<4>[   41.044618] Call Trace:
<4>[   41.044623]  __drm_atomic_helper_crtc_destroy_state+0x3d/0xb0
<4>[   41.044626]  drm_atomic_helper_crtc_destroy_state+0xc/0x20
<4>[   41.044629]  drm_atomic_state_default_clear+0xbe/0x1c0
<4>[   41.044665]  intel_atomic_state_clear+0x9/0x20 [i915]
<4>[   41.044668]  __drm_atomic_state_free+0xe/0x50
<4>[   41.044672]  drm_atomic_helper_page_flip+0x67/0x90
<4>[   41.044677]  drm_mode_page_flip_ioctl+0x467/0x4e0
<4>[   41.044683]  ? drm_mode_cursor2_ioctl+0x10/0x10
<4>[   41.044686]  drm_ioctl_kernel+0x60/0xa0
<4>[   41.044689]  drm_ioctl+0x290/0x330
<4>[   41.044692]  ? drm_mode_cursor2_ioctl+0x10/0x10
<4>[   41.044698]  ? trace_hardirqs_on_caller+0xde/0x1c0
<4>[   41.044701]  ? _raw_spin_unlock_irq+0x2f/0x50
<4>[   41.044705]  ? finish_task_switch+0xa5/0x210
<4>[   41.044707]  ? finish_task_switch+0x6a/0x210
<4>[   41.044712]  do_vfs_ioctl+0x8a/0x670
<4>[   41.044715]  ? entry_SYSCALL_64_fastpath+0x5/0x89
<4>[   41.044718]  ? trace_hardirqs_on_caller+0xde/0x1c0
<4>[   41.044722]  SyS_ioctl+0x36/0x70
<4>[   41.044725]  entry_SYSCALL_64_fastpath+0x1c/0x89
<4>[   41.044727] RIP: 0033:0x7fdb4414a587
<4>[   41.044729] RSP: 002b:00007ffdb0524358 EFLAGS: 00000246
<4>[   41.044733] Code: 48 01 dd 0f 82 7b 02 00 00 48 b8 00 00 00 80 ff 77 00 00 48 01 c5 48 b8 00 00 00 00 00 ea ff ff 48 c1 ed 0c 48 c1 e5 06 48 01 c5 <48> 8b 45 20 48 8d 50 ff a8 01 48 0f 45 ea 48 8b 55 20 48 8d 42 
<1>[   41.044783] RIP: kfree+0x66/0x2d0 RSP: ffffc90000b1bbd0
<4>[   41.044785] ---[ end trace 1f87461e46e8d75b ]---
Comment 1 Marta Löfstedt 2018-01-10 07:48:03 UTC
I tested on my BDW, the regression appear to be caused by:

commit a10195bbe7f4e6ba540083ba13126ef745116cae
Author: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Date:   Thu Jan 4 14:47:33 2018 -0500

    drm/atomic: Fix memleak on ERESTARTSYS during non-blocking commits
    
    During a non-blocking commit, it is possible to return before the
    commit_tail work is queued (-ERESTARTSYS, for example).
    
    Since a reference on the crtc commit object is obtained for the pending
    vblank event when preparing the commit, the above situation will leave
    us with an extra reference.
    
    Therefore, if the commit_tail worker has not consumed the event at the
    end of a commit, release it's reference.
    
    Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
    Acked-by: Harry Wentland <harry.wentland@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/1515095253-29817-1-git-send-email-sunpeng.li@amd.com

I'll send up a try-bot to check it on shards.
Comment 2 Marta Löfstedt 2018-01-10 09:08:15 UTC
try-bot with the revert:
https://patchwork.freedesktop.org/series/36250/

shards are green on the tests.
Comment 3 Maarten Lankhorst 2018-01-10 09:41:09 UTC
Ah, before I wanted to commit I put the series out on trybot:

https://patchwork.freedesktop.org/series/36185/

Because it was a memory leak, it should have had the appropriate fixes tags and have been committed through drm-misc-fixes branch.

commit 60ccc38f53ad50128bf33616f4e1745947eb726c (HEAD -> drm-misc-next, drm-misc/drm-misc-next)
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Wed Jan 10 10:32:18 2018 +0100

    Revert "drm/atomic: Fix memleak on ERESTARTSYS during non-blocking commits"
Comment 4 Marta Löfstedt 2018-01-10 13:42:14 UTC
the revert on drm-misc-next was integrated to CI_DRM_3618, the issue is no longer reproducible. I will close this bug.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.