Bug 106098 - [CI] igt@kms_cursor_legacy@2x-(long-)nonblocking-modeset-vs-cursor-atomic - incomplete
Summary: [CI] igt@kms_cursor_legacy@2x-(long-)nonblocking-modeset-vs-cursor-atomic - i...
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: high normal
Assignee: Stanislav Lisovskiy
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-17 10:41 UTC by Martin Peres
Modified: 2018-05-22 20:41 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: display/Other


Attachments

Description Martin Peres 2018-04-17 10:41:06 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_17/fi-skl-6770hq/igt@kms_cursor_legacy@2x-nonblocking-modeset-vs-cursor-atomic.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_17/fi-skl-6770hq/igt@kms_cursor_legacy@2x-long-nonblocking-modeset-vs-cursor-atomic.html

<4>[  126.536169] general protection fault: 0000 [#1] PREEMPT SMP PTI
<0>[  126.536173] Dumping ftrace buffer:
<0>[  126.536176] ---------------------------------
[...]
<0>[  691.011666] ---------------------------------
<4>[  691.011669] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal snd_hda_intel intel_powerclamp btusb btrtl coretemp snd_hda_codec btbcm i915 btintel snd_hwdep crct10dif_pclmul snd_hda_core crc32_pclmul bluetooth ghash_clmulni_intel snd_pcm ecdh_generic e1000e mei_me mei prime_numbers
<4>[  691.011711] CPU: 3 PID: 1649 Comm: kms_cursor_lega Tainted: G     U           4.16.0-rc7-g98590826e4a4-drmtip_17+ #1
<4>[  691.011714] Hardware name:  /NUC6i7KYB, BIOS KYSKLi70.86A.0042.2016.0929.1933 09/29/2016
<4>[  691.011775] RIP: 0010:intel_atomic_check+0x984/0x1080 [i915]
<4>[  691.011779] RSP: 0018:ffff9c7700e1faf0 EFLAGS: 00010202
<4>[  691.011783] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: 0000000000000001
<4>[  691.011787] RDX: ffff8807594d6278 RSI: ffffffff8e0eac19 RDI: 00000000ffffffff
<4>[  691.011790] RBP: 0000000000000001 R08: ffff880768695948 R09: 000000000a21ef20
<4>[  691.011793] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
<4>[  691.011796] R13: ffff880752ba0000 R14: ffff880752ba0000 R15: ffff8807686db528
<4>[  691.011799] FS:  00007faafce48980(0000) GS:ffff88077ecc0000(0000) knlGS:0000000000000000
<4>[  691.011803] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  691.011806] CR2: 00005562cdfc50b0 CR3: 000000049429c006 CR4: 00000000003606e0
<4>[  691.011809] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  691.011812] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  691.011815] Call Trace:
<4>[  691.011824]  ? lock_acquire+0xaf/0x200
<4>[  691.011832]  drm_atomic_check_only+0x3fd/0x510
<4>[  691.011837]  ? trace_hardirqs_on_caller+0xde/0x1c0
<4>[  691.011844]  drm_atomic_nonblocking_commit+0xe/0x50
<4>[  691.011848]  drm_mode_atomic_ioctl+0x888/0x9f0
<4>[  691.011861]  ? drm_atomic_set_property+0x510/0x510
<4>[  691.011866]  drm_ioctl_kernel+0xa3/0xe0
<4>[  691.011873]  drm_ioctl+0x2e2/0x380
<4>[  691.011877]  ? drm_atomic_set_property+0x510/0x510
<4>[  691.011888]  do_vfs_ioctl+0x9a/0x6a0
<4>[  691.011894]  ? __task_pid_nr_ns+0xb9/0x1c0
<4>[  691.011900]  ? entry_SYSCALL_64_after_hwframe+0x52/0xb7
<4>[  691.011905]  SyS_ioctl+0x36/0x70
<4>[  691.011911]  do_syscall_64+0x65/0x1a0
<4>[  691.011916]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
<4>[  691.011920] RIP: 0033:0x7faafc2dd5d7
<4>[  691.011923] RSP: 002b:00007fff99602dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  691.011928] RAX: ffffffffffffffda RBX: 00005562cdfd9810 RCX: 00007faafc2dd5d7
<4>[  691.011931] RDX: 00007fff99602e30 RSI: 00000000c03864bc RDI: 0000000000000003
<4>[  691.011934] RBP: 00007fff99602e30 R08: 00005562cdf7fbc0 R09: 00005562cdfda500
<4>[  691.011938] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000c03864bc
<4>[  691.011940] R13: 0000000000000003 R14: 00005562cdfd8ca0 R15: 00005562cdfd5580
<4>[  691.011948] Code: 63 90 d0 00 00 00 41 3b 57 28 0f 8c d1 01 00 00 48 8b 90 50 04 00 00 48 8b 42 10 48 85 c0 74 d0 48 83 7a 08 00 0f 84 64 05 00 00 <8b> 50 70 83 fa 0a 0f 84 dc 01 00 00 0f 87 be 01 00 00 83 ea 06 
<1>[  691.012086] RIP: intel_atomic_check+0x984/0x1080 [i915] RSP: ffff9c7700e1faf0
<4>[  691.012115] ---[ end trace 7bcc1481cf759876 ]---
Comment 1 Martin Peres 2018-04-17 10:44:02 UTC
This has been reproduced on all subsequent runs.
Comment 2 Stanislav Lisovskiy 2018-04-23 14:12:57 UTC
I will take a look at this, if nobody objects.
Comment 3 Stanislav Lisovskiy 2018-05-11 13:27:31 UTC
Still not able to reproduce the oops issue, despite using exactly same test sequence(extracted from shards x0027, x0030), skl-6770hw hardware and even same kernel version.

Those tests seem to fail more frequently with vblank_matches_assertion failure though. especially if drm.debug all bits are set(probably affects timings). However no similar crash happens.

So still need info on that one.
Comment 4 Maarten Lankhorst 2018-05-14 13:11:09 UTC
eu-addr2line on the backtrace points to encoder being NULL in check_digital_port_conflicts.

However that should only be called when a modeset is requested. Still lets figure out what happens..

https://patchwork.freedesktop.org/series/43132/
Comment 5 Stanislav Lisovskiy 2018-05-14 14:11:36 UTC
(In reply to Maarten Lankhorst from comment #4)
> eu-addr2line on the backtrace points to encoder being NULL in
> check_digital_port_conflicts.
> 
> However that should only be called when a modeset is requested. Still lets
> figure out what happens..
> 
> https://patchwork.freedesktop.org/series/43132/

There are ioctls sent to change cursor position, simultaneously with disabling/enabling pipe2 in parallel. To me it looks like some concurrency issue, got same result with objdump. However the question is then why it is not reproducible.
Comment 6 Jani Saarinen 2018-05-22 11:14:30 UTC
Not seen on CI lately.
Comment 7 Martin Peres 2018-05-22 20:41:01 UTC
(In reply to Jani Saarinen from comment #6)
> Not seen on CI lately.

Agreed, it was very consistent, but not anymore (past ~12 runs). I guess we can call this fixed!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.