Bug 103738 - [IGT] igt@gem_exec/igt@gem_reloc/igt@sync some subtests has a dmesg-warn watchdog: BUG: soft lockup - CPU#1 stuck for 23s!
Summary: [IGT] igt@gem_exec/igt@gem_reloc/igt@sync some subtests has a dmesg-warn watc...
Status: CLOSED NOTABUG
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) Linux (All)
: low enhancement
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-14 16:21 UTC by Hector Velazquez
Modified: 2018-04-09 10:23 UTC (History)
3 users (show)

See Also:
i915 platform: BXT, CFL, CNL, GLK, KBL
i915 features: GEM/execlists


Attachments
dmesg-warn (23.04 KB, text/plain)
2017-11-14 16:21 UTC, Hector Velazquez
no flags Details
kern-log (160.35 MB, text/plain)
2018-01-29 20:31 UTC, Ricardo Perez
no flags Details
kernel log (comment 13) (342.41 KB, text/plain)
2018-01-30 15:49 UTC, Hector Velazquez
no flags Details
dmesg -H (comment 13) (192.87 KB, text/plain)
2018-01-30 15:50 UTC, Hector Velazquez
no flags Details
Kernel log CFL (245.22 KB, text/plain)
2018-02-27 18:25 UTC, Octavio
no flags Details

Description Hector Velazquez 2017-11-14 16:21:24 UTC
Created attachment 135452 [details]
dmesg-warn
Comment 1 Hector Velazquez 2017-11-14 16:21:30 UTC
This test have a dmesg-warn on CFL QA

igt@gem_exec_reloc@cpu-32
igt@gem_exec_reloc@readonly-32
igt@gem_reloc_overflow@wrapped-overflow

====================================================
dmesg-warn Sample
====================================================
[  234.116432] Setting dangerous option reset - tainting kernel
[  260.176003] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [gem_exec_reloc:1940]
[  260.176006] Modules linked in: snd_hda_codec_hdmi pegasus mii ip6table_filter ip6_tables iptable_filter bnep snd_hda_codec_realtek snd_hda_codec_generic binfmt_misc nls_iso8859_1 intel_rapl 8250_dw x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_intel kvm snd_hda_codec irqbypass snd_hda_core crct10dif_pclmul snd_hwdep crc32_pclmul ghash_clmulni_intel snd_pcm pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel aes_x86_64 snd_seq crypto_simd glue_helper btusb cryptd btrtl idma64 btbcm virt_dma intel_cstate btintel iwlwifi snd_seq_device bluetooth snd_timer intel_rapl_perf input_leds intel_lpss_pci snd ecdh_generic cfg80211 wmi_bmof serio_raw soundcore intel_lpss acpi_als kfifo_buf industrialio soc_button_array winbond_cir rc_core spidev tpm_crb acpi_pad intel_hid mac_hid sparse_keymap
[  260.176059]  parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid i915 i2c_algo_bit prime_numbers drm_kms_helper syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops ahci ptp pps_core libahci drm wmi video i2c_hid hid
[  260.176068] CPU: 1 PID: 1940 Comm: gem_exec_reloc Tainted: G     U          4.14.0-rc8-drm-tip-ww45-commit-1342299+ #1
[  260.176069] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X098.A00.1707301945 07/30/2017
[  260.176084] task: ffff9dbeff0f6c80 task.stack: ffffc0fc05a00000
[  260.176115] RIP: 0010:i915_exit+0x49/0x13b [i915]
[  260.176116] RSP: 0018:ffffc0fc05a038a8 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff10
[  260.176117] RAX: 0000000000000000 RBX: ffffc0fc05a03b40 RCX: 0000000000000000
[  260.176117] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000ff8
[  260.176117] RBP: ffffc0fc05a03af0 R08: 000000000536d1b3 R09: 0000000000001000
[  260.176118] R10: ffff9dbf047163c0 R11: 0000000000000000 R12: ffff9dbf047163c0
[  260.176118] R13: 00007f6001e2b810 R14: 00007f6001e2b670 R15: ffffc0fc05a03920
[  260.176119] FS:  00007f605dc73880(0000) GS:ffff9dbf0b240000(0000) knlGS:0000000000000000
[  260.176119] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  260.176120] CR2: 00007f6001e2b670 CR3: 0000000842a8b006 CR4: 00000000003606e0
[  260.176120] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  260.176121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  260.176121] Call Trace:
[  260.176134]  i915_gem_do_execbuffer+0x55e/0x1060 [i915]
[  260.176137]  ? shmem_getpage_gfp+0x735/0xc00
[  260.176139]  ? get_page_from_freelist+0x24e/0xaa0
[  260.176140]  ? get_page_from_freelist+0x24e/0xaa0
[  260.176141]  ? __kmalloc_node+0x1e6/0x2b0
[  260.176151]  i915_gem_execbuffer2+0x1b0/0x390 [i915]
[  260.176153]  ? find_next_bit+0xb/0x10
[  260.176155]  ? cpumask_any_but+0x2c/0x40
[  260.176165]  ? i915_gem_execbuffer+0x2c0/0x2c0 [i915]
[  260.176171]  drm_ioctl_kernel+0x69/0xb0 [drm]
[  260.176175]  drm_ioctl+0x340/0x450 [drm]
[  260.176185]  ? i915_gem_execbuffer+0x2c0/0x2c0 [i915]
[  260.176187]  do_vfs_ioctl+0xa1/0x5e0
[  260.176188]  SyS_ioctl+0x79/0x90
[  260.176189]  entry_SYSCALL_64_fastpath+0x1e/0xa9
[  260.176190] RIP: 0033:0x7f605c1834b7
[  260.176190] RSP: 002b:00007ffe6ff901a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  260.176191] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f605c1834b7
[  260.176192] RDX: 00007ffe6ff90260 RSI: 0000000040406469 RDI: 0000000000000003
[  260.176192] RBP: 00007f5f5b088000 R08: ffffffffffffffff R09: 0000000000000000
[  260.176192] R10: 0000000000000487 R11: 0000000000000246 R12: 0000000100000000
[  260.176193] R13: 00007ffe6ff90220 R14: 0000000000000100 R15: 0000000008000000
[  260.176193] Code: ff ff ff e9 a8 89 f1 ff b9 f2 ff ff ff e9 af 89 f1 ff b8 f2 ff ff ff 40 30 f6 e9 d3 b1 f2 ff b8 f2 ff ff ff 30 d2 e9 d8 b1 f2 ff <ba> f2 ff ff ff e9 df cb f2 ff b8 f2 ff ff ff e9 91 d0 f2 ff b8

This is my configuration:

======================================
	Graphic stack
======================================
Component: drm
    tag: libdrm-2.4.81-96-g931f019
    commit: 931f01964a2f2a75e8563feccc70ac2eb0296d99

Component: cairo
    tag: 1.15.6-82-g164be89
    commit: 164be896603ceb419c5bc47c7348781f791f70e4

Component: intel-gpu-tools
    tag: intel-gpu-tools-1.19-481-g7d75119
    commit: 7d75119b7f23fb49af52463da9bcd62e64fe6a6f

Component: piglit
    tag: piglit-v1
    commit: 733e3ab212fcce735f47ed9f8659ccdf6f625a70

======================================
	     Software
======================================
kernel version              : 4.14.0-rc8-drm-tip-ww45-commit-1342299+
hostname                    : gfx-desktop
architecture                : x86_64
os version                  : Ubuntu 16.10
os codename                 : yakkety
kernel driver               : i915
bios revision               : 98.0
bios release date           : 07/30/2017
ksc                         : 1.5
hardware acceleration       : disabled
swap partition              : enabled on (/dev/sda3)

======================================
	Graphic drivers
======================================
libdrm                      : 2.4.88
cairo                       : 1.15.9
intel-gpu-tools (tag)       : intel-gpu-tools-1.19-481-g7d75119
intel-gpu-tools (commit)    : 7d75119

======================================
	     Hardware
======================================
motherboard model          : CoffeeLakeClientPlatform
motherboard id             : CoffeeLakeHDDR4RVP
form factor                : Laptop
manufacturer               : IntelCorporation
cpu family                 : Other
cpu family id              : 6
cpu information            : Genuine Intel(R) CPU 0000 @ 2.80GHz
gpu card                   : Intel Corporation Device 3e9b (prog-if 00 [VGA controller])
memory ram                 : 31.3 GB
max memory ram             : 32 GB
cpu thread                 : 12
cpu core                   : 6
cpu model                  : 158
cpu stepping               : 10
socket                     : Other
hard drive                 : 74GiB (80GB)
current cd clock frequency : 337500 kHz
maximum cd clock frequency : 675000 kHz
displays connected         : DP-1 DP-2 DP-3

======================================
	     Firmware
======================================
dmc fw loaded             : yes
dmc version               : 1.1
guc fw loaded             : fetch SUCCESS, load SUCCESS
guc version wanted        : wanted 9.14, found 9.14
guc version found         : wanted 9.14, found 9.14

======================================
	     kernel parameters
======================================
quiet splash drm.debug=0xe intel_iommu=igfx_off i915.alpha_support=1 i915.enable_guc_loading=2 i915.enable_guc_submission=2 resume=/dev/sda3
Comment 2 Chris Wilson 2017-11-14 17:17:40 UTC
It's processing nearly a billion entries, that takes time. It's an unusual stress case of igt that I'm reluctant to add cond_resched() to eliminate the warning (at the cost of slowing down the typical case).
Comment 3 Hector Velazquez 2017-11-17 15:36:02 UTC
This test have the same fail on GLK QA, and is expected... (comment 2) 

igt@gem_reloc_overflow@wrapped-overflow

IGT-Version: 1.20-g88d6550 (x86_64) (Linux: 4.14.0-drm-tip-ww46-commit-1fc4fe8+ x86_64), fastfeedback-nov-ww46-thursday-07-03-33-code-179785857
Comment 4 Hector Velazquez 2017-12-07 20:17:11 UTC
This test have the same dmesg-warn on CNL QA (...see comment 2...)

igt@gem_reloc_overflow@wrapped-overflow
igt@gem_exec_reloc@readonly-31
igt@gem_sync@vebox

using IGT-IGT-Version: 1.20-g1db1246 (x86_64) (Linux: 4.15.0-rc2-drm-tip-ww49-commit-66be577+ x86_64)
Component: intel-gpu-tools
    tag: intel-gpu-tools-1.20-189-g1db1246
    commit: 1db12466cb5ad8483cd469753d2e312a62d717b7
Comment 5 Chris Wilson 2017-12-07 20:20:17 UTC
(In reply to Hector Velazquez from comment #4)
> This test have the same dmesg-warn on CNL QA (...see comment 2...)
> 
> igt@gem_reloc_overflow@wrapped-overflow
> igt@gem_exec_reloc@readonly-31
> igt@gem_sync@vebox

gem_sync is a very odd one out there. Attach the kernel logs for the gem_sync error.
Comment 6 Octavio 2017-12-12 17:55:03 UTC
The below test have a dmesg-warn on BXT

gem_exec_reloc@cpu-31 
gem_exec_reloc@readonly-31
gem_reloc_overflow@wrapped-overflow

====================================================
Configuration
====================================================


BXT IGT-Version: 1.20-g39ac6b8 (x86_64) (Linux: 4.15.0-rc2-drm-intel-qa-ww49-commit-bdf9b36+ x86_64)
Comment 7 Octavio 2017-12-28 21:28:01 UTC
The below tests are still failing on BXT 

igt@gem_exec_reloc@cpu-31
igt@gem_exec_reloc@readonly-31

using IGT-Version: 1.20-g4cd4cc4 (x86_64) (Linux: 4.15.0-rc5-drm-tip-ww52-commit-42a41a5+ x86_64)
Comment 8 Chris Wilson 2018-01-03 19:17:10 UTC
The dmesg-warn is not a fail. It's just telling you that it takes a long time to read and process many gigabytes of relocation data. What's the fail you see?

I note that the requested information about the odd-one-out has not materialised.
Comment 9 Elizabeth 2018-01-03 21:45:31 UTC
Please Octavio, Hector, attach kern log. Thanks.
Comment 10 Ricardo Perez 2018-01-29 20:31:43 UTC
Created attachment 137033 [details]
kern-log

Kernel Log for CoffeeLake S UDIMM RVP
Comment 11 Ricardo Perez 2018-01-29 20:34:02 UTC
For CoffeeLake S UDIMM RVP QA system, the following tests is failing 

igt@gem_reloc_overflow@wrapped-overflow

Running: 
IGT-Version: 1.21-g37bd27f (x86_64) (Linux: 4.15.0-rc9-drm-intel-qa-ww4-commit-59275f1+ x86_64)

-----------------------------------------------------------------------------

[  132.043996] watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [gem_reloc_overf:8093]
[  132.044001] Modules linked in: asix usbnet mii ip6table_filter ip6_tables iptable_filter cmac bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi 8250_dw snd_hda_codec_realtek snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel snd_hda_codec snd_hda_core kvm_intel snd_hwdep kvm snd_pcm irqbypass snd_seq_midi snd_seq_midi_event crct10dif_pclmul crc32_pclmul snd_rawmidi ghash_clmulni_intel pcbc snd_seq aesni_intel snd_seq_device iwlwifi aes_x86_64 btusb crypto_simd snd_timer btrtl glue_helper btbcm cryptd btintel intel_cstate bluetooth idma64 intel_rapl_perf snd virt_dma input_leds wmi_bmof serio_raw intel_lpss_pci ecdh_generic soundcore cfg80211 intel_lpss acpi_pad intel_pch_thermal mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid
[  132.044049]  hid i915 e1000e ptp ahci pps_core libahci wmi video
[  132.044052] CPU: 5 PID: 8093 Comm: gem_reloc_overf Tainted: G     U           4.15.0-rc9-drm-intel-qa-ww4-commit-59275f1+ #1
[  132.044053] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B07.1801040709 01/04/2018
[  132.044083] RIP: 0010:eb_relocate_entry+0xe3/0xbc0 [i915]
[  132.044084] RSP: 0018:ffffb6614b4c7858 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
[  132.044084] RAX: 0000000000000000 RBX: ffff9eaa92cdcdc0 RCX: ffff9eaa8c3a3870
[  132.044085] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000101
[  132.044085] RBP: ffff9eaa924bd600 R08: 0000000000000404 R09: 0000000000000010
[  132.044085] R10: 0000000008000000 R11: ffff9eaa90f7cc40 R12: ffffb6614b4c7b70
[  132.044086] R13: 00007f72471a1c30 R14: ffff9eaa924bd600 R15: ffffb6614b4c7918
[  132.044086] FS:  00007f725d5d1880(0000) GS:ffff9eaa9cf40000(0000) knlGS:0000000000000000
[  132.044087] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  132.044087] CR2: 0000558d26beca88 CR3: 00000004547c6003 CR4: 00000000003606e0
[  132.044088] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  132.044088] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  132.044088] Call Trace:
[  132.044100]  eb_relocate_vma+0x123/0x1c0 [i915]
[  132.044111]  i915_gem_do_execbuffer+0x599/0x1060 [i915]
[  132.044114]  ? unwind_next_frame+0x25c/0x690
[  132.044115]  ? __module_text_address+0xe/0x60
[  132.044116]  ? __save_stack_trace+0x92/0x100
[  132.044118]  ? create_object+0x24e/0x300
[  132.044128]  i915_gem_execbuffer2+0xee/0x360 [i915]
[  132.044137]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  132.044139]  drm_ioctl_kernel+0x65/0xb0
[  132.044140]  drm_ioctl+0x2e5/0x3e0
[  132.044149]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  132.044150]  ? security_file_free+0x20/0x30
[  132.044152]  ? _cond_resched+0x16/0x40
[  132.044153]  do_vfs_ioctl+0x9f/0x5f0
[  132.044154]  ? _cond_resched+0x16/0x40
[  132.044155]  ? task_work_run+0x33/0xa0
[  132.044156]  SyS_ioctl+0x74/0x80
[  132.044157]  ? SyS_rt_sigprocmask+0x8b/0xc0
[  132.044158]  entry_SYSCALL_64_fastpath+0x24/0x87
[  132.044159] RIP: 0033:0x7f725bae34b7
[  132.044160] RSP: 002b:00007ffc06480e68 EFLAGS: 00000246
[  132.044160] Code: 48 63 ff 48 39 fa 73 cc 49 8b 4c 24 20 48 8b 1c d1 48 85 db 74 be 41 8b 47 1c 8d 50 ff 85 c2 0f 85 11 05 00 00 45 8b 4f 18 89 c2 <44> 09 ca 83 e2 c1 0f 85 2f 05 00 00 85 c0 0f 85 4f 01 00 00 48 
[  160.047996] watchdog: BUG: soft lockup - CPU#11 stuck for 23s! [gem_reloc_overf:8093]
[  160.047999] Modules linked in: asix usbnet mii ip6table_filter ip6_tables iptable_filter cmac bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi 8250_dw snd_hda_codec_realtek snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel snd_hda_codec snd_hda_core kvm_intel snd_hwdep kvm snd_pcm irqbypass snd_seq_midi snd_seq_midi_event crct10dif_pclmul crc32_pclmul snd_rawmidi ghash_clmulni_intel pcbc snd_seq aesni_intel snd_seq_device iwlwifi aes_x86_64 btusb crypto_simd snd_timer btrtl glue_helper btbcm cryptd btintel intel_cstate bluetooth idma64 intel_rapl_perf snd virt_dma input_leds wmi_bmof serio_raw intel_lpss_pci ecdh_generic soundcore cfg80211 intel_lpss acpi_pad intel_pch_thermal mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid
[  160.048047]  hid i915 e1000e ptp ahci pps_core libahci wmi video
[  160.048051] CPU: 11 PID: 8093 Comm: gem_reloc_overf Tainted: G     U       L   4.15.0-rc9-drm-intel-qa-ww4-commit-59275f1+ #1
[  160.048051] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B07.1801040709 01/04/2018
[  160.048054] RIP: 0010:check_stack_object+0x0/0x40
[  160.048054] RSP: 0018:ffffb6614b4c78b8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
[  160.048055] RAX: 0000000000000000 RBX: 0000000000000200 RCX: 0000000000000027
[  160.048056] RDX: 000017bb0b4c78f8 RSI: 0000000000000200 RDI: ffffb6614b4c78f8
[  160.048056] RBP: 0000000000000000 R08: 0000000000000404 R09: 0000000000000010
[  160.048056] R10: 0000000000000000 R11: 0000000000000040 R12: ffffb6614b4c7af8
[  160.048057] R13: ffffb6614b4c78f8 R14: 00000000003ca5e0 R15: ffff9eaa90f7cc40
[  160.048071] FS:  00007f725d5d1880(0000) GS:ffff9eaa9d0c0000(0000) knlGS:0000000000000000
[  160.048072] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  160.048072] CR2: 00007ffb767b9008 CR3: 00000004547c6002 CR4: 00000000003606e0
[  160.048072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  160.048073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  160.048073] Call Trace:
[  160.048075]  __check_object_size+0xf6/0x190
[  160.048090]  eb_relocate_vma+0xc0/0x1c0 [i915]
[  160.048102]  i915_gem_do_execbuffer+0x599/0x1060 [i915]
[  160.048104]  ? unwind_next_frame+0x25c/0x690
[  160.048106]  ? __module_text_address+0xe/0x60
[  160.048107]  ? __save_stack_trace+0x92/0x100
[  160.048108]  ? create_object+0x24e/0x300
[  160.048118]  i915_gem_execbuffer2+0xee/0x360 [i915]
[  160.048127]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  160.048129]  drm_ioctl_kernel+0x65/0xb0
[  160.048130]  drm_ioctl+0x2e5/0x3e0
[  160.048139]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  160.048140]  ? security_file_free+0x20/0x30
[  160.048142]  ? _cond_resched+0x16/0x40
[  160.048143]  do_vfs_ioctl+0x9f/0x5f0
[  160.048144]  ? _cond_resched+0x16/0x40
[  160.048145]  ? task_work_run+0x33/0xa0
[  160.048146]  SyS_ioctl+0x74/0x80
[  160.048147]  ? SyS_rt_sigprocmask+0x8b/0xc0
[  160.048148]  entry_SYSCALL_64_fastpath+0x24/0x87
[  160.048149] RIP: 0033:0x7f725bae34b7
[  160.048150] RSP: 002b:00007ffc06480e78 EFLAGS: 00000246
[  160.048150] Code: c7 45 04 00 00 00 00 5b 5d 41 5c c3 e8 3a d9 f7 ff eb e1 e8 c3 fe ff ff 85 c0 4c 63 e0 75 a8 eb a2 0f ff eb db 90 90 90 90 90 90 <0f> 1f 44 00 00 65 48 8b 04 25 c0 5b 01 00 48 8b 40 40 48 01 fe 
[  188.043993] watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [gem_reloc_overf:8093]
[  188.043997] Modules linked in: asix usbnet mii ip6table_filter ip6_tables iptable_filter cmac bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi 8250_dw snd_hda_codec_realtek snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel snd_hda_codec snd_hda_core kvm_intel snd_hwdep kvm snd_pcm irqbypass snd_seq_midi snd_seq_midi_event crct10dif_pclmul crc32_pclmul snd_rawmidi ghash_clmulni_intel pcbc snd_seq aesni_intel snd_seq_device iwlwifi aes_x86_64 btusb crypto_simd snd_timer btrtl glue_helper btbcm cryptd btintel intel_cstate bluetooth idma64 intel_rapl_perf snd virt_dma input_leds wmi_bmof serio_raw intel_lpss_pci ecdh_generic soundcore cfg80211 intel_lpss acpi_pad intel_pch_thermal mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid
[  188.044045]  hid i915 e1000e ptp ahci pps_core libahci wmi video
[  188.044048] CPU: 5 PID: 8093 Comm: gem_reloc_overf Tainted: G     U       L   4.15.0-rc9-drm-intel-qa-ww4-commit-59275f1+ #1
[  188.044049] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B07.1801040709 01/04/2018
[  188.044078] RIP: 0010:eb_relocate_entry+0x32/0xbc0 [i915]
[  188.044079] RSP: 0018:ffffb6614b4c7858 EFLAGS: 00000292 ORIG_RAX: ffffffffffffff11
[  188.044080] RAX: 0a708ac0f8b29d00 RBX: ffffb6614b4c7b70 RCX: 0000000000000000
[  188.044080] RDX: 0000000000000000 RSI: ffff9eaa92514840 RDI: 00000000fffffeff
[  188.044080] RBP: ffff9eaa92514840 R08: 0000000000000404 R09: 0000000000000010
[  188.044081] R10: 0000000000000000 R11: 0000000000000040 R12: ffffb6614b4c7b70
[  188.044081] R13: 00007f725523f830 R14: ffff9eaa92514840 R15: ffffb6614b4c7918
[  188.044082] FS:  00007f725d5d1880(0000) GS:ffff9eaa9cf40000(0000) knlGS:0000000000000000
[  188.044082] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  188.044082] CR2: 00007fbb307109b8 CR3: 00000004547c6006 CR4: 00000000003606e0
[  188.044083] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  188.044083] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  188.044084] Call Trace:
[  188.044095]  eb_relocate_vma+0x123/0x1c0 [i915]
[  188.044106]  i915_gem_do_execbuffer+0x599/0x1060 [i915]
[  188.044108]  ? unwind_next_frame+0x25c/0x690
[  188.044110]  ? __module_text_address+0xe/0x60
[  188.044111]  ? __save_stack_trace+0x92/0x100
[  188.044113]  ? _raw_write_lock_irqsave+0x2b/0x30
[  188.044114]  ? create_object+0x24e/0x300
[  188.044124]  i915_gem_execbuffer2+0xee/0x360 [i915]
[  188.044133]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  188.044134]  drm_ioctl_kernel+0x65/0xb0
[  188.044135]  drm_ioctl+0x2e5/0x3e0
[  188.044144]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  188.044145]  ? security_file_free+0x20/0x30
[  188.044146]  ? _cond_resched+0x16/0x40
[  188.044147]  do_vfs_ioctl+0x9f/0x5f0
[  188.044148]  ? _cond_resched+0x16/0x40
[  188.044149]  ? task_work_run+0x33/0xa0
[  188.044150]  SyS_ioctl+0x74/0x80
[  188.044151]  ? SyS_rt_sigprocmask+0x8b/0xc0
[  188.044152]  entry_SYSCALL_64_fastpath+0x24/0x87
[  188.044153] RIP: 0033:0x7f725bae34b7
[  188.044153] RSP: 002b:00007ffc06480e78 EFLAGS: 00000246
[  188.044154] Code: 41 56 49 89 d7 41 55 41 54 49 89 fc 55 53 49 89 f6 48 83 ec 58 8b 12 8b bf 68 01 00 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 50 <31> c0 85 ff 48 89 d6 78 74 69 c2 47 86 c8 61 b9 20 00 00 00 29 
[  198.543277] Setting dangerous option prefault_disable - tainting kernel
[  198.543547] Setting dangerous option prefault_disable - tainting kernel
[  198.543754] Setting dangerous option prefault_disable - tainting kernel
Comment 12 Jari Tahvanainen 2018-01-30 08:12:12 UTC
Dropping down from Highest back to Medium - I missed comment 8
saying that is kinda expected behavior.
Comment 13 Hector Velazquez 2018-01-30 15:48:35 UTC
igt@gem_sync@vebox

This test has a dmesg-warn on CFL QA 

Tests List:

igt@gem_exec_reloc@readonly-32
igt@gem_reloc_overflow@wrapped-overflow

Note: kernel log has been attached...

IGT-Version: 1.21-g098a401 (x86_64) (Linux: 4.15.0-drm-tip-ww5-commit-d0eb027+ x86_64)

======================================
        dmesg-warn sample
======================================
. . .
[  580.165105] Setting dangerous option reset - tainting kernel
[  604.056004] watchdog: BUG: soft lockup - CPU#11 stuck for 22s! [gem_exec_reloc:1857]
[  604.056038] Modules linked in: snd_hda_codec_hdmi asix usbnet mii bnep arc4 binfmt_misc nls_iso8859_1 iwlmvm snd_hda_codec_realtek snd_hda_codec_generic mac80211 8250_dw snd_hda_intel snd_hda_codec intel_rapl x86_pkg_temp_thermal snd_hda_core intel_powerclamp snd_hwdep coretemp snd_pcm kvm_intel snd_seq_midi kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_seq_midi_event snd_rawmidi btusb btrtl aesni_intel btbcm aes_x86_64 crypto_simd btintel glue_helper cryptd snd_seq iwlwifi intel_cstate intel_rapl_perf bluetooth snd_seq_device input_leds snd_timer snd serio_raw idma64 wmi_bmof ecdh_generic virt_dma cfg80211 soundcore intel_lpss_pci intel_pch_thermal intel_lpss tpm_crb acpi_pad mac_hid parport_pc ppdev lp parport autofs4 hid_generic usbhid uas hid usb_storage i915 e1000e
[  604.056063]  ptp ahci pps_core libahci wmi video
[  604.056079] CPU: 11 PID: 1857 Comm: gem_exec_reloc Tainted: G     U           4.15.0-drm-tip-ww5-commit-d0eb027+ #1
[  604.056080] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X118.B07.1801040709 01/04/2018
[  604.056114] RIP: 0010:i915_exit+0x44/0x5f4 [i915]
[  604.056114] RSP: 0018:ffffb15004f678e8 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff11
[  604.056115] RAX: 0000000000000000 RBX: ffffb15004f67b70 RCX: 0000000000000000
[  604.056115] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000ff8
[  604.056116] RBP: ffff9c333bbc1340 R08: 0000000000000400 R09: 0000000004b9dc13
[  604.056116] R10: ffff9c333bbc1340 R11: 0000000000000001 R12: 00007fb9976a8410
[  604.056116] R13: 00007fb9976a8270 R14: ffffb15004f67958 R15: ffff9c3341206ac0
[  604.056117] FS:  00007fba02eac8c0(0000) GS:ffff9c334b4c0000(0000) knlGS:0000000000000000
[  604.056117] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  604.056118] CR2: 00007fb9976a8270 CR3: 00000008398bc004 CR4: 00000000003606e0
[  604.056119] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  604.056119] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  604.056119] Call Trace:
[  604.056133]  i915_gem_do_execbuffer+0x599/0x1060 [i915]
[  604.056136]  ? shmem_getpage_gfp+0x79a/0xcb0
[  604.056137]  ? is_bpf_text_address+0xa/0x20
[  604.056139]  ? __save_stack_trace+0x92/0x100
[  604.056140]  ? create_object+0x24e/0x300
[  604.056151]  i915_gem_execbuffer2+0xee/0x360 [i915]
[  604.056162]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  604.056164]  drm_ioctl_kernel+0x65/0xb0
[  604.056165]  drm_ioctl+0x2e5/0x3e0
[  604.056174]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  604.056176]  do_vfs_ioctl+0x9f/0x5f0
[  604.056177]  SyS_ioctl+0x74/0x80
[  604.056179]  entry_SYSCALL_64_fastpath+0x24/0x87
[  604.056180] RIP: 0033:0x7fba013b3f47
[  604.056180] RSP: 002b:00007ffe0475bb38 EFLAGS: 00000246
[  604.056181] Code: ff ff ff e9 02 65 f1 ff b9 f2 ff ff ff e9 0a 65 f1 ff b8 f2 ff ff ff 40 30 f6 e9 5d 83 f2 ff b8 f2 ff ff ff 30 d2 e9 62 83 f2 ff <ba> f2 ff ff ff e9 18 9e f2 ff b8 f2 ff ff ff e9 c9 a2 f2 ff b8
. . .
Comment 14 Hector Velazquez 2018-01-30 15:49:54 UTC
Created attachment 137051 [details]
kernel log (comment 13)
Comment 15 Hector Velazquez 2018-01-30 15:50:26 UTC
Created attachment 137052 [details]
dmesg -H (comment 13)
Comment 16 Chris Wilson 2018-01-30 15:55:25 UTC
(In reply to Hector Velazquez from comment #13)
> igt@gem_sync@vebox

And where is the info about gem_sync? The reloc warnings are the kernel functioning as expected and the test should succeed, just slowly. But for some reason, you've flagged gem_sync as an issue and gem_sync should not be triggering any of these warnings.
Comment 17 Hector Velazquez 2018-01-30 16:42:22 UTC
the comment 13 has been upload incomplete, sorry about that...
this is the missing part:

This test has been passed successfully on CFL QA 

igt@gem_exec_reloc@cpu-32
igt@gem_exec_reloc@readonly-31
igt@gem_sync@vebox
Comment 18 Hector Velazquez 2018-02-09 17:56:30 UTC
This test has a dmesg-warn on GLK QA 

Tests List:

igt@gem_exec_reloc@readonly-31

IGT-Version: 1.21-g94bd67c (x86_64) (Linux: 4.15.0-drm-tip-ww6-commit-078873d+ x86_64)

======================================
        dmesg-warn sample
======================================
. . .
[  199.267219] Setting dangerous option reset - tainting kernel
[  225.359139] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [gem_exec_reloc:2023]
[  225.359148] Modules linked in: spi_pxa2xx_platform 8250_dw intel_rapl intel_telemetry_pltdrv intel_pmc_ipc intel_punit_ipc intel_telemetry_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf serio_raw wmi_bmof binfmt_misc lpc_ich nls_iso8859_1 snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi snd_hda_codec_realtek snd_hda_codec_generic snd_soc_core snd_compress snd_pcm_dmaengine ac97_bus snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq joydev input_leds snd_seq_device snd_timer idma64 virt_dma snd intel_lpss_pci shpchp intel_lpss soundcore
[  225.359189]  mei_me mei rfkill_gpio soc_button_array intel_vbtn dptf_power int3406_thermal int3403_thermal int340x_thermal_zone intel_hid int3400_thermal acpi_thermal_rel sparse_keymap mac_hid parport_pc ppdev lp parport autofs4 ahci r8169 i915 mii libahci wmi i2c_hid video hid_generic usbhid hid
[  225.359210] CPU: 1 PID: 2023 Comm: gem_exec_reloc Tainted: G     U           4.15.0-drm-tip-ww6-commit-078873d+ #1
[  225.359211] Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0077.B50.1712072148 12/07/2017
[  225.359261] RIP: 0010:i915_exit+0x44/0xab9 [i915]
[  225.359262] RSP: 0018:ffffb71c819478e8 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff11
[  225.359263] RAX: 0000000000000000 RBX: ffffb71c81947b70 RCX: 0000000000000000
[  225.359264] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000ff8
[  225.359265] RBP: ffff8c356abdec00 R08: 0000000000000400 R09: 000000000374c84b
[  225.359266] R10: ffff8c356abdec00 R11: 0000000000000001 R12: 00007f23199f4a10
[  225.359266] R13: 00007f23199f4970 R14: ffffb71c81947a58 R15: ffff8c3569c83900
[  225.359267] FS:  00007f232dc238c0(0000) GS:ffff8c357fc80000(0000) knlGS:0000000000000000
[  225.359268] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  225.359269] CR2: 00007f23199f4970 CR3: 000000027060e000 CR4: 0000000000340ee0
[  225.359270] Call Trace:
[  225.359305]  i915_gem_do_execbuffer+0x599/0x1060 [i915]
[  225.359311]  ? shmem_getpage_gfp+0x79a/0xcb0
[  225.359315]  ? is_bpf_text_address+0xa/0x20
[  225.359318]  ? __save_stack_trace+0x92/0x100
[  225.359321]  ? create_object+0x24e/0x300
[  225.359349]  i915_gem_execbuffer2+0xee/0x360 [i915]
[  225.359376]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  225.359380]  drm_ioctl_kernel+0x65/0xb0
[  225.359382]  drm_ioctl+0x2e5/0x3e0
[  225.359409]  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
[  225.359412]  do_vfs_ioctl+0x9f/0x5f0
[  225.359414]  SyS_ioctl+0x74/0x80
[  225.359418]  entry_SYSCALL_64_fastpath+0x24/0x87
[  225.359420] RIP: 0033:0x7f232c127f47
[  225.359421] RSP: 002b:00007fff4fe99a68 EFLAGS: 00000246
[  225.359422] Code: ff ff ff e9 37 5b f1 ff b9 f2 ff ff ff e9 3f 5b f1 ff b8 f2 ff ff ff 40 30 f6 e9 12 79 f2 ff b8 f2 ff ff ff 30 d2 e9 17 79 f2 ff <ba> f2 ff ff ff e9 cd 93 f2 ff b8 f2 ff ff ff e9 7e 98 f2 ff b8
. . .
Comment 19 Armando Antonio 2018-02-26 21:11:26 UTC
The following test case is working good on CNL with latest configuration

====================
Test case
====================
igt@gem_sync@vebox
Comment 20 Octavio 2018-02-27 18:24:56 UTC
The below test fails on CFL QA 

igt@gem_exec_reloc@readonly-31

IGT-Version: 1.21-ga2664f8 (x86_64) (Linux: 4.16.0-rc2-drm-intel-qa-ww9-commit-01a067a+ x86_64)
Subtest readonly-31: SUCCESS (61.994s)

[   80.321442] Setting dangerous option reset - tainting kernel
[  104.043998] watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [gem_exec_reloc:1642]
[  104.044001] Modules linked in: cmac bnep snd_hda_codec_hdmi arc4 snd_hda_codec_realtek snd_hda_codec_generic 8250_dw nls_iso8859_1 iwlmvm intel_rapl mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep kvm snd_pcm snd_seq_midi snd_seq_midi_event irqbypass crct10dif_pclmul snd_rawmidi crc32_pclmul ghash_clmulni_intel pcbc iwlwifi snd_seq snd_seq_device snd_timer btusb aesni_intel btrtl btbcm aes_x86_64 crypto_simd btintel glue_helper cryptd input_leds bluetooth snd intel_cstate idma64 wmi_bmof serio_raw cfg80211 ecdh_generic intel_rapl_perf virt_dma soundcore intel_lpss_pci shpchp intel_lpss intel_pch_thermal acpi_pad mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 i915 e1000e ahci libahci wmi video
[  104.044053] CPU: 10 PID: 1642 Comm: gem_exec_reloc Tainted: G     U           4.16.0-rc2-drm-intel-qa-ww9-commit-01a067a+ #1
[  104.044054] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B07.1801040709 01/04/2018
[  104.044087] RIP: 0010:i915_exit+0x44/0xc56 [i915]
[  104.044087] RSP: 0018:ffffa3e1c3bdf8d0 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff12
[  104.044088] RAX: 0000000000000000 RBX: ffffa3e1c3bdfb58 RCX: 0000000000000000
[  104.044089] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000ff8
[  104.044089] RBP: ffff90f7d0721340 R08: 0000000000000400 R09: 000000000341d575
[  104.044090] R10: ffff90f7d0721340 R11: 0000000000000001 R12: 00007f4e79c3a010
[  104.044090] R13: 00007f4e79c39eb0 R14: ffffa3e1c3bdf980 R15: ffff90f7cf513d00
[  104.044091] FS:  00007f4e94c598c0(0000) GS:ffff90f7dd080000(0000) knlGS:0000000000000000
[  104.044091] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  104.044092] CR2: 00007f4e79c39eb0 CR3: 0000000453078001 CR4: 00000000003606e0
[  104.044092] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  104.044093] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  104.044093] Call Trace:
[  104.044107]  i915_gem_do_execbuffer+0x598/0x1060 [i915]
[  104.044110]  ? shmem_getpage_gfp+0x79f/0xcd0
[  104.044111]  ? is_bpf_text_address+0xa/0x20
[  104.044113]  ? __save_stack_trace+0x92/0x100
[  104.044115]  ? create_object+0x24e/0x300
[  104.044126]  i915_gem_execbuffer2_ioctl+0xee/0x360 [i915]
[  104.044135]  ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 [i915]
[  104.044137]  drm_ioctl_kernel+0x67/0xb0
[  104.044139]  drm_ioctl+0x2f4/0x3f0
[  104.044148]  ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 [i915]
[  104.044150]  ? vma_merge+0xc8/0x330
[  104.044152]  do_vfs_ioctl+0xa2/0x610
[  104.044153]  SyS_ioctl+0x74/0x80
[  104.044155]  do_syscall_64+0x6e/0x120
[  104.044156]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  104.044157] RIP: 0033:0x7f4e92e39ef7
[  104.044158] RSP: 002b:00007ffe8c904368 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  104.044158] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f4e92e39ef7
[  104.044159] RDX: 00007ffe8c904420 RSI: 0000000040406469 RDI: 0000000000000003
[  104.044159] RBP: 00007ffe8c904420 R08: ffffffffffffffff R09: 0000000000000000
[  104.044160] R10: 000000000000049e R11: 0000000000000246 R12: 0000000040406469
[  104.044160] R13: 0000000000000003 R14: 00007ffe8c904420 R15: 0000000004000000
[  104.044161] Code: ff ff ff e9 84 0c f1 ff b9 f2 ff ff ff e9 8c 0c f1 ff b8 f2 ff ff ff 40 30 f6 e9 72 34 f2 ff b8 f2 ff ff ff 30 d2 e9 7a 34 f2 ff <ba> f2 ff ff ff e9 0a 4f f2 ff b8 f2 ff ff ff e9 bb 53 f2 ff b8
Comment 21 Octavio 2018-02-27 18:25:48 UTC
Created attachment 137656 [details]
Kernel log CFL
Comment 22 Hector Velazquez 2018-02-27 18:38:55 UTC
This test has a dmesg-warn on GLK QA

Test List

igt@gem_exec_reloc@readonly-30

IGT-Version: 1.21-ga2664f8 (x86_64) (Linux: 4.16.0-rc2-drm-tip-ww9-commit-3a86cab+ x86_64)

======================================
        dmesg-warn sample
======================================
. . .
[  +0.000117] [IGT] gem_exec_reloc: executing
[  +0.008802] Setting dangerous option reset - tainting kernel
[  +0.000187] [IGT] gem_exec_reloc: starting subtest readonly-30
[  +0.000239] gem_exec_reloc (667): drop_caches: 4
[  +0.334621] random: crng init done
[  +0.003440] idma64 idma64.5: Found Intel integrated DMA 64-bit
[  +0.038183] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
[  +0.000035] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
[  +0.000032] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
[  +0.000038] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
[  +0.000031] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
[  +0.000031] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
[  +0.000030] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
[  +0.000030] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
[  +0.007423] input: HDA Intel PCH Mic as /devices/pci0000:00/0000:00:0e.0/sound/card0/input10
[  +0.000385] input: HDA Intel PCH Headphone as /devices/pci0000:00/0000:00:0e.0/sound/card0/input11
[  +0.000714] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:0e.0/sound/card0/input12
[  +0.000314] input: HDA Intel PCH HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:0e.0/sound/card0/input13
[  +0.000297] input: HDA Intel PCH HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:0e.0/sound/card0/input14
[  +0.000321] input: HDA Intel PCH HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:0e.0/sound/card0/input15
[  +0.000465] input: HDA Intel PCH HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:0e.0/sound/card0/input16
[  +2.969758] r8169 0000:01:00.0 enp1s0: link up
[  +0.000017] IPv6: ADDRCONF(NETDEV_CHANGE): enp1s0: link becomes ready
[Feb25 02:26] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [gem_exec_reloc:667]
[  +0.000025] Modules linked in: intel_rapl_perf(+) serio_raw wmi_bmof snd_hda_codec_hdmi lpc_ich snd_soc_skl nls_iso8859_1 snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi snd_hda_codec_realtek snd_hda_codec_generic snd_soc_core snd_compress snd_pcm_dmaengine ac97_bus snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep input_leds snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer idma64 snd virt_dma shpchp mei_me intel_lpss_pci(+) intel_lpss soundcore mei rfkill_gpio intel_vbtn soc_button_array dptf_power int3400_thermal int3406_thermal intel_hid acpi_thermal_rel int3403_thermal int340x_thermal_zone sparse_keymap mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic ahci r8169 usbhid mii i915 libahci wmi i2c_hid hid
[  +0.000050]  video
[  +0.000004] CPU: 0 PID: 667 Comm: gem_exec_reloc Tainted: G     U           4.16.0-rc2-drm-tip-ww9-commit-3a86cab+ #1
[  +0.000002] Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0077.B50.1712072148 12/07/2017
[  +0.000066] RIP: 0010:i915_exit+0x44/0xc86 [i915]
[  +0.000002] RSP: 0018:ffffb2fcc16fb8d0 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff12
[  +0.000002] RAX: 0000000000000000 RBX: ffffb2fcc16fbb58 RCX: 0000000000000000
[  +0.000001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000ff8
[  +0.000000] RBP: ffff89942a114dc0 R08: 0000000000000400 R09: 0000000001b2d62e
[  +0.000001] R10: ffff89942a114dc0 R11: 0000000000000001 R12: 00007f4227dc5610
[  +0.000001] R13: 00007f4227dc55d0 R14: ffffb2fcc16fbaa0 R15: ffff89942d6e0e40
[  +0.000002] FS:  00007f42354248c0(0000) GS:ffff89943f800000(0000) knlGS:0000000000000000
[  +0.000001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000001] CR2: 00007f4227dc55d0 CR3: 000000026d830000 CR4: 0000000000340ef0
[  +0.000001] Call Trace:
[  +0.000037]  i915_gem_do_execbuffer+0x598/0x1060 [i915]
[  +0.000008]  ? is_bpf_text_address+0xa/0x20
[  +0.000003]  ? __save_stack_trace+0x92/0x100
[  +0.000004]  ? create_object+0x24e/0x300
[  +0.000027]  i915_gem_execbuffer2_ioctl+0xee/0x360 [i915]
[  +0.000029]  ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 [i915]
[  +0.000004]  drm_ioctl_kernel+0x67/0xb0
[  +0.000003]  drm_ioctl+0x2f4/0x3f0
[  +0.000028]  ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 [i915]
[  +0.000002]  ? vma_merge+0xc8/0x330
[  +0.000004]  do_vfs_ioctl+0xa2/0x610
[  +0.000003]  SyS_ioctl+0x74/0x80
[  +0.000003]  do_syscall_64+0x6e/0x120
[  +0.000003]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  +0.000003] RIP: 0033:0x7f4233604ef7
[  +0.000001] RSP: 002b:00007fffd7296f08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  +0.000002] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f4233604ef7
[  +0.000000] RDX: 00007fffd7296fc0 RSI: 0000000040406469 RDI: 0000000000000003
[  +0.000001] RBP: 00007fffd7296fc0 R08: ffffffffffffffff R09: 0000000000000000
[  +0.000001] R10: 000000000000049e R11: 0000000000000246 R12: 0000000040406469
[  +0.000001] R13: 0000000000000003 R14: 00007fffd7296fc0 R15: 0000000002000000
[  +0.000001] Code: ff ff ff e9 b4 0c f1 ff b9 f2 ff ff ff e9 bc 0c f1 ff b8 f2 ff ff ff 40 30 f6 e9 a2 34 f2 ff b8 f2 ff ff ff 30 d2 e9 aa 34 f2 ff <ba> f2 ff ff ff e9 3a 4f f2 ff b8 f2 ff ff ff e9 eb 53 f2 ff b8 
[  +4.106061] [drm:edp_panel_vdd_off_sync [i915]] Turning eDP port A VDD off
[  +0.000036] [drm:edp_panel_vdd_off_sync [i915]] PP_STATUS: 0x80000008 PP_CONTROL: 0x00000067
[  +0.000029] [drm:intel_power_well_disable [i915]] disabling AUX A
[  +0.005901] RAPL PMU: API unit is 2^-32 Joules, 4 fixed counters, 655360 ms ovfl timer
[  +0.000002] RAPL PMU: hw unit of domain pp0-core 2^-14 Joules
[  +0.000000] RAPL PMU: hw unit of domain package 2^-14 Joules
[  +0.000001] RAPL PMU: hw unit of domain dram 2^-14 Joules
[  +0.000001] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
[  +0.010316] cryptd: max_cpu_qlen set to 1000
[  +0.031105] SSE version of gcm_enc/dec engaged.
[  +0.090212] idma64 idma64.6: Found Intel integrated DMA 64-bit
[  +0.017921] intel_telemetry_core Init
[  +0.058801] idma64 idma64.7: Found Intel integrated DMA 64-bit
[  +0.059638] idma64 idma64.8: Found Intel integrated DMA 64-bit
[  +0.001535] idma64 idma64.9: Found Intel integrated DMA 64-bit
[  +0.002904] idma64 idma64.11: Found Intel integrated DMA 64-bit
[  +0.001767] idma64 idma64.12: Found Intel integrated DMA 64-bit
[  +0.001685] idma64 idma64.13: Found Intel integrated DMA 64-bit
[  +0.196054] intel_rapl: Found RAPL domain package
[  +0.000002] intel_rapl: Found RAPL domain core
[  +0.000002] intel_rapl: Found RAPL domain uncore
[  +0.000001] intel_rapl: Found RAPL domain dram
[  +0.820175] dw-apb-uart.8: ttyS4 at MMIO 0xa122b000 (irq = 4, base_baud = 115200) is a 16550A
[  +0.128048] dw-apb-uart.9: ttyS5 at MMIO 0xa122d000 (irq = 5, base_baud = 115200) is a 16550A
[  +0.128005] dw-apb-uart.10: ttyS6 at MMIO 0xa122f000 (irq = 7, base_baud = 115200) is a 16550A
[ +31.907336] [IGT] gem_exec_reloc: exiting, ret=0
. . .
Comment 23 Octavio 2018-03-06 17:31:33 UTC
This test has dmesg-warn on CFL QA

using IGT-Version: 1.21-g68fb759 (x86_64) (Linux: 4.16.0-rc4-drm-intel-qa-ww10-commit-a994c52+ x86_64)

[   59.150255] Setting dangerous option reset - tainting kernel
[   84.047998] watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [gem_exec_reloc:1786]
[   84.048001] Modules linked in: snd_hda_codec_hdmi cmac bnep arc4 nls_iso8859_1 8250_dw snd_hda_codec_realtek iwlmvm snd_hda_codec_generic mac80211 intel_rapl x86_pkg_temp_thermal snd_hda_intel intel_powerclamp coretemp snd_hda_codec snd_hda_core kvm_intel snd_hwdep snd_pcm kvm snd_seq_midi snd_seq_midi_event snd_rawmidi irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_seq btusb aesni_intel snd_seq_device btrtl snd_timer aes_x86_64 crypto_simd glue_helper cryptd btbcm intel_cstate iwlwifi btintel snd idma64 virt_dma intel_rapl_perf input_leds bluetooth serio_raw ecdh_generic mei_me wmi_bmof cfg80211 soundcore shpchp intel_pch_thermal mei intel_lpss_pci intel_lpss acpi_pad mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 i915 e1000e ahci libahci prime_numbers wmi video
[   84.048053] CPU: 5 PID: 1786 Comm: gem_exec_reloc Tainted: G     U           4.16.0-rc4-drm-intel-qa-ww10-commit-a994c52+ #1
[   84.048054] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B07.1801040709 01/04/2018
[   84.048087] RIP: 0010:i915_exit+0x44/0xe08 [i915]
[   84.048087] RSP: 0018:ffffb42f0263f8d0 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff12
[   84.048088] RAX: 0000000000000000 RBX: ffffb42f0263fb58 RCX: 0000000000000000
[   84.048089] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000ff8
[   84.048089] RBP: ffff950f8bd3aa00 R08: 0000000000000400 R09: 000000000362b653
[   84.048090] R10: ffff950f8bd3aa00 R11: 0000000000000001 R12: 00007f57c2926c10
[   84.048090] R13: 00007f57c2926a70 R14: ffffb42f0263f940 R15: ffff950f8f529e80
[   84.048091] FS:  00007f57d97848c0(0000) GS:ffff950f9cf40000(0000) knlGS:0000000000000000
[   84.048091] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   84.048092] CR2: 00007f57c2926a70 CR3: 000000044b436001 CR4: 00000000003606e0
[   84.048092] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   84.048093] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   84.048093] Call Trace:
[   84.048106]  i915_gem_do_execbuffer+0x598/0x1060 [i915]
[   84.048110]  ? shmem_getpage_gfp+0x7a1/0xcf0
[   84.048111]  ? is_bpf_text_address+0xa/0x20
[   84.048113]  ? __save_stack_trace+0x92/0x100
[   84.048114]  ? create_object+0x24e/0x300
[   84.048125]  i915_gem_execbuffer2_ioctl+0xee/0x360 [i915]
[   84.048135]  ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 [i915]
[   84.048137]  drm_ioctl_kernel+0x67/0xb0
[   84.048139]  drm_ioctl+0x2f4/0x3f0
[   84.048148]  ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 [i915]
[   84.048150]  ? vma_merge+0xc8/0x330
[   84.048152]  do_vfs_ioctl+0xa2/0x610
[   84.048153]  SyS_ioctl+0x74/0x80
[   84.048155]  do_syscall_64+0x6e/0x120
[   84.048156]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   84.048157] RIP: 0033:0x7f57d7964ef7
[   84.048158] RSP: 002b:00007ffebfa762b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   84.048159] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f57d7964ef7
[   84.048159] RDX: 00007ffebfa76370 RSI: 0000000040406469 RDI: 0000000000000003
[   84.048159] RBP: 00007ffebfa76370 R08: ffffffffffffffff R09: 0000000000000000
[   84.048160] R10: 000000000000049e R11: 0000000000000246 R12: 0000000040406469
[   84.048160] R13: 0000000000000003 R14: 00007ffebfa76370 R15: 0000000004000000
[   84.048161] Code: ff ff ff e9 d6 08 f1 ff b9 f2 ff ff ff e9 de 08 f1 ff b8 f2 ff ff ff 40 30 f6 e9 94 30 f2 ff b8 f2 ff ff ff 30 d2 e9 9c 30 f2 ff <ba> f2 ff ff ff e9 2c 4b f2 ff b8 f2 ff ff ff e9 dd 4f f2 ff b8
Comment 24 Hector Velazquez 2018-03-07 15:41:31 UTC
This tests has a dmesg-warn on CNL QA 
Tests List:
igt@gem_exec_reloc@cpu-30
igt@gem_exec_reloc@cpu-31
igt@gem_exec_reloc@readonly-30
igt@gem_exec_reloc@readonly-31

dmesg-warn sample:
. . .
[  892.043983] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [gem_exec_reloc:22440]
[  892.043990] Modules linked in: snd_hda_codec_hdmi cmac bnep 8250_dw nls_iso8859_1 snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc arc4 snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi snd_hda_codec_realtek snd_hda_codec_generic snd_soc_core snd_compress snd_pcm_dmaengine ac97_bus iwlmvm mac80211 snd_hda_intel x86_pkg_temp_thermal snd_hda_codec intel_powerclamp coretemp snd_hda_core snd_hwdep snd_pcm kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_seq_midi snd_seq_midi_event pcbc snd_rawmidi aesni_intel aes_x86_64 crypto_simd glue_helper snd_seq cryptd input_leds serio_raw snd_seq_device iwlwifi btusb btrtl btbcm snd_timer wmi_bmof btintel asix bluetooth usbnet mii shpchp idma64 snd virt_dma ecdh_generic soundcore cfg80211 intel_lpss_pci mei_me intel_lpss mei intel_pch_thermal
[  892.044025]  acpi_pad mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid i915 e1000e prime_numbers wmi video
[  892.044034] CPU: 0 PID: 22440 Comm: gem_exec_reloc Tainted: G     U       L   4.16.0-rc3-drm-intel-qa-ww9-commit-b2e10fd+ #1
[  892.044035] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X124.B02.1802051422 02/05/2018
[  892.044065] RIP: 0010:i915_exit+0x44/0x118 [i915]
[  892.044066] RSP: 0018:ffffaad0422778d0 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff12
[  892.044068] RAX: 0000000000000000 RBX: ffffaad042277b58 RCX: 0000000000000000
[  892.044069] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000ff8
[  892.044069] RBP: ffff9e501c291500 R08: 0000000000000400 R09: 0000000001e23ff0
[  892.044070] R10: ffff9e501c291500 R11: 0000000000000001 R12: 00007f0718023010
[  892.044071] R13: 00007f0718022e10 R14: ffffaad0422778e0 R15: ffff9e4fc24c8000
[  892.044072] FS:  00007f075ef6d8c0(0000) GS:ffff9e502f800000(0000) knlGS:0000000000000000
[  892.044073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  892.044074] CR2: 00007f0718022e10 CR3: 0000000208252006 CR4: 0000000000760ef0
[  892.044075] PKRU: 55555554
[  892.044075] Call Trace:
[  892.044099]  i915_gem_do_execbuffer+0x598/0x1060 [i915]
[  892.044104]  ? shmem_getpage_gfp+0x79f/0xcf0
[  892.044107]  ? is_bpf_text_address+0xa/0x20
[  892.044109]  ? __save_stack_trace+0x92/0x100
[  892.044112]  ? create_object+0x24e/0x300
[  892.044132]  i915_gem_execbuffer2_ioctl+0xee/0x360 [i915]
[  892.044150]  ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 [i915]
[  892.044153]  drm_ioctl_kernel+0x67/0xb0
[  892.044156]  drm_ioctl+0x2f4/0x3f0
[  892.044173]  ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 [i915]
[  892.044176]  ? vma_merge+0xc8/0x330
[  892.044179]  do_vfs_ioctl+0xa2/0x610
[  892.044181]  SyS_ioctl+0x74/0x80
[  892.044184]  do_syscall_64+0x6e/0x120
[  892.044187]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  892.044189] RIP: 0033:0x7f075d14def7
[  892.044190] RSP: 002b:00007fffc117d6f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  892.044191] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f075d14def7
[  892.044192] RDX: 00007fffc117d7b0 RSI: 0000000040406469 RDI: 0000000000000003
[  892.044193] RBP: 00007fffc117d7b0 R08: ffffffffffffffff R09: 0000000000000000
[  892.044193] R10: 000000000000049e R11: 0000000000000246 R12: 0000000040406469
[  892.044194] R13: 0000000000000003 R14: 00007fffc117d7b0 R15: 0000000004000000
[  892.044195] Code: ff ff ff e9 e6 0b f1 ff b9 f2 ff ff ff e9 ee 0b f1 ff b8 f2 ff ff ff 40 30 f6 e9 94 33 f2 ff b8 f2 ff ff ff 30 d2 e9 9c 33 f2 ff <ba> f2 ff ff ff e9 2c 4e f2 ff b8 f2 ff ff ff e9 dd 52 f2 ff b8
. . .

software:
IGT-Version: 1.21-gbddfb8d (x86_64) (Linux: 4.16.0-rc3-drm-intel-qa-ww9-commit-b2e10fd+ x86_64)
Comment 25 Octavio 2018-03-21 16:44:57 UTC
The below test has dmesg-warn on CFL QA

igt@gem_exec_reloc@readonly-31

IGT-Version: 1.22-g94e8862 (x86_64) (Linux: 4.16.0-rc6-drm-intel-qa-ww12-commit-9d737ce+ x86_64)
Subtest readonly-31: SUCCESS (65.823s)

[   61.528380] Setting dangerous option reset - tainting kernel
[   88.043998] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [gem_exec_reloc:1819]
[   88.044001] Modules linked in: snd_hda_codec_hdmi cmac bnep arc4 nls_iso8859_1 8250_dw iwlmvm snd_hda_codec_realtek snd_hda_codec_generic intel_rapl mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_intel kvm snd_hda_codec irqbypass snd_hda_core crct10dif_pclmul crc32_pclmul snd_hwdep ghash_clmulni_intel snd_pcm pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel btusb aes_x86_64 crypto_simd btrtl glue_helper cryptd snd_seq btbcm asix btintel iwlwifi usbnet intel_cstate snd_seq_device mii bluetooth intel_rapl_perf input_leds snd_timer idma64 ecdh_generic serio_raw wmi_bmof virt_dma snd mei_me cfg80211 intel_lpss_pci mei intel_pch_thermal intel_lpss soundcore acpi_pad mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 i915 e1000e ahci libahci prime_numbers
[   88.044051]  wmi video
[   88.044053] CPU: 1 PID: 1819 Comm: gem_exec_reloc Tainted: G     U           4.16.0-rc6-drm-intel-qa-ww12-commit-9d737ce+ #1
[   88.044053] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B09.1801120459 01/12/2018
[   88.044100] RIP: 0010:i915_exit+0x44/0xa01 [i915]
[   88.044100] RSP: 0018:ffffbe46c3e278d0 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff12
[   88.044101] RAX: 0000000000000000 RBX: ffffbe46c3e27b58 RCX: ffff98a3d6933004
[   88.044102] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   88.044102] RBP: ffff98a3d60f2400 R08: 0000000000001000 R09: 0000000003637064
[   88.044103] R10: ffff98a3d60f2400 R11: 0000000000000001 R12: 00007f59831fde10
[   88.044103] R13: 00007f59831fdc90 R14: ffffbe46c3e27960 R15: ffff98a3cf35bd00
[   88.044104] FS:  00007f5999ee78c0(0000) GS:ffff98a3dd240000(0000) knlGS:0000000000000000
[   88.044104] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.044105] CR2: 00007f59831fdc90 CR3: 000000044b738002 CR4: 00000000003606e0
[   88.044105] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   88.044106] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   88.044106] Call Trace:
[   88.044120]  i915_gem_do_execbuffer+0x591/0x1020 [i915]
[   88.044123]  ? shmem_getpage_gfp+0x7a1/0xcf0
[   88.044125]  ? is_bpf_text_address+0xa/0x20
[   88.044126]  ? __save_stack_trace+0x92/0x100
[   88.044128]  ? create_object+0x24e/0x300
[   88.044139]  i915_gem_execbuffer2_ioctl+0xe7/0x340 [i915]
[   88.044149]  ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0 [i915]
[   88.044151]  drm_ioctl_kernel+0x67/0xb0
[   88.044152]  drm_ioctl+0x2d4/0x3c0
[   88.044161]  ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0 [i915]
[   88.044163]  ? vma_merge+0xc8/0x330
[   88.044165]  do_vfs_ioctl+0xa2/0x610
[   88.044166]  SyS_ioctl+0x74/0x80
[   88.044168]  do_syscall_64+0x6e/0x120
[   88.044170]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   88.044171] RIP: 0033:0x7f59980c7ef7
[   88.044171] RSP: 002b:00007ffc5bb6a6d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   88.044172] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f59980c7ef7
[   88.044172] RDX: 00007ffc5bb6a790 RSI: 0000000040406469 RDI: 0000000000000003
[   88.044173] RBP: 00007ffc5bb6a790 R08: ffffffffffffffff R09: 0000000000000000
[   88.044173] R10: 000000000000049e R11: 0000000000000246 R12: 0000000040406469
[   88.044174] R13: 0000000000000003 R14: 00007ffc5bb6a790 R15: 0000000004000000
[   88.044174] Code: ff ff ff e9 ef 15 f1 ff b9 f2 ff ff ff e9 f7 15 f1 ff b8 f2 ff ff ff 40 30 f6 e9 7d 3f f2 ff b8 f2 ff ff ff 30 d2 e9 85 3f f2 ff <ba> f2 ff ff ff e9 15 5a f2 ff b8 f2 ff ff ff e9 c6 5e f2 ff b8
Comment 26 Jani Saarinen 2018-03-29 07:10:11 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 27 Ricardo Perez 2018-04-05 18:43:15 UTC
For the following test in CNL:

igt@gem_exec_reloc@cpu-31

Software version:

IGT-Version: 1.22-g0721161 (x86_64) (Linux: 4.16.0-rc6-drm-intel-qa-ww12-commit-4db112a+ x86_64)
Subtest cpu-31: SUCCESS (4.202s)

We are seeing the following dmesg-warn:

[  562.977732] Setting dangerous option reset - tainting kernel
[  567.865254] Setting dangerous option reset - tainting kernel
[  577.855254] Setting dangerous option reset - tainting kernel
[  579.573912] Setting dangerous option reset - tainting kernel
[  604.039990] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [gem_exec_reloc:2144]
[  604.039997] Modules linked in: snd_hda_codec_hdmi bnep 8250_dw nls_iso8859_1 arc4 snd_soc_skl x86_pkg_temp_thermal intel_powerclamp snd_soc_skl_ipc coretemp snd_soc_sst_ipc snd_soc_sst_dsp kvm_intel snd_hda_ext_core snd_hda_codec_realtek snd_soc_acpi snd_hda_codec_generic kvm snd_soc_core snd_compress snd_pcm_dmaengine irqbypass ac97_bus crct10dif_pclmul iwlmvm crc32_pclmul ghash_clmulni_intel pcbc mac80211 aesni_intel snd_hda_intel aes_x86_64 crypto_simd glue_helper cryptd snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi serio_raw snd_seq wmi_bmof btusb btrtl snd_seq_device btbcm iwlwifi asix snd_timer btintel usbnet mii bluetooth snd input_leds soundcore ecdh_generic idma64 shpchp virt_dma mei_me intel_lpss_pci cfg80211 mei intel_pch_thermal intel_lpss acpi_pad
[  604.040035]  mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 uas usb_storage hid_generic usbhid hid i915 dwc3 udc_core ulpi e1000e dwc3_pci prime_numbers wmi video
[  604.040049] CPU: 3 PID: 2144 Comm: gem_exec_reloc Tainted: G     U  W        4.16.0-rc6-drm-intel-qa-ww12-commit-4db112a+ #1
[  604.040049] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X124.B02.1802051422 02/05/2018
[  604.040086] RIP: 0010:i915_exit+0x44/0x3b7 [i915]
[  604.040087] RSP: 0018:ffffbba98215f8d0 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff12
[  604.040089] RAX: 0000000000000000 RBX: ffffbba98215fb58 RCX: ffff9eb231e55004
[  604.040090] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  604.040091] RBP: ffff9eb22d796100 R08: 0000000000001000 R09: 0000000001e2ed51
[  604.040092] R10: ffff9eb22d796100 R11: 0000000000000001 R12: 00007f7ce694ac10
[  604.040092] R13: 00007f7ce694aa30 R14: ffffbba98215f900 R15: ffff9eb22d2295c0
[  604.040094] FS:  00007f7ced73a8c0(0000) GS:ffff9eb23f980000(0000) knlGS:0000000000000000
[  604.040095] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  604.040095] CR2: 00007f7ce694aa30 CR3: 00000002af862002 CR4: 0000000000760ee0
[  604.040096] PKRU: 55555554
[  604.040097] Call Trace:
[  604.040124]  i915_gem_do_execbuffer+0x591/0x1020 [i915]
[  604.040131]  ? shmem_getpage_gfp+0x7a1/0xcf0
[  604.040135]  ? is_bpf_text_address+0xa/0x20
[  604.040138]  ? __save_stack_trace+0x92/0x100
[  604.040140]  ? create_object+0x24e/0x300
[  604.040161]  i915_gem_execbuffer2_ioctl+0xe7/0x340 [i915]
[  604.040180]  ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0 [i915]
[  604.040184]  drm_ioctl_kernel+0x67/0xb0
[  604.040186]  drm_ioctl+0x2d4/0x3c0
[  604.040205]  ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0 [i915]
[  604.040208]  ? vma_merge+0xc8/0x330
[  604.040211]  do_vfs_ioctl+0xa2/0x610
[  604.040214]  SyS_ioctl+0x74/0x80
[  604.040217]  do_syscall_64+0x6e/0x120
[  604.040220]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  604.040222] RIP: 0033:0x7f7ceb91aef7
[  604.040222] RSP: 002b:00007ffebad27fd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  604.040224] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7ceb91aef7
[  604.040225] RDX: 00007ffebad28090 RSI: 0000000040406469 RDI: 0000000000000003
[  604.040225] RBP: 00007ffebad28090 R08: ffffffffffffffff R09: 0000000000000000
[  604.040226] R10: 000000000000049e R11: 0000000000000246 R12: 0000000040406469
[  604.040227] R13: 0000000000000003 R14: 00007ffebad28090 R15: 0000000002000000
[  604.040228] Code: ff ff ff e9 75 11 f1 ff b9 f2 ff ff ff e9 7d 11 f1 ff b8 f2 ff ff ff 40 30 f6 e9 d3 3a f2 ff b8 f2 ff ff ff 30 d2 e9 db 3a f2 ff <ba> f2 ff ff ff e9 6b 55 f2 ff b8 f2 ff ff ff e9 1c 5a f2 ff b8 
[  641.307239] Setting dangerous option reset - tainting kernel
[  643.512634] Setting dangerous option reset - tainting kernel
[  653.165474] Setting dangerous option reset - tainting kernel
[  673.001889] Setting dangerous option reset - tainting kernel
[  676.264418] Setting dangerous option reset - tainting kernel
[  700.039987] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [gem_exec_reloc:9761]
[  700.039992] Modules linked in: snd_hda_codec_hdmi bnep 8250_dw nls_iso8859_1 arc4 snd_soc_skl x86_pkg_temp_thermal intel_powerclamp snd_soc_skl_ipc coretemp snd_soc_sst_ipc snd_soc_sst_dsp kvm_intel snd_hda_ext_core snd_hda_codec_realtek snd_soc_acpi snd_hda_codec_generic kvm snd_soc_core snd_compress snd_pcm_dmaengine irqbypass ac97_bus crct10dif_pclmul iwlmvm crc32_pclmul ghash_clmulni_intel pcbc mac80211 aesni_intel snd_hda_intel aes_x86_64 crypto_simd glue_helper cryptd snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi serio_raw snd_seq wmi_bmof btusb btrtl snd_seq_device btbcm iwlwifi asix snd_timer btintel usbnet mii bluetooth snd input_leds soundcore ecdh_generic idma64 shpchp virt_dma mei_me intel_lpss_pci cfg80211 mei intel_pch_thermal intel_lpss acpi_pad
[  700.040022]  mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 uas usb_storage hid_generic usbhid hid i915 dwc3 udc_core ulpi e1000e dwc3_pci prime_numbers wmi video
[  700.040032] CPU: 2 PID: 9761 Comm: gem_exec_reloc Tainted: G     U  W    L   4.16.0-rc6-drm-intel-qa-ww12-commit-4db112a+ #1
[  700.040033] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X124.B02.1802051422 02/05/2018
[  700.040077] RIP: 0010:i915_exit+0x44/0x3b7 [i915]
[  700.040077] RSP: 0018:ffffbba983da78d0 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff12
[  700.040079] RAX: 0000000000000000 RBX: ffffbba983da7b58 RCX: ffff9eb232a2d004
[  700.040079] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  700.040080] RBP: ffff9eb201f1a400 R08: 0000000000001000 R09: 00000000021c42c8
[  700.040081] R10: ffff9eb201f1a400 R11: 0000000000000001 R12: 00007f80c7a13a10
[  700.040082] R13: 00007f80c7a13910 R14: ffffbba983da79e0 R15: ffff9eb1d976d700
[  700.040083] FS:  00007f81075588c0(0000) GS:ffff9eb23f900000(0000) knlGS:0000000000000000
[  700.040083] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  700.040084] CR2: 00007f80c7a13910 CR3: 00000002af480006 CR4: 0000000000760ee0
[  700.040085] PKRU: 55555554
[  700.040085] Call Trace:
[  700.040104]  i915_gem_do_execbuffer+0x591/0x1020 [i915]
[  700.040108]  ? is_bpf_text_address+0xa/0x20
[  700.040111]  ? __save_stack_trace+0x92/0x100
[  700.040113]  ? create_object+0x24e/0x300
[  700.040128]  i915_gem_execbuffer2_ioctl+0xe7/0x340 [i915]
[  700.040143]  ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0 [i915]
[  700.040146]  drm_ioctl_kernel+0x67/0xb0
[  700.040147]  drm_ioctl+0x2d4/0x3c0
[  700.040161]  ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0 [i915]
[  700.040164]  ? vma_merge+0xc8/0x330
[  700.040167]  do_vfs_ioctl+0xa2/0x610
[  700.040169]  SyS_ioctl+0x74/0x80
[  700.040171]  do_syscall_64+0x6e/0x120
[  700.040173]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  700.040174] RIP: 0033:0x7f8105738ef7
[  700.040175] RSP: 002b:00007ffd7a1aef18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  700.040176] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f8105738ef7
[  700.040177] RDX: 00007ffd7a1aefd0 RSI: 0000000040406469 RDI: 0000000000000003
[  700.040177] RBP: 00007ffd7a1aefd0 R08: ffffffffffffffff R09: 0000000000000000
[  700.040178] R10: 000000000000049e R11: 0000000000000246 R12: 0000000040406469
[  700.040179] R13: 0000000000000003 R14: 00007ffd7a1aefd0 R15: 0000000004000000
[  700.040180] Code: ff ff ff e9 75 11 f1 ff b9 f2 ff ff ff e9 7d 11 f1 ff b8 f2 ff ff ff 40 30 f6 e9 d3 3a f2 ff b8 f2 ff ff ff 30 d2 e9 db 3a f2 ff <ba> f2 ff ff ff e9 6b 55 f2 ff b8 f2 ff ff ff e9 1c 5a f2 ff b8 
[  789.011827] Setting dangerous option reset - tainting kernel
Comment 28 Ricardo Perez 2018-04-05 19:16:47 UTC
The following tests are failing on CNL QA systems:

gem_exec_reloc@readonly-30
gem_exec_reloc@readonly-31

Software version:
	
IGT-Version: 1.22-g0721161 (x86_64) (Linux: 4.16.0-rc6-drm-intel-qa-ww12-commit-4db112a+ x86_64)
Subtest readonly-30: SUCCESS (61.594s)


Dmesg-warn:

[  579.573912] Setting dangerous option reset - tainting kernel
[  604.039990] watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [gem_exec_reloc:2144]
[  604.039997] Modules linked in: snd_hda_codec_hdmi bnep 8250_dw nls_iso8859_1 arc4 snd_soc_skl x86_pkg_temp_thermal intel_powerclamp snd_soc_skl_ipc coretemp snd_soc_sst_ipc snd_soc_sst_dsp kvm_intel snd_hda_ext_core snd_hda_codec_realtek snd_soc_acpi snd_hda_codec_generic kvm snd_soc_core snd_compress snd_pcm_dmaengine irqbypass ac97_bus crct10dif_pclmul iwlmvm crc32_pclmul ghash_clmulni_intel pcbc mac80211 aesni_intel snd_hda_intel aes_x86_64 crypto_simd glue_helper cryptd snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi serio_raw snd_seq wmi_bmof btusb btrtl snd_seq_device btbcm iwlwifi asix snd_timer btintel usbnet mii bluetooth snd input_leds soundcore ecdh_generic idma64 shpchp virt_dma mei_me intel_lpss_pci cfg80211 mei intel_pch_thermal intel_lpss acpi_pad
[  604.040035]  mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 uas usb_storage hid_generic usbhid hid i915 dwc3 udc_core ulpi e1000e dwc3_pci prime_numbers wmi video
[  604.040049] CPU: 3 PID: 2144 Comm: gem_exec_reloc Tainted: G     U  W        4.16.0-rc6-drm-intel-qa-ww12-commit-4db112a+ #1
[  604.040049] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X124.B02.1802051422 02/05/2018
[  604.040086] RIP: 0010:i915_exit+0x44/0x3b7 [i915]
[  604.040087] RSP: 0018:ffffbba98215f8d0 EFLAGS: 00050246 ORIG_RAX: ffffffffffffff12
[  604.040089] RAX: 0000000000000000 RBX: ffffbba98215fb58 RCX: ffff9eb231e55004
[  604.040090] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  604.040091] RBP: ffff9eb22d796100 R08: 0000000000001000 R09: 0000000001e2ed51
[  604.040092] R10: ffff9eb22d796100 R11: 0000000000000001 R12: 00007f7ce694ac10
[  604.040092] R13: 00007f7ce694aa30 R14: ffffbba98215f900 R15: ffff9eb22d2295c0
[  604.040094] FS:  00007f7ced73a8c0(0000) GS:ffff9eb23f980000(0000) knlGS:0000000000000000
[  604.040095] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  604.040095] CR2: 00007f7ce694aa30 CR3: 00000002af862002 CR4: 0000000000760ee0
[  604.040096] PKRU: 55555554
[  604.040097] Call Trace:
[  604.040124]  i915_gem_do_execbuffer+0x591/0x1020 [i915]
[  604.040131]  ? shmem_getpage_gfp+0x7a1/0xcf0
[  604.040135]  ? is_bpf_text_address+0xa/0x20
[  604.040138]  ? __save_stack_trace+0x92/0x100
[  604.040140]  ? create_object+0x24e/0x300
[  604.040161]  i915_gem_execbuffer2_ioctl+0xe7/0x340 [i915]
[  604.040180]  ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0 [i915]
[  604.040184]  drm_ioctl_kernel+0x67/0xb0
[  604.040186]  drm_ioctl+0x2d4/0x3c0
[  604.040205]  ? i915_gem_execbuffer_ioctl+0x2b0/0x2b0 [i915]
[  604.040208]  ? vma_merge+0xc8/0x330
[  604.040211]  do_vfs_ioctl+0xa2/0x610
[  604.040214]  SyS_ioctl+0x74/0x80
[  604.040217]  do_syscall_64+0x6e/0x120
[  604.040220]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  604.040222] RIP: 0033:0x7f7ceb91aef7
[  604.040222] RSP: 002b:00007ffebad27fd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  604.040224] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7ceb91aef7
[  604.040225] RDX: 00007ffebad28090 RSI: 0000000040406469 RDI: 0000000000000003
[  604.040225] RBP: 00007ffebad28090 R08: ffffffffffffffff R09: 0000000000000000
[  604.040226] R10: 000000000000049e R11: 0000000000000246 R12: 0000000040406469
[  604.040227] R13: 0000000000000003 R14: 00007ffebad28090 R15: 0000000002000000
[  604.040228] Code: ff ff ff e9 75 11 f1 ff b9 f2 ff ff ff e9 7d 11 f1 ff b8 f2 ff ff ff 40 30 f6 e9 d3 3a f2 ff b8 f2 ff ff ff 30 d2 e9 db 3a f2 ff <ba> f2 ff ff ff e9 6b 55 f2 ff b8 f2 ff ff ff e9 1c 5a f2 ff b8
Comment 29 Jani Saarinen 2018-04-06 06:10:28 UTC
These tests are blacklisted for the reason. Do we care these tests? 
https://cgit.freedesktop.org/drm/igt-gpu-tools/tree/tests/intel-ci/blacklist.txt
Comment 30 Joonas Lahtinen 2018-04-06 08:25:56 UTC
The tests are blacklisted for taking a long time to complete, and the errors seen here are watchdog errors that trigger due to the long runtime, no actual bug exists. This can be "fixed" by allocating more runtime for the tests to complete.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.