Bug 104982 - [IGT][CNL Only] igt@debugfs_test@read_all_entries dmesg-warn WARN_ON_ONCE(mcr & ((((3) & 3) << 26) | (((3) & 3) << 24)))
Summary: [IGT][CNL Only] igt@debugfs_test@read_all_entries dmesg-warn WARN_ON_ONCE(mcr...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-06 23:28 UTC by Elizabeth
Modified: 2018-09-04 12:07 UTC (History)
2 users (show)

See Also:
i915 platform: CNL
i915 features: display/Other


Attachments
dmesg_log_warn_on_once (231.63 KB, text/plain)
2018-02-12 23:14 UTC, Elizabeth
no flags Details

Description Elizabeth 2018-02-06 23:28:36 UTC
This warning have been appearing in FF with commits badc039+ and 4244f98+,
once with test igt@gem_ringfill@basic-default-hang and commit badc039+,

IGT-Version: 1.21-g37bd27f (x86_64) (Linux: 4.15.0-drm-intel-qa-ww5-commit-badc039+ x86_64)
Ring size: 153 batches
Subtest basic-default-hang: SUCCESS (10.879s)

and with test igt@debugfs_test@read_all_entries with the both mentioned:

IGT-Version: 1.21-g37bd27f (x86_64) (Linux: 4.15.0-drm-intel-qa-ww5-commit-4244f98+ x86_64)
Subtest read_all_entries: SUCCESS (1.308s)

[  156.394252] ------------[ cut here ]------------
[  156.394254] WARN_ON_ONCE(mcr & ((((3) & 3) << 26) | (((3) & 3) << 24)))
[  156.394372] WARNING: CPU: 2 PID: 2633 at drivers/gpu/drm/i915/intel_engine_cs.c:757 intel_engine_get_instdone+0x39c/0x3f0 [i915]
[  156.394374] Modules linked in: 8250_dw snd_hda_codec_hdmi asix usbnet mii cmac ip6table_filter ip6_tables bnep iptable_filter binfmt_misc nls_iso8859_1 snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core x86_pkg_temp_thermal snd_soc_acpi intel_powerclamp snd_soc_core snd_hda_codec_realtek coretemp snd_hda_codec_generic snd_compress snd_pcm_dmaengine kvm_intel ac97_bus kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm aesni_intel aes_x86_64 snd_seq_midi snd_seq_midi_event crypto_simd glue_helper snd_rawmidi cryptd snd_seq snd_seq_device snd_timer snd wmi_bmof serio_raw input_leds shpchp iwlwifi btusb btrtl btbcm idma64 soundcore virt_dma btintel bluetooth intel_lpss_pci intel_lpss ecdh_generic cfg80211
[  156.394445]  intel_pch_thermal tpm_crb mac_hid acpi_pad parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid i915 e1000e ptp pps_core wmi video
[  156.394469] CPU: 2 PID: 2633 Comm: debugfs_test Tainted: G     U           4.15.0-drm-intel-qa-ww6-commit-078873d+ #1
[  156.394471] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X122.B01.1801151045 01/15/2018
[  156.394535] RIP: 0010:intel_engine_get_instdone+0x39c/0x3f0 [i915]
[  156.394537] RSP: 0018:ffffa1f10342fcb0 EFLAGS: 00010086
[  156.394540] RAX: 0000000000000000 RBX: ffff8c0dd9478000 RCX: ffffffffab257dc8
[  156.394542] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 0000000000000002
[  156.394544] RBP: ffffa1f10342fd48 R08: 000000000000003b R09: 0000000000000a95
[  156.394546] R10: 00000000ffffffff R11: 0000000000000a95 R12: 0000000000000001
[  156.394548] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000011000000
[  156.394551] FS:  00007fa5dd020a00(0000) GS:ffff8c0def900000(0000) knlGS:0000000000000000
[  156.394553] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  156.394555] CR2: 00007f331bacb9a4 CR3: 0000000259dda005 CR4: 0000000000660ee0
[  156.394558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  156.394560] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  156.394561] PKRU: 55555554
[  156.394562] Call Trace:
[  156.394616]  i915_hangcheck_info+0x10e/0x400 [i915]
[  156.394626]  ? create_object+0x24e/0x300
[  156.394634]  seq_read+0x9d/0x410
[  156.394642]  full_proxy_read+0x4a/0x70
[  156.394647]  __vfs_read+0x33/0x160
[  156.394652]  vfs_read+0x8e/0x130
[  156.394657]  SyS_read+0x52/0xc0
[  156.394665]  entry_SYSCALL_64_fastpath+0x24/0x87
[  156.394668] RIP: 0033:0x7fa5dbf2e690
[  156.394670] RSP: 002b:00007fff21134c48 EFLAGS: 00000246
[  156.394673] Code: 0d fe ff ff 80 3d 69 b8 16 00 00 0f 85 93 fe ff ff 48 c7 c6 38 20 48 c0 48 c7 c7 31 c4 46 c0 c6 05 4e b8 16 00 01 e8 84 b9 cd e9 <0f> ff e9 72 fe ff ff 80 3d 3b b8 16 00 00 0f 85 51 ff ff ff 48 
[  156.394740] ---[ end trace 489fff2b714cdf7d ]---

I'll update logs as soon as I'm able to replicate.
Comment 1 Lionel Landwerlin 2018-02-08 16:38:44 UTC
Chris asked me to look into this, just in case I had any idea.
I was first confused by the internal documentation (containing a GEN9 change making look like GEN11), turns out there is nothing wrong there.

I couldn't reproduce the issue on my side, I did get an unclaimed register access though. This unclaimed access is fixed by : https://patchwork.freedesktop.org/series/37901/
Comment 2 Elizabeth 2018-02-09 16:21:00 UTC
Hello Lionel, thanks for your time. I haven't been able to reproduce this issue and our FF results don't have this warn even since commit-94ca1eb. 
But taking a closer look to our results, I'm inclined to think this is platform dependent, since our CNL-1 haven't show the warn, while our CNL-2 is the one showing this.
Comment 3 Elizabeth 2018-02-12 23:13:51 UTC
As I thought, I'm able to replicate the warn only in the CNL-2, which has a newer BIOS version than the CNL-1. I will update CNL-1's BIOS and check if results are the same.

(02:35 PM) [gfx@CNL-2] [tests]$ : time sudo -E ./debugfs_test --r read_all_entries
IGT-Version: 1.21-g37bd27f (x86_64) (Linux: 4.16.0-rc1-drm-intel-qa-ww7-commit-28dc2a5+ x86_64)
Subtest read_all_entries: SUCCESS (1.330s)

[  +0.000430] [drm:sandybridge_pcode_read [i915]] warning: pcode (read from mbox 5) mailbox access failed for i915_drpc_info [i915]: -6
[  +0.000138] ------------[ cut here ]------------
[  +0.000002] WARN_ON_ONCE(mcr & ((((3) & 3) << 26) | (((3) & 3) << 24)))
[  +0.000120] WARNING: CPU: 0 PID: 3389 at drivers/gpu/drm/i915/intel_engine_cs.c:757 intel_engine_get_instdone+0x39c/0x3f0 [i915]
[  +0.000001] Modules linked in: snd_hda_codec_hdmi asix usbnet mii ip6table_filter ip6_tables cmac iptable_filter bnep 8250_dw binfmt_misc nls_iso8859_1 snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi snd_hda_codec_realtek snd_hda_codec_generic snd_soc_core snd_compress snd_pcm_dmaengine ac97_bus x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_codec snd_hda_core pcbc snd_hwdep snd_pcm snd_seq_midi aesni_intel snd_seq_midi_event aes_x86_64 crypto_simd glue_helper cryptd snd_rawmidi snd_seq serio_raw wmi_bmof snd_seq_device snd_timer input_leds snd btusb soundcore iwlwifi btrtl btbcm btintel shpchp bluetooth idma64 virt_dma intel_lpss_pci cfg80211 intel_lpss ecdh_generic
[  +0.000070]  intel_pch_thermal tpm_crb mac_hid acpi_pad parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid dwc3 udc_core ulpi i915 e1000e dwc3_pci wmi video
[  +0.000026] CPU: 0 PID: 3389 Comm: debugfs_test Tainted: G     U           4.16.0-rc1-drm-intel-qa-ww7-commit-28dc2a5+ #1
[  +0.000003] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X122.B01.1801151045 01/15/2018
[  +0.000063] RIP: 0010:intel_engine_get_instdone+0x39c/0x3f0 [i915]
[  +0.000003] RSP: 0018:ffffb7e103ccfc98 EFLAGS: 00010082
[  +0.000003] RAX: 0000000000000000 RBX: ffff919dd9ba8000 RCX: ffffffff8da58208
[  +0.000002] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000002
[  +0.000002] RBP: ffffb7e103ccfd30 R08: 000000000000003b R09: 0000000000000a07
[  +0.000002] R10: 00000000ffffffff R11: 0000000000000a07 R12: 0000000000000001
[  +0.000002] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000011000000
[  +0.000003] FS:  00007fde5299ba00(0000) GS:ffff919def800000(0000) knlGS:0000000000000000
[  +0.000002] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000002] CR2: 00007fa088fb7000 CR3: 000000025e0e0006 CR4: 0000000000660ef0
[  +0.000002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  +0.000002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  +0.000001] PKRU: 55555554
[  +0.000002] Call Trace:
[  +0.000054]  i915_hangcheck_info+0x10e/0x400 [i915]
[  +0.000011]  ? create_object+0x24e/0x300
[  +0.000007]  seq_read+0xad/0x420
[  +0.000008]  full_proxy_read+0x4a/0x70
[  +0.000006]  __vfs_read+0x33/0x160
[  +0.000007]  vfs_read+0x8e/0x130
[  +0.000005]  SyS_read+0x52/0xc0
[  +0.000007]  do_syscall_64+0x6b/0x120
[  +0.000008]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[  +0.000003] RIP: 0033:0x7fde518a9690
[  +0.000002] RSP: 002b:00007ffc71877218 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  +0.000004] RAX: ffffffffffffffda RBX: 000055e554b39cfb RCX: 00007fde518a9690
[  +0.000002] RDX: 0000000000000200 RSI: 00007ffc71877240 RDI: 0000000000000006
[  +0.000002] RBP: 0000000000000006 R08: 000055e554af9b50 R09: 0000000000000000
[  +0.000001] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc71877240
[  +0.000002] R13: 000055e554b39770 R14: 0000000000000005 R15: 00007ffc71877230
[  +0.000003] Code: 0d fe ff ff 80 3d 75 b6 16 00 00 0f 85 93 fe ff ff 48 c7 c6 18 20 68 c0 48 c7 c7 3c c4 66 c0 c6 05 5a b6 16 00 01 e8 84 ba 2d cc <0f> ff e9 72 fe ff ff 80 3d 47 b6 16 00 00 0f 85 51 ff ff ff 48 
[  +0.000067] ---[ end trace 9d3219a9ed0bffa4 ]---
Comment 4 Elizabeth 2018-02-12 23:14:31 UTC
Created attachment 137307 [details]
dmesg_log_warn_on_once
Comment 5 Elizabeth 2018-02-20 23:15:13 UTC
igt@debugfs_test@read_all_entries

Out	
IGT-Version: 1.21-gdd61508 (x86_64) (Linux: 4.16.0-rc1-drm-intel-qa-ww8-commit-67f1480+ x86_64)
Subtest read_all_entries: SUCCESS (1.325s)

Dmesg	
[  155.464514] ------------[ cut here ]------------
[  155.464516] WARN_ON_ONCE(mcr & ((((3) & 3) << 26) | (((3) & 3) << 24)))...
Comment 6 Elizabeth 2018-03-12 20:29:30 UTC
We're not using the CNL-2 to run FF right now, and haven't seen this on our CNL-1. I'm closing for now.
Comment 7 Elizabeth 2018-03-26 18:30:30 UTC
The warn keeps appearing on our CNL-2:

Results for igt@debugfs_test@read_all_entries
Result: dmesg-warn

Out	
IGT-Version: 1.22-ga9741da (x86_64) (Linux: 4.16.0-rc6-drm-intel-qa-ww13-commit-94f5d91+ x86_64)
Subtest read_all_entries: SUCCESS (1.551s)

Dmesg	
[   74.686683] ------------[ cut here ]------------
[   74.686684] WARN_ON_ONCE(mcr & mcr_slice_subslice_mask)
[   74.686777] WARNING: CPU: 2 PID: 2120 at drivers/gpu/drm/i915/intel_engine_cs.c:835 intel_engine_get_instdone+0x45e/0x4b0 [i915]
[   74.686778] Modules linked in: snd_hda_codec_hdmi cmac bnep 8250_dw snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi nls_iso8859_1 arc4 snd_soc_core snd_hda_codec_realtek snd_hda_codec_generic snd_compress snd_pcm_dmaengine ac97_bus iwlmvm mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hda_core pcbc snd_hwdep snd_pcm snd_seq_midi aesni_intel snd_seq_midi_event aes_x86_64 crypto_simd glue_helper cryptd snd_rawmidi snd_seq snd_seq_device serio_raw snd_timer wmi_bmof iwlwifi snd btusb btrtl btbcm btintel input_leds bluetooth soundcore ax88179_178a usbnet mii mei_me shpchp idma64 ecdh_generic cfg80211 virt_dma mei intel_lpss_pci intel_lpss intel_pch_thermal
[   74.686842]  acpi_pad mac_hid parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid dwc3 udc_core ulpi i915 e1000e dwc3_pci prime_numbers wmi video
[   74.686865] CPU: 2 PID: 2120 Comm: debugfs_test Not tainted 4.16.0-rc6-drm-intel-qa-ww13-commit-94f5d91+ #1
[   74.686867] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X124.B02.1802051422 02/05/2018
[   74.686915] RIP: 0010:intel_engine_get_instdone+0x45e/0x4b0 [i915]
[   74.686918] RSP: 0018:ffffb13a43907bf0 EFLAGS: 00010082
[   74.686921] RAX: 0000000000000000 RBX: 000000000f000000 RCX: ffffffffa3258448
[   74.686923] RDX: 0000000000000001 RSI: 0000000000000082 RDI: 0000000000000002
[   74.686925] RBP: ffffb13a43907cb8 R08: 000000000000002b R09: 0000000000000b04
[   74.686927] R10: 00000000ffffffff R11: 0000000000000b04 R12: 0000000000000001
[   74.686929] R13: ffff9816a90c0000 R14: 0000000000000000 R15: 0000000011000000
[   74.686932] FS:  00007f6ceba6ba40(0000) GS:ffff9816bf900000(0000) knlGS:0000000000000000
[   74.686934] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   74.686936] CR2: 00007f5298d33000 CR3: 00000002ae8f6001 CR4: 0000000000760ee0
[   74.686938] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   74.686940] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   74.686941] PKRU: 55555554
[   74.686942] Call Trace:
[   74.686985]  i915_hangcheck_info+0x107/0x3f0 [i915]
[   74.686994]  ? is_bpf_text_address+0xa/0x20
[   74.686999]  ? __save_stack_trace+0x92/0x100
[   74.687007]  seq_read+0xb6/0x450
[   74.687012]  full_proxy_read+0x50/0x70
[   74.687017]  __vfs_read+0x36/0x170
[   74.687021]  vfs_read+0x8e/0x130
[   74.687024]  SyS_read+0x52/0xc0
[   74.687029]  do_syscall_64+0x6e/0x120
[   74.687034]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   74.687037] RIP: 0033:0x7f6ce9c28d11
[   74.687039] RSP: 002b:00007ffece4d9788 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   74.687043] RAX: ffffffffffffffda RBX: 00000000010e91e3 RCX: 00007f6ce9c28d11
[   74.687044] RDX: 0000000000000200 RSI: 00007ffece4d97a0 RDI: 0000000000000006
[   74.687046] RBP: 00000000010e8c30 R08: 0000000000000000 R09: 0000000000000000
[   74.687048] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[   74.687050] R13: 0000000000000005 R14: 0000000000000006 R15: 0000000000000000
[   74.687052] Code: e7 fe ff ff 80 3d 2a 11 17 00 00 0f 85 40 fe ff ff 48 c7 c6 50 0a 54 c0 48 c7 c7 ad c4 52 c0 c6 05 0f 11 17 00 01 e8 c2 06 c2 e1 <0f> 0b 8b 4c 24 10 e9 1b fe ff ff 80 3d f8 10 17 00 00 0f 85 07 
[   74.687158] WARNING: CPU: 2 PID: 2120 at drivers/gpu/drm/i915/intel_engine_cs.c:835 intel_engine_get_instdone+0x45e/0x4b0 [i915]
[   74.687160] ---[ end trace 953653dc4e9d12c7 ]---
Comment 8 Jani Saarinen 2018-03-29 07:11:29 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 9 Elizabeth 2018-04-02 22:24:33 UTC
Results for igt@debugfs_test@read_all_entries
Result: dmesg-warn

Out	
IGT-Version: 1.21-ge3a0ed9 (x86_64) (Linux: 4.16.0-rc7-drm-intel-qa-ww14-commit-c46052c+ x86_64)
Subtest read_all_entries: SUCCESS (1.551s)

Dmesg	
[   74.742586] ------------[ cut here ]------------
[   74.742587] WARN_ON_ONCE(mcr & mcr_slice_subslice_mask)
[   74.742676] WARNING: CPU: 3 PID: 2074 at drivers/gpu/drm/i915/intel_engine_cs.c:835 intel_engine_get_instdone+0x45e/0x4b0 [i915]
[   74.742677] Modules linked in: snd_hda_codec_hdmi cmac bnep 8250_dw nls_iso8859_1 arc4 snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi snd_soc_core iwlmvm snd_compress snd_hda_codec_realtek snd_pcm_dmaengine ac97_bus snd_hda_codec_generic mac80211 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_intel kvm snd_hda_codec irqbypass snd_hda_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hwdep pcbc snd_pcm aesni_intel aes_x86_64 crypto_simd snd_seq_midi glue_helper snd_seq_midi_event cryptd snd_rawmidi input_leds btusb snd_seq serio_raw btrtl btbcm btintel snd_seq_device wmi_bmof snd_timer bluetooth iwlwifi snd ax88179_178a idma64 virt_dma usbnet soundcore mei_me mii shpchp ecdh_generic cfg80211 intel_lpss_pci mei intel_lpss intel_pch_thermal
[   74.742741]  mac_hid acpi_pad parport_pc ppdev lp parport ip_tables x_tables autofs4 dwc3 udc_core ulpi i915 e1000e dwc3_pci prime_numbers wmi video
[   74.742761] CPU: 3 PID: 2074 Comm: debugfs_test Not tainted 4.16.0-rc7-drm-intel-qa-ww14-commit-c46052c+ #1
[   74.742763] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X124.B02.1802051422 02/05/2018
[   74.742811] RIP: 0010:intel_engine_get_instdone+0x45e/0x4b0 [i915]
[   74.742814] RSP: 0018:ffffa91c039ffbf0 EFLAGS: 00010082
[   74.742817] RAX: 0000000000000000 RBX: 000000000f000000 RCX: ffffffff84e58448
[   74.742819] RDX: 0000000000000001 RSI: 0000000000000082 RDI: 0000000000000002
[   74.742821] RBP: ffffa91c039ffcb8 R08: 000000000000002b R09: 0000000000000afb
[   74.742822] R10: 00000000ffffffff R11: 0000000000000afb R12: 0000000000000001
[   74.742824] R13: ffff8cb5e90d8000 R14: 0000000000000000 R15: 0000000011000000
[   74.742827] FS:  00007f8b3ca8ca40(0000) GS:ffff8cb5ff980000(0000) knlGS:0000000000000000
[   74.742829] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   74.742831] CR2: 00007f3d4395adc0 CR3: 00000002b0190003 CR4: 0000000000760ee0
[   74.742833] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   74.742835] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   74.742836] PKRU: 55555554
[   74.742838] Call Trace:
[   74.742880]  i915_hangcheck_info+0x107/0x3f0 [i915]
[   74.742888]  ? is_bpf_text_address+0xa/0x20
[   74.742893]  ? __save_stack_trace+0x92/0x100
[   74.742899]  seq_read+0xb6/0x450
[   74.742904]  full_proxy_read+0x50/0x70
[   74.742909]  __vfs_read+0x36/0x170
[   74.742913]  vfs_read+0x8e/0x130
[   74.742916]  SyS_read+0x52/0xc0
[   74.742921]  do_syscall_64+0x6e/0x120
[   74.742926]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   74.742929] RIP: 0033:0x7f8b3ac49d11
[   74.742930] RSP: 002b:00007ffc62e3fbc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   74.742934] RAX: ffffffffffffffda RBX: 0000000000b32423 RCX: 00007f8b3ac49d11
[   74.742935] RDX: 0000000000000200 RSI: 00007ffc62e3fbe0 RDI: 0000000000000006
[   74.742937] RBP: 0000000000b31e70 R08: 0000000000000000 R09: 0000000000000000
[   74.742939] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[   74.742940] R13: 0000000000000005 R14: 0000000000000006 R15: 0000000000000000
[   74.742943] Code: e7 fe ff ff 80 3d 7a 20 17 00 00 0f 85 40 fe ff ff 48 c7 c6 78 66 3d c0 48 c7 c7 6e 1d 3c c0 c6 05 5f 20 17 00 01 e8 22 b7 98 c3 <0f> 0b 8b 4c 24 10 e9 1b fe ff ff 80 3d 48 20 17 00 00 0f 85 07 
[   74.743050] WARNING: CPU: 3 PID: 2074 at drivers/gpu/drm/i915/intel_engine_cs.c:835 intel_engine_get_instdone+0x45e/0x4b0 [i915]
[   74.743051] ---[ end trace 1c66f8f80e4df2f0 ]---
Comment 10 Elizabeth 2018-04-09 22:03:23 UTC
Results for igt@debugfs_test@read_all_entries
Result: dmesg-warn

Stdout	
IGT-Version: 1.22-g8cc6f71 (x86_64) (Linux: 4.16.0-rc7-drm-intel-qa-ww15-commit-1be0731+ x86_64)
Subtest read_all_entries: SUCCESS (1.544s)
Stderr	
Environment	
PIGLIT_SOURCE_DIR="/home/gfx/intel-graphics/intel-gpu-tools/piglit" PIGLIT_PLATFORM="mixed_glx_egl"
Command	/home/gfx/intel-graphics/intel-gpu-tools/tests/debugfs_test --run-subtest read_all_entries
dmesg	
[   74.653930] ------------[ cut here ]------------
[   74.653931] WARN_ON_ONCE(mcr & mcr_slice_subslice_mask)
[   74.654020] WARNING: CPU: 1 PID: 2062 at drivers/gpu/drm/i915/intel_engine_cs.c:835 intel_engine_get_instdone+0x45e/0x4b0 [i915]
[   74.654023] Modules linked in: snd_hda_codec_hdmi cmac bnep 8250_dw snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic snd_soc_core snd_compress snd_pcm_dmaengine ac97_bus arc4 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_intel snd_hda_codec snd_hda_core kvm snd_hwdep snd_pcm irqbypass crct10dif_pclmul crc32_pclmul iwlmvm ghash_clmulni_intel pcbc mac80211 snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq aesni_intel aes_x86_64 crypto_simd glue_helper snd_seq_device cryptd snd_timer input_leds snd iwlwifi wmi_bmof serio_raw btusb asix btrtl btbcm usbnet btintel idma64 virt_dma shpchp soundcore mii bluetooth intel_lpss_pci intel_lpss cfg80211 mei_me mei ecdh_generic intel_pch_thermal
[   74.654086]  mac_hid acpi_pad parport_pc ppdev lp parport ip_tables x_tables autofs4 i915 dwc3 udc_core ulpi e1000e dwc3_pci prime_numbers wmi video
[   74.654106] CPU: 1 PID: 2062 Comm: debugfs_test Tainted: G     U           4.16.0-rc7-drm-intel-qa-ww15-commit-1be0731+ #1
[   74.654108] Hardware name: Intel Corporation CannonLake Client Platform/CannonLake Y LPDDR4 RVP, BIOS CNLSFWR1.R00.X124.B02.1802051422 02/05/2018
[   74.654156] RIP: 0010:intel_engine_get_instdone+0x45e/0x4b0 [i915]
[   74.654159] RSP: 0018:ffffaeb6c384fbf0 EFLAGS: 00010082
[   74.654162] RAX: 0000000000000000 RBX: 000000000f000000 RCX: ffffffff9aa58448
[   74.654164] RDX: 0000000000000001 RSI: 0000000000000082 RDI: 0000000000000002
[   74.654165] RBP: ffffaeb6c384fcb8 R08: 000000000000002b R09: 0000000000000ac7
[   74.654167] R10: 00000000ffffffff R11: 0000000000000ac7 R12: 0000000000000001
[   74.654169] R13: ffff9d4aa8c80000 R14: 0000000000000000 R15: 0000000011000000
[   74.654172] FS:  00007fea70a3fa40(0000) GS:ffff9d4abf880000(0000) knlGS:0000000000000000
[   74.654174] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   74.654176] CR2: 00007f7d194a09b8 CR3: 00000002a93f6001 CR4: 0000000000760ee0
[   74.654178] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   74.654180] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   74.654181] PKRU: 55555554
[   74.654182] Call Trace:
[   74.654225]  i915_hangcheck_info+0x107/0x3f0 [i915]
[   74.654233]  ? is_bpf_text_address+0xa/0x20
[   74.654238]  ? __save_stack_trace+0x92/0x100
[   74.654244]  seq_read+0xb6/0x450
[   74.654249]  full_proxy_read+0x50/0x70
[   74.654255]  __vfs_read+0x36/0x170
[   74.654258]  vfs_read+0x8e/0x130
[   74.654261]  SyS_read+0x52/0xc0
[   74.654266]  do_syscall_64+0x6e/0x120
[   74.654271]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[   74.654274] RIP: 0033:0x7fea6ebfcd11
[   74.654276] RSP: 002b:00007ffc559521b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   74.654279] RAX: ffffffffffffffda RBX: 00000000019fc743 RCX: 00007fea6ebfcd11
[   74.654281] RDX: 0000000000000200 RSI: 00007ffc559521d0 RDI: 0000000000000006
[   74.654282] RBP: 00000000019fc190 R08: 0000000000000000 R09: 0000000000000000
[   74.654284] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
[   74.654286] R13: 0000000000000005 R14: 0000000000000006 R15: 0000000000000000
[   74.654288] Code: e7 fe ff ff 80 3d a7 2e 17 00 00 0f 85 40 fe ff ff 48 c7 c6 98 a6 40 c0 48 c7 c7 b1 5c 3f c0 c6 05 8c 2e 17 00 01 e8 22 85 55 d9 <0f> 0b 8b 4c 24 10 e9 1b fe ff ff 80 3d 75 2e 17 00 00 0f 85 07 
[   74.654394] WARNING: CPU: 1 PID: 2062 at drivers/gpu/drm/i915/intel_engine_cs.c:835 intel_engine_get_instdone+0x45e/0x4b0 [i915]
[   74.654396] ---[ end trace 1c1702ada2409c7b ]---
Comment 11 Elizabeth 2018-04-13 20:23:45 UTC
Results for igt@debugfs_test@read_all_entries
Result: dmesg-warn

Out	
IGT-Version: 1.22-g80e4910 (x86_64) (Linux: 4.16.0-rc7-drm-intel-qa-ww15-commit-6b9e85a+ x86_64)
Subtest read_all_entries: SUCCESS (1.549s)

Dmesg	
[   73.371442] ------------[ cut here ]------------
[   73.371444] WARN_ON_ONCE(mcr & mcr_slice_subslice_mask)

I still believe this may be HW related, but I'm not an expert so I leave it open. Thank you.
Comment 12 Jani Saarinen 2018-04-24 06:54:42 UTC
Rodrigo, is this real issue?
Comment 13 Martin Peres 2018-09-04 12:03:42 UTC
Still could not reproduce this issue on our multiple platforms in CI.

Closing as INVALID.
Comment 14 Chris Wilson 2018-09-04 12:07:26 UTC
commit 1e40d4aea57bbbd277777dd1fe18599dd77c55ab
Author: Yunwei Zhang <yunwei.zhang@intel.com>
Date:   Fri May 18 15:39:57 2018 -0700

    drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads
    
    WaProgramMgsrForCorrectSliceSpecificMmioReads dictate that before any MMIO
    read into Slice/Subslice specific registers, MCR packet control
    register(0xFDC) needs to be programmed to point to any enabled
    slice/subslice pair. Otherwise, incorrect value will be returned.
    
    However, that means each subsequent MMIO read will be forwarded to a
    specific slice/subslice combination as read is unicast. This is OK since
    slice/subslice specific register values are consistent in almost all cases
    across slice/subslice. There are rare occasions such as INSTDONE that this
    value will be dependent on slice/subslice combo, in such cases, we need to
    program 0xFDC and recover this after. This is already covered by
    read_subslice_reg.
    
    Also, 0xFDC will lose its information after TDR/engine reset/power state
    change.
    
    References: HSD#1405586840, BSID#0575


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.