Bug 105524 - [CI] igt@* - incomplete - irq/124-mei_me - Call Trace: <4>[ 56.184872] mei_irq_read_handler+0x26d/0x650 [mei]
Summary: [CI] igt@* - incomplete - irq/124-mei_me - Call Trace: <4>[ 56.184872] mei...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high normal
Assignee: Jani Saarinen
QA Contact: Intel GFX Bugs mailing list
URL: https://bugzilla.kernel.org/show_bug....
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-15 15:33 UTC by Marta Löfstedt
Modified: 2019-01-16 11:54 UTC (History)
2 users (show)

See Also:
i915 platform: CFL, KBL, SKL
i915 features:


Attachments

Description Marta Löfstedt 2018-03-15 15:33:08 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_1/fi-cfl-s2/igt@gem_softpin@noreloc-s3.html

From pstore:
<4>[   56.184849] Modules linked in: ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp e1000e crct10dif_pclmul crc32_pclmul i915 ghash_clmulni_intel mei_me mei prime_numbers
<4>[   56.184857] CPU: 2 PID: 1520 Comm: irq/124-mei_me Tainted: G     U           4.16.0-rc5-g613eb885b69e-drmtip_1+ #1
<4>[   56.184857] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X104.B11.1710091318 10/09/2017
<4>[   56.184861] RIP: 0010:mei_hbm_dispatch+0x15b/0xbe0 [mei]
<4>[   56.184862] RSP: 0018:ffffb9e5c0fafdb0 EFLAGS: 00010297
<4>[   56.184863] RAX: 0000000000000000 RBX: ffffa1214ef06690 RCX: 0000000000000003
<4>[   56.184864] RDX: ffffb9e5c01a9004 RSI: ffffb9e5c01a9004 RDI: 000000008024240c
<4>[   56.184864] RBP: ffffa1214ef06a48 R08: ffffa121477230f0 R09: 000000000f171054
<4>[   56.184865] R10: ffffb9e5c0fafe30 R11: 0000000000000001 R12: ffffb9e5c0fafe40
<4>[   56.184865] R13: ffffa1214ef06c48 R14: ffffa1214ef06690 R15: ffffffffb50f3410
<4>[   56.184866] FS:  0000000000000000(0000) GS:ffffa1215b280000(0000) knlGS:0000000000000000
<4>[   56.184867] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   56.184867] CR2: 00007fc001a73240 CR3: 0000000415210005 CR4: 00000000003606e0
<4>[   56.184868] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[   56.184868] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[   56.184869] Call Trace:
<4>[   56.184872]  mei_irq_read_handler+0x26d/0x650 [mei]
<4>[   56.184874]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[   56.184876]  ? irq_thread+0x80/0x1b0
<4>[   56.184878]  mei_me_irq_thread_handler+0x3c4/0xa50 [mei_me]
<4>[   56.184880]  ? irq_thread+0xb5/0x1b0
<4>[   56.184881]  ? irq_thread+0x80/0x1b0
<4>[   56.184882]  irq_thread_fn+0x16/0x40
<4>[   56.184883]  irq_thread+0x14e/0x1b0
<4>[   56.184885]  ? irq_forced_thread_fn+0x60/0x60
<4>[   56.184886]  ? wake_threads_waitq+0x30/0x30
<4>[   56.184888]  kthread+0xfb/0x130
<4>[   56.184889]  ? irq_thread_dtor+0x90/0x90
<4>[   56.184891]  ? _kthread_create_on_node+0x30/0x30
<4>[   56.184893]  ret_from_fork+0x3a/0x50
<4>[   56.184895] Code: 8b 3b be 01 00 00 00 e8 24 d9 4f f5 31 c0 e9 02 ff ff ff 3c 8a 0f 84 37 03 00 00 3c 90 0f 84 5c 02 00 00 3c 87 0f 84 79 03 00 00 <0f> 0b 3c 03 0f 84 96 01 00 00 3c 07 75 f2 0f 1f 44 00 00 48 8b 
<1>[   56.184921] RIP: mei_hbm_dispatch+0x15b/0xbe0 [mei] RSP: ffffb9e5c0fafdb0
<4>[   56.184949] ---[ end trace 869670b1217b5f1e ]---
<7>[   56.197650] [drm:intel_power_well_enable [i915]] enabling DDI C IO power well
<7>[   56.197680] [drm:intel_power_well_enable [i915]] enabling DDI D IO power well
<7>[   56.222551] [drm:intel_atomic_check [i915]] New cdclk calculated to be logical 337500 kHz, actual 337500 kHz
<7>[   56.232900] [drm:intel_atomic_check [i915]] New voltage level calculated to be logical 0, actual 0
<7>[   56.233747] [drm:intel_edp_backlight_off [i915]] 
<7>[   56.445244] [drm:intel_panel_actually_set_backlight [i915]] set backlight PWM = 0
<7>[   56.456221] [drm:intel_disable_pipe [i915]] disabling pipe A
<7>[   56.470178] [drm:intel_edp_panel_off.part.30 [i915]] Turn eDP port A panel power off
<7>[   56.470437] [drm:intel_edp_panel_off.part.30 [i915]] Wait for panel power off time
<7>[   56.482455] [drm:gen8_de_irq_handler [i915]] hotplug event received, stat 0x01000000, dig 0x11101010, pins 0x00000010
<1>[   56.482456] BUG: unable to handle kernel NULL pointer dereference at 0000000000000006
<7>[   56.482485] [drm:intel_hpd_irq_handler [i915]] digital hpd port A - short
<1>[   56.482486] IP: 0x6
<7>[   56.482542] [drm:wait_panel_status [i915]] mask b0000000 value 00000000 status a0000002 control 00000060
<6>[   56.482543] PGD 0 
<7>[   56.482571] [drm:intel_dp_hpd_pulse [i915]] got hpd irq on port A - short
<4>[   56.482571] P4D 0 
<4>[   56.482573] Oops: 0010 [#2] PREEMPT SMP PTI
<0>[   56.482575] Dumping ftrace buffer:
<0>[   56.482579]    (ftrace buffer empty)
<4>[   56.482580] Modules linked in: ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp e1000e crct10dif_pclmul crc32_pclmul i915 ghash_clmulni_intel mei_me mei prime_numbers
<4>[   56.482589] CPU: 2 PID: 1520 Comm: irq/124-mei_me Tainted: G     UD          4.16.0-rc5-g613eb885b69e-drmtip_1+ #1
<4>[   56.482589] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X104.B11.1710091318 10/09/2017
<4>[   56.482590] RIP: 0010:0x6
<4>[   56.482591] RSP: 0018:ffffb9e5c0fafea0 EFLAGS: 00010282
<4>[   56.482592] RAX: ffffb9e5c0fafed0 RBX: ffffa12147722fd0 RCX: 0000000000000001
<4>[   56.482593] RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffb9e5c0fafed0
<4>[   56.482593] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
<4>[   56.482594] R10: 0000000000000000 R11: ffffffffb50a1c99 R12: ffffa12147722840
<4>[   56.482594] R13: ffffffffb605d245 R14: ffffa12147723030 R15: 0000000000000000
<4>[   56.482595] FS:  0000000000000000(0000) GS:ffffa1215b280000(0000) knlGS:0000000000000000
<4>[   56.482596] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   56.482597] CR2: 0000000000000006 CR3: 0000000415210005 CR4: 00000000003606e0
<4>[   56.482597] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[   56.482598] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[   56.482598] Call Trace:
<4>[   56.482601]  ? task_work_run+0x88/0xb0
<4>[   56.482603]  ? do_exit+0x304/0xcb0
<4>[   56.482605]  ? kthread+0xfb/0x130
<4>[   56.482607]  ? irq_thread_dtor+0x90/0x90
<4>[   56.482609]  ? rewind_stack_do_exit+0x17/0x20
<4>[   56.482612] Code:  Bad RIP value.
<1>[   56.482615] RIP: 0x6 RSP: ffffb9e5c0fafea0
<4>[   56.482615] CR2: 0000000000000006
<4>[   56.482617] ---[ end trace 869670b1217b5f1f ]---
<7>[   56.517747] [drm:wait_panel_status [i915]] Wait complete
<7>[   56.517767] [drm:gen8_de_irq_handler [i915]] hotplug event received, stat 0x01000000, dig 0x12101010, pins 0x00000010
<7>[   56.517960] [drm:__intel_fbc_disable [i915]] Disabling FBC on pipe A
<7>[   56.528855] [drm:intel_hpd_irq_handler [i915]] digital hpd port A - long
<7>[   56.528888] [drm:edp_panel_vdd_on [i915]] Turning eDP port A VDD on
<7>[   56.528913] [drm:intel_disable_shared_dpll [i915]] disable DPLL 0 (active 1, on? 1) for crtc 41
<7>[   56.529531] [drm:intel_hpd_irq_handler [i915]] Received HPD interrupt on PIN 4 - cnt: 0
<7>[   56.529557] [drm:wait_panel_power_cycle [i915]] Wait for panel power cycle
<7>[   56.529580] [drm:intel_disable_shared_dpll [i915]] disabling DPLL 0
<7>[   56.539233] [drm:hsw_audio_codec_disable [i915]] Disable audio codec on pipe B
<7>[   56.539429] [drm:intel_disable_pipe [i915]] disabling pipe B
<1>[   56.722868] Fixing recursive fault but reboot is needed!
<7>[   57.132398] [drm:wait_panel_status [i915]] mask b800000f value 00000000 status 00000000 control 00000060
<7>[   57.132580] [drm:wait_panel_status [i915]] Wait complete
<7>[   57.132956] [drm:edp_panel_vdd_on [i915]] PP_STATUS: 0x00000000 PP_CONTROL: 0x00000068
<7>[   57.133022] [drm:edp_panel_vdd_on [i915]] eDP port A panel power wasn't enabled
<7>[   57.183051] [drm:gen8_de_irq_handler [i915]] hotplug event received, stat 0x01000000, dig 0x12101010, pins 0x00000010
<7>[   57.183194] [drm:intel_hpd_irq_handler [i915]] digital hpd port A - long
<7>[   57.183342] [drm:intel_hpd_irq_handler [i915]] Received HPD interrupt on PIN 4 - cnt: 1
<7>[   57.340799] [drm:intel_dp_read_dpcd [i915]] DPCD: 12 0a 84 41 00 00 01 01 02 00 00 00 00 0b 00
<7>[   57.341003] [drm:intel_disable_shared_dpll [i915]] disable DPLL 1 (active 2, on? 1) for crtc 55
<7>[   57.341104] [drm:intel_disable_shared_dpll [i915]] disabling DPLL 1
<7>[   57.341145] [drm:intel_atomic_commit_tail [i915]] [ENCODER:70:DDI A]
<7>[   57.341175] [drm:intel_atomic_commit_tail [i915]] [ENCODER:77:DDI B]
<7>[   57.341206] [drm:intel_atomic_commit_tail [i915]] [ENCODER:79:DP-MST A]
<7>[   57.341228] [drm:intel_atomic_commit_tail [i915]] [ENCODER:80:DP-MST B]
<7>[   57.341250] [drm:intel_atomic_commit_tail [i915]] [ENCODER:81:DP-MST C]
<7>[   57.341277] [drm:intel_atomic_commit_tail [i915]] [ENCODER:84:DDI C]
<7>[   57.341305] [drm:intel_atomic_commit_tail [i915]] [ENCODER:86:DP-MST A]
<7>[   57.341332] [drm:intel_atomic_commit_tail [i915]] [ENCODER:87:DP-MST B]
<7>[   57.341359] [drm:intel_atomic_commit_tail [i915]] [ENCODER:88:DP-MST C]
<7>[   57.341385] [drm:intel_atomic_commit_tail [i915]] [ENCODER:93:DDI D]
<7>[   57.341411] [drm:intel_atomic_commit_tail [i915]] [ENCODER:95:DP-MST A]
<7>[   57.341439] [drm:intel_atomic_commit_tail [i915]] [ENCODER:96:DP-MST B]
<7>[   57.341467] [drm:intel_atomic_commit_tail [i915]] [ENCODER:97:DP-MST C]
<7>[   57.341493] [drm:verify_connector_state.isra.78 [i915]] [CONNECTOR:71:eDP-1]
<7>[   57.341558] [drm:verify_connector_state.isra.78 [i915]] [CONNECTOR:78:DP-1]
<7>[   57.341602] [drm:verify_single_dpll_state.isra.79 [i915]] DPLL 0
<7>[   57.341650] [drm:verify_single_dpll_state.isra.79 [i915]] DPLL 1
<7>[   57.341678] [drm:verify_single_dpll_state.isra.79 [i915]] DPLL 2
<7>[   57.341706] [drm:verify_single_dpll_state.isra.79 [i915]] DPLL 3
<7>[   57.341749] [drm:intel_atomic_commit_tail [i915]] [CRTC:41:pipe A]
<7>[   57.341777] [drm:intel_atomic_commit_tail [i915]] [CRTC:55:pipe B]
<7>[   57.341928] [drm:intel_enable_sagv [i915]] Enabling the SAGV
<7>[   57.353138] [drm:edp_panel_vdd_off_sync [i915]] Turning eDP port A VDD off
<7>[   57.353396] [drm:edp_panel_vdd_off_sync [i915]] PP_STATUS: 0x00000000 PP_CONTROL: 0x00000060
<0>[  119.788355] mei_me 0000:00:16.0: **** DPM device timeout ****
<4>[  119.788359] Call Trace:
<4>[  119.788374]  ? __schedule+0x3c7/0xb00
<4>[  119.788381]  ? pci_pm_freeze+0xd0/0xd0
<4>[  119.788387]  schedule+0x37/0x90
<4>[  119.788394]  synchronize_irq+0x57/0x90
<4>[  119.788400]  ? wait_woken+0x90/0x90
<4>[  119.788412]  mei_stop+0x57/0xa0 [mei]
<4>[  119.788421]  mei_me_pci_suspend+0x27/0x70 [mei_me]
<4>[  119.788426]  pci_pm_suspend+0x74/0x140
<4>[  119.788433]  dpm_run_callback+0x5f/0x310
<4>[  119.788439]  __device_suspend+0xfa/0x5b0
<4>[  119.788445]  ? dpm_watchdog_set+0x60/0x60
<4>[  119.788453]  async_suspend+0x15/0x90
<4>[  119.788459]  async_run_entry_fn+0x2e/0x160
<4>[  119.788465]  process_one_work+0x215/0x620
<4>[  119.788474]  worker_thread+0x48/0x3a0
<4>[  119.788483]  kthread+0xfb/0x130
<4>[  119.788487]  ? process_one_work+0x620/0x620
<4>[  119.788492]  ? _kthread_create_on_node+0x30/0x30
<4>[  119.788499]  ret_from_fork+0x3a/0x50
<0>[  119.788511] Kernel panic - not syncing: mei_me 0000:00:16.0: unrecoverable failure
<0>[  119.788511] 
<0>[  119.788590] Dumping ftrace buffer:
<0>[  119.788601]    (ftrace buffer empty)
<0>[  119.788606] Kernel Offset: 0x34000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Comment 1 Marta Löfstedt 2018-03-28 05:51:26 UTC
This machine is no longer in the lab I will close this bug.
Comment 2 Marta Löfstedt 2018-04-04 06:14:51 UTC
reopened:

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_11/fi-skl-6700k2/igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend.html

<4>[  180.842987] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me prime_numbers mei
<4>[  180.843005] CPU: 3 PID: 1547 Comm: irq/130-mei_me Tainted: G     U           4.16.0-rc7-gc46052cde6a5-drmtip_11+ #1
<4>[  180.843006] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 0802 09/02/2015
<4>[  180.843010] RIP: 0010:mei_hbm_dispatch+0x8b/0xbc0 [mei]
<4>[  180.843012] RSP: 0018:ffffa0984109fdb8 EFLAGS: 00010297
<4>[  180.843013] RAX: 0000000000000000 RBX: ffff896a4579d3f0 RCX: 0000000000000003
<4>[  180.843015] RDX: ffffa09840181004 RSI: ffffa09840181004 RDI: 000000008014140c
<4>[  180.843016] RBP: ffff896a4579d7a8 R08: 0000000051816e62 R09: 0000000000000001
<4>[  180.843017] R10: ffffa0984109fe38 R11: 0000000000000000 R12: ffffa0984109fe44
<4>[  180.843018] R13: ffff896a4579d9a8 R14: ffffa0984109fe48 R15: ffffffff840f3f21
<4>[  180.843020] FS:  0000000000000000(0000) GS:ffff896a55cc0000(0000) knlGS:0000000000000000
<4>[  180.843021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  180.843022] CR2: 000055edee791680 CR3: 0000000144210005 CR4: 00000000003606e0
<4>[  180.843023] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  180.843024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  180.843025] Call Trace:
<4>[  180.843030]  mei_irq_read_handler+0x2d5/0x630 [mei]
<4>[  180.843035]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[  180.843039]  ? irq_thread+0x131/0x1c0
<4>[  180.843041]  mei_me_irq_thread_handler+0x429/0xab0 [mei_me]
<4>[  180.843044]  ? __schedule+0x2a6/0xbb0
<4>[  180.843047]  ? irq_thread+0x87/0x1c0
<4>[  180.843050]  ? irq_forced_thread_fn+0x70/0x70
<4>[  180.843052]  ? irq_thread+0x131/0x1c0
<4>[  180.843054]  irq_thread_fn+0x1c/0x50
<4>[  180.843056]  ? irq_thread+0x131/0x1c0
<4>[  180.843058]  ? irq_thread+0xbe/0x1c0
<4>[  180.843060]  irq_thread+0x15f/0x1c0
<4>[  180.843063]  ? wake_threads_waitq+0x30/0x30
<4>[  180.843066]  ? irq_thread_dtor+0x90/0x90
<4>[  180.843069]  kthread+0x119/0x130
<4>[  180.843071]  ? _kthread_create_on_node+0x30/0x30
<4>[  180.843074]  ret_from_fork+0x3a/0x50
<4>[  180.843078] Code: 02 00 00 76 49 3c 86 0f 84 0f 05 00 00 0f 86 b1 00 00 00 3c 8a 0f 84 01 04 00 00 3c 90 0f 84 7b 02 00 00 3c 87 0f 84 49 04 00 00 <0f> 0b 48 8b 33 48 c7 c2 b8 cd 0b c0 48 c7 c7 38 46 0c c0 e8 9d 
<1>[  180.843127] RIP: mei_hbm_dispatch+0x8b/0xbc0 [mei] RSP: ffffa0984109fdb8
<4>[  180.843142] ---[ end trace cbdda785752d805e ]---
<4>[  182.112176] sched: RT throttling activated
<1>[  182.424544] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
<1>[  182.424558] IP: 0x4
<6>[  182.424562] PGD 0 P4D 0 
<4>[  182.424574] Oops: 0010 [#2] PREEMPT SMP PTI
<0>[  182.424581] Dumping ftrace buffer:
<0>[  182.424589]    (ftrace buffer empty)
<4>[  182.424592] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic
<5>[  182.424605] sd 0:0:0:0: [sda] Synchronizing SCSI cache
<4>[  182.424606]  i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me prime_numbers mei
<4>[  182.424646] CPU: 3 PID: 1547 Comm: irq/130-mei_me Tainted: G     UD          4.16.0-rc7-gc46052cde6a5-drmtip_11+ #1
<4>[  182.424649] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 0802 09/02/2015
<4>[  182.424652] RIP: 0010:0x4
<4>[  182.424656] RSP: 0018:ffffa0984109fea8 EFLAGS: 00010282
<4>[  182.424662] RAX: 0000000000000004 RBX: ffff896a3cef57d0 RCX: 0000000000000001
<4>[  182.424665] RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffa0984109fed0
<4>[  182.424668] RBP: ffffffff8504dd01 R08: 0000000000000000 R09: 0000000000000000
<4>[  182.424672] R10: ffffa0984109fe30 R11: 0000000000000000 R12: ffff896a3cef5040
<4>[  182.424675] R13: ffffffff8505dbdd R14: ffffa0984109fed0 R15: ffff896a3cef5830
<4>[  182.424680] FS:  0000000000000000(0000) GS:ffff896a55cc0000(0000) knlGS:0000000000000000
<4>[  182.424683] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  182.424687] CR2: 0000000000000004 CR3: 0000000144210005 CR4: 00000000003606e0
<4>[  182.424690] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  182.424693] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  182.424696] Call Trace:
<4>[  182.424705]  ? task_work_run+0x82/0xb0
<4>[  182.424714]  ? do_exit+0x386/0xc90
<4>[  182.424723]  ? irq_thread_dtor+0x90/0x90
<4>[  182.424729]  ? kthread+0x119/0x130
<4>[  182.424738]  ? rewind_stack_do_exit+0x17/0x20
<4>[  182.424749] Code:  Bad RIP value.
<1>[  182.424764] RIP: 0x4 RSP: ffffa0984109fea8
<4>[  182.424767] CR2: 0000000000000004
<4>[  182.424773] ---[ end trace cbdda785752d805f ]---
<5>[  182.437751] sd 0:0:0:0: [sda] Stopping disk
<1>[  183.681918] Fixing recursive fault but reboot is needed!
<0>[  242.659014] mei_me 0000:00:16.0: **** DPM device timeout ****
<4>[  242.659017] Call Trace:
<4>[  242.659031]  ? __schedule+0x29e/0xbb0
<4>[  242.659040]  ? prepare_to_wait_event+0x83/0x160
<4>[  242.659048]  schedule+0x2d/0x90
<4>[  242.659056]  synchronize_irq+0x57/0x90
<4>[  242.659063]  ? wait_woken+0x90/0x90
<4>[  242.659076]  mei_stop+0x63/0xb0 [mei]
<4>[  242.659086]  mei_me_pci_suspend+0x27/0x80 [mei_me]
<4>[  242.659094]  pci_pm_suspend+0x7c/0x130
<4>[  242.659100]  ? pci_pm_freeze+0xb0/0xb0
<4>[  242.659107]  dpm_run_callback+0x5d/0x2f0
<4>[  242.659115]  __device_suspend+0xfb/0x5e0
<4>[  242.659123]  ? dpm_watchdog_set+0x60/0x60
<4>[  242.659133]  async_suspend+0x15/0x90
<4>[  242.659140]  async_run_entry_fn+0x34/0x160
<4>[  242.659147]  process_one_work+0x224/0x680
<4>[  242.659157]  worker_thread+0x35/0x380
<4>[  242.659165]  ? process_one_work+0x680/0x680
<4>[  242.659172]  kthread+0x119/0x130
<4>[  242.659179]  ? _kthread_create_on_node+0x30/0x30
<4>[  242.659188]  ret_from_fork+0x3a/0x50
<0>[  242.659202] Kernel panic - not syncing: mei_me 0000:00:16.0: unrecoverable failure
<0>[  242.659202] 
<0>[  242.659270] Dumping ftrace buffer:
<0>[  242.659277]    (ftrace buffer empty)
<0>[  242.659283] Kernel Offset: 0x3000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Comment 3 Marta Löfstedt 2018-04-06 09:30:45 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_12/fi-kbl-7500u/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html

from pstore:
<4>[  192.623055] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul btusb crc32_pclmul btrtl ghash_clmulni_intel btbcm btintel bluetooth snd_hda_intel snd_hda_codec e1000e snd_hwdep snd_hda_core ecdh_generic mei_me snd_pcm mei prime_numbers
<4>[  192.623106] CPU: 3 PID: 1554 Comm: irq/123-mei_me Tainted: G     U           4.16.0-rc7-g8a51883453a9-drmtip_12+ #1
<4>[  192.623109] Hardware name: GIGABYTE GB-BKi7(H)A-7500/MFLP7AP-00, BIOS F7 06/28/2017
<4>[  192.623120] RIP: 0010:mei_hbm_dispatch+0x17e/0xc10 [mei]
<4>[  192.623123] RSP: 0018:ffff9ae58043bd98 EFLAGS: 00010297
<4>[  192.623128] RAX: 0000000000000000 RBX: ffff8a76e0219c10 RCX: 0000000000000003
<4>[  192.623131] RDX: ffff9ae580175004 RSI: ffff9ae580175004 RDI: 00000000800a0a0c
<4>[  192.623133] RBP: ffff8a76e0219fc8 R08: ffff8a76c905d8f8 R09: 00000000236bfc32
<4>[  192.623136] R10: ffff9ae58043be20 R11: 0000000000000001 R12: ffff9ae58043be30
<4>[  192.623138] R13: ffff8a76e021a1c8 R14: ffff8a76e0219c10 R15: ffffffff890f9e60
<4>[  192.623142] FS:  0000000000000000(0000) GS:ffff8a76edd80000(0000) knlGS:0000000000000000
<4>[  192.623144] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  192.623147] CR2: 00007f0973a52200 CR3: 0000000182210001 CR4: 00000000003606e0
<4>[  192.623149] Call Trace:
<4>[  192.623163]  mei_irq_read_handler+0x26d/0x650 [mei]
<4>[  192.623173]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[  192.623181]  ? irq_thread+0x90/0x1e0
<4>[  192.623188]  mei_me_irq_thread_handler+0x3e8/0xa70 [mei_me]
<4>[  192.623196]  ? irq_thread+0xc5/0x1e0
<4>[  192.623202]  ? irq_thread+0x90/0x1e0
<4>[  192.623207]  irq_thread_fn+0x16/0x40
<4>[  192.623214]  irq_thread+0x172/0x1e0
<4>[  192.623219]  ? irq_forced_thread_fn+0x60/0x60
<4>[  192.623227]  ? wake_threads_waitq+0x30/0x30
<4>[  192.623234]  kthread+0xfb/0x130
<4>[  192.623240]  ? irq_thread_dtor+0x90/0x90
<4>[  192.623245]  ? _kthread_create_on_node+0x60/0x60
<4>[  192.623253]  ret_from_fork+0x3a/0x50
<4>[  192.623264] Code: 8b 3b be 01 00 00 00 e8 11 df 3f c9 31 c0 e9 ef fe ff ff 3c 8a 0f 84 39 03 00 00 3c 90 0f 84 7c 01 00 00 3c 87 0f 84 fb 03 00 00 <0f> 0b 3c 03 0f 84 ba 00 00 00 3c 07 75 f2 0f 1f 44 00 00 48 8b 
<1>[  192.623384] RIP: mei_hbm_dispatch+0x17e/0xc10 [mei] RSP: ffff9ae58043bd98
<4>[  192.623423] ---[ end trace d206cb97d4e75227 ]---
<1>[  192.803911] BUG: unable to handle kernel NULL pointer dereference at 0000000000000006
<1>[  192.803913] IP: 0x6
<6>[  192.803914] PGD 0 P4D 0 
<4>[  192.803916] Oops: 0010 [#2] PREEMPT SMP PTI
<0>[  192.803917] Dumping ftrace buffer:
<0>[  192.803918]    (ftrace buffer empty)
<4>[  192.803919] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul btusb crc32_pclmul btrtl ghash_clmulni_intel btbcm btintel bluetooth snd_hda_intel snd_hda_codec e1000e snd_hwdep snd_hda_core ecdh_generic mei_me snd_pcm mei prime_numbers
<4>[  192.803934] CPU: 3 PID: 1554 Comm: irq/123-mei_me Tainted: G     UD          4.16.0-rc7-g8a51883453a9-drmtip_12+ #1
<4>[  192.803934] Hardware name: GIGABYTE GB-BKi7(H)A-7500/MFLP7AP-00, BIOS F7 06/28/2017
<4>[  192.803935] RIP: 0010:0x6
<4>[  192.803936] RSP: 0018:ffff9ae58043be98 EFLAGS: 00010282
<4>[  192.803937] RAX: ffff9ae58043bec8 RBX: ffff8a76c905d7d8 RCX: 0000000000000001
<4>[  192.803938] RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffff9ae58043bec8
<4>[  192.803938] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
<4>[  192.803939] R10: 0000000000000000 R11: ffff8a76c905d040 R12: ffff8a76c905d040
<4>[  192.803940] R13: ffffffff8a05d735 R14: ffff8a76c905d838 R15: 0000000000000000
<4>[  192.803940] FS:  0000000000000000(0000) GS:ffff8a76edd80000(0000) knlGS:0000000000000000
<4>[  192.803941] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  192.803942] CR2: 0000000000000006 CR3: 0000000182210001 CR4: 00000000003606e0
<4>[  192.803942] Call Trace:
<4>[  192.803945]  ? task_work_run+0x88/0xb0
<4>[  192.803947]  ? do_exit+0x314/0xd30
<4>[  192.803949]  ? kthread+0xfb/0x130
<4>[  192.803950]  ? rewind_stack_do_exit+0x17/0x20
<4>[  192.803953] Code:  Bad RIP value.
<1>[  192.803956] RIP: 0x6 RSP: ffff9ae58043be98
<4>[  192.803957] CR2: 0000000000000006
<4>[  192.803958] ---[ end trace d206cb97d4e75228 ]---
Comment 4 Marta Löfstedt 2018-04-16 07:45:54 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_21/fi-skl-6700k2/igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend.html

pstore:
<4>[  266.280526] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_codec e1000e snd_hwdep snd_hda_core snd_pcm mei_me mei prime_numbers
<4>[  266.280542] CPU: 6 PID: 4144 Comm: irq/130-mei_me Tainted: G     U           4.16.0-rc7-ga0e39233b887-drmtip_21+ #1
<4>[  266.280544] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 0802 09/02/2015
<4>[  266.280548] RIP: 0010:mei_hbm_dispatch+0x17e/0xc10 [mei]
<4>[  266.280549] RSP: 0018:ffffb75740dc7d98 EFLAGS: 00010297
<4>[  266.280551] RAX: 0000000000000000 RBX: ffff9971440e5d40 RCX: 0000000000000003
<4>[  266.280552] RDX: ffffb75740189004 RSI: ffffb75740189004 RDI: 000000008014140c
<4>[  266.280553] RBP: ffff9971440e60f8 R08: ffff99713f03b0f8 R09: 000000005e09e028
<4>[  266.280554] R10: ffffb75740dc7e20 R11: 0000000000000001 R12: ffffb75740dc7e30
<4>[  266.280555] R13: ffff9971440e62f8 R14: ffff9971440e5d40 R15: ffffffffac0f9e60
<4>[  266.280556] FS:  0000000000000000(0000) GS:ffff997155d80000(0000) knlGS:0000000000000000
<4>[  266.280557] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  266.280558] CR2: 000055d7d568cd40 CR3: 000000004c210006 CR4: 00000000003606e0
<4>[  266.280559] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  266.280560] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  266.280561] Call Trace:
<4>[  266.280566]  mei_irq_read_handler+0x26d/0x650 [mei]
<4>[  266.280570]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[  266.280573]  ? irq_thread+0x90/0x1e0
<4>[  266.280576]  mei_me_irq_thread_handler+0x3e8/0xa70 [mei_me]
<4>[  266.280579]  ? irq_thread+0xc5/0x1e0
<4>[  266.280581]  ? irq_thread+0x90/0x1e0
<4>[  266.280583]  irq_thread_fn+0x16/0x40
<4>[  266.280585]  irq_thread+0x172/0x1e0
<4>[  266.280587]  ? irq_forced_thread_fn+0x60/0x60
<4>[  266.280590]  ? wake_threads_waitq+0x30/0x30
<4>[  266.280593]  kthread+0xfb/0x130
<4>[  266.280595]  ? irq_thread_dtor+0x90/0x90
<4>[  266.280597]  ? _kthread_create_on_node+0x60/0x60
<4>[  266.280601]  ret_from_fork+0x3a/0x50
<4>[  266.280605] Code: 8b 3b be 01 00 00 00 e8 11 df 24 ec 31 c0 e9 ef fe ff ff 3c 8a 0f 84 39 03 00 00 3c 90 0f 84 7c 01 00 00 3c 87 0f 84 fb 03 00 00 <0f> 0b 3c 03 0f 84 ba 00 00 00 3c 07 75 f2 0f 1f 44 00 00 48 8b 
<1>[  266.280649] RIP: mei_hbm_dispatch+0x17e/0xc10 [mei] RSP: ffffb75740dc7d98
<4>[  266.280664] ---[ end trace 9faad50c8ef83572 ]---
<5>[  266.286016] sd 0:0:0:0: [sda] Synchronizing SCSI cache
<5>[  266.293372] sd 0:0:0:0: [sda] Stopping disk
<4>[  267.576854] sched: RT throttling activated
<1>[  268.467906] BUG: unable to handle kernel NULL pointer dereference at 0000000000000006
<1>[  268.467917] IP: 0x6
<6>[  268.467919] PGD 0 P4D 0 
<4>[  268.467928] Oops: 0010 [#2] PREEMPT SMP PTI
<0>[  268.467933] Dumping ftrace buffer:
<0>[  268.467940]    (ftrace buffer empty)
<4>[  268.467943] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_codec e1000e snd_hwdep snd_hda_core snd_pcm mei_me mei prime_numbers
<4>[  268.467990] CPU: 6 PID: 4144 Comm: irq/130-mei_me Tainted: G     UD          4.16.0-rc7-ga0e39233b887-drmtip_21+ #1
<4>[  268.467993] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 0802 09/02/2015
<4>[  268.467996] RIP: 0010:0x6
<4>[  268.468000] RSP: 0018:ffffb75740dc7e98 EFLAGS: 00010282
<4>[  268.468005] RAX: ffffb75740dc7ec8 RBX: ffff99713f03afd8 RCX: 0000000000000001
<4>[  268.468008] RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffb75740dc7ec8
<4>[  268.468011] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
<4>[  268.468013] R10: 0000000000000000 R11: ffff99713f03a840 R12: ffff99713f03a840
<4>[  268.468016] R13: ffffffffad05d735 R14: ffff99713f03b038 R15: 0000000000000000
<4>[  268.468020] FS:  0000000000000000(0000) GS:ffff997155d80000(0000) knlGS:0000000000000000
<4>[  268.468023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  268.468026] CR2: 0000000000000006 CR3: 000000004c210006 CR4: 00000000003606e0
<4>[  268.468028] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  268.468031] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  268.468033] Call Trace:
<4>[  268.468042]  ? task_work_run+0x88/0xb0
<4>[  268.468050]  ? do_exit+0x314/0xd30
<4>[  268.468058]  ? kthread+0xfb/0x130
<4>[  268.468067]  ? rewind_stack_do_exit+0x17/0x20
<4>[  268.468078] Code:  Bad RIP value.
<1>[  268.468091] RIP: 0x6 RSP: ffffb75740dc7e98
<4>[  268.468093] CR2: 0000000000000006
Comment 5 Jani Saarinen 2018-04-18 17:35:26 UTC
Pinged mei folks, waiting for reply
Comment 6 Jani Saarinen 2018-04-27 13:41:58 UTC
Martin, dis you made kernel bug for this? I guess we could drop CFL from this as CFL-S2 is now CFL-S3 and no issues seen there.
Comment 7 Martin Peres 2018-04-27 14:14:54 UTC
Also reported on kernel.org: https://bugzilla.kernel.org/show_bug.cgi?id=199541
Comment 8 Francesco Balestrieri 2018-12-04 05:44:15 UTC
According to the kernel bug, there is a patch that fixes the issue when applied to our CI. If the patch is not yet merged in any of the kernels we use and we can't close the bug, we could at least reduce the priority. This bug is currently at the top of the list of our overdue bugs, but it doesn't look like there is much we can do.
Comment 9 Martin Peres 2019-01-16 11:54:03 UTC
(In reply to Francesco Balestrieri from comment #8)
> According to the kernel bug, there is a patch that fixes the issue when
> applied to our CI. If the patch is not yet merged in any of the kernels we
> use and we can't close the bug, we could at least reduce the priority. This
> bug is currently at the top of the list of our overdue bugs, but it doesn't
> look like there is much we can do.

it has been merged, and it is not part of Core-for-CI, so we are good to go. Closing!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.