Bug 111221 - [CI][BAT] igt@i915_selftest@live_contexts - dmesg-warn - BUG: Bad page state in process kworker
Summary: [CI][BAT] igt@i915_selftest@live_contexts - dmesg-warn - BUG: Bad page state ...
Status: RESOLVED NOTOURBUG
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-07-26 08:26 UTC by Martin Peres
Modified: 2019-07-26 08:38 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: GEM/Other


Attachments

Description Martin Peres 2019-07-26 08:26:21 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6553/fi-icl-dsi/igt@i915_selftest@live_contexts.html

<1> [435.321134] BUG: Bad page state in process kworker/u16:22  pfn:260fc0
<4> [435.321196] page:ffffea000983f000 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
<4> [435.321198] flags: 0x8000000000000000()
<4> [435.321201] raw: 8000000000000000 dead000000000100 dead000000000122 0000000000000000
<4> [435.321203] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000040000
<4> [435.321204] page dumped because: page still charged to cgroup
<4> [435.321205] page->mem_cgroup:0000000000040000
<4> [435.321207] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal coretemp mei_hdcp crct10dif_pclmul snd_hda_codec ax88179_178a crc32_pclmul usbnet snd_hwdep mii snd_hda_core e1000e ghash_clmulni_intel snd_pcm ptp pps_core mei_me mei prime_numbers [last unloaded: i915]
<4> [435.321222] CPU: 6 PID: 2368 Comm: kworker/u16:22 Tainted: G     U            5.3.0-rc1-CI-CI_DRM_6553+ #1
<4> [435.321223] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake Y LPDDR4x T4 RVP TLC, BIOS ICLSFWR1.R00.3102.A00.1903052247 03/05/2019
<4> [435.321282] Workqueue: i915 __i915_gem_free_work [i915]
<4> [435.321284] Call Trace:
<4> [435.321289]  dump_stack+0x67/0x9b
<4> [435.321294]  bad_page+0xc2/0x120
<4> [435.321297]  free_pcppages_bulk+0x470/0x680
<4> [435.321304]  ? sg_alloc_table+0x50/0x50
<4> [435.321306]  free_unref_page+0x3b/0x60
<4> [435.321309]  __sg_free_table+0x49/0x80
<4> [435.321353]  huge_free_pages.isra.2+0x61/0x70 [i915]
<4> [435.321395]  huge_put_pages+0x1c/0x30 [i915]
<4> [435.321436]  __i915_gem_object_put_pages+0x6e/0xe0 [i915]
<4> [435.321475]  __i915_gem_free_objects+0x1df/0x4f0 [i915]
<4> [435.321514]  __i915_gem_free_work+0x5a/0x90 [i915]
<4> [435.321518]  process_one_work+0x245/0x5f0
<4> [435.321524]  worker_thread+0x1d0/0x380
<4> [435.321528]  ? process_one_work+0x5f0/0x5f0
<4> [435.321530]  kthread+0x119/0x130
<4> [435.321532]  ? kthread_park+0xa0/0xa0
<4> [435.321536]  ret_from_fork+0x3a/0x50
<4> [435.321544] Disabling lock debugging due to kernel taint
<6> [435.452367] i915: Running i915_gem_context_live_selftests/igt_vm_isolation
<7> [435.452969] [drm:intel_power_well_enable [i915]] enabling always-on
<6> [440.456642] Checked 26116 scratch offsets across 5 engines
<7> [440.458026] [drm:intel_power_well_disable [i915]] disabling always-on
<7> [441.191647] [drm:intel_power_well_enable [i915]] enabling always-on
<7> [441.191719] [drm:intel_power_well_enable [i915]] enabling DC off
<7> [441.192136] [drm:gen9_set_dc_state [i915]] Setting DC state from 02 to 00
<7> [441.192221] [drm:intel_combo_phy_init [i915]] Combo PHY A already enabled, won't reprogram it.
<7> [441.192286] [drm:intel_combo_phy_init [i915]] Combo PHY B already enabled, won't reprogram it.
<7> [441.192330] [drm:intel_power_well_enable [i915]] enabling power well 2
<7> [441.192380] [drm:intel_power_well_enable [i915]] enabling power well 3
<7> [441.192432] [drm:intel_power_well_enable [i915]] enabling power well 4
<4> [441.272941] i915: probe of 0000:00:02.0 failed with error -25
<6> [441.468776] [IGT] i915_selftest: exiting, ret=0
Comment 1 CI Bug Log 2019-07-26 08:27:04 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ICL: igt@i915_selftest@live_contexts - dmesg-warn - BUG: Bad page state in process kworker
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6553/fi-icl-dsi/igt@i915_selftest@live_contexts.html
Comment 2 Chris Wilson 2019-07-26 08:38:59 UTC
The test just before shows that there is HW memcorruption. We need to replace this preproduction icl asap.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.