Bug 108926 - [CI] igt@runner@aborted - fail - Previous test: pm_rpm (module-reload), check: Corrupted low memory at 000000007cfb63b3 (1000 phys) = 1864a00118649001
Summary: [CI] igt@runner@aborted - fail - Previous test: pm_rpm (module-reload), check...
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: highest normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-03 13:38 UTC by Martin Peres
Modified: 2019-02-14 16:16 UTC (History)
1 user (show)

See Also:
i915 platform: I915G
i915 features: power/suspend-resume


Attachments
Dmesg during the execution of this test (2.06 MB, text/x-log)
2018-12-03 14:12 UTC, Martin Peres
no flags Details

Description Martin Peres 2018-12-03 13:38:28 UTC
We have been very reliably hitting the following error message on gdg:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5186_102/fi-gdg-551/igt@runner@aborted.html

<3>[  308.655471] check: Corrupted low memory at 000000007cfb63b3 (1000 phys) = 1864a00118649001
<3>[  308.655596] check: Corrupted low memory at 00000000492f0651 (1008 phys) = 1864c0011864b001
<3>[  308.655604] check: Corrupted low memory at 00000000fe0b4fbc (1010 phys) = 1864e0011864d001
<3>[  308.655612] check: Corrupted low memory at 00000000b66fdaa9 (1018 phys) = 186500011864f001
<3>[  308.655620] check: Corrupted low memory at 00000000f720d8c9 (1020 phys) = 1865200118651001
<3>[  308.655627] check: Corrupted low memory at 0000000018a04d7a (1028 phys) = 1865400118653001
<3>[  308.655634] check: Corrupted low memory at 00000000d36a3096 (1030 phys) = 1865600118655001
<3>[  308.655642] check: Corrupted low memory at 0000000034e45896 (1038 phys) = 1865800118657001
<3>[  308.655649] check: Corrupted low memory at 0000000019fbe82c (1040 phys) = 1865a00118659001
<3>[  308.655656] check: Corrupted low memory at 000000008509ff77 (1048 phys) = 1865c0011865b001
<3>[  308.655663] check: Corrupted low memory at 000000008b445056 (1050 phys) = 1865e0011865d001
<3>[  308.655671] check: Corrupted low memory at 00000000856cf92c (1058 phys) = 186600011865f001
<3>[  308.655678] check: Corrupted low memory at 0000000094da0e4d (1060 phys) = 1866200118661001
<3>[  308.655685] check: Corrupted low memory at 000000001922f546 (1068 phys) = 1866400118663001
<3>[  308.655693] check: Corrupted low memory at 00000000e71e4205 (1070 phys) = 1866600118665001
<3>[  308.655700] check: Corrupted low memory at 0000000014e7dd31 (1078 phys) = 1866800118667001
<3>[  308.655707] check: Corrupted low memory at 00000000a7bbeab6 (1080 phys) = 1866a00118669001
<3>[  308.655714] check: Corrupted low memory at 0000000060f6520b (1088 phys) = 1866c0011866b001
<3>[  308.655722] check: Corrupted low memory at 00000000f79fdab7 (1090 phys) = 1866e0011866d001
<3>[  308.655729] check: Corrupted low memory at 00000000952b1d17 (1098 phys) = 186700011866f001
<3>[  308.655736] check: Corrupted low memory at 00000000be69864c (10a0 phys) = 1867200118671001
<3>[  308.655743] check: Corrupted low memory at 000000002fef4fad (10a8 phys) = 1867400118673001
<3>[  308.655751] check: Corrupted low memory at 00000000f296e6e8 (10b0 phys) = 1867600118675001
<3>[  308.655758] check: Corrupted low memory at 00000000b560cfea (10b8 phys) = 1867800118677001
<3>[  308.655765] check: Corrupted low memory at 000000002b51c646 (10c0 phys) = 1867a00118679001
<3>[  308.655773] check: Corrupted low memory at 00000000c5732bff (10c8 phys) = 1867c0011867b001
<3>[  308.655780] check: Corrupted low memory at 0000000017d83472 (10d0 phys) = 1867e0011867d001
<3>[  308.655787] check: Corrupted low memory at 00000000dac8bb35 (10d8 phys) = 187000011867f001
<3>[  308.655794] check: Corrupted low memory at 0000000008419ee5 (10e0 phys) = 1870200118701001
<3>[  308.655802] check: Corrupted low memory at 000000001ca88fd9 (10e8 phys) = 1870400118703001
[...]
Comment 1 Martin Peres 2018-12-03 13:41:18 UTC
This was caught in 50% of the runs on CI_DRM_5186.
Comment 4 Tomi Sarvela 2018-12-03 13:52:49 UTC
Oop, the earlier link was wrong.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5186_102/fi-gdg-551/dmesg0.log
Comment 5 Chris Wilson 2018-12-03 13:53:19 UTC
So PTE entries? Starting from 0x18649001. No obvious match for that page range in boot.log though.

I honestly expect it's the BIOS doing strange things after we unload.
Comment 6 Martin Peres 2018-12-03 14:12:58 UTC
Created attachment 142702 [details]
Dmesg during the execution of this test

Also hit with igt@i915_module_load@reload-with-fault-injection: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5186_114/fi-gdg-551/igt@runner@aborted.html

Aborting.
Previous test: i915_module_load (reload-with-fault-injection)
Next test: pm_rpm (module-reload)

Kernel tainted (0x240 -- 200)
Comment 7 Chris Wilson 2018-12-04 11:48:42 UTC
Seems like it's prevalent on drm-intel-fixes runs, but rare enough that it hasn't struck a BAT run?
Comment 8 Francesco Balestrieri 2019-01-08 13:27:30 UTC
Not seen in more than a month. Closing.
Comment 9 CI Bug Log 2019-02-14 16:16:00 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.