We have been very reliably hitting the following error message on gdg: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5186_102/fi-gdg-551/igt@runner@aborted.html <3>[ 308.655471] check: Corrupted low memory at 000000007cfb63b3 (1000 phys) = 1864a00118649001 <3>[ 308.655596] check: Corrupted low memory at 00000000492f0651 (1008 phys) = 1864c0011864b001 <3>[ 308.655604] check: Corrupted low memory at 00000000fe0b4fbc (1010 phys) = 1864e0011864d001 <3>[ 308.655612] check: Corrupted low memory at 00000000b66fdaa9 (1018 phys) = 186500011864f001 <3>[ 308.655620] check: Corrupted low memory at 00000000f720d8c9 (1020 phys) = 1865200118651001 <3>[ 308.655627] check: Corrupted low memory at 0000000018a04d7a (1028 phys) = 1865400118653001 <3>[ 308.655634] check: Corrupted low memory at 00000000d36a3096 (1030 phys) = 1865600118655001 <3>[ 308.655642] check: Corrupted low memory at 0000000034e45896 (1038 phys) = 1865800118657001 <3>[ 308.655649] check: Corrupted low memory at 0000000019fbe82c (1040 phys) = 1865a00118659001 <3>[ 308.655656] check: Corrupted low memory at 000000008509ff77 (1048 phys) = 1865c0011865b001 <3>[ 308.655663] check: Corrupted low memory at 000000008b445056 (1050 phys) = 1865e0011865d001 <3>[ 308.655671] check: Corrupted low memory at 00000000856cf92c (1058 phys) = 186600011865f001 <3>[ 308.655678] check: Corrupted low memory at 0000000094da0e4d (1060 phys) = 1866200118661001 <3>[ 308.655685] check: Corrupted low memory at 000000001922f546 (1068 phys) = 1866400118663001 <3>[ 308.655693] check: Corrupted low memory at 00000000e71e4205 (1070 phys) = 1866600118665001 <3>[ 308.655700] check: Corrupted low memory at 0000000014e7dd31 (1078 phys) = 1866800118667001 <3>[ 308.655707] check: Corrupted low memory at 00000000a7bbeab6 (1080 phys) = 1866a00118669001 <3>[ 308.655714] check: Corrupted low memory at 0000000060f6520b (1088 phys) = 1866c0011866b001 <3>[ 308.655722] check: Corrupted low memory at 00000000f79fdab7 (1090 phys) = 1866e0011866d001 <3>[ 308.655729] check: Corrupted low memory at 00000000952b1d17 (1098 phys) = 186700011866f001 <3>[ 308.655736] check: Corrupted low memory at 00000000be69864c (10a0 phys) = 1867200118671001 <3>[ 308.655743] check: Corrupted low memory at 000000002fef4fad (10a8 phys) = 1867400118673001 <3>[ 308.655751] check: Corrupted low memory at 00000000f296e6e8 (10b0 phys) = 1867600118675001 <3>[ 308.655758] check: Corrupted low memory at 00000000b560cfea (10b8 phys) = 1867800118677001 <3>[ 308.655765] check: Corrupted low memory at 000000002b51c646 (10c0 phys) = 1867a00118679001 <3>[ 308.655773] check: Corrupted low memory at 00000000c5732bff (10c8 phys) = 1867c0011867b001 <3>[ 308.655780] check: Corrupted low memory at 0000000017d83472 (10d0 phys) = 1867e0011867d001 <3>[ 308.655787] check: Corrupted low memory at 00000000dac8bb35 (10d8 phys) = 187000011867f001 <3>[ 308.655794] check: Corrupted low memory at 0000000008419ee5 (10e0 phys) = 1870200118701001 <3>[ 308.655802] check: Corrupted low memory at 000000001ca88fd9 (10e8 phys) = 1870400118703001 [...]
This was caught in 50% of the runs on CI_DRM_5186.
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5186/fi-gdg-551/dmesg0.log
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5186_102/fi-gdg-551/boot0.log
Oop, the earlier link was wrong. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5186_102/fi-gdg-551/dmesg0.log
So PTE entries? Starting from 0x18649001. No obvious match for that page range in boot.log though. I honestly expect it's the BIOS doing strange things after we unload.
Created attachment 142702 [details] Dmesg during the execution of this test Also hit with igt@i915_module_load@reload-with-fault-injection: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5186_114/fi-gdg-551/igt@runner@aborted.html Aborting. Previous test: i915_module_load (reload-with-fault-injection) Next test: pm_rpm (module-reload) Kernel tainted (0x240 -- 200)
Seems like it's prevalent on drm-intel-fixes runs, but rare enough that it hasn't struck a BAT run?
Not seen in more than a month. Closing.
The CI Bug Log issue associated to this bug has been archived. New failures matching the above filters will not be associated to this bug anymore.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.