Summary: | [GM45/ILK/SNB/BYT/BDW]igt/gem_evict_everything/forked-swapping-multifd-mempressure-normal causes OOM killer | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | lu hua <huax.lu> | ||||||||||||||||||||||||||||||
Component: | DRM/Intel | Assignee: | Daniel Vetter <daniel> | ||||||||||||||||||||||||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||||||||||||||||||||||
Severity: | major | ||||||||||||||||||||||||||||||||
Priority: | high | CC: | intel-gfx-bugs, wendy.wang, xunx.fang | ||||||||||||||||||||||||||||||
Version: | unspecified | ||||||||||||||||||||||||||||||||
Hardware: | All | ||||||||||||||||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||||||||||||||||
Whiteboard: | |||||||||||||||||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||||||||||||||||
Bug Depends on: | 72742 | ||||||||||||||||||||||||||||||||
Bug Blocks: | |||||||||||||||||||||||||||||||||
Attachments: |
|
Roughly 750MB of free swap ;-) In other words our shrinker failed to clear out sufficient number of objects. Known issue, and will take a long time to fix. To be fair, our shrinker probably did exactly what it was asked to do... Hm, I have to admit I don't really have much clue how the shrinker interacts with the page swapout code. But it looks like all_unreclaimable might have misfired a bit ... Created attachment 85711 [details] [review] patch 1 Please test whether this patch helps to avoid the OOM. Created attachment 85712 [details] [review] patch 2 If OOMs still happen please also apply this patch on top of patch 1 (so both patches) for testing. Run patch 1 and patch 2, It will timeout. Created attachment 85750 [details]
dmesg with patch1 patch2
(In reply to comment #6) > Run patch 1 and patch 2, It will timeout. Hm, just tried to run the test and it says SUCCESS after a bit of time, but then seems to get stuck before exit. Do you see the same? Also please test with just patch 1 to see whether that also prevents the OOM. Ok, there was a small bug in igt which resulted in testcases getting stuck. Fixed with commit a031a1bf93b828585e7147f06145fc5030814547 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Sep 13 16:43:22 2013 +0200 lib/drmtest: ducttape over fork race The gem_evict_everything subtest here now completes for me in roughly 45 s on my snb, so hopefully you're now unblocked to test my patches. Please test first patch 1 alone (repeat the test a few times to make sure there's really no OOM any more). Then test patch 1+2 (again please repeat). Also: Does the OOM always happen on your testbox when running this test or only occasionally? (In reply to comment #9) > Ok, there was a small bug in igt which resulted in testcases getting stuck. > Fixed with > > commit a031a1bf93b828585e7147f06145fc5030814547 > Author: Daniel Vetter <daniel.vetter@ffwll.ch> > Date: Fri Sep 13 16:43:22 2013 +0200 > > lib/drmtest: ducttape over fork race > > The gem_evict_everything subtest here now completes for me in roughly 45 s > on my snb, so hopefully you're now unblocked to test my patches. > > Please test first patch 1 alone (repeat the test a few times to make sure > there's really no OOM any more). Then test patch 1+2 (again please repeat). > > Also: Does the OOM always happen on your testbox when running this test or > only occasionally? Test 3 cycles on latest igt and latest -nightly kernel. it works well. Test the patch 1 alone or test patch 1+2, OOM still happens. Oops, patch 1 alone was actually broken. I'll attach a new one. Created attachment 85896 [details] [review] patch 1, fixed Please retest with just this patch applied, thanks. If one needs to analyze what actually happens with system resources, this is pretty good tool for collecting the information, post-processing and visualizing it: https://maemo.gitorious.org/maemo-tools/sp-endurance It's used by taking snapshots at suitable intervals, and processing those snapshots offline. [1] snapshot contains huge amount of data from the system so getting one takes quite a lot of time. The interval between snapshots should be at least tens of secs, preferably minutes, otherwise collecting the data can affect the test too much. (In reply to comment #12) > Created attachment 85896 [details] [review] [review] > patch 1, fixed > > Please retest with just this patch applied, thanks. Test this patch, It happens 1 in 3 runs. Created attachment 85966 [details] [review] shrinker tuning Ok, slight variation of the previous patches, please test again a few times to see how this version fares. Heh, I actually had a patch with similar intent to push the batch_size loop into the shrinkers. (In reply to comment #16) > Heh, I actually had a patch with similar intent to push the batch_size loop > into the shrinkers. I think the logic makes more sense outside of the actual shrinker so that we can take memory pressure (i.e. how many loops through the entire reclaim dance we've done so far) into account. We probably shouldn't use the minimal reclaim size if memory is still easy to get, but ramp it up aggressively if memory is getting really tight. As soon as we have testing results on this patch (I've asked QA to also test this a bit on their OOM-prone byt platform) I'll send and rfc to mm. btw for testing the patch: Please test on all affected platforms so that we really know it's robust. (In reply to comment #15) > Created attachment 85966 [details] [review] [review] > shrinker tuning > > Ok, slight variation of the previous patches, please test again a few times > to see how this version fares. Test 3 cycles with this patch, It works well. It also happens on Baytrail. Ok, I've rebased my kernel trees to be based on 3.12-rc2. There have been some shrinker changes in upstream, so we need to retest everything. I'll work on an updated patch, meanwhile can you please check on all the affected platforms (please list them) that the bug is still there or whether anything chagned? Created attachment 86556 [details] [review] Updated patch for 3.12-rc2 kernels Under the assumption that the bug is still there please test all affected platforms with this updated patch. (In reply to comment #21) > Created attachment 86556 [details] [review] [review] > Updated patch for 3.12-rc2 kernels > > Under the assumption that the bug is still there please test all affected > platforms with this updated patch. Run 5 cycles with this patch, it works well. (In reply to comment #20) > Ok, I've rebased my kernel trees to be based on 3.12-rc2. There have been > some shrinker changes in upstream, so we need to retest everything. > > I'll work on an updated patch, meanwhile can you please check on all the > affected platforms (please list them) that the bug is still there or whether > anything chagned? And what's with plain -nightly based on -rc2? (In reply to comment #22) > (In reply to comment #21) > > Created attachment 86556 [details] [review] [review] [review] > > Updated patch for 3.12-rc2 kernels > > > > Under the assumption that the bug is still there please test all affected > > platforms with this updated patch. > > Run 5 cycles with this patch, it works well. Run on kernel-3.12.0rc2(commit 8153de8b327e89bad0e36f82b098e37a6e9ef5bb) with this patch. 1. Run 5 cycles on sandybridge with this patch, it works well. 2. It still causes OOM killer on ILK. Created attachment 86782 [details]
ILK dmesg
(In reply to comment #24) > Run on kernel-3.12.0rc2(commit 8153de8b327e89bad0e36f82b098e37a6e9ef5bb) > with this patch. > > 1. Run 5 cycles on sandybridge with this patch, it works well. > > 2. It still causes OOM killer on ILK. Is the OOM killer with the patch? Also I've asked you to retest _without_ the patch applied, on a -rc2 based -nightly. This is to check whether the patch is still effective or whether something else changed. There have been many core mm changes which are relevant. Test on ILK with the patch, It still happens. Test on ILK with the latest -nightly kernel(commit a411305bdabef2) 3.12.0-rc2 without any patch, It also happens. (In reply to comment #27) > Test on ILK with the patch, It still happens. > Test on ILK with the latest -nightly kernel(commit a411305bdabef2) > 3.12.0-rc2 without any patch, It also happens. And what about snb/byt? Again I'm interested in how well it works both with the patch and without. Xun, please follow up on behalf of Hua during his vacation. Test with the latest -nightly kernel(commit ae5be842311c9108c6dbbbe0e2abc1c306016f12) 3.12.0-rc4 on both snb and byt, here is the result below: snb with out patch: works well snb with patch: works well byt with out patch: It still causes OOM killer byt with patch: It still causes OOM killer Daniel, any updated? do you need more info? (In reply to comment #31) > Daniel, any updated? do you need more info? It looks like we're back to square one since the patch doesn't seem to actually work. Can you please update the summary with the affected platforms? SNB seems to work now according to comment #30 (In reply to comment #32) > (In reply to comment #31) > > Daniel, any updated? do you need more info? > > It looks like we're back to square one since the patch doesn't seem to > actually work. > > Can you please update the summary with the affected platforms? SNB seems to > work now according to comment #30 It still happens on SNB randomly. Test on latest -nightly kernel(commit db86e5), It happens 1 in 3 runs. It also happens on gm45. Updated status, it also happened on BDW, change the status higher for this hardly hang blocked for a long time. Created attachment 90006 [details] [review] Include active objects in the shrinker count Try this with your fingers crossed. (In reply to comment #36) > Created attachment 90006 [details] [review] [review] > Include active objects in the shrinker count > > Try this with your fingers crossed. Test this patch, It still exists. Created attachment 90124 [details] [review] Tune the shrinker Try this in conjunction with the previous patch (https://bugs.freedesktop.org/attachment.cgi?id=90006). (Similar to Daniel's suggestion) Many gem_concurrent_blit subcases also cause OOM killer. (In reply to comment #38) > Created attachment 90124 [details] [review] [review] > Tune the shrinker > > Try this in conjunction with the previous patch > (https://bugs.freedesktop.org/attachment.cgi?id=90006). (Similar to Daniel's > suggestion) Test these 2 patches, It still causes OOM killer. (In reply to comment #39) > Many gem_concurrent_blit subcases also cause OOM killer. This is likely a different issue, now tracked in bug #72255 Can you please double-check that Chris' patches don't help with the issue at hand here, namely gem_evict_everything subtests going nuts? (In reply to comment #41) > (In reply to comment #39) > > Many gem_concurrent_blit subcases also cause OOM killer. > > This is likely a different issue, now tracked in bug #72255 > > Can you please double-check that Chris' patches don't help with the issue at > hand here, namely gem_evict_everything subtests going nuts? Run ./gem_evict_everything --run-subtest forked-swapping-multifd-mempressure-normal with these 2 patches. It works well on BDW. it still causes OOM killer on BYT. Can we move on? Many tests have to be disabled in nightly due to this bug, impacting the execution rate of BYT/BDW. (In reply to comment #43) > Can we move on? > Many tests have to be disabled in nightly due to this bug, impacting the > execution rate of BYT/BDW. Which? Do you mean other than the mempressure and swapping tests? Does failure in these impact upon other tests? (In reply to comment #44) > (In reply to comment #43) > > Can we move on? > > Many tests have to be disabled in nightly due to this bug, impacting the > > execution rate of BYT/BDW. > > Which? Do you mean other than the mempressure and swapping tests? Does > failure in these impact upon other tests? gem_evict_everything fails with OOM killer, system will be no response.We disable gem_evict_everything subcases. So it impacts execution rate. Please be sure to test the patches posted in the related bug 72742 for this one too. Thanks. (In reply to comment #46) > Please be sure to test the patches posted in the related bug 72742 for this > one too. Thanks. Test this patch, It still occurs. Please do update the dmesg after testing the patches. Retest the patches on BYT with latest igt,run subtest forked-swapping-multifd-interruptible more than 30 minutes, it doesn't exit testing. The OOM killer doesn't happen. output: # ./gem_evict_everything IGT-Version: 1.5-g072d358 (x86_64) (Linux: 3.14.0-rc4_prts_aa1fe3_20140304 x86_6 Subtest forked-normal: SUCCESS Subtest forked-interruptible: SUCCESS Subtest forked-swapping-normal: SUCCESS Subtest forked-swapping-interruptible: SUCCESS Subtest forked-multifd-normal: SUCCESS Subtest forked-multifd-interruptible: SUCCESS Subtest forked-swapping-multifd-normal: SUCCESS Subtest forked-swapping-multifd-interruptible: SUCCESS Subtest forked-mempressure-normal: SUCCESS Subtest forked-mempressure-interruptible: SUCCESS Subtest forked-swapping-mempressure-normal: SUCCESS Test on Ironlake with latest -nightly kernel and -igt, The OOM killer doesn't happen. output: IGT-Version: 1.5-g072d358 (x86_64) (Linux: 3.14.0-rc5_drm-intel-nightly_2bbdb4_20140304+ x86_64) Subtest forked-normal: SUCCESS Subtest forked-interruptible: SUCCESS Subtest forked-swapping-normal: SUCCESS Subtest forked-swapping-interruptible: SUCCESS Subtest forked-multifd-normal: SUCCESS Subtest forked-multifd-interruptible: SUCCESS Subtest forked-swapping-multifd-normal: SUCCESS Subtest forked-swapping-multifd-interruptible: SUCCESS Subtest forked-mempressure-normal: SUCCESS Subtest forked-mempressure-interruptible: SUCCESS Subtest forked-swapping-mempressure-normal: SUCCESS Subtest forked-swapping-mempressure-interruptible: SUCCESS Subtest forked-multifd-mempressure-normal: SUCCESS Subtest forked-multifd-mempressure-interruptible: SUCCESS Subtest forked-swapping-multifd-mempressure-normal: SUCCESS Subtest forked-swapping-multifd-mempressure-interruptible: SUCCESS Subtest swapping-normal: SUCCESS Subtest minor-normal: SUCCESS Test requirement not met in function major_evictions, file eviction_common.c:109: Last errno: 28, No space left on device Test requirement: (!((uint64_t)nr_surfaces * surface_size / (1024 * 1024) < intel_get_total_ram_mb() * 9 / 10)) Subtest major-normal: SKIP Subtest swapping-interruptible: SUCCESS Subtest minor-interruptible: SUCCESS Test requirement not met in function major_evictions, file eviction_common.c:109: Last errno: 28, No space left on device Test requirement: (!((uint64_t)nr_surfaces * surface_size / (1024 * 1024) < intel_get_total_ram_mb() * 9 / 10)) Subtest major-interruptible: SKIP Created attachment 95070 [details]
dmesg(BYT)
Can you pls do an overnight run on the byt to see whether the oom killer is really gone now on restricted memory platforms like it? It would be good to log the output of vmstat 10 or something to make sure the kernel keeps on thrashing the swap. If the columns si and so under the --swap-- heading are zero for a long time the test is stuck. I know that we don't really care about tests which take positively forever, but it sounds like we're finally getting somewhere with Chris' patches ... (In reply to comment #51) > Can you pls do an overnight run on the byt to see whether the oom killer is > really gone now on restricted memory platforms like it? > > It would be good to log the output of vmstat 10 or something to make sure > the kernel keeps on thrashing the swap. If the columns si and so under the > --swap-- heading are zero for a long time the test is stuck. > > I know that we don't really care about tests which take positively forever, > but it sounds like we're finally getting somewhere with Chris' patches ... Run on Baytrail with latest -nightly kernel, some subcases still fail with OOM killer. I will try Chris' patches. Created attachment 95294 [details]
dmesg(byt)
Test the patches on BYT 5 cycles. the OOM killer doesn't occur.
output:
IGT-Version: 1.5-g072d358 (x86_64) (Linux: 3.14.0-rc4_prts_aa1fe3_20140304 x86_64)
Subtest forked-normal: SUCCESS
Subtest forked-interruptible: SUCCESS
Subtest forked-swapping-normal: SUCCESS
Subtest forked-swapping-interruptible: SUCCESS
Subtest forked-multifd-normal: SUCCESS
Subtest forked-multifd-interruptible: SUCCESS
Subtest forked-swapping-multifd-normal: SUCCESS
Subtest forked-swapping-multifd-interruptible: SUCCESS
Subtest forked-mempressure-normal: SUCCESS
Subtest forked-mempressure-interruptible: SUCCESS
Subtest forked-swapping-mempressure-normal: SUCCESS
Subtest forked-swapping-mempressure-interruptible: SUCCESS
Subtest forked-multifd-mempressure-normal: SUCCESS
Subtest forked-multifd-mempressure-interruptible: SUCCESS
Subtest forked-swapping-multifd-mempressure-normal: SUCCESS
Subtest forked-swapping-multifd-mempressure-interruptible: SUCCESS
Subtest swapping-normal: SUCCESS
Test assertion failure function copy, file gem_evict_everything.c:124:
Last errno: 2, No such file or directory
Failed assertion: ret == error
Subtest minor-normal: FAIL
Test requirement not met in function major_evictions, file eviction_common.c:109:
Last errno: 2, No such file or directory
Test requirement: (!((uint64_t)nr_surfaces * surface_size / (1024 * 1024) < intel_get_total_ram_mb() * 9 / 10))
Subtest major-normal: SKIP
Subtest swapping-interruptible: SUCCESS
Test assertion failure function copy, file gem_evict_everything.c:124:
Last errno: 2, No such file or directory
Failed assertion: ret == error
Subtest minor-interruptible: FAIL
Test requirement not met in function major_evictions, file eviction_common.c:109:
Last errno: 2, No such file or directory
Test requirement: (!((uint64_t)nr_surfaces * surface_size / (1024 * 1024) < intel_get_total_ram_mb() * 9 / 10))
Subtest major-interruptible: SKIP
Hm, that smells more like a bug in the testcase where we supply an invalid bo reference. At least we're making good progress on the OOM issue! The current patch series under considerations is: http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=bug72742&id=294c593fd65b6de37006da9eceb6860f3b9d6f26 http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=bug72742&id=224f66e5cce9575fb5433dd7ec287e3b84d2ecbd http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=bug72742&id=8d7e8626fb7132ac029b8174a6962e4aa82a950f http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=bug72742&id=ab3095d304159ec5312bf79e68ea662ff4f1767e (In reply to comment #55) > The current patch series under considerations is: > http://cgit.freedesktop.org/~ickle/linux-2.6/commit/ > ?h=bug72742&id=8d7e8626fb7132ac029b8174a6962e4aa82a950f > This patch fail. On Mon, Mar 10, 2014 at 7:11 AM, <bugzilla-daemon@freedesktop.org> wrote: > This patch fail. Please clarify: Does the testcase fail, or do you see the OOM killer in action? This bug here is _only_ about the OOM killer firing when it shouldn't, we need to track the testcase failure itself in a new bug once the oom issue is resolved. patching file drivers/gpu/drm/i915/i915_gem.c Hunk #1 FAILED at 4920. Hunk #2 succeeded at 4911 with fuzz 1 (offset -28 lines). Hunk #3 succeeded at 5007 with fuzz 1 (offset -24 lines). 1 out of 3 hunks FAILED -- saving rejects to file drivers/gpu/drm/i915/i915_gem. c.rej drivers/gpu/drm/i915/i915_gem.c: static bool mutex_is_locked_by(struct mutex *mutex, struct task_struct *task) { if (!mutex_is_locked(mutex)) return false; #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_MUTEXES) return mutex->owner == task; #else /* Since UP may be pre-empted, we cannot assume that we own the lock */ return false; #endif } static unsigned long i915_gem_inactive_count(struct shrinker *shrinker, struct shrink_control *sc) { patch: @@ -4920,6 +4920,22 @@ static bool mutex_is_locked_by(struct mutex *mutex, struct task_struct *task) #endif } +static bool i915_gem_shrinker_lock(struct drm_device *dev, bool *unlock) +{ + if (!mutex_trylock(&dev->struct_mutex)) { + if (!mutex_is_locked_by(&dev->struct_mutex, current)) + return false; + + if (to_i915(dev)->mm.shrinker_no_lock_stealing) + return false; + + *unlock = false; + } else + *unlock = true; + + return true; +} + static int num_vma_bound(struct drm_i915_gem_object *obj) { struct i915_vma *vma; Chris, could you help Hua to resolve patching issue? I really want this bug moving on. The branch is at http://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=bug72742 (In reply to comment #60) > The branch is at http://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=bug72742 Run 3 cycles on this patch, OOM Killer goes away. Chris, would you upstream the fix? ping again, Chirs&Daniel, when the fixed patch will land upstream? The patches had r-b tags, just waiting upon Daniel. I wanted a 2nd review but apparently that one's slow to come about. Poked relevant people+managers ... Sounds like a dupe of the filp leak ... Please retest and reopen if it still happens. Nevermind, got lost. commit ceabbba524fb43989875f66a6c06d7ce0410fe5c Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Mar 25 13:23:04 2014 +0000 drm/i915: Include bound and active pages in the count of shrinkable objects When the machine is under a lot of memory pressure and being stressed by multiple GPU threads, we quite often report fewer than shrinker->batch (i.e. SHRINK_BATCH) pages to be freed. This causes the shrink_control to skip calling into i915.ko to release pages, despite the GPU holding onto most of the physical pages in its active lists. References: https://bugs.freedesktop.org/show_bug.cgi?id=72742 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Robert Beckett <robert.beckett@intel.com> Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Created attachment 99464 [details]
dmesg(BDW)
Test on commit ceabbb, It still fails with OOM killer.
output:
IGT-Version: 1.6-gd71add5 (x86_64) (Linux: 3.14.0_kcloud_ceabbb_20140521+ x86_64)
Try again with the right kernel. Created attachment 99601 [details]
kernel config
Use this config build latest drm-intel-nightly commit.
(In reply to comment #70) > Try again with the right kernel. Attached kernel config, Is it incorrect? It was that your dmesg did not have the warning that was added to -nightly in relation to oom. Run and reattach the dmesg. Run it on latest -nightly 5 cycles, It works well. I will double check, if it fixed, I will close it. Fixed on latest -nightly kernel. Verified.Fixed. closing old verified+fixed. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 85686 [details] dmesg System Environment: -------------------------- Platform: Ironlake/Sandybridge kernel (drm-intel-fixes)3cea210f2c7c50e67287207a6548314491f49f31 Bug detailed description: ----------------------------- It casues OOM killer on Ironlake/Sandybridge with -fixes, -nightly, -queued kernel. It's a new case. Following cases also have this issue. igt/gem_evict_everything/forked-swapping-mempressure-interruptible igt/gem_evict_everything/forked-swapping-mempressure-normal igt/gem_evict_everything/forked-swapping-multifd-mempressure-interruptible Call Trace: [ 107.576559] [<c0870f9d>] ? dump_stack+0x3e/0x4e [ 107.577836] [<c086e300>] ? dump_header.isra.9+0x53/0x15e [ 107.579103] [<c028ced4>] ? oom_kill_process+0x6b/0x2a3 [ 107.580367] [<c028f64a>] ? get_page_from_freelist+0x382/0x3b6 [ 107.581633] [<c02960a0>] ? try_to_free_pages+0x20b/0x25b [ 107.582972] [<c028cd00>] ? find_lock_task_mm+0x12/0x40 [ 107.584171] [<c028d40d>] ? out_of_memory+0x1c3/0x1f0 [ 107.585335] [<c028fb67>] ? __alloc_pages_nodemask+0x4e9/0x5e9 [ 107.586392] [<c028c712>] ? filemap_fault+0x23f/0x336 [ 107.587525] [<c029dea6>] ? __do_fault+0x89/0x33e [ 107.588668] [<c02ba9ae>] ? pipe_read+0x323/0x331 [ 107.589791] [<c02a0386>] ? handle_pte_fault+0x274/0x5e3 [ 107.590983] [<c02a07ac>] ? handle_mm_fault+0xb7/0xd5 [ 107.592346] [<c0878161>] ? __do_page_fault+0x400/0x43b [ 107.593833] [<c02c18d4>] ? poll_select_set_timeout+0x44/0x64 [ 107.595065] [<c02c2614>] ? SyS_poll+0x3d/0x85 [ 107.596278] [<c087819c>] ? __do_page_fault+0x43b/0x43b [ 107.597508] [<c0875e1e>] ? error_code+0x5a/0x60 [ 107.599158] [<c087819c>] ? __do_page_fault+0x43b/0x43b [ 107.719514] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name [ 107.722936] [ 2489] 0 2489 1142 279 8 0 0 systemd-journal [ 107.725752] [ 2494] 0 2494 902 348 4 0 -1000 udevd [ 107.728702] [ 3445] 0 3445 901 258 4 0 -1000 udevd [ 107.731462] [ 3509] 0 3509 560 558 5 0 -1000 watchdog [ 107.735411] [ 3512] 0 3512 1016 261 4 0 0 smartd [ 107.738669] [ 3514] 0 3514 606 65 6 0 0 gpm [ 107.741852] [ 3515] 0 3515 2749 176 3 0 -1000 auditd [ 107.745762] [ 3517] 0 3517 7914 356 12 0 0 NetworkManager [ 107.748613] [ 3523] 0 3523 1454 201 7 0 0 abrtd [ 107.752326] [ 3525] 0 3525 2879 171 4 0 0 audispd [ 107.755672] [ 3526] 0 3526 961 104 6 0 0 irqbalance [ 107.758716] [ 3527] 0 3527 746 150 6 0 0 sedispatch [ 107.761549] [ 3531] 0 3531 1439 162 7 0 0 abrt-watch-log [ 107.764693] [ 3536] 0 3536 550 124 5 0 0 acpid [ 107.768113] [ 3541] 0 3541 854 214 8 0 0 systemd-logind [ 107.770899] [ 3542] 70 3542 829 191 9 0 0 avahi-daemon [ 107.775436] [ 3547] 70 3547 829 41 9 0 0 avahi-daemon [ 107.779583] [ 3549] 0 3549 643 176 5 0 0 mcelog [ 107.783916] [ 3554] 0 3554 1389 301 6 0 0 crond [ 107.788580] [ 3558] 81 3558 826 261 7 0 -900 dbus-daemon [ 107.793588] [ 3562] 0 3562 7805 261 7 0 0 rsyslogd [ 107.798416] [ 3567] 0 3567 1333 112 7 0 0 ksmtuned [ 107.802911] [ 3588] 0 3588 6188 241 12 0 -900 polkitd [ 107.807768] [ 3592] 0 3592 1392 287 7 0 -900 modem-manager [ 107.811827] [ 3621] 0 3621 3716 883 6 0 0 dhclient [ 107.816322] [ 3639] 0 3639 2465 314 5 0 -1000 sshd [ 107.820565] [ 3646] 0 3646 683 169 3 0 0 rpcbind [ 107.823938] [ 3661] 29 3661 751 260 3 0 0 rpc.statd [ 107.826943] [ 3676] 0 3676 3314 364 6 0 0 sendmail [ 107.829945] [ 3699] 51 3699 3184 304 6 0 0 sendmail [ 107.832821] [ 3731] 0 3731 681 152 3 0 0 atd [ 107.835430] [ 3732] 0 3732 1073 166 5 0 0 agetty [ 107.837801] [ 3799] 0 3799 1057 100 5 0 0 sleep [ 107.840774] [ 3803] 0 3803 3348 424 6 0 0 sshd [ 107.843105] [ 3811] 0 3811 901 215 4 0 -1000 udevd [ 107.845449] [ 3816] 0 3816 1630 544 6 0 0 bash [ 107.847652] [ 3980] 0 3980 2177 185 13 0 0 gem_evict_every [ 107.849804] [ 3981] 0 3981 2177 86 13 0 0 gem_evict_every [ 107.851971] [ 3982] 0 3982 2177 86 13 0 0 gem_evict_every [ 107.853962] [ 3983] 0 3983 2177 142 13 0 0 gem_evict_every [ 107.856008] [ 3984] 0 3984 2177 88 13 0 0 gem_evict_every [ 107.857975] [ 3985] 0 3985 2177 86 13 0 0 gem_evict_every [ 107.859952] [ 3986] 0 3986 2177 144 13 0 0 gem_evict_every [ 107.861901] [ 3987] 0 3987 2177 86 13 0 0 gem_evict_every [ 107.863679] [ 3988] 0 3988 2177 86 13 0 0 gem_evict_every [ 107.865651] [ 3989] 0 3989 2177 86 13 0 0 gem_evict_every [ 107.867548] [ 3990] 0 3990 2177 86 13 0 0 gem_evict_every [ 107.869382] [ 3991] 0 3991 2177 144 13 0 0 gem_evict_every [ 107.871157] [ 3992] 0 3992 2177 86 13 0 0 gem_evict_every [ 107.872898] [ 3993] 0 3993 2177 88 13 0 0 gem_evict_every [ 107.874425] [ 3994] 0 3994 2177 86 13 0 0 gem_evict_every [ 107.876093] [ 3995] 0 3995 2177 86 13 0 0 gem_evict_every [ 107.877817] [ 3996] 0 3996 2177 86 13 0 0 gem_evict_every [ 107.879956] [ 4009] 0 4009 560 78 5 1 -1000 watchdog [ 107.881805] Out of memory: Kill process 3699 (sendmail) score 0 or sacrifice child [ 107.883481] Killed process 3699 (sendmail) total-vm:12736kB, anon-rss:1024kB, file-rss:192kB [ 109.808431] gem_evict_every invoked oom-killer: gfp_mask=0xa00d2, order=0, oom_score_adj=0 [ 109.809912] gem_evict_every cpuset=/ mems_allowed=0 Reproduce steps: ---------------------------- 1. ./gem_evict_everything --run-subtest forked-swapping-multifd-mempressure-normal