From 48a20724e872e7dd7487fcf975957ac3dd52bd1a Mon Sep 17 00:00:00 2001 From: Daniel Vetter Date: Thu, 12 Sep 2013 13:21:42 +0200 Subject: [PATCH] mm/shrinker: Add a shrinker flag to always shrink a bit The drm/i915 gpu driver loves to hang onto as much memory as it can - we cache pinned pages, dma mappings and obviously also gpu address space bindings of buffer objects. On top of that userspace has its own opportunistic cache which is managed by an madvise-like ioctl to tell the kernel which objects are purgeable and which are actually used. This is to cache userspace mmapings and a bit of other metadata about buffer objects needed to be able to hit fastpaths even on fresh objects. We have routine encounters with the OOM killer due to all this crave for memory. The latest one seems to be an artifact of the mm core trying really hard to balance page lru evictions with shrinking caches: The shrinker in drm/i915 doesn't actually free memory, but only drops all the dma mappings and page refcounts so that the backing storage (which is just shmemfs nodes) can actually be evicted. Which means that if the core mm hasn't found anything to evict from the page lru (most likely because drm/i915 has pinned down everything available) it will also not shrink any of the caches. Which leads to a premature OOM while still tons of pages used by gpu buffer objects could be swapped out. For a quick hack I've added a shrink-me-harder flag to make sure there's at least a bit of forward progress. It seems to work. I've called the flag evicts_to_page_lru, but that might just be uninformed me talking ... We should also probably have something with a bit more smarts to be more aggressive when in a tight spot and avoid the minimal shrinking when it's not really required, so maybe take scan_control->priority into account somehow. But since I utterly lack clue I've figured sending out a quick rfc first is better. v2: - Rebase on top of the new shrinker code in 3.12. - I've tried to make it a bit more adaptive to the memory pressure but got lost in mm code. Instead just limit the scan count to what's available to avoid hitting the i915 shrinker too hard. Cc: Glauber Costa Cc: Andrew Morton Cc: Rik van Riel Cc: Mel Gorman Cc: Johannes Weiner Cc: Michal Hocko Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69247 Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_gem.c | 1 + include/linux/shrinker.h | 14 ++++++++++++++ mm/vmscan.c | 7 +++++++ 3 files changed, 22 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index cdfb9da..f9dde11 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4565,6 +4565,7 @@ i915_gem_load(struct drm_device *dev) dev_priv->mm.inactive_shrinker.scan_objects = i915_gem_inactive_scan; dev_priv->mm.inactive_shrinker.count_objects = i915_gem_inactive_count; dev_priv->mm.inactive_shrinker.seeks = DEFAULT_SEEKS; + dev_priv->mm.inactive_shrinker.evicts_to_page_lru = true; register_shrinker(&dev_priv->mm.inactive_shrinker); } diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 68c0970..4508090 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -55,6 +55,20 @@ struct shrinker { long batch; /* reclaim batch size, 0 = default */ unsigned long flags; + /* + * Some shrinkers (especially gpu drivers using gem as backing storage) + * hold onto gobloads of pinned pagecache memory (from shmem nodes). + * When those caches get shrunk the memory only gets unpin and so is + * available to be evicted with the page launderer. + * + * The problem is that the core mm tries to balance eviction from the + * page lru with shrinking caches. So if there's nothing on the page lru + * to evict we'll never shrink the gpu driver caches and so will OOM + * despite tons of memory used by gpu buffer objects that could be + * swapped out. Setting this flag ensures forward progress. + */ + bool evicts_to_page_lru; + /* These are for internal use */ struct list_head list; /* objs pending delete, per node */ diff --git a/mm/vmscan.c b/mm/vmscan.c index 8ed1b77..12bb6a5 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -287,6 +287,13 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker, if (total_scan > max_pass * 2) total_scan = max_pass * 2; + /* + * For shrinkers that evict to the page lru make sure we have some + * forward progress, but don't try to shrink more than what's there. + */ + if (shrinker->evicts_to_page_lru) + total_scan = min(max(total_scan, batch_size), max_pass); + trace_mm_shrink_slab_start(shrinker, shrinkctl, nr, nr_pages_scanned, lru_pages, max_pass, delta, total_scan); -- 1.8.4.rc3