memory.rst - OpenGrok cross reference for /linux-6.14.4/Documentation/admin-guide/cgroup-v1/memory.rst

Lines Matching +full:fails +full:- +full:without +full:- +full:test +full:- +full:cd
18       we call it "memory cgroup". When you see git-log and source code, you'll
30    Memory-hungry applications can be isolated and limited to a smaller
36 d. A CD/DVD burner could control the amount of memory used by the
42 Current Status: linux-2.6.34-mmotm(development version of 2010/April)
46  - accounting anonymous pages, file caches, swap caches usage and limiting them.
47  - pages are linked to per-memcg LRU exclusively, and there is no global LRU.
48  - optionally, memory+swap usage can be accounted and limited.
49  - hierarchical accounting
50  - soft limit
51  - moving (recharging) account at moving a task is selectable.
52  - usage threshold notifier
53  - memory pressure notifier
54  - oom-killer disable knob and oom-notifier
55  - Root cgroup has no limit controls.
59  <cgroup-v1-memory-kernel-extension>`)
160 -----------
168 ---------------
170 .. code-block::
173 		+--------------------+
176 		+--------------------+
179            +---------------+  |        +---------------+
182            +---------------+  |        +---------------+
184                               + --------------+
186            +---------------+           +------+--------+
187            | page          +---------->  page_cgroup|
189            +---------------+           +---------------+
204 If everything goes well, a page meta-data-structure called page_cgroup is
206 (*) page_cgroup structure is allocated at boot/memory-hotplug time.
209 ------------------------
224 A swapped-in page is accounted after adding into swapcache.
226 Note: The kernel does swapin-readahead and reads multiple swaps at once.
232 Note: we just account pages-on-LRU because our purpose is to control amount
233 of used pages; not-on-LRU pages tend to be out-of-control from VM view.
236 --------------------------
242 the cgroup that brought it in -- this will happen on memory pressure).
245 --------------------------------------
252  - memory.memsw.usage_in_bytes.
253  - memory.memsw.limit_in_bytes.
267 The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
269 memory+swap. In other words, when we want to limit the usage of swap without
276 When a cgroup hits memory.memsw.limit_in_bytes, it's useless to do swap-out
277 in this cgroup. Then, swap-out will not be done by cgroup routine and file
283 -----------
290 cgroup. (See :ref:`10. OOM Control <cgroup-v1-memory-oom-control>` below.)
293 pages that are selected for reclaiming come from the per-cgroup LRU
304 (See :ref:`oom_control <cgroup-v1-memory-oom-control>` section)
307 -----------
312     mm->page_table_lock or split pte_lock
313       folio_memcg_lock (memcg->move_lock)
314         mapping->i_pages lock
315           lruvec->lru_lock.
317 Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by
318 lruvec->lru_lock; the folio LRU flag is cleared before
319 isolating a page from its LRU under lruvec->lru_lock.
321 .. _cgroup-v1-memory-kernel-extension:
324 -----------------------------------------------
332 it can be disabled system-wide by passing cgroup.memory=nokmem to the kernel
347 -----------------------------------------------
359   belong to the same memcg. This only fails to hold when a task is migrated to a
371 ----------------------
384     deployments where the total amount of memory per-cgroup is overcommitted.
386     box can still run out of non-reclaimable memory.
409    <cgroups-why-needed>` for the background information)::
411 	# mount -t tmpfs none /sys/fs/cgroup
413 	# mount -t cgroup none /sys/fs/cgroup/memory -o memory
435    We can write "-1" to reset the ``*.limit_in_bytes(unlimited)``.
449 availability of memory on the system. The user is required to re-read
467 Performance test is also important. To see pure memory controller's overhead,
471 Page-fault scalability is also important. At measuring parallel
472 page fault test, multi-process test may be better than multi-thread
473 test because it has noise of shared objects/status.
476 Trying usual test under memory controller is always helpful.
478 .. _cgroup-v1-memory-test-troubleshoot:
481 -------------------
493 <cgroup-v1-memory-oom-control>` (below) and seeing what happens will be
496 .. _cgroup-v1-memory-test-task-migration:
499 ------------------
507 See :ref:`8. "Move charges at task migration" <cgroup-v1-memory-move-charges>`
510 ---------------------
513 <cgroup-v1-memory-test-troubleshoot>` and :ref:`4.2
514 <cgroup-v1-memory-test-task-migration>`, a cgroup might have some charge
529 ---------------
539   charged file caches. Some out-of-use page caches may keep charged until
543 -------------
547   * per-memory cgroup local status
570     inactive_file   # of bytes of file-backed memory and MADV_FREE anonymous
572     active_file     # of bytes of file-backed memory on active LRU list.
617 --------------
628 -----------
640 ------------------
650 -------------
652 This is similar to numa_maps but operates on a per-memcg basis.  This is
659 per-node page counts including "hierarchical_<counter>" which sums up all
694 ---------------------------------------
720 Please note that soft limits is a best-effort feature; it comes with
727 -------------
746 .. _cgroup-v1-memory-move-charges:
754 to it will always return -EINVAL.
765 - create an eventfd using eventfd(2);
766 - open memory.usage_in_bytes or memory.memsw.usage_in_bytes;
767 - write string like "<event_fd> <fd of memory.usage_in_bytes> <threshold>" to
773 It's applicable for root and non-root cgroup.
775 .. _cgroup-v1-memory-oom-control:
790  - create an eventfd using eventfd(2)
791  - open memory.oom_control file
792  - write string like "<event_fd> <fd of memory.oom_control>" to
798 You can disable the OOM-killer by writing "1" to memory.oom_control file, as:
802 If OOM-killer is disabled, tasks under cgroup will hang/sleep
803 in memory cgroup's OOM-waitqueue when they request accountable memory.
819 	- oom_kill_disable 0 or 1
820 	  (if 1, oom-killer is disabled)
821 	- under_oom	   0 or 1
823         - oom_kill         integer counter
847 resources that can be easily reconstructed or re-read from a disk.
850 about to out of memory (OOM) or even the in-kernel OOM killer is on its
856 events are not pass-through. For example, you have three cgroups: A->B->C. Now
866  - "default": this is the default behavior specified above. This mode is the
870  - "hierarchy": events always propagate up to the root, similar to the default
875  - "local": events are pass-through, i.e. they only receive notifications when
884 specified by a comma-delimited string, i.e. "low,hierarchy" specifies
885 hierarchical, pass-through, notification for all ancestor memcgs. Notification
886 that is the default, non pass-through behavior, does not specify a mode.
887 "medium,local" specifies pass-through notification for the medium level.
892 - create an eventfd using eventfd(2);
893 - open memory.pressure_level;
894 - write string as "<event_fd> <fd of memory.pressure_level> <level[,mode]>"
901 Test:
907 	# cd /sys/fs/cgroup/memory/
909 	# cd foo
916    (Expect a bunch of notifications, and eventually, the oom-killer will
922 1. Make per-cgroup scanner reclaim not-shared pages first
923 2. Teach controller to account for shared-pages
949 8. Singh, Balbir. RSS controller v2 test results (lmbench),
953 10. Singh, Balbir. Memory controller v6 test results,
954     https://lore.kernel.org/r/20070819094658.654.84837.sendpatchset@balbir-laptop
957    https://lore.kernel.org/r/20070817084228.26003.12568.sendpatchset@balbir-laptop