Skip to content

Commit d02be50

Browse files
minchanktorvalds
authored andcommitted
zsmalloc: zsmalloc documentation
Create zsmalloc doc which explains design concept and stat information. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Juneho Choi <juno.choi@lge.com> Cc: Gunho Lee <gunho.lee@lge.com> Cc: Luigi Semenzato <semenzato@google.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjennings@variantweb.net> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent 248ca1b commit d02be50

File tree

3 files changed

+71
-29
lines changed

3 files changed

+71
-29
lines changed

Documentation/vm/zsmalloc.txt

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
zsmalloc
2+
--------
3+
4+
This allocator is designed for use with zram. Thus, the allocator is
5+
supposed to work well under low memory conditions. In particular, it
6+
never attempts higher order page allocation which is very likely to
7+
fail under memory pressure. On the other hand, if we just use single
8+
(0-order) pages, it would suffer from very high fragmentation --
9+
any object of size PAGE_SIZE/2 or larger would occupy an entire page.
10+
This was one of the major issues with its predecessor (xvmalloc).
11+
12+
To overcome these issues, zsmalloc allocates a bunch of 0-order pages
13+
and links them together using various 'struct page' fields. These linked
14+
pages act as a single higher-order page i.e. an object can span 0-order
15+
page boundaries. The code refers to these linked pages as a single entity
16+
called zspage.
17+
18+
For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
19+
since this satisfies the requirements of all its current users (in the
20+
worst case, page is incompressible and is thus stored "as-is" i.e. in
21+
uncompressed form). For allocation requests larger than this size, failure
22+
is returned (see zs_malloc).
23+
24+
Additionally, zs_malloc() does not return a dereferenceable pointer.
25+
Instead, it returns an opaque handle (unsigned long) which encodes actual
26+
location of the allocated object. The reason for this indirection is that
27+
zsmalloc does not keep zspages permanently mapped since that would cause
28+
issues on 32-bit systems where the VA region for kernel space mappings
29+
is very small. So, before using the allocating memory, the object has to
30+
be mapped using zs_map_object() to get a usable pointer and subsequently
31+
unmapped using zs_unmap_object().
32+
33+
stat
34+
----
35+
36+
With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via
37+
/sys/kernel/debug/zsmalloc/<user name>. Here is a sample of stat output:
38+
39+
# cat /sys/kernel/debug/zsmalloc/zram0/classes
40+
41+
class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage
42+
..
43+
..
44+
9 176 0 1 186 129 8 4
45+
10 192 1 0 2880 2872 135 3
46+
11 208 0 1 819 795 42 2
47+
12 224 0 1 219 159 12 4
48+
..
49+
..
50+
51+
52+
class: index
53+
size: object size zspage stores
54+
almost_empty: the number of ZS_ALMOST_EMPTY zspages(see below)
55+
almost_full: the number of ZS_ALMOST_FULL zspages(see below)
56+
obj_allocated: the number of objects allocated
57+
obj_used: the number of objects allocated to the user
58+
pages_used: the number of pages allocated for the class
59+
pages_per_zspage: the number of 0-order pages to make a zspage
60+
61+
We assign a zspage to ZS_ALMOST_EMPTY fullness group when:
62+
n <= N / f, where
63+
n = number of allocated objects
64+
N = total number of objects zspage can store
65+
f = fullness_threshold_frac(ie, 4 at the moment)
66+
67+
Similarly, we assign zspage to:
68+
ZS_ALMOST_FULL when n > N / f
69+
ZS_EMPTY when n == 0
70+
ZS_FULL when n == N

MAINTAINERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10972,6 +10972,7 @@ L: linux-mm@kvack.org
1097210972
S: Maintained
1097310973
F: mm/zsmalloc.c
1097410974
F: include/linux/zsmalloc.h
10975+
F: Documentation/vm/zsmalloc.txt
1097510976

1097610977
ZSWAP COMPRESSED SWAP CACHING
1097710978
M: Seth Jennings <sjennings@variantweb.net>

mm/zsmalloc.c

Lines changed: 0 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -12,35 +12,6 @@
1212
*/
1313

1414
/*
15-
* This allocator is designed for use with zram. Thus, the allocator is
16-
* supposed to work well under low memory conditions. In particular, it
17-
* never attempts higher order page allocation which is very likely to
18-
* fail under memory pressure. On the other hand, if we just use single
19-
* (0-order) pages, it would suffer from very high fragmentation --
20-
* any object of size PAGE_SIZE/2 or larger would occupy an entire page.
21-
* This was one of the major issues with its predecessor (xvmalloc).
22-
*
23-
* To overcome these issues, zsmalloc allocates a bunch of 0-order pages
24-
* and links them together using various 'struct page' fields. These linked
25-
* pages act as a single higher-order page i.e. an object can span 0-order
26-
* page boundaries. The code refers to these linked pages as a single entity
27-
* called zspage.
28-
*
29-
* For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
30-
* since this satisfies the requirements of all its current users (in the
31-
* worst case, page is incompressible and is thus stored "as-is" i.e. in
32-
* uncompressed form). For allocation requests larger than this size, failure
33-
* is returned (see zs_malloc).
34-
*
35-
* Additionally, zs_malloc() does not return a dereferenceable pointer.
36-
* Instead, it returns an opaque handle (unsigned long) which encodes actual
37-
* location of the allocated object. The reason for this indirection is that
38-
* zsmalloc does not keep zspages permanently mapped since that would cause
39-
* issues on 32-bit systems where the VA region for kernel space mappings
40-
* is very small. So, before using the allocating memory, the object has to
41-
* be mapped using zs_map_object() to get a usable pointer and subsequently
42-
* unmapped using zs_unmap_object().
43-
*
4415
* Following is how we use various fields and flags of underlying
4516
* struct page(s) to form a zspage.
4617
*

0 commit comments

Comments
 (0)