Skip to content

Commit 673ab87

Browse files
committed
Merge branch 'akpm' (more patches from Andrew)
Merge patches from Andrew Morton: "Most of the rest of MM, plus a few dribs and drabs. I still have quite a few irritating patches left around: ones with dubious testing results, lack of review, ones which should have gone via maintainer trees but the maintainers are slack, etc. I need to be more activist in getting these things wrapped up outside the merge window, but they're such a PITA." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (48 commits) mm/vmscan.c: avoid possible deadlock caused by too_many_isolated() vmscan: comment too_many_isolated() mm/kmemleak.c: remove obsolete simple_strtoul mm/memory_hotplug.c: improve comments mm/hugetlb: create hugetlb cgroup file in hugetlb_init mm/mprotect.c: coding-style cleanups Documentation: ABI: /sys/devices/system/node/ slub: drop mutex before deleting sysfs entry memcg: add comments clarifying aspects of cache attribute propagation kmem: add slab-specific documentation about the kmem controller slub: slub-specific propagation changes slab: propagate tunable values memcg: aggregate memcg cache values in slabinfo memcg/sl[au]b: shrink dead caches memcg/sl[au]b: track all the memcg children of a kmem_cache memcg: destroy memcg caches sl[au]b: allocate objects from memcg cache sl[au]b: always get the cache from its page in kmem_cache_free() memcg: skip memcg kmem allocations in specified code regions memcg: infrastructure to match an allocation to the right cache ...
2 parents d7b96ca + 3cf2384 commit 673ab87

File tree

38 files changed

+2548
-345
lines changed

38 files changed

+2548
-345
lines changed
Lines changed: 95 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,101 @@
1+
What: /sys/devices/system/node/possible
2+
Date: October 2002
3+
Contact: Linux Memory Management list <linux-mm@kvack.org>
4+
Description:
5+
Nodes that could be possibly become online at some point.
6+
7+
What: /sys/devices/system/node/online
8+
Date: October 2002
9+
Contact: Linux Memory Management list <linux-mm@kvack.org>
10+
Description:
11+
Nodes that are online.
12+
13+
What: /sys/devices/system/node/has_normal_memory
14+
Date: October 2002
15+
Contact: Linux Memory Management list <linux-mm@kvack.org>
16+
Description:
17+
Nodes that have regular memory.
18+
19+
What: /sys/devices/system/node/has_cpu
20+
Date: October 2002
21+
Contact: Linux Memory Management list <linux-mm@kvack.org>
22+
Description:
23+
Nodes that have one or more CPUs.
24+
25+
What: /sys/devices/system/node/has_high_memory
26+
Date: October 2002
27+
Contact: Linux Memory Management list <linux-mm@kvack.org>
28+
Description:
29+
Nodes that have regular or high memory.
30+
Depends on CONFIG_HIGHMEM.
31+
132
What: /sys/devices/system/node/nodeX
233
Date: October 2002
334
Contact: Linux Memory Management list <linux-mm@kvack.org>
435
Description:
536
When CONFIG_NUMA is enabled, this is a directory containing
637
information on node X such as what CPUs are local to the
7-
node.
38+
node. Each file is detailed next.
39+
40+
What: /sys/devices/system/node/nodeX/cpumap
41+
Date: October 2002
42+
Contact: Linux Memory Management list <linux-mm@kvack.org>
43+
Description:
44+
The node's cpumap.
45+
46+
What: /sys/devices/system/node/nodeX/cpulist
47+
Date: October 2002
48+
Contact: Linux Memory Management list <linux-mm@kvack.org>
49+
Description:
50+
The CPUs associated to the node.
51+
52+
What: /sys/devices/system/node/nodeX/meminfo
53+
Date: October 2002
54+
Contact: Linux Memory Management list <linux-mm@kvack.org>
55+
Description:
56+
Provides information about the node's distribution and memory
57+
utilization. Similar to /proc/meminfo, see Documentation/filesystems/proc.txt
58+
59+
What: /sys/devices/system/node/nodeX/numastat
60+
Date: October 2002
61+
Contact: Linux Memory Management list <linux-mm@kvack.org>
62+
Description:
63+
The node's hit/miss statistics, in units of pages.
64+
See Documentation/numastat.txt
65+
66+
What: /sys/devices/system/node/nodeX/distance
67+
Date: October 2002
68+
Contact: Linux Memory Management list <linux-mm@kvack.org>
69+
Description:
70+
Distance between the node and all the other nodes
71+
in the system.
72+
73+
What: /sys/devices/system/node/nodeX/vmstat
74+
Date: October 2002
75+
Contact: Linux Memory Management list <linux-mm@kvack.org>
76+
Description:
77+
The node's zoned virtual memory statistics.
78+
This is a superset of numastat.
79+
80+
What: /sys/devices/system/node/nodeX/compact
81+
Date: February 2010
82+
Contact: Mel Gorman <mel@csn.ul.ie>
83+
Description:
84+
When this file is written to, all memory within that node
85+
will be compacted. When it completes, memory will be freed
86+
into blocks which have as many contiguous pages as possible
87+
88+
What: /sys/devices/system/node/nodeX/scan_unevictable_pages
89+
Date: October 2008
90+
Contact: Lee Schermerhorn <lee.schermerhorn@hp.com>
91+
Description:
92+
When set, it triggers scanning the node's unevictable lists
93+
and move any pages that have become evictable onto the respective
94+
zone's inactive list. See mm/vmscan.c
95+
96+
What: /sys/devices/system/node/nodeX/hugepages/hugepages-<size>/
97+
Date: December 2009
98+
Contact: Lee Schermerhorn <lee.schermerhorn@hp.com>
99+
Description:
100+
The node's huge page size control/query attributes.
101+
See Documentation/vm/hugetlbpage.txt

Documentation/cgroups/memory.txt

Lines changed: 65 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,11 @@ Brief summary of control files.
7171
memory.oom_control # set/show oom controls.
7272
memory.numa_stat # show the number of memory usage per numa node
7373

74+
memory.kmem.limit_in_bytes # set/show hard limit for kernel memory
75+
memory.kmem.usage_in_bytes # show current kernel memory allocation
76+
memory.kmem.failcnt # show the number of kernel memory usage hits limits
77+
memory.kmem.max_usage_in_bytes # show max kernel memory usage recorded
78+
7479
memory.kmem.tcp.limit_in_bytes # set/show hard limit for tcp buf memory
7580
memory.kmem.tcp.usage_in_bytes # show current tcp buf memory allocation
7681
memory.kmem.tcp.failcnt # show the number of tcp buf memory usage hits limits
@@ -268,20 +273,73 @@ the amount of kernel memory used by the system. Kernel memory is fundamentally
268273
different than user memory, since it can't be swapped out, which makes it
269274
possible to DoS the system by consuming too much of this precious resource.
270275

276+
Kernel memory won't be accounted at all until limit on a group is set. This
277+
allows for existing setups to continue working without disruption. The limit
278+
cannot be set if the cgroup have children, or if there are already tasks in the
279+
cgroup. Attempting to set the limit under those conditions will return -EBUSY.
280+
When use_hierarchy == 1 and a group is accounted, its children will
281+
automatically be accounted regardless of their limit value.
282+
283+
After a group is first limited, it will be kept being accounted until it
284+
is removed. The memory limitation itself, can of course be removed by writing
285+
-1 to memory.kmem.limit_in_bytes. In this case, kmem will be accounted, but not
286+
limited.
287+
271288
Kernel memory limits are not imposed for the root cgroup. Usage for the root
272-
cgroup may or may not be accounted.
289+
cgroup may or may not be accounted. The memory used is accumulated into
290+
memory.kmem.usage_in_bytes, or in a separate counter when it makes sense.
291+
(currently only for tcp).
292+
The main "kmem" counter is fed into the main counter, so kmem charges will
293+
also be visible from the user counter.
273294

274295
Currently no soft limit is implemented for kernel memory. It is future work
275296
to trigger slab reclaim when those limits are reached.
276297

277298
2.7.1 Current Kernel Memory resources accounted
278299

300+
* stack pages: every process consumes some stack pages. By accounting into
301+
kernel memory, we prevent new processes from being created when the kernel
302+
memory usage is too high.
303+
304+
* slab pages: pages allocated by the SLAB or SLUB allocator are tracked. A copy
305+
of each kmem_cache is created everytime the cache is touched by the first time
306+
from inside the memcg. The creation is done lazily, so some objects can still be
307+
skipped while the cache is being created. All objects in a slab page should
308+
belong to the same memcg. This only fails to hold when a task is migrated to a
309+
different memcg during the page allocation by the cache.
310+
279311
* sockets memory pressure: some sockets protocols have memory pressure
280312
thresholds. The Memory Controller allows them to be controlled individually
281313
per cgroup, instead of globally.
282314

283315
* tcp memory pressure: sockets memory pressure for the tcp protocol.
284316

317+
2.7.3 Common use cases
318+
319+
Because the "kmem" counter is fed to the main user counter, kernel memory can
320+
never be limited completely independently of user memory. Say "U" is the user
321+
limit, and "K" the kernel limit. There are three possible ways limits can be
322+
set:
323+
324+
U != 0, K = unlimited:
325+
This is the standard memcg limitation mechanism already present before kmem
326+
accounting. Kernel memory is completely ignored.
327+
328+
U != 0, K < U:
329+
Kernel memory is a subset of the user memory. This setup is useful in
330+
deployments where the total amount of memory per-cgroup is overcommited.
331+
Overcommiting kernel memory limits is definitely not recommended, since the
332+
box can still run out of non-reclaimable memory.
333+
In this case, the admin could set up K so that the sum of all groups is
334+
never greater than the total memory, and freely set U at the cost of his
335+
QoS.
336+
337+
U != 0, K >= U:
338+
Since kmem charges will also be fed to the user counter and reclaim will be
339+
triggered for the cgroup for both kinds of memory. This setup gives the
340+
admin a unified view of memory, and it is also useful for people who just
341+
want to track kernel memory usage.
342+
285343
3. User Interface
286344

287345
0. Configuration
@@ -290,6 +348,7 @@ a. Enable CONFIG_CGROUPS
290348
b. Enable CONFIG_RESOURCE_COUNTERS
291349
c. Enable CONFIG_MEMCG
292350
d. Enable CONFIG_MEMCG_SWAP (to use swap extension)
351+
d. Enable CONFIG_MEMCG_KMEM (to use kmem extension)
293352

294353
1. Prepare the cgroups (see cgroups.txt, Why are cgroups needed?)
295354
# mount -t tmpfs none /sys/fs/cgroup
@@ -406,6 +465,11 @@ About use_hierarchy, see Section 6.
406465
Because rmdir() moves all pages to parent, some out-of-use page caches can be
407466
moved to the parent. If you want to avoid that, force_empty will be useful.
408467

468+
Also, note that when memory.kmem.limit_in_bytes is set the charges due to
469+
kernel pages will still be seen. This is not considered a failure and the
470+
write will still return success. In this case, it is expected that
471+
memory.kmem.usage_in_bytes == memory.usage_in_bytes.
472+
409473
About use_hierarchy, see Section 6.
410474

411475
5.2 stat file

Documentation/cgroups/resource_counter.txt

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,16 +83,17 @@ to work with it.
8383
res_counter->lock internally (it must be called with res_counter->lock
8484
held). The force parameter indicates whether we can bypass the limit.
8585

86-
e. void res_counter_uncharge[_locked]
86+
e. u64 res_counter_uncharge[_locked]
8787
(struct res_counter *rc, unsigned long val)
8888

8989
When a resource is released (freed) it should be de-accounted
9090
from the resource counter it was accounted to. This is called
91-
"uncharging".
91+
"uncharging". The return value of this function indicate the amount
92+
of charges still present in the counter.
9293

9394
The _locked routines imply that the res_counter->lock is taken.
9495

95-
f. void res_counter_uncharge_until
96+
f. u64 res_counter_uncharge_until
9697
(struct res_counter *rc, struct res_counter *top,
9798
unsinged long val)
9899

arch/cris/include/asm/io.h

Lines changed: 33 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -133,12 +133,39 @@ static inline void writel(unsigned int b, volatile void __iomem *addr)
133133
#define insb(port,addr,count) (cris_iops ? cris_iops->read_io(port,addr,1,count) : 0)
134134
#define insw(port,addr,count) (cris_iops ? cris_iops->read_io(port,addr,2,count) : 0)
135135
#define insl(port,addr,count) (cris_iops ? cris_iops->read_io(port,addr,4,count) : 0)
136-
#define outb(data,port) if (cris_iops) cris_iops->write_io(port,(void*)(unsigned)data,1,1)
137-
#define outw(data,port) if (cris_iops) cris_iops->write_io(port,(void*)(unsigned)data,2,1)
138-
#define outl(data,port) if (cris_iops) cris_iops->write_io(port,(void*)(unsigned)data,4,1)
139-
#define outsb(port,addr,count) if(cris_iops) cris_iops->write_io(port,(void*)addr,1,count)
140-
#define outsw(port,addr,count) if(cris_iops) cris_iops->write_io(port,(void*)addr,2,count)
141-
#define outsl(port,addr,count) if(cris_iops) cris_iops->write_io(port,(void*)addr,3,count)
136+
static inline void outb(unsigned char data, unsigned int port)
137+
{
138+
if (cris_iops)
139+
cris_iops->write_io(port, (void *) &data, 1, 1);
140+
}
141+
static inline void outw(unsigned short data, unsigned int port)
142+
{
143+
if (cris_iops)
144+
cris_iops->write_io(port, (void *) &data, 2, 1);
145+
}
146+
static inline void outl(unsigned int data, unsigned int port)
147+
{
148+
if (cris_iops)
149+
cris_iops->write_io(port, (void *) &data, 4, 1);
150+
}
151+
static inline void outsb(unsigned int port, const void *addr,
152+
unsigned long count)
153+
{
154+
if (cris_iops)
155+
cris_iops->write_io(port, (void *)addr, 1, count);
156+
}
157+
static inline void outsw(unsigned int port, const void *addr,
158+
unsigned long count)
159+
{
160+
if (cris_iops)
161+
cris_iops->write_io(port, (void *)addr, 2, count);
162+
}
163+
static inline void outsl(unsigned int port, const void *addr,
164+
unsigned long count)
165+
{
166+
if (cris_iops)
167+
cris_iops->write_io(port, (void *)addr, 4, count);
168+
}
142169

143170
/*
144171
* Convert a physical pointer to a virtual kernel pointer for /dev/mem

arch/h8300/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ config H8300
33
default y
44
select HAVE_IDE
55
select HAVE_GENERIC_HARDIRQS
6+
select GENERIC_ATOMIC64
67
select HAVE_UID16
78
select ARCH_WANT_IPC_PARSE_VERSION
89
select GENERIC_IRQ_SHOW

arch/x86/platform/iris/iris.c

Lines changed: 57 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323

2424
#include <linux/moduleparam.h>
2525
#include <linux/module.h>
26+
#include <linux/platform_device.h>
2627
#include <linux/kernel.h>
2728
#include <linux/errno.h>
2829
#include <linux/delay.h>
@@ -62,29 +63,75 @@ static void iris_power_off(void)
6263
* by reading its input port and seeing whether the read value is
6364
* meaningful.
6465
*/
65-
static int iris_init(void)
66+
static int iris_probe(struct platform_device *pdev)
6667
{
67-
unsigned char status;
68-
if (force != 1) {
69-
printk(KERN_ERR "The force parameter has not been set to 1 so the Iris poweroff handler will not be installed.\n");
70-
return -ENODEV;
71-
}
72-
status = inb(IRIS_GIO_INPUT);
68+
unsigned char status = inb(IRIS_GIO_INPUT);
7369
if (status == IRIS_GIO_NODEV) {
74-
printk(KERN_ERR "This machine does not seem to be an Iris. Power_off handler not installed.\n");
70+
printk(KERN_ERR "This machine does not seem to be an Iris. "
71+
"Power off handler not installed.\n");
7572
return -ENODEV;
7673
}
7774
old_pm_power_off = pm_power_off;
7875
pm_power_off = &iris_power_off;
7976
printk(KERN_INFO "Iris power_off handler installed.\n");
80-
8177
return 0;
8278
}
8379

84-
static void iris_exit(void)
80+
static int iris_remove(struct platform_device *pdev)
8581
{
8682
pm_power_off = old_pm_power_off;
8783
printk(KERN_INFO "Iris power_off handler uninstalled.\n");
84+
return 0;
85+
}
86+
87+
static struct platform_driver iris_driver = {
88+
.driver = {
89+
.name = "iris",
90+
.owner = THIS_MODULE,
91+
},
92+
.probe = iris_probe,
93+
.remove = iris_remove,
94+
};
95+
96+
static struct resource iris_resources[] = {
97+
{
98+
.start = IRIS_GIO_BASE,
99+
.end = IRIS_GIO_OUTPUT,
100+
.flags = IORESOURCE_IO,
101+
.name = "address"
102+
}
103+
};
104+
105+
static struct platform_device *iris_device;
106+
107+
static int iris_init(void)
108+
{
109+
int ret;
110+
if (force != 1) {
111+
printk(KERN_ERR "The force parameter has not been set to 1."
112+
" The Iris poweroff handler will not be installed.\n");
113+
return -ENODEV;
114+
}
115+
ret = platform_driver_register(&iris_driver);
116+
if (ret < 0) {
117+
printk(KERN_ERR "Failed to register iris platform driver: %d\n",
118+
ret);
119+
return ret;
120+
}
121+
iris_device = platform_device_register_simple("iris", (-1),
122+
iris_resources, ARRAY_SIZE(iris_resources));
123+
if (IS_ERR(iris_device)) {
124+
printk(KERN_ERR "Failed to register iris platform device\n");
125+
platform_driver_unregister(&iris_driver);
126+
return PTR_ERR(iris_device);
127+
}
128+
return 0;
129+
}
130+
131+
static void iris_exit(void)
132+
{
133+
platform_device_unregister(iris_device);
134+
platform_driver_unregister(&iris_driver);
88135
}
89136

90137
module_init(iris_init);

drivers/message/fusion/mptscsih.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -792,6 +792,7 @@ mptscsih_io_done(MPT_ADAPTER *ioc, MPT_FRAME_HDR *mf, MPT_FRAME_HDR *mr)
792792
* than an unsolicited DID_ABORT.
793793
*/
794794
sc->result = DID_RESET << 16;
795+
break;
795796

796797
case MPI_IOCSTATUS_SCSI_EXT_TERMINATED: /* 0x004C */
797798
if (ioc->bus_type == FC)

0 commit comments

Comments
 (0)