Skip to content

Commit d706751

Browse files
Josef Bacikaxboe
authored andcommitted
block: introduce blk-iolatency io controller
Current IO controllers for the block layer are less than ideal for our use case. The io.max controller is great at hard limiting, but it is not work conserving. This patch introduces io.latency. You provide a latency target for your group and we monitor the io in short windows to make sure we are not exceeding those latency targets. This makes use of the rq-qos infrastructure and works much like the wbt stuff. There are a few differences from wbt - It's bio based, so the latency covers the whole block layer in addition to the actual io. - We will throttle all IO types that comes in here if we need to. - We use the mean latency over the 100ms window. This is because writes can be particularly fast, which could give us a false sense of the impact of other workloads on our protected workload. - By default there's no throttling, we set the queue_depth to INT_MAX so that we can have as many outstanding bio's as we're allowed to. Only at throttle time do we pay attention to the actual queue depth. - We backcharge cgroups for root cg issued IO and induce artificial delays in order to deal with cases like metadata only or swap heavy workloads. In testing this has worked out relatively well. Protected workloads will throttle noisy workloads down to 1 io at time if they are doing normal IO on their own, or induce up to a 1 second delay per syscall if they are doing a lot of root issued IO (metadata/swap IO). Our testing has revolved mostly around our production web servers where we have hhvm (the web server application) in a protected group and everything else in another group. We see slightly higher requests per second (RPS) on the test tier vs the control tier, and much more stable RPS across all machines in the test tier vs the control tier. Another test we run is a slow memory allocator in the unprotected group. Before this would eventually push us into swap and cause the whole box to die and not recover at all. With these patches we see slight RPS drops (usually 10-15%) before the memory consumer is properly killed and things recover within seconds. Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
1 parent 67b42d0 commit d706751

File tree

6 files changed

+957
-2
lines changed

6 files changed

+957
-2
lines changed

block/Kconfig

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,18 @@ config BLK_WBT
149149
dynamically on an algorithm loosely based on CoDel, factoring in
150150
the realtime performance of the disk.
151151

152+
config BLK_CGROUP_IOLATENCY
153+
bool "Enable support for latency based cgroup IO protection"
154+
depends on BLK_CGROUP=y
155+
default n
156+
---help---
157+
Enabling this option enables the .latency interface for IO throttling.
158+
The IO controller will attempt to maintain average IO latencies below
159+
the configured latency target, throttling anybody with a higher latency
160+
target than the victimized group.
161+
162+
Note, this is an experimental interface and could be changed someday.
163+
152164
config BLK_WBT_SQ
153165
bool "Single queue writeback throttling"
154166
default n

block/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ obj-$(CONFIG_BLK_DEV_BSG) += bsg.o
1717
obj-$(CONFIG_BLK_DEV_BSGLIB) += bsg-lib.o
1818
obj-$(CONFIG_BLK_CGROUP) += blk-cgroup.o
1919
obj-$(CONFIG_BLK_DEV_THROTTLING) += blk-throttle.o
20+
obj-$(CONFIG_BLK_CGROUP_IOLATENCY) += blk-iolatency.o
2021
obj-$(CONFIG_IOSCHED_NOOP) += noop-iosched.o
2122
obj-$(CONFIG_IOSCHED_DEADLINE) += deadline-iosched.o
2223
obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.o

block/blk-cgroup.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1238,6 +1238,14 @@ int blkcg_init_queue(struct request_queue *q)
12381238
if (preloaded)
12391239
radix_tree_preload_end();
12401240

1241+
ret = blk_iolatency_init(q);
1242+
if (ret) {
1243+
spin_lock_irq(q->queue_lock);
1244+
blkg_destroy_all(q);
1245+
spin_unlock_irq(q->queue_lock);
1246+
return ret;
1247+
}
1248+
12411249
ret = blk_throtl_init(q);
12421250
if (ret) {
12431251
spin_lock_irq(q->queue_lock);

0 commit comments

Comments
 (0)