Skip to content

Commit 616486a

Browse files
mlyleaxboe
authored andcommitted
bcache: fix writeback target calc on large devices
Bcache needs to scale the dirty data in the cache over the multiple backing disks in order to calculate writeback rates for each. The previous code did this by multiplying the target number of dirty sectors by the backing device size, and expected it to fit into a uint64_t; this blows up on relatively small backing devices. The new approach figures out the bdev's share in 16384ths of the overall cached data. This is chosen to cope well when bdevs drastically vary in size and to ensure that bcache can cross the petabyte boundary for each backing device. This has been improved based on Tang Junhui's feedback to ensure that every device gets a share of dirty data, no matter how small it is compared to the total backing pool. The existing mechanism is very limited; this is purely a bug fix to remove limits on volume size. However, there still needs to be change to make this "fair" over many volumes where some are idle. Reported-by: Jack Douglas <jack@douglastechnology.co.uk> Signed-off-by: Michael Lyle <mlyle@lyle.org> Reviewed-by: Tang Junhui <tang.junhui@zte.com.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk>
1 parent 5138ac6 commit 616486a

File tree

2 files changed

+34
-4
lines changed

2 files changed

+34
-4
lines changed

drivers/md/bcache/writeback.c

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,17 +18,39 @@
1818
#include <trace/events/bcache.h>
1919

2020
/* Rate limiting */
21-
22-
static void __update_writeback_rate(struct cached_dev *dc)
21+
static uint64_t __calc_target_rate(struct cached_dev *dc)
2322
{
2423
struct cache_set *c = dc->disk.c;
24+
25+
/*
26+
* This is the size of the cache, minus the amount used for
27+
* flash-only devices
28+
*/
2529
uint64_t cache_sectors = c->nbuckets * c->sb.bucket_size -
2630
bcache_flash_devs_sectors_dirty(c);
31+
32+
/*
33+
* Unfortunately there is no control of global dirty data. If the
34+
* user states that they want 10% dirty data in the cache, and has,
35+
* e.g., 5 backing volumes of equal size, we try and ensure each
36+
* backing volume uses about 2% of the cache for dirty data.
37+
*/
38+
uint32_t bdev_share =
39+
div64_u64(bdev_sectors(dc->bdev) << WRITEBACK_SHARE_SHIFT,
40+
c->cached_dev_sectors);
41+
2742
uint64_t cache_dirty_target =
2843
div_u64(cache_sectors * dc->writeback_percent, 100);
29-
int64_t target = div64_u64(cache_dirty_target * bdev_sectors(dc->bdev),
30-
c->cached_dev_sectors);
3144

45+
/* Ensure each backing dev gets at least one dirty share */
46+
if (bdev_share < 1)
47+
bdev_share = 1;
48+
49+
return (cache_dirty_target * bdev_share) >> WRITEBACK_SHARE_SHIFT;
50+
}
51+
52+
static void __update_writeback_rate(struct cached_dev *dc)
53+
{
3254
/*
3355
* PI controller:
3456
* Figures out the amount that should be written per second.
@@ -49,6 +71,7 @@ static void __update_writeback_rate(struct cached_dev *dc)
4971
* This acts as a slow, long-term average that is not subject to
5072
* variations in usage like the p term.
5173
*/
74+
int64_t target = __calc_target_rate(dc);
5275
int64_t dirty = bcache_dev_sectors_dirty(&dc->disk);
5376
int64_t error = dirty - target;
5477
int64_t proportional_scaled =

drivers/md/bcache/writeback.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,13 @@
88
#define MAX_WRITEBACKS_IN_PASS 5
99
#define MAX_WRITESIZE_IN_PASS 5000 /* *512b */
1010

11+
/*
12+
* 14 (16384ths) is chosen here as something that each backing device
13+
* should be a reasonable fraction of the share, and not to blow up
14+
* until individual backing devices are a petabyte.
15+
*/
16+
#define WRITEBACK_SHARE_SHIFT 14
17+
1118
static inline uint64_t bcache_dev_sectors_dirty(struct bcache_device *d)
1219
{
1320
uint64_t i, ret = 0;

0 commit comments

Comments
 (0)