Skip to content

Commit d0edc24

Browse files
Algodev-githubaxboe
authored andcommitted
block, bfq: inject other-queue I/O into seeky idle queues on NCQ flash
The Achilles' heel of BFQ is its failing to reach a high throughput with sync random I/O on flash storage with internal queueing, in case the processes doing I/O have differentiated weights. The cause of this failure is as follows. If at least two processes do sync I/O, and have a different weight from each other, then BFQ plugs I/O dispatching every time one of these processes, while it is being served, remains temporarily without pending I/O requests. This plugging is necessary to guarantee that every process enjoys a bandwidth proportional to its weight; but it empties the internal queue(s) of the drive. And this kills throughput with random I/O. So, if some processes have differentiated weights and do both sync and random I/O, the end result is a throughput collapse. This commit tries to counter this problem by injecting the service of other processes, in a controlled way, while the process in service happens to have no I/O. This injection is performed only if the medium is non rotational and performs internal queueing, and the process in service does random I/O (service injection might be beneficial for sequential I/O too, we'll work on that). As an example of the benefits of this commit, on a PLEXTOR PX-256M5S SSD, and with five processes having differentiated weights and doing sync random 4KB I/O, this commit makes the throughput with bfq grow by 400%, from 25 to 100MB/s. This higher throughput is 10MB/s lower than that reached with none. As some less random I/O is added to the mix, the throughput becomes equal to or higher than that with none. This commit is a very first attempt to recover throughput without losing control, and certainly has many limitations. One is, e.g., that the processes whose service is injected are not chosen so as to distribute the extra bandwidth they receive in accordance to their weights. Thus there might be loss of weighted fairness in some cases. Anyway, this loss concerns extra service, which would not have been received at all without this commit. Other limitations and issues will probably show up with usage. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
1 parent cbeb869 commit d0edc24

File tree

2 files changed

+88
-6
lines changed

2 files changed

+88
-6
lines changed

block/bfq-iosched.c

Lines changed: 62 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3182,6 +3182,13 @@ static unsigned long bfq_bfqq_softrt_next_start(struct bfq_data *bfqd,
31823182
jiffies + nsecs_to_jiffies(bfqq->bfqd->bfq_slice_idle) + 4);
31833183
}
31843184

3185+
static bool bfq_bfqq_injectable(struct bfq_queue *bfqq)
3186+
{
3187+
return BFQQ_SEEKY(bfqq) && bfqq->wr_coeff == 1 &&
3188+
blk_queue_nonrot(bfqq->bfqd->queue) &&
3189+
bfqq->bfqd->hw_tag;
3190+
}
3191+
31853192
/**
31863193
* bfq_bfqq_expire - expire a queue.
31873194
* @bfqd: device owning the queue.
@@ -3291,6 +3298,8 @@ void bfq_bfqq_expire(struct bfq_data *bfqd,
32913298
if (ref == 1) /* bfqq is gone, no more actions on it */
32923299
return;
32933300

3301+
bfqq->injected_service = 0;
3302+
32943303
/* mark bfqq as waiting a request only if a bic still points to it */
32953304
if (!bfq_bfqq_busy(bfqq) &&
32963305
reason != BFQQE_BUDGET_TIMEOUT &&
@@ -3629,6 +3638,30 @@ static bool bfq_bfqq_must_idle(struct bfq_queue *bfqq)
36293638
return RB_EMPTY_ROOT(&bfqq->sort_list) && bfq_better_to_idle(bfqq);
36303639
}
36313640

3641+
static struct bfq_queue *bfq_choose_bfqq_for_injection(struct bfq_data *bfqd)
3642+
{
3643+
struct bfq_queue *bfqq;
3644+
3645+
/*
3646+
* A linear search; but, with a high probability, very few
3647+
* steps are needed to find a candidate queue, i.e., a queue
3648+
* with enough budget left for its next request. In fact:
3649+
* - BFQ dynamically updates the budget of every queue so as
3650+
* to accommodate the expected backlog of the queue;
3651+
* - if a queue gets all its requests dispatched as injected
3652+
* service, then the queue is removed from the active list
3653+
* (and re-added only if it gets new requests, but with
3654+
* enough budget for its new backlog).
3655+
*/
3656+
list_for_each_entry(bfqq, &bfqd->active_list, bfqq_list)
3657+
if (!RB_EMPTY_ROOT(&bfqq->sort_list) &&
3658+
bfq_serv_to_charge(bfqq->next_rq, bfqq) <=
3659+
bfq_bfqq_budget_left(bfqq))
3660+
return bfqq;
3661+
3662+
return NULL;
3663+
}
3664+
36323665
/*
36333666
* Select a queue for service. If we have a current queue in service,
36343667
* check whether to continue servicing it, or retrieve and set a new one.
@@ -3710,10 +3743,19 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
37103743
* No requests pending. However, if the in-service queue is idling
37113744
* for a new request, or has requests waiting for a completion and
37123745
* may idle after their completion, then keep it anyway.
3746+
*
3747+
* Yet, to boost throughput, inject service from other queues if
3748+
* possible.
37133749
*/
37143750
if (bfq_bfqq_wait_request(bfqq) ||
37153751
(bfqq->dispatched != 0 && bfq_better_to_idle(bfqq))) {
3716-
bfqq = NULL;
3752+
if (bfq_bfqq_injectable(bfqq) &&
3753+
bfqq->injected_service * bfqq->inject_coeff <
3754+
bfqq->entity.service * 10)
3755+
bfqq = bfq_choose_bfqq_for_injection(bfqd);
3756+
else
3757+
bfqq = NULL;
3758+
37173759
goto keep_queue;
37183760
}
37193761

@@ -3803,6 +3845,14 @@ static struct request *bfq_dispatch_rq_from_bfqq(struct bfq_data *bfqd,
38033845

38043846
bfq_dispatch_remove(bfqd->queue, rq);
38053847

3848+
if (bfqq != bfqd->in_service_queue) {
3849+
if (likely(bfqd->in_service_queue))
3850+
bfqd->in_service_queue->injected_service +=
3851+
bfq_serv_to_charge(rq, bfqq);
3852+
3853+
goto return_rq;
3854+
}
3855+
38063856
/*
38073857
* If weight raising has to terminate for bfqq, then next
38083858
* function causes an immediate update of bfqq's weight,
@@ -3821,13 +3871,12 @@ static struct request *bfq_dispatch_rq_from_bfqq(struct bfq_data *bfqd,
38213871
* belongs to CLASS_IDLE and other queues are waiting for
38223872
* service.
38233873
*/
3824-
if (bfqd->busy_queues > 1 && bfq_class_idle(bfqq))
3825-
goto expire;
3826-
3827-
return rq;
3874+
if (!(bfqd->busy_queues > 1 && bfq_class_idle(bfqq)))
3875+
goto return_rq;
38283876

3829-
expire:
38303877
bfq_bfqq_expire(bfqd, bfqq, false, BFQQE_BUDGET_EXHAUSTED);
3878+
3879+
return_rq:
38313880
return rq;
38323881
}
38333882

@@ -4232,6 +4281,13 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
42324281
bfq_mark_bfqq_has_short_ttime(bfqq);
42334282
bfq_mark_bfqq_sync(bfqq);
42344283
bfq_mark_bfqq_just_created(bfqq);
4284+
/*
4285+
* Aggressively inject a lot of service: up to 90%.
4286+
* This coefficient remains constant during bfqq life,
4287+
* but this behavior might be changed, after enough
4288+
* testing and tuning.
4289+
*/
4290+
bfqq->inject_coeff = 1;
42354291
} else
42364292
bfq_clear_bfqq_sync(bfqq);
42374293

block/bfq-iosched.h

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -351,6 +351,32 @@ struct bfq_queue {
351351
unsigned long split_time; /* time of last split */
352352

353353
unsigned long first_IO_time; /* time of first I/O for this queue */
354+
355+
/* max service rate measured so far */
356+
u32 max_service_rate;
357+
/*
358+
* Ratio between the service received by bfqq while it is in
359+
* service, and the cumulative service (of requests of other
360+
* queues) that may be injected while bfqq is empty but still
361+
* in service. To increase precision, the coefficient is
362+
* measured in tenths of unit. Here are some example of (1)
363+
* ratios, (2) resulting percentages of service injected
364+
* w.r.t. to the total service dispatched while bfqq is in
365+
* service, and (3) corresponding values of the coefficient:
366+
* 1 (50%) -> 10
367+
* 2 (33%) -> 20
368+
* 10 (9%) -> 100
369+
* 9.9 (9%) -> 99
370+
* 1.5 (40%) -> 15
371+
* 0.5 (66%) -> 5
372+
* 0.1 (90%) -> 1
373+
*
374+
* So, if the coefficient is lower than 10, then
375+
* injected service is more than bfqq service.
376+
*/
377+
unsigned int inject_coeff;
378+
/* amount of service injected in current service slot */
379+
unsigned int injected_service;
354380
};
355381

356382
/**

0 commit comments

Comments
 (0)