Skip to content

Commit 40767b0

Browse files
Peter ZijlstraIngo Molnar
authored andcommitted
sched/deadline: Fix deadline parameter modification handling
Commit 67dfa1b ("sched/deadline: Implement cancel_dl_timer() to use in switched_from_dl()") removed the hrtimer_try_cancel() function call out from init_dl_task_timer(), which gets called from __setparam_dl(). The result is that we can now re-init the timer while its active -- this is bad and corrupts timer state. Furthermore; changing the parameters of an active deadline task is tricky in that you want to maintain guarantees, while immediately effective change would allow one to circumvent the CBS guarantees -- this too is bad, as one (bad) task should not be able to affect the others. Rework things to avoid both problems. We only need to initialize the timer once, so move that to __sched_fork() for new tasks. Then make sure __setparam_dl() doesn't affect the current running state but only updates the parameters used to calculate the next scheduling period -- this guarantees the CBS functions as expected (albeit slightly pessimistic). This however means we need to make sure __dl_clear_params() needs to reset the active state otherwise new (and tasks flipping between classes) will not properly (re)compute their first instance. Todo: close class flipping CBS hole. Todo: implement delayed BW release. Reported-by: Luca Abeni <luca.abeni@unitn.it> Acked-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Luca Abeni <luca.abeni@unitn.it> Fixes: 67dfa1b ("sched/deadline: Implement cancel_dl_timer() to use in switched_from_dl()") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Kirill Tkhai <tkhai@yandex.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150128140803.GF23038@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
1 parent 3e87523 commit 40767b0

File tree

2 files changed

+30
-6
lines changed

2 files changed

+30
-6
lines changed

kernel/sched/core.c

Lines changed: 28 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1814,6 +1814,10 @@ void __dl_clear_params(struct task_struct *p)
18141814
dl_se->dl_period = 0;
18151815
dl_se->flags = 0;
18161816
dl_se->dl_bw = 0;
1817+
1818+
dl_se->dl_throttled = 0;
1819+
dl_se->dl_new = 1;
1820+
dl_se->dl_yielded = 0;
18171821
}
18181822

18191823
/*
@@ -1839,7 +1843,7 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p)
18391843
#endif
18401844

18411845
RB_CLEAR_NODE(&p->dl.rb_node);
1842-
hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
1846+
init_dl_task_timer(&p->dl);
18431847
__dl_clear_params(p);
18441848

18451849
INIT_LIST_HEAD(&p->rt.run_list);
@@ -2049,6 +2053,9 @@ static inline int dl_bw_cpus(int i)
20492053
* allocated bandwidth to reflect the new situation.
20502054
*
20512055
* This function is called while holding p's rq->lock.
2056+
*
2057+
* XXX we should delay bw change until the task's 0-lag point, see
2058+
* __setparam_dl().
20522059
*/
20532060
static int dl_overflow(struct task_struct *p, int policy,
20542061
const struct sched_attr *attr)
@@ -3251,15 +3258,31 @@ __setparam_dl(struct task_struct *p, const struct sched_attr *attr)
32513258
{
32523259
struct sched_dl_entity *dl_se = &p->dl;
32533260

3254-
init_dl_task_timer(dl_se);
32553261
dl_se->dl_runtime = attr->sched_runtime;
32563262
dl_se->dl_deadline = attr->sched_deadline;
32573263
dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
32583264
dl_se->flags = attr->sched_flags;
32593265
dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
3260-
dl_se->dl_throttled = 0;
3261-
dl_se->dl_new = 1;
3262-
dl_se->dl_yielded = 0;
3266+
3267+
/*
3268+
* Changing the parameters of a task is 'tricky' and we're not doing
3269+
* the correct thing -- also see task_dead_dl() and switched_from_dl().
3270+
*
3271+
* What we SHOULD do is delay the bandwidth release until the 0-lag
3272+
* point. This would include retaining the task_struct until that time
3273+
* and change dl_overflow() to not immediately decrement the current
3274+
* amount.
3275+
*
3276+
* Instead we retain the current runtime/deadline and let the new
3277+
* parameters take effect after the current reservation period lapses.
3278+
* This is safe (albeit pessimistic) because the 0-lag point is always
3279+
* before the current scheduling deadline.
3280+
*
3281+
* We can still have temporary overloads because we do not delay the
3282+
* change in bandwidth until that time; so admission control is
3283+
* not on the safe side. It does however guarantee tasks will never
3284+
* consume more than promised.
3285+
*/
32633286
}
32643287

32653288
/*

kernel/sched/deadline.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1094,6 +1094,7 @@ static void task_dead_dl(struct task_struct *p)
10941094
* Since we are TASK_DEAD we won't slip out of the domain!
10951095
*/
10961096
raw_spin_lock_irq(&dl_b->lock);
1097+
/* XXX we should retain the bw until 0-lag */
10971098
dl_b->total_bw -= p->dl.dl_bw;
10981099
raw_spin_unlock_irq(&dl_b->lock);
10991100

@@ -1614,8 +1615,8 @@ static void cancel_dl_timer(struct rq *rq, struct task_struct *p)
16141615

16151616
static void switched_from_dl(struct rq *rq, struct task_struct *p)
16161617
{
1618+
/* XXX we should retain the bw until 0-lag */
16171619
cancel_dl_timer(rq, p);
1618-
16191620
__dl_clear_params(p);
16201621

16211622
/*

0 commit comments

Comments
 (0)