Skip to content

Commit f303fcc

Browse files
committed
workqueue: implement "workqueue.debug_force_rr_cpu" debug feature
Workqueue used to guarantee local execution for work items queued without explicit target CPU. The guarantee is gone now which can break some usages in subtle ways. To flush out those cases, this patch implements a debug feature which forces round-robin CPU selection for all such work items. The debug feature defaults to off and can be enabled with a kernel parameter. The default can be flipped with a debug config option. If you hit this commit during bisection, please refer to 041bd12 ("Revert "workqueue: make sure delayed work run in local cpu"") for more information and ping me. Signed-off-by: Tejun Heo <tj@kernel.org>
1 parent ef55718 commit f303fcc

File tree

3 files changed

+47
-2
lines changed

3 files changed

+47
-2
lines changed

Documentation/kernel-parameters.txt

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4230,6 +4230,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
42304230
The default value of this parameter is determined by
42314231
the config option CONFIG_WQ_POWER_EFFICIENT_DEFAULT.
42324232

4233+
workqueue.debug_force_rr_cpu
4234+
Workqueue used to implicitly guarantee that work
4235+
items queued without explicit CPU specified are put
4236+
on the local CPU. This guarantee is no longer true
4237+
and while local CPU is still preferred work items
4238+
may be put on foreign CPUs. This debug option
4239+
forces round-robin CPU selection to flush out
4240+
usages which depend on the now broken guarantee.
4241+
When enabled, memory and cache locality will be
4242+
impacted.
4243+
42334244
x2apic_phys [X86-64,APIC] Use x2apic physical mode instead of
42344245
default x2apic cluster mode on platforms
42354246
supporting x2apic.

kernel/workqueue.c

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,18 @@ static cpumask_var_t wq_unbound_cpumask;
307307
/* CPU where unbound work was last round robin scheduled from this CPU */
308308
static DEFINE_PER_CPU(int, wq_rr_cpu_last);
309309

310+
/*
311+
* Local execution of unbound work items is no longer guaranteed. The
312+
* following always forces round-robin CPU selection on unbound work items
313+
* to uncover usages which depend on it.
314+
*/
315+
#ifdef CONFIG_DEBUG_WQ_FORCE_RR_CPU
316+
static bool wq_debug_force_rr_cpu = true;
317+
#else
318+
static bool wq_debug_force_rr_cpu = false;
319+
#endif
320+
module_param_named(debug_force_rr_cpu, wq_debug_force_rr_cpu, bool, 0644);
321+
310322
/* the per-cpu worker pools */
311323
static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
312324
cpu_worker_pools);
@@ -1309,10 +1321,17 @@ static bool is_chained_work(struct workqueue_struct *wq)
13091321
*/
13101322
static int wq_select_unbound_cpu(int cpu)
13111323
{
1324+
static bool printed_dbg_warning;
13121325
int new_cpu;
13131326

1314-
if (cpumask_test_cpu(cpu, wq_unbound_cpumask))
1315-
return cpu;
1327+
if (likely(!wq_debug_force_rr_cpu)) {
1328+
if (cpumask_test_cpu(cpu, wq_unbound_cpumask))
1329+
return cpu;
1330+
} else if (!printed_dbg_warning) {
1331+
pr_warn("workqueue: round-robin CPU selection forced, expect performance impact\n");
1332+
printed_dbg_warning = true;
1333+
}
1334+
13161335
if (cpumask_empty(wq_unbound_cpumask))
13171336
return cpu;
13181337

lib/Kconfig.debug

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1400,6 +1400,21 @@ config RCU_EQS_DEBUG
14001400

14011401
endmenu # "RCU Debugging"
14021402

1403+
config DEBUG_WQ_FORCE_RR_CPU
1404+
bool "Force round-robin CPU selection for unbound work items"
1405+
depends on DEBUG_KERNEL
1406+
default n
1407+
help
1408+
Workqueue used to implicitly guarantee that work items queued
1409+
without explicit CPU specified are put on the local CPU. This
1410+
guarantee is no longer true and while local CPU is still
1411+
preferred work items may be put on foreign CPUs. Kernel
1412+
parameter "workqueue.debug_force_rr_cpu" is added to force
1413+
round-robin CPU selection to flush out usages which depend on the
1414+
now broken guarantee. This config option enables the debug
1415+
feature by default. When enabled, memory and cache locality will
1416+
be impacted.
1417+
14031418
config DEBUG_BLOCK_EXT_DEVT
14041419
bool "Force extended block device numbers and spread them"
14051420
depends on DEBUG_KERNEL

0 commit comments

Comments
 (0)