Skip to content

Commit 406f992

Browse files
committed
x86 / hibernate: Use hlt_play_dead() when resuming from hibernation
On Intel hardware, native_play_dead() uses mwait_play_dead() by default and only falls back to the other methods if that fails. That also happens during resume from hibernation, when the restore (boot) kernel runs disable_nonboot_cpus() to take all of the CPUs except for the boot one offline. However, that is problematic, because the address passed to __monitor() in mwait_play_dead() is likely to be written to in the last phase of hibernate image restoration and that causes the "dead" CPU to start executing instructions again. Unfortunately, the page containing the address in that CPU's instruction pointer may not be valid any more at that point. First, that page may have been overwritten with image kernel memory contents already, so the instructions the CPU attempts to execute may simply be invalid. Second, the page tables previously used by that CPU may have been overwritten by image kernel memory contents, so the address in its instruction pointer is impossible to resolve then. A report from Varun Koyyalagunta and investigation carried out by Chen Yu show that the latter sometimes happens in practice. To prevent it from happening, temporarily change the smp_ops.play_dead pointer during resume from hibernation so that it points to a special "play dead" routine which uses hlt_play_dead() and avoids the inadvertent "revivals" of "dead" CPUs this way. A slightly unpleasant consequence of this change is that if the system is hibernated with one or more CPUs offline, it will generally draw more power after resume than it did before hibernation, because the physical state entered by CPUs via hlt_play_dead() is higher-power than the mwait_play_dead() one in the majority of cases. It is possible to work around this, but it is unclear how much of a problem that's going to be in practice, so the workaround will be implemented later if it turns out to be necessary. Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371 Reported-by: Varun Koyyalagunta <cpudebug@centtech.com> Original-by: Chen Yu <yu.c.chen@intel.com> Tested-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org>
1 parent 4c0b6c1 commit 406f992

File tree

5 files changed

+40
-2
lines changed

5 files changed

+40
-2
lines changed

arch/x86/include/asm/smp.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,7 @@ int native_cpu_up(unsigned int cpunum, struct task_struct *tidle);
135135
int native_cpu_disable(void);
136136
int common_cpu_die(unsigned int cpu);
137137
void native_cpu_die(unsigned int cpu);
138+
void hlt_play_dead(void);
138139
void native_play_dead(void);
139140
void play_dead_common(void);
140141
void wbinvd_on_cpu(int cpu);

arch/x86/kernel/smpboot.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1622,7 +1622,7 @@ static inline void mwait_play_dead(void)
16221622
}
16231623
}
16241624

1625-
static inline void hlt_play_dead(void)
1625+
void hlt_play_dead(void)
16261626
{
16271627
if (__this_cpu_read(cpu_info.x86) >= 4)
16281628
wbinvd();

arch/x86/power/cpu.c

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
#include <linux/export.h>
1313
#include <linux/smp.h>
1414
#include <linux/perf_event.h>
15+
#include <linux/tboot.h>
1516

1617
#include <asm/pgtable.h>
1718
#include <asm/proto.h>
@@ -266,6 +267,35 @@ void notrace restore_processor_state(void)
266267
EXPORT_SYMBOL(restore_processor_state);
267268
#endif
268269

270+
#if defined(CONFIG_HIBERNATION) && defined(CONFIG_HOTPLUG_CPU)
271+
static void resume_play_dead(void)
272+
{
273+
play_dead_common();
274+
tboot_shutdown(TB_SHUTDOWN_WFS);
275+
hlt_play_dead();
276+
}
277+
278+
int hibernate_resume_nonboot_cpu_disable(void)
279+
{
280+
void (*play_dead)(void) = smp_ops.play_dead;
281+
int ret;
282+
283+
/*
284+
* Ensure that MONITOR/MWAIT will not be used in the "play dead" loop
285+
* during hibernate image restoration, because it is likely that the
286+
* monitored address will be actually written to at that time and then
287+
* the "dead" CPU will attempt to execute instructions again, but the
288+
* address in its instruction pointer may not be possible to resolve
289+
* any more at that point (the page tables used by it previously may
290+
* have been overwritten by hibernate image data).
291+
*/
292+
smp_ops.play_dead = resume_play_dead;
293+
ret = disable_nonboot_cpus();
294+
smp_ops.play_dead = play_dead;
295+
return ret;
296+
}
297+
#endif
298+
269299
/*
270300
* When bsp_check() is called in hibernate and suspend, cpu hotplug
271301
* is disabled already. So it's unnessary to handle race condition between

kernel/power/hibernate.c

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -409,6 +409,11 @@ int hibernation_snapshot(int platform_mode)
409409
goto Close;
410410
}
411411

412+
int __weak hibernate_resume_nonboot_cpu_disable(void)
413+
{
414+
return disable_nonboot_cpus();
415+
}
416+
412417
/**
413418
* resume_target_kernel - Restore system state from a hibernation image.
414419
* @platform_mode: Whether or not to use the platform driver.
@@ -433,7 +438,7 @@ static int resume_target_kernel(bool platform_mode)
433438
if (error)
434439
goto Cleanup;
435440

436-
error = disable_nonboot_cpus();
441+
error = hibernate_resume_nonboot_cpu_disable();
437442
if (error)
438443
goto Enable_cpus;
439444

kernel/power/power.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,8 @@ static inline char *check_image_kernel(struct swsusp_info *info)
3838
}
3939
#endif /* CONFIG_ARCH_HIBERNATION_HEADER */
4040

41+
extern int hibernate_resume_nonboot_cpu_disable(void);
42+
4143
/*
4244
* Keep some memory free so that I/O operations can succeed without paging
4345
* [Might this be more than 4 MB?]

0 commit comments

Comments
 (0)