Skip to content

Commit 6453dbd

Browse files
committed
Merge tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki: "Again, the majority of changes go into the cpufreq subsystem, but there are no big features this time. The cpufreq changes that stand out somewhat are the governor interface rework and improvements related to the handling of frequency tables. Apart from those, there are fixes and new device/CPU IDs in drivers, cleanups and an improvement of the new schedutil governor. Next, there are some changes in the hibernation core, including a fix for a nasty problem related to the MONITOR/MWAIT usage by CPU offline during resume from hibernation, a few core improvements related to memory management during resume, a couple of additional debug features and cleanups. Finally, we have some fixes and cleanups in the devfreq subsystem, generic power domains framework improvements related to system suspend/resume, support for some new chips in intel_idle and in the power capping RAPL driver, a new version of the AnalyzeSuspend utility and some assorted fixes and cleanups. Specifics: - Rework the cpufreq governor interface to make it more straightforward and modify the conservative governor to avoid using transition notifications (Rafael Wysocki). - Rework the handling of frequency tables by the cpufreq core to make it more efficient (Viresh Kumar). - Modify the schedutil governor to reduce the number of wakeups it causes to occur in cases when the CPU frequency doesn't need to be changed (Steve Muckle, Viresh Kumar). - Fix some minor issues and clean up code in the cpufreq core and governors (Rafael Wysocki, Viresh Kumar). - Add Intel Broxton support to the intel_pstate driver (Srinivas Pandruvada). - Fix problems related to the config TDP feature and to the validity of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka, Srinivas Pandruvada). - Make intel_pstate update the cpu_frequency tracepoint even if the frequency doesn't change to avoid confusing powertop (Rafael Wysocki). - Clean up the usage of __init/__initdata in intel_pstate, mark some of its internal variables as __read_mostly and drop an unused structure element from it (Jisheng Zhang, Carsten Emde). - Clean up the usage of some duplicate MSR symbols in intel_pstate and turbostat (Srinivas Pandruvada). - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay Adiga, Viresh Kumar, Ben Dooks). - Fix a regression (introduced during the 4.5 cycle) in the pcc-cpufreq driver by reverting the problematic commit (Andreas Herrmann). - Add support for Intel Denverton to intel_idle, clean up Broxton support in it and make it explicitly non-modular (Jacob Pan, Jan Beulich, Paul Gortmaker). - Add support for Denverton and Ivy Bridge server to the Intel RAPL power capping driver and make it more careful about the handing of MSRs that may not be present (Jacob Pan, Xiaolong Wang). - Fix resume from hibernation on x86-64 by making the CPU offline during resume avoid using MONITOR/MWAIT in the "play dead" loop which may lead to an inadvertent "revival" of a "dead" CPU and a page fault leading to a kernel crash from it (Rafael Wysocki). - Make memory management during resume from hibernation more straightforward (Rafael Wysocki). - Add debug features that should help to detect problems related to hibernation and resume from it (Rafael Wysocki, Chen Yu). - Clean up hibernation core somewhat (Rafael Wysocki). - Prevent KASAN from instrumenting the hibernation core which leads to large numbers of false-positives from it (James Morse). - Prevent PM (hibernate and suspend) notifiers from being called during the cleanup phase if they have not been called during the corresponding preparation phase which is possible if one of the other notifiers returns an error at that time (Lianwei Wang). - Improve suspend-related debug printout in the tasks freezer and clean up suspend-related console handling (Roger Lu, Borislav Petkov). - Update the AnalyzeSuspend script in the kernel sources to version 4.2 (Todd Brandt). - Modify the generic power domains framework to make it handle system suspend/resume better (Ulf Hansson). - Make the runtime PM framework avoid resuming devices synchronously when user space changes the runtime PM settings for them and improve its error reporting (Rafael Wysocki, Linus Walleij). - Fix error paths in devfreq drivers (exynos, exynos-ppmu, exynos-bus) and in the core, make some devfreq code explicitly non-modular and change some of it into tristate (Bartlomiej Zolnierkiewicz, Peter Chen, Paul Gortmaker). - Add DT support to the generic PM clocks management code and make it export some more symbols (Jon Hunter, Paul Gortmaker). - Make the PCI PM core code slightly more robust against possible driver errors (Andy Shevchenko). - Make it possible to change DESTDIR and PREFIX in turbostat (Andy Shevchenko)" * tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits) Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency" PM / hibernate: Introduce test_resume mode for hibernation cpufreq: export cpufreq_driver_resolve_freq() cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index() PCI / PM: check all fields in pci_set_platform_pm() cpufreq: acpi-cpufreq: use cached frequency mapping when possible cpufreq: schedutil: map raw required frequency to driver frequency cpufreq: add cpufreq_driver_resolve_freq() cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT intel_pstate: Update cpu_frequency tracepoint every time cpufreq: intel_pstate: clean remnant struct element PM / tools: scripts: AnalyzeSuspend v4.2 x86 / hibernate: Use hlt_play_dead() when resuming from hibernation cpufreq: powernv: Replacing pstate_id with frequency table index intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate() PM / hibernate: Image data protection during restoration PM / hibernate: Add missing braces in __register_nosave_region() PM / hibernate: Clean up comments in snapshot.c PM / hibernate: Clean up function headers in snapshot.c PM / hibernate: Add missing braces in hibernate_setup() ...
2 parents 27b7902 + bc841e2 commit 6453dbd

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+4332
-2821
lines changed

Documentation/cpu-freq/core.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ new - new frequency
9696
For details about OPP, see Documentation/power/opp.txt
9797

9898
dev_pm_opp_init_cpufreq_table - cpufreq framework typically is initialized with
99-
cpufreq_frequency_table_cpuinfo which is provided with the list of
99+
cpufreq_table_validate_and_show() which is provided with the list of
100100
frequencies that are available for operation. This function provides
101101
a ready to use conversion routine to translate the OPP layer's internal
102102
information about the available frequencies into a format readily
@@ -110,7 +110,7 @@ dev_pm_opp_init_cpufreq_table - cpufreq framework typically is initialized with
110110
/* Do things */
111111
r = dev_pm_opp_init_cpufreq_table(dev, &freq_table);
112112
if (!r)
113-
cpufreq_frequency_table_cpuinfo(policy, freq_table);
113+
cpufreq_table_validate_and_show(policy, freq_table);
114114
/* Do other things */
115115
}
116116

Documentation/cpu-freq/cpu-drivers.txt

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -231,7 +231,7 @@ if you want to skip one entry in the table, set the frequency to
231231
CPUFREQ_ENTRY_INVALID. The entries don't need to be in ascending
232232
order.
233233

234-
By calling cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
234+
By calling cpufreq_table_validate_and_show(struct cpufreq_policy *policy,
235235
struct cpufreq_frequency_table *table);
236236
the cpuinfo.min_freq and cpuinfo.max_freq values are detected, and
237237
policy->min and policy->max are set to the same values. This is
@@ -244,14 +244,12 @@ policy->max, and all other criteria are met. This is helpful for the
244244
->verify call.
245245

246246
int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
247-
struct cpufreq_frequency_table *table,
248247
unsigned int target_freq,
249-
unsigned int relation,
250-
unsigned int *index);
248+
unsigned int relation);
251249

252250
is the corresponding frequency table helper for the ->target
253-
stage. Just pass the values to this function, and the unsigned int
254-
index returns the number of the frequency table entry which contains
251+
stage. Just pass the values to this function, and this function
252+
returns the number of the frequency table entry which contains
255253
the frequency the CPU shall be set to.
256254

257255
The following macros can be used as iterators over cpufreq_frequency_table:

Documentation/cpu-freq/pcc-cpufreq.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -159,8 +159,8 @@ to be strictly associated with a P-state.
159159

160160
2.2 cpuinfo_transition_latency:
161161
-------------------------------
162-
The cpuinfo_transition_latency field is CPUFREQ_ETERNAL. The PCC specification
163-
does not include a field to expose this value currently.
162+
The cpuinfo_transition_latency field is 0. The PCC specification does
163+
not include a field to expose this value currently.
164164

165165
2.3 cpuinfo_cur_freq:
166166
---------------------

Documentation/kernel-parameters.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3598,6 +3598,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
35983598
present during boot.
35993599
nocompress Don't compress/decompress hibernation images.
36003600
no Disable hibernation and resume.
3601+
protect_image Turn on image protection during restoration
3602+
(that will set all pages holding image data
3603+
during restoration read-only).
36013604

36023605
retain_initrd [RAM] Keep initrd memory after extraction
36033606

arch/powerpc/platforms/cell/cpufreq_spudemand.c

Lines changed: 34 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -85,61 +85,57 @@ static void spu_gov_cancel_work(struct spu_gov_info_struct *info)
8585
cancel_delayed_work_sync(&info->work);
8686
}
8787

88-
static int spu_gov_govern(struct cpufreq_policy *policy, unsigned int event)
88+
static int spu_gov_start(struct cpufreq_policy *policy)
8989
{
9090
unsigned int cpu = policy->cpu;
91-
struct spu_gov_info_struct *info, *affected_info;
91+
struct spu_gov_info_struct *info = &per_cpu(spu_gov_info, cpu);
92+
struct spu_gov_info_struct *affected_info;
9293
int i;
93-
int ret = 0;
9494

95-
info = &per_cpu(spu_gov_info, cpu);
96-
97-
switch (event) {
98-
case CPUFREQ_GOV_START:
99-
if (!cpu_online(cpu)) {
100-
printk(KERN_ERR "cpu %d is not online\n", cpu);
101-
ret = -EINVAL;
102-
break;
103-
}
95+
if (!cpu_online(cpu)) {
96+
printk(KERN_ERR "cpu %d is not online\n", cpu);
97+
return -EINVAL;
98+
}
10499

105-
if (!policy->cur) {
106-
printk(KERN_ERR "no cpu specified in policy\n");
107-
ret = -EINVAL;
108-
break;
109-
}
100+
if (!policy->cur) {
101+
printk(KERN_ERR "no cpu specified in policy\n");
102+
return -EINVAL;
103+
}
110104

111-
/* initialize spu_gov_info for all affected cpus */
112-
for_each_cpu(i, policy->cpus) {
113-
affected_info = &per_cpu(spu_gov_info, i);
114-
affected_info->policy = policy;
115-
}
105+
/* initialize spu_gov_info for all affected cpus */
106+
for_each_cpu(i, policy->cpus) {
107+
affected_info = &per_cpu(spu_gov_info, i);
108+
affected_info->policy = policy;
109+
}
116110

117-
info->poll_int = POLL_TIME;
111+
info->poll_int = POLL_TIME;
118112

119-
/* setup timer */
120-
spu_gov_init_work(info);
113+
/* setup timer */
114+
spu_gov_init_work(info);
121115

122-
break;
116+
return 0;
117+
}
123118

124-
case CPUFREQ_GOV_STOP:
125-
/* cancel timer */
126-
spu_gov_cancel_work(info);
119+
static void spu_gov_stop(struct cpufreq_policy *policy)
120+
{
121+
unsigned int cpu = policy->cpu;
122+
struct spu_gov_info_struct *info = &per_cpu(spu_gov_info, cpu);
123+
int i;
127124

128-
/* clean spu_gov_info for all affected cpus */
129-
for_each_cpu (i, policy->cpus) {
130-
info = &per_cpu(spu_gov_info, i);
131-
info->policy = NULL;
132-
}
125+
/* cancel timer */
126+
spu_gov_cancel_work(info);
133127

134-
break;
128+
/* clean spu_gov_info for all affected cpus */
129+
for_each_cpu (i, policy->cpus) {
130+
info = &per_cpu(spu_gov_info, i);
131+
info->policy = NULL;
135132
}
136-
137-
return ret;
138133
}
139134

140135
static struct cpufreq_governor spu_governor = {
141136
.name = "spudemand",
142-
.governor = spu_gov_govern,
137+
.start = spu_gov_start,
138+
.stop = spu_gov_stop,
143139
.owner = THIS_MODULE,
144140
};
145141

arch/x86/include/asm/msr-index.h

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,8 +64,6 @@
6464

6565
#define MSR_OFFCORE_RSP_0 0x000001a6
6666
#define MSR_OFFCORE_RSP_1 0x000001a7
67-
#define MSR_NHM_TURBO_RATIO_LIMIT 0x000001ad
68-
#define MSR_IVT_TURBO_RATIO_LIMIT 0x000001ae
6967
#define MSR_TURBO_RATIO_LIMIT 0x000001ad
7068
#define MSR_TURBO_RATIO_LIMIT1 0x000001ae
7169
#define MSR_TURBO_RATIO_LIMIT2 0x000001af

arch/x86/include/asm/smp.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,7 @@ int native_cpu_up(unsigned int cpunum, struct task_struct *tidle);
135135
int native_cpu_disable(void);
136136
int common_cpu_die(unsigned int cpu);
137137
void native_cpu_die(unsigned int cpu);
138+
void hlt_play_dead(void);
138139
void native_play_dead(void);
139140
void play_dead_common(void);
140141
void wbinvd_on_cpu(int cpu);

arch/x86/kernel/smpboot.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1644,7 +1644,7 @@ static inline void mwait_play_dead(void)
16441644
}
16451645
}
16461646

1647-
static inline void hlt_play_dead(void)
1647+
void hlt_play_dead(void)
16481648
{
16491649
if (__this_cpu_read(cpu_info.x86) >= 4)
16501650
wbinvd();

arch/x86/power/cpu.c

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
#include <linux/export.h>
1313
#include <linux/smp.h>
1414
#include <linux/perf_event.h>
15+
#include <linux/tboot.h>
1516

1617
#include <asm/pgtable.h>
1718
#include <asm/proto.h>
@@ -266,6 +267,35 @@ void notrace restore_processor_state(void)
266267
EXPORT_SYMBOL(restore_processor_state);
267268
#endif
268269

270+
#if defined(CONFIG_HIBERNATION) && defined(CONFIG_HOTPLUG_CPU)
271+
static void resume_play_dead(void)
272+
{
273+
play_dead_common();
274+
tboot_shutdown(TB_SHUTDOWN_WFS);
275+
hlt_play_dead();
276+
}
277+
278+
int hibernate_resume_nonboot_cpu_disable(void)
279+
{
280+
void (*play_dead)(void) = smp_ops.play_dead;
281+
int ret;
282+
283+
/*
284+
* Ensure that MONITOR/MWAIT will not be used in the "play dead" loop
285+
* during hibernate image restoration, because it is likely that the
286+
* monitored address will be actually written to at that time and then
287+
* the "dead" CPU will attempt to execute instructions again, but the
288+
* address in its instruction pointer may not be possible to resolve
289+
* any more at that point (the page tables used by it previously may
290+
* have been overwritten by hibernate image data).
291+
*/
292+
smp_ops.play_dead = resume_play_dead;
293+
ret = disable_nonboot_cpus();
294+
smp_ops.play_dead = play_dead;
295+
return ret;
296+
}
297+
#endif
298+
269299
/*
270300
* When bsp_check() is called in hibernate and suspend, cpu hotplug
271301
* is disabled already. So it's unnessary to handle race condition between

drivers/base/power/clock_ops.c

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,7 @@ int pm_clk_add(struct device *dev, const char *con_id)
121121
{
122122
return __pm_clk_add(dev, con_id, NULL);
123123
}
124+
EXPORT_SYMBOL_GPL(pm_clk_add);
124125

125126
/**
126127
* pm_clk_add_clk - Start using a device clock for power management.
@@ -136,8 +137,41 @@ int pm_clk_add_clk(struct device *dev, struct clk *clk)
136137
{
137138
return __pm_clk_add(dev, NULL, clk);
138139
}
140+
EXPORT_SYMBOL_GPL(pm_clk_add_clk);
139141

140142

143+
/**
144+
* of_pm_clk_add_clk - Start using a device clock for power management.
145+
* @dev: Device whose clock is going to be used for power management.
146+
* @name: Name of clock that is going to be used for power management.
147+
*
148+
* Add the clock described in the 'clocks' device-tree node that matches
149+
* with the 'name' provided, to the list of clocks used for the power
150+
* management of @dev. On success, returns 0. Returns a negative error
151+
* code if the clock is not found or cannot be added.
152+
*/
153+
int of_pm_clk_add_clk(struct device *dev, const char *name)
154+
{
155+
struct clk *clk;
156+
int ret;
157+
158+
if (!dev || !dev->of_node || !name)
159+
return -EINVAL;
160+
161+
clk = of_clk_get_by_name(dev->of_node, name);
162+
if (IS_ERR(clk))
163+
return PTR_ERR(clk);
164+
165+
ret = pm_clk_add_clk(dev, clk);
166+
if (ret) {
167+
clk_put(clk);
168+
return ret;
169+
}
170+
171+
return 0;
172+
}
173+
EXPORT_SYMBOL_GPL(of_pm_clk_add_clk);
174+
141175
/**
142176
* of_pm_clk_add_clks - Start using device clock(s) for power management.
143177
* @dev: Device whose clock(s) is going to be used for power management.
@@ -192,6 +226,7 @@ int of_pm_clk_add_clks(struct device *dev)
192226

193227
return ret;
194228
}
229+
EXPORT_SYMBOL_GPL(of_pm_clk_add_clks);
195230

196231
/**
197232
* __pm_clk_remove - Destroy PM clock entry.
@@ -252,6 +287,7 @@ void pm_clk_remove(struct device *dev, const char *con_id)
252287

253288
__pm_clk_remove(ce);
254289
}
290+
EXPORT_SYMBOL_GPL(pm_clk_remove);
255291

256292
/**
257293
* pm_clk_remove_clk - Stop using a device clock for power management.
@@ -285,6 +321,7 @@ void pm_clk_remove_clk(struct device *dev, struct clk *clk)
285321

286322
__pm_clk_remove(ce);
287323
}
324+
EXPORT_SYMBOL_GPL(pm_clk_remove_clk);
288325

289326
/**
290327
* pm_clk_init - Initialize a device's list of power management clocks.
@@ -299,6 +336,7 @@ void pm_clk_init(struct device *dev)
299336
if (psd)
300337
INIT_LIST_HEAD(&psd->clock_list);
301338
}
339+
EXPORT_SYMBOL_GPL(pm_clk_init);
302340

303341
/**
304342
* pm_clk_create - Create and initialize a device's list of PM clocks.
@@ -311,6 +349,7 @@ int pm_clk_create(struct device *dev)
311349
{
312350
return dev_pm_get_subsys_data(dev);
313351
}
352+
EXPORT_SYMBOL_GPL(pm_clk_create);
314353

315354
/**
316355
* pm_clk_destroy - Destroy a device's list of power management clocks.
@@ -345,6 +384,7 @@ void pm_clk_destroy(struct device *dev)
345384
__pm_clk_remove(ce);
346385
}
347386
}
387+
EXPORT_SYMBOL_GPL(pm_clk_destroy);
348388

349389
/**
350390
* pm_clk_suspend - Disable clocks in a device's PM clock list.
@@ -375,6 +415,7 @@ int pm_clk_suspend(struct device *dev)
375415

376416
return 0;
377417
}
418+
EXPORT_SYMBOL_GPL(pm_clk_suspend);
378419

379420
/**
380421
* pm_clk_resume - Enable clocks in a device's PM clock list.
@@ -400,6 +441,7 @@ int pm_clk_resume(struct device *dev)
400441

401442
return 0;
402443
}
444+
EXPORT_SYMBOL_GPL(pm_clk_resume);
403445

404446
/**
405447
* pm_clk_notify - Notify routine for device addition and removal.
@@ -480,6 +522,7 @@ int pm_clk_runtime_suspend(struct device *dev)
480522

481523
return 0;
482524
}
525+
EXPORT_SYMBOL_GPL(pm_clk_runtime_suspend);
483526

484527
int pm_clk_runtime_resume(struct device *dev)
485528
{
@@ -495,6 +538,7 @@ int pm_clk_runtime_resume(struct device *dev)
495538

496539
return pm_generic_runtime_resume(dev);
497540
}
541+
EXPORT_SYMBOL_GPL(pm_clk_runtime_resume);
498542

499543
#else /* !CONFIG_PM_CLK */
500544

@@ -598,3 +642,4 @@ void pm_clk_add_notifier(struct bus_type *bus,
598642
clknb->nb.notifier_call = pm_clk_notify;
599643
bus_register_notifier(bus, &clknb->nb);
600644
}
645+
EXPORT_SYMBOL_GPL(pm_clk_add_notifier);

0 commit comments

Comments
 (0)