|
| 1 | + ==================== |
| 2 | + Energy Model of CPUs |
| 3 | + ==================== |
| 4 | + |
| 5 | +1. Overview |
| 6 | +----------- |
| 7 | + |
| 8 | +The Energy Model (EM) framework serves as an interface between drivers knowing |
| 9 | +the power consumed by CPUs at various performance levels, and the kernel |
| 10 | +subsystems willing to use that information to make energy-aware decisions. |
| 11 | + |
| 12 | +The source of the information about the power consumed by CPUs can vary greatly |
| 13 | +from one platform to another. These power costs can be estimated using |
| 14 | +devicetree data in some cases. In others, the firmware will know better. |
| 15 | +Alternatively, userspace might be best positioned. And so on. In order to avoid |
| 16 | +each and every client subsystem to re-implement support for each and every |
| 17 | +possible source of information on its own, the EM framework intervenes as an |
| 18 | +abstraction layer which standardizes the format of power cost tables in the |
| 19 | +kernel, hence enabling to avoid redundant work. |
| 20 | + |
| 21 | +The figure below depicts an example of drivers (Arm-specific here, but the |
| 22 | +approach is applicable to any architecture) providing power costs to the EM |
| 23 | +framework, and interested clients reading the data from it. |
| 24 | + |
| 25 | + +---------------+ +-----------------+ +---------------+ |
| 26 | + | Thermal (IPA) | | Scheduler (EAS) | | Other | |
| 27 | + +---------------+ +-----------------+ +---------------+ |
| 28 | + | | em_pd_energy() | |
| 29 | + | | em_cpu_get() | |
| 30 | + +---------+ | +---------+ |
| 31 | + | | | |
| 32 | + v v v |
| 33 | + +---------------------+ |
| 34 | + | Energy Model | |
| 35 | + | Framework | |
| 36 | + +---------------------+ |
| 37 | + ^ ^ ^ |
| 38 | + | | | em_register_perf_domain() |
| 39 | + +----------+ | +---------+ |
| 40 | + | | | |
| 41 | + +---------------+ +---------------+ +--------------+ |
| 42 | + | cpufreq-dt | | arm_scmi | | Other | |
| 43 | + +---------------+ +---------------+ +--------------+ |
| 44 | + ^ ^ ^ |
| 45 | + | | | |
| 46 | + +--------------+ +---------------+ +--------------+ |
| 47 | + | Device Tree | | Firmware | | ? | |
| 48 | + +--------------+ +---------------+ +--------------+ |
| 49 | + |
| 50 | +The EM framework manages power cost tables per 'performance domain' in the |
| 51 | +system. A performance domain is a group of CPUs whose performance is scaled |
| 52 | +together. Performance domains generally have a 1-to-1 mapping with CPUFreq |
| 53 | +policies. All CPUs in a performance domain are required to have the same |
| 54 | +micro-architecture. CPUs in different performance domains can have different |
| 55 | +micro-architectures. |
| 56 | + |
| 57 | + |
| 58 | +2. Core APIs |
| 59 | +------------ |
| 60 | + |
| 61 | + 2.1 Config options |
| 62 | + |
| 63 | +CONFIG_ENERGY_MODEL must be enabled to use the EM framework. |
| 64 | + |
| 65 | + |
| 66 | + 2.2 Registration of performance domains |
| 67 | + |
| 68 | +Drivers are expected to register performance domains into the EM framework by |
| 69 | +calling the following API: |
| 70 | + |
| 71 | + int em_register_perf_domain(cpumask_t *span, unsigned int nr_states, |
| 72 | + struct em_data_callback *cb); |
| 73 | + |
| 74 | +Drivers must specify the CPUs of the performance domains using the cpumask |
| 75 | +argument, and provide a callback function returning <frequency, power> tuples |
| 76 | +for each capacity state. The callback function provided by the driver is free |
| 77 | +to fetch data from any relevant location (DT, firmware, ...), and by any mean |
| 78 | +deemed necessary. See Section 3. for an example of driver implementing this |
| 79 | +callback, and kernel/power/energy_model.c for further documentation on this |
| 80 | +API. |
| 81 | + |
| 82 | + |
| 83 | + 2.3 Accessing performance domains |
| 84 | + |
| 85 | +Subsystems interested in the energy model of a CPU can retrieve it using the |
| 86 | +em_cpu_get() API. The energy model tables are allocated once upon creation of |
| 87 | +the performance domains, and kept in memory untouched. |
| 88 | + |
| 89 | +The energy consumed by a performance domain can be estimated using the |
| 90 | +em_pd_energy() API. The estimation is performed assuming that the schedutil |
| 91 | +CPUfreq governor is in use. |
| 92 | + |
| 93 | +More details about the above APIs can be found in include/linux/energy_model.h. |
| 94 | + |
| 95 | + |
| 96 | +3. Example driver |
| 97 | +----------------- |
| 98 | + |
| 99 | +This section provides a simple example of a CPUFreq driver registering a |
| 100 | +performance domain in the Energy Model framework using the (fake) 'foo' |
| 101 | +protocol. The driver implements an est_power() function to be provided to the |
| 102 | +EM framework. |
| 103 | + |
| 104 | + -> drivers/cpufreq/foo_cpufreq.c |
| 105 | + |
| 106 | +01 static int est_power(unsigned long *mW, unsigned long *KHz, int cpu) |
| 107 | +02 { |
| 108 | +03 long freq, power; |
| 109 | +04 |
| 110 | +05 /* Use the 'foo' protocol to ceil the frequency */ |
| 111 | +06 freq = foo_get_freq_ceil(cpu, *KHz); |
| 112 | +07 if (freq < 0); |
| 113 | +08 return freq; |
| 114 | +09 |
| 115 | +10 /* Estimate the power cost for the CPU at the relevant freq. */ |
| 116 | +11 power = foo_estimate_power(cpu, freq); |
| 117 | +12 if (power < 0); |
| 118 | +13 return power; |
| 119 | +14 |
| 120 | +15 /* Return the values to the EM framework */ |
| 121 | +16 *mW = power; |
| 122 | +17 *KHz = freq; |
| 123 | +18 |
| 124 | +19 return 0; |
| 125 | +20 } |
| 126 | +21 |
| 127 | +22 static int foo_cpufreq_init(struct cpufreq_policy *policy) |
| 128 | +23 { |
| 129 | +24 struct em_data_callback em_cb = EM_DATA_CB(est_power); |
| 130 | +25 int nr_opp, ret; |
| 131 | +26 |
| 132 | +27 /* Do the actual CPUFreq init work ... */ |
| 133 | +28 ret = do_foo_cpufreq_init(policy); |
| 134 | +29 if (ret) |
| 135 | +30 return ret; |
| 136 | +31 |
| 137 | +32 /* Find the number of OPPs for this policy */ |
| 138 | +33 nr_opp = foo_get_nr_opp(policy); |
| 139 | +34 |
| 140 | +35 /* And register the new performance domain */ |
| 141 | +36 em_register_perf_domain(policy->cpus, nr_opp, &em_cb); |
| 142 | +37 |
| 143 | +38 return 0; |
| 144 | +39 } |
0 commit comments