Skip to content

Commit f4d5b8a

Browse files
author
Ingo Molnar
committed
Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into efi/urgent
Pull EFI/arm64 fix from Matt Fleming: " * Fix a boot crash on arm64 caused by a recent commit to mark the EFI memory map as 'MEMBLOCK_NOMAP' which causes the regions to be omitted from the kernel direct mapping - Ard Biesheuvel " Signed-off-by: Ingo Molnar <mingo@kernel.org>
2 parents c05c2ec + 7cc8cbc commit f4d5b8a

File tree

14 files changed

+288
-33
lines changed

14 files changed

+288
-33
lines changed

Documentation/x86/protection-keys.txt

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
Memory Protection Keys for Userspace (PKU aka PKEYs) is a CPU feature
2+
which will be found on future Intel CPUs.
3+
4+
Memory Protection Keys provides a mechanism for enforcing page-based
5+
protections, but without requiring modification of the page tables
6+
when an application changes protection domains. It works by
7+
dedicating 4 previously ignored bits in each page table entry to a
8+
"protection key", giving 16 possible keys.
9+
10+
There is also a new user-accessible register (PKRU) with two separate
11+
bits (Access Disable and Write Disable) for each key. Being a CPU
12+
register, PKRU is inherently thread-local, potentially giving each
13+
thread a different set of protections from every other thread.
14+
15+
There are two new instructions (RDPKRU/WRPKRU) for reading and writing
16+
to the new register. The feature is only available in 64-bit mode,
17+
even though there is theoretically space in the PAE PTEs. These
18+
permissions are enforced on data access only and have no effect on
19+
instruction fetches.
20+
21+
=========================== Config Option ===========================
22+
23+
This config option adds approximately 1.5kb of text. and 50 bytes of
24+
data to the executable. A workload which does large O_DIRECT reads
25+
of holes in XFS files was run to exercise get_user_pages_fast(). No
26+
performance delta was observed with the config option
27+
enabled or disabled.

Documentation/x86/topology.txt

Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
x86 Topology
2+
============
3+
4+
This documents and clarifies the main aspects of x86 topology modelling and
5+
representation in the kernel. Update/change when doing changes to the
6+
respective code.
7+
8+
The architecture-agnostic topology definitions are in
9+
Documentation/cputopology.txt. This file holds x86-specific
10+
differences/specialities which must not necessarily apply to the generic
11+
definitions. Thus, the way to read up on Linux topology on x86 is to start
12+
with the generic one and look at this one in parallel for the x86 specifics.
13+
14+
Needless to say, code should use the generic functions - this file is *only*
15+
here to *document* the inner workings of x86 topology.
16+
17+
Started by Thomas Gleixner <tglx@linutronix.de> and Borislav Petkov <bp@alien8.de>.
18+
19+
The main aim of the topology facilities is to present adequate interfaces to
20+
code which needs to know/query/use the structure of the running system wrt
21+
threads, cores, packages, etc.
22+
23+
The kernel does not care about the concept of physical sockets because a
24+
socket has no relevance to software. It's an electromechanical component. In
25+
the past a socket always contained a single package (see below), but with the
26+
advent of Multi Chip Modules (MCM) a socket can hold more than one package. So
27+
there might be still references to sockets in the code, but they are of
28+
historical nature and should be cleaned up.
29+
30+
The topology of a system is described in the units of:
31+
32+
- packages
33+
- cores
34+
- threads
35+
36+
* Package:
37+
38+
Packages contain a number of cores plus shared resources, e.g. DRAM
39+
controller, shared caches etc.
40+
41+
AMD nomenclature for package is 'Node'.
42+
43+
Package-related topology information in the kernel:
44+
45+
- cpuinfo_x86.x86_max_cores:
46+
47+
The number of cores in a package. This information is retrieved via CPUID.
48+
49+
- cpuinfo_x86.phys_proc_id:
50+
51+
The physical ID of the package. This information is retrieved via CPUID
52+
and deduced from the APIC IDs of the cores in the package.
53+
54+
- cpuinfo_x86.logical_id:
55+
56+
The logical ID of the package. As we do not trust BIOSes to enumerate the
57+
packages in a consistent way, we introduced the concept of logical package
58+
ID so we can sanely calculate the number of maximum possible packages in
59+
the system and have the packages enumerated linearly.
60+
61+
- topology_max_packages():
62+
63+
The maximum possible number of packages in the system. Helpful for per
64+
package facilities to preallocate per package information.
65+
66+
67+
* Cores:
68+
69+
A core consists of 1 or more threads. It does not matter whether the threads
70+
are SMT- or CMT-type threads.
71+
72+
AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses
73+
"core".
74+
75+
Core-related topology information in the kernel:
76+
77+
- smp_num_siblings:
78+
79+
The number of threads in a core. The number of threads in a package can be
80+
calculated by:
81+
82+
threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings
83+
84+
85+
* Threads:
86+
87+
A thread is a single scheduling unit. It's the equivalent to a logical Linux
88+
CPU.
89+
90+
AMDs nomenclature for CMT threads is "Compute Unit Core". The kernel always
91+
uses "thread".
92+
93+
Thread-related topology information in the kernel:
94+
95+
- topology_core_cpumask():
96+
97+
The cpumask contains all online threads in the package to which a thread
98+
belongs.
99+
100+
The number of online threads is also printed in /proc/cpuinfo "siblings."
101+
102+
- topology_sibling_mask():
103+
104+
The cpumask contains all online threads in the core to which a thread
105+
belongs.
106+
107+
- topology_logical_package_id():
108+
109+
The logical package ID to which a thread belongs.
110+
111+
- topology_physical_package_id():
112+
113+
The physical package ID to which a thread belongs.
114+
115+
- topology_core_id();
116+
117+
The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo
118+
"core_id."
119+
120+
121+
122+
System topology examples
123+
124+
Note:
125+
126+
The alternative Linux CPU enumeration depends on how the BIOS enumerates the
127+
threads. Many BIOSes enumerate all threads 0 first and then all threads 1.
128+
That has the "advantage" that the logical Linux CPU numbers of threads 0 stay
129+
the same whether threads are enabled or not. That's merely an implementation
130+
detail and has no practical impact.
131+
132+
1) Single Package, Single Core
133+
134+
[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
135+
136+
2) Single Package, Dual Core
137+
138+
a) One thread per core
139+
140+
[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
141+
-> [core 1] -> [thread 0] -> Linux CPU 1
142+
143+
b) Two threads per core
144+
145+
[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
146+
-> [thread 1] -> Linux CPU 1
147+
-> [core 1] -> [thread 0] -> Linux CPU 2
148+
-> [thread 1] -> Linux CPU 3
149+
150+
Alternative enumeration:
151+
152+
[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
153+
-> [thread 1] -> Linux CPU 2
154+
-> [core 1] -> [thread 0] -> Linux CPU 1
155+
-> [thread 1] -> Linux CPU 3
156+
157+
AMD nomenclature for CMT systems:
158+
159+
[node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
160+
-> [Compute Unit Core 1] -> Linux CPU 1
161+
-> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
162+
-> [Compute Unit Core 1] -> Linux CPU 3
163+
164+
4) Dual Package, Dual Core
165+
166+
a) One thread per core
167+
168+
[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
169+
-> [core 1] -> [thread 0] -> Linux CPU 1
170+
171+
[package 1] -> [core 0] -> [thread 0] -> Linux CPU 2
172+
-> [core 1] -> [thread 0] -> Linux CPU 3
173+
174+
b) Two threads per core
175+
176+
[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
177+
-> [thread 1] -> Linux CPU 1
178+
-> [core 1] -> [thread 0] -> Linux CPU 2
179+
-> [thread 1] -> Linux CPU 3
180+
181+
[package 1] -> [core 0] -> [thread 0] -> Linux CPU 4
182+
-> [thread 1] -> Linux CPU 5
183+
-> [core 1] -> [thread 0] -> Linux CPU 6
184+
-> [thread 1] -> Linux CPU 7
185+
186+
Alternative enumeration:
187+
188+
[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0
189+
-> [thread 1] -> Linux CPU 4
190+
-> [core 1] -> [thread 0] -> Linux CPU 1
191+
-> [thread 1] -> Linux CPU 5
192+
193+
[package 1] -> [core 0] -> [thread 0] -> Linux CPU 2
194+
-> [thread 1] -> Linux CPU 6
195+
-> [core 1] -> [thread 0] -> Linux CPU 3
196+
-> [thread 1] -> Linux CPU 7
197+
198+
AMD nomenclature for CMT systems:
199+
200+
[node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0
201+
-> [Compute Unit Core 1] -> Linux CPU 1
202+
-> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2
203+
-> [Compute Unit Core 1] -> Linux CPU 3
204+
205+
[node 1] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 4
206+
-> [Compute Unit Core 1] -> Linux CPU 5
207+
-> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 6
208+
-> [Compute Unit Core 1] -> Linux CPU 7

arch/x86/events/amd/core.c

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -369,7 +369,7 @@ static int amd_pmu_cpu_prepare(int cpu)
369369

370370
WARN_ON_ONCE(cpuc->amd_nb);
371371

372-
if (boot_cpu_data.x86_max_cores < 2)
372+
if (!x86_pmu.amd_nb_constraints)
373373
return NOTIFY_OK;
374374

375375
cpuc->amd_nb = amd_alloc_nb(cpu);
@@ -388,7 +388,7 @@ static void amd_pmu_cpu_starting(int cpu)
388388

389389
cpuc->perf_ctr_virt_mask = AMD64_EVENTSEL_HOSTONLY;
390390

391-
if (boot_cpu_data.x86_max_cores < 2)
391+
if (!x86_pmu.amd_nb_constraints)
392392
return;
393393

394394
nb_id = amd_get_nb_id(cpu);
@@ -414,7 +414,7 @@ static void amd_pmu_cpu_dead(int cpu)
414414
{
415415
struct cpu_hw_events *cpuhw;
416416

417-
if (boot_cpu_data.x86_max_cores < 2)
417+
if (!x86_pmu.amd_nb_constraints)
418418
return;
419419

420420
cpuhw = &per_cpu(cpu_hw_events, cpu);
@@ -648,6 +648,8 @@ static __initconst const struct x86_pmu amd_pmu = {
648648
.cpu_prepare = amd_pmu_cpu_prepare,
649649
.cpu_starting = amd_pmu_cpu_starting,
650650
.cpu_dead = amd_pmu_cpu_dead,
651+
652+
.amd_nb_constraints = 1,
651653
};
652654

653655
static int __init amd_core_pmu_init(void)
@@ -674,6 +676,11 @@ static int __init amd_core_pmu_init(void)
674676
x86_pmu.eventsel = MSR_F15H_PERF_CTL;
675677
x86_pmu.perfctr = MSR_F15H_PERF_CTR;
676678
x86_pmu.num_counters = AMD64_NUM_COUNTERS_CORE;
679+
/*
680+
* AMD Core perfctr has separate MSRs for the NB events, see
681+
* the amd/uncore.c driver.
682+
*/
683+
x86_pmu.amd_nb_constraints = 0;
677684

678685
pr_cont("core perfctr, ");
679686
return 0;
@@ -693,6 +700,14 @@ __init int amd_pmu_init(void)
693700
if (ret)
694701
return ret;
695702

703+
if (num_possible_cpus() == 1) {
704+
/*
705+
* No point in allocating data structures to serialize
706+
* against other CPUs, when there is only the one CPU.
707+
*/
708+
x86_pmu.amd_nb_constraints = 0;
709+
}
710+
696711
/* Events are common for all AMDs */
697712
memcpy(hw_cache_event_ids, amd_hw_cache_event_ids,
698713
sizeof(hw_cache_event_ids));

arch/x86/events/perf_event.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -607,6 +607,11 @@ struct x86_pmu {
607607
*/
608608
atomic_t lbr_exclusive[x86_lbr_exclusive_max];
609609

610+
/*
611+
* AMD bits
612+
*/
613+
unsigned int amd_nb_constraints : 1;
614+
610615
/*
611616
* Extra registers for events
612617
*/

arch/x86/include/asm/msr-index.h

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,7 @@
190190
#define MSR_PP1_ENERGY_STATUS 0x00000641
191191
#define MSR_PP1_POLICY 0x00000642
192192

193+
/* Config TDP MSRs */
193194
#define MSR_CONFIG_TDP_NOMINAL 0x00000648
194195
#define MSR_CONFIG_TDP_LEVEL_1 0x00000649
195196
#define MSR_CONFIG_TDP_LEVEL_2 0x0000064A
@@ -210,13 +211,6 @@
210211
#define MSR_GFX_PERF_LIMIT_REASONS 0x000006B0
211212
#define MSR_RING_PERF_LIMIT_REASONS 0x000006B1
212213

213-
/* Config TDP MSRs */
214-
#define MSR_CONFIG_TDP_NOMINAL 0x00000648
215-
#define MSR_CONFIG_TDP_LEVEL1 0x00000649
216-
#define MSR_CONFIG_TDP_LEVEL2 0x0000064A
217-
#define MSR_CONFIG_TDP_CONTROL 0x0000064B
218-
#define MSR_TURBO_ACTIVATION_RATIO 0x0000064C
219-
220214
/* Hardware P state interface */
221215
#define MSR_PPERF 0x0000064e
222216
#define MSR_PERF_LIMIT_REASONS 0x0000064f

arch/x86/include/asm/processor.h

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -132,8 +132,6 @@ struct cpuinfo_x86 {
132132
u16 logical_proc_id;
133133
/* Core id: */
134134
u16 cpu_core_id;
135-
/* Compute unit id */
136-
u8 compute_unit_id;
137135
/* Index into per_cpu list: */
138136
u16 cpu_index;
139137
u32 microcode;

arch/x86/include/asm/smp.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,7 @@ static inline int wbinvd_on_all_cpus(void)
155155
wbinvd();
156156
return 0;
157157
}
158+
#define smp_num_siblings 1
158159
#endif /* CONFIG_SMP */
159160

160161
extern unsigned disabled_cpus;

arch/x86/include/asm/thread_info.h

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -276,11 +276,9 @@ static inline bool is_ia32_task(void)
276276
*/
277277
#define force_iret() set_thread_flag(TIF_NOTIFY_RESUME)
278278

279-
#endif /* !__ASSEMBLY__ */
280-
281-
#ifndef __ASSEMBLY__
282279
extern void arch_task_cache_init(void);
283280
extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
284281
extern void arch_release_task_struct(struct task_struct *tsk);
285-
#endif
282+
#endif /* !__ASSEMBLY__ */
283+
286284
#endif /* _ASM_X86_THREAD_INFO_H */

arch/x86/kernel/amd_nb.c

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -170,15 +170,13 @@ int amd_get_subcaches(int cpu)
170170
{
171171
struct pci_dev *link = node_to_amd_nb(amd_get_nb_id(cpu))->link;
172172
unsigned int mask;
173-
int cuid;
174173

175174
if (!amd_nb_has_feature(AMD_NB_L3_PARTITIONING))
176175
return 0;
177176

178177
pci_read_config_dword(link, 0x1d4, &mask);
179178

180-
cuid = cpu_data(cpu).compute_unit_id;
181-
return (mask >> (4 * cuid)) & 0xf;
179+
return (mask >> (4 * cpu_data(cpu).cpu_core_id)) & 0xf;
182180
}
183181

184182
int amd_set_subcaches(int cpu, unsigned long mask)
@@ -204,7 +202,7 @@ int amd_set_subcaches(int cpu, unsigned long mask)
204202
pci_write_config_dword(nb->misc, 0x1b8, reg & ~0x180000);
205203
}
206204

207-
cuid = cpu_data(cpu).compute_unit_id;
205+
cuid = cpu_data(cpu).cpu_core_id;
208206
mask <<= 4 * cuid;
209207
mask |= (0xf ^ (1 << cuid)) << 26;
210208

0 commit comments

Comments
 (0)