Skip to content

Commit 0d1e8b8

Browse files
committed
Merge tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář: "ARM: - Improved guest IPA space support (32 to 52 bits) - RAS event delivery for 32bit - PMU fixes - Guest entry hardening - Various cleanups - Port of dirty_log_test selftest PPC: - Nested HV KVM support for radix guests on POWER9. The performance is much better than with PR KVM. Migration and arbitrary level of nesting is supported. - Disable nested HV-KVM on early POWER9 chips that need a particular hardware bug workaround - One VM per core mode to prevent potential data leaks - PCI pass-through optimization - merge ppc-kvm topic branch and kvm-ppc-fixes to get a better base s390: - Initial version of AP crypto virtualization via vfio-mdev - Improvement for vfio-ap - Set the host program identifier - Optimize page table locking x86: - Enable nested virtualization by default - Implement Hyper-V IPI hypercalls - Improve #PF and #DB handling - Allow guests to use Enlightened VMCS - Add migration selftests for VMCS and Enlightened VMCS - Allow coalesced PIO accesses - Add an option to perform nested VMCS host state consistency check through hardware - Automatic tuning of lapic_timer_advance_ns - Many fixes, minor improvements, and cleanups" * tag 'kvm-4.20-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) KVM/nVMX: Do not validate that posted_intr_desc_addr is page aligned Revert "kvm: x86: optimize dr6 restore" KVM: PPC: Optimize clearing TCEs for sparse tables x86/kvm/nVMX: tweak shadow fields selftests/kvm: add missing executables to .gitignore KVM: arm64: Safety check PSTATE when entering guest and handle IL KVM: PPC: Book3S HV: Don't use streamlined entry path on early POWER9 chips arm/arm64: KVM: Enable 32 bits kvm vcpu events support arm/arm64: KVM: Rename function kvm_arch_dev_ioctl_check_extension() KVM: arm64: Fix caching of host MDCR_EL2 value KVM: VMX: enable nested virtualization by default KVM/x86: Use 32bit xor to clear registers in svm.c kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD kvm: vmx: Defer setting of DR6 until #DB delivery kvm: x86: Defer setting of CR2 until #PF delivery kvm: x86: Add payload operands to kvm_multiple_exception kvm: x86: Add exception payload fields to kvm_vcpu_events kvm: x86: Add has_payload and payload to kvm_queued_exception KVM: Documentation: Fix omission in struct kvm_vcpu_events KVM: selftests: add Enlightened VMCS test ...
2 parents 83c4087 + 22a7cdc commit 0d1e8b8

File tree

138 files changed

+12456
-3259
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

138 files changed

+12456
-3259
lines changed

Documentation/s390/vfio-ap.txt

Lines changed: 837 additions & 0 deletions
Large diffs are not rendered by default.

Documentation/virtual/kvm/api.txt

Lines changed: 129 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,37 @@ memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the
123123
flag KVM_VM_MIPS_VZ.
124124

125125

126+
On arm64, the physical address size for a VM (IPA Size limit) is limited
127+
to 40bits by default. The limit can be configured if the host supports the
128+
extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
129+
KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
130+
identifier, where IPA_Bits is the maximum width of any physical
131+
address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
132+
machine type identifier.
133+
134+
e.g, to configure a guest to use 48bit physical address size :
135+
136+
vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));
137+
138+
The requested size (IPA_Bits) must be :
139+
0 - Implies default size, 40bits (for backward compatibility)
140+
141+
or
142+
143+
N - Implies N bits, where N is a positive integer such that,
144+
32 <= N <= Host_IPA_Limit
145+
146+
Host_IPA_Limit is the maximum possible value for IPA_Bits on the host and
147+
is dependent on the CPU capability and the kernel configuration. The limit can
148+
be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the KVM_CHECK_EXTENSION
149+
ioctl() at run-time.
150+
151+
Please note that configuring the IPA size does not affect the capability
152+
exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects
153+
size of the address translated by the stage2 level (guest physical to
154+
host physical address translations).
155+
156+
126157
4.3 KVM_GET_MSR_INDEX_LIST, KVM_GET_MSR_FEATURE_INDEX_LIST
127158

128159
Capability: basic, KVM_CAP_GET_MSR_FEATURES for KVM_GET_MSR_FEATURE_INDEX_LIST
@@ -850,7 +881,7 @@ struct kvm_vcpu_events {
850881
__u8 injected;
851882
__u8 nr;
852883
__u8 has_error_code;
853-
__u8 pad;
884+
__u8 pending;
854885
__u32 error_code;
855886
} exception;
856887
struct {
@@ -873,15 +904,23 @@ struct kvm_vcpu_events {
873904
__u8 smm_inside_nmi;
874905
__u8 latched_init;
875906
} smi;
907+
__u8 reserved[27];
908+
__u8 exception_has_payload;
909+
__u64 exception_payload;
876910
};
877911

878-
Only two fields are defined in the flags field:
912+
The following bits are defined in the flags field:
879913

880-
- KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
914+
- KVM_VCPUEVENT_VALID_SHADOW may be set to signal that
881915
interrupt.shadow contains a valid state.
882916

883-
- KVM_VCPUEVENT_VALID_SMM may be set in the flags field to signal that
884-
smi contains a valid state.
917+
- KVM_VCPUEVENT_VALID_SMM may be set to signal that smi contains a
918+
valid state.
919+
920+
- KVM_VCPUEVENT_VALID_PAYLOAD may be set to signal that the
921+
exception_has_payload, exception_payload, and exception.pending
922+
fields contain a valid state. This bit will be set whenever
923+
KVM_CAP_EXCEPTION_PAYLOAD is enabled.
885924

886925
ARM/ARM64:
887926

@@ -961,6 +1000,11 @@ shall be written into the VCPU.
9611000

9621001
KVM_VCPUEVENT_VALID_SMM can only be set if KVM_CAP_X86_SMM is available.
9631002

1003+
If KVM_CAP_EXCEPTION_PAYLOAD is enabled, KVM_VCPUEVENT_VALID_PAYLOAD
1004+
can be set in the flags field to signal that the
1005+
exception_has_payload, exception_payload, and exception.pending fields
1006+
contain a valid state and shall be written into the VCPU.
1007+
9641008
ARM/ARM64:
9651009

9661010
Set the pending SError exception state for this VCPU. It is not possible to
@@ -1922,6 +1966,7 @@ registers, find a list below:
19221966
PPC | KVM_REG_PPC_TIDR | 64
19231967
PPC | KVM_REG_PPC_PSSCR | 64
19241968
PPC | KVM_REG_PPC_DEC_EXPIRY | 64
1969+
PPC | KVM_REG_PPC_PTCR | 64
19251970
PPC | KVM_REG_PPC_TM_GPR0 | 64
19261971
...
19271972
PPC | KVM_REG_PPC_TM_GPR31 | 64
@@ -2269,6 +2314,10 @@ The supported flags are:
22692314
The emulated MMU supports 1T segments in addition to the
22702315
standard 256M ones.
22712316

2317+
- KVM_PPC_NO_HASH
2318+
This flag indicates that HPT guests are not supported by KVM,
2319+
thus all guests must use radix MMU mode.
2320+
22722321
The "slb_size" field indicates how many SLB entries are supported
22732322

22742323
The "sps" array contains 8 entries indicating the supported base
@@ -3676,6 +3725,34 @@ Returns: 0 on success, -1 on error
36763725
This copies the vcpu's kvm_nested_state struct from userspace to the kernel. For
36773726
the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.
36783727

3728+
4.116 KVM_(UN)REGISTER_COALESCED_MMIO
3729+
3730+
Capability: KVM_CAP_COALESCED_MMIO (for coalesced mmio)
3731+
KVM_CAP_COALESCED_PIO (for coalesced pio)
3732+
Architectures: all
3733+
Type: vm ioctl
3734+
Parameters: struct kvm_coalesced_mmio_zone
3735+
Returns: 0 on success, < 0 on error
3736+
3737+
Coalesced I/O is a performance optimization that defers hardware
3738+
register write emulation so that userspace exits are avoided. It is
3739+
typically used to reduce the overhead of emulating frequently accessed
3740+
hardware registers.
3741+
3742+
When a hardware register is configured for coalesced I/O, write accesses
3743+
do not exit to userspace and their value is recorded in a ring buffer
3744+
that is shared between kernel and userspace.
3745+
3746+
Coalesced I/O is used if one or more write accesses to a hardware
3747+
register can be deferred until a read or a write to another hardware
3748+
register on the same device. This last access will cause a vmexit and
3749+
userspace will process accesses from the ring buffer before emulating
3750+
it. That will avoid exiting to userspace on repeated writes.
3751+
3752+
Coalesced pio is based on coalesced mmio. There is little difference
3753+
between coalesced mmio and pio except that coalesced pio records accesses
3754+
to I/O ports.
3755+
36793756
5. The kvm_run structure
36803757
------------------------
36813758

@@ -4522,7 +4599,7 @@ hpage module parameter is not set to 1, -EINVAL is returned.
45224599
While it is generally possible to create a huge page backed VM without
45234600
this capability, the VM will not be able to run.
45244601

4525-
7.14 KVM_CAP_MSR_PLATFORM_INFO
4602+
7.15 KVM_CAP_MSR_PLATFORM_INFO
45264603

45274604
Architectures: x86
45284605
Parameters: args[0] whether feature should be enabled or not
@@ -4531,6 +4608,45 @@ With this capability, a guest may read the MSR_PLATFORM_INFO MSR. Otherwise,
45314608
a #GP would be raised when the guest tries to access. Currently, this
45324609
capability does not enable write permissions of this MSR for the guest.
45334610

4611+
7.16 KVM_CAP_PPC_NESTED_HV
4612+
4613+
Architectures: ppc
4614+
Parameters: none
4615+
Returns: 0 on success, -EINVAL when the implementation doesn't support
4616+
nested-HV virtualization.
4617+
4618+
HV-KVM on POWER9 and later systems allows for "nested-HV"
4619+
virtualization, which provides a way for a guest VM to run guests that
4620+
can run using the CPU's supervisor mode (privileged non-hypervisor
4621+
state). Enabling this capability on a VM depends on the CPU having
4622+
the necessary functionality and on the facility being enabled with a
4623+
kvm-hv module parameter.
4624+
4625+
7.17 KVM_CAP_EXCEPTION_PAYLOAD
4626+
4627+
Architectures: x86
4628+
Parameters: args[0] whether feature should be enabled or not
4629+
4630+
With this capability enabled, CR2 will not be modified prior to the
4631+
emulated VM-exit when L1 intercepts a #PF exception that occurs in
4632+
L2. Similarly, for kvm-intel only, DR6 will not be modified prior to
4633+
the emulated VM-exit when L1 intercepts a #DB exception that occurs in
4634+
L2. As a result, when KVM_GET_VCPU_EVENTS reports a pending #PF (or
4635+
#DB) exception for L2, exception.has_payload will be set and the
4636+
faulting address (or the new DR6 bits*) will be reported in the
4637+
exception_payload field. Similarly, when userspace injects a #PF (or
4638+
#DB) into L2 using KVM_SET_VCPU_EVENTS, it is expected to set
4639+
exception.has_payload and to put the faulting address (or the new DR6
4640+
bits*) in the exception_payload field.
4641+
4642+
This capability also enables exception.pending in struct
4643+
kvm_vcpu_events, which allows userspace to distinguish between pending
4644+
and injected exceptions.
4645+
4646+
4647+
* For the new DR6 bits, note that bit 16 is set iff the #DB exception
4648+
will clear DR6.RTM.
4649+
45344650
8. Other capabilities.
45354651
----------------------
45364652

@@ -4772,3 +4888,10 @@ CPU when the exception is taken. If this virtual SError is taken to EL1 using
47724888
AArch64, this value will be reported in the ISS field of ESR_ELx.
47734889

47744890
See KVM_CAP_VCPU_EVENTS for more details.
4891+
8.20 KVM_CAP_HYPERV_SEND_IPI
4892+
4893+
Architectures: x86
4894+
4895+
This capability indicates that KVM supports paravirtualized Hyper-V IPI send
4896+
hypercalls:
4897+
HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.

MAINTAINERS

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12800,6 +12800,18 @@ W: http://www.ibm.com/developerworks/linux/linux390/
1280012800
S: Supported
1280112801
F: drivers/s390/crypto/
1280212802

12803+
S390 VFIO AP DRIVER
12804+
M: Tony Krowiak <akrowiak@linux.ibm.com>
12805+
M: Pierre Morel <pmorel@linux.ibm.com>
12806+
M: Halil Pasic <pasic@linux.ibm.com>
12807+
L: linux-s390@vger.kernel.org
12808+
W: http://www.ibm.com/developerworks/linux/linux390/
12809+
S: Supported
12810+
F: drivers/s390/crypto/vfio_ap_drv.c
12811+
F: drivers/s390/crypto/vfio_ap_private.h
12812+
F: drivers/s390/crypto/vfio_ap_ops.c
12813+
F: Documentation/s390/vfio-ap.txt
12814+
1280312815
S390 ZFCP DRIVER
1280412816
M: Steffen Maier <maier@linux.ibm.com>
1280512817
M: Benjamin Block <bblock@linux.ibm.com>

arch/arm/include/asm/kvm_arm.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -133,8 +133,7 @@
133133
* space.
134134
*/
135135
#define KVM_PHYS_SHIFT (40)
136-
#define KVM_PHYS_SIZE (_AC(1, ULL) << KVM_PHYS_SHIFT)
137-
#define KVM_PHYS_MASK (KVM_PHYS_SIZE - _AC(1, ULL))
136+
138137
#define PTRS_PER_S2_PGD (_AC(1, ULL) << (KVM_PHYS_SHIFT - 30))
139138

140139
/* Virtualization Translation Control Register (VTCR) bits */

arch/arm/include/asm/kvm_host.h

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -273,7 +273,7 @@ static inline void __cpu_init_stage2(void)
273273
kvm_call_hyp(__init_stage2_translation);
274274
}
275275

276-
static inline int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
276+
static inline int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, long ext)
277277
{
278278
return 0;
279279
}
@@ -354,4 +354,15 @@ static inline void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu) {}
354354
struct kvm *kvm_arch_alloc_vm(void);
355355
void kvm_arch_free_vm(struct kvm *kvm);
356356

357+
static inline int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
358+
{
359+
/*
360+
* On 32bit ARM, VMs get a static 40bit IPA stage2 setup,
361+
* so any non-zero value used as type is illegal.
362+
*/
363+
if (type)
364+
return -EINVAL;
365+
return 0;
366+
}
367+
357368
#endif /* __ARM_KVM_HOST_H__ */

arch/arm/include/asm/kvm_mmu.h

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -35,23 +35,26 @@
3535
addr; \
3636
})
3737

38-
/*
39-
* KVM_MMU_CACHE_MIN_PAGES is the number of stage2 page table translation levels.
40-
*/
41-
#define KVM_MMU_CACHE_MIN_PAGES 2
42-
4338
#ifndef __ASSEMBLY__
4439

4540
#include <linux/highmem.h>
4641
#include <asm/cacheflush.h>
4742
#include <asm/cputype.h>
43+
#include <asm/kvm_arm.h>
4844
#include <asm/kvm_hyp.h>
4945
#include <asm/pgalloc.h>
5046
#include <asm/stage2_pgtable.h>
5147

5248
/* Ensure compatibility with arm64 */
5349
#define VA_BITS 32
5450

51+
#define kvm_phys_shift(kvm) KVM_PHYS_SHIFT
52+
#define kvm_phys_size(kvm) (1ULL << kvm_phys_shift(kvm))
53+
#define kvm_phys_mask(kvm) (kvm_phys_size(kvm) - 1ULL)
54+
#define kvm_vttbr_baddr_mask(kvm) VTTBR_BADDR_MASK
55+
56+
#define stage2_pgd_size(kvm) (PTRS_PER_S2_PGD * sizeof(pgd_t))
57+
5558
int create_hyp_mappings(void *from, void *to, pgprot_t prot);
5659
int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
5760
void __iomem **kaddr,
@@ -355,6 +358,8 @@ static inline int hyp_map_aux_data(void)
355358

356359
#define kvm_phys_to_vttbr(addr) (addr)
357360

361+
static inline void kvm_set_ipa_limit(void) {}
362+
358363
static inline bool kvm_cpu_has_cnp(void)
359364
{
360365
return false;

arch/arm/include/asm/stage2_pgtable.h

Lines changed: 32 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -19,43 +19,53 @@
1919
#ifndef __ARM_S2_PGTABLE_H_
2020
#define __ARM_S2_PGTABLE_H_
2121

22-
#define stage2_pgd_none(pgd) pgd_none(pgd)
23-
#define stage2_pgd_clear(pgd) pgd_clear(pgd)
24-
#define stage2_pgd_present(pgd) pgd_present(pgd)
25-
#define stage2_pgd_populate(pgd, pud) pgd_populate(NULL, pgd, pud)
26-
#define stage2_pud_offset(pgd, address) pud_offset(pgd, address)
27-
#define stage2_pud_free(pud) pud_free(NULL, pud)
28-
29-
#define stage2_pud_none(pud) pud_none(pud)
30-
#define stage2_pud_clear(pud) pud_clear(pud)
31-
#define stage2_pud_present(pud) pud_present(pud)
32-
#define stage2_pud_populate(pud, pmd) pud_populate(NULL, pud, pmd)
33-
#define stage2_pmd_offset(pud, address) pmd_offset(pud, address)
34-
#define stage2_pmd_free(pmd) pmd_free(NULL, pmd)
35-
36-
#define stage2_pud_huge(pud) pud_huge(pud)
22+
/*
23+
* kvm_mmu_cache_min_pages() is the number of pages required
24+
* to install a stage-2 translation. We pre-allocate the entry
25+
* level table at VM creation. Since we have a 3 level page-table,
26+
* we need only two pages to add a new mapping.
27+
*/
28+
#define kvm_mmu_cache_min_pages(kvm) 2
29+
30+
#define stage2_pgd_none(kvm, pgd) pgd_none(pgd)
31+
#define stage2_pgd_clear(kvm, pgd) pgd_clear(pgd)
32+
#define stage2_pgd_present(kvm, pgd) pgd_present(pgd)
33+
#define stage2_pgd_populate(kvm, pgd, pud) pgd_populate(NULL, pgd, pud)
34+
#define stage2_pud_offset(kvm, pgd, address) pud_offset(pgd, address)
35+
#define stage2_pud_free(kvm, pud) pud_free(NULL, pud)
36+
37+
#define stage2_pud_none(kvm, pud) pud_none(pud)
38+
#define stage2_pud_clear(kvm, pud) pud_clear(pud)
39+
#define stage2_pud_present(kvm, pud) pud_present(pud)
40+
#define stage2_pud_populate(kvm, pud, pmd) pud_populate(NULL, pud, pmd)
41+
#define stage2_pmd_offset(kvm, pud, address) pmd_offset(pud, address)
42+
#define stage2_pmd_free(kvm, pmd) pmd_free(NULL, pmd)
43+
44+
#define stage2_pud_huge(kvm, pud) pud_huge(pud)
3745

3846
/* Open coded p*d_addr_end that can deal with 64bit addresses */
39-
static inline phys_addr_t stage2_pgd_addr_end(phys_addr_t addr, phys_addr_t end)
47+
static inline phys_addr_t
48+
stage2_pgd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
4049
{
4150
phys_addr_t boundary = (addr + PGDIR_SIZE) & PGDIR_MASK;
4251

4352
return (boundary - 1 < end - 1) ? boundary : end;
4453
}
4554

46-
#define stage2_pud_addr_end(addr, end) (end)
55+
#define stage2_pud_addr_end(kvm, addr, end) (end)
4756

48-
static inline phys_addr_t stage2_pmd_addr_end(phys_addr_t addr, phys_addr_t end)
57+
static inline phys_addr_t
58+
stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
4959
{
5060
phys_addr_t boundary = (addr + PMD_SIZE) & PMD_MASK;
5161

5262
return (boundary - 1 < end - 1) ? boundary : end;
5363
}
5464

55-
#define stage2_pgd_index(addr) pgd_index(addr)
65+
#define stage2_pgd_index(kvm, addr) pgd_index(addr)
5666

57-
#define stage2_pte_table_empty(ptep) kvm_page_empty(ptep)
58-
#define stage2_pmd_table_empty(pmdp) kvm_page_empty(pmdp)
59-
#define stage2_pud_table_empty(pudp) false
67+
#define stage2_pte_table_empty(kvm, ptep) kvm_page_empty(ptep)
68+
#define stage2_pmd_table_empty(kvm, pmdp) kvm_page_empty(pmdp)
69+
#define stage2_pud_table_empty(kvm, pudp) false
6070

6171
#endif /* __ARM_S2_PGTABLE_H_ */

0 commit comments

Comments
 (0)