Skip to content

Commit 63b5cf0

Browse files
committed
Merge tag 'kvm-s390-20140422' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into queue
Lazy storage key handling ------------------------- Linux does not use the ACC and F bits of the storage key. Newer Linux versions also do not use the storage keys for dirty and reference tracking. We can optimize the guest handling for those guests for faults as well as page-in and page-out by simply not caring about the guest visible storage key. We trap guest storage key instruction to enable those keys only on demand. Migration bitmap Until now s390 never provided a proper dirty bitmap. Let's provide a proper migration bitmap for s390. We also change the user dirty tracking to a fault based mechanism. This makes the host completely independent from the storage keys. Long term this will allow us to back guest memory with large pages. per-VM device attributes ------------------------ To avoid the introduction of new ioctls, let's provide the attribute semanantic also on the VM-"device". Userspace controlled CMMA ------------------------- The CMMA assist is changed from "always on" to "on if requested" via per-VM device attributes. In addition a callback to reset all usage states is provided. Proper guest DAT handling for intercepts ---------------------------------------- While instructions handled by SIE take care of all addressing aspects, KVM/s390 currently does not care about guest address translation of intercepts. This worked out fine, because - the s390 Linux kernel has a 1:1 mapping between kernel virtual<->real for all pages up to memory size - intercepts happen only for a small amount of cases - all of these intercepts happen to be in the kernel text for current distros Of course we need to be better for other intercepts, kernel modules etc. We provide the infrastructure and rework all in-kernel intercepts to work on logical addresses (paging etc) instead of real ones. The code has been running internally for several months now, so it is time for going public. GDB support ----------- We provide breakpoints, single stepping and watchpoints. Fixes/Cleanups -------------- - Improve program check delivery - Factor out the handling of transactional memory on program checks - Use the existing define __LC_PGM_TDB - Several cleanups in the lowcore structure - Documentation NOTES ----- - All patches touching base s390 are either ACKed or written by the s390 maintainers - One base KVM patch "KVM: add kvm_is_error_gpa() helper" - One patch introduces the notion of VM device attributes Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Conflicts: include/uapi/linux/kvm.h
2 parents 5c7411e + e325fe6 commit 63b5cf0

File tree

33 files changed

+2831
-501
lines changed

33 files changed

+2831
-501
lines changed

Documentation/virtual/kvm/api.txt

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2314,8 +2314,8 @@ struct kvm_create_device {
23142314

23152315
4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
23162316

2317-
Capability: KVM_CAP_DEVICE_CTRL
2318-
Type: device ioctl
2317+
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
2318+
Type: device ioctl, vm ioctl
23192319
Parameters: struct kvm_device_attr
23202320
Returns: 0 on success, -1 on error
23212321
Errors:
@@ -2340,8 +2340,8 @@ struct kvm_device_attr {
23402340

23412341
4.81 KVM_HAS_DEVICE_ATTR
23422342

2343-
Capability: KVM_CAP_DEVICE_CTRL
2344-
Type: device ioctl
2343+
Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device
2344+
Type: device ioctl, vm ioctl
23452345
Parameters: struct kvm_device_attr
23462346
Returns: 0 on success, -1 on error
23472347
Errors:
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
Generic vm interface
2+
====================================
3+
4+
The virtual machine "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
5+
KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same
6+
struct kvm_device_attr as other devices, but targets VM-wide settings
7+
and controls.
8+
9+
The groups and attributes per virtual machine, if any, are architecture
10+
specific.
11+
12+
1. GROUP: KVM_S390_VM_MEM_CTRL
13+
Architectures: s390
14+
15+
1.1. ATTRIBUTE: KVM_S390_VM_MEM_CTRL
16+
Parameters: none
17+
Returns: -EBUSY if already a vcpus is defined, otherwise 0
18+
19+
Enables CMMA for the virtual machine
20+
21+
1.2. ATTRIBUTE: KVM_S390_VM_CLR_CMMA
22+
Parameteres: none
23+
Returns: 0
24+
25+
Clear the CMMA status for all guest pages, so any pages the guest marked
26+
as unused are again used any may not be reclaimed by the host.

Documentation/virtual/kvm/s390-diag.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,3 +78,5 @@ DIAGNOSE function code 'X'501 - KVM breakpoint
7878

7979
If the function code specifies 0x501, breakpoint functions may be performed.
8080
This function code is handled by userspace.
81+
82+
This diagnose function code has no subfunctions and uses no parameters.

arch/s390/include/asm/ctl_reg.h

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,20 @@ static inline void __ctl_clear_bit(unsigned int cr, unsigned int bit)
5757
void smp_ctl_set_bit(int cr, int bit);
5858
void smp_ctl_clear_bit(int cr, int bit);
5959

60+
union ctlreg0 {
61+
unsigned long val;
62+
struct {
63+
#ifdef CONFIG_64BIT
64+
unsigned long : 32;
65+
#endif
66+
unsigned long : 3;
67+
unsigned long lap : 1; /* Low-address-protection control */
68+
unsigned long : 4;
69+
unsigned long edat : 1; /* Enhanced-DAT-enablement control */
70+
unsigned long : 23;
71+
};
72+
};
73+
6074
#ifdef CONFIG_SMP
6175
# define ctl_set_bit(cr, bit) smp_ctl_set_bit(cr, bit)
6276
# define ctl_clear_bit(cr, bit) smp_ctl_clear_bit(cr, bit)

arch/s390/include/asm/kvm_host.h

Lines changed: 137 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,17 @@ struct sca_entry {
3939
__u64 reserved2[2];
4040
} __attribute__((packed));
4141

42+
union ipte_control {
43+
unsigned long val;
44+
struct {
45+
unsigned long k : 1;
46+
unsigned long kh : 31;
47+
unsigned long kg : 32;
48+
};
49+
};
4250

4351
struct sca_block {
44-
__u64 ipte_control;
52+
union ipte_control ipte_control;
4553
__u64 reserved[5];
4654
__u64 mcn;
4755
__u64 reserved2;
@@ -85,12 +93,26 @@ struct kvm_s390_sie_block {
8593
__u8 reserved40[4]; /* 0x0040 */
8694
#define LCTL_CR0 0x8000
8795
#define LCTL_CR6 0x0200
96+
#define LCTL_CR9 0x0040
97+
#define LCTL_CR10 0x0020
98+
#define LCTL_CR11 0x0010
8899
#define LCTL_CR14 0x0002
89100
__u16 lctl; /* 0x0044 */
90101
__s16 icpua; /* 0x0046 */
91-
#define ICTL_LPSW 0x00400000
102+
#define ICTL_PINT 0x20000000
103+
#define ICTL_LPSW 0x00400000
104+
#define ICTL_STCTL 0x00040000
105+
#define ICTL_ISKE 0x00004000
106+
#define ICTL_SSKE 0x00002000
107+
#define ICTL_RRBE 0x00001000
92108
__u32 ictl; /* 0x0048 */
93109
__u32 eca; /* 0x004c */
110+
#define ICPT_INST 0x04
111+
#define ICPT_PROGI 0x08
112+
#define ICPT_INSTPROGI 0x0C
113+
#define ICPT_OPEREXC 0x2C
114+
#define ICPT_PARTEXEC 0x38
115+
#define ICPT_IOINST 0x40
94116
__u8 icptcode; /* 0x0050 */
95117
__u8 reserved51; /* 0x0051 */
96118
__u16 ihcpu; /* 0x0052 */
@@ -109,9 +131,21 @@ struct kvm_s390_sie_block {
109131
psw_t gpsw; /* 0x0090 */
110132
__u64 gg14; /* 0x00a0 */
111133
__u64 gg15; /* 0x00a8 */
112-
__u8 reservedb0[30]; /* 0x00b0 */
113-
__u16 iprcc; /* 0x00ce */
114-
__u8 reservedd0[48]; /* 0x00d0 */
134+
__u8 reservedb0[28]; /* 0x00b0 */
135+
__u16 pgmilc; /* 0x00cc */
136+
__u16 iprcc; /* 0x00ce */
137+
__u32 dxc; /* 0x00d0 */
138+
__u16 mcn; /* 0x00d4 */
139+
__u8 perc; /* 0x00d6 */
140+
__u8 peratmid; /* 0x00d7 */
141+
__u64 peraddr; /* 0x00d8 */
142+
__u8 eai; /* 0x00e0 */
143+
__u8 peraid; /* 0x00e1 */
144+
__u8 oai; /* 0x00e2 */
145+
__u8 armid; /* 0x00e3 */
146+
__u8 reservede4[4]; /* 0x00e4 */
147+
__u64 tecmc; /* 0x00e8 */
148+
__u8 reservedf0[16]; /* 0x00f0 */
115149
__u64 gcr[16]; /* 0x0100 */
116150
__u64 gbea; /* 0x0180 */
117151
__u8 reserved188[24]; /* 0x0188 */
@@ -146,6 +180,8 @@ struct kvm_vcpu_stat {
146180
u32 exit_instruction;
147181
u32 instruction_lctl;
148182
u32 instruction_lctlg;
183+
u32 instruction_stctl;
184+
u32 instruction_stctg;
149185
u32 exit_program_interruption;
150186
u32 exit_instr_and_program;
151187
u32 deliver_external_call;
@@ -164,6 +200,7 @@ struct kvm_vcpu_stat {
164200
u32 instruction_stpx;
165201
u32 instruction_stap;
166202
u32 instruction_storage_key;
203+
u32 instruction_ipte_interlock;
167204
u32 instruction_stsch;
168205
u32 instruction_chsc;
169206
u32 instruction_stsi;
@@ -183,13 +220,58 @@ struct kvm_vcpu_stat {
183220
u32 diagnose_9c;
184221
};
185222

186-
#define PGM_OPERATION 0x01
187-
#define PGM_PRIVILEGED_OP 0x02
188-
#define PGM_EXECUTE 0x03
189-
#define PGM_PROTECTION 0x04
190-
#define PGM_ADDRESSING 0x05
191-
#define PGM_SPECIFICATION 0x06
192-
#define PGM_DATA 0x07
223+
#define PGM_OPERATION 0x01
224+
#define PGM_PRIVILEGED_OP 0x02
225+
#define PGM_EXECUTE 0x03
226+
#define PGM_PROTECTION 0x04
227+
#define PGM_ADDRESSING 0x05
228+
#define PGM_SPECIFICATION 0x06
229+
#define PGM_DATA 0x07
230+
#define PGM_FIXED_POINT_OVERFLOW 0x08
231+
#define PGM_FIXED_POINT_DIVIDE 0x09
232+
#define PGM_DECIMAL_OVERFLOW 0x0a
233+
#define PGM_DECIMAL_DIVIDE 0x0b
234+
#define PGM_HFP_EXPONENT_OVERFLOW 0x0c
235+
#define PGM_HFP_EXPONENT_UNDERFLOW 0x0d
236+
#define PGM_HFP_SIGNIFICANCE 0x0e
237+
#define PGM_HFP_DIVIDE 0x0f
238+
#define PGM_SEGMENT_TRANSLATION 0x10
239+
#define PGM_PAGE_TRANSLATION 0x11
240+
#define PGM_TRANSLATION_SPEC 0x12
241+
#define PGM_SPECIAL_OPERATION 0x13
242+
#define PGM_OPERAND 0x15
243+
#define PGM_TRACE_TABEL 0x16
244+
#define PGM_SPACE_SWITCH 0x1c
245+
#define PGM_HFP_SQUARE_ROOT 0x1d
246+
#define PGM_PC_TRANSLATION_SPEC 0x1f
247+
#define PGM_AFX_TRANSLATION 0x20
248+
#define PGM_ASX_TRANSLATION 0x21
249+
#define PGM_LX_TRANSLATION 0x22
250+
#define PGM_EX_TRANSLATION 0x23
251+
#define PGM_PRIMARY_AUTHORITY 0x24
252+
#define PGM_SECONDARY_AUTHORITY 0x25
253+
#define PGM_LFX_TRANSLATION 0x26
254+
#define PGM_LSX_TRANSLATION 0x27
255+
#define PGM_ALET_SPECIFICATION 0x28
256+
#define PGM_ALEN_TRANSLATION 0x29
257+
#define PGM_ALE_SEQUENCE 0x2a
258+
#define PGM_ASTE_VALIDITY 0x2b
259+
#define PGM_ASTE_SEQUENCE 0x2c
260+
#define PGM_EXTENDED_AUTHORITY 0x2d
261+
#define PGM_LSTE_SEQUENCE 0x2e
262+
#define PGM_ASTE_INSTANCE 0x2f
263+
#define PGM_STACK_FULL 0x30
264+
#define PGM_STACK_EMPTY 0x31
265+
#define PGM_STACK_SPECIFICATION 0x32
266+
#define PGM_STACK_TYPE 0x33
267+
#define PGM_STACK_OPERATION 0x34
268+
#define PGM_ASCE_TYPE 0x38
269+
#define PGM_REGION_FIRST_TRANS 0x39
270+
#define PGM_REGION_SECOND_TRANS 0x3a
271+
#define PGM_REGION_THIRD_TRANS 0x3b
272+
#define PGM_MONITOR 0x40
273+
#define PGM_PER 0x80
274+
#define PGM_CRYPTO_OPERATION 0x119
193275

194276
struct kvm_s390_interrupt_info {
195277
struct list_head list;
@@ -229,6 +311,45 @@ struct kvm_s390_float_interrupt {
229311
unsigned int irq_count;
230312
};
231313

314+
struct kvm_hw_wp_info_arch {
315+
unsigned long addr;
316+
unsigned long phys_addr;
317+
int len;
318+
char *old_data;
319+
};
320+
321+
struct kvm_hw_bp_info_arch {
322+
unsigned long addr;
323+
int len;
324+
};
325+
326+
/*
327+
* Only the upper 16 bits of kvm_guest_debug->control are arch specific.
328+
* Further KVM_GUESTDBG flags which an be used from userspace can be found in
329+
* arch/s390/include/uapi/asm/kvm.h
330+
*/
331+
#define KVM_GUESTDBG_EXIT_PENDING 0x10000000
332+
333+
#define guestdbg_enabled(vcpu) \
334+
(vcpu->guest_debug & KVM_GUESTDBG_ENABLE)
335+
#define guestdbg_sstep_enabled(vcpu) \
336+
(vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
337+
#define guestdbg_hw_bp_enabled(vcpu) \
338+
(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP)
339+
#define guestdbg_exit_pending(vcpu) (guestdbg_enabled(vcpu) && \
340+
(vcpu->guest_debug & KVM_GUESTDBG_EXIT_PENDING))
341+
342+
struct kvm_guestdbg_info_arch {
343+
unsigned long cr0;
344+
unsigned long cr9;
345+
unsigned long cr10;
346+
unsigned long cr11;
347+
struct kvm_hw_bp_info_arch *hw_bp_info;
348+
struct kvm_hw_wp_info_arch *hw_wp_info;
349+
int nr_hw_bp;
350+
int nr_hw_wp;
351+
unsigned long last_bp;
352+
};
232353

233354
struct kvm_vcpu_arch {
234355
struct kvm_s390_sie_block *sie_block;
@@ -238,11 +359,13 @@ struct kvm_vcpu_arch {
238359
struct kvm_s390_local_interrupt local_int;
239360
struct hrtimer ckc_timer;
240361
struct tasklet_struct tasklet;
362+
struct kvm_s390_pgm_info pgm;
241363
union {
242364
struct cpuid cpu_id;
243365
u64 stidp_data;
244366
};
245367
struct gmap *gmap;
368+
struct kvm_guestdbg_info_arch guestdbg;
246369
#define KVM_S390_PFAULT_TOKEN_INVALID (-1UL)
247370
unsigned long pfault_token;
248371
unsigned long pfault_select;
@@ -285,7 +408,9 @@ struct kvm_arch{
285408
struct gmap *gmap;
286409
int css_support;
287410
int use_irqchip;
411+
int use_cmma;
288412
struct s390_io_adapter *adapters[MAX_S390_IO_ADAPTERS];
413+
wait_queue_head_t ipte_wq;
289414
};
290415

291416
#define KVM_HVA_ERR_BAD (-1UL)

arch/s390/include/asm/lowcore.h

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -56,13 +56,14 @@ struct _lowcore {
5656
__u16 pgm_code; /* 0x008e */
5757
__u32 trans_exc_code; /* 0x0090 */
5858
__u16 mon_class_num; /* 0x0094 */
59-
__u16 per_perc_atmid; /* 0x0096 */
59+
__u8 per_code; /* 0x0096 */
60+
__u8 per_atmid; /* 0x0097 */
6061
__u32 per_address; /* 0x0098 */
6162
__u32 monitor_code; /* 0x009c */
6263
__u8 exc_access_id; /* 0x00a0 */
6364
__u8 per_access_id; /* 0x00a1 */
6465
__u8 op_access_id; /* 0x00a2 */
65-
__u8 ar_access_id; /* 0x00a3 */
66+
__u8 ar_mode_id; /* 0x00a3 */
6667
__u8 pad_0x00a4[0x00b8-0x00a4]; /* 0x00a4 */
6768
__u16 subchannel_id; /* 0x00b8 */
6869
__u16 subchannel_nr; /* 0x00ba */
@@ -196,12 +197,13 @@ struct _lowcore {
196197
__u16 pgm_code; /* 0x008e */
197198
__u32 data_exc_code; /* 0x0090 */
198199
__u16 mon_class_num; /* 0x0094 */
199-
__u16 per_perc_atmid; /* 0x0096 */
200+
__u8 per_code; /* 0x0096 */
201+
__u8 per_atmid; /* 0x0097 */
200202
__u64 per_address; /* 0x0098 */
201203
__u8 exc_access_id; /* 0x00a0 */
202204
__u8 per_access_id; /* 0x00a1 */
203205
__u8 op_access_id; /* 0x00a2 */
204-
__u8 ar_access_id; /* 0x00a3 */
206+
__u8 ar_mode_id; /* 0x00a3 */
205207
__u8 pad_0x00a4[0x00a8-0x00a4]; /* 0x00a4 */
206208
__u64 trans_exc_code; /* 0x00a8 */
207209
__u64 monitor_code; /* 0x00b0 */

arch/s390/include/asm/mmu.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ typedef struct {
1616
unsigned long vdso_base;
1717
/* The mmu context has extended page tables. */
1818
unsigned int has_pgste:1;
19+
/* The mmu context uses storage keys. */
20+
unsigned int use_skey:1;
1921
} mm_context_t;
2022

2123
#define INIT_MM_CONTEXT(name) \

arch/s390/include/asm/mmu_context.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ static inline int init_new_context(struct task_struct *tsk,
2323
mm->context.asce_bits |= _ASCE_TYPE_REGION3;
2424
#endif
2525
mm->context.has_pgste = 0;
26+
mm->context.use_skey = 0;
2627
mm->context.asce_limit = STACK_TOP_MAX;
2728
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
2829
return 0;

arch/s390/include/asm/pgalloc.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ unsigned long *page_table_alloc(struct mm_struct *, unsigned long);
2222
void page_table_free(struct mm_struct *, unsigned long *);
2323
void page_table_free_rcu(struct mmu_gather *, unsigned long *);
2424

25-
void page_table_reset_pgste(struct mm_struct *, unsigned long, unsigned long);
25+
void page_table_reset_pgste(struct mm_struct *, unsigned long, unsigned long,
26+
bool init_skey);
2627
int set_guest_storage_key(struct mm_struct *mm, unsigned long addr,
2728
unsigned long key, bool nq);
2829

0 commit comments

Comments
 (0)