APICV
在qemu/kernel混合模拟的基础上,为了进一步优化中断模拟,硬件提出了APICV。
在上面一小节混合模拟的流程中,其实也会涉及到apicv的代码,但没有仔细考察其作用。现在我们把它拿出来,仔细考察。
硬件知识
既然是硬件的辅助功能,那自然是先学习下硬件都提供了些什么。这部分的内容主要在SDM Chapter 29 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS。
相关的VM-execution controls
24.6.8 Controls for APIC Virtualization
APIC-access address (64 bits). This field contains the physical address of the 4-KByte APIC-access page
Virtual-APIC address (64 bits). This field contains the physical address of the 4-KByte virtual-APIC page
Certain VM-execution controls enable the processor to virtualize certain accesses to the APIC-access page without a VM exit. In general, this virtualization causes these accesses to be made to the virtual-APIC page instead of the APIC-access page.
TPR threshold (32 bits). Bits 3:0 of this field determine the threshold below which bits 7:4 of VTPR (see Section 29.1.1) cannot fall.
EOI-exit bitmap (4 fields; 64 bits each). These fields are supported only on processors that support the 1- setting of the “virtual-interrupt delivery” VM-execution control. They are used to determine which virtualized writes to the APIC’s EOI register cause VM exits
Posted-interrupt notification vector (16 bits). Its low 8 bits contain the interrupt vector that is used to notify a logical processor that virtual interrupts have been posted. See Section 29.6 for more information on the use of this field.
Posted-interrupt descriptor address (64 bits).
24.4.2 Guest Non-Register State
Guest interrupt status (16 bits) This field is supported only on processors that support the 1-setting of the “virtual-interrupt delivery” VM-execution control.
Requesting virtual interrupt (RVI)(low byte)
Servicing virtual interrupt (SVI)(high byte)
29 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS
The following are the VM-execution controls relevant to APIC virtualization and virtual interrupts (see Section 24.6 for information about the locations of these controls):
Virtual-interrupt delivery. This controls enables the evaluation and delivery of pending virtual interrupts (Section 29.2). It also enables the emulation of writes (memory-mapped or MSR-based, as enabled) to the APIC registers that control interrupt prioritization.
Use TPR shadow. This control enables emulation of accesses to the APIC’s task-priority register (TPR) via CR8 (Section 29.3) and, if enabled, via the memory-mapped or MSR-based interfaces.
Virtualize APIC accesses. This control enables virtualization of memory-mapped accesses to the APIC (Section 29.4) by causing VM exits on accesses to a VMM-specified APIC-access page. Some of the other controls, if set, may cause some of these accesses to be emulated rather than causing VM exits.
Virtualize x2APIC mode. This control enables virtualization of MSR-based accesses to the APIC (Section 29.5).
APIC-register virtualization. This control allows memory-mapped and MSR-based reads of most APIC registers (as enabled) by satisfying them from the virtual-APIC page. It directs memory-mapped writes to the APIC-access page to the virtual-APIC page, following them by VM exits for VMM emulation.
Process posted interrupts. This control allows software to post virtual interrupts in a data structure and send a notification to another logical processor; upon receipt of the notification, the target processor will process the posted interrupts by copying them into the virtual-APIC page (Section 29.6).
Virtual APIC Page
29.1.1 Virtualized APIC Registers
Virtual task-priority register (VTPR): the 32-bit field located at offset 080H on the virtual-APIC page.
Virtual processor-priority register (VPPR): the 32-bit field located at offset 0A0H on the virtual-APIC page.
Virtual end-of-interrupt register (VEOI): the 32-bit field located at offset 0B0H on the virtual-APIC page.
Virtual interrupt-service register (VISR)
Virtual interrupt-request register (VIRR)
Virtual interrupt-command register (VICR_LO): the 32-bit field located at offset 300H on the virtual-APIC page
Virtual interrupt-command register (VICR_HI): the 32-bit field located at offset 310H on the virtual-APIC page.
图解相关寄存器
虚拟APIC状态
29.1 VIRTUAL APIC STATE
硬件虚拟化通过Virtualize APIC accesses来指定一块4k内存,用于模拟APIC寄存器的访问和管理中断。
如果没有理解错的话,这个页面在vmx_vcpu_reset函数中设置。
虚拟中断
29.2 EVALUATION AND DELIVERY OF VIRTUAL INTERRUPTS
虚拟中断包含了中断检测和中断发送。
对virtual-APIC page的操作会触发虚拟中断的检测,如果这个检测得到了一个虚拟中断,则会向guest发送一个中断且不导致guest退出。
虚拟中断检测
一下几种情况将触发虚拟中断的检测:
VM entry(Section 26.3.2.5)
TPR virtualization(Section 29.1.2)
EOI virtualization(Section 29.1.4)
self-IPI virtualization(Section 29.1.5)
posted-interrupt processing(Section 29.6)
检测的伪代码如下:
虚拟中断发送
虚拟中断发送会改变虚拟机中断状态(RVI/SVI),并且在vmx non-root 模式下发送一个中断,从而不导致虚拟机退出。
中断发送的伪代码如下:
看过了硬件提供的能力,我们来看看软件上是如何借助硬件的。
相关软件
检测APICV能力
在kvm代码中首先需要检测硬件是否支持apicv的功能,并作标示。
可以看到在启动kvm模块的时候就需要检测apicv是否存在并标示enable_apicv。
那这个cpu_has_vmx_apicv又做了啥?
这个就和上文列出了硬件提供的内容匹配了。
何处使用
检测好了硬件的特性,现在就要看在什么地方使用了。当然,我们并没有直接使用enable_apicv这个值,而是把它赋值给了vcpu。
聪明的朋友一定已经想到,接下来就是找apicv_active这个值会在哪里判断,判断的地方就是使用apicv功能的地方了。
Posted Interrupt
Posted Interrupt作为一个比较重要的功能,我们单独拿出来研究。
概念
Posted-interrupt processing is a feature by which a processor processes the virtual interrupts by recording them as pending on the virtual-APIC page.
If the “external-interrupt exiting” VM-execution control is 1, any unmasked external interrupt causes a VM exit (see Section 25.2). If the “process posted interrupts” VM-execution control is also 1, this behavior is changed and the processor handles an external interrupt as follows.
The local APIC is acknowledged; this provides the processor core with an interrupt vector, called here the physical vector.
If the physical vector equals the posted-interrupt notification vector, the logical processor continues to the next step. Otherwise, a VM exit occurs as it would normally due to an external interrupt; the vector is saved in the VM-exit interruption-information field. 物理中断向量号等于 posted-interrupt notification vector才继续。
The processor clears the outstanding-notification bit in the posted-interrupt descriptor. This is done atomically so as to leave the remainder of the descriptor unmodified (e.g., with a locked AND operation). 清ON位。
The processor writes zero to the EOI register in the local APIC; this dismisses the interrupt with the posted- interrupt notification vector from the local APIC.清EOI。
The logical processor performs a logical-OR of PIR into VIRR and clears PIR. No other agent can read or write a PIR bit (or group of bits) between the time it is read (to determine what to OR into VIRR) and when it is cleared. PIR->VIRR
The logical processor sets RVI to be the maximum of the old value of RVI and the highest index of all bits that were set in PIR; if no bit was set in PIR, RVI is left unmodified. 计算得到RVI
The logical processor evaluates pending virtual interrupts as described in Section 29.2.1.
简单来说就是原来运行是的虚拟机需要退出来处理中断,现在不退出了,宿主机上用一个特殊的中断将真正的中断注入到虚拟机。真正注入到虚拟机的中断号记录在 PIR (Posted Interrupt Requests)。
软件中断处理的代码
在之前的代码分析中我们也看到过,不过没有着重讲解。在发送中断的内核部分中,我们看到
对于APIC_DM_FIXED中断类型,如果apicv_active为真,则会采用Posted interrupt方式。
这个函数的工作在注释中写得很清楚了。我向着重讲的是pi_test_and_set_pir将真正的中断向量号写在了pir中。
进一步打开kvm_vcpu_trigger_posted_interrupt,发生了什么呢?实际上是向目标vcpu所在的物理cpu上发送了vector为POSTED_INTR_VECTOR的一个中断。当然这个中断就叫Posted Interrupt。
硬件中断处理的代码
在上面的代码中我们可以看到,如果vcpu在guest状态下才会通过post interrupt发送中断。否则还是走kvm_vcpu_kick。
但是对于硬件中断来说,中断发生时直接进入中断处理函数,而不会去判断vcpu状态。这要怎么处理呢?
暂时我能看到是当vcpu处于block状态时,会更换notification vector。也就是由另一个中断函数来响应这个事件。
而这个中断处理函数的内容是:
暂时还不是特别理解,待我以后好好研究。
Last updated