APICV

在qemu/kernel混合模拟的基础上,为了进一步优化中断模拟,硬件提出了APICV。

在上面一小节混合模拟的流程中,其实也会涉及到apicv的代码,但没有仔细考察其作用。现在我们把它拿出来,仔细考察。

硬件知识

既然是硬件的辅助功能,那自然是先学习下硬件都提供了些什么。这部分的内容主要在SDM Chapter 29 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS。

相关的VM-execution controls

24.6.8 Controls for APIC Virtualization

  • APIC-access address (64 bits). This field contains the physical address of the 4-KByte APIC-access page

  • Virtual-APIC address (64 bits). This field contains the physical address of the 4-KByte virtual-APIC page

Certain VM-execution controls enable the processor to virtualize certain accesses to the APIC-access page without a VM exit. In general, this virtualization causes these accesses to be made to the virtual-APIC page instead of the APIC-access page.

  • TPR threshold (32 bits). Bits 3:0 of this field determine the threshold below which bits 7:4 of VTPR (see Section 29.1.1) cannot fall.

  • EOI-exit bitmap (4 fields; 64 bits each). These fields are supported only on processors that support the 1- setting of the “virtual-interrupt delivery” VM-execution control. They are used to determine which virtualized writes to the APIC’s EOI register cause VM exits

  • Posted-interrupt notification vector (16 bits). Its low 8 bits contain the interrupt vector that is used to notify a logical processor that virtual interrupts have been posted. See Section 29.6 for more information on the use of this field.

  • Posted-interrupt descriptor address (64 bits).

24.4.2 Guest Non-Register State

  • Guest interrupt status (16 bits) This field is supported only on processors that support the 1-setting of the “virtual-interrupt delivery” VM-execution control.

  • Requesting virtual interrupt (RVI)(low byte)

  • Servicing virtual interrupt (SVI)(high byte)

29 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS

The following are the VM-execution controls relevant to APIC virtualization and virtual interrupts (see Section 24.6 for information about the locations of these controls):

  • Virtual-interrupt delivery. This controls enables the evaluation and delivery of pending virtual interrupts (Section 29.2). It also enables the emulation of writes (memory-mapped or MSR-based, as enabled) to the APIC registers that control interrupt prioritization.

  • Use TPR shadow. This control enables emulation of accesses to the APIC’s task-priority register (TPR) via CR8 (Section 29.3) and, if enabled, via the memory-mapped or MSR-based interfaces.

  • Virtualize APIC accesses. This control enables virtualization of memory-mapped accesses to the APIC (Section 29.4) by causing VM exits on accesses to a VMM-specified APIC-access page. Some of the other controls, if set, may cause some of these accesses to be emulated rather than causing VM exits.

  • Virtualize x2APIC mode. This control enables virtualization of MSR-based accesses to the APIC (Section 29.5).

  • APIC-register virtualization. This control allows memory-mapped and MSR-based reads of most APIC registers (as enabled) by satisfying them from the virtual-APIC page. It directs memory-mapped writes to the APIC-access page to the virtual-APIC page, following them by VM exits for VMM emulation.

  • Process posted interrupts. This control allows software to post virtual interrupts in a data structure and send a notification to another logical processor; upon receipt of the notification, the target processor will process the posted interrupts by copying them into the virtual-APIC page (Section 29.6).

Virtual APIC Page

29.1.1 Virtualized APIC Registers

  • Virtual task-priority register (VTPR): the 32-bit field located at offset 080H on the virtual-APIC page.

  • Virtual processor-priority register (VPPR): the 32-bit field located at offset 0A0H on the virtual-APIC page.

  • Virtual end-of-interrupt register (VEOI): the 32-bit field located at offset 0B0H on the virtual-APIC page.

  • Virtual interrupt-service register (VISR)

  • Virtual interrupt-request register (VIRR)

  • Virtual interrupt-command register (VICR_LO): the 32-bit field located at offset 300H on the virtual-APIC page

  • Virtual interrupt-command register (VICR_HI): the 32-bit field located at offset 310H on the virtual-APIC page.

图解相关寄存器

虚拟APIC状态

29.1 VIRTUAL APIC STATE

硬件虚拟化通过Virtualize APIC accesses来指定一块4k内存,用于模拟APIC寄存器的访问和管理中断。

如果没有理解错的话,这个页面在vmx_vcpu_reset函数中设置。

虚拟中断

29.2 EVALUATION AND DELIVERY OF VIRTUAL INTERRUPTS

虚拟中断包含了中断检测和中断发送。

对virtual-APIC page的操作会触发虚拟中断的检测,如果这个检测得到了一个虚拟中断,则会向guest发送一个中断且不导致guest退出。

虚拟中断检测

一下几种情况将触发虚拟中断的检测:

  • VM entry(Section 26.3.2.5)

  • TPR virtualization(Section 29.1.2)

  • EOI virtualization(Section 29.1.4)

  • self-IPI virtualization(Section 29.1.5)

  • posted-interrupt processing(Section 29.6)

检测的伪代码如下:

虚拟中断发送

虚拟中断发送会改变虚拟机中断状态(RVI/SVI),并且在vmx non-root 模式下发送一个中断,从而不导致虚拟机退出。

中断发送的伪代码如下:

看过了硬件提供的能力,我们来看看软件上是如何借助硬件的。

相关软件

检测APICV能力

在kvm代码中首先需要检测硬件是否支持apicv的功能,并作标示。

可以看到在启动kvm模块的时候就需要检测apicv是否存在并标示enable_apicv。

那这个cpu_has_vmx_apicv又做了啥?

这个就和上文列出了硬件提供的内容匹配了。

何处使用

检测好了硬件的特性,现在就要看在什么地方使用了。当然,我们并没有直接使用enable_apicv这个值,而是把它赋值给了vcpu。

聪明的朋友一定已经想到,接下来就是找apicv_active这个值会在哪里判断,判断的地方就是使用apicv功能的地方了。

Posted Interrupt

Posted Interrupt作为一个比较重要的功能,我们单独拿出来研究。

概念

Posted-interrupt processing is a feature by which a processor processes the virtual interrupts by recording them as pending on the virtual-APIC page.

If the “external-interrupt exiting” VM-execution control is 1, any unmasked external interrupt causes a VM exit (see Section 25.2). If the “process posted interrupts” VM-execution control is also 1, this behavior is changed and the processor handles an external interrupt as follows.

  1. The local APIC is acknowledged; this provides the processor core with an interrupt vector, called here the physical vector.

  2. If the physical vector equals the posted-interrupt notification vector, the logical processor continues to the next step. Otherwise, a VM exit occurs as it would normally due to an external interrupt; the vector is saved in the VM-exit interruption-information field. 物理中断向量号等于 posted-interrupt notification vector才继续。

  3. The processor clears the outstanding-notification bit in the posted-interrupt descriptor. This is done atomically so as to leave the remainder of the descriptor unmodified (e.g., with a locked AND operation). 清ON位。

  4. The processor writes zero to the EOI register in the local APIC; this dismisses the interrupt with the posted- interrupt notification vector from the local APIC.清EOI。

  5. The logical processor performs a logical-OR of PIR into VIRR and clears PIR. No other agent can read or write a PIR bit (or group of bits) between the time it is read (to determine what to OR into VIRR) and when it is cleared. PIR->VIRR

  6. The logical processor sets RVI to be the maximum of the old value of RVI and the highest index of all bits that were set in PIR; if no bit was set in PIR, RVI is left unmodified. 计算得到RVI

  7. The logical processor evaluates pending virtual interrupts as described in Section 29.2.1.

简单来说就是原来运行是的虚拟机需要退出来处理中断,现在不退出了,宿主机上用一个特殊的中断将真正的中断注入到虚拟机。真正注入到虚拟机的中断号记录在 PIR (Posted Interrupt Requests)。

软件中断处理的代码

在之前的代码分析中我们也看到过,不过没有着重讲解。在发送中断的内核部分中,我们看到

对于APIC_DM_FIXED中断类型,如果apicv_active为真,则会采用Posted interrupt方式。

这个函数的工作在注释中写得很清楚了。我向着重讲的是pi_test_and_set_pir将真正的中断向量号写在了pir中。

进一步打开kvm_vcpu_trigger_posted_interrupt,发生了什么呢?实际上是向目标vcpu所在的物理cpu上发送了vector为POSTED_INTR_VECTOR的一个中断。当然这个中断就叫Posted Interrupt。

硬件中断处理的代码

在上面的代码中我们可以看到,如果vcpu在guest状态下才会通过post interrupt发送中断。否则还是走kvm_vcpu_kick。

但是对于硬件中断来说,中断发生时直接进入中断处理函数,而不会去判断vcpu状态。这要怎么处理呢?

暂时我能看到是当vcpu处于block状态时,会更换notification vector。也就是由另一个中断函数来响应这个事件。

而这个中断处理函数的内容是:

暂时还不是特别理解,待我以后好好研究。

Last updated

Was this helpful?