Why guest VM gets paused in boot process


#1

Hi there,

I run a guest VM in KVM, but it always get paused in the boot process. Below is the info about my environment,

  1. Host OS: ubuntu 12.04 64bits with linux kernel 3.2.0.36-generic and the default kvm package
  2. Guest OS: linux kernel 2.6.23, 64bits, two cpus, 4GB memory

What’s wired is that the guest VM works well with uniprocess.

The guest VM is paused by the error “emulation failure”. And i find that it’s caused by an instruction in the trampoline_data section – refer to arch/x86_64/kernel/trampoline.S. I’m not sure about the SMP boot process, but it’s said BSP uses IPI to start APs, and the APs will start run the trampoline code in real mode. As shown in the trampoline.S code, the first 4 bytes will be changed to a5a5a5a5 to make a marker to let BSP know that the code has been executed by APs. But i found that BSP will run the same code section after APs have changed the first 4 bytes, but BSP starts from the fifth byte (eip=4), which is ‘C8’ and an invalid instruction.

I’m not sure if it’s normal for BSP to run the trampoline code, but it’s definitely wrong to execute from that offset. It’s said it may be caused by APIC emulation in kernel space, but I’m not sure about this.

Anyone can help me? any suggestions will be much appreciated.


#2

Below is a code snippet from arch/x86/kernel/trampoline_64.S,

ENTRY(trampoline_data)
r_base = .
fa                               cli                     # We should be safe anyway
0f 09                           wbinvd
8c c8                          mov     %cs, %ax        # Code and data in the same place
8e d8                          mov     %ax, %ds
8e d0                          mov     %ax, %es
8e c0                          mov     %ax, %ss


66 c7 06 00 00 a5 a5 a5 a5        movl    $0xA5A5A5A5, trampoline_status - r_base
                                # write marker for master knows we're running

the raw instructions for the code snippet are “fa 0f 09 8c c8 8e d8 8e d0 8e c0 66 c7 06 00 00 a5 a5 a5 a5”, and after APs run the code, it will be changed to “a5 a5 a5 a5 c8 8e d8 8e d0 8e c0 66 c7 06 00 00 a5 a5 a5 a5”. So when BSP starts run at the fifth byte ‘c8’, the emulation fails – since it’s an invalid instruction, but ‘8c c8’ is.

Any tips to debugging this problem? Thanks a lot in advance.