Migration bug in qemu-kvm-0.12.1.2

Zhiming · January 12, 2012, 9:43pm

Hi, does anyone know whether there is a bug in VM migration in qemu-kvm-0.12.1.2? If the VM is very busy when migration is triggered (say 100% CPU usage, a lot of network connections), the VM will get stuck after being migrated (100% CPU usage, not responding, but the QEMU monitor is still working). Any help will be appreciated. Thanks very much.

frido · January 21, 2012, 4:00pm

Hello,

I have a similar problem that the VM turns unresponsive after a live migration and each virtual CPU is then consuming 100% cpu. I had the impression that guests with SMP were more sensitive to lock ups.
I’m also using qemu-kvm-0.12.1.2. It was not always the case, but mostly within 3-4 live migrations, the guest shows this behaviour, especially with SMP.
This also happens when the machine is just idle.

After some searching, I found that it could be a problem with the clocksource on the guest. By default, the para-virtualized kvm-clock driver is used in the guest:

# cat /sys/devices/system/clocksource/clocksource0/current_clocksource kvm-clock
I have these available on my guest:

# cat /sys/devices/system/clocksource/clocksource0/available_clocksource kvm-clock tsc acpi_pm
To further test this, I switched the clocksource to tsc, since the host has a constant tsc flag:

# cat /proc/cpuinfo | grep constant_tsc flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm **constant_tsc** arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm arat epb dts tpr_shadow vnmi flexpriority ept vpid
On the guest (SLES11 SP1):

From this moment, live migrations were stable and didn’t make the VM unresponsive after a migation (or being idle).
Normally, kvm-clock source is recommended, but it seems tsc works more reliable in my case.

Perhaps you could try the same and see if it helps in your case?

Ghilteras · April 18, 2012, 3:50pm

I have the exact same problem with migration, but sadly this hack of changing the clock doesn’t help at all

I’m running

ii qemu-kvm 0.12.5+dfsg-5+squeeze8 Full virtualization on x86 hardware

when I live migrate from serverA (dell poweredge r210) to serverB (dual core intel 3.20GHz) my guest gets always stuck, oddly enough this does not happens the other way around

they both have debian 6