I have a similar problem that the VM turns unresponsive after a live migration and each virtual CPU is then consuming 100% cpu. I had the impression that guests with SMP were more sensitive to lock ups.
I’m also using qemu-kvm-0.12.1.2. It was not always the case, but mostly within 3-4 live migrations, the guest shows this behaviour, especially with SMP.
This also happens when the machine is just idle.
After some searching, I found that it could be a problem with the clocksource on the guest. By default, the para-virtualized kvm-clock driver is used in the guest:
# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
I have these available on my guest:
# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
kvm-clock tsc acpi_pm
To further test this, I switched the clocksource to tsc, since the host has a constant tsc flag:
# cat /proc/cpuinfo | grep constant_tsc
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm **constant_tsc** arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm arat epb dts tpr_shadow vnmi flexpriority ept vpid
On the guest (SLES11 SP1):
From this moment, live migrations were stable and didn’t make the VM unresponsive after a migation (or being idle).
Normally, kvm-clock source is recommended, but it seems tsc works more reliable in my case.
Perhaps you could try the same and see if it helps in your case?