整体架构
现在就让我们来看看迁移这件事的总体架构吧。
发送端
从migrate命令开始
通过上一小节的例子,我们可以看到迁移可以通过在monitor中执行命令开始。既然如此,那我们就从这里开始。
话说要讲清楚monitor中命令执行的机制,还真是要花费一些事件。通过一些学习,发现迁移的入口函数是hmp_migrate。下面就是本人总结的从hmp_migrate开始到迁移主函数的流程。
hmp_migrate(), invoked from handle_hmp_command()
qmp_migrate()
migrate_get_current(), global MigrationState
migrate_prepare()
migrate_init()
tcp_start_outgoing_migration()
socket_start_outgoing_migration()
unix_start_outgoing_migration()
socket_start_outgoing_migration()
socket_outgoing_migration
migration_channel_connect(s, sioc, hostname, err)
exec_start_outgoing_migration()
migration_channel_connect(s, ioc, NULL, NULL)
fd_start_outgoing_migration()
migration_channel_connect(s, ioc, NULL, NULL)
migrate_fd_connect(s, NULL)
rdma_start_outgoing_migration()
migrate_fd_connect(s, NULL)
migration_thread()
可以看到,迁移的接口有
tcp
unix
exec
fd
rdma
但是万变不离其宗,最后都启动了migration_thread这个线程处理。
迁移主函数 migration_thread
所以最关键的就是这个迁移的主函数migration_thread。那我们把这个函数也打开。
migration_thread()
qemu_savevm_state_header
qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
qemu_put_be32(f, QEMU_VM_FILE_VERSION);
qemu_put_be32(f, QEMU_VM_CONFIGURATION);
vmstate_save_state(f, &vmstate_configuration, &savevm_state, 0);
qemu_savevm_send_open_return_path(s->to_dst_file);
qemu_savevm_send_ping(s->to_dst_file, 1);
qemu_savevm_command_send(f, MIG_CMD_PING, , (uint8_t *)&buf)
; iterate savevm_state and call save_setup
qemu_savevm_state_setup(s->to_dst_file);
save_section_header(f, se, QEMU_VM_SECTION_START)
se->ops->save_setup(f, se->opaque)
save_section_footer(f, se)
precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)
migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, MIGRATION_STATUS_ACTIVE);
migration_iteration_run
; iterate savevm_state and call save_live_pending
qemu_savevm_state_pending(pend_pre/compat/post)
se->ops->save_live_pending()
; iterate savevm_state and call save_live_iterate
qemu_savevm_state_iterate()
save_section_header(f, se, QEMU_VM_SECTION_PART)
se->ops->save_live_iterate(f, se->opaque)
save_section_footer(f, se)
migration_completion()
qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
vm_stop_force_state(RUN_STATE_FINISH_MIGRATE)
qemu_savevm_state_complete_precopy(s->to_dst_file, false, inactivate);
; iterate savevm_state and call save_live_complete_precopy
cpu_synchronize_all_states();
save_section_header(f, se, QEMU_VM_SECTION_END);
se->ops->save_live_complete_precopy(f, se->opaque)
save_section_footer(f, se);
; iterate savevm_state and call vmstate_save
save_section_header(f, se, QEMU_VM_SECTION_FULL);
vmstate_save(f, se, vmdesc)
save_section_footer(f, se);
migration_detect_error
migration_update_counters
migration_iteration_finish
虽然这个函数很长,不过整体的结构还算清晰。大致可以分成这么几个阶段:
发送header
建立迁移的准备
迭代传输
完成迁移
其中主要就是通过几个不同的se->ops来实现的。
接收端
从incoming开始
接收端在运行时需要加上-incoming选项,所以我们也从incoming开始。
qemu_start_incoming_migration()
deferred_incoming_migration()
tcp_start_incoming_migration()
socket_start_incoming_migration()
rdma_start_incoming_migration()
rdma_accept_incoming_migration()
migration_fd_process_incoming()
migration_incoming_setup()
migration_incoming_process()
exec_start_incoming_migration()
exec_accept_incoming_migration()
migration_channel_process_incoming()
unix_start_incoming_migration()
socket_start_incoming_migration()
socket_accept_incoming_migration()
migration_channel_process_incoming()
fd_start_incoming_migration()
fd_accept_incoming_migration()
migration_channel_process_incoming()
migration_tls_channel_process_incoming()
migration_ioc_process_incoming()
migration_incoming_process()
process_incoming_migration_co()
qemu_loadvm_state()
看着要比发送端麻烦些,不过还好找到了各种方式最终都执行到qemu_loadvm_state()。
qemu_loadvm_state
qemu_loadvm_state()
qemu_get_be32, QEMU_VM_FILE_MAGIC
qemu_get_be32, QEMU_VM_FILE_VERSION
qemu_loadvm_state_setup
se->ops->load_setup
vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0)
cpu_synchronize_all_pre_loadvm
cpu_synchronize_pre_loadvm(cpu)
qemu_loadvm_state_main
section_type = qemu_get_byte(f)
QEMU_VM_SECTION_START | QEMU_VM_SECTION_FULL
qemu_loadvm_section_start_full
section_id = qemu_get_be32
vmstate_load
QEMU_VM_SECTION_PART | QEMU_VM_SECTION_END
qemu_loadvm_section_part_end
section_id = qemu_get_be32
vmstate_load
QEMU_VM_COMMAND
loadvm_process_command
QEMU_VM_EOF
qemu_loadvm_state_cleanup
se->ops->load_cleanup
cpu_synchronize_all_post_init
cpu_synchronize_post_init(cpu);
接收的过程相对发送要“简单”,主要的工作都隐藏在了section的三种情况中。
QEMU_VM_SECTION_START | QEMU_VM_SECTION_FULL
QEMU_VM_SECTION_PART | QEMU_VM_SECTION_END
QEMU_VM_COMMAND
第一种代表了setup阶段和最后vmstate_save阶段。 第二种代表了iteration的中间阶段和最后一次完成。 第三种还没有仔细看。
但是至少从前两种情况看,大家都走到了vmstate_load。不错。
SaveStateEntry
迁移过程中起到关键作用的数据结构名字是SaveStateEntry,也就是代码中的se。
SaveState(savevm_state)
+--------------------------------------+
|global_section_id |
| (int) |
|name |
| (char*) |
|len |
|target_page_bits |
|caps_count |
| (uint32_t) |
+--------------------------------------+
|capabilities |
| (MigrationCapability*) |
+--------------------------------------+
|handlers |
| (list of SaveStateEntry) |
+--------------------------------------+
|
|
------------+-----+----------------------------------+-------------------------------------------+---
| | |
| | |
SaveStateEntry SaveStateEntry SaveStateEntry
+-----------------------------+ +-----------------------------+ +------------------------------------+
|idstr | |idstr | |idstr |
| = "block" | | = "ram" | | = "dirty-bitmap" |
|ops | |ops | |ops |
| = savevm_block_handlers | | = savevm_ram_handlers | | = savevm_dirty_bitmap_handlers |
|opaque | |opaque | |opaque |
| = block_mig_state | | = ram_state | | = dirty_bitmap_mig_state |
|vmsd | |vmsd | |vmsd |
| = NULL | | = NULL | | = NULL |
+-----------------------------+ +-----------------------------+ +------------------------------------+
所有的SaveStateEntry结构都链接在全局链表savevm_state上。上图列举了几个比较重要的SaveStateEntry。比如名字叫ram的就是管理RAMBlock的。而且有意思的是,这几个的vmsd都是空。
隐藏的重点
在上述migrate_thread的流程中,有一个隐藏的函数vmstate_save。如果你把这个函数打开,那又将是一番新的天地。今天就先到这里把。
Last updated