整体架构

现在就让我们来看看迁移这件事的总体架构吧。

发送端

从migrate命令开始

通过上一小节的例子,我们可以看到迁移可以通过在monitor中执行命令开始。既然如此,那我们就从这里开始。

话说要讲清楚monitor中命令执行的机制,还真是要花费一些事件。通过一些学习,发现迁移的入口函数是hmp_migrate。下面就是本人总结的从hmp_migrate开始到迁移主函数的流程。

     hmp_migrate(), invoked from handle_hmp_command()
         qmp_migrate()
             migrate_get_current(), global MigrationState
             migrate_prepare()
                 migrate_init()

             tcp_start_outgoing_migration()
                 socket_start_outgoing_migration()
             unix_start_outgoing_migration()
                 socket_start_outgoing_migration()
                     socket_outgoing_migration
                         migration_channel_connect(s, sioc, hostname, err)
             exec_start_outgoing_migration()
                 migration_channel_connect(s, ioc, NULL, NULL)
             fd_start_outgoing_migration()
                 migration_channel_connect(s, ioc, NULL, NULL)
                     migrate_fd_connect(s, NULL)
             rdma_start_outgoing_migration()
                 migrate_fd_connect(s, NULL)
                     migration_thread()

可以看到,迁移的接口有

  • tcp

  • unix

  • exec

  • fd

  • rdma

但是万变不离其宗,最后都启动了migration_thread这个线程处理。

迁移主函数 migration_thread

所以最关键的就是这个迁移的主函数migration_thread。那我们把这个函数也打开。

     migration_thread()

       qemu_savevm_state_header
           qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
           qemu_put_be32(f, QEMU_VM_FILE_VERSION);
           qemu_put_be32(f, QEMU_VM_CONFIGURATION);
           vmstate_save_state(f, &vmstate_configuration, &savevm_state, 0);
       qemu_savevm_send_open_return_path(s->to_dst_file);
       qemu_savevm_send_ping(s->to_dst_file, 1);
           qemu_savevm_command_send(f, MIG_CMD_PING, , (uint8_t *)&buf)

       ; iterate savevm_state and call save_setup
       qemu_savevm_state_setup(s->to_dst_file);
           save_section_header(f, se, QEMU_VM_SECTION_START)
           se->ops->save_setup(f, se->opaque)
           save_section_footer(f, se)
           precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)

       migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, MIGRATION_STATUS_ACTIVE);
       migration_iteration_run

           ; iterate savevm_state and call save_live_pending
           qemu_savevm_state_pending(pend_pre/compat/post)
               se->ops->save_live_pending()

           ; iterate savevm_state and call save_live_iterate
           qemu_savevm_state_iterate()
               save_section_header(f, se, QEMU_VM_SECTION_PART)
               se->ops->save_live_iterate(f, se->opaque)
               save_section_footer(f, se)
           migration_completion()
               qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
               vm_stop_force_state(RUN_STATE_FINISH_MIGRATE)

               qemu_savevm_state_complete_precopy(s->to_dst_file, false, inactivate);
                   ; iterate savevm_state and call save_live_complete_precopy
                   cpu_synchronize_all_states();
                   save_section_header(f, se, QEMU_VM_SECTION_END);
                   se->ops->save_live_complete_precopy(f, se->opaque)
                   save_section_footer(f, se);

                   ; iterate savevm_state and call vmstate_save
                   save_section_header(f, se, QEMU_VM_SECTION_FULL);
                   vmstate_save(f, se, vmdesc)
                   save_section_footer(f, se);

       migration_detect_error
       migration_update_counters
       migration_iteration_finish

虽然这个函数很长,不过整体的结构还算清晰。大致可以分成这么几个阶段:

  • 发送header

  • 建立迁移的准备

  • 迭代传输

  • 完成迁移

其中主要就是通过几个不同的se->ops来实现的。

接收端

从incoming开始

接收端在运行时需要加上-incoming选项,所以我们也从incoming开始。

      qemu_start_incoming_migration()
         deferred_incoming_migration()
         tcp_start_incoming_migration()
             socket_start_incoming_migration()
         rdma_start_incoming_migration()
             rdma_accept_incoming_migration()
                 migration_fd_process_incoming()
                     migration_incoming_setup()
                     migration_incoming_process()
         exec_start_incoming_migration()
             exec_accept_incoming_migration()
                 migration_channel_process_incoming()
         unix_start_incoming_migration()
             socket_start_incoming_migration()
                 socket_accept_incoming_migration()
                     migration_channel_process_incoming()
         fd_start_incoming_migration()
             fd_accept_incoming_migration()
                 migration_channel_process_incoming()
                     migration_tls_channel_process_incoming()
                     migration_ioc_process_incoming()
                         migration_incoming_process()
                             process_incoming_migration_co()
                                 qemu_loadvm_state()

看着要比发送端麻烦些,不过还好找到了各种方式最终都执行到qemu_loadvm_state()。

qemu_loadvm_state

     qemu_loadvm_state()
         qemu_get_be32, QEMU_VM_FILE_MAGIC
         qemu_get_be32, QEMU_VM_FILE_VERSION
         qemu_loadvm_state_setup
             se->ops->load_setup
         vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0)
         cpu_synchronize_all_pre_loadvm
             cpu_synchronize_pre_loadvm(cpu)

         qemu_loadvm_state_main
             section_type = qemu_get_byte(f)
             QEMU_VM_SECTION_START | QEMU_VM_SECTION_FULL
             qemu_loadvm_section_start_full
                section_id = qemu_get_be32
                vmstate_load
             QEMU_VM_SECTION_PART | QEMU_VM_SECTION_END
             qemu_loadvm_section_part_end
                 section_id = qemu_get_be32
                 vmstate_load
             QEMU_VM_COMMAND
             loadvm_process_command
             QEMU_VM_EOF

         qemu_loadvm_state_cleanup
             se->ops->load_cleanup
         cpu_synchronize_all_post_init
             cpu_synchronize_post_init(cpu);

接收的过程相对发送要“简单”,主要的工作都隐藏在了section的三种情况中。

  • QEMU_VM_SECTION_START | QEMU_VM_SECTION_FULL

  • QEMU_VM_SECTION_PART | QEMU_VM_SECTION_END

  • QEMU_VM_COMMAND

第一种代表了setup阶段和最后vmstate_save阶段。 第二种代表了iteration的中间阶段和最后一次完成。 第三种还没有仔细看。

但是至少从前两种情况看,大家都走到了vmstate_load。不错。

SaveStateEntry

迁移过程中起到关键作用的数据结构名字是SaveStateEntry,也就是代码中的se。

    SaveState(savevm_state)
    +--------------------------------------+
    |global_section_id                     |
    |     (int)                            |
    |name                                  |
    |     (char*)                          |
    |len                                   |
    |target_page_bits                      |
    |caps_count                            |
    |     (uint32_t)                       |
    +--------------------------------------+
    |capabilities                          |
    |     (MigrationCapability*)           |
    +--------------------------------------+
    |handlers                              |
    |     (list of SaveStateEntry)         |
    +--------------------------------------+
                      |
                      |
    ------------+-----+----------------------------------+-------------------------------------------+---
                |                                        |                                           |
                |                                        |                                           |
    SaveStateEntry                          SaveStateEntry                       SaveStateEntry                 
    +-----------------------------+         +-----------------------------+      +------------------------------------+
    |idstr                        |         |idstr                        |      |idstr                               |
    |     = "block"               |         |     = "ram"                 |      |     = "dirty-bitmap"               |
    |ops                          |         |ops                          |      |ops                                 |
    |     = savevm_block_handlers |         |     = savevm_ram_handlers   |      |     = savevm_dirty_bitmap_handlers |
    |opaque                       |         |opaque                       |      |opaque                              |
    |     = block_mig_state       |         |     = ram_state             |      |     = dirty_bitmap_mig_state       |
    |vmsd                         |         |vmsd                         |      |vmsd                                |
    |     = NULL                  |         |     = NULL                  |      |     = NULL                         |
    +-----------------------------+         +-----------------------------+      +------------------------------------+

所有的SaveStateEntry结构都链接在全局链表savevm_state上。上图列举了几个比较重要的SaveStateEntry。比如名字叫ram的就是管理RAMBlock的。而且有意思的是,这几个的vmsd都是空。

隐藏的重点

在上述migrate_thread的流程中,有一个隐藏的函数vmstate_save。如果你把这个函数打开,那又将是一番新的天地。今天就先到这里把。

Last updated