Now it’s the time to
go into CPU Virtualization. I have to mention here that ESXi 5.x was used as
the testing environment for the concepts mentioned in this and the next posts.
Since ESXi is NUMA
aware OS, the way how CPU virtualization works differ based on the underlying
system (NUMA or non-NUMA).
For non-NUMA SMP
In this type, ESXi
CPU Scheduler will split the load from all VMs across all cores or HT logical
processors in round robin manner (which similar to other Operating Systems such as Windows in the way of distributing threads across core as we mentioned in the previous posts).
Keeping in mind that each core or HT logical processing is serving one
VM at a time. What does this mean?
Assuming that we
have an SMP system with 4 cores running on top of it 2 VMs each having 1 vCPU,
in this case ESXi CPU scheduler will split the load of both VMs across the 4
cores in round robin manner (Thread01 (VM1 + VM2) on cores 0 & 1, Thread02
(VM1 + VM2) on cores 2&3, Thread03 (VM1 + VM2) on cores 1&3, etc).
Below is a
test done by loading the vCPU of MS-AD VM. From ESXTOP we can see
that the %PCPU is hunting in a round robin manner.
In case you have
8 VMs each with 1 vCPU running on top of
this system, the scheduler will split the load of those VMs across the 4 cores.
Since each core will serve one VM at a time, you will find %RDY counter increased on all VMs. %RDY counter describes the amount of time a VM
is waiting for a free core to handle it based on CPU scheduler. During this
waiting time, the VM will be freeze.
Let's consider another example of vSMP VM over
non-NUMA SMP System
Assume that we have
2 VMs with the 1vCPU and 2vCPU configurations running on top 4 cores SMP
system. We said previously that the load from all VMs will be spread across all
cores. Does this mean that VM1 will use core-01
while VM2 will use core-02?
In fact No! ESXi CPU
scheduler will split vSMP VM across multiple cores based on the number of
vCPUs. In our example VM01 will use core-01 while VM02 will use cores 02 &
03. Then round robin hunting will start (VM01 to core-02 + VM02 to cores-03 &
04, etc). This is to provide more cycles to the VM as expected by adding extra
vCPUs. On the other hand,
this has a major drawback that you need to have two free cores (or HT logical processors) simultaneously to
execute VM02.Till then the VM will be freeze, i.e. higher %RDY count. Therefore, adding extra vCPUs isn't
always an advantage and needs to be addressed properly.
Note: Also, keep in mind that extra vCPUs needs to
be consider from DRS and HA point of view to make sure that other hosts in the
cluster are having the same CPU capacity.
Below is a
test done by loading a VM named VM-01 which has
2vCPUs. We can see that two %PCPU are loaded at
a time with round robin hunting. Note that the total amount of %USED is almost 200% which the sum
of usage of 2vCPUs (you may refer to ESXTOP Bible to understand %USED more and
see how its exactly calculated).
Last in this
section, I thought to add this example. Two VMs named MS-AD (1vCPU) and VM-01
(4vCPU) are loaded at the same time on top of 4 cores system. Let's look at
ESXTOP output.
If you think that
VM-01 (%USED) is 400% and MS-AD (%USED) is 100%, then you are wrong!!
The total number of
vCPUs is 5 while the total number of physical cores is 4. This means that CPU
scheduler will give one core to MS-AD (during this time VM-01 freeze. Look at
its %RDY). Next CPU scheduler will give 4 cores to VM-01 (during this time MS-AD
freeze. Look at its %RDY). We just said up the drawback of extra vCPUs is that
you need to have the same number of physical cores free simultaneously to run
the VM.
Note: From the guest OS, you will
find that CPU utilization is 100% for both VMs since the guest is utilizating
the max given CPU cycles.
hi ,is this the right scheduling method in ESXi host ?
ReplyDeleteYes, it is.. I am going to post on how it works with NUMA systems. Just got engaged with couple of projects and running out of time :)
ReplyDeleteThanks ..I'm waiting for it :)
ReplyDelete