Sunday, 28 October 2012

PCI Passthrough (Direct-IO or SR-IOV) with PCIe devices behind a non-ACS switch in vSphere


As we mentioned in previous post, old PCIe devices won't support SR-IOV or Direct-IO due to missing ACS capability. The support will be stopped by VMkernal due to the importance of ACS component. This illustration was captured from VMware KB 1036811.

"Access Control Services (ACS) was introduced by the PCI-SIG to address potential data corruption with direct assignment of devices. Passthrough of a PCIe device to a virtual machine, using DirectPath I/O, is disallowed by the VMKernel if the device is connected to a switch that does not support Access Control Services"

The feature sounds good to protect traffic flow, but my lab was having old PCIe and wanted to test the features. SR-IOV initiation was failing as I noticed in VMkernal logs. Also, Direct-IO PCIe devices weren't recognized even after reboot (it is still asking to reboot the host).

I thought how to bypass the check for ACS capability and found it !!!

Select the host and navigate to Configuration > Advanced Settings (Software) > VMkernal > Boot. Search for a parameter VMkernel.Boot.disableACSCheck and enable the check-box.

Single Root I/O Virtualization (SR-IOV) - Configuration


Here I am detailing the configuration steps and the caveats faced during implementation. 

1. Upgrade your Server BIOS and NIC Firmware to a version supporting SR-IOV. For Dell PowerEdge R710, minimum BIOS is 6.2.3 and 82576 Intel NIC Firmware is 13.5. Refer to my previous blog "Upgrade Dell PowerEdge Components" to understand how to upgrade server components. 
2. From the BIOS, enable Intel VT-d and SR-IOV Global.
 
3. Identify the driver for the NIC that will be used for SR-IOV (in VMware terms its called kernel module). This can be identified from vSphere Client or using CLI (I prefer this method since it provides more details).

From vSphere Client,
a. Navigate to Home > Inventory > Hosts and Clusters > #Select the host# > Configuration Tab > Networking.
b. In case of standard switch, select Properties > Network Adapters Tab. In case of distributed switch, select Manage Physical Adapters.
c. Select the adapter to be used for SR-IOV. On the right pane you will get all the details for the NIC.
From CLI,
a. SSH to the server
b. List all the NICs available on the server
c. Identify the NIC of interest and get more details about it. Remember this NIC should be one of the listed models which support SR-IOV. Refer to the previous blog to see the supported NICs.

~ # ethtool -i vmnic5
driver: igb
version: 2.1.11.1
firmware-version: 1.5-1
bus-info: 0000:06:00.1

This NIC supports SR-IOV as listed in Intel and VMware (Intel Corporation 82576 Gigabit Network Connection).

4. The driver version for vmnic5 is 2.1.11.1. This version doesn't support SR-IOV which is listed in VMware support list as well as Intel support list. You need to upgrade the driver of your NIC. For Intel 82576, I have upgrade to version 3.4.7.3. I will brief how to upgrade the driver.

a. Download the driver from VMware download center based on the version of ESXi (in my case it was VMware ESXi 5.0 Driver for Intel 82580 and I350 Gigabit Ethernet Controllers)
b. Unzip the driver file and identify a file named as #driver_name#-#version#-offline_bundle-#build_number# (in my case it was igb-3.4.7.3-offline_bundle-804663).
c. From vSphere Client, navigate to Home > Solutions and Applications > Update Manager > Patch Repository.
d. Select Import Patches hyperlink and select the file from your machine (igb-3.4.7.3-offline_bundle-804663).
e. Create a new Baseline with type as Extension and include your driver in this baseline.
f. Navigate to Home > Inventory > Hosts and Clusters > #Select the desired Host# > Update Manager Tab.
g. Attach the baseline to your host and Stage/Remediate.

You can verify the update by logging into CLI.

~ # ethtool -i vmnic5
driver: igb
version: 3.4.7.3
firmware-version: 1.5-1
bus-info: 0000:06:00.1

Also, you can update the driver using CLI instead of VUM.

5. Create a Host Profile for the host which has the NIC to be used for SR-IOV. To create the profile Right-Click the Host > Host Profile > Manage Profile. 
6. Navigate to Home > Management > Host Profiles. Select the interested profile and Edit Settings. 
7. From the Settings Navigate to Kernel Module Configuration > Kernel Module > #Driver_Name# > Kernel Module Parameters > max_vfs > Module Parameter Settings.
Note: In case your driver doesn't support SR-IOV, max_vfs kernel parameter won't be listed under the driver. This is another method to check your driver support to SR-IOV.

Enter the max number of Virtual Functions (VFs) per Physical Function (PF). In case of dual-port 82576, you can enter the value as max_vfs=x,x. The max number of VFs per PF depends on the NIC. For Intel 82576 it is 16.

Note: A value of '0' means that SR-IOV is disable for the specified PF.

8. Apply the profile to the host after the modifications and reboot the host. 
9. Once the host is up, Select the host and navigate to Configuration > Network Adapters. The NICs enabled for SR-IOV won't be listed. Navigate to Advanced Settings under Hardware section, you will see that VFs are listed as Passthrough devices. At this stage, you have completed SR-IOV configuration.  
10. Shutdown you VMs > Edit Settings > Add PCI Devices. You VM Guest OS should be supporting SR-IOV and having VFs drivers.