Migration of physical SQL Server to a new VMware Environment.
As announced in my last post about Migration of SQL Server with PowerShell dbatools #PSBlogWeek I’ll write a recap of the first three parts of the project – and here we come.
- Why (do we need to do this project?)
- Ordering the hardware
- Configuration of the VMware environment
- Detail configuration of the individual virtual machines
- Migration of SQL Server with PowerShell dbatools (separate article)
It all begun with the idea of reducing the amount of seven physical server we use for our SQL Server databases.
- old servers are running out of support and have to be replaced.
- the amount of SQL Server Core licenses can be reduced.
- better High Availability and shorter restore times in case of a Desaster Recovery.
- fewer unused ressources of the physical server.
To analyze and select the hardware, we used a Dell tool (DPack) to record the performance data of the seven physical servers for over a week. We saw a peek there of 11.000 IOPS that needs to be fulfilled by the VMware Environment.
But that’s only a peek and therefore considered as the maximum value. The mean at 95% is 2475 IOPS.
Here is the (german) summary for the 7 servers with the determined data.
It says we need 661 MB/s, or 11074 IOPS (2475 by mean 95%) for the seven server with 22 CPUs and 116 Cores.
Therefore a selection of 3 Dell Poweredge servers (R730), two Dell 10G switches (S4048T) and two SANs (Compellent SC4020 with SSDs) was proposed as a central storage.
For the VMware version, the Essentials Plus version is enough for us, as this will be a completely independent environment. Windows 2016 Datacenter with the correct core number serves as the operating system in the VMs.
The SQL Server are licensed with SQL Server 2016 Standard incl. SA in the Core variant, a single one is raised to Enterprise 2016 for a Mobile Reports Reporting Server (SSRS).
Unfortunately, one of the two SANs was canceled for the order due to financial reasons, which reduced the total price by about 50K. The lack of the SAN does not affect the topic of HA to any great extent, but is crucial for disaster recovery.
In the event of a complete failure of the SAN, the mission-critical productive databases are no longer available on another SAN in a very short time, but must be restored from the backup. Here the goal of the shorter Disaster Recovery times is in my opinion again far off.
The technology behind the implementation:
The three servers (Dell PowerEdge R730) each contain 2 CPUs with 12 cores, that is a total of 72 cores and 512 GB RAM each. There are two micro SD cards for the operating system in the servers (VMware 6.0 – 6.5 is not yet supported by the SAN). The Compellent SC4020 SAN has 9 SSDs of 3.82 TB each, one of which is configured as a HotSpare. The volumes are available as RAID 6 or 10.
Server and SAN are connected via the Dell S4048T switches with 4 x 10G each.
First, the physical installation of the environment was done in the data center. The problem arose that the switches were delivered without OS. So it was necessary for the switches and then the two controllers of the SAN to provide serial ports to the device. My colleague provided me with a 4-way serial to USB / network adapter based on a Raspberry. At the same time, FTP, TFTP and several other servers proved to be very helpful for the setup.
At the same time, the servers and the SAN could still be reached via their management ports (Dell iDRAC), which was also necessary for the basic installation.
On the basis of the basic structure (picture below) you can see the architecture of the area well. At the top the two servers, including the two switches to which then the SAN is connected. Of course, everything is wired twice to achieve the greatest possible reliability.
The environment shown in the picture differs from ours only that two servers are used here, but three servers are used in our environment as hosts. Of course, there are other internal connections between servers, switches and the SAN to separate, for example, VMware, VMotion, iSCSI and other traffic from the actual traffic to the rest of the network. Its connection is then made via two breakout cables with 4 x 10G which are connected directly to our central core switches.
The basic installation was then carried out by the mentioned Dell Remote technician.
• Server firmware updates
• VMware basic installation on the SD cards
• Switch OS installation and configuration
• SAN setup, creation of volumes
• iSCSI setup in VMware and connection of SAN volumes
During the subsequent tests of the various failure scenarios (switch, server, SAN controller) an error occurred during the reboot of one of the switches. Dell support has then replaced the switch directly.
This section is now about the detail configuration of the individual virtual machines.
This SQL Server VMware environment is now the third stand-alone VMware environment. Each environment consists of three ESXi hosts connected to one or two SANs.
So far, in the second or „Application-VMware“ called environment, we have not thought so much about the VMs and their possible performance requirements.
But that changed with the introduction of this SQL Server on VMware environment, as we wanted to turn high performance productive servers from physics to virtual without sacrificing speed.
Basically it was, as described in the beginning of the article, of course, the replacement of existing hardware with new – in the form of VMware Hosts. Behind it was, of course, in the implementation and the requirement that it should not slow down in any case.
That’s why I’ve studied the requirements of SQL Server in such an environment and read various guides and best practices. These are available from Microsoft, VMware and several other sources (I have linked a part of them below). Particularly helpful was a series of articles by David Klee on the page sqlservercentral.com called „Stairway to SQL Server Virtualization“ (1).
The majority of the following settings / configurations is taken from this article because there, unlike the other sources, directly and understandably the most important points are highlighted and shown by example.
1. Storage / Disk Partitioning
Basically the same applies here as for physical hardware: distribution of access to as many disks as possible, or here SAN paths (LUNs). Since this is often the same target on the SAN, only distinguished by different RAIDs (6 or 10), the only noticeable difference is the buffer of each individual path (LUN).
This also means that the correct RAIDs have to be passed through the different LUNs to the VM. Accordingly, hard disk configuration and allocation is as complex as in „physical life“.
It looks like this, for example – closely following the recommendations of David Klee (1):
As mentioned in the article, it is also of particular importance to use the correct hard disk adapter, ie controller (see SCSI ID), not the standard but the Paravirtualized Adapters (Paravirtual SCSI Driver – PVSCSI). For partitions C: (OS) and Y: (Pagefile) in the table above, this can be left with the default LSI SAS Controller.
The other should be set up on the mentioned PVSCSI. However, the corresponding controller and thus the drives are only present in the VM when the VMware Tools are installed.
Unfortunately, the assignment of the hard disks to the appropriate controller now only works in the vSphere web client.
The hard disk type used is fixed disk sizes (thick provision lazy-zeroed or eager-zeroed) as static VHDs. Although thin provision may be more economical in the beginning when consuming disk space, it will consume additional resources when the VMDK files are enlarged in the background because the space is needed. This is – similar to the autogrowth of a database – most often exactly the wrong time and then possibly causes problems in the VM.
The CPU settings are also a bit more complicated.
For the whole topic, note how many physical CPUs the ESX Host has the VMs on. This should be considered as the maximum that should be allocated in the VMs – an overbooking of the CPU resources, assigning more CPU than the host has available does not make sense for SQL Server.
In addition, one could still consider in connection with the size of the memory (point 3) the subject NUMA (Non Uniform Memory Access). This means that accessing the memory directly allocated to a CPU is faster than accessing „remote“ memory. To what extent this has an impact on the performance of the VMs, I can only estimate so far, as I have not received any information from Dell, which tells me the correct configuration of our servers used.
The basic CPU settings are simply distributed between the virtual sockets (processors) and cores per socket (cores per processor).
In the screenshot you can see that there are 12 cores associated with this VM, which then have to be licensed under current license conditions for the SQL Server (no guarantee!).
The allocation of the necessary memory for a VM mainly depends on the size of the databases, which should be as completely as possible in the main memory for performance reasons. To add is still a share for OS and SQL Server itself. The screenshot above currently has 192 GB allocated.
As already indicated in the case of CPU there are still missing technical details missing for the configuration.
For the network, you should use VMXNET3 adapters, which also do not arrive in the VM until the VMware Tools are installed. Basically, there is still room for teaming two network adapters in the VM – not necessarily as a redundancy topic, but again with two additional buffers Speed advantage – though probably minimal. That still needs to be tested. This point was later added by another document from Idera „Moving SQL Server to a virtual Platform“ (6). For this reason, this is also not included in the original configuration.
The entire configuration refers to the settings of one VM. Of course, the complete installation incl. SQL Server was finally migrated to a template to quickly create another SQL Server.
• Set Powerplan to highest performance – in the VM as well as the ESXi hosts.
The configuration for the VM goes, in addition to the way through the energy settings, also with the Powershell dbatools with the command Test-DbaPowerPlan (https://dbatools.io/functions/test-dbapowerplan)
More about the dbatools in the 4th article of this article series „Migration“:
1. Stairway to SQL Server Virtualization by David Klee on sqlservercentral.com
2. Understanding NUMA – https://technet.microsoft.com/en-us/library/ms178144(v=sql.105).aspx
3. VMware Best Practices SQL Server – http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/solutions/sql-server-on-vmware-best-practices-guide.pdf
4. VMware Performances Best Practices – https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/vmware-perfbest-practices-vsphere6-0-white-paper.pdf
5. Moving SQL Server to a Virtual Platform from Idera (registration required) – https://www.idera.com/resourcecentral/whitepapers/movingsqlservertoavirtualplatform
6. Brent Ozar’s page on Virtualization Best Practices: https://www.brentozar.com/sql/virtualization-best-practices/ (the page does not seem to exist anymore)
Topics for the other distributions of the project
• Describe migration of a SQL Server with dbatools, W2K16, SQL 2016 SP1 – done in this article: https://blog.volkerbachmann.de/2017/10/19/migration-of-sql-server-with-powershell-dbatools/
• Performance comparison physical <-> virtual – to be done!
Any comments are welcome!
Thanks for reading,