Been having performance issues with a CRM environment which is housed in 3 VM's.
Application VM1: 2 vCPU's and 8GB RAM
Application VM2: 2 vCPU's and 8GB RAM
SQL database VM: 4 vCPU's and 12GB RAM
Users are experiencing performance issues when running reports, these are run against the database as you'd imagine. They reckon the system slows down considerably when they do this. There are about 400 users accessing this system, I dont know how many of these concurrently use the system. I have had reports of people experiencing slowness even when not generating reports. The CRM sys admin has reported 100% CPU utilization when people are generating these reports and 80-90% utilization at other times.
I had a look at the SQL VM's advanced performance charts and Ready times are quite high (almost 10%). The screenshot shows the average to be 1732ms, I have seen it higher than 2200ms. So rounding off 1732 to 1800, divided by 20000*100 = 9%. But that's only ~ 2.5% per vCPU. That's not high, is it?
One host runs all 3 VM's, there are 3 other hosts in the cluster. Each host has 2 sockets, with 6 cores per socket and HT enabled. This host runs five VM's with 4 vCPU's and ten VM's with 2 vCPU's and nine VM's with 1 vCPU. Going by what I read from the below link, these larger VM's are running on a host that has lots of smaller VM's.
CPU Ready Time in VMware and How to Interpret its Real Meaning - Jonathan Kehayias
The CPU ready times on the other two VM's are very low, like 1-2% or thereabouts. Disk latency for the host is low, GAVG values are about 2.5.
I intend doing the following. What do the smart cats at TE recommend?
- Decrease the vCPU count on all 3 VM's. Bring the SQL VM down to 2 vCPU's and the other two down to 1 vCPU.
- Put in a DRS rule to separate these VM's to different hosts. This may or may not be a good idea because other hosts are running similar workloads.
- Introduce a culture shift from believing more vCPU's are better. The CRM sys admin wanted me to give him 12 vCPU's

It's difficult to obtain downtime for this environment, so do I bring down the vCPU count on all VM's in one hit or just do the SQL VM for now and see how it goes? This is considering the fact that the ready times for the other two VM's arent high.