Reserved resources not released?

EssendonEssendon Member Posts: 4,546 ■■■■■■■■■■
Odd one this guys, so I thought I'd hit up the wise folks at TE.

I had 9 VM's

- 4 vCPU's and 32 GB RAM
- all Citrix ones with Windows OS
- 32GB RAM reserved (equal to the configured memory). Dont ask why, this was set by another admin.
- some amount of reserved CPU (dont remember the exact figure, but about 3 Ghz)

Cluster:

- 3 hosts running ESXi 5, 128GB RAM, 2-way 6 core units with hyperthreading enabled.
- HA policy = host failures a cluster tolerates = 1
- Several DRS rules in place as requested by the client and vendor. Moderate setting on DRS aggressiveness.
- No resource pools
- Total of about 25 VM's in cluster

Now there was significant ballooning going on in the cluster with a number of other VM's suffering as a result of these reservations. I got handed the troubleshooting of this. I checked the utilization within the 9 Citrix VM's, none of the guests were using more than 10GB, the CPU utilization also sat at about 25%.

I decided to remove the reservations for both RAM and CPU. Powered down the VM's, removed both RAM and CPU reservations, left the CPU shares at High, and powered up the VM's. Left the configured memory at 32GB.

After the change:

Similar stats within the VM's today (RAM at about 10GB, CPU at 25%). Ballooning still happening for some VM's, not quite as much though. The hosts dont seem to have released the physical memory backing the VM's back into the cluster resource pool. Check below screenshots:



See how the host mem - MB tab is still sitting at almost equal to the configured memory. The Host CPU tab is also sitting at what was previously reserved for the VM's. The second picture is of the resource allocation tab on one of the VM's and the 3rd one's the summary tab of the same VM.

Does anyone know why the previously reserved memory has not been freed up for use by the rest of the cluster?
NSX, NSX, more NSX..

Blog >> http://virtual10.com

Comments

  • EssendonEssendon Member Posts: 4,546 ■■■■■■■■■■
    Hosts tab on the cluster:

    NSX, NSX, more NSX..

    Blog >> http://virtual10.com
  • QHaloQHalo Member Posts: 1,488
    Active is what you should be looking at. The memory looks so high because those VMs are severely over-provisioned. The RAM should be dropped. What does Windows say about the RAM usage from within the guest?
  • EssendonEssendon Member Posts: 4,546 ■■■■■■■■■■
    I completely agree that Active is what I should be looking at, but if you look at the 4th screenshot you'll see that the hosts' RAM usage is very high. Windows reports about 12GB used which is almost equal to the Active figure, but Private is what's physically backed by the host and not given out to other VM's. This is causing ballooning on other VM's. It's almost as if there's some kind of 'ghost' reservation still there.
    NSX, NSX, more NSX..

    Blog >> http://virtual10.com
  • QHaloQHalo Member Posts: 1,488
    My Exchange server shows the exact same thing your res_alloc pic shows. Are there any other cluster reservations being put into play?

  • EssendonEssendon Member Posts: 4,546 ■■■■■■■■■■
    Just one other VM has a reservation of 8GB. Has your Exchange server ever had a reservation?
    NSX, NSX, more NSX..

    Blog >> http://virtual10.com
  • QHaloQHalo Member Posts: 1,488
    Once upon a time. But I removed it. Dont remember having this problem though. There are some suggestions here. Looks like the easiest is to try a vmotion to another host to see if it frees the balloon driver.

    VMware KB: Balloon driver retains hold on memory causing virtual machine guest operating system performance issues
  • TheProfTheProf Users Awaiting Email Confirmation Posts: 331 ■■■■□□□□□□
    Sorry in advance if I misunderstood the question :)

    The reason you're seeing the 29GB of private memory, is because of the way private memory is calculated. For example, to get that figure, the host performs the following calculation:

    The amount of memory allocated to the VM (32GB) - Unaccessed (822MB) - Shared (2.41GB), this will equal to what you're seeing here which is 28.79GB of private memory. it's normal to see that, that can also change if you over-provision your VM's on the host that they're currently running.

    Now if at some point you had this virtual machine memory set to 32GB and you had a reservation in place, that reservation would be static, in other words, when you change the memory configuration of a VM, the reservation would still remain the same. To get around that, you can use the "Reserve all guest memory" option, which will change the reservation to amount of memory that is allocated to the VM.

    Hope this helps.
  • EssendonEssendon Member Posts: 4,546 ■■■■■■■■■■
    @TheProf, I'm afraid you have misunderstood the question icon_smile.gif

    What I meant was when I removed the reservation the host hasnt released it off the VM so it can be used by other VM's. The reservation is no longer than there.
    NSX, NSX, more NSX..

    Blog >> http://virtual10.com
  • scott28ttscott28tt Member Posts: 686 ■■■■■□□□□□
    I don't think that what's happened has anything to do with ESXi or you removing the reservation, I think it's much more simple - the guest OS or applications/services inside the guest OS have "claimed" the 32GB of memory allocated to the VMs - they've used it because it's there. They might not actively be using it, but have requested it and ESXi has physically allocated it as it is designed to do - ESXi is reacting to the demand from inside the VMs.

    The guest OS or application/service behaviour has remained the same when you've started up each VM after removing the reservation, hence why removing the reservation has made no difference.

    Is performance of any of your VMs compromised because of this situation? Do these VMs need to be configured with that amount of memory?
    VCP2 / VCP3 / VCP4 / VCP5 / VCAP4-DCA / VCI / vExpert 2010-2012
    Blog - http://vmwaretraining.blogspot.com
    Twitter - http://twitter.com/vmtraining
    Email - vmtraining.blog@gmail.com
  • EssendonEssendon Member Posts: 4,546 ■■■■■■■■■■
    Yes I realize that ESXi reacts to demand inside the guest OS, but monitoring inside the VM's has revealed that the maximum memory touched was about 12GB. Now ESXi should have allocated 12GB to it, which is fine - that's how it should be. But the VM's never demanded more than 12GB.

    I suspect this has happened because the setting - "Reserve all guest memory - all locked" was checked. So somehow ESXi hasnt released memory even after the reservations were removed and the VM's power-cycled.

    The performance of some other VM's has been impacted because the hosts have very little memory to go around. There's ballooning in some VM's during peak usage and performance suffers. Some app servers and an SQL server were the most impacted. Yesterday the SQL VM was severely impacted and the host it was on was swapping heavily to disk with free memory down to a few GB's.

    I've logged a case with VMware, let's see what they suggest.

    I believe these VM's were highly over-provisioned to begin with. Citrix said 10 VM's with 2 vCPU/12GB were good, but the decision was made to go with 5 VM's with 4 vCPU's/32GB (with full reservations - dont ask me why!). In the meantime, I've asked that the configuration be dropped down to what Citrix recommended.
    NSX, NSX, more NSX..

    Blog >> http://virtual10.com
  • jibbajabbajibbajabba Member Posts: 4,317 ■■■■■■■■□□
    Did you try moving the VMs to different hosts ?
    My own knowledge base made public: http://open902.com :p
  • blargoeblargoe Member Posts: 4,174 ■■■■■■■■■□
    When you did your power cycle, did you just restart, or did you power off... wait... power on? I don't think just a restart releases the memory. I think the VM "touches" all of its allocated memory when it starts up, and if it did so while the reservation was in place, the physical memory would have been mapped to the VM at that time.

    No limit values configured for any of the VM's, right?
    IT guy since 12/00

    Recent: 11/2019 - RHCSA (RHEL 7); 2/2019 - Updated VCP to 6.5 (just a few days before VMware discontinued the re-cert policy...)
    Working on: RHCE/Ansible
    Future: Probably continued Red Hat Immersion, Possibly VCAP Design, or maybe a completely different path. Depends on job demands...
  • blargoeblargoe Member Posts: 4,174 ■■■■■■■■■□
    Essendon wrote: »
    I believe these VM's were highly over-provisioned to begin with. Citrix said 10 VM's with 2 vCPU/12GB were good, but the decision was made to go with 5 VM's with 4 vCPU's/32GB (with full reservations - dont ask me why!)

    I'll tell you why. Either the guy that set this up didn't really understand how VMware works, or the manager required it because he didn't understand how VMware works. I've been fighting the same thing. Lots of people want to use VMware to manage their workload, but they are afraod to actually "trust" VMware to manage their workload. I've gotten the limits and reservations out of my environment, but now I think that has had the unintended consequence of a peer of mine requesting more physical servers because I won't let him have reservations in VMware.
    IT guy since 12/00

    Recent: 11/2019 - RHCSA (RHEL 7); 2/2019 - Updated VCP to 6.5 (just a few days before VMware discontinued the re-cert policy...)
    Working on: RHCE/Ansible
    Future: Probably continued Red Hat Immersion, Possibly VCAP Design, or maybe a completely different path. Depends on job demands...
  • EssendonEssendon Member Posts: 4,546 ■■■■■■■■■■
    @jibba, yes I have tried moving them to different hosts. Had no effect.

    @blargoe, yes I waited after powering the VM's down. No limits on any VM's. The guy who set this up is not bad, what's needed is a culture shift here. Too many other people "know" vSphere and call the shots.

    The request to drop the amount of configured resources has gone through, albeit partially. These VM's are now 4 vCPU/20GB instead. Few more hours before users walk in for work, lets see what happens!!
    NSX, NSX, more NSX..

    Blog >> http://virtual10.com
  • scott28ttscott28tt Member Posts: 686 ■■■■■□□□□□
    How can you tell that the the guest OS never demanded more than 12GB since the moment you powered the VM on? I've never heard of ESXi not releasing memory when a VM is powered off...
    VCP2 / VCP3 / VCP4 / VCP5 / VCAP4-DCA / VCI / vExpert 2010-2012
    Blog - http://vmwaretraining.blogspot.com
    Twitter - http://twitter.com/vmtraining
    Email - vmtraining.blog@gmail.com
  • EssendonEssendon Member Posts: 4,546 ■■■■■■■■■■
    Monitoring told me that Scott.

    Looks like dropping the configured RAM on the machines has fixed the 'issue'. The hosts are now sitting at about 70% utilization of RAM, another host is expected to be added to the cluster in the near future.

    Perhaps this wasn't an issue at all, just behaviour that I wasn't aware of.
    NSX, NSX, more NSX..

    Blog >> http://virtual10.com
  • blargoeblargoe Member Posts: 4,174 ■■■■■■■■■□
    Glad you were able to get the memory allocation reduced.

    Take a look at this VMware article.

    https://communities.vmware.com/docs/DOC-10398

    So apparently, whatever memory a VM touches will continue to show as "consumed" memory, whether or not it is currently using the memory, and will continue to display that way until overcommittment occurs. I am thinking the Host Mem value for each VM will normally be close to configured memory because Windows is touching all the memory when the VM is powered on. When the host hits the soft/low/hard thresholds (or if you set a limit on a VM), ballooning will occur and the Host Mem value for the VM would reduce, and I think would remain at that reduced level even when the host memory pressure subsides (or the limit is removed) unless the VM touches additional memory again. SOME of this activity should be fine as long as the increase in memory pressure overall is gradual enough for the host to pass the swapping threshold.
    IT guy since 12/00

    Recent: 11/2019 - RHCSA (RHEL 7); 2/2019 - Updated VCP to 6.5 (just a few days before VMware discontinued the re-cert policy...)
    Working on: RHCE/Ansible
    Future: Probably continued Red Hat Immersion, Possibly VCAP Design, or maybe a completely different path. Depends on job demands...
  • scott28ttscott28tt Member Posts: 686 ■■■■■□□□□□
    I'm still of the opinion that the guest OS demanded/touched the allocated memory.

    Good to hear that things are fine now.
    VCP2 / VCP3 / VCP4 / VCP5 / VCAP4-DCA / VCI / vExpert 2010-2012
    Blog - http://vmwaretraining.blogspot.com
    Twitter - http://twitter.com/vmtraining
    Email - vmtraining.blog@gmail.com
Sign In or Register to comment.