Wednesday, March 16, 2011

Bin Packing Problem...

I have a collection of blade servers which host virtual machine guests. Each blade has a known RAM capacity that can safely be allocated to guests (80GB each in my current environment, but eventually some hosts will be larger than others). Each guest consumes a known quantity of RAM, which is variable across the environment. 

The goal is to optimize placement of the VMs to reduce unusable space and leave as many blades as possible unused (so they can be powered off).

I’ve found that for my current environment with uniformly sized hosts, I can calculate a theoretical best consumption like this:
hostCount = 34; // 9 chassis * 4 blades each, -2 utilized elsewhere
hostRAM = 80;
demandRAM = [sum of VM demands]
maxFreeHosts = (int)( hostCount – (demandRAM/hostRAM)) // Integer division should eliminate the need for a cast here, but I was originally working with an Excel formula.

Obviously this would only be true for a “liquid” with variable size occupying a host, not a “solid” with a fixed size.

I’m still working on an algorithm to arrange these optimally, but as best I can tell it is an example of the Bin Packing Problem, Which is either NP-Hard or NP-Complete depending on who you ask.
https://secure.wikimedia.org/wikipedia/en/wiki/Bin_packing_problem

As an additional complexity, Ideally I would like to take note of what host a given guest is on, and minimize migrations, which have a time penalty of a ~5 minutes and cause a <30 second ‘freeze’ of the VM during the move.

Go figure I would stumble on something that sound easy, and turns out to be NP-Hard. When it comes to packing actual bins, my mother has an unnatural ability to do this sort of optimizing in her head, but it turns out that computers aren't very good at it at all.