I was talking with a customer today about an application we are talking about virtualizing. The vendor only recommends purchasing HP DL580 servers, loading them with lots of RAM and 4 CPU’s. I personally like the idea of DL380 servers with 2 CPU’s in each, so instead of having 2 or 3 DL580 servers, get 4 or 6 DL380 servers… so you have the same number of CPU’s and same amount of RAM.
In my opinion if I an building a cluster and a 2 CPU socket box goes offline my resource pool shrinks by a smaller number then if I lose a 4 CPU box. Same goes for RAM…. I would much rather lose 64GB of RAM then 128GB of RAM (assuming we put 32GB per CPU socket).
So my question is which do you prefer? Am I missing something in my theory?
Your comments are appreciated.
Do you know the probability of an HP server going offline? One reseller of mine claims its very likely. I’ve only been involved with small shops of half a dozen servers or so, but in 20 years I’ve never experienced a single Compaq server going offline. Am I just really lucky? How do you do your risk analysis?
Also, what’s the difference in power consumption in your models above? Energy costs never seems to come up in any of our planning, but there are a couple of guys here who are obsessed with reducing our company’s overall power consumption, and the servers are a big part of that. Not because reducing fuel bills will have a big impact on the bottom line as such, but because we feel a moral duty of care, as a company, to the environment.
Finally, six HP carepack’s are going to be a lot more expensive than three, aren’t they?
What about cost of I/O? Smaller 2 CPU boxes will need twice the I/O cards, network ports, Fc ports. That all adds up. Depending on the complexity of your network that can play a big part in the decision.
Guess the answer is as always “it depends”.
What is the advantage of the HP carepacks anyway? With HA you can generally afford a few days downtime.
We are a HP Enterprise partner who sells, then manages the customers systems… On the subject of carepacks – we typically use 6hr Call-To-Repair carepacks on physical servers, as that gives you a “committed” repair time and access to the local spare parts depot. On clustered virtual servers we go with 4hr onsite response – the 4hr onsite response is a couple of hundred dollars cheaper, although you don’t have a “committed” repair time and the parts may or may not be available in the local depot.
As for failing boxes, we’ve seen more G6 and G7 boxes with failure rates than all G1 to G5 boxes combined that we’ve ever deployed. Typically it’s been the integrated P410i, resulting system board replacements.
Thanks for the comment Dean,
We have deployed alot of the G6’s and G7’s and haven’t had any trouble with the motherboards… so far anyhow. Lets hope it keeps going that way 🙂
I’m for scaling out. Obviously it does mean more servers to administer but as you said it’s advantageous if you have a failure of a server. Another facture could be environmental. Your DC cooling may cope better if your servers are smaller and spread out more.
We have a mixed bag of G4,5,6 and 7 series rackmount and Blade servers and I have never had a complete failure of a server. Components have failed but they are mostly hot pluggable. This is the beauty of HP hardware in my view.