Picture this. An executive at your organization gets an
idea for a big project, one that adds a new product line to your company
and could result in millions of additional dollars in revenue per year.
The whole company is gung ho about this. The new mantra each workday is
“what are we doing to advance Project X?” Cheers are sung each morning.
And, of course, the IT team gets involved and spins up a number of
servers, both physical and virtual, to help out the development team and
put the new product or service into production.
There’s just one thing: All of this happened in 2005. It is
now 2015, a full decade later, and Project X has been replaced by
Projects Y, Z and Omega. Omega is now hosted up in the cloud, either on
Google Compute Engine, Amazon Web Services or Microsoft Azure. The
executive who championed Project X in the first place is long gone, and
the original IT team that set up all of the computing power for Project X
has transitioned into other teams or out of the company.
Now answer me this: What is the disposition of all of those Project X servers?
It’s 8 p.m. Do you know where your servers are?
You probably don’t know if you’re anything but the smallest
of organizations. The authors of a new study on deprecated equipment
agree. Jonathan Koomey of Stanford University teamed up with the
Anthesis Group to study 4,000 servers and found that up to 30 percent of servers in datacenters are turned on,
ready for service, and actively drawing power and consuming resources …
but are not actually doing anything. The study refers to these types of
machines as comatose servers, in that the “bodies” are there and
working and breathing but, like an unfortunate accident victim who is
brain dead, the servers are not actually doing anything. Previous
studies by TSO Logic and the Natural Resources Defense Council (NRDC) reported much the same findings.
To get a
sense of the cost of the problem, think about how much you could save if
you just turned off a third of the hardware that you manage – got rid
of or re-used the licensing, unplugged the hardware, and liquidated the
rest of it. It’s a problem with an enormous cost, and even if the study
is half wrong, at 15 percent, that’s still a significant cost.
Why does this happen? Fundamentally it comes down to the
problem of not knowing what you have and what it is doing. It used to be
a little easier to keep track of things because in order to roll out
new servers, you had to requisition one, send a PO, receive it,
inventory it and mark it, so at least you knew what type of silicon you
had on your server closet racks. The operating system and software was
another story, but at least you had a fighting chance.
Virtualization changed all that – now spinning up a new Web
server to host one dinky little authentication task takes minutes and
requires no input at all from finance. Virtual machine sprawl is a real
problem, and while management software like System Center Virtual
Machine Manager and similar VMware tools has long tried to help catalog
and inventory virtual machines – as well as make sense of how they’re
deployed – not every organization has either invested in such tools or
is actively using them. What’s more, it’s incredibly easy to spin up new
virtual machines to take over and consolidate tasks old virtual
machines were handling, and it’s perhaps even easier to forget to
decommission the old virtual machine. Now you have three or four VMs for
every physical server you used to have. What a nightmare.
[Related: Top cloud Infrastructure-as-a-Service vendors]
And then there’s the fact that business process owners do
not always inform IT when things have changed or priorities have
shifted. IT may be unaware that an outsourcer or third party has taken
over a workload, especially if the project was only minimally staffed by
your own IT team. New IT folks might be reluctant to turn off old
servers, because there might be a process or dependent resource they
don’t even know about since they don’t have the full institutional
memory of the previous team.
The cloud does not solve this problem either – in fact, it
might even make it worse. At first blush you might think comatose server
are the cloud provider’s problem to work around – “scale up, guys!” you
might think – but remember who’s ultimately footing the bill for that.
Plus, unlike comatose servers in your datacenter, which only eat up
power and network bandwidth, unused servers sitting active on a cloud
solution platform like Azure or AWS are costing you hourly fees. A
reasonably equipped virtual machine might run $0.50 cents/hour, which
doesn’t sound like much until you realize that equates to burning $4,380
every year for every single cloud server you have running that’s not
doing anything. If anything, it’s quick way to reduce expenses and look
great in the next budget review.
Solving the problem
What are some ways you can reduce the comatose servers in
your organization? Ultimately, the solution is to know what you have and
understand its lifecycle. Barring that, however, there are ways to get
your head around the problem:
- Use a free network scanning tool to get a sense of exactly what you have. This won’t pick up everything, mostly because machines have different network security settings, but it will give you a good starting point and may jog your memory or the memory of your teammates about a group of machines that might still be around.
- Consult with your finance department to see if you can get records of hardware and software purchases by year to piece together a history of machines. If you can grab MAC addresses off an invoice, or you know a particular line of business application was purchased for some group based on the expenditure justification report, it might be easier to track down those machines and find out if they are still around or not.
- Pick low intensity, low usage periods during shoulder and off seasons in your business and just turn old machines off. Chances are, you’ll have a group of machines – maybe a Virtual Server host and a bunch of guests from 2006 – that are still on but you suspect aren’t doing anything. Wait until the week after Christmas, or Spring Break or (for universities) the interim period between terms, and just turn them off. See what happens. Note who complains.
- Put procedures in place. IT should manage server requests with a justification, however brief is necessary, and an expected lifecycle. IT should know who owns what workloads and who to get in touch with yearly so that an audit of necessary services can be performed. This should go for physical machines, virtual machines and cloud services too.
This story, "Are comatose servers your next big IT headache?" was originally published by
CIO.
Post a Comment