Head in the clouds
January 10th, 2010
It seems that due to my recent post, Bioinformatics and cloud computing, I have been labeled a cloud skeptic. While I don’t reject that label outright, I won’t accept it either. If I may label myself, I would call myself a cloud realist. My first piece of evidence is that at the end of my previous post I specifically state, “This is all not to say that there is not a place for cloud and other distributed computing frameworks in bioinformatics, but that’s the topic of a future post.” Unfortunately, this is not the future post to which that statement refers. The purpose of this post is to respond to some of the comments made on that post and around the web.
First, Ben Langmead said,
My main comment is that you’re comparing the cloud cost against only at one type of cost: the one-time cost of buying new machines and adding them to your (already large; at least at Wash U) pool of computers. That isn’t the only relevant number for a lot of people, especially those in smaller institutions and academic departments, because (a) there are recurring costs for electricity, cooling, space, and (b) there isn’t necessarily a huge pool of computers (and support staff, and space) to begin with, so the initial cost and effort barrier can be much larger than the cost of the machines per se.
Bob Carpenter then adds similar comments,
To repeat what Ben Langmead said above, the total cost of ownership of a computer, even for a university, is much higher than its purchase cost. For instance, how many computers does each sysadmin manage (or how much time does it take to manage new operating system patches, software installs, etc.)? How much space do they take up? The power for these beasts is not inconsiderable… My wife’s having trouble with her cluster at NYU because the building’s heating and cooling are both tied to the same faulty plumbing system; so even though it’s winter here in NYC, when the heat went out, so did the machine room cooling, so they had to shut down all the machines for a day or two. Just like when the AC went out in the summer.
Finally, over at says
What his numbers don’t take into account is the overhead of running a (possibly single node) cluster. While the fixed cost of purchasing computer equipment might be manageable, especially compared to chemical reagents, the operational costs of running a data center are substantial. Computer equipment needs to be continually serviced, be it for software, security, or kernel patches, or for unscheduled maintenance. In addition, energy costs for running a data center are high and expected to increase in the near future.
Yes, it is true that the cost for the Dell server I quoted was just the purchase price. But the price I quoted for a computing core in our cluster, $500, was a fully loaded cost. As indicated in the post, that fully loaded cost includes server, rack, networking, electrical hookup, installation, 3-year warranty, etc. In other words, that is the cost to add a core to an existing cluster and was provided for those researchers that do have clusters (as opposed to the cost of the Dell which was provided for those who do not). It does not include system administration, electrical power, or cooling. In other words, it does not include ongoing costs, only capital costs. Why did I not include those ongoing costs? Because I did not need to. To maintain pace with the sequence data generated by an Illumina GA IIx or two, you don’t need any of that stuff! For electrical power and cooling, the addition of a few cores to an existing computing infrastructure is not going to make a substantive difference in power or cooling. For a lab without an existing computing cluster, all you need is the desk where you sit your bioinformatician. If you are at a normally operating university, the electrical power and cooling to office space is provided from the overhead your university takes out of your grants. If you operate a core facility at a university, then you simply work these costs into the fees you charge (their contributions are several orders of magnitude less than the sequencing reagents). What about labs who have lots of sequencers but not a lot of computing power? Well, that’s bad planning and allocation of assets; no one can help you.
Systems administration costs are a similar story. For researchers with existing clusters, the addition of a few cores to keep pace with a few Illumina instruments will not require them to hire additional IT staff. For researchers without a cluster, I posit that it does not take more system administration costs to manage a single desktop workstation than it would to manage a cluster of Amazon EC2 nodes. Amazon EC2 provides and a stock installation of an . Aside from the fact that you can purchase computers from Dell with Red Hat Enterprise [GNU/]Linux, any bioinformatician worth her salt (or any 12-year-old for that matter) can install Ubuntu on a computer. Just as the Dell customer will have to install their bioinformatics tools on the systems, so too will the Amazon EC2 customer; except they will need to install them on all the nodes they have rented. Regarding maintaining security patches and other updates, that is also dead simple in Ubuntu (although I will readily admit that just because something is easy, it does not necessarily follow that people will do it). The bottom line is that maintaining a workstation used for day-to-day activities and analyzing data from one or two Illumina instruments is more likely to be within the capabilities of a bioinformatician than setting up and maintaining an Amazon EC2 cluster.
Another point brought up in the above comments was reliability of the systems. One of the arguments in this area is that with your own hardware, you are responsible for maintaining the equipment while with Amazon EC2, they manage all the hardware. This is not really the case, though. All of the costs I have quoted included a 3-year warranty with on-site service. The reliability argument also involves downtime. If your local systems go down, whether for hardware failures, network outages, power outages, or Armageddon, it is true that you will not be able to do any computations on them, but you’re also not going to be able to access your EC2 systems and those EC2 systems will not be able to pull data from your systems (and in the case of Armageddon, Amazon EC2 will probably also be down).
So, that leaves us with the question, what would the fully loaded cost of the Dell workstation be, and what is the break even point with Amazon EC2? The cost of the quad-core system was roughly $1700. You only need one core for data analysis. Since you need to buy your bioinformatician a workstation anyway and it needs an operating system, bioinformatics software, power, and cooling, we’ll ignore those costs. So the purchase price becomes the fully loaded costs for comparison purposes. Assuming you would buy your bioinformatician a dual-core systems with 1 GiB of RAM (Firefox uses a lot of memory) which costs about $1000, the incremental cost of getting a machine capable of analyzing data is $700; the incremental cost per computing core is only $350. That dollar amount will buy you less than three genomes worth of analysis on Amazon EC2.
Bob Carpenter had a few other points worth addressing: viruses and running analysis multiple times. I would argue that the former is an issue regardless of where you run your analysis. Plus, for the GNU/Linux systems we are talking about in these scenarios, viruses are much less of an issue than they are for Microsoft Windows. Regarding running analysis multiple times, sure it would mean you may need more than one core to keep up, but it also means you are going to pay Amazon a lot more too. With the quad-core system quoted above, you have a whole extra core (two for the desktop, one for the single pass analysis, and one extra) to spill over into at no cost.
Before I close, I would like to thank all the commenters for raising the above points. All of the issues they raised are very important to consider when jumping into the next-generation informatics space. They also made it clear that my previous post was not as thorough as I thought it was when I hit the publish button. In addition to the excellent comments I quoted above, there were also several other good points regarding software in the comments of the previous post that I hope to incorporate in future posts (and hopefully this post will generate a few comments as well).
Posted in genomics, IT | 2 Comments »
Tagged with: cloud, compute, genomics, informatics, IT
You can follow any responses to this entry through the feed. You can leave a response, or trackback from your own site.
January 12th, 2010 at 7:15 pm
If you can get away with a single desktop workstation for analyses, by all means go for it. That’s exactly what we do at work. I’d recommend a spare workstation, as you don’t want your desktop computing getting jammed by jobs hammering away multi-threaded.
Admittedly, if you can do all your sysadmin work yourself, and you have time to spare, it’s a very different issue than for those of us whose time is already overbooked.
Clusters just up the ante. My wife’s having to run jobs that take hours over dozens of compute nodes, and often runs and reruns lots of them at the same time for different projects and analyses. The work just won’t fit on a desktop workstation.
I believe part of the argument in this post involves the same quantitative mistake people make to get themselves deep into debt: one more purchase can’t hurt, can it?
If it’s really no trouble for sysadmins to manage “just one more machine” or “one more core”, then why ever have more than one sysadmin? In fact, if a 12 year old can do it, why have a sysadmin at all?
Have you ever had Dell out for repairs or sat on their phone queue? It’s not fun, even if they do come out pretty quickly. Our Dells broke down all the time when we were using them for speech processing and telephony at SpeechWorks. It’s at least half a day’s lost work if you need to wait for the machine being repaired.
You may not be paying heating/cooling costs for your machines or even space or maybe even sysadmins (different institutions do this differently), and that can seriously change a decision point. CPUs run at least 25 watts these days, and cheaper quad cores run around 80 watts or more per CPU. There’s also memory, disks, heating and cooling the server room, etc. It adds up. So much so that your power circuits may need to be upgraded. Again, if it’s just a single workstation and you have a free high-amp relatively clean power circuit, no problem. If not, at least get a UPS!
Also, there’s the problem with adding the machine that breaks the camel’s back in terms of heating/cooling. This is what happened to me back in the late 1990s at Bell Labs. We bought a bunch of SGI rack computers, mostly for speech and vision statistical processing. Lucent was charging us a fortune for space on the Murray Hill campus (more than Manhattan in the dot com boom), so our cheapskate lab director decided to put the new machines in the same room with the old machines. The problem was that the cooling system couldn’t keep up and temps in the machine room quickly soared over 110 degrees and disks started failing. So the whole thing got shut down for months while they rethought where they could put the new machines and how the lab’s budget could pay for it.
That’s also what just happened to my wife’s cluster at NYU (which went down again this weekend). They added new machines on the “what can a few more hurt?” principle without adding new cooling beyond the building’s own. Not surprisingly, they overheat all the time and get shut down.
Typically, you create a single image on EC2 and it shares them across nodes — you don’t literally have to manage all the nodes in a job. At least for what I’ve seen people do, they run something like Hadoop in a prepackaged configuration, so there’s no configuring the OS at all.
I’m not sure if the cluster software for which sequencers are set up is available prepackaged on EC2 or other clouds, so that could be a huge hassle with EC2. Unless, perhaps, you have that wily 12 year old on hand!
January 17th, 2010 at 2:23 pm
Well. I’m a skeptic – and proud !