Feeling the squeeze: Workflow management under sequester

(First of all, let me welcome you to the Lab7 Systems blog, which will be full of our thoughts, insights, and musings for the scientific community and our users.  Enjoy the ride with us! – Varshal)

Whatever your political leaning, whatever your thoughts on the budget mess that the US government is in, you are no doubt starting to feel the effects of the sequester. The NIH has slashed its budget by 5% or about $1.6B (in a move that Howard Fineman calls “boneheaded”), and this is resulting in researchers’ need to closely examine their own labs’ budgets, and is requiring them to be much more diligent in their operations.

This turn of events comes rather conveniently at a time of rapidly decreasing costs for the generation of DNA sequencing data. This decline is due in large part to the reduction in the cost of sequencing instrumentation, but it does not take into account the costs associated with managing (which include, but are not limited to, analysis) the flood of data that are being generated by this massive increase in adoption of sequencing as a routine scientific technique. Management of NGS data has unfortunately been taken for granted by those who are rushing to sequence. And for what its worth, even with the massive cuts, furloughs, and belt-tightening that’s going on, the data still need to be analyzed, and research still needs to move forward. It would seem that streamlining the process of data analysis, and thereby reducing the human costs of doing it would make a lot of sense, right?

Lab7 Systems has done quite a bit of work to try to define the costs of data management, and I’d like to share some of this work with you. Given that the driver for the “bioinformatics bottleneck” is the lowered cost of instrumentation, we’ve based our calculations on the growth of the sequencing instrumentation market. Depending on whom you ask, this market is growing by as much as 64% annually (DeciBio, 2011). A more conservative approach, and what the instrumentation industry would consider a rapidly growing market, would assume a 25% growth rate.  Under the latter growth rate, assuming there are about 2500 NGS instruments in place at the end of 2012 (including Illumina’s HiSeq and MiSeq, LifeTech’s Proton and PGM, and a handful of PacBios, old 454s and SOLiDs), there will be upwards of 15,000 instruments in use by 2016. Further making assumptions for the types of labs in which instruments are placed, how many are in each lab, and how many people are tasked with managing them and their data production, the monies being spent have topped $400,000,000 in 2012 alone. This number, assuming status quo in data management processes, will balloon to almost $2,500,000,000 by 2016.

Oof.

An industry squeezed under sequester, at the mercy of a government that needs to face some austerity, simply cannot support these numbers. So how can we overcome another constriction of the bottleneck? Make it less expensive to manage the data workflow. Refocus the efforts of the scientists, bioinformaticians, and IT teams that are spending too much time being inefficient and repurpose their efforts back to what they want to do – make scientific progress. Lab7 Systems is working on providing the means to do just this.

Remember the $400 million I mentioned before? If 100% of the efforts reflected in this number were to be managed through workflow software (I know it's not realistic to think that everyone would convert all at once, but I'm making a point here), that huge number would shrink to just under $50 million.  That’s over $350,000,000 of savings/repurposing that could have been done in 2012 alone. Seems like a no-brainer when we think of it that way.  Just some food for thought…

Anyhow, this is the beginning of Lab7 Systems’ journey to help out in a time of need, and to address the ballooning costs of NGS data management. We hope that you will follow along with us on our mission to streamline these processes. We’re not out to recreate the wheel, or to force feed our agenda onto the community; we simply want to enable process optimization, and thereby ease the bottleneck.

Stay tuned to this space. We have a lot in store…

This entry was posted in Big Data, Bioinformatics, NGS