In a previous post, we talked about sequencing as a big data problem, with our focus on the scale of data generated by sequencing instruments. In this post, we'll look at the problem from the context of biology to help understand the fundamental reasons why sequencing will always be a big data problem.
ReadInformatics 101: Sequencing is a BIG Data Problem
Posted on May 08, 2013In the early days of next-generation sequencing, I was approached by a local R&D group about designing a cluster for their two new SOLiD instruments. The budget was modest - tens of thousands of dollars - and the expectations were high - a run every ten days for each instrument, or one run a week for the lab.[...]
ReadPosted in Big Data, Bioinformatics, HPC, NGS
Feeling the squeeze: Workflow management under sequester
Posted on April 11, 2013(First of all, let me welcome you to the Lab7 Systems blog, which will be full of our thoughts, insights, and musings for the scientific community and our users. Enjoy the ride with us! – Varshal)
ReadPosted in Big Data, Bioinformatics, NGS