Next generation sequencing (NGS) has given the genomic research community ultra-high throughput, scalability, and speed that could only be dreamed of 20 years ago. After winning the Breakthrough Prize last month, we speak to Professor Shankar Balasubramanian about his work on NGS, his incredible translational journey and the power of the pub…

First and foremost, congratulations! You have been awarded some of the most prestigious science and technology awards over the last 2 years, including the Breakthrough Prize and Millennium Technology Prize. Do big awards like this validate your work?

Thank you very much. Prizes like this are of course a great honor, but I guess I’d say, I think we as scientists don’t do what we do for prizes. So actually, I think the biggest validation of anything we do or contribute to is whether it ends up providing useful knowledge or useful capabilities. That’s the most important validation for things that we do.

The recent big prizes have all centered around your work on next generation sequencing – just how big has the shift been in sequencing capability since its invention?

Well firstly, it’s important to say many people have contributed to this – there’s a big team of people behind it. When I first started working in the chemistry of DNA and genomics, around 1996, it was the early phases of the Human Genome Project and the machines of that time used Sanger sequencing. Typically, they were sequencing in the order of hundreds of thousands, perhaps approaching a million, letters of DNA per experimental run. So that was the benchmark then. But for NGS now, with the latest automated machines, they will sequence trillions of DNA bases per experiment. So, the difference is about a million-fold.

So, if you have one of these machines, you have a capacity that’s probably about 1,000 times higher than the global capacity as it was during the early phases of the Human Genome Project. That means laboratories have a capacity to do things that were unthinkable 20 years ago. Now, along with capacity and speed comes a reduction in cost.

Yes, there was a time not too long ago that getting a full genome sequence for less than $1,000 was the ultimate, almost unimaginable, goal of sequencing…

That’s right. And I suppose the motivation behind what people such as such as myself were trying to enable going back 25 years ago, was to really to enable population-scale human genome sequencing. While a reference genome was the breakthrough in the Human Genome Project, we’re all different. And in the case of diseases, like cancer, each cancer is unique in a unique individual, so you’re not going to get all that information from just one genome. So there was a need to develop a capability to ultimately make it scalable to a population.

Now, it’s slightly unfair to draw a comparison with the cost of the Human Genome Project. Because that was the very first genome, and there was no reference to go on. It was a lot more work than it takes to now to sequence a human genome, given a reference genome, to align it to. But nonetheless, the cost of the Human Genome Project was of the order of billions of US dollars. And the cost of a high-quality human genome today is, I think, quoted at around $600. So again, there’s a sort of almost million-fold shift in the economics of tackling this sort of project.

Can you take us back to the early days of how you wanted to approach this scalability issue – you and Professor David Klenerman worked together closely, and like all good Cambridge-based DNA breakthroughs, initial ideas were kicked around in the pub…

Well, I’ll get to the pub in a moment! I’m an organic chemist and biochemist and David is a physical chemist and a laser spectroscopist. So really, scientifically, we’re from quite different worlds. And we happen to be in the same department. And we were brought together because I had a problem that needed a particular laser which I didn’t have. I’d heard about this clever guy in the department and someone said I should talk to him, and that’s what brought David and me together and got us talking.

We created an idea for a project to build a single molecule fluorescence microscope in order to watch a DNA polymerase, the enzymes that synthesise and copy DNA when cells replicate. We started the work with a couple of postdocs – Mark Osborne and Colin Barnes, who were here right at the beginning of all this. And so, the fundamental idea was, in fact, not to sequence DNA at all.

But it was during this work that I thought it would be exciting to apply this new observation technique to more closely understand arguably one of the most important chemical reactions in life – the synthesis of DNA. This is where the pub comes in.

We used to go to a pub around the corner… I think pubs and tea rooms are important places to relax, brainstorm and have conversations. Often you do that if you’ve got stuck with a problem and need a different sort of ambience to try and unstick the problem. When we did that, we saw the potential of our approach. What we were doing was watching DNA being synthesised one molecule at a time on a surface. But that is very hard to do, you can easily miss it – so we decided to watch lots of molecules in parallel at the same time, that way, there’s a better chance of you catching events. And we went from that, to seeing the potential to actually decode the sequence of lots of DNA molecules attached to a surface in parallel.

Partly, it was these pub discussions where we saw it so strongly that, you know, we felt we had to do something about this.

From quite an early stage the route you took to develop the idea was a commercial one – forming a company and raising funding. Why was that?

We knew the idea had enormous potential and we saw a biochemical and chemical way of doing this, and we could see how to image it. And even once we’d done some of the proof of concept to show it would be possible, there was a lot of hard work still ahead to bring it all together.

Now, early on we did go and talk to some colleagues at the Sanger Institute, who were involved in the early phases of the Human Genome Project. And actually, one of the people we spoke to was David Bentley, who’s now chief scientist of Illumina, to just understand the Human Genome Project and share with them that we had an idea that could increase the capacity. We were left feeling that if we could make this happen, somehow, there was just no question that the world in 10 years-time would be ready for such a system.

We thought about writing grants – we even wrote a couple of preliminary grants to support what had turned from basic exploration into a focused development project. But in the end, we took it on a commercial route. We’d encountered some people from a venture capital company and we decided to start a company, primarily, because we couldn’t think of any other way of pulling together the kind of resource that you’d need to do this. There weren’t Cancer Grand Challenges and things back then in the mid to late 90s! We couldn’t find any other mechanisms. So we raised money from a venture fund, and started a company that we called Solexa – which is now part of Illumina Inc.

We went out to raise the money in 1997, and formally formed the company in 1998. For two years we incubated in the university, in the chemistry department, and built a team to do some of the early work. As the headcount increased, we packaged the company and exported it out of the university and raised more money to build a management and development team to fully develop the technology into a working commercial system.

You went from a blue skies researcher to a very translational researcher involved in company formation and spinouts – what advice can you give to researchers who might take this route?

The first thing to say is that had we not been doing the blue skies research to begin with that had no specific application in mind, other than to build knowledge, we certainly would not have gone down this pathway of developing the sequencing technology. So, the blue sky research was important. And the observations we made is what led us to a translational direction, even though that wasn’t what we set out to do.

If you see the potential of blue skies research, it’s definitely worth pausing at that point and thinking about what the use might be – you know, asking how might it change things? Even though it might take 10 years or 20 years? One of the questions the first investor who funded this at the beginning asked was: “Say all this stuff that you’re telling me works at the level that you say it might work? What’s the market for that sort of thing today?” And I said: “That’s easy, it’s zero.”

Because, of course, there’s the Human Genome Project ongoing, but no one was really setting out to sequence whole genomes routinely, at that point in time. But that didn’t stop us. And it also ultimately didn’t stop the investor. So, a lot of translation is about finding another solution to a problem that may be incremental, but for certain types of problems, it needs to be more than incremental, because the world will change by the time you have a fully working method or technology. By the time it’s working, it still has to be attractive. And sometimes that’s going to take a decade or even more.

I’d also say it’s easy to start a venture or company, but actually, you have to ask yourself, ‘Am I still going to be motivated to drive this years down the line?’ And if you if you’re one of the people who started it, your enthusiasm to continue pushing it is important, because if you lose interest, it’s possible that everyone else will as well. So, you need to make a commitment that’s medium to long term.

What do you see as the future for sequencing?

So, one aspect of this is to go for faster and cheaper solutions. I think that will continue as long as there’s commercial pressure to drive those sorts of improvements. Now, much of the work on extracting value from DNA to provide insights about basic biology and cancer as well as other diseases has been driven by genetics. So that’s the nucleotide sequence. What I would say is that there’s much more information than just that sequence and the mutations and variants that one can get from DNA. There are other layers of information that can be extracted that can tell you about what’s going on in the cell – what’s going on in the system – during normal biology, and what gets perturbed in disease biology.

One of the areas that I’ve been active in is epigenetics, but there are other modifications in DNA. And more than one actually – DNA methylation is one, but there’s also hydroxymethylation, which was only really proven in human DNA in 2009. So they are, if you like, the fifth and sixth letter of the DNA, and, in fact, there are also several other modifications that occur in human DNA and we don’t quite know what they’re doing.

Now, these letters provide other information about what’s going on in biology that is orthogonal to the information provided by traditional sequence data. I started another company some time back in 2012 called Cambridge Epigenetix. They developed a technology that allows you to get five or six letters from a sequencing not just four, with the extra letters providing information about the biology in addition to the genetic information.

I think we’re going to learn more about biology, normal biology and disease by looking at not just genetics, but genetics plus, epigenetics. So, I think there’s, there’s a lot that will, will come out of the scientific and clinical literature with this sort of capability.

The Breakthrough Prize in Life Sciences is awarded for transformative advances toward understanding living systems and extending human life.

The Prize was founded in 2013 by Sergey Brin, Priscilla Chan and Mark Zuckerberg, Yuri and Julia Milner, and Anne Wojcicki.

Professor Sir Shankar Balasubramanian is Herchel Smith Professor of Medicinal Chemistry in the Department of Chemistry at the University of Cambridge. He is also Senior Group Leader at the Cancer Research UK Cambridge Institute.

More on this topic

Source