The International Congress of Genomics 10 is a conference held annually in Shenzhen, China where BGI is headquartered. Largely attended by life science researchers in China, the invited speakers will have a natural connection to the genetics and genomics community in China, in particular Huanming Yuan. And over the past seven or eight years BGI has been a major collaborative and commercial sequencing organization, having grown to some 5,000 employees worldwide.
This year my new employer SeraCare Life Sciences decided to become a sponsor, in order to gauge the NGS clinical market in China for our new oncology and maternal health reference materials, and I was privileged to attend. My relationship with China has previously been only from a personal dimension, having lived in Beijing as an expatriate way back in 1997 and 1998; viewing China from a commercial one is something of a revelation, along to observing first-hand how much things have changed in the 17 years since I last lived here.
At this year’s European Society for Human Genetics meeting in Glasgow, Scotland BGI announced the commercialization of the Complete Genomics technology that they acquired in 2013 for $117M. For those not familiar with Complete Genomics, they developed a service-only, human genomes-only model for high-quality whole genome sequencing, and in the marketplace they initially offered 5 human genomes for $100,000 in 2009. (At that time, for the amount of sequencing required for high-quality WGS data, $20K per sample was something of a bargain.) In the ensuing years Complete Genomics (known as CGI) was locked in a turf- and price-war with Illumina, for WGS service driving the price down over time.
And over that timeframe of 2009 – 2013 CGI was able to publish high-visibility papers, have their data included in the Genome In A Bottle consortium (organized by the US National Institute of Standards and Technology), and publish one memorable technique called Long Fragment Reads (LFR) which I wrote about here. After the acquisition by BGI, it’s hard to believe that they were able to execute new platforms on two different scales, to change the business model of CGI from one of service-only to one to serving individual customers.
The Revolocity was announced at the Glasgow ESHG conference is a high-throughput instrument (if you can call a set of robotics and cabinets as a group an ‘instrument’) on par with the HiSeq X10, designed for 10,000 whole genomes or 95,000 whole exomes per year. That is 192 whole genomes per week or 1,800 whole exomes, and speaking to a friend at the recent ASHG meeting (who just happens to work in the Netherlands where the first system is being installed now) they certainly have a huge amount of existing sequencing capacity using Illumina technology, but need even more additional capacity for clinical applications. One important application is to determine the cause of intellectual disability in children by clinical WGS or WES of trios, that is the affected and the parents, or affected and siblings where a parent is not available, to look for de novo mutations that may be causative and could suggest therapeutic options.
The positioning of the Revolocity is not for the research market per se, but rather the applied / clinical one. And with a flow-cell the size of about 12 inches square (reminiscent of the tiles in my kitchen, I’ll be heading over to the BGI booth today to take a closer look at it), there has been a ton of engineering that has gone into this development. There is an industrial arm to manage movement of parts and pieces, physical integration of sample preparation through library construction through template preparation (they use DNBs, or DNA NanoBalls on an ordered array) through sequencing by combinatorial Probe Anchor Ligation (cPAL™), and imaging and analysis all in one physically imposing instrument.
On that note it take 1,500 square feet of space, and of course it is too large to exhibit at a booth. (At trade shows like the recent ASHG in Baltimore, they had a miniature model of the Revolocity to show to prospective customers.) Also for the price ($12M) it comes with a full-time engineer to keep things running. Another item of note (that wasn’t said in yesterday’s presentation explicitly) is that this unit was designed to keep things very simple for their intended clinical customer, which was not to output data on an aligned read level (typical output, which takes a FASTQ file of reads, trims and aligns reads to reference, and leaves the alignment at that) but at the VCF-file level (VCF = Variant Call File). Other technical details include the ability to do paired-ends with 375bp inserts, the ability to call high-quality CNVs based upon read depth, and report on the structural variations observed (no mention of read length but historically CGI data was on the order of 40bp).
But the star of the event (and the purpose of this post while working on a Sunday morning) was the BGISEQ-500, shown here in the early morning of the first day of the conference. It is a benchtop device, has two flow-cells, and some remarkable attributes, the first of which is that this instrument has been designed for sale to the Chinese market. Second, it was designed and built in China. Third, I cannot remember the last time China developed an instrument for genomics; there may have been some activity in the microarray world but this is certainly a first for NGS instrumentation.
The two flow-cell configuration has the flexibility to run two different applications (library and sequencing formats) simultaneously. The presentation claimed ’16 modes’, which in parlance likely means 16 different distinct applications, including WGS, WES, 16S, RNA-Seq, small RNA RNA-Seq, single-cell RNA-Seq (by the way, I saw several single-cell talks that used BGI’s technology so apparently they will offer it as a kit), WG bisulfite sequencing (WGBS), RRBS (reduced representation bisulfite sequencing for those not familiar), MeDIP-Seq, ChIP-Seq, NIFTY (their clinical NIPT), EmbryoSeq (their Preimplantation Genetic Diagnosis assay), and a host of constitutional and oncology panels.
BGI apparently modified the cPAL chemistry mentioned above to something they call cPAS, which enables single-end reads both 50bp and 100bp long, and paired-end reads also 50bp and 100bp long. So having 1×50 through 2×100 options is great – 50bp single-end used for tag counting applications (RNA-Seq for gene quantitation or for ChIP) while 100bp paired-end useful for many other applications (de novo etc.) Since the presentation was in Chinese (and I wasn’t able to score a cool translation electronic box they offered to their VIPs) one diagram appeared to illustrate a modification of the sequencing on the DNB, which is why the nomenclature change to cPAS (combinatorial Probe Anchor Synthesis).
The flow-cells themselves come in two formats: FCL (‘flow-cell large’) and FCS. FCS has 300M reads – for an RNA-Seq experiment requiring 30M reads/sample, that’s 10 samples per one run. And 3/4 the capacity when compared to the NextSeq-500 from Illumina which has 400M reads per run. But the FCL has 1,600M reads – you read that right, over five times the capacity – putting the BGISEQ-500 outdoing what the HiSeq 2500 can do in Rapid Run mode (1,200M reads in 27 hours). The BGISEQ-500 flowcell FCS has 8-40GB/run, the FCL 40-200GB/run, and both can go from ‘sample processing to sequencing analysis’ in 24 hours. Note: it’s not clear if the 24h is only instrument run time (i.e. not including library preparation), and not clear if ‘sequencing analysis’ means raw data or aligned (important distinctions all!).
They also offer library automation (but no details were given about that), and mentioned one-touch operation. Orders will be taken December 25 2015, and the first shipments are slated for February 2016. Do note that this is only for the Chinese market, priced at ‘25% less than comparable systems’ which I am told is a NextSeq 500, not a HiSeq 2500. So 25% off a $250K list price for a NextSeq puts the BIGSEQ-500 a bit less than $200K.
One common concern when a new system launches is data produced by the system. Having been with a few NGS providers this is often a source of pressure on the organization’s application scientists. BGI had a remarkable volume of data from the BGISEQ-500: 2036 NIFTY tests with T21, T18 and T13 statistics compared (i.e. same sample) to performance on the BGISEQ-1000 (which I understand is BGI’s modified Ion Proton), and the concordance was 100% across 2036 total samples.
Also presented was their BGI-Osmart™ targeted oncology assay deployed on FFPE samples, cell lines and reference; they were 100% specificity / sensitivity for all except for the FFPE sensitivity at 97.7% (their assay’s average depth was about 900x). By the way, the accuracy of the reads are >85% of the bases at Q30.
At last night’s banquet I heard why everyone in Shenzhen was so young – it is a metropolis that is only about 30 years old, so the term ‘local’ doesn’t really apply. And in a young world, ‘the sky’s the limit’, as you can see from this building going up right across the street from the conference hotel, the Ping An Finance Center. Makes me a bit optimistic.
For BGI’s new website on the BGISEQ-500, click here.