DG:Q:Genotypes is fine; but lots of different types of clin data? A:Too hard to do all at once. Agree all datatypes needed #ASHG14
12:00pm October 19th 2014 via Hootsuite
DG: Finishes with this XKCD cartoon http://t.co/l6neUc1L8S #ASHG14
11:58am October 19th 2014 via Hootsuite
DG: Points to @GA4GH GitHub Schema here http://t.co/tgSLPKvyJc Building tools once and know it fits data whereever it is. #ASHG14
11:57am October 19th 2014 via Hootsuite
DG: Points to a messy Venn from O'Rawe 2013 http://t.co/CIuGh5Wttm Now looking at benchmarking w/ @GA4GH, also interoperability #ASHG14
11:56am October 19th 2014 via Hootsuite
DG: Standardization: an absolute requirement. Global Alliance for Genomics and Health mentioned @GA4GH #ASHG14
11:54am October 19th 2014 via Hootsuite
DG:Went from 2h on 60 8-core machines, to 10m on one computer. Able to scale 1K to 1M without hit in computer time #ASHG14
11:52am October 19th 2014 via Hootsuite
DG:This was 1K, what about 1M? Thruput vs. latency is genomes/hr vs. how many genomes processed? 'Let's process 1M at once' #ASHG14
11:50am October 19th 2014 via Hootsuite
DG:Principal Coord Analysis=build similarity matrix, transform data, graph 2 dim's. 1000x1000 computational problem #ASHG14
11:49am October 19th 2014 via Hootsuite
DG: A different fast: fast analysis. Input data was var calls across TGP dataset. Principle Coord Analysis:3 clusters, populations #ASHG14
11:48am October 19th 2014 via Hootsuite
DG: 3rd slide of code, SNP distribution, showed graph of heterzygosity by population of origin. 'The tool is working' #ASHG14
11:46am October 19th 2014 via Hootsuite
DG: Another 10 lines of code, 'shared variation across samples', showed a U-shaped graph by # of samples. Also validated #ASHG14
11:45am October 19th 2014 via Hootsuite
DG: Shows 10 lines of query code for 'super-population'; 10s run, graphed in R, segregation by population, validated the biology #ASHG14
11:44am October 19th 2014 via Hootsuite
DG: Cycle-time: tweak 1 var, get result. 'A key to creativity', to iteratively explore the question. TGP dataset: validating results #ASHG14
11:43am October 19th 2014 via Hootsuite
DG: Fast: on top of Dremel, using BigQuery applied to dig thru .vcf files. Using 1000 WGS, 'variants by super-population' #ASHG14
11:41am October 19th 2014 via Hootsuite
DG: But video is only ~2 WGS per min. Search index ~1M WGS. '# of active gmail uses is 150x the # of US PhD's' #ASHG14
11:40am October 19th 2014 via Hootsuite
DG:The tools: Big, Fast and Standard. Not always obv. which is easiest. Big: 100h YouTube video per minute; search index 100PB+ #ASHG14
11:39am October 19th 2014 via Hootsuite
DG:"A biologist considers Big Data as anything that doesn't fit in Excel" Not condescending - a description of the need for tools #ASHG14
11:38am October 19th 2014 via Hootsuite
DG: Artisan bread baking vs. assembly line: how we get to an N of Millions, affecting 'everything you do with the data' #ASHG14
11:37am October 19th 2014 via Hootsuite
DG: Google Dremel - http://t.co/BzvM3Fx3JH working with trillions of rows of data. How can this be applied with genomics? #ASHG14
11:36am October 19th 2014 via Hootsuite
DG: A few developments: MapReduce ('04), Hadoop ('05), Apache Spark research ('09), Google Dremel paper ('10) #ASHG14
11:35am October 19th 2014 via Hootsuite
DG: "Genomics is becoming an N of Millions activity" #ASHG14
11:34am October 19th 2014 via Hootsuite
DG:18mo ago, as an outsider: combining data science to life science. Brings up NGHRI seq cost slide http://t.co/8eZpNEplDZ #ASHG14
Next: David Glazer (Google): Lessons from a Mixed Marriage: Big Sequencing Meets Big Data #ASHG14
11:32am October 19th 2014 via Hootsuite
AR:Q:Survivability data? A:Working on it, will be added later to train and incorporate outcome data #ASHG14
11:30am October 19th 2014 via Hootsuite
AR:A2: Even at the single-cell sequencing stage. #ASHG14
11:29am October 19th 2014 via Hootsuite
AR:Q:Does alg. include % of hetergeneity? A:System will ingest input given; finer-grain analysis will enable better analysis #ASHG14
AR: Credits include Robert Darnell and Toby Bloom of NYGC, team of 12 at IBM. #ASHG14
11:28am October 19th 2014 via Hootsuite
AR: Timeline: '14 is NY Genome Center (on GBM), 2015 in beta, later 2015 'avail for commercial use' #ASHG14
11:27am October 19th 2014 via Hootsuite
AR: Illus. w/Novartis Signature, AZ Personalized Healthcare, CRUK "Matrix", NCI MATCH (2400 inst, 14K invest. 'extremely ambitious' #ASHG14
11:26am October 19th 2014 via Hootsuite
AR: Drills down into particular drug and their pathways, an interactive map. Illustrating the transparency of choices for drug trtmt #ASHG14
11:25am October 19th 2014 via Hootsuite
AR: Report is verbose for many treatment options; MOI, targets, pathway from driver mutation downstream. Incl clinical trials too #ASHG14
11:22am October 19th 2014 via Hootsuite
AR:Cloud implemented, shows video of an instance from login through uploading data, metadata, report in minutes incl rec's #ASHG14
11:21am October 19th 2014 via Hootsuite
AR: HIPAA/HITECH compliant, db's eg NCI PID, DrugBank; building a conceptual model incl. treatment options, 'clin. experience db' #ASHG14
11:20am October 19th 2014 via Hootsuite
AR: Watson to precision oncology: NGS on left thru annotation; Translation ID pathways, compunds, integration to machine learning #ASHG14
11:18am October 19th 2014 via Hootsuite
AR:Watson is a deep foundation in CS; machine learning, Q&A, other attributes incorporated into Jeopardy! play. #ASHG14
11:16am October 19th 2014 via Hootsuite
AR: Can it be done in minutes? Can it be transparent enough to see all the reasoning involved in the judgement? #ASHG14
11:14am October 19th 2014 via Hootsuite
AR: Slide of 5 components: comprehensive, objective, fast, transparent, scalable. Can we do this for millions of affecteds? #ASHG14
AR: Can 'every pt be considered an exceptional responder'? By scaling, learning can be fast enough, provide this then to all. #ASHG14
11:12am October 19th 2014 via Hootsuite
AR: Charts of PubMed articles, Genes ass'd with COSMIC somatic mut's, only increasing over time. Needs contextualizing; not manual #ASHG14
11:11am October 19th 2014 via Hootsuite
AR: Investigational: Foundation One, Paradigm, SQNM, Ion Ampliseq. Research: MyCancerGenome etc. Oncology is vast 'deserves help' #ASHG14
11:09am October 19th 2014 via Hootsuite
AR: Shows chart - approved oncology panels, investigational & research. Approved: ex OncotypeDx, BRCA, DAKO etc #ASHG14
11:08am October 19th 2014 via Hootsuite
AR: Ledford 2013 Nature piece, revisiting 'failed' clinical trials: http://t.co/h61RsiH3Em #ASHG14
11:07am October 19th 2014 via Hootsuite
AR:Starts w/ importance of personalized oncology - at scale, reproducibly. Affordability is enabling; exceptional responders too #ASHG14
11:05am October 19th 2014 via Hootsuite
First up: Ajay Royyuru (IBM Watson Research) Genomic Analytics with IBM Watson #ASHG14
11:02am October 19th 2014 via Hootsuite
Sun 7pm after #ASHG14 @LifeTech services group I'll present 'From Microarrays to RNA-Seq' (& a movie too) http://t.co/9msUeYZVEp
10:06am October 19th 2014 via Hootsuite
.@eperlste Likewise! If you are free Monday PM, I'd like to invite you to a @LifeTech event at the Science Center. http://t.co/z7j3AL0DfI
9:43am October 19th 2014 via Hootsuite in reply to
A story of a reporter / book author / entrepreneur with a China connection | Huffington Post http://t.co/oVCqA0UW2k
9:01am October 19th 2014 via Hootsuite
Thank you @geneticssociety @girscientist and @nalinip for the #ASHG14 tweetup. Great seeing @eperlste @massgenomics @genomenathan & othe
8:50am October 19th 2014 via Hootsuite
RT @hootsuite: Want to know the secret to a great Tweet? Here's 6 lessons we've learned: http://t.co/OVMu5vX1kT http://t.co/PORjd1ZIaM
8:05am October 19th 2014 via Hootsuite
With $2M Seed, Perlstein Lab Tests Unorthodox Rare-Disease Plan | Xconomy http://t.co/sYZkj6AFYC About @eperlste new effort
7:30am October 19th 2014 via Hootsuite