Faction I Genome Browser Group
Group members: Adam Dabrowski, Mrunal Dehankar, Shareef Khalid, Hubert Pan, Ajay Ramakrishnan, Ankit Srivastava, Kris Wang, Seyed Alireza Zamani
Develop an on-line platform that would allow users to view the annotated genome results and run BLAST queries against the VFDB database.
Genome Browsers are tools that allow you to view, edit, and possibly annotate genomic data. They usually consist of some sort of database coupled with a front end. These databases can consist of a set of flat files (JBrowse) or be a full blown Database Management System such as MySQL which underpins the UCSC Genome Browser. The frontend can vary from desktop applications to full blown web platforms. For this project, we will primarily focus on web-based open source browsers.
In the first stage of this project, we collected a list of web-based genome browsers by querying Google for "genome browsers". The table below lists some of the browsers we looked at.
|USCS Genome Browser||https://genome.ucsc.edu/goldenpath/help/gbib.html|
Then using selected metrics from the CompaGB project, we created a weighted decision matrix and selected the highest scoring candidate. In terms of weighting, we gave higher priority to ease of setup due to time constraints.
The full weighted decision matrix can be seen here: https://docs.google.com/spreadsheets/d/1vsbuzWiJztOwOxotdcOTXSQWVCICYOa5GkParN83FYc/edit?usp=sharing A presentation containing an introduction to this project and detailing the various Genome Browsers considered can be viewed here: https://docs.google.com/presentation/d/1OVm9BUJhFbNdQ4w2T3ftVvHfqTq2s6FMikZADpHhH70/edit?usp=sharing
In addition to the Genome Browser, the project also requested that we build a surrounding site that included general overview information. In addition as a stretch goal, we were to also include the ability to run BLAST queries against the VFDB tool databases. To build the website we ended up using Mustache.js, and Java to generate static pages that were then loaded to the website. To handle the VFDB calls we ended up using a PERL CGI script to interact with a local VFDB and BLAST instances.
The home of all our results is located at http://gbrowse2017a.biology.gatech.edu
Genome Browser (JBrowse)
It has no requirement for setting up backend components. It contains a set of 'run once' data formatting scripts which convert the data to the required format. All our tracks were input in the GFF3 format. Tweaking the configuration files in JBrowse allowed us to add additional features like color-coding individual tracks and providing link-outs to relevant websites.
The information for every isolate is displayed separately. All the isolates can be accessed from the 'Genome' menu of the browser. For every isolate, the track information is attached to different scaffolds in the final multi-FASTA files provided by the assembly group.
As seen in the image above, the tracks to be displayed can be selected via the checkboxes on the left. Specific information about some of the tracks is discussed below.
The "All genes" track shows the genes predicted for every isolate. This track is color coded to differentiate between known genes, represented by royal blue blocks, and the putative genes, represented by deep sky blue blocks.
In addition, left-clicking on the genes pops up a modal box with gene attributes and a link-out to the NCBI entry for the corresponding protein.
The "InterProScan results" track shows all the annotations by InterProScan. This track is color coded to differentiate between the source database of all the annotations. Following is the color legend for different databases:
- CDD : maroon
- Gene3D : orange
- HAMAP : medium violet red
- PANTHER : pink
- Pfam : medium orchid
- PIRSF : magenta
- PRINTS : tomato
- ProSite Families : light coral
- SUPERFAMILY : red
- TIGRFAMs : orange red
All the other boxes are colored as pale violet red.
As for Genes track, left-clicking on the annotations pops up a modal box with different attributes and link-outs to the corresponding database entries.
The VFDB Query page allows users to execute a BLAST query against a local copy of VFDB.
The code used to generate this site can be found at https://github.com/GenomeBrowser2017
- Zweig, Ann S. et al. “UCSC Genome Browser Tutorial.” Genomics 92.2 (2008): 75–84. ScienceDirect. Web.
- “UCSC Genome Browser Home.” N.p., n.d. Web. 12 Apr. 2017.
- Robinson, James T., et al. "Integrative genomics viewer." Nature biotechnology 29.1 (2011): 24-26.
- Igv.js http://igv.org/doc/doc.html
- Fiume M. et al. (2012) Savant Genome Browser 2: visualization and analysis for population-scale genomics. Nucleic Acids Res., 40, W615–W621
- Medina, Ignacio, et al. "Genome Maps, a new generation genome browser." Nucleic acids research 41.W1 (2013): W41-W46.
- Skinner, M. E., A. V. Uzilov, L. D. Stein, C. J. Mungall, and I. H. Holmes. "JBrowse: A next-generation genome browser."Genome Research19.9 (2009): 1630-638. Web.
- Lacroix, Thomas et al. “CompaGB: An Open Framework for Genome Browsers Comparison.” BMC research notes 4 (2011): 133. PubMed. Web.