Abstract
We present the Genome Browser, an interactive graphical tool for visualizing and curating the nucleotide sequences of large genomes, in particular, the human genome. This tool, developed by Celera Genomics and used by Celera’s scientists and customers, permits raw nucleotide information to be visualized, together with accompanying annotation information. It also provides interactive capabilities for human curation of genes. The software is written in Java and has a three-tiered architecture with a high-performance "thick" graphical client, an EJB-based middle-tier server, and an Oracle database backend. This architecture allows a terabyte-sized genomic database containing annotations on sequences exceeding 3 Billion base-pairs in length to be viewed using a direct manipulation graphical user interface displaying tens of thousands of zoomable data points at a time. It also allows layering of additional user-specified data on top of the database data via an XML import capability. Curation operations are performed by the user using an interactive "drag-and-drop" style to create and modify gene and transcript information. Curation information is exported via XML files which can then be loaded into the database using a separate curation "promotion" utility. This combined XML and three-tiered data architecture provides sufficient flexibility to support a variety of different genomic data formats and curation workflows.