A way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
(New in 2014) Harvest is a suite of core-genome alignment and visualization tools for quickly analyzing thousands of intraspecific microbial genomes. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Combined they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees.
Optical Mapping Data as a Guide for Genome Assembly
Genome assembly -- the task of reconstructing a genome from the small
fragments of DNA that can be sequenced by modern technologies -- is a difficult
computational problem, in no small part due to the fact that the shotgun
sequencing process cannot preserve the long-range structure of the genome
being assembled. Optical mapping is a genomic technology, pioneered by David
Schwartz, which can map the location of restriction sites along a genomic
chromosome. Thus, optical mapping provides a long-range sparse representation
is a comparative genome assembler, which uses one genome as a reference on which to assemble another, closely related species. See the journal paper here.
The is a set of tools, libraries, and freestanding genome assemblers, all open source. AMOS is also an open consortium that includes TIGR, the University of Maryland, The Karolinska Institutet, and the Marine Biological Laboratory.
New generation DNA sequencing technologies are revolutionizing modern biological research. Scientists can now generate the rough equivalent of an entire human genome (~3 billion base-pairs of DNA) in just a few days with one single sequencing instrument. Until recently, such amounts of data could only be generated at large genome centers using hundreds of sequencers. The analysis of these data is complicated by their size - a single run of a sequencing instrument yields terabytes of information, often requiring a significant scale-up of the existing computational infrastructure.