FASTG-An expressive representation for genome assemblies | all4bioinformatics
Breaking News
Loading...

Friday, 21 June 2013

FASTG-An expressive representation for genome assemblies

Introduction

FASTG is a format for faithfully representing genome assemblies in the face of allelic polymorphism and assembly uncertainty. It is called FASTG, like FASTA, but the G stands for ‘graph’.
Currently genome assemblies are represented linearly, as sequences of bases, recorded in FASTA files. Since chromosomes are in fact linear or circular, this makes sense, so long as one has complete knowledge of the genome. However, almost all assemblies contain errors and omissions, which can result in incorrect biological inferences. Moreover, in most cases these assemblies do not represent polymorphism at all.
Today, using high-coverage data, assembly algorithms 'see' almost all bases of the genome. Thus errors in the assemblies result primarily from defects in the algorithms and defects in assembly representation. Indeed, where a particular locus in an assembly is wrong, it is generally the case that the assembly algorithm could have prevented error by emitting an ambiguous call. However, such ambiguities are precluded by the current linear representation. Similarly, complex polymorphisms cannot be easily represented either and simple polymorphisms must be captured in a supporting file.
Just as physical measurements come with error bars, so should genome assemblies come with structures that capture the uncertainties in our knowledge. At its heart it is FASTA – thus allowing existing tools to run and providing coordinates that facilitate computation. On top of this are global and local layers of markup.

Specification

The current version of the FASTG Specification is available for download.
Here are the toy genome FASTG and FASTA files described in section 2.

google+

linkedin

About Author
  • Donec sed odio dui. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Sed posuere consecteturDonec sed odio dui. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Read More

    0 comments:

    POST A COMMENT

     

    Gallery

    About

    About Us