English: First, the repetitive regions of an assembled genome are masked by using a repeat library. Then, optionally, the masked sequence is aligned with all the available evidence (ESTs, RNAs, and proteins) of the organism being annotated. In eukaryotic genomes, splice sites must be identified. Finally, the coding and noncoding sequences contained in the genome are predicted with the help of databases of known DNA, RNA and protein sequences, as well as other supporting information.
to share – to copy, distribute and transmit the work
to remix – to adapt the work
Under the following conditions:
attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.
Please help improve this media file by adding it to one or more categories, so it may be associated with related media files (how?), and so that it can be more easily found.
Please notify the uploader with
{{subst:Please link images|File:Structural Annotation Flowchart.svg}} ~~~~
Captions
Generalized flowchart of a structural genome annotation pipeline