GeneTack: tools for frameshift prediction
Links
-
1,106 prokaryotic genomes were used for frameshift prediction. They were downloaded from GenBank in April 2010.
-
146 programmed frameshift clusters contain 5,632 fs-genes. Each cluster contains at least 5 fs-genes and more than 50% of the fs-genes have the same programmed frameshift motif in the vicinity of the predicted frameshift.
-
11 phase variation clusters contain 2,330 fs-genes. Each cluster contains 100+ fs-genes.
-
18 translational coupling clusters contain 3,198 fs-genes. Each cluster contains 100+ fs-genes.
-
2,810 pseudogene clusters contain 10,290 fs-genes with 5,484 of them annotated as pseudogenes. Each cluster contains at least one annotated pseudogene and fs-genes from no more than two different genera (programmed frameshift clusters are excluded).
-
1,200 clusters of genes with indel mutations contain 3,522 fs-genes. Each cluster contains more than 50% of BLASTp validated fs-genes from no more than two different genera (programmed frameshift clusters are excluded).
-
11,433 pseudogene singletons (fs-genes that are annotated as pseudogenes but did not cluster):
- Intact fs-genes -- nucleotide sequences of the fs-genes containing frameshifts
- Corrected fs-genes -- nucleotide sequences of the single ORF fs-genes obtained by correcting the predicted frameshift by removing either 1 nt (for +1 frameshfits) or 2 nt (for -1 frameshifts) at the predicted frameshift position
- Conceptual translations -- amino acid sequences obtained by translating the corrected fs-genes
-
3,244 singletons that are either sequencing errors or indel mutations (BLASTp and Pfam validated fs-genes that are not annotated as pseudogenes):
-
Clusters groupped by overrepresented heptamer in the frameshift vicinity
-
Clusters containing 100+ fs-genes
-
Clusters containing 50 - 100 fs-genes