You are on page 1of 17

Background Information : What are Transposons?

Jumping genes

Transposons are sequences of DNA that can move or transpose themselves to new positions within the genome of a single cell. The mechanism of transposition can be either "copy and paste (Class I) or "cut and paste (Class II)

These can significantly create mutations and alter genome size Class I/ Retrotransposons - DNA RNA
RT

DNA

Inserted into genome


( similar to retroviruses

RT Reverse transcriptase enzyme often coded by transposon it self.


like HIV) 3 classes viral, LINEs, Nonviral superfamily . LTR retrotransposons -Long terminal repeat - ~100 bp 5 kb size a) Ty1-copia retrotransposons b) Ty3 gypsy retrotransposons

Non LTR retrotransposons - high copy number ( 250,000) in plants - prevalant in eukaryotic geneomes. LINES high copy number transcribed to RNA with a RNApol II promoter that is with in LINE - either AP type or REL type endonucleases are encoded by it and REL shows specificity in insertion of the DNA - 51 UTR has promoter seq and 3 UTR has polyadenylation sequence As the yare copy and paste type They enlarge the genome. 17 % of human genome are LINES. SINES Reverse transcribed RNA molecules transcribed by RNApol III into tRNA , rRNA and other small RNAs. The don t encode a RT enzyme rather depend on other mobile elements for transposition most common SINES in primates are Alu elements (~280 bp ) 1,500,000 copies in human genome. ~11% of genome size B1 elements in mouse While historically viewed as "junk DNA", recent research suggests that in some rare cases both LINEs and SINEs were incorporated into novel genes, so as to evolve new functionality

Mobile genetic elements

Transposons

Retroposons

Viral retroposons

Non viral retroposons

LTR retroposons Retro viruses LINES

SINES Processed Pseudogenes

LINES usually show moderately or severely truncated 5 end which means it encodes for Rtase that recognizes 3 end of RNA template for its 1st strand synthesis. SINEs are distinguished by large copy number, relatively short length, and inability to encode for enzymes, such as RTase. Typically 70 and 500 bases in length ~10,000 copies. Top most repetitive elements in mammalian genome exceptional diagnostic power for establishing common ancestry among taxa. Most SINEs are derived from tRNA and are believed to recombine and interact functionally with corresponding LINEs to acquisition of Rtase activity. However, there is increasing evidence that SINEs may acquire the necessary enzymes for retroposition from corresponding LINEs, which do code for reverse transcriptase thus becoming capable of self replication. A major clue to the possible functional relationship between SINEs and corresponding LINEs is the direct homology evident at 39 sequence tails in the tRNA unrelated region of SINEs with those of certain LINEs.

Insertion events from hundreds of thousands of SINEs adds great fluidity to genomes and has the potential to influence their architecture to a much greater degree than other common molecular genetic mechanisms. The retroposition of SINEs in concert with corresponding partner LINEs is quite likely an ancient molecular process and is consistent with the intriguing hypothesis that historical retroposon activity may have played an important role in landmark radiations of eukaryotic taxa. 2 SINE families that were discovered first B1 of mice and Alu of humans originate from 7SL RNA , Both SINE families are composed of sequences corresponding to the terminal regions of 7SL RNA with the central 144 184 nucleotides deleted. Alu (300 bp) is a dimer, apparently formed by fusion of two similar but not identical monomers where is a murine B1 (140 bp) is a monomer. However, there is an internal 29 bp duplication so can be called as quasi dimer.

Alu and B1 Repeats Have Been Selectively Retained in the Upstream and Intronic Regions of Genes of Specific Functional Classes Aristotelis Tsirigos et al

Alu and B1 elements have been selectively retained in the upstream and intronic regions of genes belonging to specific functional classes. At the same time, authors found no evidence for selective loss of these elements in any functional class. A subset of the functional links authors discovered corresponds to functions where Alu involvement has actually been experimentally validated, whereas the majority of the functional links authors report are novel. Finally, the unexpected finding that Alu and B1 elements show similar biases in their distribution across functional classes, despite having spread independently in their respective genomes, further supports our claim that the extant instances of Alu and B1 elements are the result of positive selection.

The presence of these SINEs in a wide range of rodents 15 families as well as primates was reported along with their absence in other mammals () Alu and B element distribution in genome : Alu elements - ~1.1 million copies and cover about 5.4% of human genome B1elements - ~ 1.2 million copies and cover about 3.6% of the mouse genome

When their densities are separately studied using parameters like distance from gene transcript start positions, direction (upstream vs. downstream), and orientation (sense vs. antisense). Alu & B elements significantly over-represented in the upstream regions of genes, highest densities are observed within the window ending at 16 kb upstream of gene transcript start positions. Like wise in intronic downstream side also enrichment was observed at 16 kb region but more on the antisense side. And significant underrepresentation of Alu and B1 elements in exon regions of genes.

Authors a associated Alu elements to functional classes by performing a genome-wide analysis on the latest release of the human genome annotations and applying a distribution-free statistical test with multiple hypothesis testing correction. When 16 kb window for the upstream analysis was examined i.e. the window where authors find that the Alu density is maximized (see above). Also the examining of the possibility that intronic instances might also be linked to specific functional classes and by treating sense and antisense orientations separately were done. As a result, authors were able to associate with Alu elements at least four times more functional classes than they would have been able to, had authors only considered 5 kb upstream regions. After finding the
functional associations, tests were done to find explanation for the observed functional biases. After labeling each gene s upstream or intronic region with the GO terms attributed to the corresponding spliced transcripts, whether Alu densities are significantly higher in the upstream or intronic regions of genes associated with certain GO Terms was tested - Result was that upstream and intronic Alu instances are not randomly distributed, but instead are located, significantly more frequently than expected, inside upstream and intronic regions (in either the sense or antisense direction) of genes belonging to specific functional classes, i.e. GO terms

The colors in the columns labeled Alu and B show for each GO term whether it is associated with upstream (U), sense intronic (I+), or antisense intronic (I-) regions. Significant GO terms are considered those terms whose adjusted p-values are less than 0.01

Average pair-wise sequence similarities involving Alu and B1 elements

When the sequences of B elements are so different from those of Alu elements, and the current distribution of Alu and B elements has been shaped independently in the each of the two genomes through initial random spreading and subsequent loss of certain copies, one would expect that the functional associations of B elements in upstream and intronic regions of genes would be different from the ones described in the previous section

However, authors found that the set of functions associated with B elements contains 83.2% of the functions associated with Alu elements (expected = 12.262.0%). The fact that this result is observed independently in the mouse genome further strengthens our claim that these two types of SINE elements have been selectively retained in genes of certain functional classes, rather than selectively lost from certain genes

Approaches to identify TEs in genome : TE features for computational identification: Existence of inverted repetitive sequences on either ends of all Tes possession of a transposase ORF (Open Reading Frame) and their existence as multiple copies. Identification of TEs based on their features has enabled the construction of libraries of consensus sequences of various types of TEs. RepBase Update (RU) which is a service of the Genetic Information Research Institute (GIRI) is a comprehensive database of repetitive element consensus sequences. Repeat Masker - Is software that screens DNA sequences for low complexity sequences, repetitive/TEs including small RNA pseudogenes, Alus, LINEs, SINEs, LTR . GenBank actually returns cDNAs elements, and others, producing a detailed annotation that identifies all of the repetitive elements in a query sequence. RepeatMasker makes use of RepBase libraries , which act as reference points for the identification of repetitive elements in a query sequence. RepeatMasker employs a scoring system to ensure that only statistically significant alignments are shown

You might also like