You are on page 1of 11

4/2/12: Lecture 1

Chapter 6:
98.5% of human DNA is noncoding (does not encode RNA) Noncoding DNA contains many repetitious DNA regions Transposable DNA elements- sequences that can copy themselves and move throughout the genome contribute to evolution of multicellular organisms Genes- DNA regions encoding proteins or functional RNAs Introns- noncoding regions within genes Exons- coding regions Mitochondria and chloroplasts also contain DNA Chromatin- complex of DNA and proteins that organize it

6.1: Eukaryotic Gene Structure Gene- entire nucleic acid sequence that is necessary for the synthesis of a functional gene product (polypeptide or RNA) o Contains not only the coding regions (exons) but also the DNA sequences required for synthesis of a particular RNA transcript (ex: includes enhancers, promoters, splice sites) and introns o DNA that encodes for tRNA, rRNA, and micro RNAs are also considered genes Polycistronic mRNA- mRNA that includes the coding region for several proteins that function together in a biological process o Cistron- a genetic unit encoding a single polypeptide o Monocistronic- encodes a single protein Eukaryotic vs. Bacterial Cells: o Bacterial- a ribosome-binding site is located near the start site of each cistron translation can begin at any of these sites; usually lack introns o Eukaryotic- 5-cap directs ribosome binding and translation begins at the closest AUG start codon produce single type of mRNA and thus polypeptide Contain introns removed during RNA processing in the nucleus before the mRNA is transported to the cytosol for translation (usually longer than exons) Transcription unit- a cluster of genes that form a bacterial operon transcribed from a specific promoter in the DNA sequence to a termination site produce a single primary transcript o Bacteria- single transcription unit contains several genes when they are part of an operon o Eukaryotes- genes are expressed from separate transcription units Simple Transcription Unit- produces a single monocistronic mRNA, which is translated into a single protein Complex Transcription Unit- transcribed into a primary transcript that can be processed into two or more different monocistronic mRNAs

Due to alternative splice sites, multiple poly(A) sites, multiple promoters Isoforms- the various proteins encoded by the alternatively spliced mRNAs expressed from one gene Many complex transcription units express one mRNA in one cell type and an alternate mRNA in a different cell type Two types of protein coding genes: o Solitary Genes- represented only once in the haploid genome; example: chicken lysozyme (about 50% of protein encoding genes) o Duplicated genes-genes with close but nonidentical sequences that often are located within 5-50 kb of one another (gene family- a set of duplicated genes that encode proteins with similar amino acid sequence AKA homologous but not identical, thus similar but not identical properties and functions form a protein family); example: hemoglobin, proteins that make up the cytoskeleton (actin, tubulin, intermediate filament proteins) How does it occur? Unequal crossover during meiotic recombination two copies of the genes over time accumulated random mutations mutations were retained by natural selection (sequence drift) Pseudogenes- nonfunctional sequences similar to the functional gene; also result from duplicated genes (see above) but NO selective pressure to maintain function; have sequences that either terminate translation or block mRNA processing Sometimes duplicated genes are found on different chromosomes separated by chromosomal translocation Heavily Used Gene Products are enforced by Multiple Copies of Genes o Tandemly repeated arrays- include genes encoding ribosomal RNAs and some other nonprotein-coding RNAs; Encode identical or nearly identical proteins or functional RNAs; Copies of a sequence appear one after the other over a long stretch of DNA; Nontranscribed spacer regions between transcribed regions can vary. Needed to meet great cellular demand for transcripts (since more RNA is required than can be transcribed from one gene) o Note: While tRNA and histone genes often occur in clusters, they do not occur in tandem arrays in the human genome Nonprotein-Coding Genes o snRNAs (small nuclear RNAs)- function in RNA splicing o snoRNAs (small nucleolar RNAs)- function in rRNA processing and base modification in the nucleolus. o RNase P RNA- functions in tRNA processing o miRNAs (micro RNAs)- regulate the stability and and translation of specific mRNAs o RNA in telomerase- functions in maintaining the sequence at the ends of chromosomes

7SL RNA- functions in the import of secreted proteins and most membrane proteins into the endoplasmic reticulum

6.2.: Chromosomal Organization of Genes and Noncoding DNA Genomes of many organisms contain nonfunctional DNA o Majority are introns that are removed by RNA splicing o Only 1/3 are pre-mRNA precursors or nonprotein-coding RNAs remaining 2/3 is noncoding DNA between genes as well as repeated DNA sequences that make up centromeres and telomeres of the human chromosomes (exons= 1.5% in humans!) o Why differences in # of introns between species? Different selective pressures nonfunctional DNA requires time, nutrients and energy. For vertebrates, this is trivial compared to that needed for function of muscles little selective pressure to eliminate nonfunctional DNA in vertebrates Classes of Eukaryotic Genomic DNA: o Genes encoding proteins and functional RNAs o Repetitious DNA o Spacer DNA Most simple-sequence DNAs are concentrated in specific chromosomal locations o Repetitious DNA: Simple-sequence DNA (satellite DNA)- constitutes 6% of human genomecomposed of perfect or nearly perfect repeats of relatively short sequences Microsatellites- repeats contain 1-13 base pairs caused by backward slippage of a daughter strand on its template strand during DNA replication so that the same short sequence is copied twice. Interspersed repeats- composed of much longer sequences o These sequences are localized to specific chromosomal regions (near centromeres which attach to spindle microtubules) sequences are required to form centromeric heterochromatin, which is necessary for proper segregation of chromosomes to daughter cells. Also found at the telomeres. DNA Fingerprinting Depends on Differences in Length of Simple-Sequence DNA o Nucleotide sequences of repeat units are highly conserved among individuals, but the number of repeats and thus the length of simple-sequence tandem arrays is quite variable among individuals Cause: unequal crossing over during meiosis o Minisatellites- simple-sequence DNA in relatively short 1-to-5 kb regions made up of 2040 repeat units each containing 14-100 base pairs create DNA polymorphisms o Differences can be detected by PCR using different primers DNA fingerprinting Unclassified spacer DNA occupies a significant portion of the genome o 25% of human DNA lies between transcription units and is not repeated anywhere else in the genome

Cause: arose from transposable elements that accumulated so many mutations over time Transcriptional-control regions also lie in this region help regulate transcription from distant promoters

4/3/12: Lecture 2 6.3 Transposable (Mobile) DNA Elements Interspersed repeats- composed of a very large # of copies of relatively few sequence families; can move in the genome called transposable DNA elements or mobile DNA elements o No specific function in biology but exists to maintain themselves o Have multiplied and slowly accumulated in eukaryotic genomes over evolutionary time o Can bring about chromosomal DNA rearrangements during evolution Transposition- the process by which these sequences are copied and inserted into a new site in the genome o Occurs rarely o No deleterious effects (usually) Movement of Mobile Elements Involves a DNA or an RNA Intermediate o DNA Transposons- transposons that transpose directly as DNA excise themselves from one place into another (cut and paste) o Retrotransposons- Transpose via an RNA Intermediate first transcribed into an RNA molecule, which is then reverse-transcribed into double stranded DNA (copy and paste) DNA Transposons are Present in Prokaryotes and Eukaryotes o Bacteria- mostly DNA transposons o Eukaryotes- mostly retrotransposons Bacterial Insertion Sequences (IS)- stretches of DNA that are inserted (transpose an IS element) o Inverted repeat- ~50 base pairs that are invariably present at each end of an insertion sequence. 53 sequence on one strand is repeated on the other strand (reading 5 3 as well) o Direct-repeat sequence- contains 5-11 base pairs, immediately adjacent to both ends of the inserted element. Length is characteristic of each type of IS, but sequence depends on the target site where a particular copy of the IS element is inserted o Tranposase- an enzyme required for transposition of the IS element to a new site Transcribed in a region between the inverted repeats Expressed very rarely o Mechanism: cut and paste- similar to DNA transposons Tranposase excises the IS element in the donor DNA Makes staggered cuts in a short sequence in the target DNA Ligates the 3 termini of the IS element to the 5 end of the donor DNA Eukaryotic DNA Transposons:

Activator (Ac) element- agent responsible for mutation that is revertible at high frequency equivalent to IS elements Contain inverted terminal repeat sequences that flank the coding region for a transposase o Dissociation (Ds) element- responsible for mutations that do not revert unless they occur in the presence of the mutations that are revertible at high frequency Deleted forms of the Ac element in which a portion of the sequence encoding transposase is missing Cannot move by itself Retrotransposons o LTR (long terminal repeats) retrotransposons- common in yeast and in Drosophila Have LTRs in the central protein-coding region Mechanism: The left LTR directs RNA polymerase to initiate transcription formation of a primary transcript extending beyond right LTR Enzymes cleave the primary transcript at right LTR and add a poly(A) tail Form RNA intermediate exit the nucleus Reverse transcription in cytosol produce DNA Transport back into the nucleus in a complex with integrase Short direct repeats are generated at the ends of the DNA intermediate inserted with integrase o Non-LTR Retrotransposons: nonviral retrotransposons. Form two classes in mammalian genomes: LINEs (long interspersed elements) L1, L2, and L3 family- L1 is the only one in contemporary human genome Flanked by short direct repeats and contain 2 ORF (open reading frames) Mechanism: o LINE DNA transcribed to LINE RNA translated to ORF1 and multiple copies of ORF1 and one copy of ORF2 bind to poly(A) tail of LINE RNA moves into the nucleus o ORF2 makes staggered nicks in chromosomal DNA o Reverse transcription of LINE RNA by ORF2 is primed by the single-stranded T rich sequence generated by the nick in the bottom strand, which hybridizes to the poly(A) tail.--> form LINE DNA o Continue to transcribe, using upper DNA strand as template (no longer using RNA)

Cellular enzymes then hydrolyze the RNA and extend the 3 end with the chromosomal DNA top strand, replacing the LINE RNA strand with DNA o Ligate the ends of the DNA strands o Short direct repeat is generated at the insertion site SINEs (short interspersed elements) Transcribed by the same nuclear RNA polymerase that transcribes genes encoding tRNAs, etc. ORF1 and ORF2 proteins expressed from LINEs mediate reverse transcription and integration of SINEs Compete with LINE RNAS for binding Alu elements- SINEs that contain a single recognition site for the restriction enzyme AluI scattered at sites where insertion does not disrupt gene expression Processed Pseudogenes- DNA segments that were retrotransposed copies of spliced and polyadenylated mRNA o lack introns, do not have flanking sequence similar to those of the functional gene copies o contain multiple mutations o flanked by short direct repeats Transposons influenced evolution o Cause mutations hemophilia, myotonic dystrophy o Promote generation of gene families via gene duplication Unequal homologous crossover between two L1 sequences o Promote the creation of new genes via shuffling preexisting exons o Promote formation of more complex regulatory regions that provide multifaceted control of gene expression

6.5: Genomics: Genome-Wide Analysis of Gene Structure and Expression Bioinformatics- field that use computers to analyze gene sequence data Stored Sequences Suggest Functions of Newly Identified Genes and Proteins o Newly cloned gene with unknown function compare amino acid sequence of this protein with one of a known function to determine the function o Comparing amino acid sequence is more reliable than comparing DNA sequence (due to degeneracy) o Use BLAST to compare protein sequences (basic local alignment search tool) o Structural motifs- short segments that recur in many different proteins Comparison of related sequences from different species can give clues to evolutionary relationships among proteins o Homologous- similar in sequence to suggest a common ancestral sequence o Paralogous- genes that diverged as a result of gene duplication

Orthologous- derive from mutations that accumulated during speciation (similar function in different organisms) Genes can be identified within genomic DNA sequences o Open Reading Frame (ORF)- a stretch of DNA containing at least 100 codons that begins with a start codon and ends with a stop codon. most of the time encode protein Use ORF analysis for yeast and bacteria Poor method for finding genes in higher eukaryotes since more introns than exons instead use alignment or hybridization of the query sequence to a full length cDNA o Currently 25,000 identified genes 10,000 not sure if encode proteins or RNAs o Compare human with mouse genome similar regions are for functional coding regions The number of protein-coding genes in an organisms genome is not directly related to its biological complexity. Higher eukaryotes have fewer genes but complex because: o Alternative splicing of pre-mRNA yields multiple functional mRNA corresponding to a particular gene o Variations in the post-translational modification of some proteins produce functional differences o Increased # of cells can interact in a more complex manner Single Nucleotide Polymorphisms and Gene Copy-Number Variation are important determinants of differences between individuals of a species o Single nucleotide polymorphisms (SNPs)- differences in DNA sequences between individual humans who are not closely related Usually not functionally significant because occur in introns ; however, accounts for differences between individuals But important markers for measuring frequency of recombination between genes o Gene-copy number- # of copies of DNA sequences per cell Varying deletions and duplications- arose from unequal crossing over between chromosomes during meiotic recombination in a direct ancestor Example of how to get different gene copy #-> delete on one chromosome but not the other, duplicate on one chromosome but not the other, duplicate both

6.6 Structural Organization of Eukaryotic Chromosomes Histones- nuclear proteins that compact and organize chromosomal DNA DNA is condensed during metaphase but is dispersed throughout interphase (not mitosis) Chromatin exists in extended and condensed forms o Chromatin- nucleoprotein complex consisting of DNA and histone proteins o 5 major types of histones: H1, H2A, H2B, H3, and H4-> positively charged o Low salt concentration-> looks like beads on a string (free DNA called linker DNA connecting beadlike structures called nucleosomes extended DNA (10 nm diameter)

o Physiological ionic strength more condensed 30 nm in diameter Structure of Nucleosomes- structure during transcription o Less susceptible to nuclease digestion than linker DNA o Consists of protein core with DNA wound around its surface o Core= to copies of H2A, H2B, H3, and H4 (octamer) o Chaperones- bind to istomes and assemble them together with newly replicated DNA into nucleosomes Structure of 30 nm fiber- not transcribed o Zig-zag structure wound into a two-start helix with 2 strands of nucleosomes stacked on top of each other like coins o Include H1 o Left-handed Conservation of Chromatin Structure o Chromatin structure is similar among species early optimization in ancestor eukaryotes o Amino acid sequences for histone are very similar o H1- varies more from organism to organism Slight changes in histone sequence o Vertebrates- special form of H2A called H2AX at sties of double stranded breaks in chromosomal DNA, H2AX becomes phosphorylated and participates in the chromosome-repair process (serve as binding site for repair proteins) o H3 is replaced by CENP-A at centromere-> participates in spindle microtubules during mitosis Modification of Histone Tails Control Chromatin Condensation and Function o Each histone in the nucleosome contains flexible N-terminus and H2A and H2B also contain flexible C terminus extending from core called histone tails required for chromatin to condense from beads-on-a-string to 30 nm conformation o N-terminal tails of H4 have lysines which are positive and can interact with the negative patch of H2A-H2B interface of the next nucleosome to produce stacked nucleosome of 30nm fiber o Histone tails are subject to multiple post-translational modifications Phosphorylation, methylation, acetylation, and ubiquitination -> never all 3 simultaneously but histones in a single nucleosome may have all Modifications create a histone code- influences chromatin function by creating ore removing binding sites for chromatin-associated proteins o Histone Acetylation Acetylation and deacetylation acetylation= neutral form. Deacetylation is positive When acetylated less condensed beads on a string form Increased sensitivity to DNA digestion Condensed form more resistant since DNA is inaccessible due to histones

HATS (histone acetylases) are required for full activation of transcription -> control of acetylation controls transcription of gene expression. In condensed form, RNA polymerase and other proteins cannot access. o Other Histone Modifiers: Methylated- prvetns acetylation and maintains positive charge Serine residues can be phosphorylated negative charge introduced Ubiquitin molecule can be reversible added to lysine Heterochromatin- highly condensed chromatin Remains in a compacted state during interphase Found at centromeres and telomeres Stain very darkly Transcriptionally inactive Euchromatin- less condensed chromatin Stain lightly Transcriptionally active Reading the Histone Code o Read by proteins that bind to modified histone tails promote condensation or decondensation of chromatin o Chromodomain- a # of proteins that bind to histone tail when they are methylated at specific lysines contribute to condensed structure H3K9 HMT methylates tri-methylates H3 N terminal at lysine 9 HP1- binds to this lysine residue. Contains a chromoshadow domain- binds to other chromoshadow domain and H3K9 HMT (histone methyl transferase) more methylations Cause more Hp1 to go on top spread until boundary element is encountered (where several nonhistone proteins bind to DNA and thus blocks methylation This model provides explanation of how heterochromatic regions of a chromosome are re-established following DNA replication during S phase. When replicated histone octamer is distributed to both daughter chromosomes o Bromodomain- (euchromatin)- bind to acetylated histone tails. (eg: TFIID includes 2 bromodomains which help it associate with euchromatin . Also has histone acetylase activity maintain chromatin in hyperacetylated state conducive to transcription X-Chromosome Inactivation in Mammalian Females o Random inactivation/condensation of X chromosome-> become heterochromatin o Dosage compensation- generates equal expression of genes in males and females Visible as dark-staining Barr Body o Some cells have inactivated X chromosome from mom or dad o Inactivation caused by hypoacetylation, lack of methylation at histone H3 lysine 4 Controlled by X-inactivation center-> determines which X is inactivated o Mechanism of inactivation:

Involves action of polycomb protein complexes- binds to H3 tails when they are tri-metyhlated at lysine 27. Contains a HMT for H3 lysine 27 promotes condensation o Epigentic process- affects expression of genetic genes inherited by daughter cells but is not the result of a change in DNA sequence o Inactivated X chromosome is maintained as inactive chromosome in the progeny of all future cell divisions because histones are modified in a specific, repressing manner Nonhistone Proteins Provide a Structural Scaffold for Long Chromatin Loops o SARS- scaffold associated regions- or MARS (matrix-attachment regions) provide structural support of chromosome during metaphase. Without histones, they (dark region) are attached to loops of DNA (light strands) Found between transcription units -> genes are in loops Required for transcription of genes near SARS/MARS Can act as insulators insulate transcription units from each other o Interphase chromosomes are organized into chromosome territories (not completely spread out) Ringlike Structure of SMC Protein Complexes o SMC- structural maintenance of chromosome proteins nonhistone proteins maintain structure of chromosomes If mutated fail to properly associate daughter chromatids following DNA replication. Do not properly segregate to daughter cells o SMC monomer contains 2 globular domains- head domain and a hinge domainseparated by a long coiled-coil domain; head= N and C terminal; folded together. Hinge is where the polypeptide folds back on itself. -> U shape Head domain- ATPase activity -> linked by members of a protein family called kleisins o Interphase Chromosome Structure SMC complex can link 2 30 nm chromatin fibers by encircling them Long loops of chromatin are tethered at the base by several SMC complexes DNA cleavage results in rapid dissolution of chromosome structure Metaphase Chromosome Structure o Condensation during prophase formation of many more loops of chromatin (length of loop is reduced) o Chromonema fiber- 100-to130 nm fiber o Middle prophase chromatid- 500-to750 diameter chromatid during metaphase Additional Nonhistone Proteins Regulate Transcription and Replication o Transcription factors- associated with interphase chromatin o HMG (high mobility group) proteins- DNA binding proteins (nonhistone) that are present in large amts that bind to DNA cooperatively with transcription factors that bind to specific DNA sequence stabilize multiprotein complexes that regulate transcription

6.7 Morphology and Functional Elements of Eukaryotic Chromosome s

You might also like