Professional Documents
Culture Documents
Sequencesimilarity,homology, andalignment
Natureisatinkerer andnotaninventor [FranoisJacob1977] tinkerer inventor [ ]
Newsequencesareadaptedfrompreexistingsequencesratherthan inventeddenovo denovo. We can often recognize a significant similarity between a new Wecanoftenrecognizeasignificantsimilarity betweenanew sequenceandasequenceaboutwhichsomethingisalreadyknown Inthiscasewecantransferinformationaboutstructureand/or functiontothenewsequence function to the new sequence
Wesaythatthetworelatedsequencesarehomologous and homologous thatwearetransferringinformationbyhomology homology. Evolvingsequencesaccumulateinsertionsanddeletionsas wellassubstitutions,sobeforethesimilarityoftwo sequencescanbeevaluated,onetypicallybeginsbyfindinga sequences can be evaluated one typically begins by finding a plausiblealignment betweenthem. alignment
Anearlystepforwardwastheintroductionof l f d h d f probabilisticmatrices probabilisticmatricesforscoringpairwiseamino acidalignments[Dayhoff,etal1972&1978];theseserveto acid alignments [Dayhoff et al 1972 & 1978]; these serve to
quantifyevolutionarypreferences evolutionarypreferencesforcertainsubstitutions overothers.
Probabilities&probabilistic models
What do we mean by a probabilistic model? Whatdowemeanbyaprobabilisticmodel?
Whenwetalkaboutamodel normallywemeana model systemthatsimulatestheobjectunder system that simulates the object under consideration. A probabilistic model is a model that produces Aprobabilisticmodelisamodelthatproduces differentoutcomeswithdifferentprobabilities.
Conditional,joint,andmarginal probabilities
Suppose we have two dice, D1 and D2 Supposewehavetwodice,D andD Theprobabilityofrollingani withdieD1 iscalled P(i|D1).Thisistheconditionalprobability (| ) conditionalprobabilityofrollingi p y g givendieD1 theprobabilityforpickingdiejandrollingani isthe p y p g j g productofthetwoprobabilities:
P(Dj)andj=1,2&P(i|Dj)So: P(i ,Dj)=P(i|Dj)P(Dj) ThetermP(i,Dj)iscalledthejointprobability thejointprobability Thestatementp(x,y)=p(x|y)p(y) appliesuniversallytoany Th t t t p(x,y) ( | ) ( ) ( )=p(x|y)p(y) li i ll t eventsXandY.
Conditional,joint,andmarginal probabilities
When conditional or joint probabilities are known, Whenconditionalorjointprobabilitiesareknown, wecancalculateamarginalprobabilitythatremoves oneofthevariablesbyusing:
p( x) p( x, y) p( x | y) p( y)
y y
Exercise
Consideranoccasionallydishonestcasinothatusestwokinds y ofdice.Ofthedice99%arefairbut1%areloadedsothata sixcomesup50%ofthetime.Wepickupadiefromatableat random.WhatareP(six|D random What are P(six|Dloaded) and P(six|Dfair)? What are )andP(six|D )?Whatare P(six,Dloaded)andP(six,Dfair)?Whatistheprobabilityofrolling asixfromthediewepickedup?
Bayes'theoremandmodel comparison
In the same occasionally dishonest casino as in Inthesameoccasionallydishonestcasinoasin previousexercise,wepickadieatrandomandrollit threetimes,gettingthreeconsecutivesixes.Weare suspiciousthatthisisaloadeddie.Howcanwe evaluatewhetherthatisthecase? WhatwewanttoknowisP(Dloaded|3six) whatwecandirectlycalculateisp(3six|Dloaded) Bayestheorem:
p( y | x) p( x) p( x | y) p( y)
Exercises
rare genetic disease is discovered. Although only one raregeneticdiseaseisdiscovered.Althoughonlyone ina10.000peoplecarryit,youconsidergetting screenedina1.000.000population.Youaretoldthat thegenetictestisextremelygood;itis100% sensitive(itisalwayscorrectifyouhavethedisease) and99.96%specific(itgivesafalsepositiveresult d 99 96% ifi (it i f l iti lt only0.04%ofthetime).UsingBayes'theorem, explainwhyyoumightdecidenottotakethetest. explain why you might decide not to take the test