You are on page 1of 10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora

SupportVectorMachines

Classification(machinelearning)

+2

WhatareKernelsinMachineLearningandSVM?
I'mtryingtogetintoSVM,butIcannotgettheideaofkernels.Whataretheyandwhydowe
needthem?
Answer

Request

Follow 451

Comment Share 5

Downvote

12Answers
BharathHariharan,PhdStudentinComputervision
54.4kViewsUpvotedbyVladimirNovakovski,LedmachinelearningatQuoraandLuis
Argerich,DataScienceProfessorattheUniversityofBuenosAires(UBA)
MostViewedWriterinClassification(machinelearning)

Greatanswersherealready,buttherearesomeadditionalthingsthatIwouldwanttosay.
Soheregoes.
Whatarekernels?
Akernelisasimilarityfunction.Itisafunctionthatyou,asthedomainexpert,provideto
amachinelearningalgorithm.Ittakestwoinputsandspitsouthowsimilartheyare.
Supposeyourtaskistolearntoclassifyimages.Youhave(image,label)pairsastraining
data.Considerthetypicalmachinelearningpipeline:youtakeyourimages,youcompute
features,youstringthefeaturesforeachimageintoavector,andyoufeedthese"feature
vectors"andlabelsintoalearningalgorithm.

RelatedQuestions
WhenandwherecanIusekernelfunctioninSVM?
Whatisareferenceforthekerneltrickinmachine
learningalgorithmsthatisasmathematicallydetailed
aspossible?
Whydoweusekernelsinsomemachinelearning
models?
Whyisthekernelimportantinmachinelearning
algorithms,suchasSVMorkNN?
WhataresomeawesomepapersrelatedtoSVM's
andkernelsthateveryoneinterestedinthissubject
shouldread?
HowcanIknowwhetherornotaseparablefeatures
bythekernelfunctioninSVMalgorithm?
Whatistheintuitionofconcatenatingkernelsbefore
applyinganSVM?
Aretheconceptsof"kernelfunction"inmachine
learningandmachinevisiondifferent?Ifso,how?
WhyisitsaidthatKernels(say,inSVM's)capture
thesimilaritybetweentwodatapoints?
Whataresomeoftheprojectsonecandoto
understandSVMandKernelTricks?

Data>Features>Learningalgorithm
MoreRelatedQuestions

Kernelsofferanalternative.Insteadofdefiningaslewoffeatures,youdefineasingle
kernelfunctiontocomputesimilaritybetweenimages.Youprovidethiskernel,together
withtheimagesandlabelstothelearningalgorithm,andoutcomesaclassifier.

QuestionStats
451Followers

Ofcourse,thestandardSVM/logisticregression/perceptronformulationdoesn'twork
withkernels:itworkswithfeaturevectors.Howonearthdoweusekernelsthen?Two
beautifulmathematicalfactscometoourrescue:
1.Undersomeconditions,everykernelfunctioncanbeexpressedasadotproductina
(possiblyinfinitedimensional)featurespace(Mercer'stheorem ).

122,109Views
LastAskedFeb7
1MergedQuestion
Edits

2.Manymachinelearningalgorithmscanbeexpressedentirelyintermsofdotproducts.
ThesetwofactsmeanthatIcantakemyfavoritemachinelearningalgorithm,expressitin
termsofdotproducts,andthensincemykernelisalsoadotproductinsomespace,
replacethedotproductbymyfavoritekernel.Voila!
Whykernels?
Whykernels,asopposedtofeaturevectors?Onebigreasonisthatinmanycases,
computingthekerneliseasy,butcomputingthefeaturevectorcorrespondingtothekernel
isreallyreallyhard.Thefeaturevectorforevensimplekernelscanblowupinsize,andfor
kernelsliketheRBFkernel(k(x,y)=exp(||xy||^2),seeRadialbasisfunctionkernel )
thecorrespondingfeaturevectorisinfinitedimensional.Yet,computingthekernelis
almosttrivial.
Manymachinelearningalgorithmscanbewrittentoonlyusedotproducts,andthenwe
canreplacethedotproductswithkernels.Bydoingso,wedon'thavetousethefeature
vectoratall.Thismeansthatwecanworkwithhighlycomplex,efficienttocompute,and
yethighperformingkernelswithouteverhavingtowritedownthehugeandpotentially
infinitedimensionalfeaturevector.Thusifnotfortheabilitytousekernelfunctions
directly,wewouldbestuckwithrelativelylowdimensional,lowperformancefeature
vectors.This"trick"iscalledthekerneltrick(Kerneltrick ).
Endnote
Iwanttoclearuptwoconfusionswhichseemprevalantonthispage:

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

1/10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora

1.Afunctionthattransformsonefeaturevectorintoahigherdimensionalfeaturevector
isnotakernelfunction.Thusf(x)=[x,x^2]isnotakernel.Itissimplyanewfeature
vector.Youdonotneedkernelstodothis.Youneedkernelsifyouwanttodothis,or
morecomplicatedfeaturetransformationswithoutblowingupdimensionality.
2.AkernelisnotrestrictedtoSVMs.Anylearningalgorithmthatonlyworkswithdot
productscanbewrittendownusingkernels.TheideaofSVMsisbeautiful,thekernel
trickisbeautiful,andconvexoptimizationisbeautiful,andtheystandquite
independent.
WrittenDec21,2013ViewUpvotesAnswerrequestedbyAniketGurav
Upvote 383

Downvote Comments 7+

VicentRibasRipoll,Isolveclassificationproblems.
30.9kViewsUpvotedbySeanOwen,Director,DataScience@ClouderaandYisongYue,
MachineLearningResearcher
MostViewedWriterinSupportVectorMachines

Intuitively,akernelisjustatransformationofyourinputdatathatallowsyou(oran
algorithmlikeSVMs)totreat/processitmoreeasily.Imaginethatwehavethetoyproblem
ofseparatingtheredcirclesfromthebluecrossesonaplaneasshownbelow.
Ourseparatingsurfacewouldbetheellipsedrawnontheleftfigure.However,
transformingourdataintoa3dimensionalspacethroughthemappingshowninthe
figurewouldmaketheproblemmucheasiersince,now,ourpointsareseparatedbya
simpleplane.Thisembeddingonahigherdimensioniscalledthekerneltrick.
Inconclusion,andveryinformally,akernelconsistsonembeddinggeneralpointsintoan
innerproductspace.

PS:Ihavetakenthisgraphfromhttp://www.sussex.ac.uk/Users/ch... ,whichisalso
showninHastieandTibshirani:ElementsofStatisticalLearning.
WrittenAug16,2012ViewUpvotes
Upvote 159

Downvote Comments 2+

LiliJiang,DataScientistatQuora
32.3kViews

Brieflyspeaking,akernelisashortcutthathelpsusdocertaincalculationfasterwhich
otherwisewouldinvolvecomputationsinhigherdimensionalspace.
Mathematicaldefinition:K(x,y)=<f(x),f(y)>.HereKisthekernelfunction,x,yaren
dimensionalinputs.fisamapfromndimensiontomdimensionspace.<x,y>denotes
thedotproduct.usuallymismuchlargerthann.
Intuition:normallycalculating<f(x),f(y)>requiresustocalculatef(x),f(y)first,and
thendothedotproduct.Thesetwocomputationstepscanbequiteexpensiveasthey
involvemanipulationsinmdimensionalspace,wheremcanbealargenumber.Butafter
allthetroubleofgoingtothehighdimensionalspace,theresultofthedotproductisreally

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

2/10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora

ascalar:wecomebacktoonedimensionalspaceagain!Now,thequestionwehaveis:do
wereallyneedtogothroughallthetroubletogetthisonenumber?dowereallyhavetogo
tothemdimensionalspace?Theanswerisno,ifyoufindacleverkernel.
SimpleExample:x=(x1,x2,x3)y=(y1,y2,y3).Thenforthefunctionf(x)=(x1x1,
AskQuestion
AskorSearchQuora
x1x2,x1x3,x2x1,x2x2,x2x3,x3x1,x3x2,x3x3),thekernelisK(x,y)=(<x,y>)^2.

Read

Answer

10

Notifications

Carlos

Let'spluginsomenumberstomakethismoreintuitive:supposex=(1,2,3)y=(4,5,6).
Then:
f(x)=(1,2,3,2,4,6,3,6,9)
f(y)=(16,20,24,20,25,36,24,30,36)
<f(x),f(y)>=16+40+72+40+100+180+72+180+324=1024
Alotofalgebra.Mainlybecausefisamappingfrom3dimensionalto9dimensional
space.
Nowletususethekernelinstead:
K(x,y)=(4+10+18)^2=32^2=1024
Sameresult,butthiscalculationissomucheasier.
AdditionalbeautyofKernel:kernelsallowustodostuffininfinitedimensions!
Sometimesgoingtohigherdimensionisnotjustcomputationallyexpensive,butalso
impossible.f(x)canbeamappingfromndimensiontoinfinitedimensionwhichwemay
havelittleideaofhowtodealwith.Thenkernelgivesusawonderfulshortcut.
RelationtoSVM:nowhowisrelatedtoSVM?TheideaofSVMisthaty=wphi(x)+b,
wherewistheweight,phiisthefeaturevector,andbisthebias.ify>0,thenweclassify
datumtoclass1,elsetoclass0.Wewanttofindasetofweightandbiassuchthatthe
marginismaximized.Previousanswersmentionthatkernelmakesdatalinearlyseparable
forSVM.Ithinkamoreprecisewaytoputthisis,kernelsdonotmakethethedata
linearlyseparable.Thefeaturevectorphi(x)makesthedatalinearlyseparable.Kernelis
tomakethecalculationprocessfasterandeasier,especiallywhenthefeaturevectorphiis
ofveryhighdimension(forexample,x1,x2,x3,...,x_D^n,x1^2,x2^2,....,x_D^2).
Whyitcanalsobeunderstoodasameasureofsimilarity:
ifweputthedefinitionofkernelabove,<f(x),f(y)>,inthecontextofSVMandfeature
vectors,itbecomes<phi(x),phi(y)>.Theinnerproductmeanstheprojectionofphi(x)
ontophi(y).orcolloquially,howmuchoverlapdoxandyhaveintheirfeaturespace.In
otherwords,howsimilartheyare.
UpdatedSep14,2015ViewUpvotes
Upvote 438

Downvote Comments 19+

SameerGupta,Topoi[n:]...//e
14.6kViews

Analogyofequilibriumofaspringmassconfigurationisyetanotherwaytounderstandit,
whichisattainedwhentheconfighasminimumPotentialenergy.
//Caveat:Lowmath,limitedscope.
Wecancompare:
Learningmodeltothehorizontallineinfigure,whichcontainsthedatapointy queried.
Thedatapointsy arelikeweightedparticles.
Similarityofdatapointswithquerypointiscomparabletospringlength
andthespringconstantsk toKernelfunctions,addingweightstodatapoints
correspondingtotheirsimilaritywiththeonequeried.
PotentialEnergyU = k (y y ) toErrorinpredictingthedatapointssimilartothe
0

1
2

onequeriedfromthelearningmodel.
andequilibriumtothestatewhenthiserrorisminimum.
Allspringsareidenticaltillkernelfunctionsarenotuserforweighingthedata
correspondingtotheirsimilarity.

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

3/10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora

AskorSearchQuora

AskQuestion

Read

Answer

10

Notifications

Carlos

Onceusedspringsarenotequalanymore.

Sometimesnovaluesoftheparametersofa[nonlinear]globalmodelcanprovideagood
approximationofthetruefunction.Therearetwoapproachestothisproblem.
First,wecouldusealarger,morecomplexglobalmodelandhope
thatitcanapproximatethedatasufciently.
Thesecondapproachistotthesimplemodeltolocalpatchesinsteadofthewhole
regionofinterest.
Inthesecondapproach
Nowerrorinpredictionbehavesmuchasthinplatesplinesminimizeabendingenergyofa
plateandtheenergyoftheconstraintspullingontheplate,inaplanarlocalmodel[the
lineinfigure]cannowrotateaswellastranslate.Thespringsareforcedtoremain
orientedvertically,ratherthanmovetothesmallestdistancebetweenthedatapointsand
theline.
thet(thelineinequilibrium)producedbyequallystrongspringstoasetofdatapoints
(theblackdots),minimizingthecriterion,lookslikefollowing

Asthekernelsareputtoaction,springsnearertothequerypointarestrengthenedand
thespringsfurtherawayareweakened.ThestrengthsofthespringsaregivenbyK(d(xi,
q)),andthetminimizesthecriterion..&[equilibrium]lookslikefollowing

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

4/10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora

AskorSearchQuora

AskQuestion

Read

Answer

10

Notifications

Carlos

http://www.qou.edu/arabic/resear...
WrittenNov27,2012ViewUpvotes
Upvote

Downvote Comment

RahulAgarwal,SeniorStatisticalAnalystatWalmartLabs
7.2kViews

FoundthisonReddit:PleaseexplainSupportVectorMachines(SVM)likeIama5year
old./r/MachineLearning
SimplythebestexplanationofSVMieverfound.

Wehave2colorsofballsonthetablethatwewanttoseparate.

Wegetastickandputitonthetable,thisworksprettywellright?

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

5/10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora

AskorSearchQuora

AskQuestion

Read

Answer

10

Notifications

Carlos

Somevillaincomesandplacesmoreballsonthetable,itkindofworksbut
oneoftheballsisonthewrongsideandthereisprobablyabetterplaceto
putthesticknow.

SVMstrytoputthestickinthebestpossibleplacebyhavingasbigagapon
eithersideofthestickaspossible.

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

6/10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora

AskorSearchQuora

AskQuestion

Read

Answer

10

Notifications

Carlos

Nowwhenthevillainreturnsthestickisstillinaprettygoodspot.

ThereisanothertrickintheSVMtoolboxthatisevenmoreimportant.Say
thevillainhasseenhowgoodyouarewithasticksohegivesyouanew
challenge.

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

7/10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora

AskorSearchQuora

AskQuestion

Read

Answer

10

Notifications

Carlos

Theresnostickintheworldthatwillletyousplitthoseballswell,sowhat
doyoudo?Youflipthetableofcourse!Throwingtheballsintotheair.
Then,withyourproninjaskills,yougrabasheetofpaperandslipit
betweentheballs.

Now,lookingattheballsfromwherethevillainisstanding,theyballswill
looksplitbysomecurvyline.

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

8/10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora
Boringadultsthecallballsdata,thestickaclassifier,thebiggestgaptrick
optimization,callflippingthetablekernellingandthepieceofpapera
hyperplane.

Nowseethis:

AskorSearchQuora

AskQuestion

Read

Answer

10

Notifications

Carlos

OneotherpointthatIliketomentionabouttheSVM(unrelatedtothisquestion)ishowit
isdefinedbytheboundarycaseexamples.(takenfromCS109course:SVMExplanation )
AssumeyouwanttoseparateApplesfromorangesusingSVM.TheRed
squareareApplesandthebluecirclesareoranges.

Nowseethesupportvectors(ThefilledPoints)whichdefinethemarginhere.
...
(more)
Upvote 71

Downvote Comment 1

CharlesHMartin,CalculationConsultingwepredictthings
9.8kViewsMostViewedWriterinMachineLearningwith420+answers

TheintentoftheKernelTrick:toallowustorepresentainfinitesetofdiscretefunctions
withafamilyofcontinuousfunctions.Andinmanycasesinmachinelearning,wecan
expressourdotproducts
withasimpleanalyticfunction,withsomeadjustableparameters.
Pleaseseemyblogpostsfordetails

KernelsPart1:WhatisanRBFKernel?Really?

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

9/10

4/8/2016

(10)WhatareKernelsinMachineLearningandSVM?Quora

KernelsPart2:AffineQuantumGravity
KernelsandQuantumGravityPart3:CoherentStates
WrittenJun13,2013ViewUpvotes
AskorSearchQuora
Upvote

AskQuestion

Read

10

Answer

Notifications

Carlos

Downvote Comment

TopStoriesfromYourFeed
PopularonQuora

PopularonQuora

AnswerwrittenWed

Whatisthemostsatisfying
passiveaggressivethingyou
haveeverdonetoareallymeanor
rudeperson?

Haveyouevercaughtsomeone
talkingaboutyouinanother
language?

HowshouldIfigureoutin1month
ifIhavetheintellectualabilityand
passionrequiredtobea
successfulquanttrader/HFT
developer?

CamilleFankhauser,workedasa
cashier,lostfaithinhumanity.
86.8kViews

WhenIwas1820yearsold,Iworkedpart
timeasacashierinasupermarketin
Switzerland.Oneday,awomanisatmy
registerwithherson,whowasabout78
yearsold.WhileIwasscanninghergr...

GisellaFam,ItalianandEnglishare
bothmyfirstlanguages...
527.5kViewsMostViewedWriterin
Language

Yes.Duringmyhoneymoon,inoneofmy
languages.MyhusbandandIwerelooking
forwardtotryingtheJapaneserestaurantin
ourresortinAntigua.Forourdisgracewe
getseatedwith3loudcouples...

ReadInFeed

https://www.quora.com/WhatareKernelsinMachineLearningandSVM

ReadInFeed

MichaelStier,Tradecommodities
professionally.
934Views

LetsbreakdownsomeideashereWhatare
youtryingtoachieve?Whattypeof
intellectualcapabilitydoyouthinkittakes?I
haveabachelorsfromaBschoolandam
doingjustfine.Iworkwith...
ReadInFeed

10/10

You might also like