Professional Documents
Culture Documents
calculationincardiacMRIusing
ConvolutionalNeuralNetwork
Name:TenciaLee&QiLiu
Location:LosAngeles,CA&NewYork,NY,USA
Email:tencia@gmail.com,liu.qi.alex@gmail.com
Date:03/16/2016
Competition:SecondAnnualDataScienceBowl,1stplacesolution
1.Backgroundonyou/yourteam
Tencia
:IgraduatedfromCaltechin2009withB.Scinappliedmathematicsandeconomics.I
thenworkedinquantitativefinance,firstasaresearcherandthenasaportfoliomanagerata
LosAngelesbasedhedgefund,foraboutsixyears.Irecentlytransitionedtoaroboticsstartup
asaresearchengineer.Ibecameinterestedindeeplearningalmostayearagoandhavebeen
studyingandlearningaboutitsincethen.IdecidedtoenterthiscompetitionasIthoughtitwould
beagreateducationalexperienceandachancetoapplythemethodsIhavebeenlearning.I
spentapproximately160hoursworkingonthiscompetition.
Qi:
IgotmyPh.Dintheoreticalphysics(aboutLatticeQuantumchromodynamics&
ChargeParityviolationoftheWeakinteraction)from
ColumbiaUniversity
4yearsagoandafter
thatIworkedasaquantitativetraderinahedgefund.ImtakingalongvacationrecentlysoI
havealotoftimeworkingontheKagglecompetitions,thisonehasalotofmoneyprizeandthe
problemiscomplicated(beingcomplicatedmeansthatthesignaltonoiseratioishigheronthe
LB)andinterestingsoIdecidetogiveitashot
WeoriginallydecidedtoformateambecauseQiwasworkingwithadynamicprogramming
segmentationmethodandTenciawasworkingwithneuralnetworks,andwethoughttheywould
becomplementarytoeachother.However,afterafewweeksitbecameclearthattheneural
networkapproachwascapableofhigherprecision,sowedroppedthedynamicprogramming
segmentationmethodcompletelytosimplifyourworkandcode.
2.Summary
Weusedasourprimarymodelanensembleoffullyconvolutionalneuralnetworkswhich
calculatedareasfromDICOMimages,andthenusedtheseareastocalculateheartvolumeat
differenttimesintheheartbeatcycle.Aspartofourensemble,wealsoincludedafully
convolutionalsegmentationnetworkfor4chamberviewDICOMs,asingleslicemodel,andan
agesexmodel.
Ourtimewasspent10%datacleaningmethodology,30%neuralnetworkdesignand
experimentation,30%identifyingmodelweaknesses,20%calculationofvolumefrom
segmentation,and10%manuallylabelingdata.
Oneofthetrickiestaspectsofthisdatasetwasthenumberofcaseswithmissingorpoorquality
data.Wedevelopedseveralheuristicstoevaluatesegmentationperformanceanddetect
outliers,andtodecideforeachcasewhichmodelstoinclude.
Wefoundthatoneofthemostdifficultproblemswithourapproachwasthatthesegmentation
networkwouldoftenfailtorecognizeaventricle.Imagenormalizationwasessentialto
remediatethisproblemhowever,wealsousedpseudoactivelearningtoselectexamplesfor
manualsegmentation.Intotalwesegmented130SAXviewDICOMimagesbyhand.
Ourapproachwasguidedbytheviewthatsinceweareapproximatingaderivednumber,we
shouldfindtheinputsandthencalculatetheendresultinthesamewaythegroundtruthshad
beencalculated.Forthisreasonwesteeredawayfromanendtoendpredictionalgorithmfor
ourprimarymodel,insteadoptingtoapproximatethesegmentationforeachsliceascloselyas
possibletohowadoctorwould,andthencalculatetheheartvolumeusingthoseareas.
Forneuralnetworktraining,weusedminibatchgradientdescentwiththeAdamoptimizer(4),
withuseofconvolutionsandbatchnormalizationinournetworks.Forensembling,weoptimized
alinearcombinationofdifferentmodels.TrainingandpredictionweredoneusingtwoGPUs,
andallcodewaswritteninPythonwiththeuseofTheano,Lasagne,andcuDNNforneural
networks.
3.FeaturesSelection/Engineering
FullyConvolutionalNeuralNetworkforsegmentingtheleftventricle
OurmainmodelusedseveralFullyConvolutionalNeuralNetworks(CNN)tosegmenttheleft
ventricleforeachMRIimage.Theoutputofeachnetworkwasanimageofthesamesizeasthe
inputimage,withpixelshavingvaluesintherange[0,1].Eachpixelsoutputrepresentsthe
probabilitythatthecorrespondingpixelintheinputimageispartoftheleftventricle.
ThearchitectureofthenetworksfromtheCNN_Bfolderareasfollows,where:
b=batchsize
Thefourdimensionsofthetensorrepresentbatchsize,channelsornumberoffilters,
andthetwodimensionsoftheimage,respectively.
Conv=convolutionwithastackofsquarekernelsof(#filters)x(filtersize)x(filtersize)
BN=batchnormalizationasinreference(2),implementedinLasagne,inreference(3)
ReLU(x)=max(0,x)
Sigmoid(x)=1/(1+exp(x))
Convolutionallayerswithvalidpaddingoutputavalueonlyforfilterpositionswhere
everyvalueinthefilterhasacorrespondingvalueintheinputtensor(shrinksthe
tensor).
Convolutionallayerswithfullpaddingoutputavalueforallfilterpositionsforwhichat
leastonevalueinthefilterhasacorrespondingvalueintheinputtensor(expandsthe
tensor),withallothervalueszeropadded.
MaxPoolandUpscalearedoneacrossthelasttwodimensionsofthetensor.
LayerOp/Type
#Filters/Pool/ FilterSize
UpscaleFactor
Padding
OutputShape
Input
(b,1,246,246)
Conv+BN+ReLU
valid
(b,8,240,240)
Conv+BN+ReLU
16
valid
(b,16,238,238)
MaxPool
(b,16,119,119)
Conv+BN+ReLU
32
valid
(b,32,117,117)
MaxPool
(b,32,58,58)
Conv+BN+ReLU
64
valid
(b,64,56,56)
MaxPool
(b,64,28,28)
Conv+BN+ReLU
64
valid
(b,64,26,26)
Conv+BN+ReLU
64
full
(b,64,28,28)
Upscale
(b,64,56,56)
Conv+BN+ReLU
64
full
(b,64,58,58)
Upscale
(b,64,116,116)
Conv+BN+ReLU
32
full
(b,32,122,122)
Upscale
(b,32,244,244)
Conv+BN+ReLU
16
full
(b,16,246,246)
Conv+BN+ReLU
valid
(b,8,240,240)
Conv+sigmoid
full
(b,1,246,246)
CNN_Anetworksaresimilartotheabovewithminorchangesinnumberoflayersandfilter
sizes,soforbrevityweomittheirexactarchitecturehere.Attesttime,CNN_Aevaluatedan
imagecroppedfromapproximatecenterofleftventricle,whileCNN_Bevaluatedthefullimage
resizedtomatchtheinputsize.
Exampleoutputsfromthesegmentationnetwork,withventriclehighlightedinpurple.
Volumecalculationfromsegmentedimages
WiththesegmentedresultfromtheCNN,thearea Ai,t foreachslice i attime t canbe
calculated.Weremovetheslicesthathasareaof0.Thenthevolumeatanytime t iscalculated
asasumofthepieces:
N
Where z i istheslicelocation,histheslicethickness.Wetriedothermethodstocalculatethe
volumebutthismethodseemstogiveusaslightlybetterresultafteradjustmentsforsystematic
error.Itwouldbebestiftheorganizerofthiscompetitioncouldreleasetheactualformulasowe
dontneedtoguessone.
Thesystoleanddiastolevolumeistakenfromtheminimumof V t andthemaximumofit.We
observedthatformanycases,ourresultislargerthanthetruevaluewhileformanyother
casestheyagreeverywell.Webelievethisisatypeofsystematicerrorthatourmethodintends
toincludeallgoodlookingsliceswhileadoctormightdecidetoremoveitforsomereasons.So
wedidacorrectionwiththeformula:
V adj = V CN N
CN N
(2)
forsystoleanddiastolevolumeseparately.Herewehavetwoparameterstobedetermined sys
and dias .Wealsoexperimentedwithsimplelinearfittingbutitdidnotperformaswell.
Therearefewcasesthatourmethodmightfail.TheCNNmightfailtodetecttheLeftventricleor
somecaseshavemissingdata(e.g.,somedatasetonlyhave3SAXslices).Sowedeveloped
someothersimplermodelstoavoidgettingsomethingcompletelywrong.
Othermodelsforfailurecases
OneSliceModel
:Wetookthe80thpercentileoftheareavector, Adias and Asys fordiastoleand
systolerespectively,andusedthemtofitthevolumes.Thereasontouse80percentileinstead
ofthemaximumistoavoidthosewrong/extremecontourreadingsfromCNN.Becausethe
lengthoftheheartismoreorlessproportionaltothediameterofthelargestcrosssectional
area,sowefit
V = V 0 + A A
(4)
fordiastoleandsystolevolumeseparately.Thismodelachievesascoreof0.015intrainset.
4chamberModel
:Wehandsegmented7364chamberviewimagesandusedthemtotraina
fullyconvolutionalsegmentationnetworkinidenticalmannertotheCNN_Bmainmodel.We
trainedfivenetworks,eachwith20%trainingdataheldout,andusedtheaverageoutputof
thesefiveasthesegmentationoutput.Theoutputofthisnetworkwasasegmentationmask
indicatingwhethereachpixelwaspartoftheleftventricle,fromthe4chamberview.
Examplesofsegmentationoutputofthe4chamberview
FromhereweusedPCAtofindthemainaxisoftheventricle,thencalculatedadiskateach
pixelalongthemainventricleandsummedthediskstoarriveatavolume.Thisvolumewas
thenadjustedusinglinearfit,andstandarddeviationforCRPScurvewascalculatedasin(2).
V f inal = V seg + V 0 (5)
Thismodelachievesascoreof0.017.
AgesexModel
:Itisclearthatthesizeoftheheartgrowswithagebeforeitcomestomoreor
lessfixedsize.Anditalsosignificantlydependonthegenderofthepatient.Sowefitsome
linearmodelsusingageandsex.Ourmethodcloselyfollowsreference(1)thatwaspostedin
theforum.Thismodelscores0.037intrainset.
Wetookacombinationoftheoneslicemodeland4chambermodelasthefirstdefaultmodel,
andtogethertheyachieveascoreof0.0134forthetrainingdataset.WhenouroriginalSAX
viewbasedmodelfails,ittakesresultfromthisdefaultmodel.Ifitstillfailsthenwetookthe
resultfromtheagesexmodelwhichhasnofailurecases.
4.TrainingMethod(s)
Trainingdata
Fortrainingthesegmentationnetworks,weusedtheSunnybrookcontourdatainasimilar
mannerastheDeepLearningTutorial,aswellas130handlabeledSAXimagesselectedfrom
thetrainingset.
Trainingprocedure:
TheneuralnetworksweretrainedtominimizeamodifiedversionoftheSrensonDiceIndex:
Loss = (2 i,j predij targetij + s)/(i,j (predij + targetij ) + s) (6)
Inequation(5),
pred
foreachpixelwastheoutputofthenetworkafterthesigmoidnonlinearity,
withvaluesintherange[0,1],and
target
waswhetherthatpixelwaspartoftheleftventriclein
thegroundtruthsegmentationofthatimage,withvalue0ifnot,and1ifso.Wefoundthis
objectivefunctiongavemuchbetterresultsthanbinarycrossentropy.Thehyperparameter
s
wasusedtogiveanonzerolossevenwhenimagedidnotcontainanypartoftheventricle,and
wasusuallysetto100.
Eachneuralnetworkwastrainedforbetween150and300epochsusingminibatchgradient
descent,withbatchsizeseither8or16images,usingtheAdamoptimizer(4)withlearningrate
3e3.Hyperparameterswerehandtunedforeffectiveness,butduetotimeconstraintswerenot
automaticallyoptimized.
Forsomemodelsintheensemble,trainingwasdonewithsomepartofthedataheldoutasa
validationset.Inthesecases,theparametersthatweresavedastheresultofthatrunwereset
thatyieldedthelowestvalidationloss.Inothers,trainingwasdonewithallavailabledata,andin
thesecasestheparametersattheendweresaved.
Datapreprocessingandaugmentation:
Imagenormalizationwasnecessaryasapreprocessingsteptohomogenizethedatasetand
facilitateneuralnetworktraining.Twonormalizationmethodswereused:
Meanandstandarddeviation:themeanandstandarddeviationofthepixelvalues
calculatedoverthemiddle60%oftheimage,andthemeanwasthensubtractedfrom
theimageandtheresultdividedbythestandarddeviation.Thiswasdonebecause
extremevaluestendedtooccurattheedges,sothisprovidedamorestable
normalization.
Percentile:theimageintensitywasrescaled,withasetpercentilerangebeingstretched
totheentirerange,asinRef.(6).
Duetothesmallsizeofthesegmentationtrainingset,dataaugmentationduringtrainingwas
essentialforsuccessfulresults.BothCNN_AandCNN_Busedrandomrotationsand
translationsattraining.CNN_Busedafixednormalizationmethodforeachtrainingrun,
whereasCNN_Ausedrandomized(onthepercentilevalues)imagenormalizationasan
augmentationmethod.CNN_Aalsousedrandomscalingoftheimages.
AnapproximatecenterandtheboundingboxoftheLeftventriclecanbeestimatedfromthe
timevarianceoftheimages.WecanalsorotatetheimagesbasedontheDicominformationto
alignallthecasesroughlyintothesameorientation.Todiversifyourlistofmodels,weonlylet
CNN_Acropfromtheapproximatecenteroftheimagesanddotherotationalignment,while
CNN_Busesthefullimageandonlyrotatethosefewcasesthathaveapproximate90degree
rotations.
5.Interestingfindings
Mostofthemostusefultricksweusedrelatedtodetectingandfixingbaddata.Inmanycasesa
patientsfolderwouldcontainDICOMimagesfrommultiplescans,onlysomeofwhichbelonged
inthevolumecalculation,soweincorporatedinformationfromtheDICOMmetadatatofilter
whichslicesshouldbekeptordropped.Severalfairlynonintuitivemethodswereusedatthis
step,andthismayhavebeenoneofthefactorsthatseparatedusmosteffectivelyfromthe
othercompetitors.
PostprocessingresultsfromtheFCNoutputwasalsoessential.Wecameupwithtwomain
heuristicstocleanthesegmentationresults.
Firstly,wenoticedthatthenetworkwouldoftenoutputfalsepositives,inwhichitmistakenly
thoughtpixelsoutsidetheventriclewerepartofit.Toeliminatethese,foreachcase,wetook
thesegmentationmaskoutputsforeveryimagefromthatpatientandcalculatedtheaverage
maskvalueforeachpixel.Thiscreatesameanmaskimage(centerframeinbelowfigure).
Thenwefita2DisotropicGaussiankernelcenteredatthemaximumvalue.Forevery
contiguouspatchreturnedbythesegmenter,wecalculatedtheaveragelikelihoodacrossallof
itspixelsunderthisGaussiandistribution,andeliminatedeverypatchthathadanaverageless
thanasetthreshold.Ifmorethanonepatchremained,weeliminatedallbuttheonewiththe
highestcumulativelikelihood.Inthebelowfiguretheextraneouspatchesvisibleinthefirst
imagehavebeenremovedusingthismethod,yieldingthecleanmaskinthethirdimage.
Secondly,oncetheSAXviewimagesforapatientaresegmented,wecomposetheareasfound
afterapplyingtheabovefilterintoamatrix,withthexaxiscorrespondingtotimeandtheyaxis
tosliceordering.Thesematricescanthenbecomparedtoeachother,toinferaqualityofread
measurementforeachpatient.Tofacilitatecomparisons,thematriceswereresizedusing
bilinearsamplingto10x30andnormalizedtovaluesin[0,1],andforeachofthe300pixels
wecalculatedameanandstandarddeviationacrossallpatientsinthetrainingset.Theneach
patientsresizedmatrixcanbegivenaloglikelihoodscoreof:
LL = t,s log(p(ats |ts , ts )) (6)
Whenthemapcontainsunusualvalues,LLascalculatedabovewouldbelow,andthisusually
correspondedtothesegmentationnetbeingunabletosegmentsomeoralloftheimages
correctly.Therefore,insomeofourmodelsweusedLLasafiltertodeterminewhennottouse
theSAXmodel.Inretrospect,thisfilterwasfairlyimprecisebecausethepatternofrelativeareas
isnotfixedacrosspatients,andbecauseLLconsiderseachpixelseparatelywithoutconsidering
thejointdistributionofpixelsacrossamap.Thisfiltercouldlikelybeimprovedbyaddingaterm
representingvariation,becauseusuallybadreadswouldalsocorrespondtohighlynoisyor
bumpyareavaluesinthemap.
Examplesofaheatmapswithlow(bottomright)andhigh(otherpositions)LLscores
Oneinterestingaspectofthisdatasetisthatitismuchmorehighlystructuredeventhanour
methodreflects.Thetrueshapeoftheventricleisusuallysmooth,withnooutlyingvalues,and
theareaprofiledescribedbytheSAXviewandthe4chamberviewhavetoagree.
Nevertheless,wedidnotuseanycollaborativemethodfordetectingfailuresbetweenthesetwo
views,orenforceanysmoothnessconstraints.Wedidnoticethatinsomecasesoutliervalues
wouldmaketheirwayintothevolumecalculationhowever,wedidnothavetimetobuilda
morecompletemodelthatincludedallavailableinformation.
6.SimpleMethods
Wetookanensembleof10CNNs,butitonlyslightlyimprovestheresultoftheaverageoftwo
configurationsunderCNN_A/config_v3/6.py.ThetrainingtimeforasingleCNNtakesabout3
hoursforinputimagesize256x256,and2hoursfor196x196.Thepredictiontimeisabout20
secondsforeachcase.
Tofurthersimplifythemodel,wecandirectlytakethevolumecalculatedfromeq.(1)thatuses
thesegmentationoutputoftheCNN.Theadjustmentequation(2)isusedtomatchtheground
truthprovidedbyhumanannotationofthecontourswhichhaslotsofuncertaintybyitselffrom
whethertoincludeasliceornotatthebaseoftheheart.Asweexaminedmanycases,itseems
thatthegroundtruthresultsarenotfullyconsistentinkeepingtheendslicesornot.
Appendix
A1.ModelExecutionTime
Forthe8versionsofCNN(2differentarchitectureswithdifferentinputimagesize256and196,
and4differentkindofparameters)inCNN_A,eachCNNtakesabout3hourstotrainonaGPU
GTX970.Ittakesabout1020secondstoforecastforeachcase(dependingonhowmanySAX
slicesithas).Forall700testcases,thepredictiontakeslessthan3.5hoursforeachCNN.
Forthe2versionofCNNinCNN_B,sincewetrain5foldsofeachofthetwomodels(eachwith
20%ofdataheldout)inadditiontoonefullrun,ittakesabout30hourstotrainand8hoursto
evaluateonaGTX980TiwithcuDNN.The4chambermodeltrainsonly5folds(nofullrun),
andtrainingandevaluationtogethertakearound9hours.
A2.Dependencies
ForallthecodeexceptthoseinCNN_B,itcanrunwithPython2.7.6onUbuntu14.04witha
GTX970GPU,anditimportsthefollowingpackages:cv23.1.0,theano
0.8.0.dev0.devRELEASE,lasagne0.2.dev1,pandas0.14.1,numpy1.12.0.dev0+a2f5392.cuda
7.5isusedandcudnnisenabled.
ToruncodeinCNN_B,youneedtousecv2version2.4.12.
A3.HowToGeneratetheSolution(akaREADMEfile)
A.Downloadandpreparedata
ChangethedirectoriesinSETTINGS.pytoyoursettings,anddownloadthesunnybrookdata
set,thetrain,valid,andtestdataset.Appendtherowsfromvalidate.csvtotrain.csvand
renameitastrain_valid.csv
Thedirectorymanual_data/includesallthehandlabeledimagesandthecontours,theyare
combinedwiththesunnybrookdatatotraintheCNNnetworks.
B.TrainCNNstopredictthecontoursoftheLV
PartA
1. run>>bashCNN_A/run_train.sh
2. a)itpreprocessestheimagedatafortheCNNnettotrain.
3. b)ittrainsmanyversionoftheCNNmodelswithdifferentparameters.Tosavetime,you
cansimplyjustrunversions3and6andgetaslightlyworseresultbut1/4ofthetotal
amountoftime.ForeachversionofCNN,ittakesabout3hourstotrainonaGPUGTX
970,and20secondtopredictforeachcase.
4. c)itloadsthetrainedCNNmodelsandpredictsthecontoursforallcases.
5. d)itextractsthesexageinforamtionforlaterusetobuildsexagebaseddefaultmodel.
6. Ifthereareadditionalcasesthatyouneedtomakepredictions,justruntherun_test.sh
script:
7. run>>bashCNN_A/run_test.sh
8. a)predictsthecontoursfortestcases.
9. b)extractsthesexageinformationfortestcases.
PartB
run>>pythontrain.py
C.Calculatethevolumes
Combine(average)alltheprocessedresultsthatcontainstheareaofthecontours,calculatethe
volumesforeachcase,andfitsimplemodelsbasedonthetraindatasettocorrectsystematic
errors,andpredictfortheunknowns.
run>>./train_pred.py
A4.References
1) https://www.kaggle.com/c/secondannualdatasciencebowl/forums/t/18375/003
6023scorewithoutlookingattheimages
2) http://arxiv.org/abs/1502.03167
3) https://github.com/Lasagne/Lasagne/blob/master/lasagne/layers/normalization.py
4) http://arxiv.org/abs/1412.6980
5) https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient
6) http://scikitimage.org/docs/dev/api/skimage.exposure.html#skimage.exposure.re
scale_intensity