Professional Documents
Culture Documents
org
doi:10.14355/jpsr.2014.0302.04
AutomaticIdentificationofFormation
IithologyfromWellLogData:AMachine
LearningApproach
SeyyedMohsenSalehi*1,BizhanHonarvar2
DepartmentofPetroleumEngineering,OmidiyehBranch,IslamicAzadUniversity,omidiyeh,Iran
*1
IslamicAzadUniversity,FarsScienceandResearchBranch,Shiraz,Iran
Emails:*1smohsen_salehi@yahoo.com;2honarvar2@gmail.com
Received22December2013;Accepted10February2014;Published14April2014
2014ScienceandEngineeringPublishingCompany
Abstract
Determination of the hydrocarbon content and also the
successful drilling of petroleum wells are highly contingent
upon the lithology of the underground formation.
Conventional lithology identification methods are either
uneconomical or of high uncertainties.The main aim of this
study is to develop an intelligent model based on Least
Squares Support Vector Machine (LSSVM) and Coupled
Simulated Annealing (CSA) algorithm simply called CSA
LSSVM for predicting the lithology in one of the Iranian
oilfields. To this end, photoelectric index (PEF) values were
simulated by CSALSSVM algorithm based on valid well
loggingdatagenerallyknownaslithologyindicators.Model
predictions were compared to the real data obtained from
well logging operation and the overall Correlation
Coefficient (R2) of 0.993 and Average Absolute Relative
Deviation(AARD)of1.6%wereobtainedforthetotaldataset
(3243 data points) which shows the robustness of the CSA
LSSVMalgorithminpredictingaccuratePEFvalues.Inorder
to check the validity of the employed well log data,value
statistical method was implemented in this study for
detecting the possible outliers. However, diagnosing only
one single data point as the suspected data or probable
outlier reveals the validity of recorded data points and
showshighapplicabilitydomainoftheproposedmodel.
Keywords
Lithology; Least Squares Support Vector Machine (LSSVM);
CoupledSimulatedAnnealing(CSA);Outlier
Introduction
Efficient drilling of hydrocarbon wells in an oilfield
certainlyentailsidentificationofthelithologiescrossed
by the well. The knowledge of lithology on a
hydrocarbon well can be employed in determining a
73
www.jpsr.orgJournalofPetroleumScienceResearch(JPSR)Volume3Issue2,April2014
Thisresearchemployedsaleastsquaremodificationof
SVM approach called Least Squares Support Vector
Machine (LSSVM) in an effort to alleviate the
shortcomingsanddeficienciescarriedbyconventional
well log interpretation methods and previously
appliedalgorithmicapproaches.Ourmainfocusisthe
determination of lithology from the data recorded in
wirelineloggingoperationfromoneoftheIranianoil
wells in Ahwaz oilfield. In this study, caliper log
(CALI), sonic log (DT), deep induction resistivity log
(ILD), neutron log (NPHI), density log (RHOB), and
gamma ray log (CGR) were identified as lithology
indicators. All raw data obtained from wireline
loggingareinitiallycorrectedforenvironmentaleffects
owing to borehole size, mud salinity, etc. These
corrections are rendered indispensible prior to any
interpretationsbeingperformedonwelllogdata.
74
JournalofPetroleumScienceResearch(JPSR)Volume3Issue2,April2014www.jpsr.org
TABLEIRANGESOFINPUT/OUTPUTVARIABLESUSEDFORDEVELOPING
ANDTESTINGTHEMODEL
Data Acquisition
Borehole geophysical data were obtained from an oil
well in Ahwaz Iranian oilfield. Some of the well log
data were selected as indicators oflithology.For each
datapoint,thesearecaliperlog(CALI),soniclog(DT),
deep induction resistivity log (ILD), neutron log
(NPHI), density log (RHOB), and gamma ray log
(CGR). These readings were then connected to
photoelectric index (PEF) which is a supplementary
measurementusedforrecordingtheadsorptionoflow
energygammaraysbytheformationinunitsofbarns
per electron. The logged values are directly
proportional to the aggregate atomic number of the
elementsinformation,thusitisasensitiveindicatorof
mineralogy and has to be predicted with high
accuracy.Figure1indicatesdifferentvaluesofPEFin
differentformationlithologies.Atotalnumberof3243
logreadingswereassembledintoadatasetincluding7
inputs (lithology indicator logs) and 1 output (PEF
values).Theoverallrangeofrecordeddataalongwith
theiraverageandstandarddeviationsaresummarized
inTableI.
Parameter
Minimum
Maximum
Average
Standard
Depth(m)
2575.712
3075.889
2827.878
124.5312
CALI(in)
8.1504
22.2763
9.345049
0.659798
DT(s/ft)
53.1954
113.1356
77.09043
9.722123
ILD(m)
0.1975
1705.562
12.79944
15.99413
NPHI(p.u)
0.041645
0.494965
0.199554
0.047319
RHOB(g/cm3)
1.4736
2.8639
2.420654
0.158964
CGR(API)
0.0139
111.2971
30.33772
19.87745
PEF
(barn/electron)
1.8121
6.635
3.096314
0.845851
FIGURE1MEASUREMENTSOFPHOTOELECTRICINDEX(PEF)
FORDIFFERENTUNDERGROUNDLITHOLOGIES
75
www.jpsr.orgJournalofPetroleumScienceResearch(JPSR)Volume3Issue2,April2014
LeastSquaresSupportVectorMachine(LSSVM)
f(xi ) 1
if yi 1
(2)
if yi 1
min(
n
1
2
w ) C i (3)
2
i 1
whereitissubjectedtothefollowinglinearconstraints:
y wt (x ) b e ,
i
i
i
i 1, 2,...,n (6)
n
1 t
w w ei2 (5)
2
i 1
relativeweightregardingthesummationofregression
errors compared to regression weight. Regression
weight coefficient (w) can be written in terms of
Lagrangian multiplier (i) and input vector (xi) as
represented below (Farasat et al., 2013; Fazavi et al.,
2013; RafieeTaghanaki et al., 2013; Shokrollahi et al.,
2013):
n
n
1 t
C n
w w i i (yi wt xi b 1 i ) i i
2
2 i 1
i 1
i 1
(4)
w i xi
where,aretheLagrangemultipliers.Thesolutionis
defined through the saddle point of the Lagrangian
when the value of i is greater than zero (Cristianini
where
76
i 1
i 2 ei (7)
JournalofPetroleumScienceResearch(JPSR)Volume3Issue2,April2014www.jpsr.org
y i xi t x b (8)
i 1
and 2 .
CoupledSimulatedAnnealing
SimulatedAnnealing(SA)isapopulationbasedsearch
method which is usually used for combinatorial
optimization problems. The method was initially
proposed by Metropolis et al. (1953), and was
popularized by Kirkpatrick et al. (1983) afterwards.
Themotivationbehindthismethodliesinthephysical
processofannealing,duringwhichametalisheatedto
a liquid state and then cooled slowly enough that all
crystalgrainseventuallyreachtothelowestminimum
inner energy. Like the metal cooling process, SA
gradually converges to the optimum solution which
further guarantees global optimum accomplishment
andevadesthelocaloptimality(Fabian,1997).
( yi b )
xi x (2 ) 1 (9)
t
f ( x ) i K ( x, xi ) b (10)
i 1
K ( x, x )
K ( x, xi ) ( xi )t . ( x )
(11)
K ( x, xi ) exp( xi x / 2 )
2
where is squared bandwidth which is optimized
throughanexternaloptimizationtechniqueduringthe
trainingprocess.
The mean squared error (MSE) between the real PEF
values and those of predicted by LSSVM algorithm
was defined as (Farasat et al., 2013; RafieeTaghanaki
etal.,2013;Shokrollahietal.,2013):
n
MSE
( PEFpred PEFreal )
i 1
(13)
wherePEFrepresentsthePEFvalues,Nisthenumber
oftrainingobjectsandsubscriptspredandrealdenote
the predicted and real PEF values, respectively. The
LSSVM algorithm employed in this study to train the
well log data has been developed by Pelckmans et al.
(2002)andSuykensandVandewalle(1999).Inorderto
enhance model performance during learning process,
(14)
77
www.jpsr.orgJournalofPetroleumScienceResearch(JPSR)Volume3Issue2,April2014
Implement
Coupled
Simulated
Annealing (CSA)
Inthenextstep,assembledwelllogdatawereinitially
divided into three subsets namely, train, validation
and test. The Train set is employed to perform and
generate the model structure, the Validation set is
appliedforadjustingthemodelparametersandalsoto
check the validity of the patterns learned by CSA
LSSVM over the whole range of dataset, and finally,
the Test set is used to investigate the final
performance and validity of the proposed model for
unseen data. To increase the model applicability and
robustness,thewholedatabasewasdividedrandomly
into 70%, 15%, and 15% fractions of the main dataset
fortheTrainset(2270datapoints),theValidation
set (486 data points), and the Test set (487 data
points),respectively.
( and )
Vldn. set
Tst. set
Trn. set
Employ featuresubset
( and )
2
NoO
Meet stopping
criteria?
Yes
and
)obtained
Final CSA-LSSVM
model
FIGURE2ATYPICALFLOWCHARTREPRESENTINGTHECSA
LSSVMALGORITH
of optimization
and 2 0.9916 .
exp(
l
Tka
TRAINSET
) (15)
R2
0.995
AVERAGEABSOLUTERELATIVEDEVIATION
1.3
STANDARDDEVIATIONERROR
0.84
ROOTMEANSQUAREERROR
0.07
2270
VALIDATIONSET
R2
0.987
AVERAGEABSOLUTERELATIVEDEVIATION
2.2
STANDARDDEVIATIONERROR
0.82
ROOTMEANSQUAREERROR
0.11
486
TESTSET
R2
ModelAccuracyAndValidation
In this research, CSALSSVM algorithm was
implemented in order to obtain PEF as a function of
several other measurements recorded during well
logging operation. PEF can be used as a general
indicatoroflithologiesandmineralogicalcomplexities
ofdifferentlayersofformation.Inthisstudy,PEFwas
linked to some other parameters generally known as
lithologyindicators:
0.985
AVERAGEABSOLUTERELATIVEDEVIATION
2.2
STANDARDDEVIATIONERROR
0.86
ROOTMEANSQUAREERROR
0.12
487
TOTAL
PEF=f(Depth,CALI,DT,ILD,NPHI,RHOB,CGR)(16)
78
284.8173
STATISTICALPARAMETERS
were:
TABLEIISTATISTICALPARAMETERSOFTHEPROPOSEDCSALSSVMMODEL
E ( xi ) max x E ( xi )
process
R2
0.993
AVERAGEABSOLUTERELATIVEDEVIATION
1.6
STANDARDDEVIATIONERROR
0.84
ROOTMEANSQUAREERROR
0.08
3243
JournalofPetroleumScienceResearch(JPSR)Volume3Issue2,April2014www.jpsr.org
5.5
Real PEF
LSSVM prediction
45 line
Train
Validation
Test
6.5
4.5
R2 = 0.993
4
4.5
PEF values
5.5
3.5
3.5
3
3
2.5
2.5
2
2
2.5
3.5
4
4.5
Real PEF
5.5
6.5
1.5
FIGURE3GRAPHICALREPRESENTATIONOFPEFVALUES
PREDICTEDBYCSALSSVMALGORITHMVERSUSREALPEF
VALUES.
50
100
150
200
250
300
350
Total number of test data
400
450
500
FIGURE6COMPARISONBETWEENCSALSSVMMODEL
PREDICTIONSANDREALDATAFORTESTDATASET
7
Real PEF
LSSVM prediction
600
Train
Validation
Test
500
400
Data frequency
PEF values
300
200
2
100
1
500
1000
1500
Total number of train data
2000
2500
FIGURE4COMPARISONBETWEENCSALSSVMMODEL
PREDICTIONSANDREALDATAFORTRAINDATASET
Real PEF
LSSVM prediction
4.5
PEF values
3.5
2.5
50
100
150
200
250
300
350
Total number of validation data
400
450
0.5
FIGURE7HISTOGRAMOFERRORFREQUENCYSKETCHED
FORALLDATAINCLUDINGTRAIN,VALIDATION,ANDTEST
SETS
0.5
Relative deviation
5.5
1.5
0
1
500
FIGURE5COMPARISONBETWEENCSALSSVMMODEL
PREDICTIONSANDREALDATAFORVALIDATIONDATASET
79
www.jpsr.orgJournalofPetroleumScienceResearch(JPSR)Volume3Issue2,April2014
( pred (i ) exp.(i ))
R 1
% AARD
N
STD
i
N i
exp.(i )
0 H H * and 3 R 3 revealsthehighapplicability
and reliability of developed model. Based on these
values, suspected data may be categorized into two
types namely, leverage points and regression outliers.
Leverage points are also subdivided into two groups
namely, good leverage point and bad leverage point.
Good leverage points are those data points located
OutlierDetectionInPEFMeasurements
Developing a valid and highly applicable model for
predicting PEF values from well log measurements,
recordeddatamustbereliableandaccurate.However,
accurate measurements of well log data is almost not
feasibleandenvironmentalinterferencesinsomecases
may introduce some flawed measurements into
recorded database. These observations usually differ
frombulkofthedataandareconsideredasamenace
to successful lithology prediction. Thus, constructing
an accurate and reliable model is highly dependent
upondetectingthesevaluesfromwellloggingdata.
H X ( X t X ) 1 X t (17)
80
JournalofPetroleumScienceResearch(JPSR)Volume3Issue2,April2014www.jpsr.org
Standardized residual
K(x, x i )
Kernelfunction
Coefficientofdetermination
AARD
AverageAbsoluteRelativeDeviations,%
Biasterm
Positiveconstant
CALI
Caliperlog
CGR
Correctedgammaray
CLM
CoupledLocalMinimizers
CSA
CoupledSimulatedAnnealing
0.001
0.002
0.003
0.004
0.005
Hat
0.006
0.007
0.008
0.009
0.01
FIGURE8DETECTIONOFPROBABLEOUTLIERSOR
SUSPECTEDDATAFROMTHEWHOLERECORDEDDATASET
Conclusions
In this study, Least Squares Support Vector Machine
(LSSVM) was implemented to obtain formation
lithologyfromwelllogdataobtainedfromanoilwell
in Ahvaz Iranian oilfield. In order to optimize the
LSSVM parameters, Coupled Simulated Annealing
(CSA) algorithm was implemented to construct a
hybrid approach called CSALSSVM. Using the CSA
LSSVM algorithm, photoelectric index (PEF) was
simulated based on the well logging data obtained
fromundergroundformation.Modelpredictionswere
comparedwithrealPEFvaluesandoverallCorrelation
Coefficient (R2) of 0.993 and Average Absolute
Relative Deviation (AARD) of 1.6% were obtained
showing high accuracy of CSALSSVM in predicting
PEF values. Excellent accordance was observed
between simulated and real PEF values in this study
which corroborates the validity of developed model.
Also, a statistical approach was implemented for
determining the suspected data and possible outliers
from overall PEF recordings. It was found that
employed database is highly accurate and only one
data point was diagnosed of following a different
patternfromtherestofthedataset.Thus,thissuggests
the high applicability domain of the developed CSA
LSSVMmodelinpredictingPEFvaluesfromwelllog
data.Developedmodelcanfurtherbeimplementedin
adjacent wells with an acceptable accuracy for
lithologypredictionduringdrillingoperations.
Acceptanceprobabilityfunction
Tka
Acceptancetemperature
ei
Regressionerror
Soniclog
Hatmatrix
ILD
Deepinductionresistivitylog
g(x)
Mappingfunction
LSSVM
LeastSquaresSupportedVectorMachine
Numberofemployeddata
MSE
MeanSquaredError
Totalnumberofmodelparameters
Numberoftrainingobjects
NPHI
Neutronlog
Injectionrate,cc/min
Residual
RMSE
RootMeanSquaredErrors
RHOB
Densitylog
Setofallpossiblesolutions
SA
SimulatedAnnealing
STD
StandardDeviationError
Transpose
Anonlinearfunction
Inputs
Atwodimensionalmatrix(mn)
Outputs
GREEKLETTERS
Squaredbandwidth
Aofsubsetofallpossiblesolutions
Couplingterm
Lagrangemultipliers
Relativeweightofthesummationoftheregression
errors
Slackvariable
REFERENCES:
Akinyokun,O.C.,Enikanselu,P.A.,Adeyemo,A.B.,Adesida,
A., 2009. Well Log Interpretation Model for the
Determination of Lithology and Fluid Contents. The
NOMENCLATURE
DT
PacificJournalofScienceandTechnology10,507517.
Baylar,A.,Hanbay,D.,Batan,M.,2009.Applicationofleast
square support vector machines in the prediction of
aeration performance of plunging overfall jets from
81
www.jpsr.orgJournalofPetroleumScienceResearch(JPSR)Volume3Issue2,April2014
weirs.ExpertSyst.Appl.36,83688374.
&Geosciences31,263275.
Kirkpatrick,S.,Gelatt,C.D.,Vecchi,M.P.,1983.Optimization
bySimulatedAnnealing.Science220,671680.
43,18821889.
Physics21,10871092.
Mohammadi, A.H., Eslamimanesh, A., Gharagheizi, F.,
organizingmaps.Computers&Geosciences28,223229.
asphaltene
precipitation
titration
data.
Chemical
EngineeringScience78,181185.
Moser,G.,Serpico,S.B.,2009.ModelingtheErrorStatisticsin
learningmethods.CambridgeUniversityPress.
Eslamimanesh, A., Gharagheizi, F., Mohammadi, A.H.,
SupportVectorRegressionofSurfaceTemperatureFrom
InfraredData.IEEEGeosci.RemoteSens.Lett.6,448452.
Pelckmans, K., Suykens, J.A.K., Gestel, T.V., Brabanter, J.D.,
gases.FuelProcessingTechnology110,133140.
Maps.PennWell.
SupportVectorMachines,Leuven,Belgium.
Fabian,V.,1997.Simulatedannealingsimulated.Computers
&MathematicswithApplications33,8194.
Farasat, A., Shokrollahi, A., Arabloo, M., Gharagheizi, F.,
M.,
Zargari,
M.H.,
Adelzadeh,
M.R.,
2013.
Mohammadi,A.H.,2013.Towardanintelligentapproach
fordeterminationofsaturationpressureofcrudeoil.Fuel
propertiesofreservoiroil.FluidPhaseEquilib.346,2532.
Process.Technol.Fazavi,M.,Hosseini,S.M.,Arabloo,M.,
andBeyond.UniversityPressGroupLimited.
DispersionSci.Technol.
Gharagheizi,F.,Eslamimanesh,A.,Farjood,F.,Mohammadi,
PetroleumEngineersJournal22,117131.
Strategy.Ind.Eng.Chem.50,1138211395.
375384.
Goodall,
C.R.,
1993.
Computation
using
the
QR
decomposition,HandbookofStatistics.Elsevier,pp.467
508.
293300.
Suykens,J.A.K.,2001.SupportVectorMachines:ANonlinear
internalandexternal.QSAR&CombinatorialScience26,
ModellingandControlPerspective.Eur.J.Control7,311
694701.
327.
SpringerVerlag,NewYork.
279283.
identificationofaquifersfromgeophysicalwelllogsand
on40,320335.
82