Professional Documents
Culture Documents
hadoop.hdfs.configuration.version
value
1
dfs.https.enable
false
dfs.http.policy
HTTP_ONLY
dfs.client.https.needauth
false
dfs.client.cached.conn.retry
dfs.https.server.keystore.resource
sslserver.xml
description
versionofthisconfigurationfile
Thelogginglevelfordfsnamenode.Othervaluesare"dir"(tracenamespacemutations),
"block"(traceblockunder/overreplicationsandblockcreations/deletions),or"all".
RPCaddressthathandlesallclientsrequests.InthecaseofHA/Federationwheremultiple
namenodesexist,thenameserviceidisaddedtothenamee.g.dfs.namenode.rpc
address.ns1dfs.namenode.rpcaddress.EXAMPLENAMESERVICEThevalueofthis
propertywilltaketheformofnnhost1:rpcport.
TheactualaddresstheRPCserverwillbindto.Ifthisoptionaladdressisset,itoverrides
onlythehostnameportionofdfs.namenode.rpcaddress.Itcanalsobespecifiedpername
nodeornameserviceforHA/Federation.Thisisusefulformakingthenamenodelisten
onallinterfacesbysettingitto0.0.0.0.
RPCaddressforHDFSServicescommunication.BackupNode,Datanodesandallother
servicesshouldbeconnectingtothisaddressifitisconfigured.Inthecaseof
HA/Federationwheremultiplenamenodesexist,thenameserviceidisaddedtothename
e.g.dfs.namenode.servicerpcaddress.ns1dfs.namenode.rpc
address.EXAMPLENAMESERVICEThevalueofthispropertywilltaketheformofnn
host1:rpcport.Ifthevalueofthispropertyisunsetthevalueofdfs.namenode.rpcaddress
willbeusedasthedefault.
TheactualaddresstheserviceRPCserverwillbindto.Ifthisoptionaladdressisset,it
overridesonlythehostnameportionofdfs.namenode.servicerpcaddress.Itcanalsobe
specifiedpernamenodeornameserviceforHA/Federation.Thisisusefulformakingthe
namenodelistenonallinterfacesbysettingitto0.0.0.0.
Thesecondarynamenodehttpserveraddressandport.
ThesecondarynamenodeHTTPSserveraddressandport.
Thedatanodeserveraddressandportfordatatransfer.
Thedatanodehttpserveraddressandport.
Thedatanodeipcserveraddressandport.
Thenumberofserverthreadsforthedatanode.
Theaddressandthebaseportwherethedfsnamenodewebuiwilllistenon.
TheactualadresstheHTTPserverwillbindto.Ifthisoptionaladdressisset,itoverrides
onlythehostnameportionofdfs.namenode.httpaddress.Itcanalsobespecifiedpername
nodeornameserviceforHA/Federation.ThisisusefulformakingthenamenodeHTTP
serverlistenonallinterfacesbysettingitto0.0.0.0.
Deprecated.Use"dfs.http.policy"instead.
DecideifHTTPS(SSL)issupportedonHDFSThisconfigurestheHTTPendpointfor
HDFSdaemons:Thefollowingvaluesaresupported:HTTP_ONLY:Serviceis
providedonlyonhttpHTTPS_ONLY:Serviceisprovidedonlyonhttps
HTTP_AND_HTTPS:Serviceisprovidedbothonhttpandhttps
WhetherSSLclientcertificateauthenticationisrequired
ThenumberoftimestheHDFSclientwillpullasocketfromthecache.Oncethisnumber
isexceeded,theclientwilltrytocreateanewsocket.
Resourcefilefromwhichsslserverkeystoreinformationwillbeextracted
dfs.namenode.logging.level
info
dfs.client.https.keystore.resource
dfs.datanode.https.address
sslclient.xml
0.0.0.0:50475
Resourcefilefromwhichsslclientkeystoreinformationwillbeextracted
Thedatanodesecurehttpserveraddressandport.
dfs.namenode.rpcaddress
dfs.namenode.rpcbindhost
dfs.namenode.servicerpcaddress
dfs.namenode.servicerpcbindhost
dfs.namenode.secondary.httpaddress
dfs.namenode.secondary.httpsaddress
dfs.datanode.address
dfs.datanode.http.address
dfs.datanode.ipc.address
dfs.datanode.handler.count
dfs.namenode.httpaddress
0.0.0.0:50090
0.0.0.0:50091
0.0.0.0:50010
0.0.0.0:50075
0.0.0.0:50020
10
0.0.0.0:50070
dfs.namenode.httpbindhost
dfs.namenode.httpsaddress
0.0.0.0:50470
dfs.namenode.httpsbindhost
dfs.datanode.dns.interface
default
dfs.datanode.dns.nameserver
default
dfs.namenode.backup.address
0.0.0.0:50100
dfs.namenode.backup.httpaddress
0.0.0.0:50105
dfs.namenode.replication.considerLoad
true
dfs.default.chunk.view.size
dfs.datanode.du.reserved
32768
0
dfs.namenode.name.dir
file://${hadoop.tmp.dir}/dfs/name
dfs.namenode.name.dir.restore
false
dfs.namenode.fslimits.maxcomponentlength
255
dfs.namenode.fslimits.maxdirectoryitems
1048576
dfs.namenode.fslimits.minblocksize
1048576
dfs.namenode.fslimits.maxblocksperfile
1048576
dfs.namenode.edits.dir
${dfs.namenode.name.dir}
dfs.namenode.shared.edits.dir
dfs.namenode.edits.journalplugin.qjournal
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager
dfs.permissions.enabled
true
dfs.permissions.superusergroup
supergroup
dfs.namenode.acls.enabled
false
Thenamenodesecurehttpserveraddressandport.
TheactualadresstheHTTPSserverwillbindto.Ifthisoptionaladdressisset,itoverrides
onlythehostnameportionofdfs.namenode.httpsaddress.Itcanalsobespecifiedper
namenodeornameserviceforHA/Federation.Thisisusefulformakingthenamenode
HTTPSserverlistenonallinterfacesbysettingitto0.0.0.0.
ThenameoftheNetworkInterfacefromwhichadatanodeshouldreportitsIPaddress.
ThehostnameorIPaddressofthenameserver(DNS)whichaDataNodeshoulduseto
determinethehostnameusedbytheNameNodeforcommunicationanddisplaypurposes.
Thebackupnodeserveraddressandport.Iftheportis0thentheserverwillstartonafree
port.
Thebackupnodehttpserveraddressandport.Iftheportis0thentheserverwillstartona
freeport.
DecideifchooseTargetconsidersthetarget'sloadornot
Thenumberofbytestoviewforafileonthebrowser.
Reservedspaceinbytespervolume.Alwaysleavethismuchspacefreefornondfsuse.
DetermineswhereonthelocalfilesystemtheDFSnamenodeshouldstorethename
table(fsimage).Ifthisisacommadelimitedlistofdirectoriesthenthenametableis
replicatedinallofthedirectories,forredundancy.
SettotruetoenableNameNodetoattemptrecoveringapreviouslyfailed
dfs.namenode.name.dir.Whenenabled,arecoveryofanyfaileddirectoryisattempted
duringcheckpoint.
DefinesthemaximumnumberofbytesinUTF8encodingineachcomponentofapath.
Avalueof0willdisablethecheck.
Definesthemaximumnumberofitemsthatadirectorymaycontain.Avalueof0will
disablethecheck.
Minimumblocksizeinbytes,enforcedbytheNamenodeatcreatetime.Thispreventsthe
accidentalcreationoffileswithtinyblocksizes(andthusmanyblocks),whichcan
degradeperformance.
Maximumnumberofblocksperfile,enforcedbytheNamenodeonwrite.Thisprevents
thecreationofextremelylargefileswhichcandegradeperformance.
DetermineswhereonthelocalfilesystemtheDFSnamenodeshouldstorethetransaction
(edits)file.Ifthisisacommadelimitedlistofdirectoriesthenthetransactionfileis
replicatedinallofthedirectories,forredundancy.Defaultvalueissameas
dfs.namenode.name.dir
AdirectoryonsharedstoragebetweenthemultiplenamenodesinanHAcluster.This
directorywillbewrittenbytheactiveandreadbythestandbyinordertokeepthe
namespacessynchronized.Thisdirectorydoesnotneedtobelistedin
dfs.namenode.edits.dirabove.ItshouldbeleftemptyinanonHAcluster.
If"true",enablepermissioncheckinginHDFS.If"false",permissioncheckingisturned
off,butallotherbehaviorisunchanged.Switchingfromoneparametervaluetotheother
doesnotchangethemode,ownerorgroupoffilesordirectories.
Thenameofthegroupofsuperusers.
SettotruetoenablesupportforHDFSACLs(AccessControlLists).Bydefault,ACLs
aredisabled.WhenACLsaredisabled,theNameNoderejectsallRPCsrelatedtosetting
orgettingACLs.
dfs.namenode.lazypersist.file.scrub.interval.sec
300
dfs.block.access.token.enable
false
dfs.block.access.key.update.interval
dfs.block.access.token.lifetime
600
600
dfs.datanode.data.dir
file://${hadoop.tmp.dir}/dfs/data
dfs.datanode.data.dir.perm
700
dfs.replication
dfs.replication.max
dfs.namenode.replication.min
512
1
dfs.blocksize
134217728
dfs.client.block.write.retries
dfs.client.block.write.replacedatanodeonfailure.enable
true
dfs.client.block.write.replacedatanodeonfailure.policy
DEFAULT
dfs.client.block.write.replacedatanodeonfailure.besteffort false
dfs.blockreport.intervalMsec
dfs.blockreport.initialDelay
21600000
0
TheNameNodeperiodicallyscansthenamespaceforLazyPersistfileswithmissing
blocksandunlinksthemfromthenamespace.Thisconfigurationkeycontrolstheinterval
betweensuccessivescans.Setittoanegativevaluetodisablethisbehavior.
If"true",accesstokensareusedascapabilitiesforaccessingdatanodes.If"false",no
accesstokensarecheckedonaccessingdatanodes.
Intervalinminutesatwhichnamenodeupdatesitsaccesskeys.
Thelifetimeofaccesstokensinminutes.
DetermineswhereonthelocalfilesystemanDFSdatanodeshouldstoreitsblocks.Ifthis
isacommadelimitedlistofdirectories,thendatawillbestoredinallnameddirectories,
typicallyondifferentdevices.Directoriesthatdonotexistareignored.
PermissionsforthedirectoriesononthelocalfilesystemwheretheDFSdatanodestore
itsblocks.Thepermissionscaneitherbeoctalorsymbolic.
Defaultblockreplication.Theactualnumberofreplicationscanbespecifiedwhenthefile
iscreated.Thedefaultisusedifreplicationisnotspecifiedincreatetime.
Maximalblockreplication.
Minimalblockreplication.
Thedefaultblocksizefornewfiles,inbytes.Youcanusethefollowingsuffix(case
insensitive):k(kilo),m(mega),g(giga),t(tera),p(peta),e(exa)tospecifythesize(suchas
128k,512m,1g,etc.),Orprovidecompletesizeinbytes(suchas134217728for128
MB).
Thenumberofretriesforwritingblockstothedatanodes,beforewesignalfailuretothe
application.
Ifthereisadatanode/networkfailureinthewritepipeline,DFSClientwilltrytoremove
thefaileddatanodefromthepipelineandthencontinuewritingwiththeremaining
datanodes.Asaresult,thenumberofdatanodesinthepipelineisdecreased.Thefeatureis
toaddnewdatanodestothepipeline.Thisisasitewidepropertytoenable/disablethe
feature.Whentheclustersizeisextremelysmall,e.g.3nodesorless,cluster
administratorsmaywanttosetthepolicytoNEVERinthedefaultconfigurationfileor
disablethisfeature.Otherwise,usersmayexperienceanunusuallyhighrateofpipeline
failuressinceitisimpossibletofindnewdatanodesforreplacement.Seealso
dfs.client.block.write.replacedatanodeonfailure.policy
Thispropertyisusedonlyifthevalueofdfs.client.block.write.replacedatanodeon
failure.enableistrue.ALWAYS:alwaysaddanewdatanodewhenanexistingdatanodeis
removed.NEVER:neveraddanewdatanode.DEFAULT:Letrbethereplication
number.Letnbethenumberofexistingdatanodes.Addanewdatanodeonlyifris
greaterthanorequalto3andeither(1)floor(r/2)isgreaterthanorequaltonor(2)ris
greaterthannandtheblockishflushed/appended.
Thispropertyisusedonlyifthevalueofdfs.client.block.write.replacedatanodeon
failure.enableistrue.Besteffortmeansthattheclientwilltrytoreplaceafaileddatanode
inwritepipeline(providedthatthepolicyissatisfied),however,itcontinuesthewrite
operationincasethatthedatanodereplacementalsofails.Supposethedatanode
replacementfails.false:Anexceptionshouldbethrownsothatthewritewillfail.true:
Thewriteshouldberesumedwiththeremainingdatandoes.Notethatsettingthisproperty
totrueallowswritingtoapipelinewithasmallernumberofdatanodes.Asaresult,it
increasestheprobabilityofdataloss.
Determinesblockreportingintervalinmilliseconds.
Delayforfirstblockreportinseconds.
IfthenumberofblocksontheDataNodeisbelowthisthresholdthenitwillsendblock
dfs.blockreport.split.threshold
1000000
dfs.datanode.directoryscan.interval
21600
dfs.datanode.directoryscan.threads
dfs.heartbeat.interval
dfs.namenode.handler.count
3
10
dfs.namenode.safemode.thresholdpct
0.999f
dfs.namenode.safemode.min.datanodes
dfs.namenode.safemode.extension
30000
dfs.namenode.resource.check.interval
5000
dfs.namenode.resource.du.reserved
104857600
dfs.namenode.resource.checked.volumes
dfs.namenode.resource.checked.volumes.minimum
dfs.datanode.balance.bandwidthPerSec
1048576
dfs.hosts
dfs.hosts.exclude
dfs.namenode.max.objects
dfs.namenode.datanode.registration.iphostnamecheck
true
dfs.namenode.decommission.interval
30
dfs.namenode.decommission.nodes.per.interval
reportsforallStorageDirectoriesinasinglemessage.Ifthenumberofblocksexceeds
thisthresholdthentheDataNodewillsendblockreportsforeachStorageDirectoryin
separatemessages.Settozerotoalwayssplit.
IntervalinsecondsforDatanodetoscandatadirectoriesandreconcilethedifference
betweenblocksinmemoryandonthedisk.
Howmanythreadsshouldthethreadpoolusedtocompilereportsforvolumesinparallel
have.
Determinesdatanodeheartbeatintervalinseconds.
Thenumberofserverthreadsforthenamenode.
Specifiesthepercentageofblocksthatshouldsatisfytheminimalreplicationrequirement
definedbydfs.namenode.replication.min.Valueslessthanorequalto0meannottowait
foranyparticularpercentageofblocksbeforeexitingsafemode.Valuesgreaterthan1will
makesafemodepermanent.
Specifiesthenumberofdatanodesthatmustbeconsideredalivebeforethenamenode
exitssafemode.Valueslessthanorequalto0meannottotakethenumberoflive
datanodesintoaccountwhendecidingwhethertoremaininsafemodeduringstartup.
Valuesgreaterthanthenumberofdatanodesintheclusterwillmakesafemode
permanent.
Determinesextensionofsafemodeinmillisecondsafterthethresholdlevelisreached.
TheintervalinmillisecondsatwhichtheNameNoderesourcecheckerruns.Thechecker
calculatesthenumberoftheNameNodestoragevolumeswhoseavailablespacesaremore
thandfs.namenode.resource.du.reserved,andenterssafemodeifthenumberbecomes
lowerthantheminimumvaluespecifiedby
dfs.namenode.resource.checked.volumes.minimum.
Theamountofspacetoreserve/requireforaNameNodestoragedirectoryinbytes.The
defaultis100MB.
AlistoflocaldirectoriesfortheNameNoderesourcecheckertocheckinadditiontothe
localeditsdirectories.
TheminimumnumberofredundantNameNodestoragevolumesrequired.
Specifiesthemaximumamountofbandwidththateachdatanodecanutilizeforthe
balancingpurposeintermofthenumberofbytespersecond.
Namesafilethatcontainsalistofhoststhatarepermittedtoconnecttothenamenode.
Thefullpathnameofthefilemustbespecified.Ifthevalueisempty,allhostsare
permitted.
Namesafilethatcontainsalistofhoststhatarenotpermittedtoconnecttothe
namenode.Thefullpathnameofthefilemustbespecified.Ifthevalueisempty,nohosts
areexcluded.
Themaximumnumberoffiles,directoriesandblocksdfssupports.Avalueofzero
indicatesnolimittothenumberofobjectsthatdfssupports.
Iftrue(thedefault),thenthenamenoderequiresthataconnectingdatanode'saddressmust
beresolvedtoahostname.Ifnecessary,areverseDNSlookupisperformed.Allattempts
toregisteradatanodefromanunresolvableaddressarerejected.Itisrecommendedthat
thissettingbeleftontopreventaccidentalregistrationofdatanodeslistedbyhostnamein
theexcludesfileduringaDNSoutage.Onlysetthistofalseinenvironmentswherethere
isnoinfrastructuretosupportreverseDNSlookup.
Namenodeperiodicityinsecondstocheckifdecommissioniscomplete.
Thenumberofnodesnamenodechecksifdecommissioniscompleteineach
dfs.namenode.decommission.interval.
dfs.namenode.replication.interval
Theperiodicityinsecondswithwhichthenamenodecomputesrepliactionworkfor
datanodes.
dfs.namenode.accesstime.precision
3600000
TheaccesstimeforHDFSfileispreciseuptothisvalue.Thedefaultvalueis1hour.
Settingavalueof0disablesaccesstimesforHDFS.
dfs.datanode.plugins
Commaseparatedlistofdatanodepluginstobeactivated.
dfs.namenode.plugins
Commaseparatedlistofnamenodepluginstobeactivated.
Thesizeofbuffertostreamfiles.Thesizeofthisbuffershouldprobablybeamultipleof
hardwarepagesize(4096onIntelx86),anditdetermineshowmuchdataisbuffered
duringreadandwriteoperations.
dfs.streambuffersize
4096
dfs.bytesperchecksum
dfs.clientwritepacketsize
512
65536
Thenumberofbytesperchecksum.Mustnotbelargerthandfs.streambuffersize
Packetsizeforclientstowrite
ThemaximumperiodtokeepaDNintheexcludednodeslistataclient.Afterthisperiod,
inmilliseconds,thepreviouslyexcludednode(s)willberemovedautomaticallyfromthe
cacheandwillbeconsideredgoodforblockallocationsagain.Usefultolowerorraisein
situationswhereyoukeepafileopenforverylongperiods(suchasaWriteAheadLog
(WAL)file)tomakethewritertoleranttoclustermaintenancerestarts.Defaultsto10
minutes.
DetermineswhereonthelocalfilesystemtheDFSsecondarynamenodeshouldstorethe
temporaryimagestomerge.Ifthisisacommadelimitedlistofdirectoriesthentheimage
isreplicatedinallofthedirectoriesforredundancy.
dfs.client.write.exclude.nodes.cache.expiry.interval.millis
600000
dfs.namenode.checkpoint.dir
file://${hadoop.tmp.dir}/dfs/namesecondary
dfs.namenode.checkpoint.edits.dir
${dfs.namenode.checkpoint.dir}
dfs.namenode.checkpoint.period
3600
dfs.namenode.checkpoint.txns
1000000
dfs.namenode.checkpoint.check.period
60
TheSecondaryNameNodeandCheckpointNodewillpolltheNameNodeevery
'dfs.namenode.checkpoint.check.period'secondstoquerythenumberofuncheckpointed
transactions.
dfs.namenode.checkpoint.maxretries
TheSecondaryNameNoderetriesfailedcheckpointing.Ifthefailureoccurswhileloading
fsimageorreplayingedits,thenumberofretriesislimitedbythisvariable.
dfs.namenode.num.checkpoints.retained
dfs.namenode.num.extra.edits.retained
1000000
dfs.namenode.max.extra.edits.segments.retained
10000
DetermineswhereonthelocalfilesystemtheDFSsecondarynamenodeshouldstorethe
temporaryeditstomerge.Ifthisisacommadelimitedlistofdirectoiresthenteheditsis
replicatedinallofthedirectoiresforredundancy.Defaultvalueissameas
dfs.namenode.checkpoint.dir
Thenumberofsecondsbetweentwoperiodiccheckpoints.
TheSecondaryNameNodeorCheckpointNodewillcreateacheckpointofthenamespace
every'dfs.namenode.checkpoint.txns'transactions,regardlessofwhether
'dfs.namenode.checkpoint.period'hasexpired.
ThenumberofimagecheckpointfilesthatwillberetainedbytheNameNodeand
SecondaryNameNodeintheirstoragedirectories.Alleditlogsnecessarytorecoveran
uptodatenamespacefromtheoldestretainedcheckpointwillalsoberetained.
Thenumberofextratransactionswhichshouldberetainedbeyondwhatisminimally
necessaryforaNNrestart.ThiscanbeusefulforauditpurposesorforanHAsetupwhere
aremoteStandbyNodemayhavebeenofflineforsometimeandneedtohavealonger
backlogofretainededitsinordertostartagain.Typicallyeacheditisontheorderofa
fewhundredbytes,sothedefaultof1millioneditsshouldbeontheorderofhundredsof
MBsorlowGBs.NOTE:Fewerextraeditsmayberetainedthanvaluespecifiedforthis
settingifdoingsowouldmeanthatmoresegmentswouldberetainedthanthenumber
configuredbydfs.namenode.max.extra.edits.segments.retained.
Themaximumnumberofextraeditlogsegmentswhichshouldberetainedbeyondwhat
isminimallynecessaryforaNNrestart.Whenusedinconjunctionwith
dfs.namenode.num.extra.edits.retained,thisconfigurationpropertyservestocapthe
numberofextraeditsfilestoareasonablevalue.
dfs.namenode.delegation.key.updateinterval
dfs.namenode.delegation.token.maxlifetime
86400000
604800000
Theupdateintervalformasterkeyfordelegationtokensinthenamenodeinmilliseconds.
Themaximumlifetimeinmillisecondsforwhichadelegationtokenisvalid.
dfs.namenode.delegation.token.renewinterval
86400000
Therenewalintervalfordelegationtokeninmilliseconds.
dfs.datanode.failed.volumes.tolerated
Thenumberofvolumesthatareallowedtofailbeforeadatanodestopsofferingservice.
Bydefaultanyvolumefailurewillcauseadatanodetoshutdown.
dfs.image.compress
false
dfs.image.compression.codec
org.apache.hadoop.io.compress.DefaultCodec
dfs.image.transfer.timeout
60000
Shouldthedfsimagebecompressed?
Ifthedfsimageiscompressed,howshouldtheybecompressed?Thishastobeacodec
definedinio.compression.codecs.
Sockettimeoutforimagetransferinmilliseconds.Thistimeoutandtherelated
dfs.image.transfer.bandwidthPerSecparametershouldbeconfiguredsuchthatnormal
imagetransfercancompletesuccessfully.Thistimeoutpreventsclienthangswhenthe
senderfailsduringimagetransfer.Thisissockettimeoutduringimagetranfer.
dfs.image.transfer.bandwidthPerSec
Maximumbandwidthusedforimagetransferinbytespersecond.Thiscanhelpkeep
normalnamenodeoperationsresponsiveduringcheckpointing.Themaximumbandwidth
andtimeoutindfs.image.transfer.timeoutshouldbesetsuchthatnormalimagetransfers
cancompletesuccessfully.Adefaultvalueof0indicatesthatthrottlingisdisabled.
dfs.image.transfer.chunksize
65536
Chunksizeinbytestouploadthecheckpoint.Chunkedstreamingisusedtoavoidinternal
bufferingofcontentsofimagefileofhugesize.
dfs.namenode.support.allow.format
true
DoesHDFSnamenodeallowitselftobeformatted?Youmayconsidersettingthistofalse
foranyproductioncluster,toavoidanypossibilityofformattingarunningDFS.
dfs.datanode.max.transfer.threads
4096
Specifiesthemaximumnumberofthreadstousefortransferringdatainandoutofthe
DN.
dfs.datanode.scan.period.hours
dfs.block.scanner.volume.bytes.per.second
1048576
dfs.datanode.readahead.bytes
4193404
dfs.datanode.drop.cache.behind.reads
false
dfs.datanode.drop.cache.behind.writes
false
Ifthisis0ornegative,theDataNode'sblockscannerwillbedisabled.Ifthisispositive,
theDataNodewillnotscananyindividualblockmorethanonceinthespecifiedscan
period.
Ifthisis0,theDataNode'sblockscannerwillbedisabled.Ifthisispositive,thisisthe
numberofbytespersecondthattheDataNode'sblockscannerwilltrytoscanfromeach
volume.
Whilereadingblockfiles,iftheHadoopnativelibrariesareavailable,thedatanodecan
usetheposix_fadvisesystemcalltoexplicitlypagedataintotheoperatingsystembuffer
cacheaheadofthecurrentreader'sposition.Thiscanimproveperformanceespecially
whendisksarehighlycontended.Thisconfigurationspecifiesthenumberofbytesahead
ofthecurrentreadpositionwhichthedatanodewillattempttoreadahead.Thisfeature
maybedisabledbyconfiguringthispropertyto0.Ifthenativelibrariesarenotavailable,
thisconfigurationhasnoeffect.
Insomeworkloads,thedatareadfromHDFSisknowntobesignificantlylargeenough
thatitisunlikelytobeusefultocacheitintheoperatingsystembuffercache.Inthiscase,
theDataNodemaybeconfiguredtoautomaticallypurgealldatafromthebuffercache
afteritisdeliveredtotheclient.Thisbehaviorisautomaticallydisabledforworkloads
whichreadonlyshortsectionsofablock(e.gHBaserandomIOworkloads).Thismay
improveperformanceforsomeworkloadsbyfreeingbuffercachespageusageformore
cacheabledata.IftheHadoopnativelibrariesarenotavailable,thisconfigurationhasno
effect.
Insomeworkloads,thedatawrittentoHDFSisknowntobesignificantlylargeenough
thatitisunlikelytobeusefultocacheitintheoperatingsystembuffercache.Inthiscase,
theDataNodemaybeconfiguredtoautomaticallypurgealldatafromthebuffercache
afteritiswrittentodisk.Thismayimproveperformanceforsomeworkloadsbyfreeing
dfs.datanode.sync.behind.writes
false
dfs.client.failover.max.attempts
15
dfs.client.failover.sleep.base.millis
500
buffercachespageusageformorecacheabledata.IftheHadoopnativelibrariesarenot
available,thisconfigurationhasnoeffect.
Ifthisconfigurationisenabled,thedatanodewillinstructtheoperatingsystemtoenqueue
allwrittendatatothediskimmediatelyafteritiswritten.ThisdiffersfromtheusualOS
policywhichmaywaitforupto30secondsbeforetriggeringwriteback.Thismay
improveperformanceforsomeworkloadsbysmoothingtheIOprofilefordatawrittento
disk.IftheHadoopnativelibrariesarenotavailable,thisconfigurationhasnoeffect.
Expertonly.Thenumberofclientfailoverattemptsthatshouldbemadebeforethe
failoverisconsideredfailed.
Expertonly.Thetimetowait,inmilliseconds,betweenfailoverattemptsincreases
exponentiallyasafunctionofthenumberofattemptsmadesofar,witharandomfactorof
+/50%.Thisoptionspecifiesthebasevalueusedinthefailovercalculation.Thefirst
failoverwillretryimmediately.The2ndfailoverattemptwilldelayatleast
dfs.client.failover.sleep.base.millismilliseconds.Andsoon.
dfs.client.failover.sleep.max.millis
15000
Expertonly.Thetimetowait,inmilliseconds,betweenfailoverattemptsincreases
exponentiallyasafunctionofthenumberofattemptsmadesofar,witharandomfactorof
+/50%.Thisoptionspecifiesthemaximumvaluetowaitbetweenfailovers.Specifically,
thetimebetweentwofailoverattemptswillnotexceed+/50%of
dfs.client.failover.sleep.max.millismilliseconds.
dfs.client.failover.connection.retries
Expertonly.IndicatesthenumberofretriesafailoverIPCclientwillmaketoestablisha
serverconnection.
dfs.client.failover.connection.retries.on.timeouts
Expertonly.ThenumberofretryattemptsafailoverIPCclientwillmakeonsocket
timeoutwhenestablishingaserverconnection.
30
Expertonly.Thetimetowait,inseconds,fromreceptionofandatanodeshutdown
notificationforquickrestart,untildeclaringthedatanodedeadandinvokingthenormal
recoverymechanisms.Thenotificationissentbyadatanodewhenitisbeingshutdown
usingtheshutdownDatanodeadmincommandwiththeupgradeoption.
dfs.client.datanoderestart.timeout
dfs.nameservices
Commaseparatedlistofnameservices.
TheIDofthisnameservice.IfthenameserviceIDisnotconfiguredormorethanone
nameserviceisconfiguredfordfs.nameservicesitisdeterminedautomaticallyby
matchingthelocalnode'saddresswiththeconfiguredaddress.
dfs.nameservice.id
dfs.internal.nameservices
Commaseparatedlistofnameservicesthatbelongtothiscluster.Datanodewillreportto
allthenameservicesinthislist.Bydefaultthisissettothevalueofdfs.nameservices.
dfs.ha.namenodes.EXAMPLENAMESERVICE
Theprefixforagivennameservice,containsacommaseparatedlistofnamenodesfora
givennameservice(egEXAMPLENAMESERVICE).
dfs.ha.namenode.id
TheIDofthisnamenode.IfthenamenodeIDisnotconfigureditisdetermined
automaticallybymatchingthelocalnode'saddresswiththeconfiguredaddress.
dfs.ha.logroll.period
120
Howoften,inseconds,theStandbyNodeshouldasktheactivetorolleditlogs.Sincethe
StandbyNodeonlyreadsfromfinalizedlogsegments,theStandbyNodewillonlybeas
uptodateashowoftenthelogsarerolled.Notethatfailovertriggersalogrollsothe
StandbyNodewillbeuptodatebeforeitbecomesactive.
dfs.ha.tailedits.period
60
Howoften,inseconds,theStandbyNodeshouldcheckfornewfinalizedlogsegmentsin
thesharededitslog.
dfs.ha.automaticfailover.enabled
false
Whetherautomaticfailoverisenabled.SeetheHDFSHighAvailabilitydocumentation
fordetailsonautomaticHAconfiguration.
dfs.support.append
dfs.client.use.datanode.hostname
true
false
DoesHDFSallowappendstofiles?
Whetherclientsshouldusedatanodehostnameswhenconnectingtodatanodes.
dfs.datanode.use.datanode.hostname
false
dfs.client.local.interfaces
dfs.datanode.shared.file.descriptor.paths
/dev/shm,/tmp
dfs.short.circuit.shared.memory.watcher.interrupt.check.ms 60000
dfs.namenode.kerberos.internal.spnego.principal
${dfs.web.authentication.kerberos.principal}
dfs.secondary.namenode.kerberos.internal.spnego.principal
${dfs.web.authentication.kerberos.principal}
dfs.namenode.avoid.read.stale.datanode
false
dfs.namenode.avoid.write.stale.datanode
false
dfs.namenode.stale.datanode.interval
30000
dfs.namenode.write.stale.datanode.ratio
0.5f
dfs.namenode.invalidate.work.pct.per.iteration
0.32f
dfs.namenode.replication.work.multiplier.per.iteration
Whetherdatanodesshouldusedatanodehostnameswhenconnectingtootherdatanodes
fordatatransfer.
Acommaseparatedlistofnetworkinterfacenamestousefordatatransferbetweenthe
clientanddatanodes.Whencreatingaconnectiontoreadfromorwritetoadatanode,the
clientchoosesoneofthespecifiedinterfacesatrandomandbindsitssockettotheIPof
thatinterface.Individualnamesmaybespecifiedaseitheraninterfacename(eg"eth0"),a
subinterfacename(eg"eth0:0"),oranIPaddress(whichmaybespecifiedusingCIDR
notationtomatcharangeofIPs).
Acommaseparatedlistofpathstousewhencreatingfiledescriptorsthatwillbeshared
betweentheDataNodeandtheDFSClient.Typicallyweuse/dev/shm,sothatthefile
descriptorswillnotbewrittentodisk.Systemsthatdon'thave/dev/shmwillfallbackto
/tmpbydefault.
Thelengthoftimeinmillisecondsthattheshortcircuitsharedmemorywatcherwillgo
betweencheckingforjavainterruptionssentfromotherthreads.Thisisprovidedmainly
forunittests.
Indicatewhetherornottoavoidreadingfrom"stale"datanodeswhoseheartbeatmessages
havenotbeenreceivedbythenamenodeformorethanaspecifiedtimeinterval.Stale
datanodeswillbemovedtotheendofthenodelistreturnedforreading.See
dfs.namenode.avoid.write.stale.datanodeforasimilarsettingforwrites.
Indicatewhetherornottoavoidwritingto"stale"datanodeswhoseheartbeatmessages
havenotbeenreceivedbythenamenodeformorethanaspecifiedtimeinterval.Writes
willavoidusingstaledatanodesunlessmorethanaconfiguredratio
(dfs.namenode.write.stale.datanode.ratio)ofdatanodesaremarkedasstale.See
dfs.namenode.avoid.read.stale.datanodeforasimilarsettingforreads.
Defaulttimeintervalformarkingadatanodeas"stale",i.e.,ifthenamenodehasnot
receivedheartbeatmsgfromadatanodeformorethanthistimeinterval,thedatanodewill
bemarkedandtreatedas"stale"bydefault.Thestaleintervalcannotbetoosmallsince
otherwisethismaycausetoofrequentchangeofstalestates.Wethussetaminimumstale
intervalvalue(thedefaultvalueis3timesofheartbeatinterval)andguaranteethatthe
staleintervalcannotbelessthantheminimumvalue.Astaledatanodeisavoidedduring
lease/blockrecovery.Itcanbeconditionallyavoidedforreads(see
dfs.namenode.avoid.read.stale.datanode)andforwrites(see
dfs.namenode.avoid.write.stale.datanode).
Whentheratioofnumberstaledatanodestototaldatanodesmarkedisgreaterthanthis
ratio,stopavoidingwritingtostalenodessoastopreventcausinghotspots.
*Note*:Advancedproperty.Changewithcaution.Thisdeterminesthepercentageamount
ofblockinvalidations(deletes)todooverasingleDNheartbeatdeletioncommand.The
finaldeletioncountisdeterminedbyapplyingthispercentagetothenumberoflivenodes
inthesystem.Theresultantnumberisthenumberofblocksfromthedeletionlistchosen
forproperinvalidationoverasingleheartbeatofasingleDN.Valueshouldbeapositive,
nonzeropercentageinfloatnotation(X.Yf),with1.0fmeaning100%.
*Note*:Advancedproperty.Changewithcaution.Thisdeterminesthetotalamountof
blocktransferstobegininparallelataDN,forreplication,whensuchacommandlistis
beingsentoveraDNheartbeatbytheNN.Theactualnumberisobtainedbymultiplying
thismultiplierwiththetotalnumberoflivenodesinthecluster.Theresultnumberisthe
numberofblockstobegintransfersimmediatelyfor,perDNheartbeat.Thisnumbercan
beanypositive,nonzerointeger.
nfs.server.port
2049
SpecifytheportnumberusedbyHadoopNFS.
nfs.mountd.port
4242
nfs.dump.dir
/tmp/.hdfsnfs
SpecifytheportnumberusedbyHadoopmountdaemon.
ThisdirectoryisusedtotemporarilysaveoutoforderwritesbeforewritingtoHDFS.For
eachfile,theoutoforderwritesaredumpedaftertheyareaccumulatedtoexceedcertain
threshold(e.g.,1MB)inmemory.Oneneedstomakesurethedirectoryhasenoughspace.
nfs.rtmax
1048576
nfs.wtmax
1048576
nfs.keytab.file
nfs.kerberos.principal
nfs.allow.insecure.ports
true
dfs.webhdfs.enabled
true
hadoop.fuse.connection.timeout
300
hadoop.fuse.timer.period
dfs.metrics.percentiles.intervals
dfs.encrypt.data.transfer
false
dfs.encrypt.data.transfer.algorithm
dfs.encrypt.data.transfer.cipher.suites
dfs.encrypt.data.transfer.cipher.key.bitlength
dfs.trustedchannel.resolver.class
128
ThisisthemaximumsizeinbytesofaREADrequestsupportedbytheNFSgateway.If
youchangethis,makesureyoualsoupdatethenfsmount'srsize(addrsize=#ofbytesto
themountdirective).
ThisisthemaximumsizeinbytesofaWRITErequestsupportedbytheNFSgateway.If
youchangethis,makesureyoualsoupdatethenfsmount'swsize(addwsize=#ofbytes
tothemountdirective).
*Note*:Advancedproperty.Changewithcaution.Thisisthepathtothekeytabfilefor
thehdfsnfsgateway.Thisisrequiredwhentheclusteriskerberized.
*Note*:Advancedproperty.Changewithcaution.Thisisthenameofthekerberos
principal.Thisisrequiredwhentheclusteriskerberized.Itmustbeofthisformat:nfs
gatewayuser/nfsgatewayhost@kerberosrealm
Whensettofalse,clientconnectionsoriginatingfromunprivilegedports(thoseabove
1023)willberejected.ThisistoensurethatclientsconnectingtothisNFSGatewaymust
havehadrootprivilegeonthemachinewherethey'reconnectingfrom.
EnableWebHDFS(RESTAPI)inNamenodesandDatanodes.
Theminimumnumberofsecondsthatwe'llcachelibhdfsconnectionobjectsinfuse_dfs.
Lowervalueswillresultinlowermemoryconsumptionhighervaluesmayspeedup
accessbyavoidingtheoverheadofcreatingnewconnectionobjects.
Thenumberofsecondsbetweencacheexpirychecksinfuse_dfs.Lowervalueswillresult
infuse_dfsnoticingchangestoKerberosticketcachesmorequickly.
Commadelimitedsetofintegersdenotingthedesiredrolloverintervals(inseconds)for
percentilelatencymetricsontheNamenodeandDatanode.Bydefault,percentilelatency
metricsaredisabled.
Whetherornotactualblockdatathatisread/writtenfrom/toHDFSshouldbeencrypted
onthewire.ThisonlyneedstobesetontheNNandDNs,clientswilldeducethis
automatically.Itispossibletooverridethissettingperconnectionbyspecifyingcustom
logicviadfs.trustedchannel.resolver.class.
Thisvaluemaybesettoeither"3des"or"rc4".Ifnothingisset,thentheconfiguredJCE
defaultonthesystemisused(usually3DES.)Itiswidelybelievedthat3DESismore
cryptographicallysecure,butRC4issubstantiallyfaster.NotethatifAESissupportedby
boththeclientandserverthenthisencryptionalgorithmwillonlybeusedtoinitially
transferkeysforAES.(Seedfs.encrypt.data.transfer.cipher.suites.)
ThisvaluemaybeeitherundefinedorAES/CTR/NoPadding.Ifdefined,then
dfs.encrypt.data.transferusesthespecifiedciphersuitefordataencryption.Ifnotdefined,
thenonlythealgorithmspecifiedindfs.encrypt.data.transfer.algorithmisused.By
default,thepropertyisnotdefined.
Thekeybitlengthnegotiatedbydfsclientanddatanodeforencryption.Thisvaluemaybe
settoeither128,192or256.
TrustedChannelResolverisusedtodeterminewhetherachannelistrustedforplaindata
transfer.TheTrustedChannelResolverisinvokedonbothclientandserverside.Ifthe
resolverindicatesthatthechannelistrusted,thenthedatatransferwillnotbeencrypted
evenifdfs.encrypt.data.transferissettotrue.Thedefaultimplementationreturnsfalse
indicatingthatthechannelisnottrusted.
AcommaseparatedlistofSASLprotectionvaluesusedforsecuredconnectionstothe
DataNodewhenreadingorwritingblockdata.Possiblevaluesareauthentication,integrity
andprivacy.authenticationmeansauthenticationonlyandnointegrityorprivacy
integrityimpliesauthenticationandintegrityareenabledandprivacyimpliesallof
authentication,integrityandprivacyareenabled.Ifdfs.encrypt.data.transferissettotrue,
thenitsupersedesthesettingfordfs.data.transfer.protectionandenforcesthatall
connectionsmustuseaspecializedencryptedSASLhandshake.Thispropertyisignored
forconnectionstoaDataNodelisteningonaprivilegedport.Inthiscase,itisassumed
thattheuseofaprivilegedportestablishessufficienttrust.
SaslPropertiesResolverusedtoresolvetheQOPusedforaconnectiontotheDataNode
whenreadingorwritingblockdata.Ifnotspecified,thevalueof
hadoop.security.saslproperties.resolver.classisusedasthedefaultvalue.
dfs.data.transfer.protection
dfs.data.transfer.saslproperties.resolver.class
dfs.datanode.hdfsblocksmetadata.enabled
false
dfs.client.fileblockstoragelocations.numthreads
10
dfs.client.fileblockstoragelocations.timeout.millis
1000
dfs.journalnode.rpcaddress
0.0.0.0:8485
dfs.journalnode.httpaddress
0.0.0.0:8480
dfs.journalnode.httpsaddress
0.0.0.0:8481
dfs.namenode.audit.loggers
default
dfs.datanode.availablespacevolumechoosing
policy.balancedspacethreshold
10737418240
dfs.datanode.availablespacevolumechoosing
policy.balancedspacepreferencefraction
0.75f
dfs.namenode.edits.noeditlogchannelflush
false
Booleanwhichenablesbackenddatanodesidesupportfortheexperimental
DistributedFileSystem#getFileVBlockStorageLocationsAPI.
NumberofthreadsusedformakingparallelRPCsin
DistributedFileSystem#getFileBlockStorageLocations().
Timeout(inmilliseconds)fortheparallelRPCsmadein
DistributedFileSystem#getFileBlockStorageLocations().
TheJournalNodeRPCserveraddressandport.
TheaddressandporttheJournalNodeHTTPserverlistenson.Iftheportis0thenthe
serverwillstartonafreeport.
TheaddressandporttheJournalNodeHTTPSserverlistenson.Iftheportis0thenthe
serverwillstartonafreeport.
Listofclassesimplementingauditloggersthatwillreceiveauditevents.Theseshouldbe
implementationsoforg.apache.hadoop.hdfs.server.namenode.AuditLogger.Thespecial
value"default"canbeusedtoreferencethedefaultauditlogger,whichusesthe
configuredlogsystem.Installingcustomauditloggersmayaffecttheperformanceand
stabilityoftheNameNode.Refertothecustomlogger'sdocumentationformoredetails.
Onlyusedwhenthedfs.datanode.fsdataset.volume.choosing.policyissetto
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
ThissettingcontrolshowmuchDNvolumesareallowedtodifferintermsofbytesoffree
diskspacebeforetheyareconsideredimbalanced.Ifthefreespaceofallthevolumesare
withinthisrangeofeachother,thevolumeswillbeconsideredbalancedandblock
assignmentswillbedoneonapureroundrobinbasis.
Onlyusedwhenthedfs.datanode.fsdataset.volume.choosing.policyissetto
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
Thissettingcontrolswhatpercentageofnewblockallocationswillbesenttovolumes
withmoreavailablediskspacethanothers.Thissettingshouldbeintherange0.01.0,
thoughinpractice0.51.0,sincethereshouldbenoreasontopreferthatvolumeswith
lessavailablediskspacereceivemoreblockallocations.
Specifieswhethertoflusheditlogfilechannel.Whenset,expensiveFileChannel#force
callsareskippedandsynchronousdiskwritesareenabledinsteadbyopeningtheeditlog
filewithRandomAccessFile("rws")flags.Thiscansignificantlyimprovetheperformance
ofeditlogwritesontheWindowsplatform.Notethatthebehaviorofthe"rws"flagsis
platformandhardwarespecificandmightnotprovidethesamelevelofguaranteesas
FileChannel#force.Forexample,thewritewillskipthediskcacheonSASandSCSI
deviceswhileitmightnotonSATAdevices.Thisisanexpertlevelsetting,changewith
caution.
Justlikedfs.datanode.drop.cache.behind.writes,thissettingcausesthepagecachetobe
droppedbehindHDFSwrites,potentiallyfreeingupmorememoryforotheruses.Unlike
dfs.datanode.drop.cache.behind.writes,thisisaclientsidesettingratherthanasettingfor
theentiredatanode.Ifpresent,thissettingwilloverridetheDataNodedefault.Ifthe
nativelibrariesarenotavailabletotheDataNode,thisconfigurationhasnoeffect.
dfs.client.cache.drop.behind.writes
dfs.client.cache.drop.behind.reads
dfs.client.cache.readahead
dfs.namenode.enable.retrycache
true
dfs.namenode.retrycache.expirytime.millis
600000
dfs.namenode.retrycache.heap.percent
0.03f
dfs.client.mmap.enabled
true
dfs.client.mmap.cache.size
256
dfs.client.mmap.cache.timeout.ms
3600000
dfs.client.mmap.retry.timeout.ms
300000
dfs.client.short.circuit.replica.stale.threshold.ms
1800000
dfs.namenode.path.based.cache.block.map.allocation.percent 0.25
Justlikedfs.datanode.drop.cache.behind.reads,thissettingcausesthepagecachetobe
droppedbehindHDFSreads,potentiallyfreeingupmorememoryforotheruses.Unlike
dfs.datanode.drop.cache.behind.reads,thisisaclientsidesettingratherthanasettingfor
theentiredatanode.Ifpresent,thissettingwilloverridetheDataNodedefault.Ifthe
nativelibrariesarenotavailabletotheDataNode,thisconfigurationhasnoeffect.
Whenusingremotereads,thissettingcausesthedatanodetoreadaheadintheblockfile
usingposix_fadvise,potentiallydecreasingI/Owaittimes.Unlike
dfs.datanode.readahead.bytes,thisisaclientsidesettingratherthanasettingfortheentire
datanode.Ifpresent,thissettingwilloverridetheDataNodedefault.Whenusinglocal
reads,thissettingdetermineshowmuchreadaheadwedoinBlockReaderLocal.Ifthe
nativelibrariesarenotavailabletotheDataNode,thisconfigurationhasnoeffect.
Thisenablestheretrycacheonthenamenode.Namenodetracksfornonidempotent
requeststhecorrespondingresponse.Ifaclientretriestherequest,theresponsefromthe
retrycacheissent.Suchoperationsaretaggedwithannotation@AtMostOncein
namenodeprotocols.Itisrecommendedthatthisflagbesettotrue.Settingittofalse,will
resultinclientsgettingfailureresponsestoretriedrequest.Thisflagmustbeenabledin
HAsetupfortransparentfailovers.Theentriesinthecachehaveexpirationtime
configurableusingdfs.namenode.retrycache.expirytime.millis.
Thetimeforwhichretrycacheentriesareretained.
Thisparameterconfigurestheheapsizeallocatedforretrycache(excludingtheresponse
cached).Thiscorrespondstoapproximately4096entriesforevery64MBofnamenode
processjavaheapsize.Assumingretrycacheentryexpirationtime(configuredusing
dfs.namenode.retrycache.expirytime.millis)of10minutes,thisenablesretrycacheto
support7operationspersecondsustainedfor10minutes.Astheheapsizeisincreased,
theoperationratelinearlyincreases.
Ifthisissettofalse,theclientwon'tattempttoperformmemorymappedreads.
Whenzerocopyreadsareused,theDFSClientkeepsacacheofrecentlyusedmemory
mappedregions.Thisparametercontrolsthemaximumnumberofentriesthatwewill
keepinthatcache.Thelargerthisnumberis,themorefiledescriptorswewillpotentially
useformemorymappedfiles.mmapedfilesalsousevirtualaddressspace.Youmayneed
toincreaseyourulimitvirtualaddressspacelimitsbeforeincreasingtheclientmmap
cachesize.Notethatyoucanstilldozerocopyreadswhenthissizeissetto0.
Theminimumlengthoftimethatwewillkeepanmmapentryinthecachebetweenuses.
Ifanentryisinthecachelongerthanthis,andnobodyusesit,itwillberemovedbya
backgroundthread.
Theminimumamountoftimethatwewillwaitbeforeretryingafailedmmapoperation.
Themaximumamountoftimethatwewillconsiderashortcircuitreplicatobevalid,if
thereisnocommunicationfromtheDataNode.Afterthistimehaselapsed,wewillre
fetchtheshortcircuitreplicaevenifitisinthecache.
ThepercentageoftheJavaheapwhichwewillallocatetothecachedblocksmap.The
cachedblocksmapisahashmapwhichuseschainedhashing.Smallermapsmaybe
accessedmoreslowlyifthenumberofcachedblocksislargelargermapswillconsume
morememory.
Theamountofmemoryinbytestouseforcachingofblockreplicasinmemoryonthe
datanode.Thedatanode'smaximumlockedmemorysoftulimit(RLIMIT_MEMLOCK)
dfs.datanode.max.locked.memory
dfs.namenode.list.cache.directives.num.responses
100
dfs.namenode.list.cache.pools.num.responses
100
dfs.namenode.path.based.cache.refresh.interval.ms
30000
dfs.namenode.path.based.cache.retry.interval.ms
30000
dfs.datanode.fsdatasetcache.max.threads.per.volume
dfs.cachereport.intervalMsec
10000
dfs.namenode.edit.log.autoroll.multiplier.threshold
2.0
dfs.namenode.edit.log.autoroll.check.interval.ms
dfs.webhdfs.user.provider.user.pattern
300000
^[AZaz_][AZaz09._]*[$]?$
dfs.client.context
default
dfs.client.read.shortcircuit
false
dfs.domain.socket.path
mustbesettoatleastthisvalue,elsethedatanodewillabortonstartup.Bydefault,this
parameterissetto0,whichdisablesinmemorycaching.Ifthenativelibrariesarenot
availabletotheDataNode,thisconfigurationhasnoeffect.
ThisvaluecontrolsthenumberofcachedirectivesthattheNameNodewillsendoverthe
wireinresponsetoalistDirectivesRPC.
ThisvaluecontrolsthenumberofcachepoolsthattheNameNodewillsendoverthewire
inresponsetoalistPoolsRPC.
Theamountofmillisecondsbetweensubsequentpathcacherescans.Pathcacherescans
arewhenwecalculatewhichblocksshouldbecached,andonwhatdatanodes.Bydefault,
thisparameterissetto30seconds.
WhentheNameNodeneedstouncachesomethingthatiscached,orcachesomethingthat
isnotcached,itmustdirecttheDataNodestodosobysendingaDNA_CACHEor
DNA_UNCACHEcommandinresponsetoaDataNodeheartbeat.Thisparameter
controlshowfrequentlytheNameNodewillresendthesecommands.
Themaximumnumberofthreadspervolumetouseforcachingnewdataonthedatanode.
ThesethreadsconsumebothI/OandCPU.Thiscanaffectnormaldatanodeoperations.
Determinescachereportingintervalinmilliseconds.Afterthisamountoftime,the
DataNodesendsafullreportofitscachestatetotheNameNode.TheNameNodeusesthe
cachereporttoupdateitsmapofcachedblockstoDataNodelocations.Thisconfiguration
hasnoeffectifinmemorycachinghasbeendisabledbysetting
dfs.datanode.max.locked.memoryto0(whichisthedefault).Ifthenativelibrariesarenot
availabletotheDataNode,thisconfigurationhasnoeffect.
Determineswhenanactivenamenodewillrollitsowneditlog.Theactualthreshold(in
numberofedits)isdeterminedbymultiplyingthisvalueby
dfs.namenode.checkpoint.txns.Thispreventsextremelylargeeditfilesfromaccumulating
ontheactivenamenode,whichcancausetimeoutsduringnamenodestartupandposean
administrativehassle.Thisbehaviorisintendedasafailsafeforwhenthestandbyor
secondarynamenodefailtorolltheeditlogbythenormalcheckpointthreshold.
Howoftenanactivenamenodewillcheckifitneedstorollitseditlog,inmilliseconds.
Validpatternforuserandgroupnamesforwebhdfs,itmustbeavalidjavaregex.
ThenameoftheDFSClientcontextthatweshoulduse.Clientsthatshareacontextshare
asocketcacheandshortcircuitcache,amongotherthings.Youshouldonlychangethisif
youdon'twanttosharewithanothersetofthreads.
Thisconfigurationparameterturnsonshortcircuitlocalreads.
Optional.ThisisapathtoaUNIXdomainsocketthatwillbeusedforcommunication
betweentheDataNodeandlocalHDFSclients.Ifthestring"_PORT"ispresentinthis
path,itwillbereplacedbytheTCPportoftheDataNode.
Ifthisconfigurationparameterisset,shortcircuitlocalreadswillskipchecksums.Thisis
normallynotrecommended,butitmaybeusefulforspecialsetups.Youmightconsider
usingthisifyouaredoingyourownchecksummingoutsideofHDFS.
dfs.client.read.shortcircuit.skip.checksum
false
dfs.client.read.shortcircuit.streams.cache.size
256
TheDFSClientmaintainsacacheofrecentlyopenedfiledescriptors.Thisparameter
controlsthesizeofthatcache.Settingthishigherwillusemorefiledescriptors,but
potentiallyprovidebetterperformanceonworkloadsinvolvinglotsofseeks.
dfs.client.read.shortcircuit.streams.cache.expiry.ms
300000
Thiscontrolstheminimumamountoftimefiledescriptorsneedtositintheclientcache
contextbeforetheycanbeclosedforbeinginactivefortoolong.
dfs.datanode.shared.file.descriptor.paths
/dev/shm,/tmp
Commaseparatedpathstothedirectoryonwhichsharedmemorysegmentsarecreated.
TheclientandtheDataNodeexchangeinformationviathissharedmemorysegment.It
triespathsinorderuntilcreationofsharedmemorysegmentsucceeds.
dfs.client.use.legacy.blockreader.local
false
dfs.block.localpathaccess.user
dfs.client.domain.socket.data.traffic
false
dfs.namenode.rejectunresolveddntopologymapping
false
dfs.client.slow.io.warning.threshold.ms
30000
dfs.datanode.slow.io.warning.threshold.ms
300
dfs.namenode.xattrs.enabled
dfs.namenode.fslimits.maxxattrsperinode
dfs.namenode.fslimits.maxxattrsize
true
32
16384
dfs.namenode.startup.delay.block.deletion.sec
dfs.namenode.list.encryption.zones.num.responses
100
dfs.namenode.inotify.max.events.per.rpc
1000
dfs.user.home.dir.prefix
/user
dfs.datanode.cache.revocation.timeout.ms
900000
dfs.datanode.cache.revocation.polling.ms
500
dfs.datanode.block.id.layout.upgrade.threads
12
dfs.encryption.key.provider.uri
dfs.storage.policy.enabled
true
LegacyshortcircuitreaderimplementationbasedonHDFS2246isusedifthis
configurationparameteristrue.ThisisfortheplatformsotherthanLinuxwherethenew
implementationbasedonHDFS347isnotavailable.
Commaseparatedlistoftheusersallowdtoopenblockfilesonlegacyshortcircuitlocal
read.
ThiscontrolwhetherwewilltrytopassnormaldatatrafficoverUNIXdomainsocket
ratherthanoverTCPsocketonnodelocaldatatransfer.Thisiscurrentlyexperimental
andturnedoffbydefault.
Ifthevalueissettotrue,thennamenodewillrejectdatanoderegistrationifthetopology
mappingforadatanodeisnotresolvedandNULLisreturned(scriptdefinedby
net.topology.script.file.namefailstoexecute).Otherwise,datanodewillberegisteredand
thedefaultrackwillbeassignedasthetopologypath.Topologypathsareimportantfor
dataresiliency,sincetheydefinefaultdomains.Thusitmaybeunwantedbehaviorto
allowdatanoderegistrationwiththedefaultrackiftheresolvingtopologyfailed.
Thethresholdinmillisecondsatwhichwewilllogaslowiowarninginadfsclient.By
default,thisparameterissetto30000milliseconds(30seconds).
Thethresholdinmillisecondsatwhichwewilllogaslowiowarninginadatanode.By
default,thisparameterissetto300milliseconds.
WhethersupportforextendedattributesisenabledontheNameNode.
Maximumnumberofextendedattributesperinode.
Themaximumcombinedsizeofthenameandvalueofanextendedattributeinbytes.
ThedelayinsecondsatwhichwewillpausetheblocksdeletionafterNamenodestartup.
Bydefaultit'sdisabled.Inthecaseadirectoryhaslargenumberofdirectoriesandfiles
aredeleted,suggesteddelayisonehourtogivetheadministratorenoughtimetonotice
largenumberofpendingdeletionblocksandtakecorrectiveaction.
Whenlistingencryptionzones,themaximumnumberofzonesthatwillbereturnedina
batch.Fetchingthelistincrementallyinbatchesimprovesnamenodeperformance.
MaximumnumberofeventsthatwillbesenttoaninotifyclientinasingleRPCresponse.
ThedefaultvalueattemptstoamortizeawaytheoverheadforthisRPCwhileavoiding
hugememoryrequirementsfortheclientandNameNode(1000eventsshouldconsumeno
morethan1MB.)
Thedirectorytoprependtousernametogettheuser'shomedirecotry.
WhentheDFSClientreadsfromablockfilewhichtheDataNodeiscaching,the
DFSClientcanskipverifyingchecksums.TheDataNodewillkeeptheblockfileincache
untiltheclientisdone.Iftheclienttakesanunusuallylongtime,though,theDataNode
mayneedtoevicttheblockfilefromthecacheanyway.Thisvaluecontrolshowlongthe
DataNodewillwaitfortheclienttoreleaseareplicathatitisreadingwithoutchecksums.
HowoftentheDataNodeshouldpolltoseeiftheclientshavestoppedusingareplicathat
theDataNodewantstouncache.
Thenumberofthreadstousewhencreatinghardlinksfromcurrenttopreviousblocks
duringupgradeofaDataNodetoblockIDbasedblocklayout(seeHDFS6482fordetails
onthelayout).
TheKeyProvidertousewheninteractingwithencryptionkeysusedwhenreadingand
writingtoanencryptionzone.
Allowuserstochangethestoragepolicyonfilesanddirectories.
Determineswheretosavethenamespaceintheoldfsimageformatduringcheckpointing
bystandbyNameNodeorSecondaryNameNode.Userscandumpthecontentsoftheold
dfs.namenode.legacyoivimage.dir
formatfsimagebyoiv_legacycommand.Ifthevalueisnotspecified,oldformatfsimage
willnotbesavedincheckpoint.
dfs.namenode.top.enabled
dfs.namenode.top.window.num.buckets
dfs.namenode.top.num.users
true
10
10
Enablenntop:reportingtopusersonnamenode
Numberofbucketsintherollingwindowimplementationofnntop
Numberoftopusersreturnedbythetoptool
dfs.namenode.top.windows.minutes
dfs.namenode.blocks.per.postponedblocks.rescan
1,5,25
10000
commaseparatedlistofnntopreportingperiodsinminutes
NumberofblockstorescanforeachiterationofpostponedMisreplicatedBlocks.