You are on page 1of 19

PRODUCT

(/FATUR/)

://

(https://www.quole.com/)

(/

/)

COMMUNITY
PARTNR
IGNUPFORFR

(HTTP://WWW.QUOL.COM/TRIAL/)
PAG)

ROURC
HTTP
WWW QUOL COM ROURC

LOG HTTP

://WWW.QUOL.COM/LOG/)

(/AOUT-U)

(/UPPORT)

COMPANY

LOGIN HTTP

://www.quole.com/) >

Home https

HLP

://API.QUOL.COM/UR/IGN_IN)

HiveFunctionCheatheet

HiveFunctionCheatheet

(http://quole2.wpengine.com/wp-content/uploads/2014/01/hive-function-cheatsheet.pdf)
HiveFunctionMetacommands

listsHivefunctionsandoperators
DCRIFUNCTION[functionname]displasshortdescription
HOWFUNCTION

HiveFunctionCheatheet

DateFunctions

ofthefunction

]access

DCRIFUNCTIONXTNDD functionname

MathematicalFunctions

extendeddescriptionofthefunction

tringFunctions

TpesofHiveFunctions

CollectionFunctions

isafunctionthattakesoneormorecolumnsfromarowas
argumentandreturnsasinglevalueoroject.g:concat(col1,
col2)
UDTFtakeszeroormoreinputsandandproducesmultiple
columnsorrowsofoutput.g:explode()
MacrosafunctionthatusesotherHivefunctions.
UDF

HowToDevelopUDFs

UDAF
UDTF
ConditionalFunctions
FunctionsforTextAnaltics

Goto

packageorg apache hadoop hive contri udf example

Pig

@Description(name="YourUDFName",
value="_FUNC_(InputDataType)usingtheinputdatat
ypeXargument,"+
"returnsYYY.",
extended="Example:\n"
+">SELECT_FUNC_(InputDataType)FROMtabl
ename")
publicclassYourUDFNameextendsUDF{
..
publicYourUDFName(InputDataTypeInputValue){
..
}
publicStringevaluate(InputDataTypeInputValue){
..
}
}

HowToDevelopUDFs GenericUDFs UDAFs andUDTFs


pulicclassYourUDFNameextendsUDF

Cheat

heet

(http://www.quole.com/resources/cheatsheet/pigfunction-cheat-sheet/)

importjava.util.Date
importjava.text.SimpleDateFormat
importorg.apache.hadoop.hive.ql.exec.UDF

Function

pulicclassYourGenericUDFNameextendsGenericUDF

{..}

pulicclassYourGenericUDAFNameextendsAstractGenericUDAFResolver
pulicclassYourGenericUDTFNameextendsGenericUDTF

{..}

{..}

HowToDeplo DropUDFs
Atstartofeachsession

ADDJAR/full_path_to_jar/YourUDFName.jar
CREATETEMPORARYFUNCTIONYourUDFNameAS'org.apache.hadoop.hive.contrib.udf.example.
YourUDFName'

Attheendofeachsession

DROPTEMPORARYFUNCTIONIFEXISTSYourUDFName

DateFunctions
-

Thefollowinguilt indatefunctionsaresupportedinhive

Return

Tpe

Name ignature

string

from

_unixtime(igintunixtime[,
stringformat])

xample

Convertsthenumerofsecondsfromunixepoch

(1970-01-0100:00:00UTC)toastringrepresenting
thetimestampofthatmomentinthecurrentsstem

- -

timezoneintheformatof 1970 01 0100 00 00

_timestamp()

Getscurrenttimestampusingthedefaulttimezone

_timestamp(stringdate)

Convertstimestringinformat MM dd

igint

unix

igint

unix

HH mm sstoUnixtimestamp return0iffail

_timestamp(2009-03-2011:30:01)=1237573801

unix

igint

string

_timestamp(stringdate,
stringpattern)
unix

to

_date(stringtimestamp)

ConverttimestringwithgivenpatterntoUnixtime

_timestamp(2009-03-20,

stamp return0iffail unix

-MM-dd)=1237532400

Returnsthedatepartofatimestampstring

_date(1970-01-0100:00:00)=1970-01-01

to

int

ear stringdate

Returnstheearpartofadateoratimestampstring

(1970-01-0100:00:00)=1970,ear(1970-0101)=1970
ear

int

month stringdate

Returnsthemonthpartofadateoratimestamp

(1970-11-0100:00:00)=11,
month(1970-11-01)=11
string month

int

da stringdate

(1970-11-0100:00:00)=1,da(1970-11-01)=1

daofmonth date

da

:
(
3012:58:59)=12,hour(12:58:59)=12

int

hour stringdate

int

minute stringdate

int

second stringdate

int

weekofear stringdate

- -

Returnsthehourofthetimestamp hour 2009 07

Returnstheminuteofthetimestamp

Returnsthesecondofthetimestamp

Returntheweeknumerofatimestampstring

(1970-11-0100:00:00)=44,
weekofear(1970-11-01)=44
weekofear

int

datediff stringenddate string

startdate

Returnthedapartofadateoratimestampstring

Returnthenumerofdasfromstartdateto

- ,

- ) =

enddate datediff 2009 03 01 2009 02 27 2

_add(stringstartdate,int
das)

string

date

- , ) =

timestamp

utractanumerofdastostartdate
date

_utc_timestamp(timestamp,
stringtimezone)
from

_utc_timestamp(timestamp,
stringtimezone)
to

- -

date

timestamp

_add(2008-

12 31 1 2009 01 01

_su(stringstartdate,int
das)

string

Addanumerofdastostartdate date

_su(2008-12-31,1)=2008-12-30

AssumesgiventimestampistUTCandconvertsto

. . )

giventimezone asofHive0 8 0

Assumesgiventimestampisingiventimezoneand

. . )

convertstoUTC asofHive0 8 0

MathematicalFunctions
-

( )

The following uilt in mathematical functions are supported in hive most return NULL when the argument s

areNULL

Return

Tpe

Name ignature

IGINT

round doulea

DOUL

round doulea intd

IGINT

floor doulea

xample

ReturnstheroundedIGINTvalueofthedoule

Returnsthedouleroundedtoddecimalplaces

ReturnsthemaximumIGINTvaluethatisequalorless
thanthedoule

IGINT

),

ceil doulea

ReturnstheminimumIGINTvaluethatisequalor

ceiling doulea

doule

(),rand(intseed)

rand

greaterthanthedoule

Returnsarandomnumer thatchangesfromrowtorow

thatisdistriuteduniformlfrom0to1 pecifiingthe

seedwillmakesurethegeneratedrandomnumer

sequenceisdeterministic

doule

exp doulea

doule

ln doulea

doule

log10 doulea

doule

log2 doulea

doule

log doulease doulea

doule

pow doulea doulep

Returnse whereeistheaseofthenaturallogarithm

Returnsthenaturallogarithmoftheargument

Returnsthease 10logarithmoftheargument

Returnsthease 2logarithmoftheargument

),

Returnthease ase logarithmoftheargument

Returna

power doulea doulep

doule

sqrt doulea

string

in IGINTa

string

hex IGINTa hex stringa

Returnsthesquarerootofa

Returnsthenumerininarformat

Iftheargumentisanint hexreturnsthenumerasa

stringinhexformat Otherwiseifthenumerisastring it
convertseachcharacterintoitshexrepresentationand

returnstheresultingstring

string

unhex stringa

Inverseofhex Interpretseachpairofcharactersasa
hexidecimalnumerandconvertstothecharacter

representedthenumer

string

(
,
from_ase,intto_ase),
conv(TRINGnum,int
from_ase,intto_ase)
conv IGINTnum int

Convertsanumerfromagivenasetoanother

doule

as doulea

intdoule

pmod inta int

doule

sin doulea

doule

asin doulea

doule

cos doulea

doule

acos doulea

tan doule

Returnstheasolutevalue

(
,
)
pmod(doulea,doule)
(

Returnsthesineofa aisinradians

- <=a<=1ornullotherwise

Returnsthearcsinofxif 1

Returnsthecosineofa aisinradians

Returnsthepositivevalueofamod

- <=a<=1ornullotherwise

Returnsthearccosineofxif 1

tan doulea

Returnsthetangentofa aisinradians

doule

atan doulea

Returnsthearctangentofa

doule

degrees doulea

doule

radians doulea

intdoule

positive inta

intdoule

negative inta

float

sign doulea

doule

(
),
positive(doulea)
(
),
negative(doulea)
(

()

Convertsvalueofafromradianstodegrees

Convertsvalueofafromdegreestoradians

Returnsa

Returns a

-1.0

Returnsthesignofaas 1 0 or

Returnsthevalueofe

()

doule

pi

Returnsthevalueofpi

tringFunctions
-

Thefollowingareuilt intringfunctionsaresupportedinhive

ReturnTpe

Name ignature

int

ascii stringstr

xample

Returnsthenumericvalueofthefirst
characterofstr

string

concat string inarA

string inar

Returnsthestringortesresultingfrom
concatenatingthestringsortespassed

. . .

inasparametersinorder e g concat foo

ar)resultsinfooar.Notethatthis

functioncantakeannumerofinput

strings

<

<

>>

arra struct string doule

_ngrams(arra<arra>,
arra,intK,intpf)
context

Returnsthetop kcontextualN gramsfrom

asetoftokenizedsentences givenastring

of context

eetatisticsAndDataMiningformore
information

string

_ws(stringP,string
A,string)
concat

_ws(stringP,arra)

string

concat

int

find

_in_set(stringstr,string

.
()

Likeconcat aove utwithcustom

separatorP

_ws()aove,uttakinganarra
ofstrings.(asofHive0.9.0)
Likeconcat

ReturnsthefirstoccuranceofstrinstrList

strList

wherestrListisacomma delimitedstring

Returnsnullifeitherargumentisnull

Returns0ifthefirstargumentcontainsan

. . .
_in_set(a,
ac,,a,c,def)returns3
commas e g find

string

_numer(numerx,int

format

string

_json_oject(string
json_string,stringpath)
get

FormatsthenumerXtoaformatlike

#,###,###.##,roundedtoDdecimal
places,andreturnstheresultasastring.If
Dis0,theresulthasnodecimalpointor
fractionalpart.(asofHive0.10.0)
xtractjsonojectfromajsonstringased

onjsonpathspecified andreturnjson

stringoftheextractedjsonoject Itwill
returnnulliftheinputjsonstringis

invalid NOT Thejsonpathcanonlhave

[ - - _],i.e.,noupper-case
orspecialcharacters.Also,thekes
*cannotstartwithnumers.*Thisisdueto
restrictionsonHivecolumnnames.
thecharacters 0 9a z

oolean

int

_file(stringstr,string
filename)
in

instr stringstr stringsustr

Returnstrueifthestringstrappearsasan

entirelineinfilename

Returnsthepositionofthefirstoccurence
ofsustrinstr

int

length stringA

int

locate stringsustr string

Returnsthelengthofthestring

Returnsthepositionofthefirstoccurrence

[,

str intpos

])

ofsustrinstrafterpositionpos

string

lower stringA lcase stringA

string

lpad stringstr intlen string

(
pad)

Returnsstr left paddedwithpadtoalength


oflen

string

ltrim stringA

Returnsthestringresultingfromtrimming

spacesfromtheeginning lefthandside of

. .

Ae g ltrim fooar resultsin fooar

<

<

>>

arra struct string doule

<

>,intN,

ngrams arra arra

intK intpf

Returnsthetop kN gramsfromasetof

tokenizedsentences suchasthose

()

returnedthesentences UDAF

eetatisticsAndDataMiningformore
information

string

_url(stringurltring,
stringpartToxtract[,string
keToxtract])
parse

ReturnsthespecifiedpartfromtheURL

ValidvaluesforpartToxtractinclude

,
,
,
,
AUTHORITY,FIL,andURINFO.e.g.
parse_url(http://faceook.com/path1/p.php?
k1=v1&k2=v2#Ref1,HOT)returns
faceook.com.Alsoavalueofaparticular
HOT PATH QURY RF PROTOCOL

keinQURYcaneextracted
providingthekeasthethirdargument

. .

e g

_url(http://faceook.com/path1/p.php?
k1=v1&k2=v2#Ref1,QURY,k1)returns
v1.
parse

string

printf tringformat Oj

args

string

Returnstheinputformattedaccordingdo

. . )

printf stleformatstrings asofHive0 9 0

_extract(stringsuject,
stringpattern,intindex)
regexp

Returnsthestringextractedusingthe

. . .
_extract(foothear,
foo(.*?)(ar),2)returnsar.Notethat
pattern e g regexp

somecareisnecessarinusingpredefined

\sasthesecond
argumentwillmatchtheletters;sis
necessartomatchwhitespace,etc.The
indexparameteristheJavaregexMatcher
group()methodindex.ee
docs/api/java/util/regex/Matcher.htmlfor
moreinformationontheindexorJava
regexgroup()method.
characterclasses using

string

_replace(string
INITIAL_TRING,string
PATTRN,string
RPLACMNT)
regexp

Returnsthestringresultingfromreplacing

_TRINGthatmatch

allsustringsinINITIAL

thejavaregularexpressionsntaxdefined
inPATTRNwithinstancesof

, . .
regexp_replace(fooar,oo|ar,)returns
f.Notethatsomecareisnecessarin
usingpredefinedcharacterclasses:using
\sasthesecondargumentwillmatchthe
letters;sisnecessartomatch
whitespace,etc.
RPLACMNT e g

repeat stringstr intn

string

reverse stringA

string

Repeatstrntimes

Returnsthereversedstring

string

rpad stringstr intlen string

pad

lengthoflen

string

Returnsstr right paddedwithpadtoa

rtrim stringA

Returnsthestringresultingfromtrimming

)
e.g.rtrim(fooar)resultsinfooar

spacesfromtheend righthandside ofA

<

>

arra arra

sentences stringstr string

lang stringlocale

Tokenizesastringofnaturallanguagetext

intowordsandsentences whereeach
sentenceisrokenattheappropriate
sentenceoundarandreturnedasan

optionalarguments.e.g.sentences(Hello
there!Howareou?)returns((Hello,
there),(How,are,ou))
arraofwords The lang and locale are

string

space intn

Returnastringofnspaces

arra

split stringstr stringpat

expression

<

>

map string string

_to_map(text[,delimiter1,
delimiter2])
str

plitstraroundpat patisaregular

)
-

plitstextintoke valuepairsusingtwo

pairs andDelimiter2splitseachK Vpair


Defaultdelimitersare

=fordelimiter2.
string

,
start)sustring(string|inar
A,intstart)
sustr string inarA int

delimiters Delimiter1separatestextintoK V

,fordelimiter1and

Returnsthesustringorsliceofthete
arraofAstartingfromstartpositiontillthe

. .

, )

endofstringAe g sustr fooar 4 results

in ar

string

sustr string inarA int

Returnsthesustringorsliceofthete

start intlen

arraofAstartingfromstartpositionwith

sustring string inarA int

start intlen

string

translate stringinput string

. .

from stringto

, , )

lengthlene g sustr fooar 4 1 resultsin

Translatestheinputstringreplacingthe
characterspresentinthefromstringwith
thecorrespondingcharactersin

thetostring Thisissimilarto

thetranslatefunctioninPostgreQL Ifan

oftheparameterstothisUDFareNULL the

resultisNULLaswell availaleasof

. . )

Hive0 10 0

string

trim stringA

Returnsthestringresultingfromtrimming

. .

spacesfromothendsofAe g trim

fooar resultsin fooar

string

upper stringA ucase string

Returnsthestringresultingfromconverting

. .
upper(fOoaR)resultsinFOOAR

allcharactersofAtouppercasee g

CollectionFunctions
-

Thefollowinguilt incollectionfunctionsaresupportedinhive

Return

Tpe

Name ignature

int

size Map

xample

Returnsthenumerofelementsinthemaptpe

int

size Arra

Returnsthenumerofelementsinthearratpe

arra

map

_kes(Map)

Returnsanunorderedarracontainingthekesoftheinputmap

arra

map

_values(Map)

Returnsanunorderedarracontainingthevaluesoftheinput
map

_contains(Arra,
value)

oolean

arra

arra

sort

_arra(Arra)

ReturnsTRUifthearracontainsvalue

ortstheinputarrainascendingorderaccordingtothenatural

. . )

orderingofthearraelementsandreturnsit asofversion0 9 0

uilt inAggregateFunctions UDAF


-

Thefollowingareuilt inaggregatefunctionsaresupportedinHive

Return

Tpe

Name ignature

igint

count

xample

(*),count(expr),
count(DITINCTexpr[,
expr_.])

(*)Returnsthetotalnumerofretrievedrows,
includingrowscontainingNULLvalues;count(expr)
count

Returnsthenumerofrowsforwhichthesupplied

[,

])

expressionisnon NULL count DITINCTexpr expr


Returnsthenumerofrowsforwhichthesupplied

( )

expression s areuniqueandnon NULL

doule

),

sum col sum DITINCT

Returnsthesumoftheelementsinthegrouporthesumof

col

doule

thedistinctvaluesofthecolumninthegroup

),

avg col avg DITINCTcol

Returnstheaverageoftheelementsinthegrouporthe
averageofthedistinctvaluesofthecolumninthegroup

doule

min col

Returnstheminimumofthecolumninthegroup

doule

max col

doule

variance col var

doule

var

Returnsthemaximumvalueofthecolumninthegroup

),

_pop(col)

_samp(col)

Returnsthevarianceofanumericcolumninthegroup

Returnstheuniasedsamplevarianceofanumericcolumn
inthegroup

doule

_pop(col)

stddev

Returnsthestandarddeviationofanumericcolumninthe
group

doule

_samp(col)

stddev

Returnstheuniasedsamplestandarddeviationofa
numericcolumninthegroup

doule

_pop(col1,col2)

covar

Returnsthepopulationcovarianceofapairofnumeric
columnsinthegroup

doule

_samp(col1,col2)

covar

Returnsthesamplecovarianceofapairofanumeric
columnsinthegroup

doule

corr col1 col2

ReturnsthePearsoncoefficientofcorrelationofapairofa
numericcolumnsinthegroup

doule

arra

, )

Returnstheexactp

Returnstheexactpercentilesp1 p2

percentile IGINTcol p

percentile IGINTcol

[,

arra p1 p2

]))

th

percentileofacolumninthegroup

(doesnotworkwithfloatingpointtpes).pmusteetween
0and1.NOT:Atruepercentilecanonlecomputedfor
integervalues.UsePRCNTIL_APPROXifourinputis
non-integral.
,

, ofacolumninthe
group(doesnotworkwithfloatingpointtpes).pimuste

etween0and1 NOT Atruepercentilecanonle

computedforintegervalues UsePRCNTIL

_APPROXif

ourinputisnon integral

_approx(DOUL Returnsanapproximatep percentileofanumericcolumn


col,p[,])
(includingfloatingpointtpes)inthegroup.Theparameter
controlsapproximationaccuracatthecostofmemor.
Highervaluesieldetterapproximations,andthedefaultis
10,000.Whenthenumerofdistinctvaluesincolissmaller
than,thisgivesanexactpercentilevalue.

doule

percentile

arra

percentile

th

_approx(DOUL
col,arra(p1[,p2])[,])

arra

histogram

_numeric(col,)

ameasaove utacceptsandreturnsanarraof

percentilevaluesinsteadofasingleone

Computesahistogramofanumericcolumninthegroup

usingnon uniformlspacedins Theoutputisanarraof

( , )

sizeofdoule valued x coordinatesthatrepresentthe


incentersandheights

_set(col)

arra

collect

Returnsasetofojectswithduplicateelementseliminated

uilt inTale GeneratingFunctions UDTF


-

(), take in a single input row and output a single output row. In
contrast,tale-generatingfunctionstransformasingleinputrowtomultipleoutputrows.
Normal user defined functions such as concat

ReturnTpe

Name ignature

<

[,

]>)

inline ARRAY TRUCT TRUCT

xample

xplodesanarraofstructsintoatale asofHive

. )

0 10

()

xplode

explode takesinanarraasaninputandoutputs

theelementsofthearraasseparaterows UDTF s
caneusedintheLCTexpressionlistandasa

partofLATRALVIW

ConditionalFunctions
Return

Tpe

Name ignature

if ooleantestCondition T

xample

ReturnvalueTruewhentestConditionistrue

valueTrue TvalueFalseOrNull

returnsvalueFalseOrNullotherwise

, )

COALC Tv1 Tv2

ReturnthefirstvthatisnotNULL orNULLifallv s
areNULL

= ,

CAaWHNTHNc WHNd

]*[Lf]ND

THNe

= ,

returnf

CAWHNaTHN WHNc

]*[Le]ND

Whena returnsc whena d returne else

Whena true returns whenc true returnd

THNd

elsereturne

FunctionsforTextAnaltics
(

ReturnTpe

<

<

Name ignature

>>

arra struct string doule

_ngrams(arra<arra>,
arra,intK,intpf)
context

xample

Returnsthetop kcontextualN grams

fromasetoftokenizedsentences

givenastringof context

eetatisticsAndDataMiningfor

. -

moreinformation N gramsare
susequencesoflengthNdrawn

fromalongersequence The

()

purposeofthengrams UDAFisto

findthekmostfrequentn grams

fromoneormoresequences Itcan
eusedinconjunctionwiththe

()

sentences UDFtoanalze

unstructurednaturallanguagetext

()

orthecollect functiontoanalze
moregeneralstringdata

<

<

>>

arra struct string doule

(
K,intpf)

<

>,intN,int

ngrams arra arra

Returnsthetop kN gramsfromaset

oftokenizedsentences suchas
thosereturnedthesentences

()

UDAF

eetatisticsAndDataMiningfor

moreinformation Contextualn grams

aresimilarton grams utallowou

tospecifa context stringaround

whichn gramsaretoeestimated

Forexample oucanspecifthat

ou reonlinterestedinfindingthe

mostcommontwo wordphrasesin

textthatfollowthecontext Ilove

Youcouldachievethesameresult
manuallstrippingsentencesof

non contextualcontentandthen
passingthemtongrams
context

(),ut

_ngrams()makesitmuch

easier

stems via a Flume

(http://www.quora.com/Flume) that writes data out

://www.quora.com/Amazon-3). From there, we use a

to Amazon 3 http
data

processing

pipeline

hosted

Quole

(http://www.quora.com/Quole) to process and aggregate statistics to Hive


(computing) (http://www.quora.com/Hive-computing)tales and to an AW
(

://www.quora.com/AW-Redshift)aseddatawarehouse

Redshift http

PrakashJanakiraman Co FounderandVPngineering

Nextdoor

ContactUs

upport

FreeTrial

(https://www.quole.com/contact(https
- ://quole.zendesk.com/hc/(en
https
- ://www.quole.com/trialus/)
us)
page)

AoutUs

igDataUseCases

(https://www.quole.com/aoutus/)

(/resources/solution/estuse-cases-for-igdata-analtics/)

PressReleases

(https://www.quole.com/pressreleases/)

Weinars

(https://www(.http
faceook
://www.com
.(linkedin
https
/quole
://twitter
.com
) /compan
.com/@quole
/quole
) )
2016Quole,Inc.Allrightsreserved

ecurit

Reporting
(https://www.quole.com/weinars
/) (/securit-reporting/)Privac(/privac-polic/)

You might also like