You are on page 1of 1

In Hive.

Let�s have a example first :

CREATE TABLE table_name (id INT, name STRING, published_year INT)


ROW FORMAT DELIMITED
FIELDS TERMINATED BY �,�
STORED AS TEXTFILE;

ROW FORMAT DELIMITED: This line is telling Hive to expect the file to contain one
row per line.
So basically, we are telling Hive that when it finds a new line character that
means is a new records.

FIELDS TERMINATED BY �,�: This is really similar to the one above, but instead of
meaning rows this one means columns,
this way Hive knows what delimiter you are using in your files to separate each
column. If none is set the default will
be used which is ctrl-A

STORED AS TEXTFILE: This is to tell Hive what type of file to expect. The other
type of file that can be consumed is
Sequence Files (Hadoop�s binary file format).

Two types of UDF


1) Regular UDF (User Defined Aggregate Fuction) ---->Applicabe for rows
2) User Defined Aggregate Function (UDAF) ----> Work on group of results

Steps are same to create UDF

Problem Statement-2
--------------------
Find the mean of marks obtained in maths by students

see first thing that we need to remember is that all the java code is written by
the professional
programmers. We will just use the jar file created by the

First we will import all the jar file into the Hive
Then we will run that aggregate function on them
and we will pass

You might also like