-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Learning Hadoop 2
By :

The HiveQL language can be extended by means of plugins and third-party functions. In Hive, there are three types of functions characterized by the number of rows they take as input and produce as output:
User Defined Functions (UDFs): are simpler functions that act on one row at a time.
User Defined Aggregate Functions (UDAFs): take multiple rows as input and generate multiple rows as output. These are aggregate functions to be used in conjunction with a GROUP BY
statement (similar to COUNT()
, AVG()
, MIN()
, MAX()
, and so on).
User Defined Table Functions (UDTFs): take multiple rows as input and generate a logical table comprised of multiple rows that can be used in join expressions.
These APIs are provided only in Java. For other languages, it is possible to stream data through a user-defined script using the TRANSFORM
, MAP
, and REDUCE
clauses that act as a frontend to Hadoop's streaming capabilities.
Two APIs are available to write UDFs. A simple API org.apache.hadoop...