
HBase High Performance Cookbook
By :

HBase doesn't allow direct interaction or a pipeline for data import from Oracle and MySQL to HBase. The basic concept remains the same: to first extract the data into flat / text files (ImportTsv format), transform the data into HFiles, and then load them into HBase by telling the region server where to find them.
Let's start with getting public data from the following URL:
http://databank.worldbank.org/data/download/WDI_csv.zip
This will have the following files:
WDI_Data.csv
WDI_Country.csv
(this is the file we will use)WDI_Series.csv
WDI_CS_Notes.csv
WDI_ST_Notes.csv
WDI_Footnotes.csv
WDI_Description.csv
We will be using this as data and nothing else; this is freely available on the aforementioned World Bank site.
We will then create a table in Oracle Schema on your SQL prompt:
The names of the column used have an exact match with WDI_Country.csv
:
CREATE TABLE WDI_COUNTRY ( "COUNTRY_CODE" VARCHAR2(100 BYTE), "SHORT_NAME"...