phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sela <>
Subject Insert data into HBase with Phoenix
Date Tue, 04 Mar 2014 12:50:01 GMT
Hi all,

I'm using HBase 0.94.12 with Hadoop 1.0.4.

I'm trying to load ~3GB of data into an HBase table using the csv bulk load
This is very very slow, the MR takes about 5X normal bulk load.

I was wondering if that is the best way ? I also wonder if it supports
constant pre-splitting - meaning that before each bulk load I add new
regions to the table ? Another issue I have with csv bulk load is dynamic
columns - I tried with setting null (actually "") in the csv where there is
no value but it contradicts the benefits of HBase not saving null values...

Do you think using upsert in batches could work better ? can it handle 3GB
(uncompressed) ? anyone did it from a MR context (Reducer executing the
UPSERT batches) ?


View raw message