phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adi Meller <>
Subject Csvbulkloadtool
Date Mon, 20 Mar 2017 04:55:21 GMT
I need to move some (5-6) big (2 tera each) tables from hive to Phoenix
every day.

I have cdh 5.7 and install phoenix 4.7 thought parcel.
I have 4 region server with  94gb physical memory And 32 cores each.

1. I created csv files from hive  (by run create table) . And created table
with 16 regions through phoenix. then bulk load it using csvbulkloadtool.
It took me 1 day to load 1 tera of data.
Is there any recommendation I can use to make the bulkload faster? How can
I know what is my bottleneck?

2. What is the best method to load from hive tables into phoenix?

3. I read that hive- phoenix integration include Phoenix 4.8 but I cannot
find parcel for cdh other than phoenix 4.7. Is there any plans create 4.8
and higher parcel for cloudera ?

Thanks in advanced

View raw message