Hello.
I need to move some (5-6) big (2 tera each) tables from hive to Phoenix
every day.
I have cdh 5.7 and install phoenix 4.7 thought parcel.
I have 4 region server with 94gb physical memory And 32 cores each.
1. I created csv files from hive (by run create table) . And created table
with 16 regions through phoenix. then bulk load it using csvbulkloadtool.
It took me 1 day to load 1 tera of data.
Is there any recommendation I can use to make the bulkload faster? How can
I know what is my bottleneck?
2. What is the best method to load from hive tables into phoenix?
3. I read that hive- phoenix integration include Phoenix 4.8 but I cannot
find parcel for cdh other than phoenix 4.7. Is there any plans create 4.8
and higher parcel for cloudera ?
Thanks in advanced
Adi.
|