phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radha krishna <grkmc...@gmail.com>
Subject Re: PHOENIX SPARK - DataFrame for BulkLoad
Date Tue, 17 May 2016 11:03:10 GMT
Hi

I have the same scenario, can you share your metrics like column count for
each row, number of SALT_BUCKETS, compression technique which you used and
how much time it is taking to load the complete data.

my scenario is I have to load 1.9 billions of records ( approx 20 files
data each file contains 100 million rows and 102 columns per each row)
currently it is taking 35 to 45 minutes to load one file data



On Tue, May 17, 2016 at 3:51 PM, Mohanraj Ragupathiraj <
mohanaugust@gmail.com> wrote:

> I have 100 million records to be inserted to a HBase table (PHOENIX) as a
> result of a Spark Job. I would like to know if i convert it to a Dataframe
> and save it, will it do Bulk load (or) it is not the efficient way to write
> data to Phoenix HBase table
>
> --
> Thanks and Regards
> Mohan
>



-- 








Thanks & Regards
   Radha krishna

Mime
View raw message