phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nagarjuna Kanamarlapudi <>
Subject Bulk Load in Phoenix from avro
Date Tue, 06 Dec 2016 13:49:38 GMT
I have my base data in avro format with below properties

   - Number of records  500 million
   - Size of data  : 400 GB

I tried to evaluate the below options and nothing seemed to be viable
options for me.

   - load data into phoenix using phoenix-pig connectors. Time to load 14
   hours ( batch size = 200 records)
   - Bulk load tool of phoenix ... My data is raw data and essentially I
   can't define a delimiter explicitly   (If ',' is delimeter, then few of
   columns have ',' charecter .. like wise with other delimiters)
      - In addition to populate the data in delimited format, I need to run
      another MR job .. which I want to avoid

Is there a way to bulk load avro data into phoenix directly.

Nagarjuna K

View raw message