phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radha krishna <grkmc...@gmail.com>
Subject Fwd: How to perform Read and Write operations( for TB's of data) on Phoenix tables using spark
Date Tue, 10 May 2016 11:51:52 GMT
Hi All,

In one of my project we thought of using HBASE as back end.

My use case is I have 1TB of data which will come as multiple files (one
file around 40GB with 100 Million rows & contains 102 columns for each row)
I am trying to load this files using spark + Phoenix it is taking around 2
hours.

can you please suggest how to fine tune the load process and how to load
back the data using spark.

Environment details
==========================
Hadoop Distribution : Hortonworks
Spark Version : 1.6
Hbase Version: 1.1.2
Phoenix Version: 4.4.0
Number of nodes: 19

Please find the attachment for the create and load scripts.


Thanks & Regards
   Radha krishna

Mime
View raw message