Yup, I am aware of Spark HBase integration. Phoenix-Spark integration would be more sweet. :)

On Wed, Jan 7, 2015 at 12:40 AM, sunfl@certusnet.com.cn <sunfl@certusnet.com.cn> wrote:
Hi Anil,
Well, there are already good opensouce project on github for Spark on HBase, like the following:

Phoenix integration shall be more convenient based on that. Considering to share our code for using
that schema.



From: anil gupta
Date: 2015-01-07 16:28
Subject: Re: Re: Fwd: Phoenix in production
Hi Sun,

Phoenix-Spark would be a nice addon if you can open source it. I am planning/thinking to using Spark on HBase for one of my project.


On Wed, Jan 7, 2015 at 12:17 AM, sunfl@certusnet.com.cn <sunfl@certusnet.com.cn> wrote:
spark-phoenix integration would be great as Spark community is greately active now and more 
and more developers are using Apache Spark.


Date: 2015-01-07 16:10
Subject: Re: Fwd: Phoenix in production
This is great, Sun! Thank you so much. Would you mind posting this on our user list in response to Siddharth's email? I think other Phoenix users would find it interesting as well.

On a side note, not sure how general what you developed is, but it would be interesting to pursue a general Spark integration in Phoenix as an open source contribution.


On Tue, Jan 6, 2015 at 5:41 PM, sunfl@certusnet.com.cn <sunfl@certusnet.com.cn> wrote:
Hi, James & Siddharth

Glad to share our experience of using Phoenix in Production. I believe that Siddharth had done 
sufficient tests and practices about Phoenix performance. Here are some tips about how we are using
Phoenix for our projects:
1. We facilitate Phoenix to give convinience for both RD and QA engineers, as they are glad to use 
standard sql to operate hbase with no much loss of query performance. 
2. In Production environment, we mainly integrate Apache Spark with Phoenix to optimize data loading to
Phoenix tables with or withour secondary indexes. Glad that current performance of writing had worked 
smoothly with both compared to previously used MySQL InfoBright and other sql schema. We both had tested 
a lot for secondary indexes and query optimization for Phoenix before moving Phoenix to Production 
environment. Now we can get most of the features worked for Phoenix in our job. 
3. Challenges had a lot too, such as bulkload performance with wal enabled, query optimization, statistical data collection with Phoenix
full table scan, and so on. However, we believe Phoenix be a sufficient schema for sql query over HBase and we are glad 
that even more our projects are considering using Phoenix.



Date: 2015-01-07 09:10
Subject: Fwd: Phoenix in production
Hi Sun,
Any experiences you can share with Siddharth?

---------- Forwarded message ----------
From: Siddharth Ubale <siddharth.ubale@syncoms.com>
Date: Thu, Jan 1, 2015 at 11:21 PM
Subject: Phoenix in production
To: "user@phoenix.apache.org" <user@phoenix.apache.org>

Hi Guys,


We are seriously thinking of phoenix in Production environment , however, we have no much data of how Phoenix is behaving in production.

Can anyone let us know if anyone is using Phoenix in Production and any challenges which they have experienced.



Siddharth Ubale,

Synchronized Communications

#43, Velankani Tech Park, Block No. II,

3rd Floor, Electronic City Phase I,

Bangalore – 560 100

Tel : +91 80 3202 4060

Web: www.syncoms.com




we innovate, plan, execute, and transform the business​


Thanks & Regards,
Anil Gupta

Thanks & Regards,
Anil Gupta