phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: Spark-Phoenix Plugin
Date Mon, 06 Aug 2018 14:41:56 GMT
Besides the distribution and parallelism of Spark as a distributed 
execution framework, I can't really see how phoenix-spark would be 
faster than the JDBC driver :). Phoenix-spark and the JDBC driver are 
using the same code under the hood.

Phoenix-spark is using the PhoenixOutputFormat (and thus, 
PhoenixRecordWriter) to write data to Phoenix. Maybe look at 
PhoenixRecordWritable, too. These ultimately are executing UPSERTs on a 
PreparedStatement.

There is also the CsvBulkLoadTool which can create HFiles to bulk load 
data in Phoenix. I'm not sure if phoenix-spark has something wired up 
that you can use to do this out of the box (certainly, you could do it 
yourself).

On 8/6/18 8:10 AM, Brandon Geise wrote:
> Thanks for the reply Yun.
> 
> I’m not quite clear how this would exactly help on the upsert side?  Are 
> you suggesting deriving the type from Phoenix then doing the 
> encoding/decoding and writing/reading directly from HBase?
> 
> Thanks,
> 
> Brandon
> 
> *From: *Jaanai Zhang <cloud.poster@gmail.com>
> *Reply-To: *<user@phoenix.apache.org>
> *Date: *Sunday, August 5, 2018 at 9:34 PM
> *To: *<user@phoenix.apache.org>
> *Subject: *Re: Spark-Phoenix Plugin
> 
> You can get data type from Phoenix meta, then encode/decode data to 
> write/read data. I think this way is effective, FYI :)
> 
> 
> ----------------------------------------
> 
>     Yun Zhang
> 
>     Best regards!
> 
> 2018-08-04 21:43 GMT+08:00 Brandon Geise <brandongeise@gmail.com 
> <mailto:brandongeise@gmail.com>>:
> 
>     Good morning,
> 
>     I’m looking at using a combination of Hbase, Phoenix and Spark for a
>     project and read that using the Spark-Phoenix plugin directly is
>     more efficient than JDBC, however it wasn’t entirely clear from
>     examples when writing a dataframe if an upsert is performed and how
>     much fine-grained options there are for executing the upsert.  Any
>     information someone can share would be greatly appreciated!
> 
>     Thanks,
> 
>     Brandon
> 

Mime
View raw message