phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Mahonin <jmaho...@gmail.com>
Subject Re: phoenix-spark and pyspark
Date Mon, 21 Dec 2015 16:27:04 GMT
Just an update for anyone interested, PHOENIX-2503 was just committed for
4.7.0 and the docs have been updated to include these samples for PySpark
users.

https://phoenix.apache.org/phoenix_spark.html

Josh

On Thu, Dec 10, 2015 at 1:20 PM, Josh Mahonin <jmahonin@gmail.com> wrote:

> Hey Nick,
>
> I think this used to work, and will again once PHOENIX-2503 gets resolved.
> With the Spark DataFrame support, all the necessary glue is there for
> Phoenix and pyspark to play nice. With that client JAR (or by overriding
> the com.fasterxml.jackson JARS), you can do something like:
>
> df = sqlContext.read \
>   .format("org.apache.phoenix.spark") \
>   .option("table", "TABLE1") \
>   .option("zkUrl", "localhost:63512") \
>   .load()
>
> And
>
> df.write \
>   .format("org.apache.phoenix.spark") \
>   .mode("overwrite") \
>   .option("table", "TABLE1") \
>   .option("zkUrl", "localhost:63512") \
>   .save()
>
>
> Yes, this should be added to the documentation. I hadn't actually tried
> this till just now. :)
>
> On Wed, Dec 9, 2015 at 6:39 PM, Nick Dimiduk <ndimiduk@apache.org> wrote:
>
>> Heya,
>>
>> Has anyone any experience using phoenix-spark integration from pyspark
>> instead of scala? Folks prefer python around here...
>>
>> I did find this example [0] of using HBaseOutputFormat from pyspark,
>> haven't tried extending it for phoenix. Maybe someone with more experience
>> in pyspark knows better? Would be a great addition to our documentation.
>>
>> Thanks,
>> Nick
>>
>> [0]:
>> https://github.com/apache/spark/blob/master/examples/src/main/python/hbase_outputformat.py
>>
>
>

Mime
View raw message