phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Mahonin <jmaho...@gmail.com>
Subject Re: spark plugin with java
Date Wed, 02 Dec 2015 19:02:43 GMT
Hi Krishna,

That's great to hear. You're right, the plugin itself should be backwards
compatible to Spark 1.3.1 and should be for any version of Phoenix past
4.4.0, though I can't guarantee that to be the case forever. As well, I
don't know how much usage there is across the board using the Java API and
DataFrames, you in fact may be the first. If you are encountering any
errors with it could you file a JIRA please with any stack traces you see?

Since Spark is a very quickly changing project, often they update internal
functionality that we sometimes lag behind on support for, and as a result
there's no direct mapping between specific Phoenix versions and specific
Spark versions. We add new support as fast as we get patches, essentially.

My general recommendation is to stay back a major version on Spark if
possible, but if you need to use the latest Spark releases, try use the
latest Phoenix release as well. The DataFrame support in Phoenix, for
instance, has had many patches and improvements recently that older
versions are missing.

Thanks,

Josh

On Wed, Dec 2, 2015 at 1:40 PM, Krishna <research800@gmail.com> wrote:

> Yes, that works for Spark 1.4.x. Website says Spark 1.3.1+ for Spark
> plugin, is that accurate?
>
> For Spark 1.3.1, I created a dataframe as follows (could not use the
> plugin):
> *        Map<String, String> options = new HashMap<String, String>();*
> *        options.put("url", PhoenixRuntime.JDBC_PROTOCOL +
> PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + zkQuorum);*
> *        options.put("dbtable", "TABLE_NAME");*
>
> *        SQLContext sqlContext = new SQLContext(sc);*
> *        DataFrame jdbcDF = sqlContext.load("jdbc",
> options).filter("COL_NAME > SOME_VALUE");*
>
> Also, it isn't immediately obvious which version of Spark was used in
> building Phoenix artifacts available on Maven. May be, it's worth putting
> it on the website. Let me know if the mapping below is incorrect.
>
> Phoenix 4.4.x <--> Spark 1.4.0
> > Phoenix 4.5.x <--> Spark 1.5.0
> > Phoenix 4.6.x <--> Spark 1.5.0
>
>
> On Tue, Dec 1, 2015 at 7:05 PM, Josh Mahonin <jmahonin@gmail.com> wrote:
>
> > Hi Krishna,
> >
> > I've not tried it in Java at all, but I as of Spark 1.4+ the DataFrame
> API
> > should be unified between Scala and Java, so the following may work for
> you:
> >
> > DataFrame df = sqlContext.read()
> >     .format("org.apache.phoenix.spark")
> >     .option("table", "TABLE1")
> >     .option("zkUrl", "<phoenix-server:2181>")
> >     .load();
> >
> > Note that 'zkUrl' must be set to your Phoenix URL, and passing a 'conf'
> > parameter isn't supported. Please let us know back here if this works out
> > for you, I'd love to update the documentation and unit tests if it works.
> >
> > Josh
> >
> > On Tue, Dec 1, 2015 at 6:30 PM, Krishna <research800@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> Is there a working example for using spark plugin in Java? Specifically,
> >> what's the java equivalent for creating a dataframe as shown here in
> scala:
> >>
> >> val df = sqlContext.phoenixTableAsDataFrame("TABLE1", Array("ID",
> "COL1"), conf = configuration)
> >>
> >>
> >
>

Mime
View raw message