phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Mahonin <jmaho...@gmail.com>
Subject Re: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin / Various Errors
Date Thu, 10 Dec 2015 17:24:38 GMT
Thanks Jonathan,

I'm making some headway on getting a the client library working again. I
thought I saw a mention that you were using pyspark as well using the
DataFrame support. Are you able to confirm this works as well?

Thanks!

Josh

On Wed, Dec 9, 2015 at 7:51 PM, Cox, Jonathan A <jacox@sandia.gov> wrote:

> Josh,
>
> I added all of those JARs separately to Spark's class paths, and it seems
> to be working fine now.
>
> Thanks a lot for your help!
>
> Sent from my iPhone
>
> On Dec 9, 2015, at 2:30 PM, Josh Mahonin <jmahonin@gmail.com> wrote:
>
> Thanks Jonathan,
>
> I'll follow-up with the issue there. In the meantime, you may have some
> luck just submitting a fat (assembly) JAR to a spark cluster.
>
> If you really want to dive into the nitty-gritty, I'm decomposing the
> client JAR down to the required components that allow for the Spark
> integration to work  (especially excluding the fasterxml JARs). If you were
> to manually assemble the following libraries into the Spark classpath, I
> believe you'll be able to get the spark-shell going:
>
> guava-12.0.1.jar  hbase-common-1.1.0.jar  hbase-server-1.1.0.jar
>  phoenix-core-4.6.0-HBase-1.1.jar  hbase-client-1.1.0.jar
>  hbase-protocol-1.1.0.jar  htrace-core-3.1.0-incubating.jar
>  phoenix-spark-4.6.0-HBase-1.1.jar
>
> Thanks for the report.
>
> Josh
>
> On Wed, Dec 9, 2015 at 4:00 PM, Cox, Jonathan A <jacox@sandia.gov> wrote:
>
>> Thanks, Josh. I submitted the issue, which can be found at:
>> https://issues.apache.org/jira/browse/PHOENIX-2503
>>
>>
>>
>> Multiple Java NoClass/Method Errors with Spark and Phoenix
>>
>>
>>
>> *From:* Josh Mahonin [mailto:jmahonin@gmail.com]
>> *Sent:* Wednesday, December 09, 2015 1:15 PM
>>
>> *To:* user@phoenix.apache.org
>> *Subject:* Re: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin
>> / Various Errors
>>
>>
>>
>> Hi Jonathan,
>>
>>
>>
>> Thanks, I'm digging into this as we speak. That SPARK-8332 issue looks
>> like the same issue, and to quote one of the comments in that issue
>> 'Classpath hell is hell'.
>>
>>
>>
>> What is interesting is that the unit tests in Phoenix 4.6.0 successfully
>> run against Spark 1.5.2 [1], so I wonder if this is issue is specific to
>> the spark-shell. You may have some success compiling your app as an
>> assembly JAR and submitting it to a Spark cluster instead.
>>
>>
>>
>> Could you do me a favour and file a JIRA ticket for this, and copy all
>> the relevant information you've posted there?
>>
>>
>>
>> Thanks!
>>
>> Josh
>>
>> [1]
>> https://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala
>>
>>
>>
>> On Wed, Dec 9, 2015 at 2:52 PM, Cox, Jonathan A <jacox@sandia.gov> wrote:
>>
>> Josh,
>>
>>
>>
>> I’d like to give you a little more information regarding this error. It
>> looks like when I add the Phoenix Client JAR to Spark, it causes Spark to
>> fail:
>>
>> spark.executor.extraClassPath
>> /usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar
>>
>> spark.driver.extraClassPath
>> /usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar
>>
>>
>>
>> After adding this JAR, I get the following error when excuting the
>> following command:
>>
>> scala> val textFile = sc.textFile("README.md")
>>
>> java.lang.NoSuchMethodError:
>> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>>
>>                 at
>> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<init>(ScalaNumberDeserializersModule.scala:49)
>>
>>                 at
>> com.fasterxml.jackson.module.scala.deser.NumberDeserializers$.<clinit>(ScalaNumberDeserializersModule.scala)
>>
>>
>>
>> As you can see, adding this phoenix JAR is breaking other Spark
>> functionality for me. My naïve guess is that there is a different version
>> of the Jackson FasterXML classes packaged inside
>> phoenix-4.6.0-HBase-1.1-client.jar that is breaking Spark.
>>
>>
>>
>> Have you seen anything like this before?
>>
>>
>>
>> Regards,
>>
>> Jonathan
>>
>>
>>
>> *From:* Cox, Jonathan A [mailto:jacox@sandia.gov]
>> *Sent:* Wednesday, December 09, 2015 11:58 AM
>> *To:* user@phoenix.apache.org
>> *Subject:* [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin /
>> Various Errors
>>
>>
>>
>> Josh,
>>
>>
>>
>> So using user provided Hadoop 2.6 solved the immediate Phoenix / Spark
>> integration problem I was having. However, I now have another problem,
>> which seems to be similar to:
>>
>> https://issues.apache.org/jira/browse/SPARK-8332
>>
>> java.lang.NoSuchMethodError:
>> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer
>>
>>
>>
>> I’m getting this error when executing the simple example in the Phoenix /
>> Spark Plugin page:
>>
>> Spark context available as sc.
>>
>> 15/12/09 11:51:02 INFO repl.SparkILoop: Created sql context..
>>
>> SQL context available as sqlContext.
>>
>>
>>
>> scala> val df = sqlContext.load(
>>
>>      |   "org.apache.phoenix.spark",
>>
>>      |   Map("table" -> "TABLE1", "zkUrl" -> "phoenix-server:2181")
>>
>>      | )
>>
>> warning: there were 1 deprecation warning(s); re-run with -deprecation
>> for details
>>
>> java.lang.NoSuchMethodError:
>> com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>>
>>
>>
>> I did try upgrading the Hadoop Jackson JARs from 2.2.3 to 2.4.3, as some
>> suggested in the link above, and including them in Spark’s classpath.
>> However, the error was the same.
>>
>>
>>
>> *From:* Josh Mahonin [mailto:jmahonin@gmail.com <jmahonin@gmail.com>]
>> *Sent:* Wednesday, December 09, 2015 11:21 AM
>> *To:* user@phoenix.apache.org
>> *Subject:* Re: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin
>> / Various Errors
>>
>>
>>
>> Definitely. I'd like to dig into what the root cause is, but it might be
>> optimistic to think I'll be able to get to that any time soon.
>>
>>
>>
>> I'll try get the docs updated today.
>>
>>
>>
>> On Wed, Dec 9, 2015 at 1:09 PM, James Taylor <jamestaylor@apache.org>
>> wrote:
>>
>> Would it make sense to tweak the Spark installation instructions slightly
>> with this information, Josh?
>>
>>
>>
>> On Wed, Dec 9, 2015 at 9:11 AM, Cox, Jonathan A <jacox@sandia.gov> wrote:
>>
>> Josh,
>>
>>
>>
>> Previously, I was using the SPARK_CLASSPATH, but then read that it was
>> deprecated and switched to the spark-defaults.conf file. The result was the
>> same.
>>
>>
>>
>> Also, I was using ‘spark-1.5.2-bin-hadoop2.6.tgz’, which includes some
>> Hadoop 2.6 JARs. This caused the trouble. However, by separately
>> downloading Hadoop 2.6 and Spark without Hadoop, the errors went away.
>>
>>
>>
>> -Jonathan
>>
>>
>>
>> *From:* Josh Mahonin [mailto:jmahonin@gmail.com]
>> *Sent:* Wednesday, December 09, 2015 5:57 AM
>> *To:* user@phoenix.apache.org
>> *Subject:* Re: [EXTERNAL] Re: Confusion Installing Phoenix Spark Plugin
>> / Various Errors
>>
>>
>>
>> Hi Jonathan,
>>
>>
>>
>> Thanks for the information. If you're able, could you also try the
>> 'SPARK_CLASSPATH' environment variable instead of the spark-defaults.conf
>> setting, and let us know if that works? Also the exact Spark package you're
>> using would be helpful as well (from source, prebuilt for 2.6+, 2.4+, CDH,
>> etc.)
>>
>> Thanks,
>>
>>
>>
>> Josh
>>
>>
>>
>> On Wed, Dec 9, 2015 at 12:08 AM, Cox, Jonathan A <jacox@sandia.gov>
>> wrote:
>>
>> Alright, I reproduced what you did exactly, and it now works. The problem
>> is that the Phoenix client JAR is not working correctly with the Spark
>> builds that include Hadoop.
>>
>>
>>
>> When I downloaded the Spark build with user provided Hadoop, and also
>> installed Hadoop manually, Spark works with Phoenix correctly!
>>
>>
>>
>> Thank you much,
>>
>> Jonathan
>>
>> Sent from my iPhone
>>
>>
>> On Dec 8, 2015, at 8:54 PM, Josh Mahonin <jmahonin@gmail.com> wrote:
>>
>> Hi Jonathan,
>>
>>
>>
>> Spark only needs the client JAR. It contains all the other Phoenix
>> dependencies as well.
>>
>>
>>
>> I'm not sure exactly what the issue you're seeing is. I just downloaded
>> and extracted fresh copies of Spark 1.5.2 (pre-built with user-provided
>> Hadoop), and the latest Phoenix 4.6.0 binary release.
>>
>>
>>
>> I copied the 'phoenix-4.6.0-HBase-1.1-client.jar' to /tmp and created a
>> 'spark-defaults.conf' in the 'conf' folder of the Spark install with the
>> following:
>>
>>
>> spark.executor.extraClassPath /tmp/phoenix-4.6.0-HBase-1.1-client.jar
>>
>> spark.driver.extraClassPath /tmp/phoenix-4.6.0-HBase-1.1-client.jar
>>
>> I then launched the 'spark-shell', and was able to execute:
>>
>> import org.apache.phoenix.spark._
>>
>>
>>
>> From there, you should be able to use the methods provided by the
>> phoenix-spark integration within the Spark shell.
>>
>>
>>
>> Good luck,
>>
>>
>>
>> Josh
>>
>>
>>
>> On Tue, Dec 8, 2015 at 8:51 PM, Cox, Jonathan A <jacox@sandia.gov> wrote:
>>
>> I am trying to get Spark up and running with Phoenix, but the
>> installation instructions are not clear to me, or there is something else
>> wrong. I’m using Spark 1.5.2, HBase 1.1.2 and Phoenix 4.6.0 with a
>> standalone install (no HDFS or cluster) with Debian Linux 8 (Jessie) x64.
>> I’m also using Java 1.8.0_40.
>>
>>
>>
>> The instructions state:
>>
>> 1.       Ensure that all requisite Phoenix / HBase platform dependencies
>> are available on the classpath for the Spark executors and drivers
>>
>> 2.       One method is to add the phoenix-4.4.0-client.jar to
>> ‘SPARK_CLASSPATH’ in spark-env.sh, or setting both
>> ‘spark.executor.extraClassPath’ and ‘spark.driver.extraClassPath’ in
>> spark-defaults.conf
>>
>>
>>
>> *First off, what are “all requisite Phoenix / HBase platform
>> dependencies”?* #2 suggests that all I need to do is add
>>  ‘phoenix-4.6.0-HBase-1.1-client.jar’ to Spark’s class path. But what about
>> ‘phoenix-spark-4.6.0-HBase-1.1.jar’ or ‘phoenix-core-4.6.0-HBase-1.1.jar’?
>> Do either of these (or anything else) need to be added to Spark’s class
>> path?
>>
>>
>>
>> Secondly, if I follow the instructions exactly, and add only
>> ‘phoenix-4.6.0-HBase-1.1-client.jar’ to ‘spark-defaults.conf’:
>>
>> spark.executor.extraClassPath
>> /usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar
>>
>> spark.driver.extraClassPath
>> /usr/local/phoenix/phoenix-4.6.0-HBase-1.1-client.jar
>>
>> Then I get the following error when starting the interactive Spark shell
>> with ‘spark-shell’:
>>
>> 15/12/08 18:38:05 WARN ObjectStore: Version information not found in
>> metastore. hive.metastore.schema.verification is not enabled so recording
>> the schema version 1.2.0
>>
>> 15/12/08 18:38:05 WARN ObjectStore: Failed to get database default,
>> returning NoSuchObjectException
>>
>> 15/12/08 18:38:05 WARN Hive: Failed to access metastore. This class
>> should not accessed in runtime.
>>
>> org.apache.hadoop.hive.ql.metadata.HiveException:
>> java.lang.RuntimeException: Unable to instantiate
>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>>
>>                 at
>> org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236)
>>
>> …
>>
>>
>>
>> <console>:10: error: not found: value sqlContext
>>
>>        import sqlContext.implicits._
>>
>>               ^
>>
>> <console>:10: error: not found: value sqlContext
>>
>>        import sqlContext.sql
>>
>>
>>
>> On the other hand, if I include all three of the aforementioned JARs, I
>> get the same error. However, *if I include only the
>> ‘phoenix-spark-4.6.0-HBase-1.1.jar’*, spark-shell seems so launch
>> without error. Nevertheless, if I then try the simple tutorial commands in
>> spark-shell, I get the following:
>>
>> *Spark output:* SQL context available as sqlContext.
>>
>>
>>
>> *scala >>* import org.apache.spark.SparkContext
>>
>> import org.apache.spark.sql.SQLContext
>>
>> import org.apache.phoenix.spark._
>>
>>
>>
>>                                 val sqlContext = new SQLContext(sc)
>>
>>
>>
>>                                 val df =
>> sqlContext.load("org.apache.phoenix.spark", Map("table" -> "TABLE1",
>> "zkUrl" -> "phoenix-server:2181")
>>
>>
>>
>>                 *Spark error:*
>>
>>                                 *java.lang.NoClassDefFoundError:
>> org/apache/hadoop/hbase/HBaseConfiguration*
>>
>>                 at
>> org.apache.phoenix.spark.PhoenixRDD.getPhoenixConfiguration(PhoenixRDD.scala:71)
>>
>>                 at
>> org.apache.phoenix.spark.PhoenixRDD.phoenixConf$lzycompute(PhoenixRDD.scala:39)
>>
>>                 at
>> org.apache.phoenix.spark.PhoenixRDD.phoenixConf(PhoenixRDD.scala:38)
>>
>>                 at
>> org.apache.phoenix.spark.PhoenixRDD.<init>(PhoenixRDD.scala:42)
>>
>>                 at
>> org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:50)
>>
>>                 at
>> org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:37)
>>
>>                 at
>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120)
>>
>>
>>
>> This final error seems similar to the one in mailing list post Phoenix-spark
>> : NoClassDefFoundError: HBaseConfiguration
>> <http://mail-archives.apache.org/mod_mbox/phoenix-user/201511.mbox/ajax/%3CCAKwwsRSEJHkotiF28kzumDZM6kgBVeTJNGUoJnZcLiuEGCTjHQ%40mail.gmail.com%3E>
>> <
>> http://mail-archives.apache.org/mod_mbox/phoenix-user/201511.mbox/ajax/%3CCAKwwsRSEJHkotiF28kzumDZM6kgBVeTJNGUoJnZcLiuEGCTjHQ%40mail.gmail.com%3E>.
>> But the question does not seem to have been answered satisfactory. Also
>> note, if I include all three JARs, as he did, I get an error when launching
>> spark-shell.
>>
>>
>>
>> *Can you please clarify what is the proper way to install and configure
>> Phoenix with Spark?*
>>
>>
>>
>> Sincerely,
>>
>> Jonathan
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>

Mime
View raw message