phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Kim <bbuil...@gmail.com>
Subject Re: Spark Phoenix Plugin
Date Sat, 20 Feb 2016 15:44:04 GMT
Josh,

My production environment at our company is:
CDH 5.4.8
Hadoop 2.6.0-cdh5.4.8
YARN 2.6.0-cdh5.4.8
HBase 1.0.0-cdh5.4.8
Apache
HBase 1.1.3
Spark 1.6.0
Phoenix 4.7.0

I tried to use the Phoenix Spark Plugin against both versions of HBase.

I hope this helps.

Thanks,
Ben


> On Feb 20, 2016, at 7:37 AM, Josh Mahonin <jmahonin@gmail.com> wrote:
> 
> Hi Ben,
> 
> Can you describe in more detail what your environment is? Are you using stock installs
of HBase, Spark and Phoenix? Are you using the hadoop2.4 pre-built Spark distribution as per
the documentation [1]?
> 
> The unread block data error is commonly traced back to this issue [2] which indicates
some sort of mismatched version problem..
> 
> Thanks,
> 
> Josh
> 
> [1] https://phoenix.apache.org/phoenix_spark.html <https://phoenix.apache.org/phoenix_spark.html>
> [2] https://issues.apache.org/jira/browse/SPARK-1867 <https://issues.apache.org/jira/browse/SPARK-1867>
> 
> On Fri, Feb 19, 2016 at 2:18 PM, Benjamin Kim <bbuild11@gmail.com <mailto:bbuild11@gmail.com>>
wrote:
> Hi Josh,
> 
> When I run the following code in spark-shell for spark 1.6:
> 
> import org.apache.phoenix.spark._
> val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> "TEST.MY_TEST",
"zkUrl" -> “zk1,zk2,zk3:2181"))
> df.select(df("ID")).show()
> 
> I get this error:
> 
> java.lang.IllegalStateException: unread block data
> 
> Thanks,
> Ben
> 
> 
>> On Feb 19, 2016, at 11:12 AM, Josh Mahonin <jmahonin@gmail.com <mailto:jmahonin@gmail.com>>
wrote:
>> 
>> What specifically doesn't work for you?
>> 
>> I have a Docker image that I used to do some basic testing on it with and haven't
run into any problems:
>> https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark <https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark>
>> 
>> On Fri, Feb 19, 2016 at 12:40 PM, Benjamin Kim <bbuild11@gmail.com <mailto:bbuild11@gmail.com>>
wrote:
>> All,
>> 
>> Thanks for the help. I have switched out Cloudera’s HBase 1.0.0 with the current
Apache HBase 1.1.3. Also, I installed Phoenix 4.7.0, and everything works fine except for
the Phoenix Spark Plugin. I wonder if it’s a version incompatibility issue with Spark 1.6.
Has anyone tried compiling 4.7.0 using Spark 1.6?
>> 
>> Thanks,
>> Ben
>> 
>>> On Feb 12, 2016, at 6:33 AM, Benjamin Kim <bbuild11@gmail.com <mailto:bbuild11@gmail.com>>
wrote:
>>> 
>>> Anyone know when Phoenix 4.7 will be officially released? And what Cloudera distribution
versions will it be compatible with?
>>> 
>>> Thanks,
>>> Ben
>>> 
>>>> On Feb 10, 2016, at 11:03 AM, Benjamin Kim <bbuild11@gmail.com <mailto:bbuild11@gmail.com>>
wrote:
>>>> 
>>>> Hi Pierre,
>>>> 
>>>> I am getting this error now.
>>>> 
>>>> Error: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException:
SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.: org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;
>>>> 
>>>> I even tried to use sqlline.py to do some queries too. It resulted in the
same error. I followed the installation instructions. Is there something missing?
>>>> 
>>>> Thanks,
>>>> Ben
>>>> 
>>>> 
>>>>> On Feb 9, 2016, at 10:20 AM, Ravi Kiran <maghamravikiran@gmail.com
<mailto:maghamravikiran@gmail.com>> wrote:
>>>>> 
>>>>> Hi Pierre,
>>>>> 
>>>>>   Try your luck for building the artifacts from https://github.com/chiastic-security/phoenix-for-cloudera
<https://github.com/chiastic-security/phoenix-for-cloudera>. Hopefully it helps.
>>>>> 
>>>>> Regards
>>>>> Ravi .
>>>>> 
>>>>> On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim <bbuild11@gmail.com
<mailto:bbuild11@gmail.com>> wrote:
>>>>> Hi Pierre,
>>>>> 
>>>>> I found this article about how Cloudera’s version of HBase is very
different than Apache HBase so it must be compiled using Cloudera’s repo and versions. But,
I’m not having any success with it.
>>>>> 
>>>>> http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo
<http://stackoverflow.com/questions/31849454/using-phoenix-with-cloudera-hbase-installed-from-repo>
>>>>> 
>>>>> There’s also a Chinese site that does the same thing.
>>>>> 
>>>>> https://www.zybuluo.com/xtccc/note/205739 <https://www.zybuluo.com/xtccc/note/205739>
>>>>> 
>>>>> I keep getting errors like the one’s below.
>>>>> 
>>>>> [ERROR] /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29]
cannot find symbol
>>>>> [ERROR] symbol:   class Region
>>>>> [ERROR] location: class org.apache.hadoop.hbase.regionserver.LocalIndexMerger
>>>>> …
>>>>> 
>>>>> Have you tried this also?
>>>>> 
>>>>> As a last resort, we will have to abandon Cloudera’s HBase for Apache’s
HBase.
>>>>> 
>>>>> Thanks,
>>>>> Ben
>>>>> 
>>>>> 
>>>>>> On Feb 8, 2016, at 11:04 PM, pierre lacave <pierre@lacave.me <mailto:pierre@lacave.me>>
wrote:
>>>>>> 
>>>>>> Havent met that one.
>>>>>> 
>>>>>> According to SPARK-1867, the real issue is hidden.
>>>>>> 
>>>>>> I d process by elimination, maybe try in local[*] mode first
>>>>>> 
>>>>>> https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867
<https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867>
>>>>>> On Tue, 9 Feb 2016, 04:58 Benjamin Kim <bbuild11@gmail.com <mailto:bbuild11@gmail.com>>
wrote:
>>>>>> Pierre,
>>>>>> 
>>>>>> I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar.
But, now, I get this error:
>>>>>> 
>>>>>> org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3,
prod-dc1-datanode151.pdc1i.gradientx.com <http://prod-dc1-datanode151.pdc1i.gradientx.com/>):
java.lang.IllegalStateException: unread block data
>>>>>> 
>>>>>> It happens when I do:
>>>>>> 
>>>>>> df.show()
>>>>>> 
>>>>>> Getting closer…
>>>>>> 
>>>>>> Thanks,
>>>>>> Ben
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Feb 8, 2016, at 2:57 PM, pierre lacave <pierre@lacave.me
<mailto:pierre@lacave.me>> wrote:
>>>>>>> 
>>>>>>> This is the wrong client jar try with the one named phoenix-4.7.0-HBase-1.1-client-spark.jar

>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, 8 Feb 2016, 22:29 Benjamin Kim <bbuild11@gmail.com
<mailto:bbuild11@gmail.com>> wrote:
>>>>>>> Hi Josh,
>>>>>>> 
>>>>>>> I tried again by putting the settings within the spark-default.conf.
>>>>>>> 
>>>>>>> spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>>>>>> spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
>>>>>>> 
>>>>>>> I still get the same error using the code below.
>>>>>>> 
>>>>>>> import org.apache.phoenix.spark._
>>>>>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table"
-> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
>>>>>>> 
>>>>>>> Can you tell me what else you’re doing?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Ben
>>>>>>> 
>>>>>>> 
>>>>>>>> On Feb 8, 2016, at 1:44 PM, Josh Mahonin <jmahonin@gmail.com
<mailto:jmahonin@gmail.com>> wrote:
>>>>>>>> 
>>>>>>>> Hi Ben,
>>>>>>>> 
>>>>>>>> I'm not sure about the format of those command line options
you're passing. I've had success with spark-shell just by setting the 'spark.executor.extraClassPath'
and 'spark.driver.extraClassPath' options on the spark config, as per the docs [1].
>>>>>>>> 
>>>>>>>> I'm not sure if there's anything special needed for CDH or
not though. I also have a docker image I've been toying with which has a working Spark/Phoenix
setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might be a useful reference for you as
well [2].
>>>>>>>> 
>>>>>>>> Good luck,
>>>>>>>> 
>>>>>>>> Josh
>>>>>>>> 
>>>>>>>> [1] https://phoenix.apache.org/phoenix_spark.html <https://phoenix.apache.org/phoenix_spark.html>
>>>>>>>> [2] https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark
<https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark>
>>>>>>>> 
>>>>>>>> On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim <bbuild11@gmail.com
<mailto:bbuild11@gmail.com>> wrote:
>>>>>>>> Hi Pierre,
>>>>>>>> 
>>>>>>>> I tried to run in spark-shell using spark 1.6.0 by running
this:
>>>>>>>> 
>>>>>>>> spark-shell --master yarn-client --driver-class-path /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
--driver-java-options "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”
>>>>>>>> 
>>>>>>>> The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.
>>>>>>>> 
>>>>>>>> When I get to the line:
>>>>>>>> 
>>>>>>>> val df = sqlContext.load("org.apache.phoenix.spark", Map("table"
-> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))
>>>>>>>> 
>>>>>>>> I get this error:
>>>>>>>> 
>>>>>>>> java.lang.NoClassDefFoundError: Could not initialize class
org.apache.spark.rdd.RDDOperationScope$
>>>>>>>> 
>>>>>>>> Any ideas?
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Ben
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Feb 5, 2016, at 1:36 PM, pierre lacave <pierre@lacave.me
<mailto:pierre@lacave.me>> wrote:
>>>>>>>>> 
>>>>>>>>> I don't know when the full release will be, RC1 just
got pulled out, and expecting RC2 soon
>>>>>>>>> 
>>>>>>>>> you can find them here 
>>>>>>>>> 
>>>>>>>>> https://dist.apache.org/repos/dist/dev/phoenix/ <https://dist.apache.org/repos/dist/dev/phoenix/>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar
that is all you need to have in spark classpath
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Pierre Lacave
>>>>>>>>> 171 Skellig House, Custom House, Lower Mayor street,
Dublin 1, Ireland
>>>>>>>>> Phone :       +353879128708 <tel:%2B353879128708>
>>>>>>>>> 
>>>>>>>>> On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim <bbuild11@gmail.com
<mailto:bbuild11@gmail.com>> wrote:
>>>>>>>>> Hi Pierre,
>>>>>>>>> 
>>>>>>>>> When will I be able to download this version?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Ben
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Friday, February 5, 2016, pierre lacave <pierre@lacave.me
<mailto:pierre@lacave.me>> wrote:
>>>>>>>>> This was addressed in Phoenix 4.7 (currently in RC) 
>>>>>>>>> https://issues.apache.org/jira/browse/PHOENIX-2503 <https://issues.apache.org/jira/browse/PHOENIX-2503>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Pierre Lacave
>>>>>>>>> 171 Skellig House, Custom House, Lower Mayor street,
Dublin 1, Ireland
>>>>>>>>> Phone :       +353879128708 <tel:%2B353879128708>
>>>>>>>>> 
>>>>>>>>> On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim <bbuild11@gmail.com
<>> wrote:
>>>>>>>>> I cannot get this plugin to work in CDH 5.4.8 using Phoenix
4.5.2 and Spark 1.6. When I try to launch spark-shell, I get:
>>>>>>>>> 
>>>>>>>>>         java.lang.RuntimeException: java.lang.RuntimeException:
Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
>>>>>>>>> 
>>>>>>>>> I continue on and run the example code. When I get tot
the line below:
>>>>>>>>> 
>>>>>>>>>         val df = sqlContext.load("org.apache.phoenix.spark",
Map("table" -> "TEST.MY_TEST", "zkUrl" -> "zookeeper1,zookeeper2,zookeeper3:2181")
>>>>>>>>> 
>>>>>>>>> I get this error:
>>>>>>>>> 
>>>>>>>>>         java.lang.NoSuchMethodError: com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;
>>>>>>>>> 
>>>>>>>>> Can someone help?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Ben
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
> 
> 


Mime
View raw message