Hi Ben,

Can you describe in more detail what your environment is? Are you using stock installs of HBase, Spark and Phoenix? Are you using the hadoop2.4 pre-built Spark distribution as per the documentation [1]?

The unread block data error is commonly traced back to this issue [2] which indicates some sort of mismatched version problem..

Thanks,

Josh

[1] https://phoenix.apache.org/phoenix_spark.html
[2] https://issues.apache.org/jira/browse/SPARK-1867

On Fri, Feb 19, 2016 at 2:18 PM, Benjamin Kim <bbuild11@gmail.com> wrote:
Hi Josh,

When I run the following code in spark-shell for spark 1.6:

import org.apache.phoenix.spark._
val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))
df.select(df("ID")).show()

I get this error:

java.lang.IllegalStateException: unread block data

Thanks,
Ben


On Feb 19, 2016, at 11:12 AM, Josh Mahonin <jmahonin@gmail.com> wrote:

What specifically doesn't work for you?

I have a Docker image that I used to do some basic testing on it with and haven't run into any problems:
https://github.com/jmahonin/docker-phoenix/tree/phoenix_spark

On Fri, Feb 19, 2016 at 12:40 PM, Benjamin Kim <bbuild11@gmail.com> wrote:
All,

Thanks for the help. I have switched out Cloudera’s HBase 1.0.0 with the current Apache HBase 1.1.3. Also, I installed Phoenix 4.7.0, and everything works fine except for the Phoenix Spark Plugin. I wonder if it’s a version incompatibility issue with Spark 1.6. Has anyone tried compiling 4.7.0 using Spark 1.6?

Thanks,
Ben

On Feb 12, 2016, at 6:33 AM, Benjamin Kim <bbuild11@gmail.com> wrote:

Anyone know when Phoenix 4.7 will be officially released? And what Cloudera distribution versions will it be compatible with?

Thanks,
Ben

On Feb 10, 2016, at 11:03 AM, Benjamin Kim <bbuild11@gmail.com> wrote:

Hi Pierre,

I am getting this error now.

Error: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: SYSTEM.CATALOG,,1453397732623.8af7b44f3d7609eb301ad98641ff2611.: org.apache.hadoop.hbase.client.Delete.setAttribute(Ljava/lang/String;[B)Lorg/apache/hadoop/hbase/client/Delete;

I even tried to use sqlline.py to do some queries too. It resulted in the same error. I followed the installation instructions. Is there something missing?

Thanks,
Ben


On Feb 9, 2016, at 10:20 AM, Ravi Kiran <maghamravikiran@gmail.com> wrote:

Hi Pierre,

  Try your luck for building the artifacts from https://github.com/chiastic-security/phoenix-for-cloudera. Hopefully it helps.

Regards
Ravi .

On Tue, Feb 9, 2016 at 10:04 AM, Benjamin Kim <bbuild11@gmail.com> wrote:
Hi Pierre,

I found this article about how Cloudera’s version of HBase is very different than Apache HBase so it must be compiled using Cloudera’s repo and versions. But, I’m not having any success with it.


There’s also a Chinese site that does the same thing.


I keep getting errors like the one’s below.

[ERROR] /opt/tools/phoenix/phoenix-core/src/main/java/org/apache/hadoop/hbase/regionserver/LocalIndexMerger.java:[110,29] cannot find symbol
[ERROR] symbol:   class Region
[ERROR] location: class org.apache.hadoop.hbase.regionserver.LocalIndexMerger

Have you tried this also?

As a last resort, we will have to abandon Cloudera’s HBase for Apache’s HBase.

Thanks,
Ben


On Feb 8, 2016, at 11:04 PM, pierre lacave <pierre@lacave.me> wrote:

Havent met that one.

According to SPARK-1867, the real issue is hidden.

I d process by elimination, maybe try in local[*] mode first

https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-1867


On Tue, 9 Feb 2016, 04:58 Benjamin Kim <bbuild11@gmail.com> wrote:
Pierre,

I got it to work using phoenix-4.7.0-HBase-1.0-client-spark.jar. But, now, I get this error:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, prod-dc1-datanode151.pdc1i.gradientx.com): java.lang.IllegalStateException: unread block data

It happens when I do:

df.show()

Getting closer…

Thanks,
Ben



On Feb 8, 2016, at 2:57 PM, pierre lacave <pierre@lacave.me> wrote:

This is the wrong client jar try with the one named phoenix-4.7.0-HBase-1.1-client-spark.jar 


On Mon, 8 Feb 2016, 22:29 Benjamin Kim <bbuild11@gmail.com> wrote:
Hi Josh,

I tried again by putting the settings within the spark-default.conf.

spark.driver.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar
spark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar

I still get the same error using the code below.

import org.apache.phoenix.spark._
val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> "TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181"))

Can you tell me what else you’re doing?

Thanks,
Ben


On Feb 8, 2016, at 1:44 PM, Josh Mahonin <jmahonin@gmail.com> wrote:

Hi Ben,

I'm not sure about the format of those command line options you're passing. I've had success with spark-shell just by setting the 'spark.executor.extraClassPath' and 'spark.driver.extraClassPath' options on the spark config, as per the docs [1].

I'm not sure if there's anything special needed for CDH or not though. I also have a docker image I've been toying with which has a working Spark/Phoenix setup using the Phoenix 4.7.0 RC and Spark 1.6.0. It might be a useful reference for you as well [2].

Good luck,


On Mon, Feb 8, 2016 at 4:29 PM, Benjamin Kim <bbuild11@gmail.com> wrote:
Hi Pierre,

I tried to run in spark-shell using spark 1.6.0 by running this:

spark-shell --master yarn-client --driver-class-path /opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar --driver-java-options "-Dspark.executor.extraClassPath=/opt/tools/phoenix/phoenix-4.7.0-HBase-1.0-client.jar”

The version of HBase is the one in CDH5.4.8, which is 1.0.0-cdh5.4.8.

When I get to the line:

val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> “TEST.MY_TEST", "zkUrl" -> “zk1,zk2,zk3:2181”))

I get this error:

java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.rdd.RDDOperationScope$

Any ideas?

Thanks,
Ben


On Feb 5, 2016, at 1:36 PM, pierre lacave <pierre@lacave.me> wrote:

I don't know when the full release will be, RC1 just got pulled out, and expecting RC2 soon

you can find them here 

https://dist.apache.org/repos/dist/dev/phoenix/


there is a new phoenix-4.7.0-HBase-1.1-client-spark.jar that is all you need to have in spark classpath


Pierre Lacave
171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
Phone :       +353879128708

On Fri, Feb 5, 2016 at 9:28 PM, Benjamin Kim <bbuild11@gmail.com> wrote:
Hi Pierre,

When will I be able to download this version?

Thanks,
Ben


On Friday, February 5, 2016, pierre lacave <pierre@lacave.me> wrote:
This was addressed in Phoenix 4.7 (currently in RC) 




Pierre Lacave
171 Skellig House, Custom House, Lower Mayor street, Dublin 1, Ireland
Phone :       +353879128708

On Fri, Feb 5, 2016 at 6:17 PM, Benjamin Kim <bbuild11@gmail.com> wrote:
I cannot get this plugin to work in CDH 5.4.8 using Phoenix 4.5.2 and Spark 1.6. When I try to launch spark-shell, I get:

        java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

I continue on and run the example code. When I get tot the line below:

        val df = sqlContext.load("org.apache.phoenix.spark", Map("table" -> "TEST.MY_TEST", "zkUrl" -> "zookeeper1,zookeeper2,zookeeper3:2181")

I get this error:

        java.lang.NoSuchMethodError: com.fasterxml.jackson.module.scala.deser.BigDecimalDeserializer$.handledType()Ljava/lang/Class;

Can someone help?

Thanks,
Ben