phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dalin.qin" <dalin...@gmail.com>
Subject Re: phenix spark Plugin not working for spark 2.0
Date Mon, 26 Sep 2016 20:22:00 GMT
Hi Josh,

below is the link , this is the first time in my life to create a JIRA,I'm
not sure whehter the link is a correct "JIRA ticket"
(by the way I love the number 3333, I'll remember this number )

https://issues.apache.org/jira/browse/PHOENIX-3333

Thanks
Dalin

On Mon, Sep 26, 2016 at 3:54 PM, Josh Mahonin <jmahonin@gmail.com> wrote:

> Hi Dalin,
>
> It looks like Spark may have gone and broken their API again for Spark
> 2.0. Could you file a JIRA ticket please?
>
> Thanks,
>
> Josh
>
> On Mon, Sep 26, 2016 at 1:17 PM, dalin.qin <dalinqin@gmail.com> wrote:
>
>> Hi I'm trying some test with spark 2.0 together with phoenix 4.8 . My
>> enviroment is HDP 2.5 , I installed phoenix 4.8 by myself.
>>
>> I got everything working perfectly under spark 1.6.2
>>
>> >>> df = sqlContext.read \
>> ...   .format("org.apache.phoenix.spark") \
>> ...   .option("table", "TABLE1") \
>> ...   .option("zkUrl", "namenode:2181:/hbase-unsecure") \
>> ...   .load()
>>
>> >>> df.show()
>>
>> +---+----------+
>> | ID|      COL1|
>> +---+----------+
>> |  1|test_row_1|
>> |  2|test_row_2|
>> +---+----------+
>>
>>
>> But I got  error "org.apache.spark.sql.DataFrame not existing "  when
>> loading data from phoenix table by using spark 2.0 (I've made sure that
>> necessary jar files are in spark classpath) . I checked there is no such
>> class in phoenix 4.8 . Can somebody check and update
>> https://phoenix.apache.org/phoenix_spark.html for spark 2.0 usage?
>>
>>
>> In [1]: df = sqlContext.read \
>>    ...:   .format("org.apache.phoenix.spark") \
>>    ...:   .option("table", "TABLE1") \
>>    ...:   .option("zkUrl", "namenode:2181:/hbase-unsecure") \
>>    ...:   .load()
>> ------------------------------------------------------------
>> ---------------
>> Py4JJavaError                             Traceback (most recent call
>> last)
>> <ipython-input-1-e5dfb7bbb28b> in <module>()
>> ----> 1 df = sqlContext.read   .format("org.apache.phoenix.spark")
>> .option("table", "TABLE1")   .option("zkUrl", "namenode:2181:/hbase-unsecure")
>>   .load()
>>
>> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/readwriter.pyc in
>> load(self, path, format, schema, **options)
>>     151             return self._df(self._jreader.load(se
>> lf._spark._sc._jvm.PythonUtils.toSeq(path)))
>>     152         else:
>> --> 153             return self._df(self._jreader.load())
>>     154
>>     155     @since(1.4)
>>
>> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py
>> in __call__(self, *args)
>>     931         answer = self.gateway_client.send_command(command)
>>     932         return_value = get_return_value(
>> --> 933             answer, self.gateway_client, self.target_id,
>> self.name)
>>     934
>>     935         for temp_arg in temp_args:
>>
>> /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/utils.pyc in deco(*a,
>> **kw)
>>      61     def deco(*a, **kw):
>>      62         try:
>> ---> 63             return f(*a, **kw)
>>      64         except py4j.protocol.Py4JJavaError as e:
>>      65             s = e.java_exception.toString()
>>
>> /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py
>> in get_return_value(answer, gateway_client, target_id, name)
>>     310                 raise Py4JJavaError(
>>     311                     "An error occurred while calling
>> {0}{1}{2}.\n".
>> --> 312                     format(target_id, ".", name), value)
>>     313             else:
>>     314                 raise Py4JError(
>>
>> Py4JJavaError: An error occurred while calling o43.load.
>> : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
>>         at java.lang.Class.getDeclaredMethods0(Native Method)
>>         at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
>>         at java.lang.Class.getDeclaredMethod(Class.java:2128)
>>         at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass
>> .java:1475)
>>         at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java
>> :72)
>>         at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
>>         at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>>         at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>>         at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.
>> java:1134)
>>         at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputSt
>> ream.java:1548)
>>         at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStrea
>> m.java:1509)
>>         at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputS
>> tream.java:1432)
>>         at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.
>> java:1178)
>>         at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.
>> java:348)
>>         at org.apache.spark.serializer.JavaSerializationStream.writeObj
>> ect(JavaSerializer.scala:43)
>>         at org.apache.spark.serializer.JavaSerializerInstance.serialize
>> (JavaSerializer.scala:100)
>>         at org.apache.spark.util.ClosureCleaner$.ensureSerializable(Clo
>> sureCleaner.scala:295)
>>         at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$
>> ClosureCleaner$$clean(ClosureCleaner.scala:288)
>>         at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.
>> scala:108)
>>         at org.apache.spark.SparkContext.clean(SparkContext.scala:2037)
>>         at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:366)
>>         at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:365)
>>         at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperati
>> onScope.scala:151)
>>         at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperati
>> onScope.scala:112)
>>         at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
>>         at org.apache.spark.rdd.RDD.map(RDD.scala:365)
>>         at org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.
>> scala:119)
>>         at org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelat
>> ion.scala:59)
>>         at org.apache.spark.sql.execution.datasources.LogicalRelation.<
>> init>(LogicalRelation.scala:40)
>>         at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(Sp
>> arkSession.scala:382)
>>         at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.
>> scala:143)
>>         at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.
>> scala:122)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:498)
>>         at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
>>         at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.
>> java:357)
>>         at py4j.Gateway.invoke(Gateway.java:280)
>>         at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.j
>> ava:128)
>>         at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>         at py4j.GatewayConnection.run(GatewayConnection.java:211)
>>         at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.spark.sql.DataFrame
>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>         ... 45 more
>>
>>
>>
>

Mime
View raw message