phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Mahonin <jmaho...@gmail.com>
Subject Re: [HELP:]Save Spark Dataframe in Phoenix Table
Date Sun, 10 Apr 2016 18:45:56 GMT
Hi Divya,

No, there is a separate JAR that would look like
'phoenix-4.4.0.XXX-client-spark.jar'. If you download a binary release of
Phoenix, or compile the latest version yourself, you will be able to see
and use it. It does not come with the HDP 2.3.4 platform, at least last I
checked.

Regards,

Josh

On Sat, Apr 9, 2016 at 2:24 PM, Divya Gehlot <divya.htconex@gmail.com>
wrote:

> Hi Josh,
> Thank you very much for your help.
> I could see there is  phoenix-spark-4.4.0.2.3.4.0-3485.jar in my
> phoenix/lib.
> Please confirm is the above jar you are talking about?
>
> Thanks,
> Divya
>
> Josh Mahonin <jmahonin@
>
> On 9 April 2016 at 23:01, Josh Mahonin <jmahonin@gmail.com> wrote:
>
>> Hi Divya,
>>
>> You don't have the phoenix client-spark JAR in your classpath, which is
>> required for the phoenix-spark integration to work (as per the
>> documentation).
>>
>> As well, you aren't using the vanilla Apache project that this mailing
>> list supports, but are using a vendor packaged platform (Hortonworks).
>> Since they maintain their own patches and forks to the upstream Apache
>> versions, in general you should opt for filing support tickets with them
>> first. In this particular case, HDP 2.3.4 doesn't actually provide the
>> necessary phoenix client-spark JAR by default, so your options are limited
>> here. Again, I recommend filing a support ticket with Hortonworks.
>>
>> Regards,
>>
>> Josh
>>
>> On Sat, Apr 9, 2016 at 9:11 AM, Divya Gehlot <divya.htconex@gmail.com>
>> wrote:
>>
>>> Hi,
>>> The code which I using to connect to Phoenix for writing
>>> def writeToTable(df: DataFrame,dbtable: String) = {
>>> val phx_properties = collection.immutable.Map[String, String](
>>>  "zkUrl" -> "localhost:2181:/hbase-unsecure",
>>> "table" -> dbtable)
>>>
>>> df.write.format("org.apache.phoenix.spark").mode(SaveMode.Overwrite).options(phx_properties).saveAsTable(dbtable)
>>> }
>>>
>>> While Submitting Spark job
>>> * spark-shell  --properties-file  /TestDivya/Spark/Phoenix.properties
>>> --jars
>>> /usr/hdp/2.3.4.0-3485/hive/lib/hive-hbase-handler-1.2.1.2.3.4.0-3485.jar,/usr/hdp/2.3.4.0-3485/phoenix/lib/zookeeper.jar,/usr/hdp/2.3.4.0-3485/phoenix/lib/hbase-client.jar,/usr/hdp/2.3.4.0-3485/phoenix/lib/hbase-common.jar,/usr/hdp/2.3.4.0-3485/phoenix/lib/hbase-protocol.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/phoenix-server.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/phoenix-client-4.4.0.jar
>>>  --driver-class-path
>>> /usr/hdp/2.3.4.0-3485/hbase/lib/phoenix-server.jar,/usr/hdp/2.3.4.0-3485/hbase/lib/phoenix-client-4.4.0.jar
>>> --packages com.databricks:spark-csv_2.10:1.4.0  --master yarn-client  -i
>>> /TestDivya/Spark/WriteToPheonix.scala*
>>>
>>>
>>> Getting the below error :
>>>
>>> org.apache.spark.sql.AnalysisException: org.apache.phoenix.spark.DefaultSource
>>> does not allow user-specified schemas.;
>>>
>>> Am I on the right track or missing any properties ?
>>>
>>>  Because of this I am unable to proceed with Phoenix and have to find
>>> alternate options.
>>> Would really appreciate the help
>>>
>>>
>>>
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: Divya Gehlot <divya.htconex@gmail.com>
>>> Date: 8 April 2016 at 19:54
>>> Subject: Re: [HELP:]Save Spark Dataframe in Phoenix Table
>>> To: Josh Mahonin <jmahonin@gmail.com>
>>>
>>>
>>> Hi Josh,
>>> I am doing in the same manner as mentioned in Phoenix Spark manner.
>>> Using the latest version of HDP 2.3.4 .
>>> In case of version mismatch/lack of spark Phoenix support it's should
>>> have thrown the error at read also.
>>> Which is working fine as expected .
>>> Will surely pass on the code snippets once I log on to my System.
>>> In the mean while I would like to know the zkURL parameter.If I build it
>>> with HbaseConfiguration and passing zk quorom ,znode and port .
>>> It throws error for example localhost :2181/hbase-unsecure
>>> This localhost gets replaced by all the quorom
>>> Like quorum1,quorum2:2181/hbase-unsecure
>>>
>>> I am just providing the IP address of my HBase master.
>>>
>>> I feel like I am  not on right track so asked for the help .
>>> How to connect to Phoenix through Spark on hadoop cluster .
>>> Thanks for the help.
>>> Cheers,
>>> Divya
>>> On Apr 8, 2016 7:06 PM, "Josh Mahonin" <jmahonin@gmail.com> wrote:
>>>
>>>> Hi Divya,
>>>>
>>>> That's strange. Are you able to post a snippet of your code to look at?
>>>> And are you sure that you're saving the dataframes as per the docs (
>>>> https://phoenix.apache.org/phoenix_spark.html)?
>>>>
>>>> Depending on your HDP version, it may or may not actually have
>>>> phoenix-spark support. Double-check that your Spark configuration is setup
>>>> with the right worker/driver classpath settings. and that the phoenix JARs
>>>> contain the necessary phoenix-spark classes
>>>> (e.g. org.apache.phoenix.spark.PhoenixRelation). If not, I suggest
>>>> following up with Hortonworks.
>>>>
>>>> Josh
>>>>
>>>>
>>>>
>>>> On Fri, Apr 8, 2016 at 1:22 AM, Divya Gehlot <divya.htconex@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>> I hava a Hortonworks Hadoop cluster having below Configurations :
>>>>> Spark 1.5.2
>>>>> HBASE 1.1.x
>>>>> Phoenix 4.4
>>>>>
>>>>> I am able to connect to Phoenix through JDBC connection and able to
>>>>> read the Phoenix tables .
>>>>> But while writing the data back to Phoenix table
>>>>> I am getting below error :
>>>>>
>>>>> org.apache.spark.sql.AnalysisException:
>>>>> org.apache.phoenix.spark.DefaultSource does not allow user-specified
>>>>> schemas.;
>>>>>
>>>>> Can any body help in resolving the above errors or any other solution
>>>>> of saving Spark Dataframes to Phoenix.
>>>>>
>>>>> Would really appareciate the help.
>>>>>
>>>>> Thanks,
>>>>> Divya
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message