phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Mahonin <jmaho...@gmail.com>
Subject Re: Configure Hbase Client when using Phoenix Datasource in Spark?
Date Sun, 24 Jan 2016 18:14:10 GMT
Hi Pierre,

You have a few options:

1) If your app ships an hbase-site.xml resource, or you specify one on the
Spark classpath, this should get picked up by the Phoenix client and used
for client-side settings.
2) Bypass the Spark SQLContext loader, and invoke the more direct
'phoenixTableAsDataFrame' method, passing in your own Configuration object,
as described by the 'Load as a DataFrame directly using a Configuration
object' section in the docs [1]

It may be a good idea to also pass specific tuning parameters at runtime
through the Spark DataFrame API, using the 'option()' method. Feel free to
file a JIRA ticket as an enhancement.

Thanks!

Josh

[1] https://phoenix.apache.org/phoenix_spark.html


On Sat, Jan 23, 2016 at 3:19 PM, pierre lacave <pierre@lacave.me> wrote:

> Is it possible to set a HBase config when using phoenix in spark as a
> datasource ?
>
> Had a peek at the source, and it looks I can only set the table and the
> zookeeper quorum.
>
> What would be my options if I want to set "hbase.rpc.timeout" in the
> client?
>
> // Override 'RelationProvider.createRelation', this enables DataFrame.load()
> override def createRelation(sqlContext: SQLContext, parameters: Map[String, String]):
BaseRelation = {
>   verifyParameters(parameters)
>
>   new PhoenixRelation(
>     parameters("table"),
>     parameters("zkUrl")
>   )(sqlContext)
> }
>
>
> A bit of background, I am getting those exception when running queries in
> a spark job,
>
> Sat Jan 23 20:13:57 GMT 2016, null, java.net.SocketTimeoutException:
> callTimeout=60000, callDuration=60303: row '�Ԑx~R�Traders' on table 'EVENT'
> at region=EVENT,,1453212975336.b28c683cea467d1a2fff4c495e39a3d1.,
> hostname=dc-hadoop-slave2.local,17020,1453578819465, seqNum=16367961
>
> at
> org.apache.phoenix.mapreduce.PhoenixRecordReader.nextKeyValue(PhoenixRecordReader.java:159)
> at
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:143)
> at
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$
> 1.org
> $apache$spark$sql$execution$aggregate$TungstenAggregate$$anonfun$$executePartition$1(TungstenAggregate.scala:97)
> at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
> at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
> at
> org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.phoenix.exception.PhoenixIOException: Failed after
> attempts=36, exceptions:
> Sat Jan 23 20:13:57 GMT 2016, null, java.net.SocketTimeoutException:
> callTimeout=60000, callDuration=60303: row '�Ԑx~R�Traders' on table 'EVENT'
> at region=EVENT,,1453212975336.b28c683cea467d1a2fff4c495e39a3d1.,
> hostname=dc-hadoop-slave2.local,17020,1453578819465, seqNum=16367961
>
> at
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:111)
> at
> org.apache.phoenix.iterate.ScanningResultIterator.next(ScanningResultIterator.java:65)
> at
> org.apache.phoenix.iterate.TableResultIterator.next(TableResultIterator.java:110)
> at
> org.apache.phoenix.iterate.LookAheadResultIterator$1.advance(LookAheadResultIterator.java:47)
> at
> org.apache.phoenix.iterate.LookAheadResultIterator.init(LookAheadResultIterator.java:59)
> at
> org.apache.phoenix.iterate.LookAheadResultIterator.next(LookAheadResultIterator.java:65)
> at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:778)
> at
> org.apache.phoenix.mapreduce.PhoenixRecordReader.nextKeyValue(PhoenixRecordReader.java:152)
> ... 25 more
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Failed after attempts=36, exceptions:
> Sat Jan 23 20:13:57 GMT 2016, null, java.net.SocketTimeoutException:
> callTimeout=60000, callDuration=60303: row '�Ԑx~R�Traders' on table 'EVENT'
> at region=EVENT,,1453212975336.b28c683cea467d1a2fff4c495e39a3d1.,
> hostname=dc-hadoop-slave2.local,17020,1453578819465, seqNum=16367961
>
> at
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:195)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
> at
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
> at
> org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403)
> at
> org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364)
> at
> org.apache.phoenix.iterate.ScanningResultIterator.next(ScanningResultIterator.java:55)
> ... 31 more
> Caused by: java.net.SocketTimeoutException: callTimeout=60000,
> callDuration=60303: row '�Ԑx~R�Traders' on table 'EVENT' at
> region=EVENT,,1453212975336.b28c683cea467d1a2fff4c495e39a3d1.,
> hostname=dc-hadoop-slave2.local,17020,1453578819465, seqNum=16367961
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
> at
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
> ... 3 more
> Caused by: java.io.IOException: Call to dc-hadoop-slave2.local/
> 172.20.1.185:17020 failed on local exception:
> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=257,
> waitTime=60001, operationTimeout=60000 expired.
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1284)
> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1252)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
> at
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:213)
> at
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:346)
> at
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:320)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
> ... 4 more
> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=257,
> waitTime=60001, operationTimeout=60000 expired.
> at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:70)
> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1226)
> ... 13 more
>
>
> the hbase regions server have this config
>
>   <property>
>     <name>hbase.client.scanner.timeout.period</name>
>     <value>1200000</value>
>   </property>
>   <property>
>     <name>hbase.rpc.timeout</name>
>     <value>1200000</value>
>   </property>
>   <property>
>     <name>phoenix.query.keepAliveMs</name>
>     <value>240000</value>
>   </property>
>
>
> Thanks
>
>
>
>
> *Pierre Lacave*
>
>

Mime
View raw message