phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukáš Lalinský <lalin...@gmail.com>
Subject DROP COLUMN timing out
Date Wed, 11 Nov 2015 15:08:07 GMT
When running "ALTER TABLE xxx DROP COLUMN yyy" on a table with about
6M rows (which I considered small enough), it's always timing out and
I can't see how to get it execute at least once successfully.

I was getting some internal Phoenix timeouts, but after setting the
following properties, it changed:

hbase.client.scanner.timeout.period=6000000
phoenix.query.timeoutMs=6000000
hbase.rpc.timeout=6000000

Now it fails with errors like this:

Wed Nov 11 13:44:25 UTC 2015,
RpcRetryingCaller{globalStartTime=1447246894248, pause=100,
retries=35}, java.io.IOException: Call to XXX/XXX:16020 failed on
local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException:
Call id=1303, waitTime=60001, operationTimeout=60000 expired.
Wed Nov 11 13:45:45 UTC 2015,
RpcRetryingCaller{globalStartTime=1447246894248, pause=100,
retries=35}, java.io.IOException: Call to XXX/XXX:16020 failed on
local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException:
Call id=1341, waitTime=60001, operationTimeout=60000 expired.

at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
... 3 more
Caused by: java.io.IOException: Call to XXX/XXX:16020 failed on local
exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
id=1341, waitTime=60001, operationTimeout=60000 expired.
at org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1232)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:372)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:199)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:369)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:343)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
... 4 more
Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
id=1341, waitTime=60001, operationTimeout=60000 expired.
at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:70)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1174)
... 14 more

While it's still running, I see log entries like this on region servers:

2015-11-11 15:00:49,059 WARN
[B.defaultRpcServer.handler=9,queue=0,port=16020]
coprocessor.UngroupedAggregateRegionObserver: Committing bactch of
1000 mutations for MEDIA
2015-11-11 15:00:49,259 WARN
[B.defaultRpcServer.handler=12,queue=0,port=16020]
coprocessor.UngroupedAggregateRegionObserver: Committing bactch of
1000 mutations for MEDIA
2015-11-11 15:00:49,537 WARN
[B.defaultRpcServer.handler=9,queue=0,port=16020]
coprocessor.UngroupedAggregateRegionObserver: Committing bactch of
1000 mutations for MEDIA
2015-11-11 15:00:49,766 WARN
[B.defaultRpcServer.handler=12,queue=0,port=16020]
coprocessor.UngroupedAggregateRegionObserver: Committing bactch of
1000 mutations for MEDIA
2015-11-11 15:00:49,960 WARN
[B.defaultRpcServer.handler=9,queue=0,port=16020]
coprocessor.UngroupedAggregateRegionObserver: Committing bactch of
1000 mutations for MEDIA
2015-11-11 15:00:50,212 WARN
[B.defaultRpcServer.handler=12,queue=0,port=16020]
coprocessor.UngroupedAggregateRegionObserver: Committing bactch of
1000 mutations for MEDIA

Any ideas how to solve this? I'd be even fine with having just a way
to remove the column from the Phoenix metadata and keep the values in
HBase, but I don't see how to do it except for running DROP COLUMN and
waiting for it to time out.

Lukas

Mime
View raw message