phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: PhoenixHBaseStorage: java.lang.OutOfMemoryError
Date Mon, 12 May 2014 23:24:49 GMT
Have you tried reducing the batch size (phoenix.upsert.batch.size)
 further, as this is the correct knob to dial down? This would reduce the
client-side buffering which should prevent the OOM you're seeing. Would
multiple PhoenixHBaseStorage invocations occur in parallel on the same
client?

Thanks,
James


On Sun, May 11, 2014 at 9:54 AM, Russell Jurney <russell.jurney@gmail.com>wrote:

> We've had good luck loading Phoenix from HDFS using PhoenixHBaseStorage
> up until now. Now that we're pushing hundreds of megabytes of data at a
> time, we're seeing the error at the bottom of this post.
>
> I've tried adding heap space to the pig process, up to 4GB.  I've also
> tried reducing the batch size to 500 from 5,000. Neither adjustment fixes
> the problem. We're stuck. Anyone got any ideas?
>
> We're on Phoenix 2.2, on CDH 4.4.
>
> 2014-05-11 09:40:30,034 INFO
> com.salesforce.phoenix.pig.PhoenixPigConfiguration: Phoenix Upsert
> Statement: UPSERT INTO FLOW_MEF
> VALUES
> (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
> 2014-05-11 09:40:30,034 INFO
> com.salesforce.phoenix.pig.hadoop.PhoenixOutputFormat: Initialized Phoenix
> connection, autoCommit=false
> 2014-05-11 09:40:30,042 WARN org.apache.hadoop.conf.Configuration:
> fs.default.name is deprecated. Instead, use fs.defaultFS
> 2014-05-11 09:40:30,044 WARN org.apache.hadoop.conf.Configuration:
> io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
> 2014-05-11 09:40:30,074 INFO org.apache.pig.data.SchemaTupleBackend: Key
> [pig.schematuple] was not set... will not generate code.
> 2014-05-11 09:40:30,131 INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map:
> Aliases being processed per job phase (AliasName[line,offset]): M:
> flow_mef[7,11] C:  R:
> 2014-05-11 09:40:31,289 INFO org.apache.zookeeper.ZooKeeper: Initiating
> client connection, connectString=hiveapp1:2181 sessionTimeout=180000
> watcher=hconnection
> 2014-05-11 09:40:31,290 INFO
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of
> this process is 27998@hivecluster5.labs.lan
> 2014-05-11 09:40:31,291 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server hiveapp1/10.10.30.200:2181. Will not attempt
> to authenticate using SASL (unknown error)
> 2014-05-11 09:40:31,292 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to hiveapp1/10.10.30.200:2181, initiating session
> 2014-05-11 09:40:31,304 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server hiveapp1/10.10.30.200:2181, sessionid =
> 0x345ba20ce0f7383, negotiated timeout = 60000
> 2014-05-11 09:40:31,446 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Closed zookeeper sessionid=0x345ba20ce0f7383
> 2014-05-11 09:40:31,462 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x345ba20ce0f7383 closed
> 2014-05-11 09:40:31,462 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2014-05-11 09:40:31,469 INFO org.apache.zookeeper.ZooKeeper: Initiating
> client connection, connectString=hiveapp1:2181 sessionTimeout=180000
> watcher=hconnection
> 2014-05-11 09:40:31,470 INFO
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of
> this process is 27998@hivecluster5.labs.lan
> 2014-05-11 09:40:31,471 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server hiveapp1/10.10.30.200:2181. Will not attempt
> to authenticate using SASL (unknown error)
> 2014-05-11 09:40:31,472 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to hiveapp1/10.10.30.200:2181, initiating session
> 2014-05-11 09:40:31,479 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server hiveapp1/10.10.30.200:2181, sessionid =
> 0x345ba20ce0f7389, negotiated timeout = 60000
> 2014-05-11 09:40:31,629 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Closed zookeeper sessionid=0x345ba20ce0f7389
> 2014-05-11 09:40:31,755 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x345ba20ce0f7389 closed
> 2014-05-11 09:40:31,755 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2014-05-11 09:40:31,761 INFO org.apache.zookeeper.ZooKeeper: Initiating
> client connection, connectString=hiveapp1:2181 sessionTimeout=180000
> watcher=hconnection
> 2014-05-11 09:40:31,762 INFO
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of
> this process is 27998@hivecluster5.labs.lan
> 2014-05-11 09:40:31,762 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server hiveapp1/10.10.30.200:2181. Will not attempt
> to authenticate using SASL (unknown error)
> 2014-05-11 09:40:31,763 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to hiveapp1/10.10.30.200:2181, initiating session
> 2014-05-11 09:40:31,921 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server hiveapp1/10.10.30.200:2181, sessionid =
> 0x345ba20ce0f738c, negotiated timeout = 60000
> 2014-05-11 09:40:32,053 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Closed zookeeper sessionid=0x345ba20ce0f738c
> 2014-05-11 09:40:32,220 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x345ba20ce0f738c closed
> 2014-05-11 09:40:32,221 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
> 2014-05-11 09:43:11,127 WARN
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Failed all from
> region=FLOW_MEF,\x03,1399822720276.35656409db81bc4a45384e11cec0e45b.,
> hostname=hiveapp2, port=60020
> java.util.concurrent.ExecutionException: java.lang.RuntimeException:
> java.lang.OutOfMemoryError
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1571)
>  at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1423)
> at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:754)
>  at
> com.salesforce.phoenix.execute.MutationState.commit(MutationState.java:384)
> at
> com.salesforce.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:249)
>  at
> com.salesforce.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.java:86)
> at
> com.salesforce.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.java:51)
>  at
> com.salesforce.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:161)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558)
>  at
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
> at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:264)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>  at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: java.lang.RuntimeException: java.lang.OutOfMemoryError
> at
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:216)
>  at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1407)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1395)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.OutOfMemoryError
> at sun.misc.Unsafe.allocateMemory(Native Method)
> at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:127)
>  at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
> at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
>  at sun.nio.ch.IOUtil.write(IOUtil.java:58)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
>  at
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:62)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:143)
>  at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:153)
> at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:114)
>  at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
>  at
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:625)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:981)
>  at
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at com.sun.proxy.$Proxy15.multi(Unknown Source)
>  at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1400)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1398)
>  at
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210)
> ... 6 more
> 2014-05-11 09:43:12,366 WARN
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Failed all from
> region=FLOW_MEF,\x09,1399822720277.54ea08cfff2e43e5186b26fec76f3030.,
> hostname=hiveapp3, port=60020
> java.util.concurrent.ExecutionException: java.lang.RuntimeException:
> java.lang.OutOfMemoryError
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1571)
>  at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1423)
> at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:754)
>  at
> com.salesforce.phoenix.execute.MutationState.commit(MutationState.java:384)
> at
> com.salesforce.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:249)
>  at
> com.salesforce.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.java:86)
> at
> com.salesforce.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.java:51)
>  at
> com.salesforce.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:161)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558)
>  at
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
> at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:264)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>  at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: java.lang.RuntimeException: java.lang.OutOfMemoryError
> at
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:216)
>  at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1407)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1395)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.OutOfMemoryError
> at sun.misc.Unsafe.allocateMemory(Native Method)
> at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:127)
>  at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
> at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
>  at sun.nio.ch.IOUtil.write(IOUtil.java:58)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
>  at
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:62)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:143)
>  at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:153)
> at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:114)
>  at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
>  at
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.sendParam(HBaseClient.java:625)
> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:981)
>  at
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:86)
> at com.sun.proxy.$Proxy15.multi(Unknown Source)
>  at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1400)
> at
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1398)
>  at
> org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210)
> ... 6 more
> 2014-05-11 09:43:13,409 INFO org.apache.zookeeper.ZooKeeper: Initiating
> client connection, connectString=hiveapp1:2181 sessionTimeout=180000
> watcher=hconnection
> 2014-05-11 09:43:13,411 INFO
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of
> this process is 27998@hivecluster5.labs.lan
> 2014-05-11 09:43:13,411 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server hiveapp1/10.10.30.200:2181. Will not attempt
> to authenticate using SASL (unknown error)
> 2014-05-11 09:43:13,413 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to hiveapp1/10.10.30.200:2181, initiating session
> 2014-05-11 09:43:13,547 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server hiveapp1/10.10.30.200:2181, sessionid =
> 0x345ba20ce0f73b5, negotiated timeout = 60000
> 2014-05-11 09:43:13,678 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> Closed zookeeper sessionid=0x345ba20ce0f73b5
> 2014-05-11 09:43:13,895 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x345ba20ce0f73b5 closed
> 2014-05-11 09:43:13,895 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.
> com
>

Mime
View raw message