phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Leach <jlea...@gmail.com>
Subject Re: CsvBulkLoadTool with ~75GB file
Date Fri, 19 Aug 2016 13:32:23 GMT
Gabriel,

Do you guys provide pre-split mechanisms (sampling of import/query data, splitting policies,
etc.) or does the admin have to determine the split points?

I guess that begs the question of how you would do a basic ETL operation in Phoenix?  

How would you do the following on a 100 gigs of data in the import table to 50 gigs in the
aggregate table?

(1) Create import table.
(2) Import data into that table.
(3) create an aggregate table.
(4) Insert data into aggregate table based on an aggregate of the imported table.

Here is what I am gathering from the conversation...

(1) Create Import Table.
(2) Perform a pre-split based on some information about the data or via some split mechanism
(If not, import via MapReduce does not scale).
(3) Run MapReduce job for importing data (Other mechanism does not scale?)
(4) Compaction/Statistics Operation on Import Table (If not, query will not scale?)
(5) Create Aggregate Table
(6) Perform a pre-split based on some information about the Aggregate data or via some split
mechanism. (If not, insert will not scale?).
(7) Run insert query?  If you had to run import on map reduce to scale, I suspect the insert
does MapReduce as well or is there some other mechanism that would scale?
(8) Compaction/Statistics Operation on Aggregate Table

I really appreciate all the support.  We are trying to run a Phoenix TPCH benchmark and are
struggling a bit to understand the process.

Regards,
John Leach

> On Aug 19, 2016, at 2:09 AM, Gabriel Reid <gabriel.reid@gmail.com> wrote:
> 
> Hi Aaron,
> 
> How many regions are there in the LINEITEM table? The fact that you
> needed to bump the
> hbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily setting up to
> 48 suggests that the amount of data going into a single region of that
> table is probably pretty large.
> 
> Along the same line, I believe regions are the initial unit of
> parallelism (in the absence of statistics[1]) the when running
> aggregate queries in Phoenix. This means that if you have oversized
> regions, you will have poor parallelism when running aggregate
> queries, which could lead to RPC timeouts.
> 
> From what I see in the log info that you provided, your count query
> started at 14:14:06, and errored out at 14:34:15, which appears to be
> in line with the 20 minute HBase RPC timeout. This appears to indicate
> that a scan over a single region is taking more than 20 minutes, which
> again looks to me to be an indicator of an oversized region. If
> possible, I would look into splitting up your LINEITEM table into more
> (smaller) regions, which should improve both import and query
> performance.
> 
> - Gabriel
> 
> 1. http://phoenix.apache.org/update_statistics.html
> 
> On Thu, Aug 18, 2016 at 5:22 PM, Aaron Molitor
> <amolitor@splicemachine.com> wrote:
>> Gabriel,
>> 
>> Thanks for the help, it's good to know that those params can be passed from the command
line and that the order is important.
>> 
>> I am trying to load the 100GB TPC-H data set and ultimately run the TPC-H queries.
 All of the tables loaded relatively easily except LINEITEM (the largest) required me to increase
the hbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily to 48.  After that the file loaded.
>> 
>> This brings me to my next question though.  What settings do I need to change in
order to count the [LINEITEM] table? At this point I have changed:
>> - hbase.rpc.timeout set to 20 min
>> - phoenix.query.timeoutMs set to 60 min
>> 
>> I am still getting an error, it appears to be an RPC timeout, as I have mentioned
I have already moved to an uncomfortably high setting.  Is there some other settings I should
be moving and not necessarily the rpc.timeout?
>> 
>> For reference, here's the full sqlline interaction, including the error:
>> ################################################################################
>> Latest phoenix error:
>> splice@stl-colo-srv073 ~]$ /opt/phoenix/default/bin/sqlline.py $(hostname):2181:/hbase-unsecure
>> Setting property: [incremental, false]
>> Setting property: [isolation, TRANSACTION_READ_COMMITTED]
>> issuing: !connect jdbc:phoenix:stl-colo-srv073.splicemachine.colo:2181:/hbase-unsecure
none none org.apache.phoenix.jdbc.PhoenixDriver
>> Connecting to jdbc:phoenix:stl-colo-srv073.splicemachine.colo:2181:/hbase-unsecure
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in [jar:file:/opt/phoenix/apache-phoenix-4.8.0-HBase-1.1-bin/phoenix-4.8.0-HBase-1.1-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in [jar:file:/usr/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
>> 16/08/18 14:14:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
>> 16/08/18 14:14:08 WARN shortcircuit.DomainSocketFactory: The short-circuit local
reads feature cannot be used because libhadoop cannot be loaded.
>> Connected to: Phoenix (version 4.8)
>> Driver: PhoenixEmbeddedDriver (version 4.8)
>> Autocommit status: true
>> Transaction isolation: TRANSACTION_READ_COMMITTED
>> Building list of tables and columns for tab-completion (set fastconnect to true to
skip)...
>> 147/147 (100%) Done
>> Done
>> sqlline version 1.1.9
>> 0: jdbc:phoenix:stl-colo-srv073.splicemachine> select count(*) from TPCH.LINEITEM;
>> Error: org.apache.phoenix.exception.PhoenixIOException: Failed after attempts=36,
exceptions:
>> Thu Aug 18 14:34:15 UTC 2016, null, java.net.SocketTimeoutException: callTimeout=60000,
callDuration=1200310: row '' on table 'TPCH.LINEITEM' at region=TPCH.LINEITEM,,1471407572920.656deb38db6555b3eaea71944fdfdbc9.,
hostname=stl-colo-srv076.splicemachine.colo,16020,1471495858713, seqNum=17 (state=08000,code=101)
>> org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException:
Failed after attempts=36, exceptions:
>> Thu Aug 18 14:34:15 UTC 2016, null, java.net.SocketTimeoutException: callTimeout=60000,
callDuration=1200310: row '' on table 'TPCH.LINEITEM' at region=TPCH.LINEITEM,,1471407572920.656deb38db6555b3eaea71944fdfdbc9.,
hostname=stl-colo-srv076.splicemachine.colo,16020,1471495858713, seqNum=17
>> 
>>        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:111)
>>        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:774)
>>        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:720)
>>        at org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:50)
>>        at org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:97)
>>        at org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:117)
>>        at org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64)
>>        at org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39)
>>        at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:778)
>>        at sqlline.BufferedRows.<init>(BufferedRows.java:37)
>>        at sqlline.SqlLine.print(SqlLine.java:1649)
>>        at sqlline.Commands.execute(Commands.java:833)
>>        at sqlline.Commands.sql(Commands.java:732)
>>        at sqlline.SqlLine.dispatch(SqlLine.java:807)
>>        at sqlline.SqlLine.begin(SqlLine.java:681)
>>        at sqlline.SqlLine.start(SqlLine.java:398)
>>        at sqlline.SqlLine.main(SqlLine.java:292)
>> Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException:
Failed after attempts=36, exceptions:
>> Thu Aug 18 14:34:15 UTC 2016, null, java.net.SocketTimeoutException: callTimeout=60000,
callDuration=1200310: row '' on table 'TPCH.LINEITEM' at region=TPCH.LINEITEM,,1471407572920.656deb38db6555b3eaea71944fdfdbc9.,
hostname=stl-colo-srv076.splicemachine.colo,16020,1471495858713, seqNum=17
>> 
>>        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>        at java.util.concurrent.FutureTask.get(FutureTask.java:202)
>>        at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:769)
>>        ... 15 more
>> Caused by: org.apache.phoenix.exception.PhoenixIOException: Failed after attempts=36,
exceptions:
>> Thu Aug 18 14:34:15 UTC 2016, null, java.net.SocketTimeoutException: callTimeout=60000,
callDuration=1200310: row '' on table 'TPCH.LINEITEM' at region=TPCH.LINEITEM,,1471407572920.656deb38db6555b3eaea71944fdfdbc9.,
hostname=stl-colo-srv076.splicemachine.colo,16020,1471495858713, seqNum=17
>> 
>>        at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:111)
>>        at org.apache.phoenix.iterate.TableResultIterator.initScanner(TableResultIterator.java:174)
>>        at org.apache.phoenix.iterate.TableResultIterator.next(TableResultIterator.java:124)
>>        at org.apache.phoenix.iterate.SpoolingResultIterator.<init>(SpoolingResultIterator.java:139)
>>        at org.apache.phoenix.iterate.SpoolingResultIterator.<init>(SpoolingResultIterator.java:97)
>>        at org.apache.phoenix.iterate.SpoolingResultIterator.<init>(SpoolingResultIterator.java:69)
>>        at org.apache.phoenix.iterate.SpoolingResultIterator$SpoolingResultIteratorFactory.newIterator(SpoolingResultIterator.java:92)
>>        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:114)
>>        at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:106)
>>        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>        at org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
>>        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:745)
>> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=36, exceptions:
>> Thu Aug 18 14:34:15 UTC 2016, null, java.net.SocketTimeoutException: callTimeout=60000,
callDuration=1200310: row '' on table 'TPCH.LINEITEM' at region=TPCH.LINEITEM,,1471407572920.656deb38db6555b3eaea71944fdfdbc9.,
hostname=stl-colo-srv076.splicemachine.colo,16020,1471495858713, seqNum=17
>> 
>>        at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271)
>>        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:199)
>>        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
>>        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>>        at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
>>        at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
>>        at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
>>        at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
>>        at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821)
>>        at org.apache.phoenix.iterate.TableResultIterator.initScanner(TableResultIterator.java:170)
>>        ... 12 more
>> Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=1200310:
row '' on table 'TPCH.LINEITEM' at region=TPCH.LINEITEM,,1471407572920.656deb38db6555b3eaea71944fdfdbc9.,
hostname=stl-colo-srv076.splicemachine.colo,16020,1471495858713, seqNum=17
>>        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
>>        at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
>>        ... 3 more
>> Caused by: java.io.IOException: Call to stl-colo-srv076.splicemachine.colo/10.1.1.176:16020
failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27, waitTime=1200001,
operationTimeout=1200000 expired.
>>        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.wrapException(AbstractRpcClient.java:278)
>>        at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1239)
>>        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:217)
>>        at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:318)
>>        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32831)
>>        at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:373)
>>        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:200)
>>        at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
>>        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>>        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:350)
>>        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:324)
>>        at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
>>        ... 4 more
>> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=27, waitTime=1200001,
operationTimeout=1200000 expired.
>>        at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:70)
>>        at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1213)
>>        ... 14 more
>> 0: jdbc:phoenix:stl-colo-srv073.splicemachine>
>> ################################################################################
>> 
>>> On Aug 18, 2016, at 02:15, Gabriel Reid <gabriel.reid@gmail.com> wrote:
>>> 
>>> Hi Aaron,
>>> 
>>> I'll answered your questions directly first, but please see the bottom
>>> part of this mail for important additional details.
>>> 
>>> You can specify the
>>> "hbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily" parameter
>>> (referenced from your StackOverflow link) on the command line of you
>>> CsvBulkLoadTool command -- my understanding is that this is a purely
>>> client-side parameter. You would provide it via -D as follows:
>>> 
>>>   hadoop jar phoenix-<version>-client.jar
>>> org.apache.phoenix.mapreduce.CsvBulkLoadTool
>>> -Dhbase.mapreduce.bulkload.max.hfiles.perRegion.perFamily=64 <other
>>> command-line parameters>
>>> 
>>> The important point in the above example is that config-based
>>> parameters specified with -D are given before the application-level
>>> parameters, and after the class name to be run.
>>> 
>>> From my read of the HBase code, in this context you can also specify
>>> the "hbase.hregion.max.filesize" parameter in the same way (in this
>>> context it's a client-side parameter).
>>> 
>>> As far as speeding things up, the main points to consider are:
>>> - ensure that compression is enabled for map-reduce jobs on your
>>> cluster -- particularly map-output (intermediate) compression - see
>>> https://datameer.zendesk.com/hc/en-us/articles/204258750-How-to-Use-Intermediate-and-Final-Output-Compression-MR1-YARN-
>>> for a good overview
>>> - check the ratio of map output records vs spilled records in the
>>> counters on the import job. If the spilled records are higher than map
>>> output records (e.g. twice as high or three times as high), then you
>>> will probably benefit from raising the mapreduce.task.io.sort.mb
>>> setting (see https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml)
>>> 
>>> Now those are the answers to your questions, but I'm curious about why
>>> you're getting more than 32 HFiles in a single column family of a
>>> single region. I assume that this means that you're loading large
>>> amounts of data into a small number of regions. This is probably not a
>>> good thing -- it may impact performance of HBase in general (because
>>> each region has such a large amount of data), and will also have a
>>> very negative impact on the running time of your import job (because
>>> part of the parallelism of the import job is determined by the number
>>> of regions being written to). I don't think you mentioned how many
>>> regions you have on your table that you're importing to, but
>>> increasing the number of regions will likely resolve several problems
>>> for you. Another reason to do this is the fact that HBase will likely
>>> start splitting your regions after this import due to their size.
>>> 
>>> - Gabriel
>>> 
>>> 
>>> On Thu, Aug 18, 2016 at 3:47 AM, Aaron Molitor
>>> <amolitor@splicemachine.com> wrote:
>>>> Hi all I'm running the CsvBulkLoadTool trying to pull in some data.  The
MapReduce Job appears to complete, and gives some promising information:
>>>> 
>>>> 
>>>> ################################################################################
>>>>       Phoenix MapReduce Import
>>>>               Upserts Done=600037902
>>>>       Shuffle Errors
>>>>               BAD_ID=0
>>>>               CONNECTION=0
>>>>               IO_ERROR=0
>>>>               WRONG_LENGTH=0
>>>>               WRONG_MAP=0
>>>>               WRONG_REDUCE=0
>>>>       File Input Format Counters
>>>>               Bytes Read=79657289180
>>>>       File Output Format Counters
>>>>               Bytes Written=176007436620
>>>> 16/08/17 20:37:04 INFO mapreduce.AbstractBulkLoadTool: Loading HFiles from
/tmp/66f905f4-3d62-45bf-85fe-c247f518355c
>>>> 16/08/17 20:37:04 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0xa24982f
connecting to ZooKeeper ensemble=stl-colo-srv073.splicemachine.colo:2181
>>>> 16/08/17 20:37:04 INFO zookeeper.ZooKeeper: Initiating client connection,
connectString=stl-colo-srv073.splicemachine.colo:2181 sessionTimeout=1200000 watcher=hconnection-0xa24982f0x0,
quorum=stl-colo-srv073.splicemachine.colo:2181, baseZNode=/hbase-unsecure
>>>> 16/08/17 20:37:04 INFO zookeeper.ClientCnxn: Opening socket connection to
server stl-colo-srv073.splicemachine.colo/10.1.1.173:2181. Will not attempt to authenticate
using SASL (unknown error)
>>>> 16/08/17 20:37:04 INFO zookeeper.ClientCnxn: Socket connection established
to stl-colo-srv073.splicemachine.colo/10.1.1.173:2181, initiating session
>>>> 16/08/17 20:37:04 INFO zookeeper.ClientCnxn: Session establishment complete
on server stl-colo-srv073.splicemachine.colo/10.1.1.173:2181, sessionid = 0x15696476bf90484,
negotiated timeout = 40000
>>>> 16/08/17 20:37:04 INFO mapreduce.AbstractBulkLoadTool: Loading HFiles for
TPCH.LINEITEM from /tmp/66f905f4-3d62-45bf-85fe-c247f518355c/TPCH.LINEITEM
>>>> 16/08/17 20:37:04 WARN mapreduce.LoadIncrementalHFiles: managed connection
cannot be used for bulkload. Creating unmanaged connection.
>>>> 16/08/17 20:37:04 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x456a0752
connecting to ZooKeeper ensemble=stl-colo-srv073.splicemachine.colo:2181
>>>> 16/08/17 20:37:04 INFO zookeeper.ZooKeeper: Initiating client connection,
connectString=stl-colo-srv073.splicemachine.colo:2181 sessionTimeout=1200000 watcher=hconnection-0x456a07520x0,
quorum=stl-colo-srv073.splicemachine.colo:2181, baseZNode=/hbase-unsecure
>>>> 16/08/17 20:37:04 INFO zookeeper.ClientCnxn: Opening socket connection to
server stl-colo-srv073.splicemachine.colo/10.1.1.173:2181. Will not attempt to authenticate
using SASL (unknown error)
>>>> 16/08/17 20:37:04 INFO zookeeper.ClientCnxn: Socket connection established
to stl-colo-srv073.splicemachine.colo/10.1.1.173:2181, initiating session
>>>> 16/08/17 20:37:04 INFO zookeeper.ClientCnxn: Session establishment complete
on server stl-colo-srv073.splicemachine.colo/10.1.1.173:2181, sessionid = 0x15696476bf90485,
negotiated timeout = 40000
>>>> 16/08/17 20:37:06 INFO hfile.CacheConfig: CacheConfig:disabled
>>>> ################################################################################
>>>> 
>>>> and eventually errors out with this exception.
>>>> 
>>>> ################################################################################
>>>> 16/08/17 20:37:07 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://stl-colo-srv073.splicemachine.colo:8020/tmp/66f905f4-3d62-45bf-85fe-c247f518355c/TPCH.LINEITEM/0/88b40cbbc4c841f99eae906af3b93cda
first=\x80\x00\x00\x00\x08\xB3\xE7\x84\x80\x00\x00\x04 last=\x80\x00\x00\x00\x09\x92\xAEg\x80\x00\x00\x03
>>>> 16/08/17 20:37:07 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://stl-colo-srv073.splicemachine.colo:8020/tmp/66f905f4-3d62-45bf-85fe-c247f518355c/TPCH.LINEITEM/0/de309e5c7b3841a6b4fd299ac8fa8728
first=\x80\x00\x00\x00\x15\xC1\x8Ee\x80\x00\x00\x01 last=\x80\x00\x00\x00\x16\xA0G\xA4\x80\x00\x00\x02
>>>> 16/08/17 20:37:07 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://stl-colo-srv073.splicemachine.colo:8020/tmp/66f905f4-3d62-45bf-85fe-c247f518355c/TPCH.LINEITEM/0/e7ed8bc150c9494b8c064a022b3609e0
first=\x80\x00\x00\x00\x09\x92\xAEg\x80\x00\x00\x04 last=\x80\x00\x00\x00\x0Aq\x85D\x80\x00\x00\x02
>>>> 16/08/17 20:37:07 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://stl-colo-srv073.splicemachine.colo:8020/tmp/66f905f4-3d62-45bf-85fe-c247f518355c/TPCH.LINEITEM/0/c35e01b66d85450c97da9bb21bfc650f
first=\x80\x00\x00\x00\x0F\xA9\xFED\x80\x00\x00\x04 last=\x80\x00\x00\x00\x10\x88\xD0$\x80\x00\x00\x03
>>>> 16/08/17 20:37:07 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://stl-colo-srv073.splicemachine.colo:8020/tmp/66f905f4-3d62-45bf-85fe-c247f518355c/TPCH.LINEITEM/0/b5904451d27d42f0bcb4c98a5b14f3e9
first=\x80\x00\x00\x00\x13%/\x83\x80\x00\x00\x01 last=\x80\x00\x00\x00\x14\x04\x08$\x80\x00\x00\x01
>>>> 16/08/17 20:37:07 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://stl-colo-srv073.splicemachine.colo:8020/tmp/66f905f4-3d62-45bf-85fe-c247f518355c/TPCH.LINEITEM/0/9d26e9a00e5149cabcb415c6bb429a34
first=\x80\x00\x00\x00\x06\xF6_\xE3\x80\x00\x00\x04 last=\x80\x00\x00\x00\x07\xD5 f\x80\x00\x00\x05
>>>> 16/08/17 20:37:07 ERROR mapreduce.LoadIncrementalHFiles: Trying to load more
than 32 hfiles to family 0 of region with start key
>>>> 16/08/17 20:37:07 INFO client.ConnectionManager$HConnectionImplementation:
Closing master protocol: MasterService
>>>> 16/08/17 20:37:07 INFO client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x15696476bf90485
>>>> 16/08/17 20:37:07 INFO zookeeper.ZooKeeper: Session: 0x15696476bf90485 closed
>>>> 16/08/17 20:37:07 INFO zookeeper.ClientCnxn: EventThread shut down
>>>> Exception in thread "main" java.io.IOException: Trying to load more than
32 hfiles to one family of one region
>>>>       at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:420)
>>>>       at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:314)
>>>>       at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)
>>>>       at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)
>>>>       at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)
>>>>       at org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)
>>>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>>>>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>>>>       at org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:101)
>>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>       at java.lang.reflect.Method.invoke(Method.java:606)
>>>>       at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>>>>       at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>>>> ################################################################################
>>>> 
>>>> a count of the table showa 0 rows:
>>>> 0: jdbc:phoenix:srv073> select count(*) from TPCH.LINEITEM;
>>>> +-----------+
>>>> | COUNT(1)  |
>>>> +-----------+
>>>> | 0         |
>>>> +-----------+
>>>> 
>>>> Some quick googling gives an hbase param that could be tweaked (http://stackoverflow.com/questions/24950393/trying-to-load-more-than-32-hfiles-to-one-family-of-one-region).
>>>> 
>>>> Main Questions:
>>>> - Will the CsvBulkLoadTool pick up these params, or will I need to put them
in hbase-site.xml?
>>>> - Is there anything else I can tune to make this run quicker? It took 5 hours
for it to fail with the error above.
>>>> 
>>>> This is a 9 node (8 RegionServer) cluster running HDP 2.4.2 and Phoenix 4.8.0-HBase-1.1
>>>> Ambari default settings except for:
>>>> - HBase RS heap size is set to 24GB
>>>> - hbase.rpc.timeout set to 20 min
>>>> - phoenix.query.timeoutMs set to 60 min
>>>> 
>>>> all nodes are Dell R420 with 2xE5-2430 v2 CPUs (24vCPU), 64GB RAM
>>>> 
>>>> 
>>>> 
>>>> 
>> 


Mime
View raw message