phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Perko, Ralph J" <Ralph.Pe...@pnnl.gov>
Subject RegionTooBusyException
Date Thu, 06 Nov 2014 20:31:29 GMT
Hi, I am using a combination of Pig, Phoenix and HBase to load data on a test cluster and I
continue to run into an issue with larger, longer running jobs (smaller jobs succeed).  After
the job has run for several hours, the first set of mappers have finished and the second begin,
the job dies with each mapper failing with the error RegionTooBusyException.  Could this be
related to how I have my Phoenix tables configured or is this an Hbase configuration issue
or something else?  Do you have any suggestions?

Thanks for the help,
Ralph


2014-11-05 23:08:31,573 INFO [main] org.apache.hadoop.hbase.client.AsyncProcess: #1, waiting
for 200 actions to finish
2014-11-05 23:08:33,729 WARN [phoenix-1-thread-34413] org.apache.hadoop.hbase.client.AsyncProcess:
#1, table=T1_CSV_DATA, primary, attempt=36/35 failed 200 ops, last exception: null on server1,60020,1415229553858,
tracking started Wed Nov 05 22:59:40 PST 2014; not retrying 200 - final failure
2014-11-05 23:08:33,736 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running
child : java.io.IOException: Exception while committing to database.
at org.apache.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.java:79)
at org.apache.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.java:41)
at org.apache.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:151)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.phoenix.execute.CommitException: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
Failed 200 actions: RegionTooBusyException: 200 times,
at org.apache.phoenix.execute.MutationState.commit(MutationState.java:418)
at org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:356)
at org.apache.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.java:76)
... 19 more
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 200
actions: RegionTooBusyException: 200 times,
at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:207)
at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1700(AsyncProcess.java:187)
at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.getErrors(AsyncProcess.java:1473)
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:855)
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:869)
at org.apache.phoenix.execute.MutationState.commit(MutationState.java:399)
... 21 more

2014-11-05 23:08:33,739 INFO [main] org.apache.hadoop.mapred.Task: Runnning cleanup for the
task
2014-11-05 23:08:33,773 INFO [Thread-11] org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x2497d0ab7e6007e

Data size:
75 csv files compressed with bz2
17g compressed – 165g Uncompressed

Time-series data, 6 node cluster, 5 region servers.  Hadoop 2.5  (HDP 2.1.5).  Phoenix 4.0,
Hbase 0.98,

Phoenix Table def:

CREATE TABLE IF NOT EXISTS
t1_csv_data
(
timestamp BIGINT NOT NULL,
location VARCHAR NOT NULL,
fileid VARCHAR NOT NULL,
recnum INTEGER NOT NULL,
field5 VARCHAR,
...
field45 VARCHAR,
CONSTRAINT pkey PRIMARY KEY (timestamp,
location, fileid,recnum)
)
IMMUTABLE_ROWS=true,COMPRESSION='SNAPPY',SALT_BUCKETS=10;

-- indexes
CREATE INDEX t1_csv_data_f1_idx ON t1_csv_data(somefield1) COMPRESSION='SNAPPY';
CREATE INDEX t1_csv_data_f2_idx ON t1_csv_data(somefield2) COMPRESSION='SNAPPY';
CREATE INDEX t1_csv_data_f3_idx ON t1_csv_data(somefield3) COMPRESSION='SNAPPY';

Simple Pig script:

register $phoenix_jar;
register $udf_jar;
Z = load '$data' as (
file_id,
recnum,
dtm:chararray,
...
-- lots of other fields
);
D = foreach Z generate
gov.pnnl.pig.TimeStringToPeriod(dtm,'yyyyMMdd HH:mm:ss','yyyyMMddHHmmss'),
location,
fileid,
recnum,
...
-- lots of other fields
;
STORE D into
'hbase://$table_name' using
org.apache.phoenix.pig.PhoenixHBaseStorage('$zookeeper','-batchSize 1000');


Mime
View raw message