phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Young <yomaiq...@gmail.com>
Subject Getting too many open files during table scan
Date Fri, 16 Jun 2017 18:31:15 GMT
We are running a 13-node hbase cluster.  One table uses 78 SALT BUCKETS
which seems to work reasonable well for both read and write.  This table
has 130 columns with a PK having 30 columns (fairly wide table).

However, after adding several new tables we are seeing errors about too
many open files when running a full table scan.


Caused by: org.apache.phoenix.exception.PhoenixIOException: Too many open
files
        at
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:111)
        at
org.apache.phoenix.iterate.SpoolingResultIterator.<init>(SpoolingResultIterator.java:152)
        at
org.apache.phoenix.iterate.SpoolingResultIterator.<init>(SpoolingResultIterator.java:84)
        at
org.apache.phoenix.iterate.SpoolingResultIterator.<init>(SpoolingResultIterator.java:63)
        at
org.apache.phoenix.iterate.SpoolingResultIterator$SpoolingResultIteratorFactory.newIterator(SpoolingResultIterator.java:79)
        at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:112)
        at
org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:103)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Too many open files
        at java.io.UnixFileSystem.createFileExclusively(Native Method)
        at java.io.File.createTempFile(File.java:2024)
        at
org.apache.phoenix.shaded.org.apache.commons.io.output.DeferredFileOutputStream.thresholdReached(DeferredFileOutputStream.java:176)
        at
org.apache.phoenix.iterate.SpoolingResultIterator$1.thresholdReached(SpoolingResultIterator.java:116)
        at
org.apache.phoenix.shaded.org.apache.commons.io.output.ThresholdingOutputStream.checkThreshold(ThresholdingOutputStream.java:224)
        at
org.apache.phoenix.shaded.org.apache.commons.io.output.ThresholdingOutputStream.write(ThresholdingOutputStream.java:92)
        at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
        at
org.apache.hadoop.io.WritableUtils.writeVLong(WritableUtils.java:273)
        at
org.apache.hadoop.io.WritableUtils.writeVInt(WritableUtils.java:253)
        at org.apache.phoenix.util.TupleUtil.write(TupleUtil.java:149)
        at
org.apache.phoenix.iterate.SpoolingResultIterator.<init>(SpoolingResultIterator.java:127)
        ... 10 more


When running an explain plan:
explain select count(1) from MYBIGTABLE

+------------------------------------------------------------------------------------------------------------------+
|
PLAN                                                       |
+------------------------------------------------------------------------------------------------------------------+
| CLIENT 8728-CHUNK 674830174 ROWS 2721056772632 BYTES PARALLEL 78-WAY FULL
SCAN OVER ATT.PRE_ENG_CONVERSION_OLAP  |
|     ROW TIMESTAMP FILTER [0,
9223372036854775807)
|
|     SERVER FILTER BY FIRST KEY
ONLY
|
|     SERVER AGGREGATE INTO SINGLE
ROW
|
+------------------------------------------------------------------------------------------------------------------+

I has a lot of chunks.  Normally this query would return at least some
result after running for a few minutes.  With appropriate filters in the
WHERE clause, the queries run fine.

Any suggestions on how to avoid this error and get better performance from
the table scans?  Realizing that we don't need to run full table scans
regularly, just trying to understand better best practices for Phoenix
Hbase.

Thank you,
Michael

Mime
View raw message