phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankit Singhal <ankitsingha...@gmail.com>
Subject Re: Getting too many open files during table scan
Date Fri, 23 Jun 2017 06:31:42 GMT
bq. A leading date column is in our schema model:-
Don't you have any other column which is obligatory in queries during
reading but not monotonous with ingestion? As pre-split can help you
avoiding hot-spotting.
For parallelism/performance comparison, have you tried running a query on a
non-salted table after updating the stats and comparing performance with a
salted table?


On Fri, Jun 23, 2017 at 9:49 AM, Michael Young <yomaiquin@gmail.com> wrote:

> We started with no salt buckets, but the performance was terrible in our
> testing.
>
> A leading date column is in our schema model.  We don't seem to be getting
> hotspotting after salting.  Date range scans are very common as are slice
> and dice on many dimension columns.
>
> We have tested with a range of SALT values from 0 to 120 for bulk loading,
> upserts, selects at different concurrent load levels on a test cluster
> before moving to production (with some tweaking post-production).  However,
> we had fewer average regions per RS during the testing.  The larger SALT
> numbers definitely gave overall better performance on our predominantly
> read-heavy environment.
>
> I appreciate any insights to identify bottlenecks.
>
> On Thu, Jun 22, 2017 at 6:26 PM, James Taylor <jamestaylor@apache.org>
> wrote:
>
>> My recommendation: don't use salt buckets unless you have a monatomically
>> increasing row key, for example one that leads with the current date/time.
>> Otherwise you'll be putting more load (# of salt buckets more load worst
>> case) for bread-and-butter small-range-scan Phoenix queries.
>>
>> Thanks,
>> James
>>
>> On Fri, Jun 23, 2017 at 10:06 AM Michael Young <yomaiquin@gmail.com>
>> wrote:
>>
>>> The ulimit open files was only 1024 for the user executing the query.
>>> After increasing, the queries behaves better.
>>>
>>> How can we tell if we need to reduce/increase the number of salt buckets?
>>>
>>> Our team set this based on read/write performance using data volume and
>>> expected queries to be run by users.
>>>
>>> However, now it seems the performance has degraded.  We can recreate the
>>> schemas using fewer/more buckets and reload the data, but I haven't seen a
>>> hard and fast rule for setting the number of buckets.
>>>
>>> We have 12 data nodes, 4 SSDs per node, 128 GB Ram per node, 24 core w/
>>> hyperthreading (HDP 2.5 running, hbase is primary service).
>>> and 800+ regions per RS (seems high)
>>>
>>> Any orientation on this would be greatly appreciated.
>>>
>>>
>>> On Tue, Jun 20, 2017 at 11:54 AM, Josh Elser <josh.elser@gmail.com>
>>> wrote:
>>>
>>>> I think this is more of an issue of your 78 salt buckets than the width
>>>> of your table. Each chunk, running in parallel, is spilling incremental
>>>> counts to disk.
>>>>
>>>> I'd check your ulimit settings on the node which you run this query
>>>> from and try to increase the number of open files allowed before going into
>>>> this one in more depth :)
>>>>
>>>>
>>>> On 6/16/17 2:31 PM, Michael Young wrote:
>>>>
>>>>>
>>>>> We are running a 13-node hbase cluster.  One table uses 78 SALT
>>>>> BUCKETS which seems to work reasonable well for both read and write.
 This
>>>>> table has 130 columns with a PK having 30 columns (fairly wide table).
>>>>>
>>>>> However, after adding several new tables we are seeing errors about
>>>>> too many open files when running a full table scan.
>>>>>
>>>>>
>>>>> Caused by: org.apache.phoenix.exception.PhoenixIOException: Too many
>>>>> open files
>>>>>          at org.apache.phoenix.util.Server
>>>>> Util.parseServerException(ServerUtil.java:111)
>>>>>          at org.apache.phoenix.iterate.Spo
>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:152)
>>>>>          at org.apache.phoenix.iterate.Spo
>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:84)
>>>>>          at org.apache.phoenix.iterate.Spo
>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:63)
>>>>>          at org.apache.phoenix.iterate.Spo
>>>>> olingResultIterator$SpoolingResultIteratorFactory.newIterato
>>>>> r(SpoolingResultIterator.java:79)
>>>>>          at org.apache.phoenix.iterate.Par
>>>>> allelIterators$1.call(ParallelIterators.java:112)
>>>>>          at org.apache.phoenix.iterate.Par
>>>>> allelIterators$1.call(ParallelIterators.java:103)
>>>>>          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>>          at org.apache.phoenix.job.JobMana
>>>>> ger$InstrumentedJobFutureTask.run(JobManager.java:183)
>>>>>          at java.util.concurrent.ThreadPoo
>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>          at java.util.concurrent.ThreadPoo
>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>          at java.lang.Thread.run(Thread.java:745)
>>>>> Caused by: java.io.IOException: Too many open files
>>>>>          at java.io.UnixFileSystem.createFileExclusively(Native
>>>>> Method)
>>>>>          at java.io.File.createTempFile(File.java:2024)
>>>>>          at org.apache.phoenix.shaded.org.
>>>>> apache.commons.io.output.DeferredFileOutputStream.thresholdR
>>>>> eached(DeferredFileOutputStream.java:176)
>>>>>          at org.apache.phoenix.iterate.Spo
>>>>> olingResultIterator$1.thresholdReached(SpoolingResultIterato
>>>>> r.java:116)
>>>>>          at org.apache.phoenix.shaded.org.
>>>>> apache.commons.io.output.ThresholdingOutputStream.checkThres
>>>>> hold(ThresholdingOutputStream.java:224)
>>>>>          at org.apache.phoenix.shaded.org.
>>>>> apache.commons.io.output.ThresholdingOutputStream.write(Thre
>>>>> sholdingOutputStream.java:92)
>>>>>          at java.io.DataOutputStream.writeByte(DataOutputStream.java:
>>>>> 153)
>>>>>          at org.apache.hadoop.io.WritableU
>>>>> tils.writeVLong(WritableUtils.java:273)
>>>>>          at org.apache.hadoop.io.WritableU
>>>>> tils.writeVInt(WritableUtils.java:253)
>>>>>          at org.apache.phoenix.util.TupleU
>>>>> til.write(TupleUtil.java:149)
>>>>>          at org.apache.phoenix.iterate.Spo
>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:127)
>>>>>          ... 10 more
>>>>>
>>>>>
>>>>> When running an explain plan:
>>>>> explain select count(1) from MYBIGTABLE
>>>>>
>>>>> +-----------------------------------------------------------
>>>>> -------------------------------------------------------+
>>>>> |                                                       PLAN
>>>>>                                              |
>>>>> +-----------------------------------------------------------
>>>>> -------------------------------------------------------+
>>>>> | CLIENT 8728-CHUNK 674830174 ROWS 2721056772632 BYTES PARALLEL 78-WAY
>>>>> FULL SCAN OVER ATT.PRE_ENG_CONVERSION_OLAP  |
>>>>> |     ROW TIMESTAMP FILTER [0, 9223372036854775807)
>>>>>                                             |
>>>>> |     SERVER FILTER BY FIRST KEY ONLY
>>>>>                                             |
>>>>> |     SERVER AGGREGATE INTO SINGLE ROW
>>>>>                                              |
>>>>> +-----------------------------------------------------------
>>>>> -------------------------------------------------------+
>>>>>
>>>>> I has a lot of chunks.  Normally this query would return at least some
>>>>> result after running for a few minutes.  With appropriate filters in
the
>>>>> WHERE clause, the queries run fine.
>>>>>
>>>>> Any suggestions on how to avoid this error and get better performance
>>>>> from the table scans?  Realizing that we don't need to run full table
scans
>>>>> regularly, just trying to understand better best practices for Phoenix
>>>>> Hbase.
>>>>>
>>>>> Thank you,
>>>>> Michael
>>>>>
>>>>
>>>
>

Mime
View raw message