phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Young <yomaiq...@gmail.com>
Subject Re: Getting too many open files during table scan
Date Fri, 23 Jun 2017 22:35:15 GMT
Sergey, thanks for this tip!

Since our client data volume varies a lot from site to site, would
splitting only on the first letters of the client_id lead to some regions
being much larger than others?
Or does phoenix distribute fairly across the different region servers?

Would this continue to work as need client_id's are added to the cluster,
or does splitting need to be done again manually somehow?

As a follow up question, is there a way to get hbase to balance the regions
based on their size rather than the region count?

Cheers,
Michael

On Fri, Jun 23, 2017 at 12:38 PM, Sergey Soldatov <sergeysoldatov@gmail.com>
wrote:

> You may check "Are there any tips for optimizing Phoenix?" section of
> Apache Phoenix FAQ at https://phoenix.apache.org/faq.html. It says how to
> pre-split table. In your case you may split on the first letters of
> client_id.
>
> When we are talking about monotonous data, we usually mean the primary key
> only. For example if we have primary key integer ID and writing something
> with auto increment ID, all the data will go to a single region, creating a
> hot spot there. In this case (and actually only in this case) salting may
> be useful, since it adds an additional random byte in front of primary key,
> giving us a chance to distribute the write load across the cluster. In all
> other cases salting causes more work on the cluster since we will be unable
> to do a single point lookup/range scan by primary key and need to make
> lookup for all salting keys + pk.
>
> Thanks,
> Sergey
>
>
> On Fri, Jun 23, 2017 at 12:00 PM, Michael Young <yomaiquin@gmail.com>
> wrote:
>
>> >>Don't you have any other column which is obligatory in queries during
>> reading but not monotonous with ingestion?
>> We have several columns used in typical query WHERE clauses (like
>> userID='abc' or a specific user attributes, data types). However, there are
>> a number of columns which are monotonous with many rows having the same
>> value.
>>
>> We have tried running after update STATISTICS on tables, but that would
>> be worth investigating again.
>>
>> Can you give me a hint how to pre-split the data?
>>
>> Let's say we have the following PK columns (all varchar except dt=date):
>> client_id,dt (date),rule_id,user_id,attribute_1,attribute_2,rule_name,
>> browser_type,device_type,os_type,page,group_name,period
>>
>> and non-PK columns in the same table
>> requests,connections,queues,queue_time
>>
>> What would be the suggested way to pre-split?  I'm not familiar with this
>> technique beyond very simple use cases.
>>
>> Thanks!
>>
>> On Thu, Jun 22, 2017 at 11:31 PM, Ankit Singhal <ankitsinghal59@gmail.com
>> > wrote:
>>
>>> bq. A leading date column is in our schema model:-
>>> Don't you have any other column which is obligatory in queries during
>>> reading but not monotonous with ingestion? As pre-split can help you
>>> avoiding hot-spotting.
>>> For parallelism/performance comparison, have you tried running a query
>>> on a non-salted table after updating the stats and comparing performance
>>> with a salted table?
>>>
>>>
>>> On Fri, Jun 23, 2017 at 9:49 AM, Michael Young <yomaiquin@gmail.com>
>>> wrote:
>>>
>>>> We started with no salt buckets, but the performance was terrible in
>>>> our testing.
>>>>
>>>> A leading date column is in our schema model.  We don't seem to be
>>>> getting hotspotting after salting.  Date range scans are very common as are
>>>> slice and dice on many dimension columns.
>>>>
>>>> We have tested with a range of SALT values from 0 to 120 for bulk
>>>> loading, upserts, selects at different concurrent load levels on a test
>>>> cluster before moving to production (with some tweaking post-production).
>>>> However, we had fewer average regions per RS during the testing.  The
>>>> larger SALT numbers definitely gave overall better performance on our
>>>> predominantly read-heavy environment.
>>>>
>>>> I appreciate any insights to identify bottlenecks.
>>>>
>>>> On Thu, Jun 22, 2017 at 6:26 PM, James Taylor <jamestaylor@apache.org>
>>>> wrote:
>>>>
>>>>> My recommendation: don't use salt buckets unless you have a
>>>>> monatomically increasing row key, for example one that leads with the
>>>>> current date/time. Otherwise you'll be putting more load (# of salt buckets
>>>>> more load worst case) for bread-and-butter small-range-scan Phoenix queries.
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>>> On Fri, Jun 23, 2017 at 10:06 AM Michael Young <yomaiquin@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> The ulimit open files was only 1024 for the user executing the
>>>>>> query.  After increasing, the queries behaves better.
>>>>>>
>>>>>> How can we tell if we need to reduce/increase the number of salt
>>>>>> buckets?
>>>>>>
>>>>>> Our team set this based on read/write performance using data volume
>>>>>> and expected queries to be run by users.
>>>>>>
>>>>>> However, now it seems the performance has degraded.  We can recreate
>>>>>> the schemas using fewer/more buckets and reload the data, but I haven't
>>>>>> seen a hard and fast rule for setting the number of buckets.
>>>>>>
>>>>>> We have 12 data nodes, 4 SSDs per node, 128 GB Ram per node, 24 core
>>>>>> w/ hyperthreading (HDP 2.5 running, hbase is primary service).
>>>>>> and 800+ regions per RS (seems high)
>>>>>>
>>>>>> Any orientation on this would be greatly appreciated.
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 20, 2017 at 11:54 AM, Josh Elser <josh.elser@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I think this is more of an issue of your 78 salt buckets than
the
>>>>>>> width of your table. Each chunk, running in parallel, is spilling
>>>>>>> incremental counts to disk.
>>>>>>>
>>>>>>> I'd check your ulimit settings on the node which you run this
query
>>>>>>> from and try to increase the number of open files allowed before
going into
>>>>>>> this one in more depth :)
>>>>>>>
>>>>>>>
>>>>>>> On 6/16/17 2:31 PM, Michael Young wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> We are running a 13-node hbase cluster.  One table uses 78
SALT
>>>>>>>> BUCKETS which seems to work reasonable well for both read
and write.  This
>>>>>>>> table has 130 columns with a PK having 30 columns (fairly
wide table).
>>>>>>>>
>>>>>>>> However, after adding several new tables we are seeing errors
about
>>>>>>>> too many open files when running a full table scan.
>>>>>>>>
>>>>>>>>
>>>>>>>> Caused by: org.apache.phoenix.exception.PhoenixIOException:
Too
>>>>>>>> many open files
>>>>>>>>          at org.apache.phoenix.util.Server
>>>>>>>> Util.parseServerException(ServerUtil.java:111)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:152)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:84)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:63)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator$SpoolingResultIteratorFactory.newIterato
>>>>>>>> r(SpoolingResultIterator.java:79)
>>>>>>>>          at org.apache.phoenix.iterate.Par
>>>>>>>> allelIterators$1.call(ParallelIterators.java:112)
>>>>>>>>          at org.apache.phoenix.iterate.Par
>>>>>>>> allelIterators$1.call(ParallelIterators.java:103)
>>>>>>>>          at java.util.concurrent.FutureTas
>>>>>>>> k.run(FutureTask.java:266)
>>>>>>>>          at org.apache.phoenix.job.JobMana
>>>>>>>> ger$InstrumentedJobFutureTask.run(JobManager.java:183)
>>>>>>>>          at java.util.concurrent.ThreadPoo
>>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>>>          at java.util.concurrent.ThreadPoo
>>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>>>          at java.lang.Thread.run(Thread.java:745)
>>>>>>>> Caused by: java.io.IOException: Too many open files
>>>>>>>>          at java.io.UnixFileSystem.createFileExclusively(Native
>>>>>>>> Method)
>>>>>>>>          at java.io.File.createTempFile(File.java:2024)
>>>>>>>>          at org.apache.phoenix.shaded.org.
>>>>>>>> apache.commons.io.output.DeferredFileOutputStream.thresholdR
>>>>>>>> eached(DeferredFileOutputStream.java:176)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator$1.thresholdReached(SpoolingResultIterato
>>>>>>>> r.java:116)
>>>>>>>>          at org.apache.phoenix.shaded.org.
>>>>>>>> apache.commons.io.output.ThresholdingOutputStream.checkThres
>>>>>>>> hold(ThresholdingOutputStream.java:224)
>>>>>>>>          at org.apache.phoenix.shaded.org.
>>>>>>>> apache.commons.io.output.ThresholdingOutputStream.write(Thre
>>>>>>>> sholdingOutputStream.java:92)
>>>>>>>>          at java.io.DataOutputStream.write
>>>>>>>> Byte(DataOutputStream.java:153)
>>>>>>>>          at org.apache.hadoop.io.WritableU
>>>>>>>> tils.writeVLong(WritableUtils.java:273)
>>>>>>>>          at org.apache.hadoop.io.WritableU
>>>>>>>> tils.writeVInt(WritableUtils.java:253)
>>>>>>>>          at org.apache.phoenix.util.TupleU
>>>>>>>> til.write(TupleUtil.java:149)
>>>>>>>>          at org.apache.phoenix.iterate.Spo
>>>>>>>> olingResultIterator.<init>(SpoolingResultIterator.java:127)
>>>>>>>>          ... 10 more
>>>>>>>>
>>>>>>>>
>>>>>>>> When running an explain plan:
>>>>>>>> explain select count(1) from MYBIGTABLE
>>>>>>>>
>>>>>>>> +-----------------------------------------------------------
>>>>>>>> -------------------------------------------------------+
>>>>>>>> |                                                       PLAN
>>>>>>>>                                                |
>>>>>>>> +-----------------------------------------------------------
>>>>>>>> -------------------------------------------------------+
>>>>>>>> | CLIENT 8728-CHUNK 674830174 ROWS 2721056772632 BYTES PARALLEL
>>>>>>>> 78-WAY FULL SCAN OVER ATT.PRE_ENG_CONVERSION_OLAP  |
>>>>>>>> |     ROW TIMESTAMP FILTER [0, 9223372036854775807)
>>>>>>>>                                                 |
>>>>>>>> |     SERVER FILTER BY FIRST KEY ONLY
>>>>>>>>                                                 |
>>>>>>>> |     SERVER AGGREGATE INTO SINGLE ROW
>>>>>>>>                                                |
>>>>>>>> +-----------------------------------------------------------
>>>>>>>> -------------------------------------------------------+
>>>>>>>>
>>>>>>>> I has a lot of chunks.  Normally this query would return
at least
>>>>>>>> some result after running for a few minutes.  With appropriate
filters in
>>>>>>>> the WHERE clause, the queries run fine.
>>>>>>>>
>>>>>>>> Any suggestions on how to avoid this error and get better
>>>>>>>> performance from the table scans?  Realizing that we don't
need to run full
>>>>>>>> table scans regularly, just trying to understand better best
practices for
>>>>>>>> Phoenix Hbase.
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>> Michael
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>

Mime
View raw message