phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Davis <nathan.da...@salesforce.com>
Subject Re: phoenix.query.maxServerCacheBytes not used
Date Wed, 20 Jul 2016 13:43:56 GMT
It does work fine with the sort merge hint. I was just hoping to utilize a
little extra memory to cache the hashes for the smaller of the two tables
and get better perf from a hash join (although, read below - as you
suggest, that is not the case).

I finally did figure out my config issue. The following lines (91-93) on
phoenix_utils.py was making the client tools look in /etc/hbase/conf
instead of the current directory.

>     elif os.name == 'posix':
>         # default to the bigtop configuration dir
>         hbase_conf_dir = '/etc/hbase/conf'


Once I was able to override phoenix.query.maxServerCacheBytes, I found that
the hash join was actually a lot slower than the sort merge - which was the
point of my experiment. This was a bit unexpected since the RHS is still
only 5mill rows, but I got my answer.

Thanks everyone for the help!

On Wed, Jul 20, 2016 at 9:16 AM, Mohanraj Ragupathiraj <
mohanaugust@gmail.com> wrote:

> Use sort merge join if both are large tables
>
> On 20 Jul 2016 8:45 p.m., "Nathan Davis" <nathan.davis@salesforce.com>
> wrote:
>
>> OK, thanks Mujtaba and Samarth. I added those properties to my
>> hbase-site.xml in the same folder as psql.py and sqlline.py, but still get
>> the same MaxServerCacheSizeExceededException when executing the simple
>> join. I am using v4.7 on HBase 1.1 and my query is:
>>
>>> select e.contact_key
>>> from event_20m as e
>>> inner join contact_5m as c on e.contact_key = c.contact_key
>>> where e.object = 'PURCHASE' and e.object_id = 'ITEMABD'
>>> limit 10;
>>
>>
>> I don't think the query is the issue though, because the error is saying
>> a different max bytes compared to what I have set in my hbase-site.xml
>>
>> On Tue, Jul 19, 2016 at 7:23 PM, Mujtaba Chohan <mujtaba@apache.org>
>> wrote:
>>
>>> phoenix.query.maxServerCacheBytes is a client side parameter. If you
>>> are using bin/sqlline.py then set this property in bin/hbase-site.xml and
>>> restart sqlline.
>>>
>>> - mujtaba
>>>
>>> On Tue, Jul 19, 2016 at 1:59 PM, Nathan Davis <
>>> nathan.davis@salesforce.com> wrote:
>>>
>>>> Hi,
>>>> I am running a standalone HBase locally with Phoenex installed by
>>>> dropping the jars into HBase lib directory. I have added the following to
>>>> my hbase-site.xml and restarted HBase:
>>>>
>>>>   <property>
>>>>>     <name>phoenix.query.maxServerCacheBytes</name>
>>>>>     <value>419430400</value>
>>>>>   </property>
>>>>>   <property>
>>>>>     <name>phoenix.query.maxGlobalMemoryPercentage</name>
>>>>>     <value>25</value>
>>>>>   </property>
>>>>
>>>>
>>>> However, I am still getting the following error when doing a regular
>>>> inner join to an 5mill-sized RHS table (Notice that the error says
>>>> "...maximum allowed size (104857600 bytes)" even though I have changed that
>>>> setting to 400MB):
>>>>
>>>> java.sql.SQLException: Encountered exception in sub plan [0] execution.
>>>>> at
>>>>> org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:193)
>>>>> at
>>>>> org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:138)
>>>>> at
>>>>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:276)
>>>>> at
>>>>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:261)
>>>>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
>>>>> at
>>>>> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:260)
>>>>> at
>>>>> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:248)
>>>>> at
>>>>> org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:172)
>>>>> at
>>>>> org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:177)
>>>>> at
>>>>> org.apache.phoenix.jdbc.PhoenixConnection.executeStatements(PhoenixConnection.java:354)
>>>>> at
>>>>> org.apache.phoenix.util.PhoenixRuntime.executeStatements(PhoenixRuntime.java:298)
>>>>> at org.apache.phoenix.util.PhoenixRuntime.main(PhoenixRuntime.java:243)
>>>>> Caused by:
>>>>> org.apache.phoenix.join.MaxServerCacheSizeExceededException: Size of
hash
>>>>> cache (104857638 bytes) exceeds the maximum allowed size (104857600 bytes)
>>>>> at
>>>>> org.apache.phoenix.join.HashCacheClient.serialize(HashCacheClient.java:110)
>>>>> at
>>>>> org.apache.phoenix.join.HashCacheClient.addHashCache(HashCacheClient.java:83)
>>>>> at
>>>>> org.apache.phoenix.execute.HashJoinPlan$HashSubPlan.execute(HashJoinPlan.java:381)
>>>>> at
>>>>> org.apache.phoenix.execute.HashJoinPlan$1.call(HashJoinPlan.java:162)
>>>>> at
>>>>> org.apache.phoenix.execute.HashJoinPlan$1.call(HashJoinPlan.java:158)
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>> at
>>>>> org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>
>>>>
>>>> It seems like my `maxServerCacheBytes` setting is not getting picked
>>>> up, but not sure why. I'm pretty newb to Phoenix so I'm sure it's something
>>>> simple...
>>>>
>>>> Thanks up front for the help!
>>>>
>>>> -Nathan Davis
>>>>
>>>
>>>
>>

Mime
View raw message