phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil gupta <anilgupt...@gmail.com>
Subject Re: Excessive region splitting of Global Index table(8 Megabyte regions)
Date Sat, 12 Mar 2016 21:38:36 GMT
Here it is: https://issues.apache.org/jira/browse/PHOENIX-2762

We are having performance problem while doing write to our main table from
our MapReduce job. I think, this problem was definitely degrading our
performance. Gonna try testing my hypothesis.

On Sat, Mar 12, 2016 at 1:08 PM, James Taylor <jamestaylor@apache.org>
wrote:

> Yes, good idea. Please file a JIRA.
>
> On Sat, Mar 12, 2016 at 1:07 PM, anil gupta <anilgupta84@gmail.com> wrote:
>
>> To provide more insight, This table has around 1100 columns. I create
>> this index on one column. (1/1100) * 8GB comes around 8MB. So, i think, we
>> need to set a lower bound on region size of secondary index tables in
>> Phoenix. Please let me know if you need me to file a JIRA.
>>
>> On Sat, Mar 12, 2016 at 12:45 PM, anil gupta <anilgupta84@gmail.com>
>> wrote:
>>
>>> Ok, Oversight on my side. MAX_FILESIZE => '11994435' for the secondary
>>> index table.
>>> Main table still doesnt shows MAX_FILESIZE attribute.
>>>
>>> On Sat, Mar 12, 2016 at 12:41 PM, James Taylor <jamestaylor@apache.org>
>>> wrote:
>>>
>>>> It should show up for the index table. I did a test on my local HBase,
>>>> and this is what I see:
>>>>
>>>> hbase(main):004:0> describe 'FOO_IDX'
>>>> Table FOO_IDX is ENABLED
>>>>
>>>> FOO_IDX, {TABLE_ATTRIBUTES => {MAX_FILESIZE => '6710886400', ...
>>>>
>>>> On Sat, Mar 12, 2016 at 12:36 PM, anil gupta <anilgupta84@gmail.com>
>>>> wrote:
>>>>
>>>>> 8GB setting of region size is set at the cluster level. So, we havent
>>>>> set MAX_FILESIZE in main table explicitly. I ran the describe statement
for
>>>>> both tables but its not showing up MAX_FILESIZE since we didnt do any
>>>>> custom setting to these tables. Hope this makes sense.
>>>>>
>>>>> On Sat, Mar 12, 2016 at 12:26 PM, James Taylor <jamestaylor@apache.org
>>>>> > wrote:
>>>>>
>>>>>> Ok - before you reset the MAX_FILESIZE, it'd be help if you could
>>>>>> open an HBase shell and let us know what the current values are for
your
>>>>>> data table and index table:
>>>>>>
>>>>>> describe YOUR_DATA_TABLE;
>>>>>> describe YOUR_INDEX_TABLE;
>>>>>>
>>>>>> If your data table is 8GB, I'd guess your index should be 4GB at
the
>>>>>> smallest. I think 1GB would be too low.
>>>>>>
>>>>>> Thanks,
>>>>>> James
>>>>>>
>>>>>> On Sat, Mar 12, 2016 at 12:23 PM, anil gupta <anilgupta84@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for the reply, James. We have 2 global secondary index
in
>>>>>>> this table and both of them exhibit same behavior.  Going to
give your
>>>>>>> suggestion a try. I also think that regionsize for secondary
index should
>>>>>>> not be 8GB. Will try to set the regionsize=1GB for secondary
index and see
>>>>>>> how it goes.
>>>>>>>
>>>>>>> On Sat, Mar 12, 2016 at 12:00 PM, James Taylor <
>>>>>>> jamestaylor@apache.org> wrote:
>>>>>>>
>>>>>>>> Hi Anil,
>>>>>>>> Phoenix estimates the ratio between the data table and index
table
>>>>>>>> as shown below to attempt to get the same number of splits
in your index
>>>>>>>> table as your data table.
>>>>>>>>
>>>>>>>> /*
>>>>>>>>  * Approximate ratio between index table size and data table
size:
>>>>>>>>  * More or less equal to the ratio between the number of
key value
>>>>>>>>  * columns in each. We add one to the key value column count
to
>>>>>>>>  * take into account our empty key value. We add 1/4 for
any key
>>>>>>>>  * value data table column that was moved into the index
table row
>>>>>>>> key.
>>>>>>>>  */
>>>>>>>>
>>>>>>>> Phoenix then multiples the MAX_FILESIZE of the data table
to come
>>>>>>>> up with a reasonable default value for the index table. Can
you check in
>>>>>>>> the HBase shell what the MAX_FILESIZE is for the data table
versus the
>>>>>>>> index table? Maybe there's a bug in Phoenix in how it calculates
this
>>>>>>>> ration.
>>>>>>>>
>>>>>>>> You can override the MAX_FILESIZE for your index through
an ALTER
>>>>>>>> TABLE statement:
>>>>>>>>
>>>>>>>> ALTER TABLE my_table_schema.my_index_name SET MAX_FILESIZE=
>>>>>>>> 8589934592
>>>>>>>>
>>>>>>>> You can ignore the warnings you get in sqlline and you can
verify
>>>>>>>> the setting took affect through the HBase shell by running
the following
>>>>>>>> command:
>>>>>>>>
>>>>>>>> describe 'MY_TABLE_SCHEMA.MY_INDEX_NAME'
>>>>>>>>
>>>>>>>> HTH,
>>>>>>>>
>>>>>>>>     James
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Mar 12, 2016 at 10:18 AM, anil gupta <anilgupta84@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> We are using HDP2.3.4 and Phoenix4.4.
>>>>>>>>> Our global index table is doing excessive splitting.
Our cluster
>>>>>>>>> region size setting is 8 Gigabytes but global index table
has 18 regions
>>>>>>>>> and max size of region is 10.9 MB.
>>>>>>>>> This is definitely not a good behavior. I looked into
tuning (
>>>>>>>>> https://phoenix.apache.org/tuning.html) and i could not
find
>>>>>>>>> anything relevant. Is this region splitting intentionally
done by Phoenix
>>>>>>>>> for secondary index tables?
>>>>>>>>>
>>>>>>>>> Here is the output of du command:
>>>>>>>>> [ag@hdpclient1 ~]$ hadoop fs -du -h
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX
>>>>>>>>> 761     /apps/hbase/data/data/default/SEC_INDEX/.tabledesc
>>>>>>>>> 0       /apps/hbase/data/data/default/SEC_INDEX/.tmp
>>>>>>>>> 9.3 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/079db2c953c30a8270ecbd52582e81ff
>>>>>>>>> 2.9 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/0952c070234c05888bfc2a01645e9e88
>>>>>>>>> 10.9 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/0d69bbb8991b868f0437b624410e9bed
>>>>>>>>> 8.2 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/206562491fd1de9db48cf422dd8c2059
>>>>>>>>> 7.9 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/25318837ab8e1db6922f5081c840d2e7
>>>>>>>>> 9.5 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/5369e0d6526b3d2cdab9937cb320ccb3
>>>>>>>>> 9.6 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/62704ee3c9418f0cd48210a747e1f8ac
>>>>>>>>> 7.8 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/631376fc5515d7785b2bcfc8a1f64223
>>>>>>>>> 2.8 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/6648d5396ba7a3c3bf884e5e1300eb0e
>>>>>>>>> 9.4 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/6e6e133580aea9a19a6b3ea643735072
>>>>>>>>> 8.1 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/8535a5c8a0989dcdfad2b1e9e9f3e18c
>>>>>>>>> 7.8 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/8ffa32e0c6357c2a0b413f3896208439
>>>>>>>>> 9.3 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/c27e2809cd352e3b06c0f11d3e7278c6
>>>>>>>>> 8.0 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/c4f5a98ce6452a6b5d052964cc70595a
>>>>>>>>> 8.1 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/c578d3190363c32032b4d92c8d307215
>>>>>>>>> 7.9 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/d750860bac8aa372eb28aaf055ea63e7
>>>>>>>>> 9.6 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/e9756aa4c7c8b9bfcd0857b43ad5bfbe
>>>>>>>>> 8.0 M
>>>>>>>>> /apps/hbase/data/data/default/SEC_INDEX/ebaae6c152e82c9b74c473babaf644dd
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thanks & Regards,
>>>>>>>>> Anil Gupta
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Thanks & Regards,
>>>>>>> Anil Gupta
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks & Regards,
>>>>> Anil Gupta
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks & Regards,
>>> Anil Gupta
>>>
>>
>>
>>
>> --
>> Thanks & Regards,
>> Anil Gupta
>>
>
>


-- 
Thanks & Regards,
Anil Gupta

Mime
View raw message