phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerald Sangudi <gsang...@23andme.com>
Subject Re: Salting based on partial rowkeys
Date Mon, 17 Sep 2018 05:20:41 GMT
Jaanai, Thomas,

Thanks for the feedback. I or my colleague will reply in this thread in the
dev list.

Gerald


On Thu, Sep 13, 2018 at 10:01 PM, Thomas D'Silva <tdsilva@salesforce.com>
wrote:

> For the usage example that you provided when you write data how does the
> values of id_1, id_2 and other_key vary?
> I assume id_1 and id_2 remain the same while other_key is monotonically
> increasing, and thats why the table is salted.
> If you create the salt bucket only on id_2 then wouldn't you run into
> region server hotspotting during writes?
>
> On Thu, Sep 13, 2018 at 8:02 PM, Jaanai Zhang <cloud.poster@gmail.com>
> wrote:
>
>> Sorry, I don't understander your purpose. According to your proposal, it
>> seems that can't achieve.  You need a hash partition, However,  Some things
>> need to clarify that HBase is a range partition engine and the salt buckets
>> were used to avoid hotspot, in other words, HBase as a storage engine can't
>> support hash partition.
>>
>> ----------------------------------------
>>    Jaanai Zhang
>>    Best regards!
>>
>>
>>
>> Gerald Sangudi <gsangudi@23andme.com> 于2018年9月13日周四 下午11:32写道:
>>
>>> Hi folks,
>>>
>>> Any thoughts or feedback on this?
>>>
>>> Thanks,
>>> Gerald
>>>
>>> On Mon, Sep 10, 2018 at 1:56 PM, Gerald Sangudi <gsangudi@23andme.com>
>>> wrote:
>>>
>>>> Hello folks,
>>>>
>>>> We have a requirement for salting based on partial, rather than full,
>>>> rowkeys. My colleague Mike Polcari has identified the requirement and
>>>> proposed an approach.
>>>>
>>>> I found an already-open JIRA ticket for the same issue:
>>>> https://issues.apache.org/jira/browse/PHOENIX-4757. I can provide more
>>>> details from the proposal.
>>>>
>>>> The JIRA proposes a syntax of SALT_BUCKETS(col, ...) = N, whereas Mike
>>>> proposes SALT_COLUMN=col or SALT_COLUMNS=col, ... .
>>>>
>>>> The benefit at issue is that users gain more control over partitioning,
>>>> and this can be used to push some additional aggregations and hash joins
>>>> down to region servers.
>>>>
>>>> I would appreciate any go-ahead / thoughts / guidance / objections /
>>>> feedback. I'd like to be sure that the concept at least is not
>>>> objectionable. We would like to work on this and submit a patch down the
>>>> road. I'll also add a note to the JIRA ticket.
>>>>
>>>> Thanks,
>>>> Gerald
>>>>
>>>>
>>>
>

Mime
View raw message