phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ash N <742...@gmail.com>
Subject Re: Sequences vs UUID's
Date Fri, 05 May 2017 17:25:11 GMT
James, Abraham,

I apologize if I wasn't clear with my ask.  I am neither struggling with
uniqueness nor wondering how to generate unique numbers with sequence.

I had two questions

1.  Gaps in the sequence numbers - will this ever be backfilled - I got the
answer - NO. It will not be.  thank you for that.

2.  There is a concept of sequence numbers that was introduced in Apache
Phoenix.
     The question is - What is the use case for this sequence?

     When should one go for UUID and when should one go for sequences ?

     What is the recommendation?

If i create and generated a sequence how is it stored in HBase?  Does it
automatically take care of hot-spotting?  Is there documentation around
this that I can read.


Hopefully I clarified.

I sincerely thank you all for coming forward to help.

Thanks,
-ash







On Fri, May 5, 2017 at 5:46 PM, Abraham Tom <work2much@gmail.com> wrote:

> in an RDBMS the debate has been greatly discussed with varying opinions
>
> Since this is a phoenix (hbase) forum, the key will always be a string
> so your performance bottleneck is the generation of the key.  If you like
> the incremental number solution, I would suggest the following:
> A composite key where the sequence restarts daily would address your
> concern of running out of numbers, and help with hbase (both distribution
> and performance)
> Use system date formatted as yyyyMMdd, cast as a bigint, multiply it by
> 100 billion and add your autogenerated sequence number to it.   This would
> allow you about 1.5 million unique entries per second.
>
>
>
> On Fri, May 5, 2017 at 12:15 AM, Ash N <742000@gmail.com> wrote:
>
>> Could any please help with guidance for the below or point me to any
>> documents?
>>
>> Thanks
>>
>>
>> On May 3, 2017 1:01 AM, "Ash N" <742000@gmail.com> wrote:
>>
>> John,
>>
>> Thank you so much for responding.  Appreciate the link to ppt.  Something
>> I could not find. but read about snowflake
>>   I was looking for guidance on the sequence numbers vs UUID approach.
>>
>> Could I use sequence numbers ?  are the gaps in the sequence numbers ever
>> back filled?
>> There is not much documentation on how it works.  If some one explains, I
>> will be more happy to update the documentation.
>>
>>
>> thanks again,
>> -ash
>>
>> On Wed, May 3, 2017 at 12:51 AM, John Leach <jleach4@gmail.com> wrote:
>>
>>> Ash,
>>>
>>> I built one a while back based on twitter’s snowflake algorithm.
>>>
>>> Here is a link to a presentation from twitter on it…
>>>
>>> https://www.slideshare.net/davegardnerisme/unique-id-generat
>>> ion-in-distributed-systems
>>>
>>> We used it as the primary key for the table when in essence there was
>>> not a primary key (just needed uniqueness).
>>>
>>> Good luck.
>>>
>>> Regards,
>>> John Leach
>>>
>>> On May 2, 2017, at 6:46 PM, Ash N <742000@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> Distributed web application.  Millions of users connecting to the site.
>>>
>>> we are receiving about 150,000 events/ sec through Kinesis Stream.
>>> We need to store these events in a phoenix table identified by an ID the
>>> primary for the table.
>>>
>>> what is the best way to accomplish this?
>>>
>>> Option 1
>>> I played with sequences and they seem to work well.  Although with lot
>>> of gaps.
>>> will the gaps be filled at all?  if not we will run out of IDs pretty
>>> soon.
>>>
>>> Option 2
>>> UUIDs.
>>>
>>> What is the best way to generate UUID's local or network?
>>>
>>> How are folks typically handling this situation?
>>>
>>> which route is recommended Sequences or UUIDs?
>>>
>>> thanks,
>>> -ash
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
> --
> Abraham Tom
> Email:   work2much@gmail.com
> Phone:  415-515-3621 <(415)%20515-3621>
>

Mime
View raw message