phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Loffler <Alex.Loff...@telus.com>
Subject RE: Setting a TTL in an upsert
Date Wed, 23 Sep 2015 19:06:55 GMT
Hi James,

 

Thank you for the info/validation. What I had in mind was the ability to define the (HBase
per cell) timestamp arbitrarily for each upsert statement, but more broadly there seems to
be at least three levels of granularity:

 

1)      Per cell/column timestamp – where each column in the upsert statement could define
a separate timestamp (this may make the syntax quite unwieldy, but offers the most flexibility).
At first glance this hurts my head especially wrt honouring the timestamps on primary keys
& secondary indexes that are defined over multiple columns where a parts of the index/row
could disappear at different times.

2)      Per row timestamp – my original use-case where all of the cells in the row receive
the same timestamp.

3)      Per connection – As you say, probably the simplest/least impacting as this could
be passed as a connection property. I believe it will also be possible to emulate (2) by using
multiple connections, one per retention policy duration, so maybe this is a good starting
point.

 

I’m new to the project so will dive into the code to get my bearings before pulling together
a plan of attack.

 

Thanks again,

-Alex.

 

From: James Taylor [mailto:jamestaylor@apache.org] 
Sent: September 23, 2015 10:59
To: James Taylor
Cc: user
Subject: Re: Setting a TTL in an upsert

 

Also, for more information on (2), see https://phoenix.apache.org/faq.html#Can_phoenix_work_on_tables_with_arbitrary_timestamp_as_flexible_as_HBase_API

 

On Wed, Sep 23, 2015 at 10:55 AM, James Taylor <jamestaylor@apache.org> wrote:

Hi Alex,

I can think of a couple of ways to support this:

1) Surface support for per Cell TTLs (HBASE-10560) in Phoenix (PHOENIX-1335). This could have
the kind of syntax you mentioned (or alternatively rely on a connection property and no syntactic
change would be necessary, and then in MutationState (where Phoenix produces HBase Mutations),
you'd need to use the HBase API to set the TTLs. You'd also need to deal with setting secondary
index rows to have the same TTLs as their data rows.

2) Use the CurrentSCN property at connection time for UPSERT calls to future date the cell
timestamp. You'd also need to set the CurrentSCN property for readers above any value you
used at UPSERT time as otherwise you wouldn't see the data you wrote.

 

If you're up for it, (1) would be a nice contribution and definitely a viable feature.

 

Thanks,

James

 

On Wed, Sep 23, 2015 at 9:08 AM, Alex Loffler <Alex.Loffler@telus.com> wrote:

Hi,

Thanks for the response – would this be a viable feature request? We’re moving from using
raw HBase to Phoenix and would like to use this ‘countdown’ feature to allow for different
rows in the same table to have different retention times. Instead of having to index a user
created TTL column and create a script to manually garbage collect the stale rows, we could
continue to leverage HBase’s TTL mechanism to automatically exclude the rows and physically
delete them on the next major compaction.

>From the documentation, Phoenix supports TTL on secondary indexes as long as they are
created with the same value as the base table, which would be perfect!

Thanks,

-Alex.

From: Yuhao Bi [mailto:byh0831@gmail.com]
Sent: September 23, 2015 00:31
To: user
Subject: Re: Setting a TTL in an upsert

Hi,

As I know, we can only set a ttl in create table stage corresponding to HBase table ttl.

CREATE TABLE IF NOT EXISTS my_schema.my_table (
    org_id CHAR(15), entity_id CHAR(15), payload binary(1000),
    CONSTRAINT pk PRIMARY KEY (org_id, entity_id) )
    TTL=86400

See http://phoenix.apache.org/language/index.html#create_table <http://phoenix.apache.org/language/index.html>
 for more grammar detail.

Thanks,

2015-09-23 15:11 GMT+08:00 Alex Loffler <Alex.Loffler@telus.com>:

Hi,

 

Is it possible to define the TTL of a row (or even each cell in the row) during an upsert
e.g:

 

upsert into test values(1,2,3) TTL=1442988643355;

 

Assuming the table has a TTL this would allow per-row retention policies (with automatic garbage-collection
by HBase) by e.g. setting the upsert TTL to a time in the future.

 

For example if the TTL on the table is set to 60 (seconds), a row with a desired retention
policy of 1 year could be upserted with a TTL=now() + 1 year.

 

Thanks in advance,

-Alex.

 

 


Mime
View raw message