phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Help: setting hbase row timestamp in phoenix upserts ?
Date Fri, 01 Dec 2017 07:14:42 GMT
The only way I can think of accomplishing this is by using the raw HBase
APIs to write the data but using our utilities to write it in a Phoenix
compatible manner. For example, you could run an UPSERT VALUES statement,
use the PhoenixRuntime.getUncommittedDataIterator()method to get the Cells
that would have been written, update the Cell timestamp as needed, and do
an htable.batch() call to commit them.

On Wed, Nov 29, 2017 at 11:46 AM Pedro Boado <pedro.boado@gmail.com> wrote:

> Hi,
>
> I'm looking for a little bit of help trying to get some light over
> ROW_TIMESTAMP.
>
> Some background over the problem ( simplified ) : I'm working in a project
> that needs to create a "enriched" replica of a RBDMS table based on a
> stream of cdc changes off that table.
>
> Each cdc event contains the timestamp of the change plus all the column
> values 'before' and 'after' the change . And each event is pushed to a
> kafka topic.  Because of certain "non-negotiable" design decisions kafka
> guarantees delivering each event at least once, but doesn't guarantee
> ordering for changes over the same row in the source table.
>
> The final step of the kafka-based flow is sinking the information into
> HBase/Phoenix.
>
> As I cannot get in order delivery guarantee from Kafka I need to use the
> cdc event timestamp to ensure that HBase keeps the latest change over a row.
>
> This fits perfectly well with an HBase table design with VERSIONS=1 and
> using the source event timestamp as HBase row/cells timestamp
>
> The thing is that I cannot find a way to define the value of the HBase
> cell from a Phoenix upsert.
>
> I came across the ROW_TIMESTAMP functionality, but I've just found ( I'm
> devastated now ) that the ROW_TIMESTAMP columns store the date in both
> hbase's cell timestamp and in the primary key, meaning that I cannot
> leverage that functionality to keep only the latest change.
>
> Is there a way of defining hbase's row timestamp when doing the UPSERT -
> even by setting it through some obscure hidden jdbc property - ?
>
> I want to avoid by all means doing a checkAndPut as the volume of changes
> is going to be quite bug.
>
>
>
> --
> Un saludo.
> Pedro Boado.
>

Mime
View raw message