phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Can phoenix support HBase's TimeStamp?
Date Mon, 17 Oct 2016 05:42:07 GMT
FYI, a couple of timestamp related features that Phoenix supports today
include;
- specify/filter on timestamp of a row:
http://phoenix.apache.org/rowtimestamp.html
- query as of a past timestamp:
http://phoenix.apache.org/faq.html#Can_phoenix_work_on_tables_with_arbitrary_timestamp_as_flexible_as_HBase_API

These were determined to be a good fit with SQL and surface some of the
power of HBase. Exposing per cell timestamp control and multi-version
queries are difficult in the SQL model, but we're open to suggestions if it
can be done in a standard, general way.

Thanks,
James

On Sunday, October 16, 2016, William <yhxx511@163.com> wrote:

> Hi, Zhang Yang,
>    I've implemented the multi-version feature in my own Phoenix branch.
> But this implementation is supposed to be working in a very very limited
> scenario because there were so many things to think about when designing
> it. Here are some primary problems that we must solve:
>    * add new syntax to support select with timestamps, we should support
> select only one version and multi version within a range and the number of
> versions too. For example:
>      select * from test timestamps min, max;     // select all versions
> within the specified time range
>      select * from test timestamps ts;           // select a specified
> version
>      select * from test version number;          // select specified
> number of versions
>      select * from test version number timestamps min, max; // select
> specified number of versions with a specified time range.
>      Note that this is not standard SQL syntax, which is not recommended.
>    * Timestamp is a Cell-level property in HBase, so we should support the
> same thing in Phoenix. But how can we allow different timestamps for
> different columns in the same row? I modified the ResultSet class and add
> some methods like 'public Map<Long, T> getAllT(index)' to return all
> selected versions for a single column. One can call this method on
> different columns for the same row to retrieve all the things he wants.
> Users must use PhoenixResultSet instead of ResultSet, this is not
> recommended either.
>    * How do we handle index updates/selects for multi-version? This is a
> messy problem, so my implementation did not support multi-version for index
> tables.
>    * do not support GROUP BY, ORDER BY or any nested query/upsert.
>    * for batch commit, when you upsert the same row with different
> timestamps, Phoenix can only commit the last timestamps you set. This is
> meaningless to do this. So I simply forbid this scenario.
>    * Phoenix encoded the KVs into one Cell at the RS side, but if we want
> to return multi-versions for different columns, especially different
> timestamps for different columns, we must not do the encoding. So we must
> modify the internals of Phoenix to support a brand new read path to do this.
>
>    Besides the huge efforts of implementing, IMHO, the primary problem is
> it's not easy to implement this feature properly,  as each one may have a
> different requirement. You can implementing this feature personally in your
> personal branch, but i don't know the best way to support this in an
> official Phoenix release. What do you think of this? Any suggested design?
>
>   Thanks.
>   William.
>
> At 2016-10-13 18:12:56, "Yang Zhang" <zhang.yang.dm@gmail.com
> <javascript:_e(%7B%7D,'cvml','zhang.yang.dm@gmail.com');>> wrote:
>
> Hello everyone
>
> I saw that we can create a Phoenix table from an exist HBase table,(for
> detail
> <https://phoenix.apache.org/faq.html#How_I_map_Phoenix_table_to_an_existing_HBase_table>
> )
> My question is whether Phoenix can supprort the history version of my row?
>
> I am trying to  use Phoenix to store some info which have a lot of common
> columns,
> such as a table "T1 ( c1, c2, c3, c4 )", many rows share the same
> c1,c2,c3,and the variable column is c4,
> Using HBase we can put  'T1',  'key1', ' f:c4', 'new value', timestamp,
>
> And i can get previous version of this row,They all share the same
> c1,c2,c3 whice HBase only store once.
>
> Whether phoenix support to query history version of my row?
>
> I got this jira link <https://issues.apache.org/jira/browse/PHOENIX-590>
> , This is same as my question.
>
> Hadoop is using for big data, and mlutiple version can help us reduce our
> date that unnecessary
> I think phoenix should support this feature too.
>
> If Phoenix shouldn't support multiple version, please tell me the reason.
>
>
> Anyway thansks for your help, First
>
>
>
>
>
>

Mime
View raw message