phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Problem in finding the largest value of an indexed column
Date Thu, 02 Jul 2015 16:46:47 GMT
Thanks, Yufan. I found an issue and filed PHOENIX-2096 with a patch. Would
you mind confirming that this fixes the issue you're seeing?

    James

On Thu, Jul 2, 2015 at 9:45 AM, Yufan Liu <yliu33@kent.edu> wrote:

> I'm using 4.4.0-HBase-0.98
>
> 2015-07-01 22:31 GMT-07:00 James Taylor <jamestaylor@apache.org>:
>
>> Yufan,
>> What version of Phoenix are you using?
>> Thanks,
>> James
>>
>> On Wed, Jul 1, 2015 at 2:34 PM, Yufan Liu <yliu33@kent.edu> wrote:
>>
>>> When I made more tests, I find that this problem happens after table got
>>> split.
>>>
>>> Here is the DDL I use to create table and index:
>>> CREATE TABLE IF NOT EXISTS t1 (
>>> uid BIGINT NOT NULL,
>>> timestamp BIGINT NOT NULL,
>>> eventName VARCHAR
>>> CONSTRAINT my_pk PRIMARY KEY (uid,  timestamp)) COMPRESSION='SNAPPY';
>>>
>>> CREATE INDEX timestamp_index ON t1 (timestamp) INCLUDE (eventName)
>>>
>>> Attach is the sample data I used for test. It has about 4000 rows, when
>>> the timestamp_index table has one region, the query returns correct result:
>>> 1433334048443, but when I manually split it into 4 regions (use hbase
>>> tool), it returns 1433333024961.
>>>
>>> Let know if you find anything. Thanks!
>>>
>>>
>>> 2015-07-01 11:27 GMT-07:00 James Taylor <jamestaylor@apache.org>:
>>>
>>>> If you could put a complete test (including your DDL and upsert of
>>>> data), that would be much appreciated.
>>>> Thanks,
>>>> James
>>>>
>>>> On Wed, Jul 1, 2015 at 11:20 AM, Yufan Liu <yliu33@kent.edu> wrote:
>>>>
>>>>> I have tried to use query "SELECT timestamp FROM t1 ORDER BY timestamp
>>>>> DESC NULLS LAST LIMIT 1". But it still returns the same unexpected result.
>>>>> There seems to be some internal problems related.
>>>>>
>>>>> 2015-06-30 18:03 GMT-07:00 James Taylor <jamestaylor@apache.org>:
>>>>>
>>>>>> Yes, reverse scan will be leveraged when possible. Make you use NULLS
>>>>>> LAST in your ORDER BY as rows are ordered with nulls first.
>>>>>>
>>>>>> On Tue, Jun 30, 2015 at 5:25 PM, Yufan Liu <yliu33@kent.edu>
wrote:
>>>>>>
>>>>>>> I used the HBase reverse scan to find the last row on the index
>>>>>>> table. It returned the expected result. I would like to know
is Phoenix's
>>>>>>> "ORDER BY"
>>>>>>> and "DESC" implemented based on HBase reverse scan?
>>>>>>>
>>>>>>> 2015-06-26 17:25 GMT-07:00 Yufan Liu <yliu33@kent.edu>:
>>>>>>>
>>>>>>>> Thank you anyway, Michael!
>>>>>>>>
>>>>>>>> 2015-06-26 17:21 GMT-07:00 Michael McAllister <
>>>>>>>> mmcallister@homeaway.com>:
>>>>>>>>
>>>>>>>>>  OK, I’m a Phoenix newbie, so that was the extent of
the advice I
>>>>>>>>> could give you. There are people here far more experienced
than I am who
>>>>>>>>> should be able to give you deeper advice. Have a great
weekend!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Mike
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *From:* Yufan Liu [mailto:yliu33@kent.edu]
>>>>>>>>> *Sent:* Friday, June 26, 2015 7:19 PM
>>>>>>>>> *To:* user@phoenix.apache.org
>>>>>>>>> *Subject:* Re: Problem in finding the largest value of
an indexed
>>>>>>>>> column
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Michael,
>>>>>>>>>
>>>>>>>>> Thanks for the advice, for the first one, it's "CLIENT
67-CHUNK
>>>>>>>>> PARALLEL 1-WAY FULL SCAN OVER TIMESTAMP_INDEX; SERVER
FILTER BY FIRST KEY
>>>>>>>>> ONLY; SERVER AGGREGATE INTO SINGLE ROW" which is as expected.
For the
>>>>>>>>> second one, it's "CLIENT 67-CHUNK SERIAL 1-WAY REVERSE
FULL SCAN OVER
>>>>>>>>> TIMESTAMP_INDEX; SERVER FILTER BY FIRST KEY ONLY; SERVER
1 ROW LIMIT" which
>>>>>>>>> looks correct, but still returns the unexpected result.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2015-06-26 16:59 GMT-07:00 Michael McAllister <
>>>>>>>>> mmcallister@homeaway.com>:
>>>>>>>>>
>>>>>>>>> Yufan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Have you tried using the EXPLAIN command to see what
plan is being
>>>>>>>>> used to access the data?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Michael McAllister
>>>>>>>>>
>>>>>>>>> Staff Data Warehouse Engineer | Decision Systems
>>>>>>>>>
>>>>>>>>> mmcallister@HomeAway.com | C: 512.423.7447 | skype:
>>>>>>>>> michael.mcallister.ha <zimmkate@hotmail.com> |
webex:
>>>>>>>>> https://h.a/mikewebex
>>>>>>>>>
>>>>>>>>> [image: Description: Description: cid:3410354473_30269081]
>>>>>>>>>
>>>>>>>>> This electronic communication (including any attachment)
is
>>>>>>>>> confidential.  If you are not an intended recipient of
this communication,
>>>>>>>>> please be advised that any disclosure, dissemination,
distribution, copying
>>>>>>>>> or other use of this communication or any attachment
is strictly
>>>>>>>>> prohibited.  If you have received this communication
in error, please
>>>>>>>>> notify the sender immediately by reply e-mail and promptly
destroy all
>>>>>>>>> electronic and printed copies of this communication and
any attachment.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *From:* Yufan Liu [mailto:yliu33@kent.edu]
>>>>>>>>> *Sent:* Friday, June 26, 2015 6:31 PM
>>>>>>>>> *To:* user@phoenix.apache.org
>>>>>>>>> *Subject:* Problem in finding the largest value of an
indexed
>>>>>>>>> column
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> We have created a table (eg, t1), and a global index
of one
>>>>>>>>> numeric column of t1 (eg, timestamp). Now we want to
find the largest value
>>>>>>>>> of timestamp, we have tried two approaches:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1. select max(timestamp) from t1; This query takes forever
to
>>>>>>>>> finish, so I think it maybe doing a full table scan/comparison
.
>>>>>>>>>
>>>>>>>>> 2. select timestamp from t1 order by timestamp desc limit
1; This
>>>>>>>>> query finished fast, but the result it returns is far
from the largest
>>>>>>>>> value. It seems it just return the largest value for
a certain range of
>>>>>>>>> data.
>>>>>>>>>
>>>>>>>>> Did anyone else encounter this issue/have any suggestion?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Yufan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> best,
>>>>>>>>> Yufan
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> best,
>>>>>>>> Yufan
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> best,
>>>>>>> Yufan
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> best,
>>>>> Yufan
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> best,
>>> Yufan
>>>
>>>
>>
>
>
> --
> best,
> Yufan
>
>

Mime
View raw message