phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Davis <nathan.da...@salesforce.com>
Subject Re: monitoring status of CREATE INDEX operation
Date Fri, 12 Aug 2016 15:54:02 GMT
Thanks James, all CAPS did the trick!

Yes, the event table is already IMMUTABLE_ROWS=true.

Thanks again,
Nathan


On Fri, Aug 12, 2016 at 10:59 AM, James Taylor <jamestaylor@apache.org>
wrote:

> In your IndexTool invocation, try use all caps for your table and index
> name. Phoenix normalizes names by upper casing them (unless they're in
> double quotes).
>
> One other unrelated question: did you declare your event table with
> IMMUTABLE_ROWS=true (assuming it's a write-once table)? If not, you can use
> the ALTER TABLE trans.event SET IMMUTABLE_ROWS=true command to change it.
> This will give you some performance benefit.
>
> Thanks,
> James
>
> On Fri, Aug 12, 2016 at 7:39 AM, Nathan Davis <nathan.davis@salesforce.com
> > wrote:
>
>> Thanks for the detailed info. I took the advice of using the ASYNC
>> method. The CREATE statement executes fine and I end up with an index table
>> showing in state BUILDING. When I kick off the MR job with `hbase
>> org.apache.phoenix.mapreduce.index.IndexTool --schema trans --data-table
>> event --index-table event_object_id_idx_b --output-path
>> EVENT_OBJECT_ID_IDX_B_HFILES` I get this odd error:
>>
>> 2016-08-12 14:29:40,073 ERROR [main] index.IndexTool:  An exception
>>> occured while performing the indexing job : java.lang.IllegalArgumentException:
>>>  trans.event_object_id_idx_b is not an index table for trans.event
>>> at org.apache.phoenix.mapreduce.index.IndexTool.run(IndexTool.java:187)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>> at org.apache.phoenix.mapreduce.index.IndexTool.main(IndexTool.java:378)
>>>
>>
>>
>>
>>
>> My CREATE INDEX was as follows:
>>
>> create index if not exists event_object_id_idx_b on trans.event (
>>>     object_id
>>> ) ASYNC UPDATE_CACHE_FREQUENCY=60000;
>>
>>
>>
>> On Thu, Aug 11, 2016 at 9:40 PM, James Taylor <jamestaylor@apache.org>
>> wrote:
>>
>>> Hi Nathan,
>>> If your table is large, I'd recommend creating your index
>>> asynchronously. To do that, you'd add the ASYNC keyword to the end of your
>>> CREATE INDEX call. In this case, the index will be built through Map Reduce
>>> in a more resilient manner (i.e. the client going up or down won't impact
>>> it and you have the regular retries of a MR job). On AWS, you'll need to
>>> manually start the MR job, but at SFDC we have a CRON job that'll start it
>>> for you automatically (this is open source too, so it'd be good to get that
>>> up and running in AWS as well). See https://phoenix.apache.org
>>> /secondary_indexing.html#Index_Population for details.
>>>
>>> If you don't run it asynchronously, then you'll need to increase the
>>> query timeout (i.e. phoenix.query.timeoutMs config property) to be larger
>>> than the time it'll take to build the index. If the client goes down before
>>> the CREATE INDEX call finished (or the query times out), then the index
>>> build will stop (and unfortunately will need to be run again).
>>>
>>> To monitor the index build, there are a few ways - if running through
>>> MR, then you can monitor the MR job in the standard way. If running
>>> synchronously, you can monitor the index table - you'll see new regions
>>> created as splits occur, or you could query the SYSTEM.STATS table (which
>>> gets populated as splits and compactions happen), or you could run a
>>> count(*) query directly against the index table (though that'll put more
>>> load on your system because it'll require a full table scan).
>>>
>>> HTH. Thanks,
>>> James
>>>
>>> On Thu, Aug 11, 2016 at 5:51 PM, Nathan Davis <
>>> nathan.davis@salesforce.com> wrote:
>>>
>>>> Hi All,
>>>> I executed a CREATE INDEX against a fairly large table. And I received
>>>> a timeout error after a minute or two, which I understand is expected.
>>>> `!tables` in sqlline shows the index is still in BUILDING state after 2
>>>> hours, which may be accurate since it is a pretty large table and my
>>>> cluster is just a smallish EMR throwaway.
>>>>
>>>> My question is: Is there some way to verify that the index is in fact
>>>> still being built? Perhaps some HBase logs or the UI or some hbase shell
>>>> command? Unfortunately I am just as new to HBase as I am to Phoenix itself.
>>>>
>>>> Thanks,
>>>>  -nathan
>>>>
>>>
>>>
>>
>

Mime
View raw message