phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Bulk loading and index
Date Mon, 27 Jun 2016 19:51:22 GMT
Tongzhou,
Please file a JIRA for supporting ALTER INDEX .... REBUILD ASYNC. This
would be a good addition and not very difficult to implement. Contributions
are, of course, always welcome.
Regards,
James

On Sun, Jun 26, 2016 at 2:45 AM, Ankit Singhal <ankitsinghal59@gmail.com>
wrote:

> HI Tongzhou,
>
> May be you can trying dropping the current index and after your upload is
> completed, you can create a async index. Then you can use IndexTool to
> rebuild your index from start.
>
> source:- https://phoenix.apache.org/secondary_indexing.html
>
> CREATE INDEX async_index ON my_schema.my_table (v) ASYNC
>
>
> But if you are only using CSVBulkLoadTool for bulk load, then it will
> automatically prepare and bulk load index data also. So Index maintaining
> would not be required.
>
> Regards,
> Ankit Singhal
>
> On Sat, Jun 25, 2016 at 4:13 PM, Tongzhou Wang (Simon) <
> tongzhou.wang.1994@gmail.com> wrote:
>
>> Hi Josh,
>>
>> First, thanks for the response.
>>
>> As far as I can tell, a disabled index cannot be directly changed to
>> USABLE. It must be rebuilt first. I am aware that I can do ALTER INDEX ....
>> REBUILD. But, if I understand correctly, this is single thread and slow.
>> I'm wondering if I can use the IndexTool map reduce job in this case.
>>
>> About TTL, I did some experiments. Turns out that Phoenix do not
>> automatically remove index entry when the table entry dies from TTL
>> setting. However, it is possible to set index table with same TTL so that
>> index can be in sync.
>>
>> Best,
>> Tongzhou
>>
>> > On Jun 25, 2016, at 15:31, Josh Elser <josh.elser@gmail.com> wrote:
>> >
>> > Hi Tongzhou,
>> >
>> > Maybe you can try `ALTER INDEX index ON table DISABLE`. And then the
>> same command with USABLE after you update the index. Are you attempting to
>> do this incrementally? Like, a bulk load of data then a bulk load of index
>> data, repeat?
>> >
>> > Regarding the TTL, I assume so, but I'm not certain.
>> >
>> > Tongzhou Wang wrote:
>> >> Hi all,
>> >>
>> >> I am writing to ask if there is a way to disable an index, then update
>> >> it through the MapReduce job (IndexTool). I want to bulk load a huge
>> >> amount of data, but index maintaining makes it very slow. It would be
>> >> great if I can disable an index, load data, then use a MapReduce job to
>> >> update it to usable state.
>> >>
>> >> Also, does Phoenix's secondary index maintaining take TTL into account?
>> >>
>> >> Thanks,
>> >> Tongzhou
>>
>
>

Mime
View raw message