phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheyenne Forbes <cheyenne.osanu.for...@gmail.com>
Subject Re: How can I "use" a hbase co-processor from a User Defined Function?
Date Wed, 19 Apr 2017 23:03:10 GMT
At postOpen the location of the lucene directory to be used for the region
is set using the value of *"h_region.getRegionInfo().getEncodedName();" *so
whenever prePut is called the index of the column is stored in the
directory that was set during postOpen. So basically the lucene operations
are "tied" to hbase hooks

Regards,

Cheyenne O. Forbes



On Wed, Apr 19, 2017 at 4:21 PM, Sergey Soldatov <sergeysoldatov@gmail.com>
wrote:

> How do you handle HBase region splits and merges with such architecture?
>
> Thanks,
> Sergey
>
> On Wed, Apr 19, 2017 at 9:22 AM, Cheyenne Forbes <
> cheyenne.osanu.forbes@gmail.com> wrote:
>
>> I created a hbase co-processor that stores/deletes text indexes with
>> Lucene, the indexes are stored on HDFS (for back up, replication, etc.).
>> The indexes "mirror" the regions so if the index for a column is at
>> "hdfs://localhost:9000/hbase/region_name" the index is stored at
>> "hdfs://localhost:9000/lucene/region_name". I did this just in case I
>> needed to delete (or other operation) an entire region for which ever
>> reason. The id of the row, the column and query are passed to a Lucene
>> BooleanQuery to get a search score to use to sort the data
>> "SEARCH_SCORE(primary_key, text_column_name, search_query)". So I am trying
>> to find a way to get "HRegion" of the region server the code is running on
>> to either *1.* get the region name and the hadoop FileSystem or *2. *get
>> access to the co-processor on that server which already have the values in
>> option *1*
>>
>> Regards,
>>
>> Cheyenne O. Forbes
>>
>>
>>
>> On Wed, Apr 19, 2017 at 10:59 AM, James Taylor <jamestaylor@apache.org>
>> wrote:
>>
>>> Can you describe the functionality you're after at a high level in terms
>>> of a use case (rather than an implementation idea/detail) and we can
>>> discuss any options wrt potential new features?
>>>
>>> On Wed, Apr 19, 2017 at 8:53 AM Cheyenne Forbes <
>>> cheyenne.osanu.forbes@gmail.com> wrote:
>>>
>>>> I'd still need " *HRegion MyVar; ", *because I'd still need the name
>>>> of the region where the row of the id passed to the UDF is located and the
>>>> value returned my* "getFilesystem()" *of* "**HRegion", *what do you
>>>> recommend that I do?
>>>>
>>>> Regards,
>>>>
>>>> Cheyenne O. Forbes
>>>>
>>>>
>>>>
>>>> On Tue, Apr 18, 2017 at 6:27 PM, Sergey Soldatov <
>>>> sergeysoldatov@gmail.com> wrote:
>>>>
>>>>> I mean you need to modify Phoenix code itself to properly support such
>>>>> kind of features.
>>>>>
>>>>> Thanks,
>>>>> Sergey
>>>>>
>>>>> On Tue, Apr 18, 2017 at 3:52 PM, Cheyenne Forbes <
>>>>> cheyenne.osanu.forbes@gmail.com> wrote:
>>>>>
>>>>>> Could you explain a little more what you mean by that?
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Cheyenne O. Forbes
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 18, 2017 at 4:36 PM, Sergey Soldatov <
>>>>>> sergeysoldatov@gmail.com> wrote:
>>>>>>
>>>>>>> I may be wrong, but you have chosen wrong approach. Such kind
of
>>>>>>> integration need to be (should be) done on the Phoenix layer
in the way
>>>>>>> like global/local indexes are implemented.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sergey
>>>>>>>
>>>>>>> On Tue, Apr 18, 2017 at 12:34 PM, Cheyenne Forbes <
>>>>>>> cheyenne.osanu.forbes@gmail.com> wrote:
>>>>>>>
>>>>>>>> I am creating a plugin that uses Lucene to index text fields
and I
>>>>>>>> need to access *getConf()* and *getFilesystem()* of *HRegion,
*the
>>>>>>>> Lucene indexes are split with the regions so I need  " *HRegion
>>>>>>>> MyVar; ", *I am positive the UDF will run on the region server
and
>>>>>>>> not the client*.*
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Cheyenne O. Forbes
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Apr 18, 2017 at 1:22 PM, James Taylor <
>>>>>>>> jamestaylor@apache.org> wrote:
>>>>>>>>
>>>>>>>>> Shorter answer is "no". Your UDF may be executed on the
client
>>>>>>>>> side as well (depending on the query) and there is of
course no HRegion
>>>>>>>>> available from the client.
>>>>>>>>>
>>>>>>>>> On Tue, Apr 18, 2017 at 11:10 AM Sergey Soldatov <
>>>>>>>>> sergeysoldatov@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Well, theoretically there is a way of having a coprocessor
that
>>>>>>>>>> will keep static public map of current rowkey processed
by Phoenix and the
>>>>>>>>>> correlated HRegion instance and get this HRegion
using the key that is
>>>>>>>>>> processed by evaluate function. But it's a completely
wrong approach for
>>>>>>>>>> both HBase and Phoenix. And it's not clear for me
why SQL query may need
>>>>>>>>>> access to the region internals.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Sergey
>>>>>>>>>>
>>>>>>>>>> On Mon, Apr 17, 2017 at 10:04 PM, Cheyenne Forbes
<
>>>>>>>>>> cheyenne.osanu.forbes@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> so there is no way of getting HRegion in a UDF?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>

Mime
View raw message