phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheyenne Forbes <cheyenne.osanu.for...@gmail.com>
Subject Re: How can I "use" a hbase co-processor from a User Defined Function?
Date Fri, 12 May 2017 11:18:43 GMT
Any updates on how I'd go about getting *"**HRegion" *in a UDF?

Regards,

Cheyenne O. Forbes

On Wed, Apr 19, 2017 at 6:03 PM, Cheyenne Forbes <
cheyenne.osanu.forbes@gmail.com> wrote:

> At postOpen the location of the lucene directory to be used for the region
> is set using the value of *"h_region.getRegionInfo().getEncodedName();" *so
> whenever prePut is called the index of the column is stored in the
> directory that was set during postOpen. So basically the lucene operations
> are "tied" to hbase hooks
>
> Regards,
>
> Cheyenne O. Forbes
>
>
>
> On Wed, Apr 19, 2017 at 4:21 PM, Sergey Soldatov <sergeysoldatov@gmail.com
> > wrote:
>
>> How do you handle HBase region splits and merges with such architecture?
>>
>> Thanks,
>> Sergey
>>
>> On Wed, Apr 19, 2017 at 9:22 AM, Cheyenne Forbes <
>> cheyenne.osanu.forbes@gmail.com> wrote:
>>
>>> I created a hbase co-processor that stores/deletes text indexes with
>>> Lucene, the indexes are stored on HDFS (for back up, replication, etc.).
>>> The indexes "mirror" the regions so if the index for a column is at
>>> "hdfs://localhost:9000/hbase/region_name" the index is stored at
>>> "hdfs://localhost:9000/lucene/region_name". I did this just in case I
>>> needed to delete (or other operation) an entire region for which ever
>>> reason. The id of the row, the column and query are passed to a Lucene
>>> BooleanQuery to get a search score to use to sort the data
>>> "SEARCH_SCORE(primary_key, text_column_name, search_query)". So I am trying
>>> to find a way to get "HRegion" of the region server the code is running on
>>> to either *1.* get the region name and the hadoop FileSystem or *2. *get
>>> access to the co-processor on that server which already have the values in
>>> option *1*
>>>
>>> Regards,
>>>
>>> Cheyenne O. Forbes
>>>
>>>
>>>
>>> On Wed, Apr 19, 2017 at 10:59 AM, James Taylor <jamestaylor@apache.org>
>>> wrote:
>>>
>>>> Can you describe the functionality you're after at a high level in
>>>> terms of a use case (rather than an implementation idea/detail) and we can
>>>> discuss any options wrt potential new features?
>>>>
>>>> On Wed, Apr 19, 2017 at 8:53 AM Cheyenne Forbes <
>>>> cheyenne.osanu.forbes@gmail.com> wrote:
>>>>
>>>>> I'd still need " *HRegion MyVar; ", *because I'd still need the name
>>>>> of the region where the row of the id passed to the UDF is located and
the
>>>>> value returned my* "getFilesystem()" *of* "**HRegion", *what do you
>>>>> recommend that I do?
>>>>>
>>>>> Regards,
>>>>>
>>>>> Cheyenne O. Forbes
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 18, 2017 at 6:27 PM, Sergey Soldatov <
>>>>> sergeysoldatov@gmail.com> wrote:
>>>>>
>>>>>> I mean you need to modify Phoenix code itself to properly support
>>>>>> such kind of features.
>>>>>>
>>>>>> Thanks,
>>>>>> Sergey
>>>>>>
>>>>>> On Tue, Apr 18, 2017 at 3:52 PM, Cheyenne Forbes <
>>>>>> cheyenne.osanu.forbes@gmail.com> wrote:
>>>>>>
>>>>>>> Could you explain a little more what you mean by that?
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Cheyenne O. Forbes
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 18, 2017 at 4:36 PM, Sergey Soldatov <
>>>>>>> sergeysoldatov@gmail.com> wrote:
>>>>>>>
>>>>>>>> I may be wrong, but you have chosen wrong approach. Such
kind of
>>>>>>>> integration need to be (should be) done on the Phoenix layer
in the way
>>>>>>>> like global/local indexes are implemented.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sergey
>>>>>>>>
>>>>>>>> On Tue, Apr 18, 2017 at 12:34 PM, Cheyenne Forbes <
>>>>>>>> cheyenne.osanu.forbes@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I am creating a plugin that uses Lucene to index text
fields and I
>>>>>>>>> need to access *getConf()* and *getFilesystem()* of *HRegion,
*the
>>>>>>>>> Lucene indexes are split with the regions so I need 
" *HRegion
>>>>>>>>> MyVar; ", *I am positive the UDF will run on the region
server
>>>>>>>>> and not the client*.*
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Cheyenne O. Forbes
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Apr 18, 2017 at 1:22 PM, James Taylor <
>>>>>>>>> jamestaylor@apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> Shorter answer is "no". Your UDF may be executed
on the client
>>>>>>>>>> side as well (depending on the query) and there is
of course no HRegion
>>>>>>>>>> available from the client.
>>>>>>>>>>
>>>>>>>>>> On Tue, Apr 18, 2017 at 11:10 AM Sergey Soldatov
<
>>>>>>>>>> sergeysoldatov@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Well, theoretically there is a way of having
a coprocessor that
>>>>>>>>>>> will keep static public map of current rowkey
processed by Phoenix and the
>>>>>>>>>>> correlated HRegion instance and get this HRegion
using the key that is
>>>>>>>>>>> processed by evaluate function. But it's a completely
wrong approach for
>>>>>>>>>>> both HBase and Phoenix. And it's not clear for
me why SQL query may need
>>>>>>>>>>> access to the region internals.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Sergey
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Apr 17, 2017 at 10:04 PM, Cheyenne Forbes
<
>>>>>>>>>>> cheyenne.osanu.forbes@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> so there is no way of getting HRegion in
a UDF?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>>
>

Mime
View raw message