At postOpen the location of the lucene directory to be used for the region is set using the value of "h_region.getRegionInfo().getEncodedName();" so whenever prePut is called the index of the column is stored in the directory that was set during postOpen. So basically the lucene operations are "tied" to hbase hooks

Regards,

Cheyenne O. Forbes



On Wed, Apr 19, 2017 at 4:21 PM, Sergey Soldatov <sergeysoldatov@gmail.com> wrote:
How do you handle HBase region splits and merges with such architecture?

Thanks,
Sergey 

On Wed, Apr 19, 2017 at 9:22 AM, Cheyenne Forbes <cheyenne.osanu.forbes@gmail.com> wrote:
I created a hbase co-processor that stores/deletes text indexes with Lucene, the indexes are stored on HDFS (for back up, replication, etc.). The indexes "mirror" the regions so if the index for a column is at "hdfs://localhost:9000/hbase/region_name" the index is stored at "hdfs://localhost:9000/lucene/region_name". I did this just in case I needed to delete (or other operation) an entire region for which ever reason. The id of the row, the column and query are passed to a Lucene BooleanQuery to get a search score to use to sort the data "SEARCH_SCORE(primary_key, text_column_name, search_query)". So I am trying to find a way to get "HRegion" of the region server the code is running on to either 1. get the region name and the hadoop FileSystem or 2. get access to the co-processor on that server which already have the values in option 1

Regards,

Cheyenne O. Forbes



On Wed, Apr 19, 2017 at 10:59 AM, James Taylor <jamestaylor@apache.org> wrote:
Can you describe the functionality you're after at a high level in terms of a use case (rather than an implementation idea/detail) and we can discuss any options wrt potential new features?

On Wed, Apr 19, 2017 at 8:53 AM Cheyenne Forbes <cheyenne.osanu.forbes@gmail.com> wrote:
I'd still need " HRegion MyVar; ", because I'd still need the name of the region where the row of the id passed to the UDF is located and the value returned my "getFilesystem()" of "HRegion", what do you recommend that I do?

Regards,

Cheyenne O. Forbes



On Tue, Apr 18, 2017 at 6:27 PM, Sergey Soldatov <sergeysoldatov@gmail.com> wrote:
I mean you need to modify Phoenix code itself to properly support such kind of features. 

Thanks,
Sergey

On Tue, Apr 18, 2017 at 3:52 PM, Cheyenne Forbes <cheyenne.osanu.forbes@gmail.com> wrote:
Could you explain a little more what you mean by that?

Regards,

Cheyenne O. Forbes


On Tue, Apr 18, 2017 at 4:36 PM, Sergey Soldatov <sergeysoldatov@gmail.com> wrote:
I may be wrong, but you have chosen wrong approach. Such kind of integration need to be (should be) done on the Phoenix layer in the way like global/local indexes are implemented. 

Thanks,
Sergey 

On Tue, Apr 18, 2017 at 12:34 PM, Cheyenne Forbes <cheyenne.osanu.forbes@gmail.com> wrote:
I am creating a plugin that uses Lucene to index text fields and I need to access getConf() and getFilesystem() of HRegion, the Lucene indexes are split with the regions so I need  " HRegion MyVar; ", I am positive the UDF will run on the region server and not the client.

Regards,

Cheyenne O. Forbes


On Tue, Apr 18, 2017 at 1:22 PM, James Taylor <jamestaylor@apache.org> wrote:
Shorter answer is "no". Your UDF may be executed on the client side as well (depending on the query) and there is of course no HRegion available from the client.

On Tue, Apr 18, 2017 at 11:10 AM Sergey Soldatov <sergeysoldatov@gmail.com> wrote:
Well, theoretically there is a way of having a coprocessor that will keep static public map of current rowkey processed by Phoenix and the correlated HRegion instance and get this HRegion using the key that is processed by evaluate function. But it's a completely wrong approach for both HBase and Phoenix. And it's not clear for me why SQL query may need access to the region internals. 

Thanks,
Sergey  

On Mon, Apr 17, 2017 at 10:04 PM, Cheyenne Forbes <cheyenne.osanu.forbes@gmail.com> wrote:
so there is no way of getting HRegion in a UDF?