phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Custom Indexing Plug-in for Phoenix
Date Tue, 04 Apr 2017 03:13:17 GMT
You are talking about indexing to external systems, Randy? My initial
thought was that you were asking about different HBase indexing
schemas (essentially interfaces that decouple the logical SQL
operators from the physical table scans allowing you to make more
efficient table structures for certain cases), but I saw Jonathan went
in a different direction :)

On Mon, Apr 3, 2017 at 8:51 PM, Jonathan Leech <jonathaz@gmail.com> wrote:
> Take a look at SOLR and Lucene. You should be able to a text search on the
> Hbase data written via Phoenix. It works via the hbase replication mechanism
> so should be near-real time. I think you would have to use the SOLR API to
> do the initial search, which would get you the Hbase rowkey, which you could
> parse and do a follow up Phoenix query for additional data. Note that I
> haven't done any of the above myself, so your mileage may vary.
>
> On Apr 3, 2017, at 6:27 PM, Randy <ruweih@gmail.com> wrote:
>
> Wondering if anyone knows whether there is an approach to swap in custom
> indexing implementation, while leveraging all other functionalities of
> Phoenix. The initial goal is just in SELECT query, but would be nice to make
> custom index maintenance integrated in record life cycle as well.
>
> Phoenix supports secondary index already, but need to be more flexible with
> real large data set when the format and quality varies.
>
> For example, assuming we have a table "PEOPLE" which has a column "NAME"
> stored person's name. If there is a record with "Joe Smith" as the value of
> "NAME" column, it would be really powerful if we can find it by variants or
> partial name as criteria. Ideally all the following query would find the
> same record if we can plug-in a custom indexing implementation in Phoenix:
>
> SELECT * FROM PEOPLE WHERE NAME='Joe Smith';
> SELECT * FROM PEOPLE WHERE NAME='Smith,Joe';
> SELECT * FROM PEOPLE WHERE NAME='Joseph Smith';
>
> Given the secondary index has global vs. local implementation. I would
> imagine there is some level of abstraction already on consuming the index.
> Not expecting it would be official supported API, just some guidance on
> where to start would be greatly appreciated.
>
> Thanks,
>
> Randy

Mime
View raw message