Hi Otis,
That's an excellent idea. Phoenix does support (1) & (2), but we don't support adding a secondary index on a dynamic column. However, there's really no reason why we couldn't - we've never had anyone ask for this. Our mutable secondary index support is done at the HBase level, so as long as only Puts and Deletes are done on the data, it should work fine. We'd just need to add the syntax for dynamic column declaration for CREATE INDEX to our grammar.

What's the use case you have in mind? Keep in mind too, that adding secondary indexes has an impact on write performance (from the HBase POV, your doing two Puts instead of one and there's some cost associated with the incremental maintenance).


On Tue, Mar 25, 2014 at 7:58 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com> wrote:

When I saw "Schema on read" my heart jumped because I thought that meant:

1) being able to insert rows without having to define columns ahead of time, and 

2) being able to query against any column in a row without having to know which columns one will be searching against.  For example, if a row with "anyRandomColumn" gets added, I could run a query like select .... where anyRandomColumn='foo' and select that row even though I didn't set a secondary index on anyRandomColumn.

But after reading a bit about Phoenix I think Phoenix can do 1), but cannot do 2) -- one has to tell it which columns to build indexes.  Is this correct?

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/