phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Heather, James (ELS)" <james.heat...@elsevier.com>
Subject Re: querying time for Apache Phoenix
Date Wed, 27 Jul 2016 15:07:33 GMT
- select * from documents where profile_id = ? (multiple rows returned, doing a full scan)
             --- 5 sec

See https://phoenix.apache.org/secondary_indexing.html#Index_Usage for when an index is used/not
used, but by default a global index won't be used if not all columns are contained in the
index (i.e. covered index).

If the index on profile_id isn't covered, it's much quicker, it seems, to do

SELECT * FROM documents WHERE id IN (SELECT id FROM documents WHERE profile_id = ?)

where 'id' is the primary key. In other words, do it in two stages: retrieve the primary key
using the global index, and then look up the rows using the primary key.

Is there any reason that Phoenix doesn't do this by default, to avoid the full scan?


- select profile_id from documents_test order by added desc limit 1 (index on added) --- 5.5
sec

If PROFILE_ID is not in the primary key constraint of the table, make sure to include it in
your index on the ADDED column.

I don't think there's a similar trick with an ORDER BY clause, is there?

James


________________________________

Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington, Oxford, OX5
1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales.
Mime
View raw message