phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Creating Covering index on Phoenix
Date Sun, 23 Oct 2016 21:29:28 GMT
Keep in mind that the CsvBulkLoadTool does not handle updating data
in-place. It's expected that the data is unique by row and not updating
existing data. If your data is write-once/append-only data, then you'll be
ok, but otherwise you should stick with using the JDBC APIs.

You're free to just use HBase APIs (maybe that's better for your use
case?), but you won't get:
- JDBC APIs
- SQL
- relational data model
- parallel execution for your queries
- secondary indexes
- cross row/cross table transactions
- query optimization
- views
- multi tenancy
- query server

HBase doesn't store data either, it relies on HDFS to do that. But HDFS
eventually stores data in a file system, relying on the OS.

Thanks,
James

On Sun, Oct 23, 2016 at 2:09 PM, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

> Thanks Sergey,
>
> I have modified the design to load data into Hbase through Phoenix table.
> In that way both the table in Hbase and the index in Hbase are maintained.
> I assume Phoenix bulkload .CsvBulkLoadTool  updates the underlying table
> in Hbase plus all the indexes there as well.
>
> therefore I noticed some ambiguity here
> <https://en.wikipedia.org/wiki/Apache_Phoenix>.
>
> "*Apache Phoenix* is an open source, massively parallel, relational
> *database* engine supporting OLTP for Hadoop using *Apache* HBase as its
> backing store."
>
> It is not a database. The underlying data store is Hbase. All Phoenix does
> is to allow one to create SQL on top of Hbase to manipulate Hbase table
> with DDL and DQ (data query). It does not store data itself.
>
> I trust this is the correct assessment
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 23 October 2016 at 21:49, Sergey Soldatov <sergeysoldatov@gmail.com>
> wrote:
>
>> Hi Mich,
>> No, if you update HBase directly, the index will not be maintained.
>> Actually I would suggest to ingest data using Phoenix CSV bulk load.
>>
>> Thanks,
>> Sergey.
>>
>> On Sat, Oct 22, 2016 at 12:49 AM, Mich Talebzadeh <
>> mich.talebzadeh@gmail.com> wrote:
>>
>>> Thanks Sergey,
>>>
>>> In this case the phoenix view is defined on Hbase table.
>>>
>>> Hbase table is updated every 15 minutes via cron that uses
>>> org.apache.hadoop.hbase.mapreduce.ImportTsv  to bulk load data into
>>> Hbase table,
>>>
>>> So if I create index on my view in Phoenix, will that index be
>>> maintained?
>>>
>>> regards
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 21 October 2016 at 23:35, Sergey Soldatov <sergeysoldatov@gmail.com>
>>> wrote:
>>>
>>>> Hi Mich,
>>>>
>>>> It's really depends on the query that you are going to use. If
>>>> conditions will be applied only by time column you may create index like
>>>> create index I on "marketDataHbase" ("timecreated") include ("ticker",
>>>> "price");
>>>> If the conditions will be applied on others columns as well, you may
>>>> use
>>>> create index I on "marketDataHbase" ("timecreated","ticker", "price");
>>>>
>>>> Index is updated together with the user table if you are using phoenix
>>>> jdbc driver or phoenix bulk load tools to ingest the data.
>>>>
>>>> Thanks,
>>>> Sergey
>>>>
>>>> On Fri, Oct 21, 2016 at 4:43 AM, Mich Talebzadeh <
>>>> mich.talebzadeh@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I have  a Phoenix table on Hbase as follows:
>>>>>
>>>>> [image: Inline images 1]
>>>>>
>>>>> I want to create a covered index to cover the three columns: ticker,
>>>>> timecreated, price
>>>>>
>>>>> More importantly I want the index to be maintained when new rows are
>>>>> added to Hbase table.
>>>>>
>>>>> What is the best way of achieving this?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Dr Mich Talebzadeh
>>>>>
>>>>>
>>>>>
>>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>
>>>>>
>>>>>
>>>>> http://talebzadehmich.wordpress.com
>>>>>
>>>>>
>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>>> any loss, damage or destruction of data or any other property which may
>>>>> arise from relying on this email's technical content is explicitly
>>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>>> arising from such loss, damage or destruction.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message