phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Hive or Phoenix
Date Tue, 09 Sep 2014 18:26:27 GMT
Hi Prakash,
If possible, it'd be helpful if you could describe your use case a bit.

Some questions I'd have for you: is the data over which you'd query
stored in HBase? And if so, would the Hive run over the HBase data? Is
the data read-only or does it mutate? How much data are we talking
about (approximately) and what would your typical queries be: point
look-ups, range scans, or full table scans?

As far as security, HBase provides some more fine grained mechanisms
as well which you could leverage through HBase APIs. Other than the
ability to connect to a secure cluster through the connection URL,
Phoenix doesn't yet provide a SQL wrapper on these HBase APIs. This is
how Intuit is leveraging Phoenix + security in HBase. Anil Gupta can
likely tell you more.

Thanks,
James

On Tue, Sep 9, 2014 at 9:28 AM, Nicolas Maillard
<nmaillard@hortonworks.com> wrote:
> Hello Prakash
>
> Considering Hive or Phoenix is a little misleading they di serve different
> needs, let me break it down as I can.
>
> You mention security:
> Phoenix and hive both work on a secured Hadoop cluster, but Hive with Hive
> Atz has a more fine grained authorization model. So from that perspective
> Hive has more features.
>
> Query performance
> On the performance side Phoenix has random read,write access where Hive is a
> full data access, so no way to read a particular entry unless you read the
> whole associated file.
> So Hive is batch or interactive, meaning a couple of tens of seconds to get
> your answer, where Phoenix can be sub second, the response time will depend
> greatly on wether part of the pheonix key is in your query. I you do a full
> table scan response time will suffer. Granted secondary indexes could help
> you there.
>
> SQL Semantics
> Hive currently has a more rich sql semantics with analytics functions,
> complex types etc...
> Phoenix is also more limited than Hive in joins or UDFS
>
> So I would use Hive for large data, random analysis and ETL, and pay the
> price of the response time a little.
> Phoenix on the other hand is great for large volumes of data where you can
> set up your schema and especially keys according to specific needs and query
> patterns, in this situation you would get great query performance.
>
> To sum up in all honesty both are needed
>
> Hope this helps
>
> On Tue, Sep 9, 2014 at 4:19 PM, Prakash Hosalli
> <prakash.hosalli@syncoms.com> wrote:
>>
>>
>>
>> Hi,
>>
>>
>>
>>
>>
>>                 Is phoenix as any security layer in it. As we have in
>> hive.
>>
>>
>>
>>                 Getting confuse to go forward with Phoenix or Hive in
>> production environment in my company.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Thanks  & Regards,
>>
>> Prakash Hosalli
>>
>> Syncoms Bangalore India.
>>
>>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.

Mime
View raw message