phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harshit Bapna <hrba...@gmail.com>
Subject Re: Inserting data in Hbase Phoenix using the thrift api
Date Tue, 13 May 2014 22:05:35 GMT
Hey Alex,

Thanks for creating the ticket.
I am currently discussing this internally and will need your help if we
decide to take this approach.

Another approach suggested by Jeffrey: Hack HBase thrift server to use
Phoenix instead of Hbase native api.




On Sun, May 11, 2014 at 1:40 AM, alex kamil <alex.kamil@gmail.com> wrote:

> I opened a ticket: https://issues.apache.org/jira/browse/PHOENIX-974  ,if
> it works out for you, contributions welcome)
> we use kafka as a buffer between app and db, it supports pretty high
> throughput <https://kafka.apache.org/07/performance.html> but may be not
> be suitable for very latency sensitive writes, your best bet is direct
> socket connection or some kind of distributed cache
>
>
>
> On Sun, May 11, 2014 at 4:06 AM, universal localhost <
> universal.localhost@gmail.com> wrote:
>
>> Thanks a lot James & Alex for your answer n suggestions.
>>
>> So *apart from Salting and Secondary Indexing, **are there any other
>> functionalities* *that I will loose by not using the phoenix JDBC api's
>> for inserting data*.
>>     I think the above one is similar to 'Map to an existing HBase table'
>> case.
>>
>> The last option that u suggested rdg Hive ODBC driver sounds interesting
>> but I need to think a bit more on what it means :)
>> Rest all the options that you have suggested are nice but less suitable
>> for me as we aim to have a low lag between when the txn is first seen and
>> when its ready for analysis.
>>
>>
>> Alex, I didn't think about this approach. The C++ app server handles
>> close 50-60 tns per sec so I was a bit cautious but definitely worth a try.
>> Thnxs.
>>
>>
>> --Unilocal
>>
>>
>> On Wed, May 7, 2014 at 11:31 PM, James Taylor <jamestaylor@apache.org>wrote:
>>
>>> Hi Unilocal,
>>> Yes, both salting and secondary indexing rely on the Phoenix client in
>>> cooperation with the server.
>>>
>>> Would it be possible for the C++ server to generate CSV files instead?
>>> Then these could be pumped into Phoenix through our CSV bulk loader (which
>>> could potentially be invoked through a variety of ways). Another
>>> alternative may be through our Apache Pig integration. Or it'd be pretty
>>> easy to adapt our Pig store func to a Hive SerDe. Then you could use the
>>> Hive ODBC driver to pump in data that's formated in a Phoenix compliant
>>> manner.
>>>
>>> If none of these are options, you could pump into a Phoenix table and
>>> then transfer the data (using Phoenix APIs) through UPSERT SELECT into a
>>> salted table or a table with secondary indexes.
>>>
>>> Thanks,
>>> James
>>>
>>>
>>> On Mon, May 5, 2014 at 2:42 PM, Localhost shell <
>>> universal.localhost@gmail.com> wrote:
>>>
>>>> Hey Folks,
>>>>
>>>> I have a use case where one of the apps(C++ server) will pump data into
>>>> the Hbase.
>>>> Since Phoenix doesn't support ODBC api's so the app will not be able to
>>>> use the Phoenix JDBC api and will use Hbase thirft api to insert the data.
>>>> Note: The app that is inserting data will create the row keys similar to
>>>> the way how Phoenix JDBC creates.
>>>>
>>>> Currently no data resides in Hbase and the table will be freshly
>>>> created using the SQL commands (using phoenix sqlline).
>>>> All the analysis/group-by queries will be triggered by a different app
>>>> using the Phoenix JDBC API's.
>>>>
>>>> In the above mention scenario, Are there any Phoenix functionalities (
>>>> for ex: Salting, Secondary indexing) that will not be available because
>>>> Phoenix JDBC driver is not used for inserting data.
>>>>
>>>> Can someone please share their thoughts on this?
>>>>
>>>> Hadoop Distro: CDH5
>>>> HBase: 0.96.1
>>>>
>>>> --Unilocal
>>>>
>>>>
>>>>
>>>
>>>
>>
>


-- 
--Harshit

Mime
View raw message