phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dalin.qin" <dalin...@gmail.com>
Subject Re: When would/should I use spark with phoenix?
Date Tue, 13 Sep 2016 00:15:34 GMT
Hi Josh,

before the project kicked off , we get the idea that hbase is more suitable
for massive writing rather than batch full table reading(I forgot where the
idea from ,just some benchmart testing posted in the website maybe). So we
decide to read hbase only based on primary key for small amount of data
query request. we store the hbase result in json file either as everyday's
incremental changes(another benefit from json is you can put them in a time
based directory so that you could only query part of those files), then use
spark to read those json files and do the ML model or report caculation.

Hope this could help:)

Dalin


On Mon, Sep 12, 2016 at 5:36 PM, Josh Mahonin <jmahonin@gmail.com> wrote:

> Hi Dalin,
>
> That's great to hear. Have you also tried reading back those rows through
> Spark for a larger "batch processing" job? Am curious if you have any
> experiences or insight there from operating on a large dataset.
>
> Thanks!
>
> Josh
>
> On Mon, Sep 12, 2016 at 10:29 AM, dalin.qin <dalinqin@gmail.com> wrote:
>
>> Hi ,
>> I've used phoenix table to store billions of rows , rows are
>> incrementally insert into phoenix by spark every day and the table was for
>> instant query from web page by providing primary key . so far so good .
>>
>> Thanks
>> Dalin
>>
>> On Mon, Sep 12, 2016 at 10:07 AM, Cheyenne Forbes <
>> cheyenne.osanu.forbes@gmail.com> wrote:
>>
>>> Thanks everyone, I will be using phoenix for simple input/output and
>>> the phoenix_spark plugin (https://phoenix.apache.org/phoenix_spark.html)
>>> for more complex queries, is that the smart thing?
>>>
>>> Regards,
>>>
>>> Cheyenne Forbes
>>>
>>> Chief Executive Officer
>>> Avapno Omnitech
>>>
>>> Chief Operating Officer
>>> Avapno Solutions, Co.
>>>
>>> Chairman
>>> Avapno Assets, LLC
>>>
>>> Bethel Town P.O
>>> Westmoreland
>>> Jamaica
>>>
>>> Email: cheyenne.osanu.forbes@gmail.com
>>> Mobile: 876-881-7889
>>> skype: cheyenne.forbes1
>>>
>>>
>>> On Sun, Sep 11, 2016 at 11:07 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>>
>>>> w.r.t. Resource Management, Spark also relies on other framework such
>>>> as YARN or Mesos.
>>>>
>>>> Cheers
>>>>
>>>> On Sun, Sep 11, 2016 at 6:31 AM, John Leach <jleach4@gmail.com> wrote:
>>>>
>>>>> Spark has a robust execution model with the following features that
>>>>> are not part of phoenix
>>>>>         * Scalable
>>>>>         * fault tolerance with lineage (Handles large intermediate
>>>>> results)
>>>>>         * memory management for tasks
>>>>>         * Resource Management (Fair Scheduling)
>>>>>         * Additional SQL Features (Windowing ,etc.)
>>>>>         * Machine Learning Libraries
>>>>>
>>>>>
>>>>> Regards,
>>>>> John
>>>>>
>>>>> > On Sep 11, 2016, at 2:45 AM, Cheyenne Forbes <
>>>>> cheyenne.osanu.forbes@gmail.com> wrote:
>>>>> >
>>>>> > I realized there is a spark plugin for phoenix, any use cases? why
>>>>> would I use spark with phoenix instead of phoenix by itself?
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message