phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Phoenix response time
Date Sat, 06 Sep 2014 17:03:44 GMT
I don't have experience running Phoenix in AWS. Andrew Purtell is a
good person to ask. I'm curious if our support under their EMR helps
in any way: http://phoenix.apache.org/phoenix_on_emr.html

Thanks,
James

On Sat, Sep 6, 2014 at 12:27 AM, Alex Kamil <alex.kamil@gmail.com> wrote:
> not sure, that's not my experience with phoenix, but if you have unstable
> network connection to your storage (which is EBS is well known for) it may
> affect the results
>
>
> On Sat, Sep 6, 2014 at 3:14 AM, Vikas Agarwal <vikas@infoobjects.com> wrote:
>>
>> Of course, I can do a lot of optimizations. However, my concern is that
>> what I am missing that is causing Phoenix to perform bad while exactly on
>> same time, Hbase is giving results amazingly fast.
>>
>>
>> On Sat, Sep 6, 2014 at 12:41 PM, Alex Kamil <alex.kamil@gmail.com> wrote:
>>>
>>> well it is still network attached, If you allocate enough heap to fit the
>>> whole thing in memory (in hbase/conf/hbase-env.sh) you could probably
>>> eliminate this as a possible reason
>>>
>>>
>>> On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vikas@infoobjects.com>
>>> wrote:
>>>>
>>>> EBS but with new generation SSD not magnetic one.
>>>>
>>>>
>>>> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <alex.kamil@gmail.com>
>>>> wrote:
>>>>>
>>>>> do you use EBS or ephemeral storage, I found EBS performance to be
>>>>> somewhat unpredictable
>>>>>
>>>>>
>>>>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vikas@infoobjects.com>
>>>>> wrote:
>>>>>>
>>>>>> Hbase is 0.98.0
>>>>>> Phoenix is 4.0
>>>>>>
>>>>>>
>>>>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vikas@infoobjects.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Yes, that is why it is a trouble for me. However, on contrary,
HBase
>>>>>>> shell is also on the same machine and same environment, so if
it is an issue
>>>>>>> of resource (CPU or memory) it should have affected the HBase
too, but HBase
>>>>>>> is able to give me results within 0.0150 seconds. :(
>>>>>>>
>>>>>>> No, I haven't tested it outside AWS. I guess, it should not be
the
>>>>>>> case due to much better performance by native HBase query on
HBase shell.
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor
>>>>>>> <jamestaylor@apache.org> wrote:
>>>>>>>>
>>>>>>>> Something is up in your environment. What version of Phoenix
and
>>>>>>>> HBase
>>>>>>>> are you using and in what environment? Have you tried this
locally,
>>>>>>>> outside of AWS to compare?
>>>>>>>>
>>>>>>>> Take a look at our perf numbers, generated more-or-less daily,
and
>>>>>>>> which run over more data that what you're testing against:
>>>>>>>>
>>>>>>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>>>>>>>
>>>>>>>> Some of these are point queries and they take in the neighborhood
of
>>>>>>>> 0.01 seconds.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> James
>>>>>>>>
>>>>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal
>>>>>>>> <vikas@infoobjects.com> wrote:
>>>>>>>> > Missed to mention that count query (posted in my last
mail) is
>>>>>>>> > also taking
>>>>>>>> > very long time to return the count.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal
>>>>>>>> > <vikas@infoobjects.com>
>>>>>>>> > wrote:
>>>>>>>> >>
>>>>>>>> >> As I mentioned, schema is nothing but bunch of fields
(some being
>>>>>>>> >> integers, longs and text) along with primary key
(row key) and I
>>>>>>>> >> am making
>>>>>>>> >> simple query to get result for a particular primary
key, nothing
>>>>>>>> >> more than
>>>>>>>> >> that.
>>>>>>>> >>
>>>>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM
table_name;
>>>>>>>> >>
>>>>>>>> >> +------------+
>>>>>>>> >>
>>>>>>>> >> |  COUNT(1)  |
>>>>>>>> >>
>>>>>>>> >> +------------+
>>>>>>>> >>
>>>>>>>> >> | 4667515    |
>>>>>>>> >>
>>>>>>>> >> +------------+
>>>>>>>> >>
>>>>>>>> >> 1 row selected (132.11 seconds)
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>>>>>>>> >> <puneet.kumar@pubmatic.com> wrote:
>>>>>>>> >>>
>>>>>>>> >>> If you can share the schema,data type,cardinality
of each
>>>>>>>> >>> dimension and
>>>>>>>> >>> usual queries, I can help to design a schema
with performance of
>>>>>>>> >>> less than 1
>>>>>>>> >>> sec using Phoenix.
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>> Thanks
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>> ------ Original message------
>>>>>>>> >>>
>>>>>>>> >>> From: James Taylor
>>>>>>>> >>>
>>>>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>>>>>>>> >>>
>>>>>>>> >>> To: user;
>>>>>>>> >>>
>>>>>>>> >>> Subject:Re: Phoenix response time
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>> Vikas,
>>>>>>>> >>> Please post your schema and query.
>>>>>>>> >>> Thanks,
>>>>>>>> >>> James
>>>>>>>> >>>
>>>>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal
>>>>>>>> >>> <vikas@infoobjects.com>
>>>>>>>> >>> wrote:
>>>>>>>> >>> > Ours is also a single node setup right
now and as of now there
>>>>>>>> >>> > are less
>>>>>>>> >>> > than
>>>>>>>> >>> > 1 million rows which is expected to grow
around 100m at
>>>>>>>> >>> > minimum.
>>>>>>>> >>> >
>>>>>>>> >>> > I am aware of secondary indexes but when
I am querying on
>>>>>>>> >>> > primary/row
>>>>>>>> >>> > key,
>>>>>>>> >>> > why would it take so much time?
>>>>>>>> >>> >
>>>>>>>> >>> > I am directly querying using sqlline for
Phoenix and hbase
>>>>>>>> >>> > shell for
>>>>>>>> >>> > HBase
>>>>>>>> >>> > query. I am not expecting to do any fine
tuning for such small
>>>>>>>> >>> > dataset.
>>>>>>>> >>> > I am
>>>>>>>> >>> > assumimg a minimum performance level out
of the box.
>>>>>>>> >>> >
>>>>>>>> >>> > On Friday, September 5, 2014, yeshwanth
kumar
>>>>>>>> >>> > <yeshwanth43@gmail.com>
>>>>>>>> >>> > wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >> hi vikas,
>>>>>>>> >>> >>
>>>>>>>> >>> >> we used phoenix on a 4 core/23Gb machine,
as a single node
>>>>>>>> >>> >> setup.
>>>>>>>> >>> >> used HDP 2.1
>>>>>>>> >>> >> our table has 50-70M rows,
>>>>>>>> >>> >> select on that table took less than
2 seconds.
>>>>>>>> >>> >> Aggregation queries took less than
8 seconds.
>>>>>>>> >>> >> for achieving good performance we created
secondary index on
>>>>>>>> >>> >> the
>>>>>>>> >>> >> table.
>>>>>>>> >>> >>
>>>>>>>> >>> >> make sure you finetuned hbase,
>>>>>>>> >>> >> enabling compression on the data makes
a difference in
>>>>>>>> >>> >> response.
>>>>>>>> >>> >> if u distribute the data and load over
all regions in hbase,
>>>>>>>> >>> >> look at the performance tips mentioned
in phoenix blog
>>>>>>>> >>> >>
>>>>>>>> >>> >> -yeshwanth
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >> Cheers,
>>>>>>>> >>> >> Yeshwanth
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas
Agarwal
>>>>>>>> >>> >> <vikas@infoobjects.com>
>>>>>>>> >>> >> wrote:
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> Hi,
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> Preface: We are testing phoenix
using Hortonworks
>>>>>>>> >>> >>> distribution for
>>>>>>>> >>> >>> HBase
>>>>>>>> >>> >>> on Amazon EC2 instance (r3.large,
2 CPU/15 GB RAM).
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> With contrast to performance benchmarks,
I found Phoenix to
>>>>>>>> >>> >>> be very
>>>>>>>> >>> >>> slow
>>>>>>>> >>> >>> in querying even on primary key
or row key. So, tried to
>>>>>>>> >>> >>> increase the
>>>>>>>> >>> >>> RAM
>>>>>>>> >>> >>> for HBase and Phoenix and increasing
the CPU and RAM by
>>>>>>>> >>> >>> upgrading the
>>>>>>>> >>> >>> EC2
>>>>>>>> >>> >>> machine type to r3.xlarge (4 CPU,
30 GB RAM). Results were
>>>>>>>> >>> >>> like this:
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> Time takes in returning result
of query on row key:
>>>>>>>> >>> >>> With Storm running and very less
RAM available: 50 sec
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> With Storm stopped and RAM available
to Phoenix and HBase:
>>>>>>>> >>> >>> 18 sec
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> With new machine of next higher
category (4 CPU and 30 GB
>>>>>>>> >>> >>> RAM): 8 sec
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> Pure HBase query by row key with
Storm stopped and (2 CPU,
>>>>>>>> >>> >>> 15 GB
>>>>>>>> >>> >>> RAM):
>>>>>>>> >>> >>> 0.0150 seconds. :)
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> So, the difference seems to be
many fold of what native
>>>>>>>> >>> >>> HBase is
>>>>>>>> >>> >>> providing to us. I am not able
to understand how it can be
>>>>>>>> >>> >>> possible?
>>>>>>>> >>> >>> What I
>>>>>>>> >>> >>> am missing here?
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> --
>>>>>>>> >>> >>> Regards,
>>>>>>>> >>> >>> Vikas Agarwal
>>>>>>>> >>> >>> 91 – 9928301411
>>>>>>>> >>> >>>
>>>>>>>> >>> >>> InfoObjects, Inc.
>>>>>>>> >>> >>> Execution Matters
>>>>>>>> >>> >>> http://www.infoobjects.com
>>>>>>>> >>> >>> 2041 Mission College Boulevard,
#280
>>>>>>>> >>> >>> Santa Clara, CA 95054
>>>>>>>> >>> >>> +1 (408) 988-2000 Work
>>>>>>>> >>> >>> +1 (408) 716-2726 Fax
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >
>>>>>>>> >>> >
>>>>>>>> >>> > --
>>>>>>>> >>> > Regards,
>>>>>>>> >>> > Vikas Agarwal
>>>>>>>> >>> > 91 – 9928301411
>>>>>>>> >>> >
>>>>>>>> >>> > InfoObjects, Inc.
>>>>>>>> >>> > Execution Matters
>>>>>>>> >>> > http://www.infoobjects.com
>>>>>>>> >>> > 2041 Mission College Boulevard, #280
>>>>>>>> >>> > Santa Clara, CA 95054
>>>>>>>> >>> > +1 (408) 988-2000 Work
>>>>>>>> >>> > +1 (408) 716-2726 Fax
>>>>>>>> >>> >
>>>>>>>> >>> >
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> --
>>>>>>>> >> Regards,
>>>>>>>> >> Vikas Agarwal
>>>>>>>> >> 91 – 9928301411
>>>>>>>> >>
>>>>>>>> >> InfoObjects, Inc.
>>>>>>>> >> Execution Matters
>>>>>>>> >> http://www.infoobjects.com
>>>>>>>> >> 2041 Mission College Boulevard, #280
>>>>>>>> >> Santa Clara, CA 95054
>>>>>>>> >> +1 (408) 988-2000 Work
>>>>>>>> >> +1 (408) 716-2726 Fax
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Regards,
>>>>>>>> > Vikas Agarwal
>>>>>>>> > 91 – 9928301411
>>>>>>>> >
>>>>>>>> > InfoObjects, Inc.
>>>>>>>> > Execution Matters
>>>>>>>> > http://www.infoobjects.com
>>>>>>>> > 2041 Mission College Boulevard, #280
>>>>>>>> > Santa Clara, CA 95054
>>>>>>>> > +1 (408) 988-2000 Work
>>>>>>>> > +1 (408) 716-2726 Fax
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Vikas Agarwal
>>>>>>> 91 – 9928301411
>>>>>>>
>>>>>>> InfoObjects, Inc.
>>>>>>> Execution Matters
>>>>>>> http://www.infoobjects.com
>>>>>>> 2041 Mission College Boulevard, #280
>>>>>>> Santa Clara, CA 95054
>>>>>>> +1 (408) 988-2000 Work
>>>>>>> +1 (408) 716-2726 Fax
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Regards,
>>>>>> Vikas Agarwal
>>>>>> 91 – 9928301411
>>>>>>
>>>>>> InfoObjects, Inc.
>>>>>> Execution Matters
>>>>>> http://www.infoobjects.com
>>>>>> 2041 Mission College Boulevard, #280
>>>>>> Santa Clara, CA 95054
>>>>>> +1 (408) 988-2000 Work
>>>>>> +1 (408) 716-2726 Fax
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Vikas Agarwal
>>>> 91 – 9928301411
>>>>
>>>> InfoObjects, Inc.
>>>> Execution Matters
>>>> http://www.infoobjects.com
>>>> 2041 Mission College Boulevard, #280
>>>> Santa Clara, CA 95054
>>>> +1 (408) 988-2000 Work
>>>> +1 (408) 716-2726 Fax
>>>
>>>
>>
>>
>>
>> --
>> Regards,
>> Vikas Agarwal
>> 91 – 9928301411
>>
>> InfoObjects, Inc.
>> Execution Matters
>> http://www.infoobjects.com
>> 2041 Mission College Boulevard, #280
>> Santa Clara, CA 95054
>> +1 (408) 988-2000 Work
>> +1 (408) 716-2726 Fax
>
>

Mime
View raw message