phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Agarwal <vi...@infoobjects.com>
Subject Re: Phoenix response time
Date Wed, 10 Sep 2014 10:27:22 GMT
It was not issue with Amazon at all. Amazon can't slow down things this
much. :)

It was issue from our side only. Earlier we had two columns as primary key
timestamp and id fields and recently we changed it to single be on single
field id. Phoenix table was not updated to reflect the same. :)

Now new question, do I need to drop the table to do this primary key
change? I guess yes because behind the scenes HBase has already created row
keys using two fields. Still, just to confirm and to get new ideas for not
loosing the existing data.

Thanks all for the help. :)


On Sat, Sep 6, 2014 at 10:33 PM, James Taylor <jamestaylor@apache.org>
wrote:

> I don't have experience running Phoenix in AWS. Andrew Purtell is a
> good person to ask. I'm curious if our support under their EMR helps
> in any way: http://phoenix.apache.org/phoenix_on_emr.html
>
> Thanks,
> James
>
> On Sat, Sep 6, 2014 at 12:27 AM, Alex Kamil <alex.kamil@gmail.com> wrote:
> > not sure, that's not my experience with phoenix, but if you have unstable
> > network connection to your storage (which is EBS is well known for) it
> may
> > affect the results
> >
> >
> > On Sat, Sep 6, 2014 at 3:14 AM, Vikas Agarwal <vikas@infoobjects.com>
> wrote:
> >>
> >> Of course, I can do a lot of optimizations. However, my concern is that
> >> what I am missing that is causing Phoenix to perform bad while exactly
> on
> >> same time, Hbase is giving results amazingly fast.
> >>
> >>
> >> On Sat, Sep 6, 2014 at 12:41 PM, Alex Kamil <alex.kamil@gmail.com>
> wrote:
> >>>
> >>> well it is still network attached, If you allocate enough heap to fit
> the
> >>> whole thing in memory (in hbase/conf/hbase-env.sh) you could probably
> >>> eliminate this as a possible reason
> >>>
> >>>
> >>> On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vikas@infoobjects.com>
> >>> wrote:
> >>>>
> >>>> EBS but with new generation SSD not magnetic one.
> >>>>
> >>>>
> >>>> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <alex.kamil@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> do you use EBS or ephemeral storage, I found EBS performance to
be
> >>>>> somewhat unpredictable
> >>>>>
> >>>>>
> >>>>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vikas@infoobjects.com
> >
> >>>>> wrote:
> >>>>>>
> >>>>>> Hbase is 0.98.0
> >>>>>> Phoenix is 4.0
> >>>>>>
> >>>>>>
> >>>>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <
> vikas@infoobjects.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Yes, that is why it is a trouble for me. However, on contrary,
> HBase
> >>>>>>> shell is also on the same machine and same environment,
so if it
> is an issue
> >>>>>>> of resource (CPU or memory) it should have affected the
HBase too,
> but HBase
> >>>>>>> is able to give me results within 0.0150 seconds. :(
> >>>>>>>
> >>>>>>> No, I haven't tested it outside AWS. I guess, it should
not be the
> >>>>>>> case due to much better performance by native HBase query
on HBase
> shell.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor
> >>>>>>> <jamestaylor@apache.org> wrote:
> >>>>>>>>
> >>>>>>>> Something is up in your environment. What version of
Phoenix and
> >>>>>>>> HBase
> >>>>>>>> are you using and in what environment? Have you tried
this
> locally,
> >>>>>>>> outside of AWS to compare?
> >>>>>>>>
> >>>>>>>> Take a look at our perf numbers, generated more-or-less
daily, and
> >>>>>>>> which run over more data that what you're testing against:
> >>>>>>>>
> >>>>>>>>
> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
> >>>>>>>>
> >>>>>>>> Some of these are point queries and they take in the
neighborhood
> of
> >>>>>>>> 0.01 seconds.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> James
> >>>>>>>>
> >>>>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal
> >>>>>>>> <vikas@infoobjects.com> wrote:
> >>>>>>>> > Missed to mention that count query (posted in my
last mail) is
> >>>>>>>> > also taking
> >>>>>>>> > very long time to return the count.
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal
> >>>>>>>> > <vikas@infoobjects.com>
> >>>>>>>> > wrote:
> >>>>>>>> >>
> >>>>>>>> >> As I mentioned, schema is nothing but bunch
of fields (some
> being
> >>>>>>>> >> integers, longs and text) along with primary
key (row key) and
> I
> >>>>>>>> >> am making
> >>>>>>>> >> simple query to get result for a particular
primary key,
> nothing
> >>>>>>>> >> more than
> >>>>>>>> >> that.
> >>>>>>>> >>
> >>>>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1)
FROM table_name;
> >>>>>>>> >>
> >>>>>>>> >> +------------+
> >>>>>>>> >>
> >>>>>>>> >> |  COUNT(1)  |
> >>>>>>>> >>
> >>>>>>>> >> +------------+
> >>>>>>>> >>
> >>>>>>>> >> | 4667515    |
> >>>>>>>> >>
> >>>>>>>> >> +------------+
> >>>>>>>> >>
> >>>>>>>> >> 1 row selected (132.11 seconds)
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar
Ojha
> >>>>>>>> >> <puneet.kumar@pubmatic.com> wrote:
> >>>>>>>> >>>
> >>>>>>>> >>> If you can share the schema,data type,cardinality
of each
> >>>>>>>> >>> dimension and
> >>>>>>>> >>> usual queries, I can help to design a schema
with performance
> of
> >>>>>>>> >>> less than 1
> >>>>>>>> >>> sec using Phoenix.
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>> Thanks
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>> ------ Original message------
> >>>>>>>> >>>
> >>>>>>>> >>> From: James Taylor
> >>>>>>>> >>>
> >>>>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
> >>>>>>>> >>>
> >>>>>>>> >>> To: user;
> >>>>>>>> >>>
> >>>>>>>> >>> Subject:Re: Phoenix response time
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>>
> >>>>>>>> >>> Vikas,
> >>>>>>>> >>> Please post your schema and query.
> >>>>>>>> >>> Thanks,
> >>>>>>>> >>> James
> >>>>>>>> >>>
> >>>>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal
> >>>>>>>> >>> <vikas@infoobjects.com>
> >>>>>>>> >>> wrote:
> >>>>>>>> >>> > Ours is also a single node setup right
now and as of now
> there
> >>>>>>>> >>> > are less
> >>>>>>>> >>> > than
> >>>>>>>> >>> > 1 million rows which is expected to
grow around 100m at
> >>>>>>>> >>> > minimum.
> >>>>>>>> >>> >
> >>>>>>>> >>> > I am aware of secondary indexes but
when I am querying on
> >>>>>>>> >>> > primary/row
> >>>>>>>> >>> > key,
> >>>>>>>> >>> > why would it take so much time?
> >>>>>>>> >>> >
> >>>>>>>> >>> > I am directly querying using sqlline
for Phoenix and hbase
> >>>>>>>> >>> > shell for
> >>>>>>>> >>> > HBase
> >>>>>>>> >>> > query. I am not expecting to do any
fine tuning for such
> small
> >>>>>>>> >>> > dataset.
> >>>>>>>> >>> > I am
> >>>>>>>> >>> > assumimg a minimum performance level
out of the box.
> >>>>>>>> >>> >
> >>>>>>>> >>> > On Friday, September 5, 2014, yeshwanth
kumar
> >>>>>>>> >>> > <yeshwanth43@gmail.com>
> >>>>>>>> >>> > wrote:
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> hi vikas,
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> we used phoenix on a 4 core/23Gb
machine, as a single node
> >>>>>>>> >>> >> setup.
> >>>>>>>> >>> >> used HDP 2.1
> >>>>>>>> >>> >> our table has 50-70M rows,
> >>>>>>>> >>> >> select on that table took less
than 2 seconds.
> >>>>>>>> >>> >> Aggregation queries took less
than 8 seconds.
> >>>>>>>> >>> >> for achieving good performance
we created secondary index
> on
> >>>>>>>> >>> >> the
> >>>>>>>> >>> >> table.
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> make sure you finetuned hbase,
> >>>>>>>> >>> >> enabling compression on the data
makes a difference in
> >>>>>>>> >>> >> response.
> >>>>>>>> >>> >> if u distribute the data and load
over all regions in
> hbase,
> >>>>>>>> >>> >> look at the performance tips mentioned
in phoenix blog
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> -yeshwanth
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> Cheers,
> >>>>>>>> >>> >> Yeshwanth
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM,
Vikas Agarwal
> >>>>>>>> >>> >> <vikas@infoobjects.com>
> >>>>>>>> >>> >> wrote:
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Hi,
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Preface: We are testing phoenix
using Hortonworks
> >>>>>>>> >>> >>> distribution for
> >>>>>>>> >>> >>> HBase
> >>>>>>>> >>> >>> on Amazon EC2 instance (r3.large,
2 CPU/15 GB RAM).
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> With contrast to performance
benchmarks, I found Phoenix
> to
> >>>>>>>> >>> >>> be very
> >>>>>>>> >>> >>> slow
> >>>>>>>> >>> >>> in querying even on primary
key or row key. So, tried to
> >>>>>>>> >>> >>> increase the
> >>>>>>>> >>> >>> RAM
> >>>>>>>> >>> >>> for HBase and Phoenix and
increasing the CPU and RAM by
> >>>>>>>> >>> >>> upgrading the
> >>>>>>>> >>> >>> EC2
> >>>>>>>> >>> >>> machine type to r3.xlarge
(4 CPU, 30 GB RAM). Results were
> >>>>>>>> >>> >>> like this:
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Time takes in returning result
of query on row key:
> >>>>>>>> >>> >>> With Storm running and very
less RAM available: 50 sec
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> With Storm stopped and RAM
available to Phoenix and HBase:
> >>>>>>>> >>> >>> 18 sec
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> With new machine of next higher
category (4 CPU and 30 GB
> >>>>>>>> >>> >>> RAM): 8 sec
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> Pure HBase query by row key
with Storm stopped and (2 CPU,
> >>>>>>>> >>> >>> 15 GB
> >>>>>>>> >>> >>> RAM):
> >>>>>>>> >>> >>> 0.0150 seconds. :)
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> So, the difference seems to
be many fold of what native
> >>>>>>>> >>> >>> HBase is
> >>>>>>>> >>> >>> providing to us. I am not
able to understand how it can be
> >>>>>>>> >>> >>> possible?
> >>>>>>>> >>> >>> What I
> >>>>>>>> >>> >>> am missing here?
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> --
> >>>>>>>> >>> >>> Regards,
> >>>>>>>> >>> >>> Vikas Agarwal
> >>>>>>>> >>> >>> 91 – 9928301411
> >>>>>>>> >>> >>>
> >>>>>>>> >>> >>> InfoObjects, Inc.
> >>>>>>>> >>> >>> Execution Matters
> >>>>>>>> >>> >>> http://www.infoobjects.com
> >>>>>>>> >>> >>> 2041 Mission College Boulevard,
#280
> >>>>>>>> >>> >>> Santa Clara, CA 95054
> >>>>>>>> >>> >>> +1 (408) 988-2000 Work
> >>>>>>>> >>> >>> +1 (408) 716-2726 Fax
> >>>>>>>> >>> >>
> >>>>>>>> >>> >>
> >>>>>>>> >>> >
> >>>>>>>> >>> >
> >>>>>>>> >>> > --
> >>>>>>>> >>> > Regards,
> >>>>>>>> >>> > Vikas Agarwal
> >>>>>>>> >>> > 91 – 9928301411
> >>>>>>>> >>> >
> >>>>>>>> >>> > InfoObjects, Inc.
> >>>>>>>> >>> > Execution Matters
> >>>>>>>> >>> > http://www.infoobjects.com
> >>>>>>>> >>> > 2041 Mission College Boulevard, #280
> >>>>>>>> >>> > Santa Clara, CA 95054
> >>>>>>>> >>> > +1 (408) 988-2000 Work
> >>>>>>>> >>> > +1 (408) 716-2726 Fax
> >>>>>>>> >>> >
> >>>>>>>> >>> >
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >>
> >>>>>>>> >> --
> >>>>>>>> >> Regards,
> >>>>>>>> >> Vikas Agarwal
> >>>>>>>> >> 91 – 9928301411
> >>>>>>>> >>
> >>>>>>>> >> InfoObjects, Inc.
> >>>>>>>> >> Execution Matters
> >>>>>>>> >> http://www.infoobjects.com
> >>>>>>>> >> 2041 Mission College Boulevard, #280
> >>>>>>>> >> Santa Clara, CA 95054
> >>>>>>>> >> +1 (408) 988-2000 Work
> >>>>>>>> >> +1 (408) 716-2726 Fax
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> >
> >>>>>>>> > --
> >>>>>>>> > Regards,
> >>>>>>>> > Vikas Agarwal
> >>>>>>>> > 91 – 9928301411
> >>>>>>>> >
> >>>>>>>> > InfoObjects, Inc.
> >>>>>>>> > Execution Matters
> >>>>>>>> > http://www.infoobjects.com
> >>>>>>>> > 2041 Mission College Boulevard, #280
> >>>>>>>> > Santa Clara, CA 95054
> >>>>>>>> > +1 (408) 988-2000 Work
> >>>>>>>> > +1 (408) 716-2726 Fax
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Regards,
> >>>>>>> Vikas Agarwal
> >>>>>>> 91 – 9928301411
> >>>>>>>
> >>>>>>> InfoObjects, Inc.
> >>>>>>> Execution Matters
> >>>>>>> http://www.infoobjects.com
> >>>>>>> 2041 Mission College Boulevard, #280
> >>>>>>> Santa Clara, CA 95054
> >>>>>>> +1 (408) 988-2000 Work
> >>>>>>> +1 (408) 716-2726 Fax
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Regards,
> >>>>>> Vikas Agarwal
> >>>>>> 91 – 9928301411
> >>>>>>
> >>>>>> InfoObjects, Inc.
> >>>>>> Execution Matters
> >>>>>> http://www.infoobjects.com
> >>>>>> 2041 Mission College Boulevard, #280
> >>>>>> Santa Clara, CA 95054
> >>>>>> +1 (408) 988-2000 Work
> >>>>>> +1 (408) 716-2726 Fax
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Regards,
> >>>> Vikas Agarwal
> >>>> 91 – 9928301411
> >>>>
> >>>> InfoObjects, Inc.
> >>>> Execution Matters
> >>>> http://www.infoobjects.com
> >>>> 2041 Mission College Boulevard, #280
> >>>> Santa Clara, CA 95054
> >>>> +1 (408) 988-2000 Work
> >>>> +1 (408) 716-2726 Fax
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Vikas Agarwal
> >> 91 – 9928301411
> >>
> >> InfoObjects, Inc.
> >> Execution Matters
> >> http://www.infoobjects.com
> >> 2041 Mission College Boulevard, #280
> >> Santa Clara, CA 95054
> >> +1 (408) 988-2000 Work
> >> +1 (408) 716-2726 Fax
> >
> >
>



-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Mime
View raw message