phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Agarwal <vi...@infoobjects.com>
Subject Re: Phoenix response time
Date Sat, 06 Sep 2014 07:14:04 GMT
Of course, I can do a lot of optimizations. However, my concern is that
what I am missing that is causing Phoenix to perform bad while exactly on
same time, Hbase is giving results amazingly fast.


On Sat, Sep 6, 2014 at 12:41 PM, Alex Kamil <alex.kamil@gmail.com> wrote:

> well it is still network attached, If you allocate enough heap to fit the
> whole thing in memory (in hbase/conf/hbase-env.sh) you could probably
> eliminate this as a possible reason
>
>
> On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vikas@infoobjects.com>
> wrote:
>
>> EBS but with new generation SSD not magnetic one.
>>
>>
>> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <alex.kamil@gmail.com> wrote:
>>
>>> do you use EBS or ephemeral storage, I found EBS performance to be
>>> somewhat unpredictable
>>>
>>>
>>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vikas@infoobjects.com>
>>> wrote:
>>>
>>>> Hbase is 0.98.0
>>>> Phoenix is 4.0
>>>>
>>>>
>>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vikas@infoobjects.com>
>>>> wrote:
>>>>
>>>>> Yes, that is why it is a trouble for me. However, on contrary, HBase
>>>>> shell is also on the same machine and same environment, so if it is an
>>>>> issue of resource (CPU or memory) it should have affected the HBase too,
>>>>> but HBase is able to give me results within 0.0150 seconds. :(
>>>>>
>>>>> No, I haven't tested it outside AWS. I guess, it should not be the
>>>>> case due to much better performance by native HBase query on HBase shell.
>>>>>
>>>>>
>>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor <jamestaylor@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Something is up in your environment. What version of Phoenix and
HBase
>>>>>> are you using and in what environment? Have you tried this locally,
>>>>>> outside of AWS to compare?
>>>>>>
>>>>>> Take a look at our perf numbers, generated more-or-less daily, and
>>>>>> which run over more data that what you're testing against:
>>>>>>
>>>>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm
>>>>>>
>>>>>> Some of these are point queries and they take in the neighborhood
of
>>>>>> 0.01 seconds.
>>>>>>
>>>>>> Thanks,
>>>>>> James
>>>>>>
>>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <vikas@infoobjects.com>
>>>>>> wrote:
>>>>>> > Missed to mention that count query (posted in my last mail)
is also
>>>>>> taking
>>>>>> > very long time to return the count.
>>>>>> >
>>>>>> >
>>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal <
>>>>>> vikas@infoobjects.com>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> As I mentioned, schema is nothing but bunch of fields (some
being
>>>>>> >> integers, longs and text) along with primary key (row key)
and I
>>>>>> am making
>>>>>> >> simple query to get result for a particular primary key,
nothing
>>>>>> more than
>>>>>> >> that.
>>>>>> >>
>>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name;
>>>>>> >>
>>>>>> >> +------------+
>>>>>> >>
>>>>>> >> |  COUNT(1)  |
>>>>>> >>
>>>>>> >> +------------+
>>>>>> >>
>>>>>> >> | 4667515    |
>>>>>> >>
>>>>>> >> +------------+
>>>>>> >>
>>>>>> >> 1 row selected (132.11 seconds)
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha
>>>>>> >> <puneet.kumar@pubmatic.com> wrote:
>>>>>> >>>
>>>>>> >>> If you can share the schema,data type,cardinality of
each
>>>>>> dimension and
>>>>>> >>> usual queries, I can help to design a schema with performance
of
>>>>>> less than 1
>>>>>> >>> sec using Phoenix.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Thanks
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> ------ Original message------
>>>>>> >>>
>>>>>> >>> From: James Taylor
>>>>>> >>>
>>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM
>>>>>> >>>
>>>>>> >>> To: user;
>>>>>> >>>
>>>>>> >>> Subject:Re: Phoenix response time
>>>>>> >>>
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> Vikas,
>>>>>> >>> Please post your schema and query.
>>>>>> >>> Thanks,
>>>>>> >>> James
>>>>>> >>>
>>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <
>>>>>> vikas@infoobjects.com>
>>>>>> >>> wrote:
>>>>>> >>> > Ours is also a single node setup right now and
as of now there
>>>>>> are less
>>>>>> >>> > than
>>>>>> >>> > 1 million rows which is expected to grow around
100m at minimum.
>>>>>> >>> >
>>>>>> >>> > I am aware of secondary indexes but when I am querying
on
>>>>>> primary/row
>>>>>> >>> > key,
>>>>>> >>> > why would it take so much time?
>>>>>> >>> >
>>>>>> >>> > I am directly querying using sqlline for Phoenix
and hbase
>>>>>> shell for
>>>>>> >>> > HBase
>>>>>> >>> > query. I am not expecting to do any fine tuning
for such small
>>>>>> dataset.
>>>>>> >>> > I am
>>>>>> >>> > assumimg a minimum performance level out of the
box.
>>>>>> >>> >
>>>>>> >>> > On Friday, September 5, 2014, yeshwanth kumar <
>>>>>> yeshwanth43@gmail.com>
>>>>>> >>> > wrote:
>>>>>> >>> >>
>>>>>> >>> >> hi vikas,
>>>>>> >>> >>
>>>>>> >>> >> we used phoenix on a 4 core/23Gb machine, as
a single node
>>>>>> setup.
>>>>>> >>> >> used HDP 2.1
>>>>>> >>> >> our table has 50-70M rows,
>>>>>> >>> >> select on that table took less than 2 seconds.
>>>>>> >>> >> Aggregation queries took less than 8 seconds.
>>>>>> >>> >> for achieving good performance we created secondary
index on
>>>>>> the
>>>>>> >>> >> table.
>>>>>> >>> >>
>>>>>> >>> >> make sure you finetuned hbase,
>>>>>> >>> >> enabling compression on the data makes a difference
in
>>>>>> response.
>>>>>> >>> >> if u distribute the data and load over all
regions in hbase,
>>>>>> >>> >> look at the performance tips mentioned in phoenix
blog
>>>>>> >>> >>
>>>>>> >>> >> -yeshwanth
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >> Cheers,
>>>>>> >>> >> Yeshwanth
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal
<
>>>>>> vikas@infoobjects.com>
>>>>>> >>> >> wrote:
>>>>>> >>> >>>
>>>>>> >>> >>> Hi,
>>>>>> >>> >>>
>>>>>> >>> >>> Preface: We are testing phoenix using Hortonworks
>>>>>> distribution for
>>>>>> >>> >>> HBase
>>>>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15
GB RAM).
>>>>>> >>> >>>
>>>>>> >>> >>> With contrast to performance benchmarks,
I found Phoenix to
>>>>>> be very
>>>>>> >>> >>> slow
>>>>>> >>> >>> in querying even on primary key or row
key. So, tried to
>>>>>> increase the
>>>>>> >>> >>> RAM
>>>>>> >>> >>> for HBase and Phoenix and increasing the
CPU and RAM by
>>>>>> upgrading the
>>>>>> >>> >>> EC2
>>>>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB
RAM). Results were
>>>>>> like this:
>>>>>> >>> >>>
>>>>>> >>> >>> Time takes in returning result of query
on row key:
>>>>>> >>> >>> With Storm running and very less RAM available:
50 sec
>>>>>> >>> >>>
>>>>>> >>> >>> With Storm stopped and RAM available to
Phoenix and HBase: 18
>>>>>> sec
>>>>>> >>> >>>
>>>>>> >>> >>> With new machine of next higher category
(4 CPU and 30 GB
>>>>>> RAM): 8 sec
>>>>>> >>> >>>
>>>>>> >>> >>> Pure HBase query by row key with Storm
stopped and (2 CPU, 15
>>>>>> GB
>>>>>> >>> >>> RAM):
>>>>>> >>> >>> 0.0150 seconds. :)
>>>>>> >>> >>>
>>>>>> >>> >>> So, the difference seems to be many fold
of what native HBase
>>>>>> is
>>>>>> >>> >>> providing to us. I am not able to understand
how it can be
>>>>>> possible?
>>>>>> >>> >>> What I
>>>>>> >>> >>> am missing here?
>>>>>> >>> >>>
>>>>>> >>> >>> --
>>>>>> >>> >>> Regards,
>>>>>> >>> >>> Vikas Agarwal
>>>>>> >>> >>> 91 – 9928301411
>>>>>> >>> >>>
>>>>>> >>> >>> InfoObjects, Inc.
>>>>>> >>> >>> Execution Matters
>>>>>> >>> >>> http://www.infoobjects.com
>>>>>> >>> >>> 2041 Mission College Boulevard, #280
>>>>>> >>> >>> Santa Clara, CA 95054
>>>>>> >>> >>> +1 (408) 988-2000 Work
>>>>>> >>> >>> +1 (408) 716-2726 Fax
>>>>>> >>> >>
>>>>>> >>> >>
>>>>>> >>> >
>>>>>> >>> >
>>>>>> >>> > --
>>>>>> >>> > Regards,
>>>>>> >>> > Vikas Agarwal
>>>>>> >>> > 91 – 9928301411
>>>>>> >>> >
>>>>>> >>> > InfoObjects, Inc.
>>>>>> >>> > Execution Matters
>>>>>> >>> > http://www.infoobjects.com
>>>>>> >>> > 2041 Mission College Boulevard, #280
>>>>>> >>> > Santa Clara, CA 95054
>>>>>> >>> > +1 (408) 988-2000 Work
>>>>>> >>> > +1 (408) 716-2726 Fax
>>>>>> >>> >
>>>>>> >>> >
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> --
>>>>>> >> Regards,
>>>>>> >> Vikas Agarwal
>>>>>> >> 91 – 9928301411
>>>>>> >>
>>>>>> >> InfoObjects, Inc.
>>>>>> >> Execution Matters
>>>>>> >> http://www.infoobjects.com
>>>>>> >> 2041 Mission College Boulevard, #280
>>>>>> >> Santa Clara, CA 95054
>>>>>> >> +1 (408) 988-2000 Work
>>>>>> >> +1 (408) 716-2726 Fax
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Regards,
>>>>>> > Vikas Agarwal
>>>>>> > 91 – 9928301411
>>>>>> >
>>>>>> > InfoObjects, Inc.
>>>>>> > Execution Matters
>>>>>> > http://www.infoobjects.com
>>>>>> > 2041 Mission College Boulevard, #280
>>>>>> > Santa Clara, CA 95054
>>>>>> > +1 (408) 988-2000 Work
>>>>>> > +1 (408) 716-2726 Fax
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Vikas Agarwal
>>>>> 91 – 9928301411
>>>>>
>>>>> InfoObjects, Inc.
>>>>> Execution Matters
>>>>> http://www.infoobjects.com
>>>>> 2041 Mission College Boulevard, #280
>>>>> Santa Clara, CA 95054
>>>>> +1 (408) 988-2000 Work
>>>>> +1 (408) 716-2726 Fax
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Vikas Agarwal
>>>> 91 – 9928301411
>>>>
>>>> InfoObjects, Inc.
>>>> Execution Matters
>>>> http://www.infoobjects.com
>>>> 2041 Mission College Boulevard, #280
>>>> Santa Clara, CA 95054
>>>> +1 (408) 988-2000 Work
>>>> +1 (408) 716-2726 Fax
>>>>
>>>>
>>>
>>
>>
>> --
>> Regards,
>> Vikas Agarwal
>> 91 – 9928301411
>>
>> InfoObjects, Inc.
>> Execution Matters
>> http://www.infoobjects.com
>> 2041 Mission College Boulevard, #280
>> Santa Clara, CA 95054
>> +1 (408) 988-2000 Work
>> +1 (408) 716-2726 Fax
>>
>>
>


-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Mime
View raw message