phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: query client performance
Date Fri, 25 Jul 2014 19:39:03 GMT
Yes, we're planning on cutting an RC early next week.
Thanks,
James

On Fri, Jul 25, 2014 at 12:36 PM, Abe Weinograd <abe@flonet.com> wrote:
> Thanks James.  That's very helpful.
>
> 4.1 is being released soon?
>
> Thanks,
> Abe
>
>
> On Fri, Jul 25, 2014 at 3:34 PM, James Taylor <jamestaylor@apache.org>
> wrote:
>>
>> Hi Abe,
>> FWIW, there's an improvement in place
>> (https://issues.apache.org/jira/browse/PHOENIX-539) for our upcoming
>> next release that doesn't cause the first rext row call to pull over
>> everything. Instead, it is done in chunks.
>>
>> As far as what you can do now, I'd recommend putting a LIMIT clause on
>> your queries as this will bound the number of rows that get pulled
>> over. You can also page through the results as described here:
>> http://phoenix.apache.org/paged.html and elaborated on in this email
>> thread: http://s.apache.org/588
>>
>> Thanks,
>> James
>>
>> On Fri, Jul 25, 2014 at 10:45 AM, Nicolas Maillard
>> <nmaillard@hortonworks.com> wrote:
>> > Hello Abe
>> >
>> > You are right currently the pheonix client is the final step hence some
>> > processing can happen there.
>> > One way is to actually put the client on the cluster to avoid long anf
>> > suboptimal networks.
>> > Maybe a service standing in the cluster for you in front of the client
>> > to do
>> > last stepds and even pagination/compression.
>> >
>> >
>> >
>> > On Thu, Jul 24, 2014 at 11:24 PM, Abe Weinograd <abe@flonet.com> wrote:
>> >>
>> >> Hello,
>> >>
>> >> One of our main use cases is to extract a subset of our data in an ETL
>> >> tool (usually in the 10 million row range) from our tables in Phoenix.
>> >> The
>> >> behavior I am seeing is that all rows are streamed to the machine
>> >> running
>> >> the Phoenix Client and then processed before the JDBC driver gets the
>> >> next
>> >> row.
>> >>
>> >> We have tuned the scanner cache to 1000 rows, however it takes a while.
>> >> I
>> >> can imagine the all rows are being sorted before they are streamed out
>> >> to
>> >> the result set.  Is this something we can change?  what other things
>> >> can I
>> >> tune for this access pattern?
>> >>
>> >> Thanks!
>> >> Abe
>> >
>> >
>> >
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or entity
>> > to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the
>> > reader of
>> > this message is not the intended recipient, you are hereby notified that
>> > any
>> > printing, copying, dissemination, distribution, disclosure or forwarding
>> > of
>> > this communication is strictly prohibited. If you have received this
>> > communication in error, please contact the sender immediately and delete
>> > it
>> > from your system. Thank You.
>
>

Mime
View raw message