phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randy Hu <ruw...@gmail.com>
Subject Re: Best strategy for UPSERT SELECT in large table
Date Sat, 17 Jun 2017 23:06:21 GMT
If I count the number of tailing zeros correctly, it's 15 billion records,
any solution based on HBase PUT interaction (UPSERT SELECT) would probably
take way more time than your expectation. It would be better to use the
map/reduce based bulk importer provided by Phoenix:

https://phoenix.apache.org/bulk_dataload.html

The importer leverages HBase bulk mode to convert all data into HBase
storage file, then hand it over to HBase in the final stage, thus avoids
all network and disk random access cost when going through HBase region
servers.

Randy

On Fri, Jun 16, 2017 at 9:51 AM, Pedro Boado [via Apache Phoenix User List]
<ml+s1124778n3675h74@n5.nabble.com> wrote:

> Hi guys,
>
> We are trying to populate a Phoenix table based on a 1:1 projection of
> another table with around 15.000.000.000 records via an UPSERT SELECT in
> phoenix client. We've noticed a very poor performance ( I suspect the
> client is using a single-threaded approach ) and lots of issues with client
> timeouts.
>
> Is there a better way of approaching this problem?
>
> Cheers!
> Pedro
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-phoenix-user-list.1124778.n5.nabble.com/
> Best-strategy-for-UPSERT-SELECT-in-large-table-tp3675.html
> To start a new topic under Apache Phoenix User List, email
> ml+s1124778n1h80@n5.nabble.com
> To unsubscribe from Apache Phoenix User List, click here
> <http://apache-phoenix-user-list.1124778.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=cnV3ZWloQGdtYWlsLmNvbXwxfC04OTI3ODY3NTc=>
> .
> NAML
> <http://apache-phoenix-user-list.1124778.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-phoenix-user-list.1124778.n5.nabble.com/Best-strategy-for-UPSERT-SELECT-in-large-table-tp3675p3683.html
Sent from the Apache Phoenix User List mailing list archive at Nabble.com.

Mime
View raw message