phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Heather, James (ELS)" <>
Subject Advice on Phoenix config
Date Wed, 03 Aug 2016 14:28:50 GMT

We've got a quite wide table (maybe 50 cols) with about 1 billion rows in it, currently stored
in MySQL; we're looking at moving it into Phoenix. The pk there is an autoincrement column,
but each row also contains a UUID, and that would probably naturally become the pk in Phoenix.
There are several other tables that hang off this table, in the sense that the pk for the
main table is a foreign key in these other tables. There are several indexed columns in MySQL
that would also need to carry over as indexes in Phoenix.

Most of the queries are reads, but maybe 20% of them are writes. Almost all of them are small,
doing point lookups or returning a few rows based on one of the indexes.

Can anyone suggest sensible Phoenix/HBase config to get decent performance out of this? Specifically:

  1.  How should we encode the UUID? As BINARY(16)? And if this is the PK, and they are randomly
generated UUIDs, presumably salting is unnecessary?
  2.  How many nodes should we expect to need to give us at least as good performance as our
MySQL database with 1 billion rows?
  3.  How many regions?
  4.  Presumably this will start to out-perform MySQL as the number of rows in the database
increases? When we've got 10 billion rows, MySQL might struggle but hopefully Phoenix will
be fine?
  5.  Are there any particular HBase configs we should be aware of (RPC timeouts etc.) that
we'll need to tweak to get decent performance? This applies partly to the bulk loading process
(data migration) at the beginning, but also afterwards when it's released into production.

We'd be extremely grateful for any tips.



Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington, Oxford, OX5
1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales.
View raw message