phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristoffer Sjögren <sto...@gmail.com>
Subject Re: Copy table between hbase clusters
Date Fri, 11 Dec 2015 21:36:42 GMT
My concern regarding the row key is that they are 23 bytes in reality
instead of 26 bytes? Hmm. Guess I could write a value manually using
SQL and inspect it afterwards using asynchbase to find where those 3
missing bytes went.

Not sure I understand the post procedure. I'm not trying to do an
in-place upgrade. The table will be created from scratch in Phoenix
4.4.0 / HBase 1.1.2. The data will be read from the old installation
using asynchbase and manually UPSERT'ed from that raw data using SQL.

The reason I do it this way is to avoid such manual post procedures.
So just to be clear, will this approach still require those post
procedures?


Thanks James.

On Fri, Dec 11, 2015 at 5:28 PM, James Taylor <jamestaylor@apache.org> wrote:
> Your analysis of the row key structure is correct. Those are all fixed types
> (4 + 4 + 8 +8 + 2 = 26 bytes for the key).
>
> If you're going from 0.94 to 0.98, there's stuff you need to do to get your
> data into the new format. Best to ask about this on the HBase user list or
> look it up in the reference docs.
>
> Once you get your data moved over, I'd recommend reissuing your DDL,
> specifying a CURRENT_SCN at connection time of a timestamp in millis prior
> to any of your cell timestamps (so that Phoenix doesn't set empty key value
> markers for every row). If your using any date/time types then your DDL
> should specify them as their UNSIGNED equivalent. If you have DESC row key
> declarations for variable length types or BINARY, there's some post
> processing steps you'll need to do, but otherwise you should be good to go
> (FWIW, we had an internal team successfully go through this about 6mo back).
>
> Thanks,
> James
>
> On Friday, December 11, 2015, Kristoffer Sjögren <stoffe@gmail.com> wrote:
>>
>> My plan is to try use asynchbase to read the raw data and then upsert
>> it using Phoenix SQL.
>>
>> However, when I read the old table the data types for the row key
>> doesn't add up.
>>
>> CREATE TABLE T1 (C1 INTEGER NOT NULL, C2 INTEGER NOT NULL, C3 BIGINT
>> NOT NULL, C4 BIGINT NOT NULL, C5 CHAR(2) NOT NULL, V BIGINT CONSTRAINT
>> PK PRIMARY KEY ( C1, C2, C3, C4, C5 ))
>>
>> That's 4 + 4 + 8 +8 + 2 = 26 bytes for the key. But the actual key
>> that I read from HBase is only 23 bytes.
>>
>> [0, -128, 0, 0, 0, -44, 4, 123, -32, -128, 0, 0, 10, -128, 0, 0, 0, 0,
>> 0, 0, 0, 32, 32]
>>
>> Maybe the data type definitions as described on the phoenix site have
>> changed since version 2.2.3? Or some data type may be variable in
>> size?
>>
>>
>> On Thu, Dec 10, 2015 at 4:49 PM, Kristoffer Sjögren <stoffe@gmail.com>
>> wrote:
>> > Hi
>> >
>> > We're in the process of upgrading from Phoenix 2.2.3 / HBase 0.96 to
>> > Phoneix 4.4.0 / HBase 1.1.2 and wanted to know the simplest/easiest
>> > way to copy data from old-to-new table.
>> >
>> > The tables contain only a few hundred million rows so it's OK to
>> > export locally and then upsert.
>> >
>> > Cheers,
>> > -Kristoffer

Mime
View raw message