phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Copy table between hbase clusters
Date Sat, 12 Dec 2015 06:02:40 GMT
My mistake - I thought you were going to do an in-place upgrade. No, no
post processing will be required if the data is UPSERT'ed through the
standard Phoenix APIs.

I don't know how/why your row keys aren't the correct number of bytes.
Maybe your row key is different than what you think?

Thanks,
James

On Fri, Dec 11, 2015 at 1:36 PM, Kristoffer Sjögren <stoffe@gmail.com>
wrote:

> My concern regarding the row key is that they are 23 bytes in reality
> instead of 26 bytes? Hmm. Guess I could write a value manually using
> SQL and inspect it afterwards using asynchbase to find where those 3
> missing bytes went.
>
> Not sure I understand the post procedure. I'm not trying to do an
> in-place upgrade. The table will be created from scratch in Phoenix
> 4.4.0 / HBase 1.1.2. The data will be read from the old installation
> using asynchbase and manually UPSERT'ed from that raw data using SQL.
>
> The reason I do it this way is to avoid such manual post procedures.
> So just to be clear, will this approach still require those post
> procedures?
>
>
> Thanks James.
>
> On Fri, Dec 11, 2015 at 5:28 PM, James Taylor <jamestaylor@apache.org>
> wrote:
> > Your analysis of the row key structure is correct. Those are all fixed
> types
> > (4 + 4 + 8 +8 + 2 = 26 bytes for the key).
> >
> > If you're going from 0.94 to 0.98, there's stuff you need to do to get
> your
> > data into the new format. Best to ask about this on the HBase user list
> or
> > look it up in the reference docs.
> >
> > Once you get your data moved over, I'd recommend reissuing your DDL,
> > specifying a CURRENT_SCN at connection time of a timestamp in millis
> prior
> > to any of your cell timestamps (so that Phoenix doesn't set empty key
> value
> > markers for every row). If your using any date/time types then your DDL
> > should specify them as their UNSIGNED equivalent. If you have DESC row
> key
> > declarations for variable length types or BINARY, there's some post
> > processing steps you'll need to do, but otherwise you should be good to
> go
> > (FWIW, we had an internal team successfully go through this about 6mo
> back).
> >
> > Thanks,
> > James
> >
> > On Friday, December 11, 2015, Kristoffer Sjögren <stoffe@gmail.com>
> wrote:
> >>
> >> My plan is to try use asynchbase to read the raw data and then upsert
> >> it using Phoenix SQL.
> >>
> >> However, when I read the old table the data types for the row key
> >> doesn't add up.
> >>
> >> CREATE TABLE T1 (C1 INTEGER NOT NULL, C2 INTEGER NOT NULL, C3 BIGINT
> >> NOT NULL, C4 BIGINT NOT NULL, C5 CHAR(2) NOT NULL, V BIGINT CONSTRAINT
> >> PK PRIMARY KEY ( C1, C2, C3, C4, C5 ))
> >>
> >> That's 4 + 4 + 8 +8 + 2 = 26 bytes for the key. But the actual key
> >> that I read from HBase is only 23 bytes.
> >>
> >> [0, -128, 0, 0, 0, -44, 4, 123, -32, -128, 0, 0, 10, -128, 0, 0, 0, 0,
> >> 0, 0, 0, 32, 32]
> >>
> >> Maybe the data type definitions as described on the phoenix site have
> >> changed since version 2.2.3? Or some data type may be variable in
> >> size?
> >>
> >>
> >> On Thu, Dec 10, 2015 at 4:49 PM, Kristoffer Sjögren <stoffe@gmail.com>
> >> wrote:
> >> > Hi
> >> >
> >> > We're in the process of upgrading from Phoenix 2.2.3 / HBase 0.96 to
> >> > Phoneix 4.4.0 / HBase 1.1.2 and wanted to know the simplest/easiest
> >> > way to copy data from old-to-new table.
> >> >
> >> > The tables contain only a few hundred million rows so it's OK to
> >> > export locally and then upsert.
> >> >
> >> > Cheers,
> >> > -Kristoffer
>

Mime
View raw message