Under Phoenix 4.11 we are seeing some storage discrepancies in hbase between a load via psql
and a bulk load.
To illustrate in a simple case we have modified the example table from the load reference
https://phoenix.apache.org/bulk_dataload.html
CREATE TABLE example (
my_pk bigint not null,
m.first_name varchar(50),
m.last_name varchar(50)
CONSTRAINT pk PRIMARY KEY (my_pk))
IMMUTABLE_ROWS=true,
IMMUTABLE_STORAGE_SCHEME = SINGLE_CELL_ARRAY_WITH_OFFSETS,
COLUMN_ENCODED_BYTES = 1;
Hbase Rows when Loading via PSQL
\\x80\\x00\\x00\\x00\\x00\\x0009 column=M:\\x00\\x00\\x00\\x00, timestamp=1524109827690,
value=x
\\x80\\x00\\x00\\x00\\x00\\x0009 column=M:1, timestamp=1524109827690, value=xJohnDoe\\x00\\x00\\x00\\x01\\x00\\x05\\x00\\x00\\x00\\x08\\x00\\x00\\x00\\x03\\x02
\\x80\\x00\\x00\\x00\\x00\\x01\\x092 column=M:\\x00\\x00\\x00\\x00, timestamp=1524109827690,
value=x
\\x80\\x00\\x00\\x00\\x00\\x01\\x092 column=M:1, timestamp=1524109827690, value=xMaryPoppins\\x00\\x00\\x00\\x01\\x00\\x05\\x00\\x00\\x00\\x0C\\x00\\x00\\x00\\x03\\x02
Hbase Rows when Loading via MapReduce using CsvBulkLoadTool
\\x80\\x00\\x00\\x00\\x00\\x0009 column=M:1, timestamp=1524110486638, value=xJohnDoe\\x00\\x00\\x00\\x01\\x00\\x05\\x00\\x00\\x00\\x08\\x00\\x00\\x00\\x03\\x02
\\x80\\x00\\x00\\x00\\x00\\x01\\x092 column=M:1, timestamp=1524110486638, value=xMaryPoppins\\x00\\x00\\x00\\x01\\x00\\x05\\x00\\x00\\x00\\x0C\\x00\\x00\\x00\\x03\\x02
So, the bulk loaded tables have 4 cells for the two rows loaded via psql whereas a bulk load
is missing two cells since it lacks the cells with col qualifier :\\x00\\x00\\x00\\x00 Is
this behavior correct? Thanks much for any insight.
____________________________________________________________
How To "Remove" Dark Spots
Gundry MD
http://thirdpartyoffers.netzero.net/TGL3231/5ad818ce6211c18ce6b13st04vuc
|