phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Soldatov <sergeysolda...@gmail.com>
Subject Re: hbase cell storage different bewteen bulk load and direct api
Date Thu, 19 Apr 2018 20:58:22 GMT
Heh. That looks like a bug actually. This is a 'dummy' KV (
https://phoenix.apache.org/faq.html#Why_empty_key_value), but I have some
doubts that we need it for compacted rows.

Thanks,
Sergey

On Thu, Apr 19, 2018 at 11:30 PM, Lew Jackman <lew9090@netzero.net> wrote:

> I have not tried the master yet branch yet, however on Phoenix 4.13 this
> storage discrepancy in hbase is still present with the extra
> column=M:\x00\x00\x00\x00 cells in hbase when using psql or sqlline.
>
> Does anyone have an understanding of the meaning of the column qualifier
> \x00\x00\x00\x00 ?
>
>
> ---------- Original Message ----------
> From: "Lew Jackman" <lew9090@netzero.net>
> To: user@phoenix.apache.org
> Cc: user@phoenix.apache.org
> Subject: Re: hbase cell storage different bewteen bulk load and direct api
> Date: Thu, 19 Apr 2018 13:59:16 GMT
>
> The upsert statement appears the same as the psql results - i.e. extra
> cells. I will try the master branch next. Thanks for the tip.
>
> ---------- Original Message ----------
> From: Sergey Soldatov <sergeysoldatov@gmail.com>
> To: user@phoenix.apache.org
> Subject: Re: hbase cell storage different bewteen bulk load and direct api
> Date: Thu, 19 Apr 2018 12:26:25 +0600
>
> Hi Lew,
> no. 1st one looks line incorrect. You may file a bug on that ( I believe
> that the second case is correct, but you may also check with uploading data
> using regular upserts). Also, you may check whether the master branch has
> this issue.
>
> Thanks,
> Sergey
>
> On Thu, Apr 19, 2018 at 10:19 AM, Lew Jackman <lew9090@netzero.net> wrote:
>
>> Under Phoenix 4.11 we are seeing some storage discrepancies in hbase
>> between a load via psql and a bulk load.
>>
>> To illustrate in a simple case we have modified the example table from
>> the load reference https://phoenix.apache.org/bulk_dataload.html
>>
>> CREATE TABLE example (
>> Â Â Â my_pk bigint not null,
>> Â Â Â m.first_name varchar(50),
>> Â Â Â m.last_name varchar(50)
>> Â Â Â CONSTRAINT pk PRIMARY KEY (my_pk))
>> Â Â Â IMMUTABLE_ROWS=true,
>> Â Â Â IMMUTABLE_STORAGE_SCHEME = SINGLE_CELL_ARRAY_WITH_OFFSETS,
>> Â Â Â COLUMN_ENCODED_BYTES = 1;
>>
>> Hbase Rows when Loading via PSQL
>>
>> \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0009
>> Â Â Â Â column=M:\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00,
>> timestamp=1524109827690, value=x             Â
>> \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0009
>> Â Â Â Â column=M:1, timestamp=1524109827690, value=xJohnDoe\\\\\\\\x00\\\\\
>> \\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x00\\\\\\\\x05\\\\\\\\
>> x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x08\\\\\\\\x00\\\\\\\\x00\\
>> \\\\\\x00\\\\\\\\x03\\\\\\\\x02 Â Â Â Â Â Â Â Â Â Â Â Â Â
>> \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x092
>> Â column=M:\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00,
>> timestamp=1524109827690, value=x             Â
>> \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x092
>> Â column=M:1, timestamp=1524109827690, value=xMaryPoppins\\\\\\\\x00\
>> \\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x00\\\\\\\\x05\\\\\
>> \\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0C\\\\\\\\x00\\\\\\\\
>> x00\\\\\\\\x00\\\\\\\\x03\\\\\\\\x02 Â Â Â Â Â Â Â Â Â Â Â Â Â
>>
>> Hbase Rows when Loading via MapReduce using CsvBulkLoadTool
>>
>> \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0009
>> Â Â Â Â column=M:1, timestamp=1524110486638, value=xJohnDoe\\\\\\\\x00\\\\\
>> \\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x00\\\\\\\\x05\\\\\\\\
>> x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x08\\\\\\\\x00\\\\\\\\x00\\
>> \\\\\\x00\\\\\\\\x03\\\\\\\\x02 Â Â Â Â Â Â Â Â Â Â Â Â Â
>> \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x092
>> Â column=M:1, timestamp=1524110486638, value=xMaryPoppins\\\\\\\\x00\
>> \\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x00\\\\\\\\x05\\\\\
>> \\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0C\\\\\\\\x00\\\\\\\\
>> x00\\\\\\\\x00\\\\\\\\x03\\\\\\\\x02 Â Â Â Â Â Â Â Â Â Â Â Â Â
>>
>>
>> So, the bulk loaded tables have 4 cells for the two rows loaded via psql
>> whereas a bulk load is missing two cells since it lacks the cells with col
>> qualifier :\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00
>> Â
>> Is this behavior correct?
>> Â
>> Thanks much for any insight.
>> Â
>>
>>
>> ____________________________________________________________
>> *How To "Remove" Dark Spots*
>> Gundry MD
>> <http://thirdpartyoffers.netzero.net/TGL3232/5ad818ce6211c18ce6b13st04vuc>
>> http://thirdpartyoffers.netzero.net/TGL3232/5ad818ce6211c18ce6b13st04vuc
>> [image: SponsoredBy Content.Ad]
>
>

Mime
View raw message