phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Heppner <heppner.m...@gmail.com>
Subject Inconsistent null behavior
Date Tue, 06 Dec 2016 16:58:51 GMT
I encountered something interesting and I'm not sure if it's a bug in
Phoenix itself, the query server, or just a side effect of using a large
binary column. If I create a table like this (in sqlline):
    create table test1 (
        id integer not null primary key,
        image varbinary,
        height unsigned_int,
        width unsigned_int
    )

And insert with the Python phoenixdb client (uses the query server):
    with phoenixdb.connect('http://localhost:8765', autocommit=True) as con:
        with con.cursor() as cur:
            cur.execute('upsert into test1 values(?, ?, ?, ?)', [1, img,
900, 900])

When I "select * from test1" using sqlline, the width column appears as
null:
    +-----+--------------+---------+--------+
    | ID  |    IMAGE     | HEIGHT  | WIDTH  |
    +-----+--------------+---------+--------+
    | 1   | [B@5a2bd7c8  | 900     | null   |
    +-----+--------------+---------+--------+

However, using "select width from test1", it returns the value:
    +--------+
    | WIDTH  |
    +--------+
    | 900    |
    +--------+

Filtering by null using "select * from test1 where width is null" gives 0
rows, so it's like the value is there but doesn't get recognized when using
a "select *".

The value does show up when the binary column is not included in the select
(select id, height, width). If the binary column is present, it appears as
null (select image, height, width).

This behavior can be confirmed with phoenixdb too:
    > cur.execute('select width from test1')
    > cur.fetchall()
    [[900]]

    > cur.execute('select * from test1')
    > cur.fetchall()
    [[1, '\x00\x01 ...', 900, None]]

I don't think this is an issue with phoenixdb, since I can see the protobuf
message serializing both the height and width columns the same way. Is this
an issue with Phoenix itself, or just something the query server did on
upsert?

Is this a side effect of using a large binary column (typical length is
6480000)? In sqlline, using this works:

    upsert into test1 values (2, '\x00\x01\x02', 900, 900);
    select * from test1;

    +-----+--------------+---------+--------+
    | ID  |    IMAGE     | HEIGHT  | WIDTH  |
    +-----+--------------+---------+--------+
    | 1   | [B@1835d3ed  | 900     | null   |
    | 2   | [B@66213a0d  | 900     | 900    |
    +-----+--------------+---------+--------+

Also worth mentioning, this never happens to the height column; it always
has the correct value. It's only happening on width, the second of the
unsigned_int columns.

HBase 1.1.2 and Phoenix 4.7 from HDP 2.5.

Mime
View raw message