phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "M. Aaron Bossert" <maboss...@gmail.com>
Subject inconsistent commit behavior when using JDBC
Date Wed, 22 May 2019 17:10:22 GMT
I am using Phoenix 5 as shipped with Hortonworks HDP 3.1.  I am storing 3+
million file names in a table and then using the table to keep track of
which files I have processed using a Storm topology.  I have been doing
some testing to make sure that everything is working correctly and as part
of that, tried my code on single files to ensure that columns were being
updated as expected and that my logic was working properly.

I have found that when I do an UPSERT to set a column to the current
timestamp (now()) the first time, it works as expected, then I reset that
column to NULL so that I can redo the test, but find that the exact same
UPSERT statement fails to make a change even though I get a response that
confirms one record was affected.

In order to eliminate potential red herrings, here is an example of the
problem that is reproducible using just the phoenix-sqlline interface (I
get same exact behavior when running equivalent Scala code).

Here is the DDL for the table:

CREATE TABLE DEFAULT.FILE_INDEX(
    MTIME TIMESTAMP NOT NULL,
    FILENAME VARCHAR NOT NULL,
    TYPE VARCHAR NOT NULL,
    SUBTYPE VARCHAR NOT NULL,
    SENSOR VARCHAR NOT NULL,
    SIZE BIGINT NOT NULL,
    OWNER VARCHAR NOT NULL,
    GROUP_OWNER VARCHAR NOT NULL,
    PERMISSIONS VARCHAR NOT NULL,
    STARTED TIMESTAMP,
    PROCESSED TIMESTAMP,
    EVENT_COUNT BIGINT
    CONSTRAINT PK PRIMARY KEY(MTIME
ROW_TIMESTAMP,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS))
COMPRESSION='SNAPPY',DEFAULT_COLUMN_FAMILY='F';

and here is an example set of queries to reproduce the issue I am seeing:

0: jdbc:phoenix:master-1.> UPSERT INTO
DEFAULT.FILE_INDEX(MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,STARTED)
SELECT MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,now()
AS STARTED FROM DEFAULT.FILE_INDEX WHERE
FILENAME='hdfs://filename.log';1 row affected (5.041 seconds)0:
jdbc:phoenix:master-1.> select * from default.file_index where started
is not null;+--------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------+----------+-----------------+---------+------------+--------------+------+|
         MTIME           |
             FILENAME
         | TYPE  | SUBTYPE  |     SENSOR      |  SIZE   |   OWNER    |
GROUP_OWNER  | PERM
|+--------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------+----------+-----------------+---------+------------+--------------+------+|
2018-11-01 00:00:00.000  | hdfs://filename.log  | BRO   | DNS      |
something  | 224500  | somebody  | hdfs         | rw-r
|+--------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------+----------+-----------------+---------+------------+--------------+------+1
row selected (4.046 seconds)0: jdbc:phoenix:master-1.> UPSERT INTO
DEFAULT.FILE_INDEX(MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,STARTED)
SELECT MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,NULL
AS STARTED FROM DEFAULT.FILE_INDEX WHERE
FILENAME='hdfs://filename.log';1 row affected (4.541 seconds)0:
jdbc:phoenix:master-1.> select * from default.file_index where started
is not null;+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+|
MTIME  | FILENAME  | TYPE  | SUBTYPE  | SENSOR  | SIZE  | OWNER  |
GROUP_OWNER  | PERMISSIONS  | STARTED  | PROCESSED  | EVENT_COUNT
|+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------++--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+No
rows selected (4.782 seconds)0: jdbc:phoenix:master-1.> UPSERT INTO
DEFAULT.FILE_INDEX(MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,STARTED)
SELECT MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,now()
AS STARTED FROM DEFAULT.FILE_INDEX WHERE
FILENAME='hdfs://filename.log';1 row affected (5.254 seconds)0:
jdbc:phoenix:master-1.> select * from default.file_index where started
is not null;+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+|
MTIME  | FILENAME  | TYPE  | SUBTYPE  | SENSOR  | SIZE  | OWNER  |
GROUP_OWNER  | PERMISSIONS  | STARTED  | PROCESSED  | EVENT_COUNT
|+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------++--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+No
rows selected (4.389 seconds)

I would greatly appreciate any insights into what I might be doing wrong
here.  My assumption is that this behavior, though undesirable, is expected
and has to do with versions or perhaps autocommit settings...but for the
life of me, I have tried every permutation of autocommit settings and have
not been able to get around this behavior...

Mime
View raw message