phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Steenkamp <steenkamp.j...@gmail.com>
Subject Inconsistent Results on Table Join when Index and TIMESTAMP type involved
Date Tue, 23 Jul 2019 14:32:13 GMT
Hi All,

I have a use case where I need to join a table with an index on itself in
order to find the latest entry. In doing so, I've come a case in Phoenix
(4.14.1) where I appear to get incorrect results if I do a join with
TIMESTAMPs where indexes are involved.

I have a suspicion that this is to down to the fact that TIMESTAMPs are
stored differently on index tables (at least based on what I see in the
result set meta data it is BigDecimal) and somehow conversion is not done
properly before joining. However, I thought I would reach out to check with
those that have more expertise in this area before I continue.

Attached, you can find a standalone program that you can run which
reproduces this issue. Here are the basic steps:

1) Create a table which effectively has 3 representations of the value - as
a TIMESTAMP, a VARCHAR and as a CHAR(29). Create indexes for each of these
representations:

CREATE TABLE IF NOT EXISTS TIME_TEST
(
   ID VARCHAR        NOT NULL,
   ENTRY_NAME        VARCHAR    ,
   TRANSACTION       TIMESTAMP  ,
   TRANSACTION_STR   VARCHAR    ,
   TRANSACTION_CHR   CHAR(29)
   CONSTRAINT pk PRIMARY KEY(ID)
)
IMMUTABLE_STORAGE_SCHEME=ONE_CELL_PER_COLUMN,
UPDATE_CACHE_FREQUENCY=900000,
COLUMN_ENCODED_BYTES=NONE,
IMMUTABLE_ROWS=true

CREATE LOCAL INDEX TIME_TEST_I01 ON TIME_TEST(ENTRY_NAME, TRANSACTION DESC)

CREATE LOCAL INDEX TIME_TEST_I02 ON TIME_TEST(ENTRY_NAME, TRANSACTION_STR
DESC)

CREATE LOCAL INDEX TIME_TEST_I03 ON TIME_TEST(ENTRY_NAME, TRANSACTION_CHR
DESC)

2) Fill up the table with some values, in the different representations. In
the attached test-case I do 100 entries. Such that you have something like :

ID        ENTRY_NAME   TRANSACTION                       TRANSACTION_STR
                TRANSACTION_CHR
-------------------------------------------------------------------------------------------------------------------------
0010000   0010000      1970-01-01 01:00:00.00001         1970-01-01
00:00:00.000010000     1970-01-01 00:00:00.000010000
0010001   0010001      1970-01-01 01:00:00.000010001     1970-01-01
00:00:00.000010001     1970-01-01 00:00:00.000010001
0010002   0010002      1970-01-01 01:00:00.000010002     1970-01-01
00:00:00.000010002     1970-01-01 00:00:00.000010002
0010003   0010003      1970-01-01 01:00:00.000010003     1970-01-01
00:00:00.000010003     1970-01-01 00:00:00.000010003

3) Essentially do the same MAX / Join query using each of the different
values (e.g. TIMESTAMP vs. VARCHAR vs. CHAR(29)). They should all return
the same number of rows (e.g. the total number of rows on the table). Only
the one based on VARCHAR does:

SELECT TIME_TEST.ENTRY_NAME FROM TIME_TEST
 INNER JOIN
(
SELECT ENTRY_NAME, MAX(TRANSACTION_STR) AS TRANSACTION_STR
FROM TIME_TEST where ENTRY_NAME like '0%' GROUP BY ENTRY_NAME
) sub
ON TIME_TEST.ENTRY_NAME = sub.ENTRY_NAME AND TIME_TEST.TRANSACTION_STR =
sub.TRANSACTION_STR
WHERE TIME_TEST.ENTRY_NAME like '0%'

*TOTAL Rows : 100*

SELECT TIME_TEST.ENTRY_NAME FROM TIME_TEST
 INNER JOIN
(
SELECT ENTRY_NAME, MAX(TRANSACTION) AS TRANSACTION
FROM TIME_TEST where ENTRY_NAME like '0%' GROUP BY ENTRY_NAME
) sub
ON TIME_TEST.ENTRY_NAME = sub.ENTRY_NAME AND TIME_TEST.TRANSACTION =
sub.TRANSACTION
WHERE TIME_TEST.ENTRY_NAME like '0%'

*TOTAL Rows : 1*

SELECT TIME_TEST.ENTRY_NAME FROM TIME_TEST
 INNER JOIN
(
SELECT ENTRY_NAME, MAX(TRANSACTION_CHR) AS TRANSACTION_CHR
FROM TIME_TEST where ENTRY_NAME like '0%' GROUP BY ENTRY_NAME
) sub
ON TIME_TEST.ENTRY_NAME = sub.ENTRY_NAME AND TIME_TEST.TRANSACTION_CHR=
sub.TRANSACTION_CHR
WHERE TIME_TEST.ENTRY_NAME like '0%'

*TOTAL Rows : 0*

Interestingly enough - in the final query, if I change the inner join
expression to be 'CAST( MAX(TRANSACTION_CHR) AS VARCHAR) AS TRANSACTION_CHR'
it works as expected. Hence, leading me to believe this is some sort of
column type conversion issue.

Either way, would appreciate it if anyone could give me any pointers on the
above.

Thanks,

Mime
View raw message