phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Afshin Moazami <Afshin.Moaz...@guavus.com>
Subject Index rows are not updated when the index key updated using bulk loader
Date Fri, 11 Dec 2015 17:29:44 GMT
Hi,

I guess it is a bug, but before posting in Jira, I want to see if anyone has an idea here.

I found out the map reduce csv bulk load tool doesn't behave the same as UPSERTs. Is it by
design or a bug?

Here is the queries for creating table and index:

CREATE TABLE mySchema.mainTable (
id varchar NOT NULL,
name varchar,
address varchar
CONSTRAINT pk PRIMARY KEY (id));


CREATE INDEX myIndex
ON mySchema.mainTable  (name, id)
INCLUDE (address);

if I execute two upserts where the second one update the name (which is the key for index),
everything works fine (the record will be updated in both table and index table)

UPSERT INTO mySchema.mainTable (id, name, address) values ('1', 'john', 'Montreal');
UPSERT INTO mySchema.mainTable (id, name, address) values ('1', 'jack', 'Montreal');

SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable where name = 'jack';
  ==> one record
SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable where name = 'john';
  ==> zero records

But, if I load the date using org.apache.phoenix.mapreduce.CsvBulkLoadTool to the main table,
it behaves different. The main table will be updated, but the new record will be appended
to the index table:

HADOOP_CLASSPATH=/usr/lib/hbase/lib/hbase-protocol-1.1.2.jar:/etc/hbase/conf hadoop jar  /usr/lib/hbase/phoenix-4.5.2-HBase-1.1-bin/phoenix-4.5.2-HBase-1.1-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool -d',' -s mySchema -t mainTable -i /tmp/input.txt

input.txt:

2,tomas,montreal

2,george,montreal

(I have tried it both with/without -it and got the same result)


SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable where name = 'tomas'
==> one record;


SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable where name = 'george'
==> one record;

Mime
View raw message