In Phoenix-4.3.0 or later version, They change the way to convert a date type column to an object in bulk load.
If a column is date type column and the value of this column is not null, Phoenix will convert this value to byte first.
In this step, if the value of this column if empty string (""), it may cause error.

The code is in the  line:231 of org.apache.phoenix.util.csv.CsvUpsertExecutor.java

You can check the different between Phoenix-4.2.2 and Phoenix-4.3.0 in the following website.

Best regards,

2015-11-25 5:06 GMT+08:00 Hemal Parekh <hemal@bitscopic.com>:

We recently upgraded our production HDP cluster from 2.2 to 2.3.2. Phoenix was upgraded from 4.2 to 4.4. The bulk load script using psql.py which was working in Phoenix 4.2 stopped working in Phoenix 4.4. Upon investigation, I found that psql.py was failing to upsert null value into a date column which was working fine in Phoenix 4.2. It throws following error. The .csv file has an empty string for a date column. To rule out any upgrade issue, I created a temp table in Phoenix 4.4 and tried to insert a record using psql.py but it failed giving below error for null date value.

java.lang.IllegalArgumentException: Invalid format: ""  

create table temp1 (pk varchar, c1 varchar, c2 date, c3 integer, c4 varchar constraint pk_temp1 primary key (pk))

Values used in .csv file: I ran psql.py separately with two different records.  

p1~abc~~1~x (this one gives error)

p2~abc~2015-11-24 00:00:00~1~x (this one is getting inserted fine)

Bulk load command:

/usr/hdp/current/phoenix-client/bin/psql.py -t TEMP1 -d '~' -s <my host>:2181:/hbase-unsecure temp1_insert.csv

For column other than date type, psql.py can upsert null value.

Has anyone experienced this issue? Do I need to set any property in hbase-site.xml to allow null value in date column?


Hemal Parekh
Senior Data Warehouse Architect