phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lu Wei <wey...@outlook.com>
Subject Exception of phoenix hive integration
Date Sat, 14 Apr 2018 23:39:33 GMT
## Version:

phoenix: 4.13.2-cdh5.11.2

hive: 1.1.0-cdh5.11.2


There is an ColumnNotFoundException when joining a hive internal table with an Phoenix external
table.



## Table1: phoenix external table "ext_tmp":

+-----------+------------+--------------------+--+

| col_name  | data_type  |      comment       |

+-----------+------------+--------------------+--+

| cola      | string     | from deserializer  |

| colb      | string     | from deserializer  |

+-----------+------------+--------------------+--+


### Backend Phoenix table "TMP":

select * from TMP;

+-------+------------+

| cola  |    colb    |

+-------+------------+

| a     | aaaa       |

| b     | bbaaaa     |

| ccc   | cccbbaaaa  |

+-------+------------+


### Hive external table creation statement:


+----------------------------------------------------+--+

|                   createtab_stmt                   |

+----------------------------------------------------+--+

| CREATE EXTERNAL TABLE `ext_tmp`(                   |

|   `cola` string COMMENT 'from deserializer',       |

|   `colb` string COMMENT 'from deserializer')       |

| ROW FORMAT SERDE                                   |

|   'org.apache.phoenix.hive.PhoenixSerDe'           |

| STORED BY                                          |

|   'org.apache.phoenix.hive.PhoenixStorageHandler'  |

| WITH SERDEPROPERTIES (                             |

|   'serialization.format'='1')                      |

| LOCATION                                           |

|   'hdfs://st:8020/data/user/hive/warehouse/ext_tmp'|

| TBLPROPERTIES (                                    |

|   'phoenix.column.mapping'='cola:cola,colb:colb',  |

|   'phoenix.rowkeys'='cola',                        |

|   'phoenix.table.name'='tmp',                      |

|   'phoenix.zookeeper.client.port'='2181',          |

|   'phoenix.zookeeper.quorum'='st1,st2,st3',        |

|   'phoenix.zookeeper.znode.parent'='/hbase',       |

|   'transient_lastDdlTime'='1523607352')            |

+----------------------------------------------------+--+



## Table2: hive internal table "native1":

+-----------+------------+----------+--+

| col_name  | data_type  | comment  |

+-----------+------------+----------+--+

| cola      | string     |          |

| colb      | string     |          |

+-----------+------------+----------+--+



## When join the two tables:


select *  from native1  join  ext_tmp t on native1.cola= t.cola;



Exception:

org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined column. columnName=TMP


Detailed exception as below. There is an empty column "" as readcolumn name, which is not
exsit in Pheonix table at all. so the phoenix query will never be correct: select /*+ NO_CACHE
 */ "","cola","colb" from tmp where "cola" is not null


Any thoughts?



---------

2018-04-14 21:13:40,923 INFO  org.apache.hadoop.hive.ql.io.HiveInputFormat: [HiveServer2-Background-Pool:
Thread-304]: hive.io.file.readcolumn.ids=

2018-04-14 21:13:40,923 INFO  org.apache.hadoop.hive.ql.io.HiveInputFormat: [HiveServer2-Background-Pool:
Thread-304]: hive.io.file.readcolumn.names=,cola,colb

2018-04-14 21:13:40,923 INFO  org.apache.hadoop.hive.ql.io.HiveInputFormat: [HiveServer2-Background-Pool:
Thread-304]: Generating splits

2018-04-14 21:13:40,924 INFO  org.apache.phoenix.hive.query.PhoenixQueryBuilder: [HiveServer2-Background-Pool:
Thread-304]: Input query : select /*+ NO_CACHE  */ "","cola","colb" from tmp where "cola"
is not null

2018-04-14 21:13:40,932 ERROR org.apache.phoenix.hive.mapreduce.PhoenixInputFormat: [HiveServer2-Background-Pool:
Thread-304]: Failed to get the query plan with error [ERROR 504 (42703): Undefined column.
columnName=TMP]


Thanks,

Wei
Mime
View raw message