Hello,
I was able to successfully insert "basic" data types (int and varchar)
using the Pig StoreFunc but I have not been able to insert a pig bytearray
into a phoenix VARBINARY column.
Example:
CREATE TABLE IF NOT EXISTS binary (id BIGINT NOT NULL, binary VARBINARY
CONSTRAINT my_pk PRIMARY KEY (id));
phoenix> select * from binary;
+------------+------------+
| ID | BINARY |
+------------+------------+
+------------+------------+
> cat testdata.tdf
1 10
2 20
3 30
grunt> A = load 'testdata.tdf' USING PigStorage('\t') AS (id:long,
avro:bytearray);
grunt> describe A;
A: {id: long,avro: bytearray}
grunt> STORE A into 'hbase://BINARY' using
org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 1000');
Is throwing a cannot cast exception:
java.lang.Exception: java.lang.ClassCastException:
org.apache.pig.data.DataByteArray cannot be cast to [B
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray
cannot be cast to [B
at org.apache.phoenix.schema.PDataType$23.toBytes(PDataType.java:2976)
at org.apache.phoenix.schema.PDataType$23.toObject(PDataType.java:3022)
at
org.apache.phoenix.pig.TypeUtil.castPigTypeToPhoenix(TypeUtil.java:131)
at
org.apache.phoenix.pig.hadoop.PhoenixRecord.convertTypeSpecificValue(PhoenixRecord.java:87)
at
org.apache.phoenix.pig.hadoop.PhoenixRecord.write(PhoenixRecord.java:68)
at
org.apache.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.java:71)
at
org.apache.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.java:41)
at
org.apache.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:151)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
at
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:646)
at
org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:284)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Pig version: 0.12, Phoenix version 3.0 on EMR AMI 3.1
I appreciate any help/ideas.
Thanks,
Daniel Rodriguez
|