phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Kiran <maghamraviki...@gmail.com>
Subject Re: pig and phoenix
Date Fri, 05 Dec 2014 23:20:38 GMT
Hi Ralph.
   Can you please try to modify the STORE command in the script to the
following.
   STORE D into 'hbase://$table_name/period,deployment,file_id, recnum'
using org.apache.phoenix.pig.PhoenixHBaseStorage('$zookeeper','-batchSize
1000');

Primarily, Phoenix generates the default UPSERT query to the table and it
assumes the order to be that of the columns mentioned in your CREATE table.
In your case, I see you are reordering the columns during the STORE command
. Hence, with the above change, Phoenix constructs the right UPSERT query
for you with the columns you mention after $table_name.

Also, to have the look at the query Phoenix has generated, you should see a
log entry which starts with  "
*Phoenix Generic Upsert Statement: *
That also will give insights into the UPSERT query.

Happy to help!!

Regards
Ravi


On Fri, Dec 5, 2014 at 2:57 PM, Perko, Ralph J <Ralph.Perko@pnnl.gov> wrote:

>  Hi, I wrote a series of pig scripts to load data that were working well
> with 4.0, but since upgrading  to 4.2.x (4.2.1 currently) are now failing.
>
>  Here is an example:
>
>  Table def:
> CREATE TABLE IF NOT EXISTS t1_log_dns
> (
>   period BIGINT NOT NULL,
>   deployment VARCHAR NOT NULL,
>   file_id VARCHAR NOT NULL,
>   recnum INTEGER NOT NULL,
>   f1 VARCHAR,
>   f2 VARCHAR,
>   f3 VARCHAR,
>   f4 BIGINT,
> ...
>  CONSTRAINT pkey PRIMARY KEY (period, deployment, file_id, recnum)
> )
> IMMUTABLE_ROWS=true,COMPRESSION='SNAPPY',SALT_BUCKETS=10,SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
>
>  --- some index def’s – same error occurs with or without them
>
>  Pig script:
>
>  register $phoenix_jar;
> register $udf_jar;
>
>  Z = load '$data' as (
> file_id,
> recnum,
> period,
> deployment,
> ... more fields
> );
>
>  -- put it all together and generate final output!
> D = foreach Z generate
> period,
> deployment,
> file_id,
> recnum ,
> ... more fields;
>
>  STORE D into 'hbase://$table_name' using
> org.apache.phoenix.pig.PhoenixHBaseStorage('$zookeeper','-batchSize 1000');
>
>  Error:
> 2014-12-05 14:24:06,450 [main] ERROR
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR: Unable to process
> column RECNUM:INTEGER, innerMessage=java.lang.String cannot be coerced to
> INTEGER
> 2014-12-05 14:24:06,450 [main] ERROR
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
> 2014-12-05 14:24:06,452 [main] INFO
>  org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
>
>  HadoopVersion PigVersion UserId StartedAt FinishedAt Features
> 2.4.0.2.1.5.0-695 0.12.1.2.1.5.0-695 perko 2014-12-05 14:23:17 2014-12-05
> 14:24:06 UNKNOWN
>
>  Based on the error it would seem that some non-integer value cannot be
> cast to an integer.  But the data does not show this.  Stepping through the
> Pig script and running "dump" on each variable
> shows the data in the right place and the right coercible type – for
> example the recnum has nothing but single digits of sample data.
>
>  I have tried to set "recnum" to an int in pig but this just pushes the
> error up to the previous field - file_id:
>
>  ERROR 2999: Unexpected internal error. Unable to process column
> FILE_ID:VARCHAR, innerMessage=java.lang.Integer cannot be coerced to VARCHAR
>
>  Other times I get a different error:
>
>  Unable to process column _SALT:BINARY,
> innerMessage=org.apache.phoenix.schema.TypeMismatchException: ERROR 203
> (22005): Type mismatch. BINARY cannot be coerced to LONG
>
>  Is there something obvious I am doing wrong?  Did something significant
> change between 4.0 and 4.2.x in this regard?  I would not rule out some
> silly user error I inadvertently introduced :-/
>
>  Thanks for your help
> Ralph
>
>

Mime
View raw message