phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Reid <gabriel.r...@gmail.com>
Subject Re: Unable to Use bulkloader to load Control-A delimited file
Date Thu, 31 Dec 2015 09:02:50 GMT
Hi Anil,

Sorry I've been slow to respond on this thread.

The command that you're using looks fine to me, and regardless, it
shouldn't be possible to get this kind of an issue regardless of the
command you use.

Assuming the fix that was done for PHOENIX-2238 works (and I'm quite
sure it does), my first guess on how something like this could happen
is if there is another version of the CsvBulkImportUtil class that is
earlier on the classpath than the one in the client jar that you're
supplying on the command line. I think that this could possibly happen
if you've got another version of Phoenix jars in the lib directory of
HBase, as the hbase classpath is being added to HADOOP_CLASSPATH
variable in the load command you're using. If this is the case, then
the old logic for serializing the delimiter information would be used,
leading to this error.

Could you verify that the phoenix jars in the HBase lib directory (or
anywhere else on the hadoop classpath) are the same version as the
client jar that you're using?

- Gabriel


On Thu, Dec 31, 2015 at 6:27 AM, anil gupta <anilgupta84@gmail.com> wrote:
> Hi James,
>
> We use HDP2.2.
> This commit for that JIRA was done in 4.5-HBase-1.1 branch
> :https://github.com/apache/phoenix/commit/2ab807d1ef8cb9c1cc06bfd53e8a89ce7379c57f
>
> We also decompiled the downloaded client jar and verified that related code
> changes are present in it. Still, the feature is not working. Are we running
> right command?
>
> On Wed, Dec 30, 2015 at 5:27 PM, James Taylor <jamestaylor@apache.org>
> wrote:
>>
>> Gabriel may have meant the Cloudera labs release of 4.5.2, but I'm not
>> sure if that fix is there or not. We have no plans to do a 4.5.3 release.
>> FYI, Andrew put together a 4.6 version that works with CDH here too:
>> https://github.com/chiastic-security/phoenix-for-cloudera. We also plan to
>> do a 4.7 release soon.
>>
>> Thanks,
>> James
>>
>>
>> On Wed, Dec 30, 2015 at 4:30 PM, anil gupta <anilgupta84@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I figured out that Phoenix4.5.3 is not released yet. So, we downloaded
>>> Phoenix client jar from nighly builds of
>>> Phoenix:https://builds.apache.org/view/All/job/Phoenix-4.5-HBase-1.1/lastSuccessfulBuild/artifact/phoenix-assembly/target/phoenix-4.5.3-HBase-1.1-SNAPSHOT-client.jar
>>>
>>> We ran following command:
>>> HADOOP_CLASSPATH=`hbase classpath`  hadoop jar
>>> /tmp/phoenix-4.5.3-HBase-1.1-SNAPSHOT-client.jar
>>> org.apache.phoenix.mapreduce.CsvBulkLoadTool --table ABC_DATA --input
>>> /user/abcdref/part-m-00000  -d $'\001'
>>>
>>> We are still getting same error:
>>>
>>> 2015-12-30 15:12:05,622 FATAL [main]
>>> org.apache.hadoop.conf.Configuration: error parsing conf job.xml
>>> org.xml.sax.SAXParseException; systemId:
>>> file:///grid/5/hadoop/yarn/local/usercache/hbase/appcache/application_1448997579389_28875/container_e11_1448997579389_28875_02_000001/job.xml;
>>> lineNumber: 78; columnNumber: 74; Character reference "&#1" is an invalid
>>> XML character.
>>> 	at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>>> 	at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
>>> 	at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
>>> 	at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2480)
>>> 	at
>>> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2549)
>>> 	at
>>> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2502)
>>> 	at
>>> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)
>>> 	at org.apache.hadoop.conf.Configuration.get(Configuration.java:1232)
>>> 	at
>>> org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:51)
>>> 	at
>>> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1444)
>>>
>>> Is my bulkloader command incorrect?
>>>
>>>
>>> Thanks,
>>> Anil Gupta
>>>
>>> On Wed, Dec 30, 2015 at 11:23 AM, anil gupta <anilgupta84@gmail.com>
>>> wrote:
>>>>
>>>> I dont see 4.5.3 release over here:
>>>> http://download.nextag.com/apache/phoenix/
>>>> Is 4.5.3 not released yet?
>>>>
>>>> On Wed, Dec 30, 2015 at 11:14 AM, anil gupta <anilgupta84@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi Gabriel,
>>>>>
>>>>> Thanks for the info. What is the backward compatibility policy of
>>>>> Phoenix releases? Would 4.5.3 client jar work with Phoenix4.4 server
jar?
>>>>> 4.4 and 4.5 are considered two major release or minor releases?
>>>>>
>>>>> Thanks,
>>>>> Anil Gupta
>>>>>
>>>>> On Tue, Dec 29, 2015 at 11:11 PM, Gabriel Reid <gabriel.reid@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi Anil,
>>>>>>
>>>>>> This issue was resolved a while back, via this ticket:
>>>>>> https://issues.apache.org/jira/browse/PHOENIX-2238
>>>>>>
>>>>>> Unfortunately, that fix is only available starting from Phoenix 4.6
>>>>>> and 4.5.3 (i.e. it wasn't back-ported to 4.4.x).
>>>>>>
>>>>>> - Gabriel
>>>>>>
>>>>>> On Wed, Dec 30, 2015 at 1:21 AM, anil gupta <anilgupta84@gmail.com>
>>>>>> wrote:
>>>>>> > Hi,
>>>>>> > We want to use bulkload tool to load files that are delimited
by
>>>>>> > Control-A.
>>>>>> > We are running this command on Phoenix 4.4(HDP2.2):
>>>>>> > hadoop jar
>>>>>> >
>>>>>> > /usr/hdp/current/phoenix-client/phoenix-4.4.0.2.3.2.0-2950-client.jar
>>>>>> > org.apache.phoenix.mapreduce.CsvBulkLoadTool --table LEAD_SALES_DATA
>>>>>> > --input
>>>>>> > /user/psawant/part-m-00000 -o /tmp/phx_bulk -d $'\001'
>>>>>> >
>>>>>> > Above command is as per this phoenix doc:
>>>>>> > https://phoenix.apache.org/bulk_dataload.html
>>>>>> >
>>>>>> > I have tried to look here:
>>>>>> > https://issues.apache.org/jira/browse/HADOOP-7542
>>>>>> >
>>>>>> > Similar to https://issues.apache.org/jira/browse/HBASE-3623,
we dont
>>>>>> > encode
>>>>>> > special characters with Base64 encoding?
>>>>>> >
>>>>>> > We get this error:
>>>>>> >
>>>>>> > 2015-12-29 15:26:43,801 INFO [main]
>>>>>> > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster
>>>>>> > for
>>>>>> > application appattempt_1448997579389_27927_000001
>>>>>> > 2015-12-29 15:26:44,105 FATAL [main]
>>>>>> > org.apache.hadoop.conf.Configuration:
>>>>>> > error parsing conf job.xml
>>>>>> > org.xml.sax.SAXParseException; systemId:
>>>>>> >
>>>>>> > file:///grid/4/hadoop/yarn/local/usercache/hbase/appcache/application_1448997579389_27927/container_e11_1448997579389_27927_01_000001/job.xml;
>>>>>> > lineNumber: 78; columnNumber: 74; Character reference "&#1"
is an
>>>>>> > invalid
>>>>>> > XML character.
>>>>>> >       at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>>>>>> >       at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown
>>>>>> > Source)
>>>>>> >       at
>>>>>> > javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
>>>>>> >       at
>>>>>> > org.apache.hadoop.conf.Configuration.parse(Configuration.java:2480)
>>>>>> >       at
>>>>>> >
>>>>>> > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2549)
>>>>>> >       at
>>>>>> >
>>>>>> > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2502)
>>>>>> >       at
>>>>>> > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)
>>>>>> >       at
>>>>>> > org.apache.hadoop.conf.Configuration.get(Configuration.java:1232)
>>>>>> >       at
>>>>>> >
>>>>>> > org.apache.hadoop.mapreduce.v2.util.MRWebAppUtil.initialize(MRWebAppUtil.java:51)
>>>>>> >       at
>>>>>> >
>>>>>> > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1444)
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Thanks & Regards,
>>>>>> > Anil Gupta
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks & Regards,
>>>>> Anil Gupta
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Anil Gupta
>>>
>>>
>>>
>>>
>>> --
>>> Thanks & Regards,
>>> Anil Gupta
>>
>>
>
>
>
> --
> Thanks & Regards,
> Anil Gupta

Mime
View raw message