phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Reid <gabriel.r...@gmail.com>
Subject Re: Invoking org.apache.phoenix.mapreduce.CsvBulkLoadTool from phoenix-4.4.0.2.4.0.0-169-client.jar is not working properly
Date Wed, 03 Aug 2016 12:17:09 GMT
Hi Radha,

This looks to me as if there is an issue in your data somewhere past
the first 100 records. The bulk loader isn't supposed to fail due to
issues like this. Instead, it's intended to simply report the problem
input lines and continue on, but it appears that this isn't happening.

Could you log an issue in the PHOENIX JIRA
(https://issues.apache.org/jira/browse/PHOENIX) for this problem?

Thanks,

Gabriel


On Wed, Aug 3, 2016 at 9:53 AM, Radha Krishna G <grkmca95@yahoo.com> wrote:
>
> Hi All,
> i am trying to load around 40 GB file using
> "org.apache.phoenix.mapreduce.CsvBulkLoadTool" but it is showing the below
> error message.
>
> INFO mapreduce.Job: Task Id : attempt_1469663368297_56967_m_000042_0, Status
> : FAILED
> Error: java.lang.RuntimeException: java.lang.RuntimeException:
> java.io.IOException: (startline 1) EOF reached before encapsulated token
> finished
>         at
> org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:176)
>         at
> org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:67)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: java.lang.RuntimeException: java.io.IOException: (startline 1)
> EOF reached before encapsulated token finished
>         at
> org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:398)
>         at org.apache.commons.csv.CSVParser$1.hasNext(CSVParser.java:407)
>         at com.google.common.collect.Iterators.getNext(Iterators.java:890)
>         at com.google.common.collect.Iterables.getFirst(Iterables.java:781)
>         at
> org.apache.phoenix.mapreduce.CsvToKeyValueMapper$CsvLineParser.parse(CsvToKeyValueMapper.java:287)
>         at
> org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:148)
>         ... 9 more
> Caused by: java.io.IOException: (startline 1) EOF reached before
> encapsulated token finished
>         at
> org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:282)
>         at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
>         at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:450)
>         at
> org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:395)
>         ... 14 more
>
>
> Note : I collected some sample records around(1000) form the same file and
> able to load using the same approach, but if i provide full file path its
> failing, can any one suggest what is solution for the above issue..
>
> Bellow Command i used
> ==================
>
> HADOOP_CLASSPATH=/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/conf
> hadoop jar phoenix-4.4.0.2.4.0.0-169-client.jar
> org.apache.phoenix.mapreduce.CsvBulkLoadTool --table "Table_Name" --input
> "HDFS input file path" -d $'\034'
>
>
> -d $'\034' --> the field separator in the file is FS so we provided the
> explicitly
>
> Regards
> Radha krishna G

Mime
View raw message