phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "rubysina" <ru...@sina.com>
Subject Error with lines ended with backslash when Bulk Data Loading
Date Thu, 08 Dec 2016 08:10:46 GMT
hi, I'm new to phoenix sql and here's a little problem. 

I'm following this page http://phoenix.apache.org/bulk_dataload.html
I just found that the MapReduce importer could not load file with lines ended with backslash
even with the -g parameter , i.e. ignore-errors, "java.io.IOException: EOF whilst processing
escape sequence"

but it's OK if the line contains backslash but not at the end of line, 

and there's no problem when using psql.py to load the same file.

why?  how?

thank you.



-----------------------------------------------------------------------------------------------
for example:


create table a(a char(100) primary key)

echo \\>a.csv
cat a.csv
\
hdfs dfs -put  a.csv  
...JsonBulkLoadTool  -g -t a  -i a.csv  
-- error
16/12/08 15:44:21 INFO mapreduce.Job: Task Id : attempt_1481093434027_0052_m_000000_0, Status
: FAILED
Error: java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: EOF whilst
processing escape sequence
        at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:202)
        at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:74)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.RuntimeException: java.io.IOException: EOF whilst processing escape sequence
        at org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:398)
        at org.apache.commons.csv.CSVParser$1.hasNext(CSVParser.java:407)
        at com.google.common.collect.Iterators.getNext(Iterators.java:890)
        at com.google.common.collect.Iterables.getFirst(Iterables.java:781)
        at org.apache.phoenix.mapreduce.CsvToKeyValueMapper$CsvLineParser.parse(CsvToKeyValueMapper.java:109)
        at org.apache.phoenix.mapreduce.CsvToKeyValueMapper$CsvLineParser.parse(CsvToKeyValueMapper.java:91)
        at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:161)
        ... 9 more



echo \\a>a.csv
cat a.csv
\a
hdfs dfs -rm  a.csv  
hdfs dfs -put  a.csv  
...JsonBulkLoadTool -g -t a  -i a.csv  
-- success


echo \\>a.csv
cat a.csv
\
psql.py -t A zoo a.csv 
CSV Upsert complete. 1 rows upserted
-- success


thank you.
Mime
View raw message