phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ciureanu, Constantin (GfK)" <Constantin.Ciure...@gfk.com>
Subject MapReduce bulk load into Phoenix table
Date Tue, 13 Jan 2015 09:12:13 GMT
Hello all,

(Due to the slow speed of Phoenix JDBC – single machine ~ 1000-1500 rows /sec) I am also
documenting myself about loading data into Phoenix via MapReduce.

So far I understood that the Key + List<[Key,Value]> to be inserted into HBase table
is obtained via a “dummy” Phoenix connection – then those rows are stored into HFiles
(then after the MR job finishes it is Bulk loading those HFiles normally into HBase).

My question: Is there any better / faster approach? I assume this cannot reach the maximum
speed to load data into Phoenix / HBase table.

Also I would like to find a better / newer sample code than this one:
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix/4.0.0-incubating/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java#CsvToKeyValueMapper.loadPreUpsertProcessor%28org.apache.hadoop.conf.Configuration%29

Thank you,
   Constantin
Mime
View raw message