phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaurav Kanade <gaurav.kan...@gmail.com>
Subject Using Phoenix Bulk Upload CSV to upload 200GB data
Date Fri, 11 Sep 2015 22:55:14 GMT
Hi All

I am new to Apache Phoenix (and relatively new to MR in general) but I am
trying a bulk insert of a 200GB tar separated file in an HBase table. This
seems to start off fine and kicks off about ~1400 mappers and 9 reducers (I
have 9 data nodes in my setup).

At some point I seem to be running into problems with this process as it
seems the data nodes run out of capacity (from what I can see my data nodes
have 400GB local space). It does seem that certain reducers eat up most of
the capacity on these - thus slowing down the process to a crawl and
ultimately leading to Node Managers complaining that Node Health is bad
(log-dirs and local-dirs are bad)

Is there some inherent setting I am missing that I need to set up for the
particular job ?

Any pointers would be appreciated

Thanks

-- 
Gaurav Kanade,
Software Engineer
Big Data
Cloud and Enterprise Division
Microsoft

Mime
View raw message