phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Reid <gabriel.r...@gmail.com>
Subject Re: Using Phoenix Bulk Upload CSV to upload 200GB data
Date Sat, 19 Sep 2015 13:16:47 GMT
Good to hear you got it working.

I can't say if the numbers that you're getting are exactly expected or not.
However, I can confirm that doing a bulk load like this can take up a lot
of space on local drives of nodes which are executing map and reduce tasks.

There are two main factors to this:
1. The (uncompressed) size of a single row in Phoenix/HBase is often quite
a bit larger than it is as a single row in a CSV file or something similar.
This is because the full row key is stored together with each non-row-key
value. The actual size taken up by this is minimized within HBase via
compression and block encoding schemes, but those don't apply to the way
they are stored as intermediate values during the batch upload (which is
why it's important to enable map output compression)

2. MapReduce in general can use up a lot of local (non-HDFS) space. The
output of mappers is written locally, and then pulled in to local storage
for processing by the reducers as well. In other words, you definitely need
to have the capacity of your reducer output available as local storage on
your cluster.

- Gabriel


On Fri, Sep 18, 2015 at 2:29 AM Gaurav Kanade <gaurav.kanade@gmail.com>
wrote:

> Thanks guys so much for all the help - I was able to get this scenario to
> work !
>
> Now that I got this to work I am a little bit curious to see if I can
> explore more on my initial question re disk space utilization.
>
> Providing more specifics - yes I was using YARN and my table had about a
> billion rows and 22 columns.
>
> Specifically I would see several map jobs starting and completing
> successfully. Reducers (all 9 of them) would start processing the output of
> these map jobs but at a certain stage (75% map completion, 99% map
> completion depending on my number of nodes) the non DFS disk space utilized
> reached a critical stage eventually making the node unusable; thus killing
> off all the map and reduce jobs that were on that node and restarting them;
> it then proceeded to eat up memory on the other node one by one to the
> extent that progress receded to 0% eventually.
>
> Of course all this disappeared and worked smoothly when I switched from
> using datanodes with VMs having 800G disk space rather than 400G disk space
> - but the fact is on this kind of workload there is a point at which a
> total of more than 3.2 GB temp space is required.
>
> I will of course look at using compression of map output - but just wanted
> to check if this is expected behavior on workloads of this size.
>
> Thanks
> Gaurav
>
>
>
> On 16 September 2015 at 12:21, Gaurav Kanade <gaurav.kanade@gmail.com>
> wrote:
>
>> Thanks for the pointers Gabriel! Will give it a shot now!
>>
>> On 16 September 2015 at 12:15, Gabriel Reid <gabriel.reid@gmail.com>
>> wrote:
>>
>>> Yes, there is post-processing that goes on within the driver program
>>> (i.e. the command line tool with which you started the import job).
>>>
>>> The MapReduce job actually just creates HFiles, and then the
>>> post-processing simply involves telling HBase to use these HFiles. If your
>>> terminal closed while running the tool, then the HFiles won't be handed
>>> over to HBase, which will result in what you're seeing.
>>>
>>> I usually start import jobs like this using screen [1] so that losing a
>>> client terminal connection won't get in the way of the full job completing.
>>>
>>>
>>> - Gabriel
>>>
>>>
>>>
>>> 1. https://www.gnu.org/software/screen/manual/screen.html
>>>
>>> On Wed, Sep 16, 2015 at 9:07 PM, Gaurav Kanade <gaurav.kanade@gmail.com>
>>> wrote:
>>>
>>>> Sure, attached below the job counter values. I checked the final status
>>>> of the job and it said succeeded. I could not see the import tool exactly
>>>> because I ran it overnight and my machine rebooted at some point for some
>>>> updates - I wonder if there is some post-processing after the MR job which
>>>> might have failed due to this ?
>>>>
>>>> Thanks for the help !
>>>> ----------------
>>>> Logged in as: dr.who
>>>> Counters for job_1442389862209_0002
>>>> Application Job
>>>>
>>>>    - Overview
>>>>    <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/job/job_1442389862209_0002>
>>>>    - Counters
>>>>    <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/jobcounters/job_1442389862209_0002>
>>>>    - Configuration
>>>>    <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/conf/job_1442389862209_0002>
>>>>    - Map tasks
>>>>    <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/tasks/job_1442389862209_0002/m>
>>>>    - Reduce tasks
>>>>    <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/tasks/job_1442389862209_0002/r>
>>>>
>>>> Tools
>>>> Counter Group Counters File System Counters
>>>> Name
>>>> Map
>>>> Reduce
>>>> Total
>>>> FILE: Number of bytes read
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_BYTES_READ>
1520770904675
>>>> 2604849340144 4125620244819 FILE: Number of bytes written
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_BYTES_WRITTEN>
3031784709196
>>>> 2616689890216 5648474599412 FILE: Number of large read operations
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_LARGE_READ_OPS>
0
>>>> 0 0 FILE: Number of read operations
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_READ_OPS>
0
>>>> 0 0 FILE: Number of write operations
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_WRITE_OPS>
0
>>>> 0 0 WASB: Number of bytes read
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_BYTES_READ>
186405294283
>>>> 0 186405294283 WASB: Number of bytes written
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_BYTES_WRITTEN>
0
>>>> 363027342839 363027342839 WASB: Number of large read operations
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_LARGE_READ_OPS>
0
>>>> 0 0 WASB: Number of read operations
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_READ_OPS>
0
>>>> 0 0 WASB: Number of write operations
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_WRITE_OPS>
0
>>>> 0 0
>>>> Job Counters
>>>> Name
>>>> Map
>>>> Reduce
>>>> Total
>>>> Launched map tasks
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/TOTAL_LAUNCHED_MAPS>
0
>>>> 0 348 Launched reduce tasks
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/TOTAL_LAUNCHED_REDUCES>
0
>>>> 0 9 Rack-local map tasks
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/RACK_LOCAL_MAPS>
0
>>>> 0 348 Total megabyte-seconds taken by all map tasks
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/MB_MILLIS_MAPS>
0
>>>> 0 460560315648 Total megabyte-seconds taken by all reduce tasks
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/MB_MILLIS_REDUCES>
0
>>>> 0 158604449280 Total time spent by all map tasks (ms)
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/MILLIS_MAPS>
0
>>>> 0 599687911 Total time spent by all maps in occupied slots (ms)
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/SLOTS_MILLIS_MAPS>
0
>>>> 0 599687911 Total time spent by all reduce tasks (ms)
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/MILLIS_REDUCES>
0
>>>> 0 103258105 Total time spent by all reduces in occupied slots (ms)
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/SLOTS_MILLIS_REDUCES>
0
>>>> 0 206516210 Total vcore-seconds taken by all map tasks
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/VCORES_MILLIS_MAPS>
0
>>>> 0 599687911 Total vcore-seconds taken by all reduce tasks
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/VCORES_MILLIS_REDUCES>
0
>>>> 0 103258105
>>>> Map-Reduce Framework
>>>> Name
>>>> Map
>>>> Reduce
>>>> Total
>>>> Combine input records
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/COMBINE_INPUT_RECORDS>
0
>>>> 0 0 Combine output records
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/COMBINE_OUTPUT_RECORDS>
0
>>>> 0 0 CPU time spent (ms)
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/CPU_MILLISECONDS>
162773540
>>>> 90154160 252927700 Failed Shuffles
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/FAILED_SHUFFLE>
0
>>>> 0 0 GC time elapsed (ms)
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/GC_TIME_MILLIS>
7667781
>>>> 1607188 9274969 Input split bytes
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/SPLIT_RAW_BYTES>
52548
>>>> 0 52548 Map input records
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MAP_INPUT_RECORDS>
861890673
>>>> 0 861890673 Map output bytes
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MAP_OUTPUT_BYTES>
1488284643774
>>>> 0 1488284643774 Map output materialized bytes
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MAP_OUTPUT_MATERIALIZED_BYTES>
1515865164102
>>>> 0 1515865164102 Map output records
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MAP_OUTPUT_RECORDS>
13790250768
>>>> 0 13790250768 Merged Map outputs
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MERGED_MAP_OUTPUTS>
0
>>>> 3132 3132 Physical memory (bytes) snapshot
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/PHYSICAL_MEMORY_BYTES>
192242380800
>>>> 4546826240 196789207040 Reduce input groups
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/REDUCE_INPUT_GROUPS>
0
>>>> 861890673 861890673 Reduce input records
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/REDUCE_INPUT_RECORDS>
0
>>>> 13790250768 13790250768 Reduce output records
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/REDUCE_OUTPUT_RECORDS>
0
>>>> 13790250768 13790250768 Reduce shuffle bytes
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/REDUCE_SHUFFLE_BYTES>
0
>>>> 1515865164102 1515865164102 Shuffled Maps
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/SHUFFLED_MAPS>
0
>>>> 3132 3132 Spilled Records
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/SPILLED_RECORDS>
27580501536
>>>> 23694179168 51274680704 Total committed heap usage (bytes)
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/COMMITTED_HEAP_BYTES>
186401685504
>>>> 3023044608 189424730112 Virtual memory (bytes) snapshot
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/VIRTUAL_MEMORY_BYTES>
537370951680
>>>> 19158048768 556529000448
>>>> Phoenix MapReduce Import
>>>> Name
>>>> Map
>>>> Reduce
>>>> Total
>>>> Upserts Done
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Phoenix%20MapReduce%20Import/Upserts%20Done>
861890673
>>>> 0 861890673
>>>> Shuffle Errors
>>>> Name
>>>> Map
>>>> Reduce
>>>> Total
>>>> BAD_ID
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/BAD_ID>
0
>>>> 0 0 CONNECTION
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/CONNECTION>
0
>>>> 0 0 IO_ERROR
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/IO_ERROR>
0
>>>> 0 0 WRONG_LENGTH
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/WRONG_LENGTH>
0
>>>> 0 0 WRONG_MAP
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/WRONG_MAP>
0
>>>> 0 0 WRONG_REDUCE
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/WRONG_REDUCE>
0
>>>> 0 0
>>>> File Input Format Counters
>>>> Name
>>>> Map
>>>> Reduce
>>>> Total
>>>> Bytes Read
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter/BYTES_READ>
186395934997
>>>> 0 186395934997
>>>> File Output Format Counters
>>>> Name
>>>> Map
>>>> Reduce
>>>> Total
>>>> Bytes Written
>>>> <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter/BYTES_WRITTEN>
0
>>>> 363027342839 363027342839
>>>>
>>>> On 16 September 2015 at 11:46, Gabriel Reid <gabriel.reid@gmail.com>
>>>> wrote:
>>>>
>>>>> Can you view (and post) the job counters values from the import job?
>>>>> These should be visible in the job history server.
>>>>>
>>>>> Also, did you see the import tool exit successfully (in the terminal
>>>>> where you started it?)
>>>>>
>>>>> - Gabriel
>>>>>
>>>>> On Wed, Sep 16, 2015 at 6:24 PM, Gaurav Kanade <
>>>>> gaurav.kanade@gmail.com> wrote:
>>>>> > Hi guys
>>>>> >
>>>>> > I was able to get this to work after using bigger VMs for data nodes;
>>>>> > however now the bigger problem I am facing is after my MR job
>>>>> completes
>>>>> > successfully I am not seeing any rows loaded in my table (count
>>>>> shows 0 both
>>>>> > via phoenix and hbase)
>>>>> >
>>>>> > Am I missing something simple ?
>>>>> >
>>>>> > Thanks
>>>>> > Gaurav
>>>>> >
>>>>> >
>>>>> > On 12 September 2015 at 11:16, Gabriel Reid <gabriel.reid@gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >> Around 1400 mappers sounds about normal to me -- I assume your
block
>>>>> >> size on HDFS is 128 MB, which works out to 1500 mappers for
200 GB
>>>>> of
>>>>> >> input.
>>>>> >>
>>>>> >> To add to what Krishna asked, can you be a bit more specific
on what
>>>>> >> you're seeing (in log files or elsewhere) which leads you to
believe
>>>>> >> the data nodes are running out of capacity? Are map tasks failing?
>>>>> >>
>>>>> >> If this is indeed a capacity issue, one thing you should ensure
is
>>>>> >> that map output comression is enabled. This doc from Cloudera
>>>>> explains
>>>>> >> this (and the same information applies whether you're using
CDH or
>>>>> >> not) -
>>>>> >>
>>>>> http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_23_3.html
>>>>> >>
>>>>> >> In any case, apart from that there isn't any basic thing that
you're
>>>>> >> probably missing, so any additional information that you can
supply
>>>>> >> about what you're running into would be useful.
>>>>> >>
>>>>> >> - Gabriel
>>>>> >>
>>>>> >>
>>>>> >> On Sat, Sep 12, 2015 at 2:17 AM, Krishna <research800@gmail.com>
>>>>> wrote:
>>>>> >> > 1400 mappers on 9 nodes is about 155 mappers per datanode
which
>>>>> sounds
>>>>> >> > high
>>>>> >> > to me. There are very few specifics in your mail. Are you
using
>>>>> YARN?
>>>>> >> > Can
>>>>> >> > you provide details like table structure, # of rows &
columns,
>>>>> etc. Do
>>>>> >> > you
>>>>> >> > have an error stack?
>>>>> >> >
>>>>> >> >
>>>>> >> > On Friday, September 11, 2015, Gaurav Kanade <
>>>>> gaurav.kanade@gmail.com>
>>>>> >> > wrote:
>>>>> >> >>
>>>>> >> >> Hi All
>>>>> >> >>
>>>>> >> >> I am new to Apache Phoenix (and relatively new to MR
in general)
>>>>> but I
>>>>> >> >> am
>>>>> >> >> trying a bulk insert of a 200GB tar separated file
in an HBase
>>>>> table.
>>>>> >> >> This
>>>>> >> >> seems to start off fine and kicks off about ~1400 mappers
and 9
>>>>> >> >> reducers (I
>>>>> >> >> have 9 data nodes in my setup).
>>>>> >> >>
>>>>> >> >> At some point I seem to be running into problems with
this
>>>>> process as
>>>>> >> >> it
>>>>> >> >> seems the data nodes run out of capacity (from what
I can see my
>>>>> data
>>>>> >> >> nodes
>>>>> >> >> have 400GB local space). It does seem that certain
reducers eat
>>>>> up most
>>>>> >> >> of
>>>>> >> >> the capacity on these - thus slowing down the process
to a crawl
>>>>> and
>>>>> >> >> ultimately leading to Node Managers complaining that
Node Health
>>>>> is bad
>>>>> >> >> (log-dirs and local-dirs are bad)
>>>>> >> >>
>>>>> >> >> Is there some inherent setting I am missing that I
need to set
>>>>> up for
>>>>> >> >> the
>>>>> >> >> particular job ?
>>>>> >> >>
>>>>> >> >> Any pointers would be appreciated
>>>>> >> >>
>>>>> >> >> Thanks
>>>>> >> >>
>>>>> >> >> --
>>>>> >> >> Gaurav Kanade,
>>>>> >> >> Software Engineer
>>>>> >> >> Big Data
>>>>> >> >> Cloud and Enterprise Division
>>>>> >> >> Microsoft
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Gaurav Kanade,
>>>>> > Software Engineer
>>>>> > Big Data
>>>>> > Cloud and Enterprise Division
>>>>> > Microsoft
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Gaurav Kanade,
>>>> Software Engineer
>>>> Big Data
>>>> Cloud and Enterprise Division
>>>> Microsoft
>>>>
>>>
>>>
>>
>>
>> --
>> Gaurav Kanade,
>> Software Engineer
>> Big Data
>> Cloud and Enterprise Division
>> Microsoft
>>
>
>
>
> --
> Gaurav Kanade,
> Software Engineer
> Big Data
> Cloud and Enterprise Division
> Microsoft
>

Mime
View raw message