flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Fair <matt.f...@gmail.com>
Subject Re: multiple flume clients and memory
Date Sun, 29 Mar 2015 15:47:25 GMT
Thank you very much!  Suggesting VisualVM was very useful in exploring the
usage of java resources which was really the large issue that I was running
into and I re-architected my code to run as many threads instead of many
separate java processes.  By doing that it alleviated all of my memory
issues, which I suspect was really just the overhead of each separate java
process, not the flume client code.

Thanks again!
Matt


On Wed, Mar 25, 2015 at 11:18 PM, Ashish <paliwalashish@gmail.com> wrote:

> Do all these clients have memory usage is in same range? If yes, then
> taking a heap dump would reveal what is consuming memory.
>
> As Hari said, the batch is kept in-memory, meaning Event size would
> matter. Here is what I would do to debug this
>
> 1. See the memory usage of all client
> 2. If they are in range, would use VisualVM to get the heap dump of
> any one of the process, else take heap dump of a few process (max, min
> usage etc)
> 3. Use Eclipse MAT or other tool to see what's consuming the memory
>
> Can also try tweaking the batch size to see if it makes any difference
> in memory usage.
>
> On Thu, Mar 26, 2015 at 8:33 AM, Matt Fair <matt.fair@gmail.com> wrote:
> > The machine that I have seen it both on my machine with 16 GB and 60 GB
> of
> > memory, when running about 40 clients and ~4k clients respectively using
> up
> > 100% of memory.  If I run without the flume client I have no memory
> > problems, but when I insatiate a flume RPCClient, then I run into memory
> > problems.
> >
> > Thanks,
> > Matt
> >
> > On Wed, Mar 25, 2015 at 6:42 PM, Hari Shreedharan
> > <hshreedharan@cloudera.com> wrote:
> >>
> >> How much memory are you talking about? The RPC client will hold on to
> the
> >> batch of events you sent, plus some additional threading overhead.
> Under the
> >> hood, it uses a Netty client which should not really have a big memory
> >> footprint.
> >>
> >> Thanks,
> >> Hari
> >>
> >>
> >> On Wed, Mar 25, 2015 at 3:27 PM, Matt Fair <matt.fair@gmail.com> wrote:
> >>>
> >>> I have an application that launches a bunch of processes (40+) on the
> >>> same machine, each one connects to flume using the default flume
> RPCClient.
> >>> I however have noticed that each RPCClient takes up a decent amount of
> >>> memory, and when you create as many clients like I am, it adds up to a
> lot
> >>> of memory.  One thought I had to alleviate having to create all of the
> >>> clients was to create only a single RPCClient and then have my other
> >>> processes connect to it via a socket, but that seems a little redundant
> >>> since that is what the RPCClient is suppose to do anyways.  Have others
> >>> found themselves in this same situation?  Is there a way to handle
> memory
> >>> more efficiently or is there another RPCClient implementation that
> doesn't
> >>> take up as much memory?
> >>>
> >>> Thanks,
> >>> Matt
> >>
> >>
> >
>
>
>
> --
> thanks
> ashish
>
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>

Mime
View raw message