flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmed Vila <av...@devlogic.eu>
Subject Re: Flume restart
Date Sun, 14 Dec 2014 23:14:00 GMT
Hi Bojan,

I'm hooked up now, now I'm interested where you're from :)

I'm really stunned by your configuration. I've never worked with cluster of
that size, but I would like to have a chance to do so.
What is your expected vs real throughput anyway ?

52 GB is really high and I think one agent shouldn't exceed 15 GB based on
sheer max. numbers you've set in your config.

I'm really not into the HDFS Sink and Hadoop much, we moved away from it in
favor of custom solution.
But, I think that 102 sinks in total is a quite a big deal for a single
endpoint. Plus, you have 20 threads for each sink.

I also see that you're using gzip compression, so it might help to turn it
off. Play with your config, like lowering the number of threads.

Not related to that I would suggest SSD-backed FileChannel... Once one of
my friends told me: "Things do fail more than we think"
FileChannel will be able to recover channel state in case that Flume fails,
and will also survive channel shut down.
Now, each time you hit config reload you actually trash whatever was in
those channels. Judging by the size of your channels that's a quite a
number of events.



On Sat, Dec 13, 2014 at 10:50 AM, Bojan Kostić <blood9raven@gmail.com>
wrote:

> Hi(Zdravo)) Ahmed i Otis. :)
> Thank you for your time.
> Sorry for late response but i was testing and searching clues in code, and
> i was upgrading flume to newest version.
> Yes, touching conf file starts restart, but for some reason RES usage of
> flume is staying the same. I see this using top command on server.
>  When i start agent is is less then 1GB and in one week it grow to 52GB,
>
> I have 6 flume agents 6 different servers. Every agent have 17 channels,
> 17 sinks, and 1 source.
>
> Here is the sample of config
> at.sinks.foo-sink.type = hdfs
> at.sinks.foo-sink.hdfs.path = hdfs://192.168.2.27/%{path}/%Y-%m-%d
> at.sinks.foo-sink.hdfs.filePrefix = %{filename}
> at.sinks.foo-sink.hdfs.fileSuffix = .gz
> at.sinks.foo-sink.hdfs.inUseSuffix = .tmp
> at.sinks.foo-sink.hdfs.writeFormat = Text
> at.sinks.foo-sink.hdfs.rollCount = 0
> at.sinks.foo-sink.hdfs.batchSize = 10000
> at.sinks.foo-sink.hdfs.rollSize = 125829120
> at.sinks.foo-sink.hdfs.rollInterval = 300
> at.sinks.foo-sink.hdfs.callTimeout = 25000
> at.sinks.foo-sink.hdfs.threadsPoolSize = 20
> at.sinks.foo-sink.hdfs.fileType = CompressedStream
> at.sinks.foo-sink.hdfs.codeC = gzip
> at.sinks.foo-sink.hdfs.maxOpenFiles = 100
> at.sinks.foo-sink.hdfs.idleTimeout = 320
>
> at.channels.foo-channel.type = memory
> at.channels.foo-channel.capacity = 100000
> at.channels.foo-channel.transactionCapacity = 10000
> at.channels.foo-channel.byteCapacityBufferPercentage = 20
> at.channels.foo-channel.byteCapacity = 209715200
>
> Also i  have one custom interceptor which does encoding conversion to UTF.
>
> Size of the one event is up too 2KB.
>
> I will check with visualvm or something similar, but i think i checked
> with those in past and did not see anything suspicious.
>
> I noticed that sometimes channel fills to 100% and can't drain. For that i
> created "restart" mechanism with jmx and bash when that occurs touch config
> and it get back to normal state. This rarely happens but still it works.
> I searched web, and some people blame Hadoop and hdfs. And if this helps i
> run hadoop 2.2.0(i plan to update it).
>
> Also i must add i don't have any balancing tiers for flume nodes(two/three
> tier architecture), all traffic comes directly to this 6 nodes which writes
> to hdfs. In near future i plan to add Kafka for flume and for some other
> reasons, and that should do balancing issues and 100% channels fill.
>
> Again thanks for your time and it nice to see that people from part of
> world use this type of technology. I am not alone. :)
>
> Best regards
> Bojan
>
> On Fri, Dec 5, 2014 at 5:15 PM, Otis Gospodnetic <
> otis.gospodnetic@gmail.com> wrote:
>>
>> I have to join this thread for .... obvious reasons ;)
>>
>> Bojan - you'll get better help if you share your configs.  If you are
>> monitoring Flume with something, sharing various metrics/charts will help,
>> too.  You could also run Flume under a Java profiler and see what's eating
>> your heap.
>>
>> Otis
>> --
>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>> On Thu, Dec 4, 2014 at 4:40 PM, Ahmed Vila <avila@devlogic.eu> wrote:
>>
>>> Zdravo Bojane :)
>>>
>>> Flume is well designed and it shouldn't eat up the memory. On the other
>>> hand, miss-configuration can effectively bring server to a crawl and
>>> eventually produce events loss.
>>> Pasting your configuration in here, along with basic hardware info
>>> behind it and a size of your single event in bytes would be helpful.
>>>
>>> The most common things to blame is using inappropriately large memory
>>> channel size for a given amount of memory, transaction size, HDFS sink
>>> batch size etc. because all of them are stored in memory.
>>>
>>> Anyway, you can achieve graceful restart by changing flume's
>>> configuration file modification time - basically just touching it.
>>> It will sense a change and as a result close sources, sinks and
>>> channels, and start them again without overhead of booting up JVM.
>>> That should trigger java garbage collector to clean up resources
>>> associated to those closed instances of sources, sinks and channels.
>>> As a result, you might loose some events if you're using memory channel
>>> since I think it doesn't have a shutdown procedure. Sink should flush batch
>>> to the HDFS, but that also should be tested.
>>>
>>> Also, tweaking Java GC could be of a help, but I never had a need to do
>>> so with Flume.
>>>
>>>
>>> On Thu, Dec 4, 2014 at 9:18 PM, Bojan Kostić <blood9raven@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a problem with my flume setup. Overtime they just take too much
>>>> memory. And i need them to restart every now and then, I searched web and
i
>>>> did not found any clue how to fix this. Some people blame HDFS...
>>>> For now i just kill process with TERM signal and then wait for couple
>>>> of minutes to shutdown. Now i wish to do this automatically every day. But
>>>> i don't want to lose logs. Is there a way to do this? I checked flume-ng
>>>> script and there is only start. I could write my own sh script which will
>>>> send TERM signal and then check for flume process and if there is none
>>>> start again. But first i want to check is there some smarter way to do this.
>>>>
>>>> Best
>>>> Bojan
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Best regards,
>>> Ahmed Vila | Senior software developer
>>> DevLogic | Sarajevo | Bosnia and Herzegovina
>>>
>>> Office : +387 33 942 123
>>> Mobile: +387 62 139 348
>>>
>>> Website: www.devlogic.eu
>>> E-mail   : avila@devlogic.eu
>>> ---------------------------------------------------------------------
>>> This e-mail and any attachment is for authorised use by the intended
>>> recipient(s) only. This email contains confidential information. It should
>>> not be copied, disclosed to, retained or used by, any party other than the
>>> intended recipient. Any unauthorised distribution, dissemination or copying
>>> of this E-mail or its attachments, and/or any use of any information
>>> contained in them, is strictly prohibited and may be illegal. If you are
>>> not an intended recipient then please promptly delete this e-mail and any
>>> attachment and all copies and inform the sender directly via email. Any
>>> emails that you send to us may be monitored by systems or persons other
>>> than the named communicant for the purposes of ascertaining whether the
>>> communication complies with the law and company policies.
>>>
>>> ---------------------------------------------------------------------
>>> This e-mail and any attachment is for authorised use by the intended
>>> recipient(s) only. This email contains confidential information. It should
>>> not be copied, disclosed to, retained or used by, any party other than the
>>> intended recipient. Any unauthorised distribution, dissemination or copying
>>> of this E-mail or its attachments, and/or any use of any information
>>> contained in them, is strictly prohibited and may be illegal. If you are
>>> not an intended recipient then please promptly delete this e-mail and any
>>> attachment and all copies and inform the sender directly via email. Any
>>> emails that you send to us may be monitored by systems or persons other
>>> than the named communicant for the purposes of ascertaining whether the
>>> communication complies with the law and company policies.
>>
>>
>>


-- 

Best regards,
Ahmed Vila | Senior software developer
DevLogic | Sarajevo | Bosnia and Herzegovina

Office : +387 33 942 123
Mobile: +387 62 139 348

Website: www.devlogic.eu
E-mail   : avila@devlogic.eu
---------------------------------------------------------------------
This e-mail and any attachment is for authorised use by the intended
recipient(s) only. This email contains confidential information. It should
not be copied, disclosed to, retained or used by, any party other than the
intended recipient. Any unauthorised distribution, dissemination or copying
of this E-mail or its attachments, and/or any use of any information
contained in them, is strictly prohibited and may be illegal. If you are
not an intended recipient then please promptly delete this e-mail and any
attachment and all copies and inform the sender directly via email. Any
emails that you send to us may be monitored by systems or persons other
than the named communicant for the purposes of ascertaining whether the
communication complies with the law and company policies.

-- 
---------------------------------------------------------------------
This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. This email contains confidential information. It should 
not be copied, disclosed to, retained or used by, any party other than the 
intended recipient. Any unauthorised distribution, dissemination or copying 
of this E-mail or its attachments, and/or any use of any information 
contained in them, is strictly prohibited and may be illegal. If you are 
not an intended recipient then please promptly delete this e-mail and any 
attachment and all copies and inform the sender directly via email. Any 
emails that you send to us may be monitored by systems or persons other 
than the named communicant for the purposes of ascertaining whether the 
communication complies with the law and company policies.

Mime
View raw message