flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander C.H. Lorenz" <wget.n...@googlemail.com>
Subject Re: Flume set up help
Date Thu, 03 Nov 2011 07:55:17 GMT
Hi Subbu,

which version of flume you use? The master-node has no errors? What says
dmesg @nodes? First I would check all servers for errors, mostly
dmesg-output will be helpfull. The processes are up and running or did they
die?

best,
 Alex

On Wed, Nov 2, 2011 at 9:55 PM, Subramanyam Satyanarayana <
subbu@attributor.com> wrote:

>
> Hi,
>      We are trying to set up flume in our production for data transfer
> between hosts. We had implementations of one agent node running 5 logical
> nodes & talking to 5 logical collectors running on 2 boxes ( one of them
> being an custom hbase sink) using the flow isolation mechanism. What we
> noticed it runs fine for the first couple of hours & then data just stops
> flowing for unknown reasons. There are no particular symptoms in the log
> files OR the jstacks.
>
> We need help what to debug this & root cause it. We are not sure which of
> the following is the known bottleneck choking the system
> a) Flow isolation b) Fan out sinks c) Batching at collector d) Multi
> tailing
>
> Here is the original set up info
> ===============================================
> #Configure the 5 agents
> exec  config 'uiagent' 'uiflow'
> 'tail("/usr/local/stow/tomcat/logs/ui/uievent.log")' 'agentDFOSink'
> exec  config 'newbookagent' 'newbookflow'
> 'tail("/usr/local/stow/tomcat/logs/newbook/newbookevent.log")'
> 'agentDFOSink'
> exec  config 'friendagent' 'friendflow'
> 'tail("/usr/local/stow/tomcat/logs/friend/friendevent.log")' 'agentDFOSink'
> exec  config 'systemagent' 'systemflow'
> 'tail("/usr/local/stow/tomcat/logs/system/systemevent.log")' 'agentDFOSink'
> exec  config 'feedagent' 'feedflow'
> 'tail("/usr/local/stow/tomcat/logs/feed/feedevent.log")' 'agentDFOSink'
>
> #Configure the 5 collectors
> exec config 'uicollector' 'uiflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/ui/inputarchive/%Y/%m/%d/","%{host}-%
>
> {tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_uievent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'newbookcollector' 'newbookflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/newbook/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_newbookevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'systemcollector' 'systemflow' 'autoCollectorSource'
> 'collector(300000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/system/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_systemevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'feedcollector' 'feedflow' 'autoCollectorSource'
> 'collector(60000){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/feed/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_feedevent/tempinput","%{host}-%{tailSrcFile}.log")]}'
> exec config 'friendcollector' 'friendflow' 'autoCollectorSource'
> 'collector(3){[collectorSink("hdfs://prod-namenode-001:54310/user/argus/events/friend/inputarchive/%Y/%m/%d/","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_friendevent/tempinput","%{host}-%{tailSrcFile}.log"),friends2hbase(
> "friends_list", "friendslist" )]}'
>
> #Mappings
> exec spawn 'agent' 'uiagent'
> exec spawn 'agent' 'systemagent'
> exec spawn 'agent' 'feedagent'
> exec spawn 'agent' 'friendagent'
> exec spawn 'agent' 'newbookagent'
>
> exec spawn 'collector2' 'uicollector'
> exec spawn 'collector2' 'systemcollector'
> exec spawn 'collector' 'feedcollector'
> exec spawn 'collector' 'friendcollector'
> exec spawn 'collector2' 'newbookcollector
> ====================================================
>
>
> P.S : We also finally broke it down to a bear minimum ( as shown below) of
> one agent talking to one collector & HDFS & still did not scale for large
> hours of data flow.
>
> =====================================
> exec  config 'fastagent'
> 'multitail("/usr/local/stow/tomcat/logs/feed/feedevent.log","/usr/local/stow/tomcat/logs/friend/friendevent.log")'
> 'agentDFOSink("stage-event-001<http://stage-event-001.shelflife.attributor.com>
> ","35853")'
>
> exec config 'collector' 'collectorSource(35853)'
> 'collector(60000){[collectorSink("hdfs://
> stage-namenode-001:54310/user/argus/events/backup/%Y/%m/%d/
> ","%{host}-%{tailSrcFile}.log"),collectorSink("file:///data/drivers/nile_allevents/tempinput","%{host}-%{tailSrcFile}.log")]}'
>
> ======================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ==============================================
>
> Thanks!!
> ~Subbu
>
>
>
> --
> Thanks!!
> ~Subbu
>
>


-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*

Mime
View raw message