flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cochran, David M (Contractor)" <David.Coch...@bsee.gov>
Subject RE: splitting functions
Date Wed, 12 Sep 2012 20:12:56 GMT
Putting a copy of hadoop-core.jar in the lib directory did the trick.. at least it made the
errors go away..

Just trying to sort out why nothing is getting written to the sink's files now... but when
I add entries to the file being tailed nothing makes it to the sink log file(s). guess I need
to run tcpdump on that port and see if anything is being sent or if the problem is on the
receive side now.

Thanks for the help!
Dave



-----Original Message-----
From: Brock Noland [mailto:brock@cloudera.com]
Sent: Wed 9/12/2012 12:41 PM
To: user@flume.apache.org
Subject: Re: splitting functions
 
Yeah that is my fault. FileChannel uses a few hadoop classes for
serialization. I want to get rid of that but it's just not a priority
item. You either need the hadoop command in your path or the
hadoop-core.jar in your lib directory.

On Wed, Sep 12, 2012 at 1:38 PM, Cochran, David M (Contractor)
<David.Cochran@bsee.gov> wrote:
> Brock,
>
> Thanks for the sample!  Starting to see a bit more light and making a little more sense
now...
>
> If you wouldn't mind and have a couple mins to spare...I'm getting this error and not
sure how to make it go away.. I can not use hadoop for storage instead just FILE_ROLL (ultimately
the logs will need to be processed further in plain text)  I'm just not sure why....
>
> The error follows and my conf further down.
>
> 12 Sep 2012 13:18:54,120 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.FileChannel.start:211)
 - Starting FileChannel fileChannel { dataDirs: [/tmp/flume/data1, /tmp/flume/data2, /tmp/flume/data3]
}...
> 12 Sep 2012 13:18:54,124 ERROR [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.FileChannel.start:234)
 - Failed to start the file channel [channel=fileChannel]
> java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
>         at java.lang.ClassLoader.defineClass1(Native Method)
>         at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
>         at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>         at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>         at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>         at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>         at org.apache.flume.channel.file.Log$Builder.build(Log.java:144)
>         at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:223)
>         at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.Writable
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>         ... 24 more
> 12 Sep 2012 13:18:54,126 ERROR [lifecycleSupervisor-1-0] (org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:238)
 - Unable to start FileChannel fileChannel { dataDirs: [/tmp/flume/data1, /tmp/flume/data2,
/tmp/flume/data3] } - Exception follows.
> java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
>         at java.lang.ClassLoader.defineClass1(Native Method)
>         at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
>         at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>         at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>         at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>         at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>         at org.apache.flume.channel.file.Log$Builder.build(Log.java:144)
>         at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:223)
>         at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.Writable
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>         ... 24 more
> 12 Sep 2012 13:18:54,127 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.channel.file.FileChannel.stop:249)
 - Stopping FileChannel fileChannel { dataDirs: [/tmp/flume/data1, /tmp/flume/data2, /tmp/flume/data3]
}...
> 12 Sep 2012 13:18:54,127 ERROR [lifecycleSupervisor-1-0] (org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:249)
 - Unsuccessful attempt to shutdown component: {} due to missing dependencies. Please shutdown
the agentor disable this component, or the agent will bein an undefined state.
> java.lang.IllegalStateException: Channel closed[channel=fileChannel]
>         at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
>         at org.apache.flume.channel.file.FileChannel.getDepth(FileChannel.java:282)
>         at org.apache.flume.channel.file.FileChannel.stop(FileChannel.java:250)
>         at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:244)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> 12 Sep 2012 13:18:54,622 INFO  [conf-file-poller-0] (org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents:141)
 - Starting Sink filesink1
> 12 Sep 2012 13:18:54,624 INFO  [conf-file-poller-0] (org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents:152)
 - Starting Source avroSource
> 12 Sep 2012 13:18:54,626 INFO  [lifecycleSupervisor-1-1] (org.apache.flume.source.AvroSource.start:138)
 - Starting Avro source avroSource: { bindAddress: 0.0.0.0, port: 9432 }...
> 12 Sep 2012 13:18:54,641 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.SinkRunner$PollingRunner.run:160)
 - Unable to deliver event. Exception follows.
> java.lang.IllegalStateException: Channel closed [channel=fileChannel]
>         at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
>         at org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:267)
>         at org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:118)
>         at org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:172)
>         at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
>
>
>
>
> Using your config this is my starting point... (trying to get it functioning on a single
host first)
>
> node105.sources = tailsource
> node105.channels = fileChannel
> node105.sinks = avroSink
>
> node105.sources.tailsource.type = exec
> node105.sources.tailsource.command =tail -F /root/Desktop/apache-flume-1.3.0-SNAPSHOT/test.log
> #node105.sources.stressSource.batchSize = 1000
> node105.sources.tailsource.channels = fileChannel
>
> ## Sink sends avro messages to node103.bashkew.com port 9432
> node105.sinks.avroSink.type = avro
> node105.sinks.avroSink.batch-size = 1000
> node105.sinks.avroSink.channel = fileChannel
> node105.sinks.avroSink.hostname = localhost
> node105.sinks.avroSink.port = 9432
>
> node105.channels.fileChannel.type = file
> node105.channels.fileChannel.checkpointDir = /root/Desktop/apache-flume-1.3.0-SNAPSHOT/tmp/flume/checkpoint
> node105.channels.fileChannel.dataDirs = /root/Desktop/apache-flume-1.3.0-SNAPSHOT/tmp/flume/tmp/flume/data
> node105.channels.fileChannel.capacity = 10000
> node105.channels.fileChannel.checkpointInterval = 3000
> node105.channels.fileChannel.maxFileSize = 5242880
>
> node102.sources = avroSource
> node102.channels = fileChannel
> node102.sinks = filesink1
>
> ## Source listens for avro messages on port 9432 on all ips
> node102.sources.avroSource.type = avro
> node102.sources.avroSource.channels = fileChannel
> node102.sources.avroSource.bind = 0.0.0.0
> node102.sources.avroSource.port = 9432
>
> node102.sinks.filesink1.type = FILE_ROLL
> node102.sinks.filesink1.batchSize = 1000
> node102.sinks.filesink1.channel = fileChannel
> node102.sinks.filesink1.sink.directory = /root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/rhel5/
> node102.channels.fileChannel.type = file
> node102.channels.fileChannel.checkpointDir = /tmp/flume/checkpoints
> node102.channels.fileChannel.dataDirs = /tmp/flume/data1,/tmp/flume/data2,/tmp/flume/data3
> node102.channels.fileChannel.capacity = 5000
> node102.channels.fileChannel.checkpointInterval = 45000
> node102.channels.fileChannel.maxFileSize = 5242880
>
>
>
> Thanks!
> Dave
>
>
> -----Original Message-----
> From: Brock Noland [mailto:brock@cloudera.com]
> Sent: Wed 9/12/2012 9:11 AM
> To: user@flume.apache.org
> Subject: Re: splitting functions
>
> Hi,
>
> Below is a config I use to test out the FileChannel. See the comments
> "##" for how messages are sent from one host to another.
>
> node105.sources = stressSource
> node105.channels = fileChannel
> node105.sinks = avroSink
>
> node105.sources.stressSource.type = org.apache.flume.source.StressSource
> node105.sources.stressSource.batchSize = 1000
> node105.sources.stressSource.channels = fileChannel
>
> ## Sink sends avro messages to node103.bashkew.com port 9432
> node105.sinks.avroSink.type = avro
> node105.sinks.avroSink.batch-size = 1000
> node105.sinks.avroSink.channel = fileChannel
> node105.sinks.avroSink.hostname = node102.bashkew.com
> node105.sinks.avroSink.port = 9432
>
> node105.channels.fileChannel.type = file
> node105.channels.fileChannel.checkpointDir = /tmp/flume/checkpoints
> node105.channels.fileChannel.dataDirs =
> /tmp/flume/data1,/tmp/flume/data2,/tmp/flume/data3
> node105.channels.fileChannel.capacity = 10000
> node105.channels.fileChannel.checkpointInterval = 3000
> node105.channels.fileChannel.maxFileSize = 5242880
>
> node102.sources = avroSource
> node102.channels = fileChannel
> node102.sinks = nullSink
>
>
> ## Source listens for avro messages on port 9432 on all ips
> node102.sources.avroSource.type = avro
> node102.sources.avroSource.channels = fileChannel
> node102.sources.avroSource.bind = 0.0.0.0
> node102.sources.avroSource.port = 9432
>
> node102.sinks.nullSink.type = null
> node102.sinks.nullSink.batchSize = 1000
> node102.sinks.nullSink.channel = fileChannel
>
> node102.channels.fileChannel.type = file
> node102.channels.fileChannel.checkpointDir = /tmp/flume/checkpoints
> node102.channels.fileChannel.dataDirs =
> /tmp/flume/data1,/tmp/flume/data2,/tmp/flume/data3
> node102.channels.fileChannel.capacity = 5000
> node102.channels.fileChannel.checkpointInterval = 45000
> node102.channels.fileChannel.maxFileSize = 5242880
>
>
>
> On Wed, Sep 12, 2012 at 10:06 AM, Cochran, David M (Contractor)
> <David.Cochran@bsee.gov> wrote:
>> Okay folks, after spending the better part of a week reading the docs and
>> experimenting I'm lost.  I have flume 1.3.x working pretty much as expected
>> on a single host.  It tails a log file and writes it to another rolling log
>> file via flume.  No problem there, seems to work flawlessly.  Where my issue
>> is trying to break apart the functions across multiple hosts... a single
>> host listening for others to send their logs to.  All of my efforts have
>> resulted in little more than headaches.
>>
>> I can't even see the specified port open on what should be the logging host.
>> I've tried the basic examples posted on different docs but can't seem to get
>> things working across multiple hosts.
>>
>> Would someone post a working example of the conf's needed to get me started?
>> Something simple that works, so I can them pick it apart to gain more
>> understanding.  Apparently, I just don't yet have a firm enough grasp on all
>> the pieces yet, but want to learn!
>>
>> Thanks in advance!
>> Dave
>>
>>
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/


Mime
View raw message