flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Siddharth Tiwari <siddharth.tiw...@live.com>
Subject RE: Flume not moving data to HDFS or local
Date Fri, 01 Nov 2013 02:05:01 GMT
Can you describe the process to setup spooling directory source ? I am sorry I do not know
how to to do that. If you can give me a step by step description on how to configure that
and the configuration changes I need to make in my conf to get it done I will be really thankful
.. Appreciate your help :)

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 

"Maybe other people will try to limit me but I don't limit myself"


From: pchavez@verticalsearchworks.com
To: user@flume.apache.org
Date: Thu, 31 Oct 2013 14:38:54 -0700
Subject: RE: Flume not moving data to HDFS or local

It should commit when one of the various file roll configuration values are hit. There’s
a list of them and their defaults in the flume user guide. For managing new files on your
app servers, the best option right now seems to be a spooling directory source along with
some kind of cron jobs that run locally on the app servers to drop files in the spool directory
when ready. In my case I run a job that executes a custom script to checkpoint a file that
is appended to all day long, creating incremental files every minute to drop in the spool
directory.  From: Siddharth Tiwari [mailto:siddharth.tiwari@live.com] 
Sent: Thursday, October 31, 2013 12:47 PM
To: user@flume.apache.org
Subject: RE: Flume not moving data to HDFS or local 
It got resolved it was due to wrong version of guava jar file in flume lib, but still I can
see a .tmp extention in teh fiel in HDFS, when does it actually gets commited ? :) ... One
another question though What should I change in my configuration file to capture new files
being generated in a directory in remote m,achine ?Say for example there is one new file generated
every hour in my webserver hostlog directory. What do I change in my configuration so that
I get teh new file directly in my HDFS compressed ?

*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 
"Maybe other people will try to limit me but I don't limit myself"

From: siddharth.tiwari@live.com
To: user@flume.apache.org
Subject: RE: Flume not moving data to HDFS or local
Date: Thu, 31 Oct 2013 19:29:36 +0000Hi Paul I see following error :- 13/10/31 12:27:01 ERROR
hdfs.HDFSEventSink: process failedjava.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache;
         at org.apache.hadoop.hdfs.DomainSocketFactory.<init>(DomainSocketFactory.java:45)
         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:490)          at
org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:445)          at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
         at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2429)       
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)          at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2463)
         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2445)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:363)
         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:165)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:347)
         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:275)          at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:186)
         at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)     
    at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
         at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125) 
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)          at
org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)          at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
         at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)      
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
         at java.lang.Thread.run(Thread.java:724)Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor"
java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache;
         at org.apache.hadoop.hdfs.DomainSocketFactory.<init>(DomainSocketFactory.java:45)
         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:490)          at
org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:445)          at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
         at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2429)       
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)          at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2463)
         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2445)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:363)
         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:165)          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:347)
         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:275)          at org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:186)
         at org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)     
    at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)          at org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
         at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125) 
        at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)          at
org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)          at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
         at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)      
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
         at java.lang.Thread.run(Thread.java:724) 

*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 
"Maybe other people will try to limit me but I don't limit myself"

From: pchavez@verticalsearchworks.com
To: user@flume.apache.org
Date: Thu, 31 Oct 2013 12:19:42 -0700
Subject: RE: Flume not moving data to HDFS or localTry bumping your memory channel capacities
up, they are the same as the batch size. I would go to at least 1000 on each mem channel.
Also, what to the logs and metrics show? From: Siddharth Tiwari [mailto:siddharth.tiwari@live.com]

Sent: Thursday, October 31, 2013 11:53 AM
To: user@flume.apache.org
Subject: Flume not moving data to HDFS or local Hi team I created flume source and sink as
following in hadoop yarn and I am not getting data transferred from source to sink in HDFS
it doesnt create any file and on local everytime I start agent it creates one empty file.
Below are my configs in source and sink  Source :-  agent.sources = logger1agent.sources.logger1.type
= execagent.sources.logger1.command = tail -f /var/log/messagesagent.sources.logger1.batchsSize
= 0agent.sources.logger1.channels = memoryChannelagent.channels = memoryChannelagent.channels.memoryChannel.type
= memoryagent.channels.memoryChannel.capacity = 100agent.sinks = AvroSinkagent.sinks.AvroSink.type
= avroagent.sinks.AvroSink.channel = memoryChannelagent.sinks.AvroSink.hostname = 192.168.147.101agent.sinks.AvroSink.port
= 4545agent.sources.logger1.interceptors = itime ihostagent.sources.logger1.interceptors.itime.type
= TimestampInterceptoragent.sources.logger1.interceptors.ihost.type = hostagent.sources.logger1.interceptors.ihost.useIP
= falseagent.sources.logger1.interceptors.ihost.hostHeader = host  Sink at one of the slave
( datanodes on my Yarn cluster ) : collector.sources = AvroIncollector.sources.AvroIn.type
= avrocollector.sources.AvroIn.bind = 0.0.0.0collector.sources.AvroIn.port = 4545collector.sources.AvroIn.channels
= mc1 mc2collector.channels = mc1 mc2collector.channels.mc1.type = memorycollector.channels.mc1.capacity
= 100 collector.channels.mc2.type = memorycollector.channels.mc2.capacity = 100 collector.sinks
= LocalOut HadoopOutcollector.sinks.LocalOut.type = file_rollcollector.sinks.LocalOut.sink.directory
= /home/hadoop/flumecollector.sinks.LocalOut.sink.rollInterval = 0collector.sinks.LocalOut.channel
= mc1collector.sinks.HadoopOut.type = hdfscollector.sinks.HadoopOut.channel = mc2collector.sinks.HadoopOut.hdfs.path
= /flumecollector.sinks.HadoopOut.hdfs.fileType = DataStreamcollector.sinks.HadoopOut.hdfs.writeFormat
= Textcollector.sinks.HadoopOut.hdfs.rollSize = 0collector.sinks.HadoopOut.hdfs.rollCount
= 10000collector.sinks.HadoopOut.hdfs.rollInterval = 600  can somebody point me to what am
I doing wrong ? This is what I get in my local directory [hadoop@node1 flume]$ ls -lrttotal
0-rw-rw-r-- 1 hadoop hadoop 0 Oct 31 11:25 1383243942803-1-rw-rw-r-- 1 hadoop hadoop 0 Oct
31 11:28 1383244097923-1-rw-rw-r-- 1 hadoop hadoop 0 Oct 31 11:31 1383244302225-1-rw-rw-r--
1 hadoop hadoop 0 Oct 31 11:33 1383244404929-1  when I restart the collector it creates one
0 bytes file. Please help 

*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 
"Maybe other people will try to limit me but I don't limit myself" 		 	   		  
Mime
View raw message