flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Pawar <nitinpawar...@gmail.com>
Subject Re: spoolDir source problem
Date Fri, 12 Apr 2013 20:52:42 GMT
Paul,
here is part of code which is throwing the exception
it is part of
flume-ng-core/src/main/java/org/apache/flume/serialization/DurablePositionTracker.java

    // On windows, things get messy with renames...
    // FIXME: This is not atomic. Consider implementing a recovery procedure
    // so that if it does not exist at startup, check for a rolled version
    // before creating a new file from scratch.
    if (PlatformDetect.isWindows()) {
      if (!trackerFile.delete()) {
        throw new IOException("Unable to delete existing meta file " +
            trackerFile);
      }
    }

I am not sure why the agent is not able to delete the file. Does the
agent have the permission to access those directories ? i mean both
read and write ?


I am no expert but just making a guess



On Sat, Apr 13, 2013 at 2:18 AM, Paul Chavez <
pchavez@verticalsearchworks.com> wrote:

> **
> We already have a CentOS cluster running half a dozen flume nodes, we've
> been feeding it production data for about 6 months and we've been very
> pleased with it so far. We are just looking to get agents on our app
> servers to smooth out cluster upgrades.
> Thanks for your help,
> Paul
>
>  ------------------------------
> *From:* Israel Ekpo [mailto:israel@aicer.org]
> *Sent:* Friday, April 12, 2013 1:42 PM
>
> *To:* user@flume.apache.org
> *Subject:* Re: spoolDir source problem
>
> It might be a good idea to set up Ubuntu 12 on a virtual machine using
> Virtual box and then set up your test environment there.
>
> This will give you some confidence that the set up works before you deploy
> it
>
> I dont really use Windows for development so unfortunately I am not able
> to help you troubleshoot this.
>
> On 12 April 2013 16:37, Paul Chavez <pchavez@verticalsearchworks.com>wrote:
>
>> **
>> 1. Flume 1.3.1 I believe, whatever is packaged with latest CDH
>> distribution.
>> 2. Windows Server 2008 R2
>> 3. The meta files are created by the flume agent, so should have full
>> rights. I'm went through and recreated the spool directory with more
>> explicit permissions now. It wasn't clear from the exception if the issue
>> was with the meta files or the files I'm putting in the spool dir.
>> Unfortunately it didn't seem to have an effect, recreated the directory
>> with full access for everyone and same issue.
>>
>> I'm ok with not having this functionality on Windows, just don't want to
>> waste time on a solution that won't work. My current solution uses the Avro
>> client to send files to a flume agent on our HDFS cluster running an avro
>> source. The main reason I want a local Windows agent is for the HTTP Source
>> which I've already been able to verify as working.
>>
>> Thanks,
>> Paul
>>
>>
>>  ------------------------------
>> *From:* Israel Ekpo [mailto:israel@aicer.org]
>> *Sent:* Friday, April 12, 2013 1:15 PM
>> *To:* user@flume.apache.org
>> *Subject:* Re: spoolDir source problem
>>
>>   Paul,
>>
>> I have the following questions:
>>
>> (1) What version of Flume are you using?
>>
>> (2) What version of Windows are you using?
>>
>> (3) Does the user running Flume have permissions to read/write in the
>> directories used for the spooling and channels?
>>
>>
>> This will help narrow down the reasons why this could be happening.
>>
>> Nevertheless, it looks like the issue you are encountering is platform
>> specific (just on Windows)
>>
>> From your log messages, it appears the class in the calling thread is
>> org.apache.flume.client.avro.ReliableSpoolingFileEventReader
>>
>> However, the problem is happening in
>> org.apache.flume.serialization.DurablePositionTracker.getInstance()
>>
>> Within the source code, there is a comment on line 94 in the file stating
>> that on Windows renames is not really stable and the logic is not atomic.
>>
>> There is also a recommendation for implementing a recovery procedure so
>> that if the file does not exist on startup, it will check for a rolled
>> version before attempting to create a brand new file.
>>
>> If it is possible for you to move to a different environment other than
>> Windows, that would be great.
>>
>> If this is not possible, then try deleting your spooling directory
>> "c:\flume_data\spool\web" which will also remove the metadata files
>> recursively.
>>
>> Back up all the pending files that have not yet been processed in the
>> spooling directory before deleting the folder so that you can put the files
>> back after the directory is recreated.
>>
>> Then restart your agent to see if this works.
>>
>> Let me know if this helps.
>>
>> On 12 April 2013 14:41, Paul Chavez <pchavez@verticalsearchworks.com>wrote:
>>
>>> **
>>> Anyone have any ideas on this? I can't even find the class throwing the
>>> exception to try and see what it is doing. I would really like to use this
>>> on Windows, but would like to know at least if there's some compatibility
>>> issue so I can move on.
>>>
>>> thanks,
>>> Paul
>>>
>>>
>>>  ------------------------------
>>> *From:* Paul Chavez [mailto:pchavez@verticalsearchworks.com]
>>> *Sent:* Thursday, April 11, 2013 3:15 PM
>>> *To:* user@flume.apache.org
>>> *Subject:* spoolDir source problem
>>>
>>>   Hello,
>>>
>>> I've run into a problem with the spoolDir source, on Windows, and am not
>>> sure how to proceed.
>>>
>>> The agent starts fine and the source is created without issue and is
>>> apparently ready. After agent start a .flumespool directory is created in
>>> the path the source is watching. This directory remains empty as long as
>>> the agent is idle.
>>>
>>> However, as soon as I drop a file into the spool directory (parent to
>>> the .flumespool directory), I get a series of errors in the flume log and a
>>> file named '.flumespool-main.meta<string of numbers>.tmp' is created in
>>> that .flumespool directory at the rate of one per second. The file in the
>>> spool directory is never touched as far as I can tell and the /metrics web
>>> page shows no movement on the channel or sink. A possibly related note is
>>> that the sources don't show in the metrics page, even though the logs say
>>> the source(s) are started.
>>>
>>> All I have done so far is set the directory security to be
>>> 'Everyone/Full Control', basically the windows version of 'chmod 777'
>>>
>>> Any help is appreciated!
>>>
>>> thanks,
>>> Paul
>>>
>>> Here's what the log shows.
>>> 11 Apr 2013 15:11:48,092 INFO  [conf-file-poller-0]
>>> (org.apache.flume.node.Application.startAllComponents:184)  - Starting
>>> Source spool_WebLogs
>>> 11 Apr 2013 15:11:48,092 INFO  [conf-file-poller-0]
>>> (org.apache.flume.node.Application.startAllComponents:184)  - Starting
>>> Source http_Default
>>> 11 Apr 2013 15:11:48,092 INFO  [lifecycleSupervisor-1-0]
>>> (org.apache.flume.source.SpoolDirectorySource.start:66)  -
>>> SpoolDirectorySource source starting with directory: c:\flume_data\spool\web
>>> 11 Apr 2013 15:11:48,124 INFO  [conf-file-poller-0] (
>>> org.mortbay.log.Slf4jLog.info:67)  - Logging to
>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
>>> org.mortbay.log.Slf4jLog
>>> 11 Apr 2013 15:11:48,139 INFO  [conf-file-poller-0] (
>>> org.mortbay.log.Slf4jLog.info:67)  - jetty-6.1.26
>>> 11 Apr 2013 15:11:48,155 INFO  [conf-file-poller-0] (
>>> org.mortbay.log.Slf4jLog.info:67)  - Started
>>> SocketConnector@0.0.0.0:41414
>>> 11 Apr 2013 15:11:48,202 INFO  [lifecycleSupervisor-1-2] (
>>> org.mortbay.log.Slf4jLog.info:67)  - jetty-6.1.26
>>> 11 Apr 2013 15:11:48,217 INFO  [lifecycleSupervisor-1-2] (
>>> org.mortbay.log.Slf4jLog.info:67)  - Started
>>> SocketConnector@0.0.0.0:6240
>>> 11 Apr 2013 15:11:48,389 INFO  [lifecycleSupervisor-1-1]
>>> (org.apache.flume.sink.AvroSink.start:253)  - Avro sink avro_Default
>>> started.
>>> 11 Apr 2013 15:11:48,404 ERROR [pool-6-thread-1]
>>> (org.apache.flume.client.avro.ReliableSpoolingFileEventReader.getNextFile:442)
>>> - Exception opening file:
>>> c:\flume_data\spool\web\u_ex130411.log-201304111500.log
>>> java.io.IOException: Unable to delete existing meta file
>>> c:\flume_data\spool\web\.flumespool\.flumespool-main.meta
>>>  at
>>> org.apache.flume.serialization.DurablePositionTracker.getInstance(DurablePositionTracker.java:96)
>>>  at
>>> org.apache.flume.client.avro.ReliableSpoolingFileEventReader.getNextFile(ReliableSpoolingFileEventReader.java:423)
>>>  at
>>> org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:212)
>>>  at
>>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:154)
>>>  at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>>>  at
>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>>>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>>>  at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>>>  at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>>>  at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>>>  at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>  at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>  at java.lang.Thread.run(Thread.java:662)
>>> 11 Apr 2013 15:11:48,919 ERROR [pool-6-thread-1]
>>> (org.apache.flume.client.avro.ReliableSpoolingFileEventReader.getNextFile:442)
>>> - Exception opening file:
>>> c:\flume_data\spool\web\u_ex130411.log-201304111500.log
>>> java.io.IOException: Unable to delete existing meta file
>>> c:\flume_data\spool\web\.flumespool\.flumespool-main.meta
>>>  at
>>> org.apache.flume.serialization.DurablePositionTracker.getInstance(DurablePositionTracker.java:96)
>>>  at
>>> org.apache.flume.client.avro.ReliableSpoolingFileEventReader.getNextFile(ReliableSpoolingFileEventReader.java:417)
>>>  at
>>> org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:212)
>>>  at
>>> org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:154)
>>>  at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>>>  at
>>> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>>>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>>>  at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>>>  at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
>>>  at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
>>>  at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>  at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>  at java.lang.Thread.run(Thread.java:662)
>>>
>>
>>
>


-- 
Nitin Pawar

Mime
View raw message